The present disclosure generally relates to a monitoring system, and more particularly to a monitoring system, method, and/or program configured to extrapolate a 3-dimensional (“3D”) joint coordinate representation of a vehicle occupant and estimate the location of occluded joints in the joint coordinate representation.
According to one aspect of the present disclosure, a monitoring system for a vehicle includes at least one imaging device configured to capture a first image type and a second image type in a sequence. A first illumination source is configured to emit a flood illumination captured by the at least one imaging device in the first image type. A second illumination source is configured to emit a structured light illumination captured by the at least one imaging device in the second image type. At least one processor is configured to extract a 2-dimensional (“2D”) joint coordinate representation of a vehicle occupant from the first image type, measure a depth of at least a portion of the 2D joint coordinate representation with the second image type, extrapolate at least a partial 3-dimensional (“3D”) joint coordinate representation of the vehicle occupant, determine that at least one joint in the 2D joint coordinate representation or the 3D joint coordinate representation is obscured, and estimate, based on the 3D joint coordinate representation, a location of the at least one joint that is obscured.
According to another aspect of the present disclosure, a monitoring system for a vehicle includes at least one imaging device configured to capture a first image type and a second image type in a sequence. At least one processor is configured to extract a 2-dimensional (“2D”) joint coordinate representation of a vehicle occupant from the first image type, measure a depth of at least a portion of the 2D joint coordinate representation with the second image type, extrapolate at least a partial 3-dimensional (“3D”) joint coordinate representation of the vehicle occupant, determine that at least one joint in the 2D joint coordinate representation or the 3D joint coordinate representation is obscured, and estimate, based on the 3D joint coordinate representation, a location of the at least one joint that is obscured.
According to yet another aspect of the present disclosure, a computer program product includes a non-transitory computer-readable storage medium readable by one or more processing circuits and storing instructions for execution by one or more processors for performing a method of estimating a location of an obscured joint in a driver monitoring system based on feedback from at least one imaging device. The method includes extracting a 2-dimensional (“2D”) joint coordinate representation of a vehicle occupant from the first image type, measuring a depth of at least a portion of the 2D joint coordinate representation with the second image type, extrapolating at least a partial 3-dimensional (“3D”) joint coordinate representation of the vehicle occupant, determining that at least one joint in the 2D joint coordinate representation or the 3D joint coordinate representation is obscured, and estimating, based on the 3D joint coordinate representation, a location of the at least one joint that is obscured.
These and other features, advantages, and objects of the present disclosure will be further understood and appreciated by those skilled in the art by reference to the following specification, claims, and appended drawings.
In the drawings:
The present illustrated embodiments reside primarily in combinations of method steps and apparatus related to a monitoring system configured to extrapolate a 3-dimensional (“3D”) joint coordinate representation of a vehicle occupant and estimate the location of occluded joints in the joint coordinate representation. Accordingly, the apparatus components and method steps have been represented, where appropriate, by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments of the present disclosure so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein. Further, like numerals in the description and drawings represent like elements.
For purposes of description herein, the terms “upper,” “lower,” “right,” “left,” “rear,” “front,” “vertical,” “horizontal,” and derivatives thereof, shall relate to the disclosure as oriented in
The terms “including,” “comprises,” “comprising,” or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. An element preceded by “comprises a . . . ” does not, without more constraints, preclude the existence of additional identical elements in the process, method, article, or apparatus that comprises the element.
Referring to
With reference now to
With reference now to
The control system 100 (e.g., the processor 104) may be configured to process the 2D information about the occupant 28 to detect locations within the second image type 18A that correspond to joints 25A-5H of the occupant 28 to extract the 2D joint coordinate representation 26. For example, the control system 100 (e.g., the processor 104) may utilize deep neural network modeling methods to predict the location of joints 25A-25H in 2D space and classify the joints 25A-25H between visible and obscured (i.e., occluded). In this manner, the process of extrapolating the 3D joint coordinate representation 30 from 2D information and, more particularly, the 2D joint coordinate representation 26 may be entirely on the basis of the second image type 18A. It is contemplated that, the first mode of operation may be completed with only the second image type 18A (e.g., the structured light) such that the first illumination source 20 may be absent or otherwise not utilized for extracting the 2D joint coordinate representation 26 and, consequently, the 3D joint coordinate representation 30. In some embodiments, the first image type 16 may be captured via ambient lighting (e.g., without the first illumination source 20).
With reference to
With reference now to
With reference now to
With continued reference to
The objective function of the QCQP is minimizing the Euclidean distance between an optimized location in a 3D space of at least one visible joint 25A-25H and its observed 3D location, such that skeleton/joint constraints are satisfied and the implied location of the obscured joint 25A-25H falls inside a feasible space constrained by the 2D location of the obscured joint 25A-25H and the other skeleton/joint constraints.
The control system 100 (e.g., the processor 104) may constrain visible joints 25A-25H within a ball around its observed location (e.g., the location calculated from the 2D skeleton and the depth using triangulation), where the ball diameter is determined by the size of the body portion or organ. For example, the hip may have a larger allowed distance compared to the wrist. This constraint is similar to the objective function; however, the control system 100 ensures that each joint is within the defined distance. In the objective function, the control system 100 minimizes the sum of all distances.
In addition, using absolute scale and 3D coordinates obtained from the 3D joint coordinate representation 30, a bone length or a distance between joints 25A-25H can be obtained via the image types 16, 18A and used to estimate the location of the obscured joint 25A-25H by proximate visible joints 25A-25H. In some embodiments, the control system 100 (e.g., the processor 104) may include pre-saved ratio information that relates to common ratios between joint distances. For example, the pre-saved ratio information may include a common ratio of a first distance between shoulders 25B and wrists 25E to determine an average or likely range of a second distance between hips/torso 25H and ankles 25G. In some embodiments, the control system 100 (e.g., the processor 104) may be able to measure the distance between two joints 25A-25H (e.g., shoulders 25B and wrists 25E of a first arm) and apply that measurement to a distance between like joints 25A-25H (e.g., a second arm that is obscured). In cases where the distance cannot be measured, the control system 100 utilizes the known physical lengths of human bones. For example, the bone connecting the elbow and the wrist is, on average, expected to be between 25 to 35 cm.
In some embodiments, the control system 100 (e.g., the processor 104) may be configured to estimate the location of the at least one joint 25A-25H that is obscured by reviewing the first image type 16 or the second image type 18A that was previously obtained in the sequence 19A. For example, if a joint 25A-25H is initially visible in an earlier image 16, 18A, but becomes obscured, the control system 100 (e.g., the processor 104) may estimate a current location of the joint 25A-25H based on the period of time 48 between images 16, 18A and direction of detected motion between two or more images 16, 18A.
In some embodiments, the control system 100 (e.g., the processor 104) may be configured to estimate the location of the at least one joint 25A-25H that is obscured by constraining the estimated location of the at least one joint 25A-25H that is obscured to fall inside the generated 3D cone that bounds a receptive field of a corresponding pixel in the first image type 16. For example, the generated 3D cone may originate in the closest 3D location on the lens 42 that is projected on a corresponding 2D pixel, the cone direction is defined by the angle associated with that 2D pixel, and its spatial angle is determined by the built-in receptive field of a pixel.
In some embodiments, the control system 100 (e.g., the processor 104) may be configured to estimate the location of the at least one joint 25A-25H that is obscured by setting a minimum threshold depth of a depth extracted from a corresponding pixel in the first image type 16. For example, by determining a depth of the object obscuring the joint 25A-25H from corresponding pixels, a minimum depth of the obscured joint 25A-25E can be determined that is equal to or greater than the depth of the thing obscuring the joint 25A-25H. It should be appreciated that the control system 100 (e.g., the processor 104) may be configured to extract depth information related to one of the joints 25A-25H only after it has been determined that the joint 25A-25H is obscured. In this manner, processing resources can be reduced.
With reference now to
With continued reference to
With reference now to
With reference now to
With reference now to
With continued reference to
With still continued reference to
With reference now to
The computer program product 200 may include, for instance, one or more computer-readable medium 202 (e.g., non-transitory memory) to store computer-readable program code means or logic 204 in order to provide and facilitate one or more functions and methods steps described in the present disclosure. The program code contained or stored in/on a computer-readable medium 202 can be obtained and executed by a computer, such as the control system 100 to behave/function in a particular manner. The program code can be transmitted using any appropriate medium, including (but not limited to) wireless, wireline, optical fiber, and/or radio-frequency. The program code 204 includes instructions for carrying out operations to perform, achieve, or facilitate aspects of the disclosure may be written in one or more programming languages. In some embodiments, the programming language(s) include object-oriented and/or procedural programming languages such as C, C++, C #, Java, and/or the like. The program code 204 may execute entirely on the control system 100.
In one example, the program code 204 includes one or more program instructions obtained for execution by one or more processors (e.g., the processor 104). The instructions contained on the program code 204 may be provided to the one or more processors of, for example, the control system 100, such that the program instructions, when executed by the one or more processors, perform, achieve, or facilitate aspects of the functionalities and methods described herein.
Step 312 may further include solving a quadratically constrained quadratic program to minimize an Euclidean distance between an optimized location in a 3D joint coordinate representation of at least one visible joint and an observed 3D location of the at least one visible joint. Step 312 may still further include satisfying a skeleton constraint by estimating the location of at least one joint that is obscured within a feasible space constrained by the 2D joint coordinate representation of the at least one joint that is obscured. Step 312 may further include estimating the location of the at least one joint that is obscured by determining a bone length between the at least one joint that is obscured and at least one joint that is visible. Step 312 may still further include estimating the location of the at least one joint that is obscured by constraining visible joints within a ball around its observed location (e.g., the location calculated from the 2D skeleton and the depth using triangulation), where the ball diameter is determined by the size of the body portion or organ. Step 312 may further include estimating the location of the at least one joint that is obscured by reviewing the first image type or the second image type that was previously obtained in the sequence. Step 312 may still further include estimating the location of the at least one joint that is obscured by determining if the at least one joint that is obscured is proximate to a generated 3D cone that bounds a receptive field of a corresponding pixel in the first image type. Step 312 may include estimating the location of the at least one joint that is obscured by setting a minimum threshold depth of a depth extracted from a corresponding pixel in the first image type.
The disclosure herein is further summarized in the following paragraphs and is further characterized by combinations of any and all of the various aspects described therein.
According to one aspect of the present disclosure, a monitoring system for a vehicle includes at least one imaging device configured to capture a first image type and a second image type in a sequence. A first illumination source is configured to emit a flood illumination captured by the at least one imaging device in the first image type. A second illumination source is configured to emit a structured light illumination captured by the at least one imaging device in the second image type. At least one processor is configured to extract a 2-dimensional (“2D”) joint coordinate representation of a vehicle occupant from the first image type, measure a depth of at least a portion of the 2D joint coordinate representation with the second image type, extrapolate at least a partial 3-dimensional (“3D”) joint coordinate representation of the vehicle occupant, determine that at least one joint in the 2D joint coordinate representation or the 3D joint coordinate representation is obscured, and estimate, based on the 3D joint coordinate representation, a location of the at least one joint that is obscured.
According to another aspect, at least one processor is further configured to estimate the location of the at least one joint that is obscured by solving a quadratically constrained quadratic program to minimize the Euclidean distance between an optimized location in a 3D space of at least one visible joint and its observed 3D location, such that skeleton constraints are satisfied and the implied location of at least one joint that is obscured falls inside a feasible space constrained by the 2D location of the at least one joint that is obscured and the other skeleton constraints.
According to yet another aspect, at least one processor is further configured to estimate the location of the at least one joint that is obscured by determining a bone length between the at least one joint that is obscured and at least one joint that is visible.
According to still yet another aspect, at least one processor is further configured to estimate the location of the at least one joint that is obscured by constraining visible joints within a ball around its observed location (e.g., the location calculated from the 2D skeleton and the depth using triangulation), where the ball diameter is determined by the size of the body portion or organ.
According to another aspect, at least one processor is further configured to estimate the location of the at least one joint that is obscured by reviewing the first image type or the second image type that was previously obtained in the sequence.
According to yet another aspect, at least one processor is further configured to estimate the location of the at least one joint that is obscured by determining if the at least one joint that is obscured is proximate a generated 3D cone that bounds a receptive field of a corresponding pixel in the first image type.
According to still yet another aspect, at least one processor is further configured to estimate the location of the at least one joint that is obscured by setting a minimum threshold depth of a depth extracted from a corresponding pixel in the first image type.
According to another aspect, a flood illumination and a structured light illumination are substantially within an infrared spectrum.
According to yet another aspect, a depth measurement in a second image type is obtained by at least one of a time-of-flight configuration, a stereo vision configuration, or a structured light configuration.
According to another aspect of the present disclosure, a monitoring system for a vehicle includes at least one imaging device configured to capture a first image type and a second image type in a sequence. At least one processor is configured to extract a 2-dimensional (“2D”) joint coordinate representation of a vehicle occupant from the first image type, measure a depth of at least a portion of the 2D joint coordinate representation with the second image type, extrapolate at least a partial 3-dimensional (“3D”) joint coordinate representation of the vehicle occupant, determine that at least one joint in the 2D joint coordinate representation or the 3D joint coordinate representation is obscured, and estimate, based on the 3D joint coordinate representation, a location of the at least one joint that is obscured.
According to another aspect, at least one processor is configured to solve a quadratically constrained quadratic program to minimize a Euclidean distance between an optimized location in a 3D joint coordinate representation of at least one visible joint and an observed 3D location of the at least one visible joint. The processor is further configured to satisfy a skeleton constraint by estimating the location of at least one joint that is obscured within a feasible space constrained by the 2D joint coordinate representation of the at least one joint that is obscured.
According to yet another aspect, at least one processor is configured to estimate a location of at least one joint that is obscured by constraining visible joints within a ball around its observed location (e.g., the location calculated from the 2D skeleton and the depth using triangulation), where the ball diameter is determined by the size of the body portion or organ.
According to still another aspect, at least one processor is configured to estimate a location of at least one joint that is obscured by determining if the at least one joint that is obscured is proximate a generated 3D cone that bounds the receptive field of a corresponding pixel in the first image type.
According to another aspect, rearview mirror assembly includes a monitoring system for estimating a location of an obscured joint in a driver monitoring system based on feedback from at least one imaging device.
According to yet another aspect of the present disclosure, computer program product includes a non-transitory computer-readable storage medium readable by one or more processing circuit and storing instructions for execution by one or more processor for performing a method of estimating a location of an obscured joint in a driver monitoring system based on feedback from at least one imaging device. The method includes extracting a 2-dimensional (“2D”) joint coordinate representation of a vehicle occupant from the first image type, measuring a depth of at least a portion of the 2D joint coordinate representation with the second image type, extrapolating at least a partial 3-dimensional (“3D”) joint coordinate representation of the vehicle occupant, determining that at least one joint in the 2D joint coordinate representation or the 3D joint coordinate representation is obscured, and estimating, based on the 3D joint coordinate representation, a location of the at least one joint that is obscured.
According to another aspect, the computer program product includes instructions to estimate a location of at least one joint that is obscured by solving a quadratically constrained quadratic program to minimize a Euclidean distance between an optimized location in a 3D joint coordinate representation of at least one visible joint and an observed 3D location of the at least one visible joint.
According to yet another aspect, the computer program product includes instructions to estimate a location of at least one joint that is obscured by satisfying a skeleton constraint by estimating the location of at least one joint that is obscured within a feasible space constrained by the 2D joint coordinate representation of the at least one joint that is obscured.
According to still another aspect, the computer program product includes instructions to estimate a location of at least one joint that is obscured by constraining visible joints within a ball around its observed location (e.g., the location calculated from the 2D skeleton and the depth using triangulation), where the ball diameter is determined by the size of the body portion or organ.
It will be understood by one having ordinary skill in the art that construction of the described disclosure and other components is not limited to any specific material. Other exemplary embodiments of the disclosure disclosed herein may be formed from a wide variety of materials, unless described otherwise herein.
For purposes of this disclosure, the term “coupled” (in all of its forms, couple, coupling, coupled, etc.) generally means the joining of two components (electrical or mechanical) directly or indirectly to one another. Such joining may be stationary in nature or movable in nature. Such joining may be achieved with the two components (electrical or mechanical) and any additional intermediate members being integrally formed as a single unitary body with one another or with the two components. Such joining may be permanent in nature or may be removable or releasable in nature unless otherwise stated.
As used herein, the term “about” means that amounts, sizes, formulations, parameters, and other quantities and characteristics are not and need not be exact, but may be approximate and/or larger or smaller, as desired, reflecting tolerances, conversion factors, rounding off, measurement error and the like, and other factors known to those of skill in the art. When the term “about” is used in describing a value or an end-point of a range, the disclosure should be understood to include the specific value or end-point referred to. Whether or not a numerical value or end-point of a range in the specification recites “about,” the numerical value or end-point of a range is intended to include two embodiments: one modified by “about,” and one not modified by “about.” It will be further understood that the end-points of each of the ranges are significant both in relation to the other end-point, and independently of the other end-point.
The terms “substantial,” “substantially,” and variations thereof as used herein are intended to note that a described feature is equal or approximately equal to a value or description. For example, a “substantially planar” surface is intended to denote a surface that is planar or approximately planar. Moreover, “substantially” is intended to denote that two values are equal or approximately equal. In some embodiments, “substantially” may denote values within about 10% of each other, such as within about 5% of each other, or within about 2% of each other.
It is also important to note that the construction and arrangement of the elements of the disclosure, as shown in the exemplary embodiments, is illustrative only. Although only a few embodiments of the present innovations have been described in detail in this disclosure, those skilled in the art who review this disclosure will readily appreciate that many modifications are possible (e.g., variations in sizes, dimensions, structures, shapes and proportions of the various elements, values of parameters, mounting arrangements, use of materials, colors, orientations, etc.) without materially departing from the novel teachings and advantages of the subject matter recited. For example, elements shown as integrally formed may be constructed of multiple parts, or elements shown as multiple parts may be integrally formed, the operation of the interfaces may be reversed or otherwise varied, the length or width of the structures and/or members or connectors or other elements of the system may be varied, and the nature or number of adjustment positions provided between the elements may be varied. It should be noted that the elements and/or assemblies of the system may be constructed from any of a wide variety of materials that provide sufficient strength or durability, in any of a wide variety of colors, textures, and combinations. Accordingly, all such modifications are intended to be included within the scope of the present innovations. Other substitutions, modifications, changes, and omissions may be made in the design, operating conditions, and arrangement of the desired and other exemplary embodiments without departing from the spirit of the present innovations.
It will be understood that any described processes or steps within described processes may be combined with other disclosed processes or steps to form structures within the scope of the present disclosure. The exemplary structures and processes disclosed herein are for illustrative purposes and are not to be construed as limiting.
It is also to be understood that variations and modifications can be made on the aforementioned structures and methods without departing from the concepts of the present disclosure, and further it is to be understood that such concepts are intended to be covered by the following claims unless these claims by their language expressly state otherwise.
This application claims priority to and the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 63/526,777, filed on Jul. 14, 2023, entitled “RECONSTRUCTION 3D HUMAN POSE USING CONSTRAINED OPTIMIZATION,” the disclosure of which is hereby incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63526777 | Jul 2023 | US |