Anthropometric measurements provide indicators of child health and wellbeing. Today, there is a well-developed protocol to measure the proportionate size of the infant body but it is slow, requires bulky and costly equipment, is subject to accuracy and precision errors, and requires initial and on-going training of field staff. Similarly, anthropometric measurements can be used in livestock farming to ensure adequate nutrition and growth of the animals.
Anthropometric data are used for many purposes. For example, nationally-representative surveys include anthropometric indicators such as stunting, wasting and overweight, and this information is used to track progress over time, and inform policy and program development both nationally and globally. Anthropometric indices are used also to evaluate the impact of interventions to improve child health and nutrition and to allow comparisons of cost-effectiveness among interventions. Finally, anthropometric measurements have important clinical applications in evaluating patients with severe and chronic malnutrition and for monitoring child neurodevelopment. Poor measurement compromises all of these uses. Even extremely well-trained anthropometrists demonstrate a Total Error Measurement (TEM) that can overwhelm subtle effects of an intervention. Field trained personnel can be expected to have even higher TEM.
Therefore, what are needed are systems and methods that overcome challenges in the art, some of which are described above. The systems and methods described herein provide an automated alternative to manual approaches of anthropometric measurement that provides a fully automatic, objective measure of subject surface geometry and the automatic extraction and storage of anthropometric measure of interest.
Described and disclosed herein are embodiments of a robust, low-cost, easy to use, objective, automated system to extract anthropometric information from infants between the ages of 0-24 months as well as children 25-60 months using three-dimensional (3-D) imaging technology. The automated measurements have been validated against the current gold standard, physical measurements of infant head and arm circumference and body length/height. An advantage to the disclosed embodiments is that measurements can be obtained from test subjects that are not capable of standing still for a measurement, as opposed to older children and adults.
Disclosed and described herein are embodiments of a system, method and computer-program product for determining anthropometric measurements of a non-stationary subject. One embodiments comprises scanning a non-stationary subject using a three-dimensional (3-D) scanner to create a plurality (N) of point clouds of data corresponding to the subject. Once the plurality (N) of point clouds have been captured, a processor executing computer-executable instructions is used to perform the following steps:
The N skinning weight articulated models are optimized by the processor executing computer-executable instructions to find one set of size parameters and N sets of fitted pose parameters that minimize the distance between the nth point cloud data set and the nth articulated model vertices for all N point clouds. The processor executing computer-executable instructions moves each of the N skinning weight articulated models to a neutral position from its fitted position, wherein the fitted position is based on the skinning weight articulated model's fitted pose parameters. The processor executing computer-executable instructions determines a transformation based on knowing the fitted and neutral position of each of the N skinning weight articulated models, applies the transformation to each of the plurality (N) of point clouds to produce a single merged point cloud in the neutral pose space, matches the merged point cloud in the neutral pose space to a final skinning weight articulated model in the neutral pose, and obtains anthropometric measurements from the final skinning weight articulated model in the neutral pose.
In one aspect, scanning the subject using the three-dimensional (3-D) scanner to create a plurality (N) of point clouds of data corresponding to the subject comprises scanning the subject using a 3-D hand scanner. The scanner may be used to capture bursts of data. For example, each burst of data may range from 0.10 to 0.50 seconds of scan data so that there is little to no movement of the subject while capturing the burst of data. By capturing bursts of data, three to 10 bursts of data may be captured during the scan of the subject. Each of the three to 10 bursts of data captured during the scan comprises one of the plurality (N) of point clouds of data corresponding to the subject. Each of the plurality (N) of point clouds of data corresponding to the subject comprises a pose of the subject.
In some aspects, each burst of data may be captured from 0.3 to 1.5 seconds.
Each burst of data to determine if it is rejected or accepted. In one aspect, after rejecting one or more bursts of data, the scan of the subject is rejected if the accepted bursts of data are three or less.
In one aspect, estimating the rough size of the subject using the point cloud of data comprises estimating the rough size of the subject based on an age of the subject. In some aspects, this may be performed using a lookup table. For example, the lookup table may be a table compiled by the World Health Organization (WHO).
In some aspects, estimating the rough pose of the subject using the point cloud of data comprises a search through a generated database of possible poses. The search through the generated database of possible poses may be performed using a sub-space search technique that uses principal component analysis.
In some aspects, changing the estimated rough size and the estimated rough pose of the point cloud of data of the subject to best match the surface of the skinning weight articulated model comprises using an adaptation of an iterated closest point algorithm for articulated models. The skinning weight articulated model may comprise a computer-generated hierarchical set of bones and joints to form a skeleton created by an animator and a computer-generated skin surface is attached to the skeleton by a weighting technique.
In some aspects, optimizing the N skinning weight articulated models to find one set of size parameters and N sets of fitted pose parameters comprises using a modified iterated closest point cloud algorithm, determining the one set of size parameters by adjusting a size parameter of each of the skinning weight articulated models to match all of the skinning weight articulated models to their corresponding point cloud data; and determining the N sets of fitted pose parameters by adjusting a pose parameter for each of the skinning weight articulated models to match the skinning weight articulated model to its corresponding point cloud.
Generally, obtaining anthropometric measurements from the final skinning weight articulated model in the neutral pose comprises measuring a distance along defined arcs on final skinning weight articulated model.
It is to be noted that the disclosed systems, methods and computer program product may be used to obtain anthropometric measurements of humans as well as non-humans such as swine and bovine, among other animals.
In some instances, cloud computing and storage infrastructure is used to perform some or all of the described processing and/or data storage and retrieval
It should be understood that the above-described subject matter may also be implemented as a computer-controlled apparatus, a computing system, or an article of manufacture, such as a computer-readable storage medium.
Other systems, methods, features and/or advantages will be or may become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features and/or advantages be included within this description and be protected by the accompanying claims.
The components in the drawings are not necessarily to scale relative to each other. Like reference numerals designate corresponding parts throughout the several views.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. Methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure.
As used in the specification and the appended claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise. Ranges may be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another embodiment includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it will be understood that the particular value forms another embodiment. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint.
“Optional” or “optionally” means that the subsequently described event or circumstance may or may not occur, and that the description includes instances where said event or circumstance occurs and instances where it does not.
Throughout the description and claims of this specification, the word “comprise” and variations of the word, such as “comprising” and “comprises,” means “including but not limited to,” and is not intended to exclude, for example, other additives, components, integers or steps. “Exemplary” means “an example of” and is not intended to convey an indication of a preferred or ideal embodiment. “Such as” is not used in a restrictive sense, but for explanatory purposes.
Disclosed are components that can be used to perform the disclosed methods and systems. These and other components are disclosed herein, and it is understood that when combinations, subsets, interactions, groups, etc. of these components are disclosed that while specific reference of each various individual and collective combinations and permutation of these may not be explicitly disclosed, each is specifically contemplated and described herein, for all methods and systems. This applies to all aspects of this application including, but not limited to, steps in disclosed methods. Thus, if there are a variety of additional steps that can be performed it is understood that each of these additional steps can be performed with any specific embodiment or combination of embodiments of the disclosed methods.
The present methods and systems may be understood more readily by reference to the following detailed description of preferred embodiments and the Examples included therein and to the Figures and their previous and following description.
Referring now to
Referring now to
Further comprising the exemplary system of
Referring back to
For younger test subjects that cannot in general respond appropriately to requests to stand still, the acquisition software executing on the acquisition device 102 is designed to capture multiple short bursts of point cloud data. Each burst of point cloud data is incorporated into the anthropometric estimation. Generally, such short scans that create the bursts of data range from 0.10 to 0.25 seconds of data acquisition, at approximately 30 frames of data per second, to create a single point cloud that includes data from each scan. The amount of acquisition time is determined by the operator 112 based on the ability of the subject 110 to stand still, with older children being acquired at the 0.25 second bursts and younger and uncooperative children at 0.10 second bursts. The trade-off is between a better point cloud (smoother and with a more complete surface) versus the artifacts induced by the subject movement. Capturing bursts of point cloud data accommodates capturing data from non-stationary subjects.
For each of the front and back poses of the subject 110, the acquisition software executing on the acquisition device 102 captures anywhere from three to ten bursts of data. Each burst of data is an imperfect point cloud representing one aspect of the subject's surface geometry. The result of a complete scan is six to twenty 3-D point clouds of the subject 110 from the front and the back.
The acquisition software executing on the acquisition device 102 automatically uploads all of the 3-D point cloud data for the subject into a database. In one aspect, this database may reside in the cloud 106. In addition to the point cloud data, the subject's 110 name, age, weight, and other demographic data elements of interest can be uploaded and stored automatically. In addition, in some embodiments the manual anthropometry of the subject is acquired and recorded onto the device. This manual anthropometric data may then also be automatically uploaded into the database.
The subject 110, while generally not moving much during a capture burst will likely have moved during the course of a scan sequence. Generally this is referred to as the subject 110 being in different poses. The anthropometric estimation software accommodates these noisy multiple point clouds of a subject in various poses; while the pose of the subject is different at these bursts, the size of the subject is the same.
Overall the anthropometric estimation process proceeds by fitting a generic articulated model of a human to the point cloud data of the subject 110 using the anthropometric estimation software. The anthropometric metrics of interest can then be directly extracted from the fitted model. The anthropometric estimation software is designed to estimate the articulated model of a human being that best fits the multiple point clouds given a single size model at multiple poses.
where outs is the output vertex location in the world coordinate system; BSM is a bind-shape matrix; IBMi: is an inverse bind-pose matrix of joint I; JMi: is a transformation matrix of joint i; JW is a weight of the influence of joint i on vertex v; n is the number of bones to which this vertex is weighted; and v is the location of the vertex in world system at model creation.
v, BSM, IBMi, and JW are constants with regards to some skeletal animation. In practice, as the model is moved in the animation or game, the joint transformation matrices are updated at each time step. This may be a parameter set of anywhere from 2 to 100 (or more, for very detailed facial animation). After the joints are updated, the output location of all vertices is calculated, which may be a surface mesh comprising hundreds or thousands or even tens of thousands of vertices.
While not useful in traditional animation or gaming settings, it is also possible to include a scaling matrix within the transformation matrix of the joints, in addition to rotations. That scaling matrix is included in the present implementation, in order to facilitate differential fitting of the articulated model to the set of 3D point clouds.
A flowchart illustrating an exemplary overall model fitting process is shown in
The data process starts by operating on a single 3-D point cloud at a time. The first step 404 is foreground-background segmentation. The 3-D scan in general captures data on the test subject and potentially other items in the room—other people in the background, items in the background like chairs or walls etc. The anthropometric estimation software uses as a clue the location of the point cloud in the center of the image—that is assumed to be the subject. The depth of the subject relative to the imaging device is determined, and any 3-D point further away than some constant value is discarded. In addition, the central point of the remaining point cloud is taken as a seed, and any points not connected to the central point are discarded. This in practice produces a single point cloud representing the test subject.
The next step 406 is to initialize the model to this point cloud. The size is initialized by knowing the age of the subject, and using a look-up table provided by WHO to estimate the height of the subject. The pose is estimated through a search through a generated database of possible poses, using the well-known using principal component analysis sub-space search technique. The pose initialization process involves a number of steps. In a first step, prior to working on any new data, two databases of images (front and back) are created by projecting the articulated model in various poses, encompassing all possible poses of the subject. As the process for both the front and back are identical, only one will be described in detail. In creating the database of model poses, the main joint articulations, right and left shoulder, right and left elbow, right and left hip, right and left knee, are modified over their plausible range in the sagittal plane in increments of 5 or 10 degrees. This produces a universe of models in a wide range of expected or possible poses. As noted above, this is repeated for the front and the back, thus creating two databases of images.
These two databases are combined into two data matrices encompassing all of these images. The base articulated model of the subject is projected onto an imaging plane such that the entire model is contained within a 101×101 pixel image, and the depth of the model coded in the intensity of the image. The power in the images are normalized to one, and each of these images are then vectorized, producing a single vector of data of 10,201 elements. This process is repeated for each of the (perhaps) 5000 pose images, producing an image database of 10,201×5000 pixels. The average image vector is calculated and subtracted from all of the image vectors. This results in the complete image reference database. In summary, when creating the image reference database, which is run one time before being used in the initialization portion of the algorithm, the base articulated model of a subject is run through the various poses described above, which produce approximately 5000 versions of the model in different poses, and then each of those 5000 models are projected onto two image planes (front and back) to produce the pose image reference database.
The two data matrices are then decomposed into a Principal Component sub-space using the Singular Value Decomposition algorithm, creating a sub-space that adequately represents the images (and corresponding poses). The first K eigenvectors are chosen to represent the data matrix (in this case, 50). These K eigenvectors are then multiplied against each image vector to produce a point in K-Dimensional space that represents that image, with 5,000 of these K-dimensional points being stored for comparison.
Once a new data set is to be initialized, it is decomposed to its principal component sub-space representation. The point cloud under consideration, after segmentation, is projected onto the same size imaging plane as above, and again is power-normalized (see
The new sub-space representation is compared to the existing sub-space database of images/poses and the closest fit is taken as the initial estimate of the subject pose (see
Referring back to
Because in the disclosed embodiments the Source object is not rigid, but is rather an articulated model with many degrees of freedom, a modified ICP algorithm is used. The modified ICP algorithm (see
Step (1) of the modified ICP algorithm uses the k-nearest neighbor search technique to find the nearest model points to each cloud point. For step (2), a nonlinear programming multivariable derivative free method is used that minimizes the sum of the distances between the model and corresponding point cloud points. Step (3) of the modified ICP algorithm uses the skinned-weight mesh algorithm previously described. For step (4) of the modified ICP algorithm, the first set of parameters are the overall model position and orientation, followed by overall model size, followed by upper arm orientation, upper leg orientation, lower arm orientation, lower leg orientation, torso size, arm size, leg size.
On termination of the modified ICP algorithm, at 410 the fit between the model and the point cloud is evaluated. If the standard deviation of the distances between the model points and the point cloud points is too high (as some multiple of the mean distance), it is an indication that some part of the model did not fit well and this individual point cloud is then rejected. Otherwise, the point cloud is stored 412.
The above procedure is repeated for all of the point clouds individually until there is a set of N point clouds and a corresponding set of N animator's models that minimize the distance between the nth point cloud data set and the nth articulated model vertices for all N point clouds. The mean size of these N models is calculated at 414, and all of the models are adjusted to match this mean size. The anthropometric estimation software will now work on all of the data sets and models as a group, optimizing to find one set of size parameters and N sets of pose parameters that match the animator's models to the 3-D point clouds. This is done by another extension of the ICP cloud algorithm (see
For each individual point cloud, (1) or each point in the articulated model, find the closest point in the corresponding 3-D point cloud; (2) using optimization techniques on a limited number of parameters, and holding all joint scalings as fixed, find the joint rotations that best align each model point to its corresponding 3D point cloud point; (3) apply the calculated parameters to the joints and calculate new vertex locations using the skinned-weight mesh model described previously; and (4) iterate until the alignment stops improving.
Once the models converge to the 3-D point clouds as well as they can, the fitting algorithm terminates, leaving multiple articulated models in various poses but of one size.
All of the articulated models at 420 can be moved back to their initial, neutral pose—that is, all of the articulated models can be automatically moved such that their joint rotations are brought back to zero. Knowing the fitted and neutral pose for each of these models, a transformation can be calculated from a model point on the posed model to the same point on the neutral pose model. Knowing that transformation, and knowing which 3D cloud points correspond to which posed model points, allows the appropriate transformation to be applied to each of the 3-D point cloud points. This in turn produces a set of point clouds that are all in a single coordinate system.
Performing this set of transformations on all of the individual point clouds produces a single merged point cloud 422 in the neutral pose space (see
At this point 426, any anthropometric measure of interest can be extracted by measuring distance along defined arcs on the model (see
The fitted model and the anthropometric measures of interest are stored in the cloud database, and the metrics of interest are transmitted back to the data acquisition device via a network connection.
When the logical operations described herein are implemented in software, the process may execute on any type of computing architecture or platform. For example, referring to
Computing device 1000 may have additional features/functionality. For example, computing device 1000 may include additional storage such as removable storage 1008 and non-removable storage 1010 including, but not limited to, magnetic or optical disks or tapes. Computing device 1000 may also contain network connection(s) 1016 that allow the device to communicate with other devices. Computing device 1000 may also have input device(s) 1014 such as a keyboard, mouse, touch screen, etc. Output device(s) 1012 such as a display, speakers, printer, etc. may also be included. The additional devices may be connected to the bus in order to facilitate communication of data among the components of the computing device 1000. All these devices are well known in the art and need not be discussed at length here.
The processing unit 1006 may be configured to execute program code encoded in tangible, computer-readable media. Computer-readable media refers to any media that is capable of providing data that causes the computing device 1000 (i.e., a machine) to operate in a particular fashion. Various computer-readable media may be utilized to provide instructions to the processing unit 1006 for execution. Common forms of computer-readable media include, for example, magnetic media, optical media, physical media, memory chips or cartridges, a carrier wave, or any other medium from which a computer can read. Example computer-readable media may include, but is not limited to, volatile media, non-volatile media and transmission media. Volatile and non-volatile media may be implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data and common forms are discussed in detail below. Transmission media may include coaxial cables, copper wires and/or fiber optic cables, as well as acoustic or light waves, such as those generated during radio-wave and infra-red data communication. Example tangible, computer-readable recording media include, but are not limited to, an integrated circuit (e.g., field-programmable gate array or application-specific IC), a hard disk, an optical disk, a magneto-optical disk, a floppy disk, a magnetic tape, a holographic storage medium, a solid-state device, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices.
In an example implementation, the processing unit 1006 may execute program code stored in the system memory 1004. For example, the bus may carry data to the system memory 1004, from which the processing unit 1006 receives and executes instructions. The data received by the system memory 1004 may optionally be stored on the removable storage 1008 or the non-removable storage 1010 before or after execution by the processing unit 1006.
Computing device 1000 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by device 1000 and includes both volatile and non-volatile media, removable and non-removable media. Computer storage media include volatile and non-volatile, and removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. System memory 1004, removable storage 1008, and non-removable storage 1010 are all examples of computer storage media. Computer storage media include, but are not limited to, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 1000. Any such computer storage media may be part of computing device 1000.
It should be understood that the various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination thereof. Thus, the methods and apparatuses of the presently disclosed subject matter, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium wherein, when the program code is loaded into and executed by a machine, such as a computing device, the machine becomes an apparatus for practicing the presently disclosed subject matter. In the case of program code execution on programmable computers, the computing device generally includes a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. One or more programs may implement or utilize the processes described in connection with the presently disclosed subject matter, e.g., through the use of an application programming interface (API), reusable controls, or the like. Such programs may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language and it may be combined with hardware implementations.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
This application claims priority to and benefit of U.S. Provisional Patent Application No. 62/478,772 filed Mar. 30, 2017, which is fully incorporated by reference and made a part hereof.
Number | Date | Country | |
---|---|---|---|
62478772 | Mar 2017 | US |