GAIT AND POSTURE ANALYSIS

Information

  • Patent Application
  • 20240057892
  • Publication Number
    20240057892
  • Date Filed
    December 29, 2021
    3 years ago
  • Date Published
    February 22, 2024
    11 months ago
Abstract
Systems and methods described herein provide techniques for analyzing gait and posture of a subject with respect to control data. The systems and methods, in some embodiments, processes video data, identifies keypoints representing body parts, determines metrics data at a stride level, and compares the metrics data to control data.
Description
FIELD OF THE INVENTION

The invention, in some aspects, relates to automated gait and posture analysis of subjects by processing video data.


BACKGROUND

In humans, the ability to quantitate gait and posture at high precision and sensitivity has shown that they can be used to determine proper function of numerous neural and muscular systems. Many psychiatric, neurodegenerative, and neuromuscular illnesses are associated with alterations in gait and posture, including autism spectrum disorder, schizophrenia, bipolar disorder, and Alzheimer's disease. This is because proper gait, balance, and posture are under the control of multiple nervous system processes, which include critical sensory centers that process visual, vestibular, auditory, proprioceptive, and visceral inputs. Regions of the brain that directly control movement, such as the cerebellum, motor cortex, and brain stem, respond to cognitive and emotionality cues. Thus, gait and posture integrity reflects proper neural functioning of many neural systems in humans. In rodent models of human psychiatric conditions, there has not been any demonstrated utility of gait and posture metrics as in humans. This may be due to the lack of readily implementable technology with sufficient accuracy to detect gait and posture differences between different mouse strains.


SUMMARY OF THE INVENTION

According to one aspect of the invention, a computer-implemented method is provided, the method including: receiving video data representing a video capturing movements of a subject; processing the video data to identify point data tracking movement, over a time period, of a set of body parts of the subject; determining, using the point data, a plurality of stance phases and a corresponding plurality of swing phases represented in the video data during the time period; determining, based on the plurality of stance phases and the plurality of swing phases, a plurality of stride intervals represented in the video data during the time period; determining, using the point data, metrics data for the subject, the metrics data being based on each stride interval of the plurality of stride intervals; comparing the metrics data for the subject to control metrics data; and determining, based on the comparing, a difference between the subject's metrics data and the control metrics data. In certain embodiments, the set of body parts includes the nose, base of neck, mid spine, left hind paw, right hind paw, base of tail, middle of tail and tip of tail; and wherein the plurality of stance phases and the plurality of swing phases are determined based on the change in movement speed of the left hind paw and the right hind paw. In certain embodiments, the method also includes determining a transition from a first stance phase of the plurality of stance phases and a first swing phase of the plurality of swing phases based on a toe-off event of the left hind paw or the right hind paw; and determining a transition from a second swing phase of the plurality of swing phases to a second stance phase of the plurality of stance phases based on a foot strike event of the left hind paw or the right hind paw. In some embodiments, the metrics data correspond to gait measurements of the subject during each stride interval. In some embodiments, the set of body parts includes a left hind paw and a right hind paw, and wherein determining the metrics data includes: determining, using the point data, a step length for each stride interval, the step length representing a distance that the right hind paw travels past a previous left hind paw strike; determining, using the point data, a stride length using for the each stride interval, the stride length representing a distance that the left hind paw travels during the each stride interval; between the left forepaw and the left hind paw for the each stride interval from a toe-off event to a foot-strike event; determining, using the point data, a step width for the each stride interval, the step width representing a distance between the left hind paw and the right hind paw. In some embodiments, the set of body parts includes a tail base, and wherein determining the metrics data includes determining, using the point data, speed data of the subject based on movement of the tail base for the each stride interval. In certain embodiments, the set of body parts includes a tail base, and wherein determining the metrics data includes: determining, using the point data, a set of speed data of the subject based on movement of the tail base during a set of frames representing a stride interval of the plurality of stride intervals; and determining a stride speed, for the stride interval, by averaging the set of speed data. In some embodiments, the set of body parts includes a right hind paw and a left hind paw, and wherein determining the metrics data includes: determining, using the point data, first stance duration representing an amount of time that the right hind paw is in contact with ground during a stride interval of the plurality of stride intervals; determining a first duty factor based on the first stance duration and the duration of the stride interval; determining, using the point data, second stance duration representing an amount of time that the left hind paw is in contact with ground during the stride interval; determining a second duty factor based on the second stance duration and the duration of the stride interval; and determining an average duty factor for the stride interval based on the first duty factor and the second duty factor. In some embodiments, the set of body parts includes a tail base and a neck base, and wherein determining the metrics data includes: determining, using the point data, a set of vectors connecting the tail base and the neck base during a set of frames representing a stride interval of the plurality of stride intervals; and determining, using the set of vectors, an angular velocity of the subject for the stride interval. In certain embodiments, the metrics data correspond to posture measurements of the subject during each stride interval. In some embodiments, the set of body parts includes a spine center of the subject, wherein a stride interval of the plurality of stride intervals is associated with a set of frames of the video data, and wherein determining the metrics data includes determining, using the point data, a displacement vector for the stride interval, the displacement vector connecting the spine center represented in a first frame of the set of frames and the spine center represented in a last frame of the set of frames. In some embodiments, the set of body parts further includes a nose of the subject, and wherein determining the metrics data includes determining, using the point data, a set of lateral displacements of the nose for the stride interval based on a perpendicular distance of the nose from the displacement vector for each frame in the set of frames. In certain embodiments, the lateral displacement of the nose is further based on a body length of the subject. In some embodiments, determining the metrics data further includes determining a tail tip displacement phase offset by: performing an interpolation using the set of lateral displacements of the nose to generate a smooth curve lateral displacement of the nose for the stride interval; determining, using the smooth curve lateral displacement of the nose, when a maximum displacement of the nose occurs during the stride interval; and determining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the nose occurs. In some embodiments, the set of body parts further includes a tail base of the subject, and wherein determining the metrics data includes: determining, using the point data, a set of lateral displacements of the tail base for the stride interval based on a perpendicular distance of the tail base from the displacement vector for each frame in the set of frames. In some embodiments, determining the metrics data further includes determining a tail base displacement phase offset by: performing an interpolation using the set of lateral displacements of the tail base to generate a smooth curve lateral displacement of the tail base for the stride interval; determining, using the smooth curve lateral displacement of the tail base, when a maximum displacement of the tail base occurs during the stride interval; and determining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail base occurs. In certain embodiments, the set of body parts also includes a tail tip of the subject, and wherein determining the metrics data includes: determining, using the point data, a set of lateral displacements of the tail tip for the stride interval based on a perpendicular distance of the tail tip from the displacement vector for each frame in the set of frames. In some embodiments, determining the metrics data also includes determining a tail tip displacement phase offset by: performing an interpolation using the set of lateral displacements of the tail tip to generate a smooth curve lateral displacement of the tail tip for the stride interval; determining, using the smooth curve lateral displacement of the tail tip, when a maximum displacement of the tail tip occurs during the stride interval; and determining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail tip occurs. In some embodiments, processing the video data includes processing the video data using a machine-learning model. In certain embodiments, processing the video data includes processing the video data using a neural network model. In certain embodiments, the video captures subject-determined movements of the subject in an open arena with a top-down view. In some embodiments, the control metrics data is obtained from a control organism or plurality thereof. In some embodiments, the subject is an organism and the control organism and the subject organism are the same species. In certain embodiments, the species is a member of the Order Rodentia, and optionally is rat or mouse. In certain embodiments, the control organism is a laboratory strain of the species. In some embodiments, the laboratory strain is one listed in FIG. 14E. In some embodiments, a statistically significant difference in the subject's metrics data compared to the control metrics data indicates a difference in the phenotype of the subject compared to the phenotype of the control organism. In some embodiments, the phenotypic difference indicates the presence of a disease or condition in the subject. In certain embodiments, the phenotypic difference indicates a difference between the genetic background of the subject and the genetic background of the control organism. In some embodiments, a statistically significant difference in the subject's metrics data and the control metrics data indicates a difference in the genotype of the subject compared to the genotype of the control organism. In certain embodiments, the difference in the genotype indicates a strain difference between the subject and the control organism. In certain embodiments, the difference in the genotype indicates the presence of a disease or condition in the subject. In some embodiments, the disease or condition is Rett syndrome, Down syndrome, amyotrophic lateral sclerosis (ALS), autism spectrum disorder (ASD), schizophrenia, bipolar disorder, a neurodegenerative disorder, dementia, or a brain injury. In some embodiments, the control organism and the subject organism are the same gender. In certain embodiments, the control organism and the subject organism are not the same gender. In some embodiments, the control metrics data corresponds to elements including: control stride length, control step length and control step width, wherein the subject's metrics data includes elements including stride lengths for the subject during the time period, step lengths for the subject during the time period and step widths for the subject during the time period, and wherein the difference between the one or more of the elements of the control data and the metrics data is indicative of a phenotypic difference between the subject and the control.


According to another aspect of the invention, methods of assessing one or more of an activity and behavior of a subject known to have, suspected of having, or at risk of having a disease or condition, are provided, the method including: obtaining metrics data for the subject, wherein a means for the obtaining comprises a computer-generated method of any embodiment of an aforementioned method or system of the invention, and based at least in part on the obtained metrics data, determining presence or absence of the disease or condition. In some embodiments the method also includes selecting a therapeutic regimen for the subject, based at least in part on the determined presence of the disease or condition. In some embodiments, the method also includes administering the selected therapeutic regimen to the subject. In some embodiments, the method also includes obtaining the metrics data for the subject at a time subsequent to the administration of the therapeutic regimen, and optionally comparing the initial obtained metrics data and the subsequent obtained metrics data and determining efficacy of the administered therapeutic regimen. In some embodiments, the method also includes repeating, increasing, or decreasing administration of the selected therapeutic regimen to the subject, based at least in part on the comparison of the initial and subsequent metrics data obtained for the subject. In some embodiments, the method also includes comparing the obtained metrics data to control metrics data. In some embodiments the disease or condition is: a neurodegenerative disorder, neuromuscular disorder, neuropsychiatric disorder, ALS, autism, Down syndrome, Rett syndrome, bipolar disorder, dementia, depression, a hyperkinetic disorder, an anxiety disorder, a developmental disorder, a sleep disorder, Alzheimer's disease, Parkinson's disease, a physical injury, etc. Additional diseases and disorders and animal models that can be assessed using a method and/or system of the invention are known in the art, see for example: Barrot M. Neuroscience 2012; 211: 39-50; Graham, D. M., Lab Anim (NY) 2016; 45: 99-101; Sewell, R. D. E., Ann Transl Med 2018; 6: S42. 2019/01/08; and Jourdan, D., et al., Pharmacol Res 2001; 43: 103-110.


According to another aspect of the invention, a method of identifying a subject as an animal model for a disease or condition is provided, the method including obtaining metrics data for the subject, wherein a means for the obtaining comprises a computer-generated method of any one embodiment of an aforementioned method or system of the invention, and based at least in part on the obtained metrics data, determining one or more characteristics of the disease or condition in the subject, wherein the presence of the one or more characteristics of the disease or condition in the subject, identifies the subject as an animal model for the disease or condition. In some embodiments, the method also includes additional assessment of the subject. In some embodiments the disease or condition is: a neurodegenerative disorder, neuromuscular disorder, neuropsychiatric disorder, ALS, autism, Down syndrome, Rett syndrome, bipolar disorder, dementia, depression, a hyperkinetic disorder, an anxiety disorder, a developmental disorder, a sleep disorder, Alzheimer's disease, Parkinson's disease, a physical injury, etc. In some embodiments, the method also includes comparing the obtained metrics data to a control metrics data, and identifying one or more similarities a similarity or differences in the obtained metrics data and the control metrics data, wherein identified similarities or differences assist in identifying the subject as an animal model for the disease or condition.


According to another aspect of the invention, a method of determining the presence of an effect of a candidate compound on a disease or condition is provided, the method including: obtaining first metrics data for a subject, wherein a means for the obtaining includes a computer-generated method of any embodiment of the aforementioned computer generated aspect of the invention, and wherein the subject has the disease or condition or is an animal model for the disease or condition; administering to the subject the candidate compound; obtaining post-administration metrics data for the organism; comparing the first and post-administration metrics data, wherein a difference in the first and post-administration metrics data identifies an effect of the candidate compound on the disease or condition. In some embodiments, the method also includes additional testing of the compound's effect in treatment of the disease or condition.


According to another aspect of the invention, a method of identifying the presence of an effect of a candidate compound on a disease or condition is provided, the method including: administering the candidate compound to a subject that has the disease or condition or that is an animal model for the disease or condition; obtaining metrics data for the subject, wherein a means for the obtaining includes a computer-generated method of any embodiment of the aforementioned computer generated aspect of the invention; comparing the obtained metrics data to a control metrics data, wherein a difference in the obtained metrics data and the control metrics data identifies the presence of an effect of the candidate compound on the disease or condition.


According to another aspect of the invention, a system is provided, the system including: at least one processor; and at least one memory comprising instructions that, when executed by the at least one processor, cause the system to: receive video data representing a video capturing movements of a subject; processing the video data to identify point data tracking movement, over a time period, of a set of body parts of the subject; determine, using the point data, a plurality of stance phases and a corresponding plurality of swing phases represented in the video data during the time period; determine, based on the plurality of stance phases and the plurality of swing phases, a plurality of stride intervals represented in the video data during the time period; determine, using the point data, metrics data for the subject, the metrics data being based on each stride interval of the plurality of stride intervals; compare the metrics data for the subject to control metrics data; and determine, based on the comparing, a difference between the subject's metrics data and the control metrics data. In some embodiments, the set of body parts includes the nose, base of neck, mid spine, left hind paw, right hind paw, base of tail, middle of tail and tip of tail; and wherein the plurality of stance phases and the plurality of swing phases are determined based on the change in movement speed of the left hind paw and the right hind paw. In certain embodiments, the at least one memory also includes instructions that, when executed by the at least one processor, further cause the system to: determine a transition from a first stance phase of the plurality of stance phases and a first swing phase of the plurality of swing phases based on a toe-off event of the left hind paw or the right hind paw; and determine a transition from a second swing phase of the plurality of swing phases to a second stance phase of the plurality of stance phases based on a foot strike event of the left hind paw or the right hind paw. In certain embodiments, the metrics data correspond to gait measurements of the subject during each stride interval. In some embodiments, the set of body parts includes a left hind paw and a right hind paw, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a step length for each stride interval, the step length representing a distance that the right hind paw travels past a previous left hind paw strike; determine, using the point data, a stride length using for the each stride interval, the stride length representing a distance that the left hind paw travels during the each stride interval; determine, using the point data, a step width for the each stride interval, the step width representing a distance between the left hind paw and the right hind paw. In some embodiments, the set of body parts includes a tail base, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, speed data of the subject based on movement of the tail base for the each stride interval. In certain embodiments, the set of body parts includes a tail base, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of speed data of the subject based on movement of the tail base during a set of frames representing a stride interval of the plurality of stride intervals; and determine a stride speed, for the stride interval, by averaging the set of speed data. In certain embodiments, the set of body parts includes a right hind paw and a left hind paw, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, first stance duration representing an amount of time that the right hind paw is in contact with ground during a stride interval of the plurality of stride intervals; determine a first duty factor based on the first stance duration and the duration of the stride interval; determine, using the point data, second stance duration representing an amount of time that the left hind paw is in contact with ground during the stride interval; determine a second duty factor based on the second stance duration and the duration of the stride interval; and determine an average duty factor for the stride interval based on the first duty factor and the second duty factor. In some embodiments, the set of body parts includes a tail base and a neck base, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of vectors connecting the tail base and the neck base during a set of frames representing a stride interval of the plurality of stride intervals; and determine, using the set of vectors, an angular velocity of the subject for the stride interval. In some embodiments, the metrics data correspond to posture measurements of the subject during each stride interval. In some embodiments, the set of body parts includes a spine center of the subject, wherein a stride interval of the plurality of stride intervals is associated with a set of frames of the video data, and wherein the instruction that causes the system to determine the metrics data further causes the system to determine, using the point data, a displacement vector for the stride interval, the displacement vector connecting the spine center represented in a first frame of the set of frames and the spine center represented in a last frame of the set of frames. In certain embodiments, the set of body parts also includes a nose of the subject, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of lateral displacements of the nose for the stride interval based on a perpendicular distance of the nose from the displacement vector for each frame in the set of frames. In some embodiments, the lateral displacement of the nose is further based on a body length of the subject. In some embodiments, the instruction that causes the system to determine the metrics data further causes the system to determine a tail tip displacement phase offset by: performing an interpolation using the set of lateral displacements of the nose to generate a smooth curve lateral displacement of the nose for the stride interval; determining, using the smooth curve lateral displacement of the nose, when a maximum displacement of the nose occurs during the stride interval; and determining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the nose occurs. In certain embodiments, the set of body parts also includes a tail base of the subject, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of lateral displacements of the tail base for the stride interval based on a perpendicular distance of the tail base from the displacement vector for each frame in the set of frames. In some embodiments, the instruction that causes the system to determine the metrics data further causes the system to determine a tail base displacement phase offset by: performing an interpolation using the set of lateral displacements of the tail base to generate a smooth curve lateral displacement of the tail base for the stride interval; determining, using the smooth curve lateral displacement of the tail base, when a maximum displacement of the tail base occurs during the stride interval; and determining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail base occurs. In certain embodiments, the set of body parts also includes a tail tip of the subject, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of lateral displacements of the tail tip for the stride interval based on a perpendicular distance of the tail tip from the displacement vector for each frame in the set of frames. In some embodiments, the instruction that causes the system to determine the metrics data further causes the system to determine a tail tip displacement phase offset by: performing an interpolation using the set of lateral displacements of the tail tip to generate a smooth curve lateral displacement of the tail tip for the stride interval; determining, using the smooth curve lateral displacement of the tail tip, when a maximum displacement of the tail tip occurs during the stride interval; and determining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail tip occurs. In certain embodiments, the instruction that causes the system to process the video data further causes the system to process the video data using a machine learning model. In some embodiments, the instruction that causes the system to process the video data further causes the system to process the video data using a neural network model. In some embodiments, the video captures subject-determined movements of the subject in an open arena with a top-down view. In certain embodiments, the control metrics data is obtained from a control organism or plurality thereof. In some embodiments, the subject is an organism and the control organism and the subject organism are the same species. In some embodiments, the species is a member of the Order Rodentia, and optionally is rat or mouse. In certain embodiments, the control organism is a laboratory strain of the species. In certain embodiments, the laboratory strain is one listed in FIG. 14E. In some embodiments, a statistically significant difference in the subject's metrics data compared to the control metrics data indicates a difference in the phenotype of the subject compared to the phenotype of the control organism. In some embodiments, the phenotypic difference indicates the presence of a disease or condition in the subject. In certain embodiments, the phenotypic difference indicates a difference between the genetic background of the subject and the genetic background of the control organism. In some embodiments, a statistically significant difference in the subject's metrics data and the control metrics data indicates a difference in the genotype of the subject compared to the genotype of the control organism. In some embodiments, the difference in the genotype indicates a strain difference between the subject and the control organism. In some embodiments, the difference in the genotype indicates the presence of a disease or condition in the subject. In certain embodiments, the disease or condition is Rett syndrome, Down syndrome, amyotrophic lateral sclerosis (ALS), autism spectrum disorder (ASD), schizophrenia, bipolar disorder, a neurodegenerative disorder, dementia, or a brain injury. In certain embodiments, the control organism and the subject organism are the same gender. In some embodiments, the control organism and the subject organism are not the same gender. In some embodiments, the control metrics data corresponds to elements including: control stride length, control step length and control step width, wherein the subject's metrics data includes elements including stride lengths for the subject during the time period, step lengths for the subject during the time period and step widths for the subject during the time period, and wherein the difference between the one or more of the elements of the control data and the metrics data is indicative of a phenotypic difference between the subject and the control.





BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present disclosure, reference is now made to the following description taken in conjunction with the accompanying drawings.



FIG. 1 is a conceptual diagram of an example system for determining subject gait and posture metrics, according to embodiments of the present disclosure.



FIG. 2 is a flowchart illustrating an example process that may be performed by a system shown in FIG. 1 for analyzing video data of a subject(s) to determine subject gait and posture metrics, according to embodiments of the present disclosure.



FIG. 3 is a flowchart illustrating an example process that may be performed by a point tracker component shown in FIG. 1 for tracking subject body parts in the video data, according to embodiments of the present disclosure.



FIG. 4 is a flowchart illustrating an example process that may be performed by the system shown in FIG. 1 for determining stride intervals, according to embodiments of the present disclosure.



FIG. 5 is a flowchart illustrating an example process that may be performed by a gait analysis component shown in FIG. 1 for determining subject gait metrics, according to embodiments of the present disclosure.



FIG. 6 is a flowchart illustrating an example process that may be performed by a posture analysis component shown in FIG. 1 for determining subject posture metrics, according to embodiments of the present disclosure.



FIG. 7A-C shows schematic diagrams and graphs illustrating a deep convolutional neural network for pose estimation. FIG. 7A shows HRNet-W32 neural network architecture for performing pose estimation. FIG. 7B shows the inference pipeline, which sends video, frames into the HRNet and generates twelve keypoint heatmaps as output. FIG. 7C presents training loss curves show network convergence without overfitting.



FIG. 8A-J shows schematic diagrams and graphs illustrating deriving gait phenotypes from video pose estimation. FIG. 8A-B shows spatial and temporal characteristics of gait (based on figure from Green et al., Dev Med Child Neurol (2009) 51:311). FIG. 8A is illustration showing how three spatial stride metrics were derived from hind paw foot strike positions: Step Length, Step Width and Stride Length. FIG. 8B is a Hildebrand Plot in which all metrics shown in this plot have percent stride time for units. This illustrates the relationship between foot strike and toe-off events with the stance and swing phases of stride.



FIG. 8C shows a single frame of input video with hind paw tracks plotted fifty frames in the past and fifty frames in the future. The location of hind foot strike events is indicated with black circles. The outermost line of the three is the Hind Paw Right, the middle line of the three is the Tail Base, and the innermost line of the three is the Hind Paw Left. FIG. 8D-F shows three plots showing different aspects of the mouse's movement over the same one hundred-frame interval. The centered vertical line indicates the current frame (displayed in FIG. 8C). FIG. 8D shows three lines indicating speed of the left hind paw, the right hind paw, and the base of tail. The vertical dark lines in the plot indicate the inferred start frame of each stride. FIG. 8G illustrates the distribution of confidence values for each of the 12 points that were estimated. FIG. 8H provides an aggregate view of Hildebrand plot for hind paws binned according to angular velocity. FIG. 8I shows results similar to FIG. 8H except binned by speed. FIG. 8J illustrates that limb duty factor changes as a function of speed.



FIG. 9A-I provides schematic diagrams and graphs illustrating extraction of cyclic whole-body posture metrics during gait cycle. Several metrics relate to the cyclic lateral displacement observed in pose key points. The measures of lateral displacement were defined as an orthogonal offset from the relevant stride displacement vector. The displacement vector was defined as the line connecting the mouse's center of spine on the first frame of a stride to the mouse's center of spine on the last frame of stride. This offset was calculated at each frame of a stride and then a cubic interpolation was performed in order to generate a smooth displacement curve. The phase offset of displacement was defined as the percent stride location where maximum displacement occurs on this smoothed curve. The lateral displacement metric assigned to stride was the difference between maximum displacement value and minimum displacement value observed during a stride. Lateral displacement of (FIG. 9A) the tail tip and (FIG. 9B) the nose was measured. Displacement could also be averaged across many strides within a cohort to form a consensus view such as (FIG. 9D) C57BL/6J vs. (FIG. 9E) NOR/LtJ or many strides were averaged within individuals: (FIG. 9F) C57BL/6J vs. (FIG. 9G) NOR/LtJ. In FIG. 9H and FIG. 9I illustrate the diversity of lateral displacement between a set of strains selected from the strain survey. The light (translucent) bands for these two plots represent the 95% confidence interval of the mean for each respective strain.



FIG. 10A-E shows results indicating genetic validation of gait mutants. FIG. 10A shows q-values (left) and effect sizes (right) obtained from a liner mixed effects model and circular-linear model adjusting for body length and age. In FIG. 10B Kernel density estimates and cumulative distribution functions of speed distributions were compared to test for differences in stride speeds between controls and mutants. In FIG. 10C total distance covered and speed were compared between controls and mutants using linear and linear mixed models respectively adjusting for body length and age. FIG. 10D illustrates results of body length adjusted gait metrics that were found to be different for linear mixed effects model. FIG. 10E shows results of lateral displacement of nose and tail tip for Ts65Dn strain. The solid lines represent the mean displacement of stride while the light (translucent) bands provides a 95% confidence interval for the mean.



FIG. 11A-F provides tables and graphs illustrating genetic validation of autism mutants. FIG. 11A shows q-values (left) and effect sizes (right) that were obtained from model M1 for linear phenotypes and circular-linear models for circular phenotypes. FIG. 11B shows q-values (left) and effect sizes (right) obtained from model M3 for linear phenotypes and circular-linear models for circular phenotypes. In FIG. 11C, total distance covered were speed are compared between controls and mutants using linear and linear mixed models respectively adjusting for body length and age. In each pair shown, the left data is that of the control and the right data is that of the mutant. FIG. 11D shows body length adjusted gait metrics that were found to be different for linear mixed effects model. FIG. 11E illustrates use of the first two principal components to build a 2D representation of the multidimensional space in which controls and mutants are best separated. FIG. 11F shows cumulative distribution of speed in the ASD models. The upper curves are Controls and lower curves are Mutant. Cntnap2, Fmr, Del4Aam have lower speed of strides, whereas Shank3 has higher stride speeds.



FIG. 12A-E shows results from strains tested. In FIG. 12A, each boxplot corresponds to a strain, with vertical position indicating residuals of stride length adjusted for body length.





Strains are ordered by their median residual stride length value. FIG. 12B shows z-scores of body length adjusted gait metrics for all strains color coded by the cluster membership (see FIG. 12C). FIG. 12C shows use of K-means algorithm to build, using the first two principal components, a 2D representation of the multidimensional space in which strains are best separated. Top right region is cluster 1, lower region is cluster 2, and top left region shown is cluster 3. FIG. 12D provides a consensus view of lateral displacement of nose and tail tip across the clusters. The solid lines represent the mean displacement of stride while the translucent bands provides a 95% confidence interval for the mean. FIG. 12E are post-clustering plots summarizing the residual gait metrics across different clusters. In each set of three, left is cluster 1, middle is cluster 2, and right is cluster 3.



FIG. 13A-D provides GWAS results for gait phenotypes. FIG. 13A provides heritability estimates for each phenotype mean (left) and variance (right). FIG. 13B-D provide Manhattan plots of all mean phenotypes (FIG. 13B), variance phenotypes (FIG. 13C), and all of them combined (FIG. 13D); colors correspond to the phenotype with the lowest p-value for the single nucleotide polymorphism (SNP).



FIG. 14A-D provides listings of animal strains used in certain implementations of the invention. FIG. 14A shows control strains and official identifies for gait mutants. FIG. 14B shows control strains and official identifiers for autism mutants. FIG. 14C shows a table summarizing body length and weight of animals in experiments. FIG. 14D provides a listing that summarizes animals used in the strain survey studies.



FIG. 15A-E provides heat maps, curves, and plots. FIG. 15A is a heat map summarizing the effect sizes and q-values obtained from model M3: Phenotype˜Genotype+TestAge+Speed+BodyLength+(1|MouseID/TestAge). FIG. 15B shows kernel density (left) and cumulative density (right) curves of speed across all strains. FIG. 15C is a plot showing positive association between body length and sex across different gait mutant strains. In each pair of results, Controls are on left of pair and Mutants are on right of pair. FIG. 15D shows body length (M1), speed (M2), body length and speed (M3) adjusted residuals for limb duty factor and step length for Mecp2 gait mutant. FIG. 15E shows body length (M1), speed (M2), body length and speed (M3) adjusted residuals for step width and stride length for Mecp2 gait mutant.



FIG. 16A-E provides heat maps, curves and plots. FIG. 16A is a heat map summarizing the effect sizes and q-values obtained from model M2: Phenotype˜Genotype+TestAge+Speed+(1|MouseID/TestAge). FIG. 16B shows kernel density curves of speed across all strains. FIG. 16C is a plot showing positive association between body length and sex across different gait mutant strains. In each pair of results, Controls are on left of pair and Mutants are on right of pair. FIG. 16D shows body length (M1), speed (M2), body length and speed (M3) adjusted residuals for step length and stride length for Shank3 autism mutant. FIG. 16E shows body length (M1), speed (M2), body length and speed (M3) adjusted residuals for step length and stride length for Del4Aam autism mutant. In each pair of results shown Controls are on left of pair and Mutants are on right of pair.



FIG. 17A-F shows results of body length adjusted phenotypes that were compared across 62 strains in the strain survey. The box plots are displayed in an ascending order with respect to the median measure from left to right. Each panel (FIG. 17A-F) corresponds to a different gait phenotype.



FIG. 18A-E shows results of body length adjusted phenotypes that were compared across 62 strains in the strain survey. The box plots are displayed in an ascending order with respect to the median measure from left to right. Each panel (FIG. 18A-E) corresponds to a different gait phenotype.



FIG. 19 provides a listing summarizing effect sizes and FDR adjusted p-values obtained from models M1, M2, M3 for all phenotypes and gait strains.



FIG. 20 provides a listing summarizing effect sizes and FDR adjusted p-values obtained from models M1, M2, M3 for all phenotypes and autism strains FIG. 21A-D shows three optimal clusters in strain survey data. Thirty clustering indices were examined for choosing the optimal number of clusters (Bates et al., J Stat Softw (2015) 67:1). FIG. 21A shows the majority indicated that there may be 2 or 3 clusters in the strain survey data. One major criterion for choosing the optimal number of clusters is to maximize the between-cluster distances while keeping the within-cluster distances small. To this end, within-sum-of-squares (WSS) was examined (shown in FIG. 21B); Calinski-Harabasz (CH) index shown in FIG. 21C) [Calinski, T. & Harabasz, Communications in Statistics-theory and Methods 3, 1-27 (1974)]; and gap statistic [Tibshirani, R. et al. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 411-423 (2001)](shown in FIG. 21D) versus the number of clusters employed. All these indicated that 3 clusters is the optimal choice.



FIG. 22 shows a table of significant GWAS hits for gait and posture phenotypes. The information includes results of studies showing Quantitative Trait Loci (QTL) peak SNP, QTL peak SNP position, QTL start position, QTL end position Allele 1, Allele 2, Allele 1 frequency, p wald, protein coding genes, and groups in which the QTL was found to be significant.



FIG. 23 is a block diagram conceptually illustrating example components of a device according to embodiments of the present disclosure.



FIG. 24 is a block diagram conceptually illustrating example components of a server according to embodiments of the present disclosure.


DETAILED DESCRIPTION

The invention includes, in part, a method for processing video data to first track body parts of a subject, determine data representing gait metrics and posture metrics, and then performing statistical analysis to determine any differences/deviations from a control. Methods and systems of the invention provide a reliable and scalable automated system for extracting gait-level and posture-level features, and dramatically lowers time and labor costs associated with experiments for neurogenetics behavior and also reduces variability in such experiments.


Analysis of human and animal movement, including gait, has a storied past. Aristotle was the first to write a philosophical treatise on animal movement and gait using physical and metaphysical principles. During the Renaissance, Borelli applied the laws of physics and biomechanics to muscles, tendons, and joints of the entire body to understand gait. The first application of imaging technologies to the study of gait is credited to the work of Muybridge and Marey, who took sequential photographic images of humans and animals in motion in order to derive quantitative measurements of gait. Modern animal gait analysis methods are credited to Hildebrandt, who in the 1970s classified gait based on quantified metrics. He defined a gait cycle in terms of contact of limb to the ground (stance and swing phases). Fundamentally, this concept has not changed over the past 40 years: while current methods of mouse gait analysis have increased efficiency of the imaging approaches of Muybridge and Marey, they are fundamentally still based on the timing of limbs contacting the ground. This is in contrast to human gait and posture analysis, which, since the time of Borelli, has focused on body posture, and is akin to the quantitation of whole body movement rather than just contact with the ground. This difference between mouse and human is probably due in part to the difficulty in automatically estimating the posture of rodents, which appear as deformable objects due to their fur, which obscures joint positions. In addition, unlike humans, parts of mice cannot be easily marked with wearables for localization. In rodents, recent methods have made progress in determination of whole body coordination, however, these still require specialized equipment and force the animal to walk in a fixed direction in a corridor or treadmill or a narrow corridor for proper imaging and accurate determination of limb position. This is highly unnatural, and animals often require training to perform this behavior properly, limiting the use of this type of assay in correlating to human gait. Imaging from the side leads to perspective hurdles, which are overcome by limiting the movement of the animal to one depth field. Furthermore, as the animal defecates and urinates, or when bedding is present, the resulting occlusion makes long term monitoring from this perspective impractical. Indeed, ethologically relevant gait data in which animals can move freely often produce results that differ from treadmill-based assays. Furthermore, commercial treadmill- or corridor-based systems for gait analysis often produce a plethora of measures that show differing results with same animal models. The exact causes of these disparities are challenging to determine with closed, proprietary systems. Thus, there is currently a lack of an easily and broadly implementable tool to measure gait and posture in free-moving animals.


The open field assay is one of the oldest and most commonly used assays in behavioral neurogenetics. In rodents, it has classically been used to measure endophenotypes associated with emotionality, such as hyperactivity, anxiety, exploration, and habituation in rodents. For video-based open field assays, rich and complex behaviors of animal movement are abstracted to a simple point in order to extract behavioral measures. This oversimplified abstraction is necessary mainly due to technological limitations that have prohibited accurate extraction of complex poses from video data. New technology has the potential to overcome this limitation. Gait, an important indicator of neural function, is not typically analyzed, by conventional systems, in the open field mainly due to the technical difficulty of determining limb position when animals are moving freely. The ability to combine open field measures with gait and posture analysis would offer key insights into neural and genetic regulation of animal behavior in an ethologically relevant manner. The invention of the present disclosure leverages modern machine learning models, such as neural networks, to carry out subject gait and posture analysis in the open field. The invention relates to systems and methods to measure gait and whole body posture parameters from a top-down perspective that is invariant to the high level of visual diversity seen in a subject, such as a mouse, including coat color, fur differences, and size differences. Altogether, the invention provides a system that is sensitive, accurate, and scalable and can detect previously undescribed differences in gait and posture in mouse models of diseases and conditions.


The present disclosure relates to techniques for gait and posture analysis that includes several modular components, one of which, in some embodiments, is a neural network (e.g., a deep convolutional neural network) that has been trained to perform pose estimation using top-down videos of an open field. The neural network may provide multiple two-dimensional markers (in some embodiments, twelve such markers) of a subject's anatomical location (also referred to as “keypoints”), for each frame of video describing the pose of the subject at each time point. Another one of the modular components may be capable of processing the time series of poses and identifying intervals that represent individual strides. Another one of the modular components may be capable of extracting several gait metrics on a per-stride basis, and another modular component may be capable of extracting several posture metrics. Finally, another modular component may be configured to perform statistical analysis on the gait metrics and the posture metrics, as well as enabling aggregation of large amounts of data in order to provide consensus views of the structure of a subject's gait.


The system 100 of the present disclosure may operate using various components as illustrated in FIG. 1. The system 100 may include an image capture device 101, a device 102 and one or more systems 150 connected across one or more networks 199. The image capture device 101 may be part of, included in, or connected to another device (e.g., device 1600), and may be a camera, a high speed video camera, or other types of devices capable of capturing images and videos. The device 101, in addition to or instead of an image capture device, may include a motion detection sensor, infrared sensor, temperature sensor, atmospheric conditions detection sensor, and other sensors configured to detect various characteristics/environmental conditions. The device 102 may be a laptop, a desktop, a tablet, a smartphone, or other types of computing devices, and may include one or more components described in connection with device 1600 below.


The image capture device 101 may capture video (or one or more images) of one or more subjects on whom the formalin assay is performed, and may send video data 104 representing the video to the system(s) 150 for processing as described herein. The system(s) 150 may include one or more components shown in FIG. 1, and may be configured to process the video data 104 to determine gait and posture behaviors of the subject(s) over time. The system(s) 150 may determine difference data 148 representing one or more differences in the subject gait and/or posture and a control gait and/or posture. The difference data 148 may be send to the device 102 for output to a user to observe the results of processing the video data 104.


Details of the components of the system(s) 150 are described below. The various components may be located on the same or different physical devices. Communication between the various components may occur directly or across a network(s) 199. Communication between the device 101, the system(s) 150 and the device 102 may occur directly or across a network(s) 199. One or more components shown as part of the system(s) 150 may be located at the device 102 or at a computing device (e.g., device 1600) connected to the image capture device 101. In an example embodiment, the system(s) 150 may include a point track component 110, a gait analysis component 120, a posture analysis component 130, and a statistical analysis component 140. In other embodiments, the system(s) 150 may include fewer or more components than shown in FIG. 1 to perform the same or similar functionality as described below.



FIG. 2 is a flowchart illustrating an example process 200 that may be performed by the system 100 shown in FIG. 1 for analyzing video data 104 of a subject to determine gait and posture metrics, according to embodiments of the present disclosure. At a high level, the process 200 begins with the image capture device 101 recording a video(s) of a subject's movements. In some embodiments, the video data 104 is a top-down perspective of the subject. In some embodiments, the subject(s) may be in an enclosure that has an open arena, for example, without a treadmill, a tunnel, etc. to direct the subject(s) in a particular fashion. This allows for observing subjects without having to train subjects to perform certain movements, such as walking on a treadmill or moving within a tunnel. At a step 202, the system(s) 150 may receive the video data 104 from the image capture device 101 (or a device 1600 connected to the image capture device 101 or within which the image capture device 101 is included). At a step 204, the point tracker component 110 of the system(s) 150 may process the video data 104 to determine point data 112. The point data 112 may represent data tracking movements of a set of subject body parts over a time period represented in the video data 104. Further details on the point tracker component 110 are described below in relation to FIG. 3. At a step 206, the gait analysis component 120 of the system(s) 150 may process the point data 112 to determine metrics data 122. The metrics data 122 may represent gait metrics for the subject. Further details on the gait analysis component 120 are described below in relation to FIG. 4. At a step 208, the posture analysis component 130 of the system(s) 150 may process the point data 112 to determine metrics data 132. The metrics data 132 may represent posture metrics for the subject. Further details on the posture analysis component 130 are described below in relation to FIG. 5. In some embodiments, the step 208 may be performed before the step 206. In some embodiments, the steps 206 and 208 may be performed in parallel, for example, the gait analysis component 120 may process the point data 112 while the posture analysis component 130 is processing the point data 112. In some embodiments, depending on system configuration, only one of the step 206 and 208 may be performed. For example, in some embodiments, the system(s) 150 may be configured to only determine gait metrics, and thus, only the step 206 may be performed by the gait analysis component 120. In another example, in some embodiments, the system(s) 150 may be configured to only determine posture metrics, and thus, only the step 208 may be performed by the gait analysis component 120. At a step 210, the statistical analysis component 140 of the system(s) 150 may process the metrics data 122, the metrics data 132 and control data 144 to determine difference data 148. Further details on the statistical analysis component 140 are described below.



FIG. 3 is a flowchart illustrating an example process 300 that may be performed by the point tracker component 110 for tracking subject body parts in the video data 104, according to embodiments of the present disclosure. At a step 302, the point tracker component 110 may process the video data 104 using a machine learning model(s) to locate subject body part(s). At a step 304, the point tracker component 110 may generate a heatmap(s) for the subject body part(s) based on processing the video data 104 using the machine learning model(s). The point tracker component 110 may use the machine learning model(s) to estimate a two-dimensional pixel coordinate where a subject body part appears within a video frame of the video data 104. The point tracker component 110 may generate a heatmap estimating a location of one subject body part for one video frame. For example, the point tracker component 110 may generate a first heatmap, where each cell in the heatmap may correspond to a pixel within the video frame, and may represent a likelihood of a first subject body part (e.g., a right forepaw) being located at the respective pixel. Continuing with the example, the point tracker component 110 may generate a second heatmap, where each cell may represent a likelihood of a second subject body part (e.g., a left forepaw) being located at the respective pixel. At a step 306, the point tracker component 110 may determine the point data 112 using the generated heatmap(s). The heatmap cell with the highest/maximum value may identify the pixel coordinate where the respective subject body part is located within the video frame.


The point tracker component 110 may be configured to locate two-dimensional coordinates of a set of subject body parts, identified as keypoints, in an image or video. In some embodiments, the set of subject body parts may be pre-defined and may be based on which keypoints are visually salient, such as ears or nose, and/or which keypoints capture important information for analyzing the gait and posture of the subject, such as limb joints or paws. In an example embodiment, the set of subject body parts may include twelve keypoints. In other embodiments, the set of subject body parts may include fewer than or more than twelve keypoints. In an example embodiment, the set of subject body parts may include: nose, left ear, right ear, base of neck, left forepaw, right forepaw, mid spine, left hind paw, right hind paw, base of tail, mid tail and tip of tail (as illustrated in FIG. 7B).


The point tracker component 110 may implement one or more pose estimation techniques. The point tracker component 110 may include one or more machine learning models configured to process the video data 104. In some embodiments, the one or more machine learning models may be a neural network such as, a deep neural network, a deep convolutional neural network, a recurrent neural network, etc. In other embodiments, the one or more machine learning models may be other types of models than a neural network. The point tracker component 110 may be configured to determine the point data 112 with high accuracy and precision because the metrics data 122, 132 may sensitive to errors in the point data 112. The point tracker component 110 may implement an architecture that maintains high-resolution features throughout the machine learning model stack, thereby preserving spatial precision. In some embodiments, the point tracker component 110 architecture may include one or more transpose convolutions to cause matching between a heatmap output resolution and the video data 104 resolution. The point tracker component 110 may be configured to determine the point data 112 in near real-time speeds and may run a high processing capacity GPU. The point tracker component 110 may be configured such that modifications and extensions can be made easily. In some embodiments, the point tracker component 110 may be configured to generate an inference at a fixed scale, rather than processing at multiple scales, to save computing resources and time.


In some embodiments, the video data 104 may track movements of one subject, and the point tracker component 110 may not be configured to perform any object detection techniques/algorithms. In other embodiments, the video data 104 may track movements of more than one subject, and the point track component 110 may be configured to perform object detection techniques to identify one subject from another subject within the video data 104.


In some embodiments, the point tracker component 110 may generate multiple heatmaps, each heatmap representing an inference of where one keypoint representing one subject body part is located within a frame of the video data 104. In one example, the video data 104 may have a 480×480 frame, and the point tracker component 110 may generate twelve 480×480 heatmaps. The maximum value in each heatmap may represent the highest confidence location for each respective keypoint. In some embodiments, the point tracker component 110 may take the maximum value of each of the twelve heatmaps and output that as the point data 112, thus, the point data 112 may include twelve (x,y) coordinates.


In some embodiments, the point tracker component 110 may be trained for a loss function, for example, a Gaussian distribution centered on the respective keypoint. The output of the neural network of the point tracker component 110 may be compared with the keypoint-centered Gaussian distribution, and the loss may be calculated as the mean squared difference between the respective keypoint and the heatmap generated by the point tracker component 110. In some embodiments, the point tracker component 110 may be trained using an optimization algorithm, for example, a stochastic gradient descent optimization algorithm. The point tracker component 110 may be trained using training video data of subjects having varying physical characteristics, such as, different coat color, different body lengths, different body sizes, etc.


The point tracker component 110 may estimate given keypoints with varying levels of confidence depending on the position of the subject body part on the subject body. For example, the location of the hind paws may be estimated with a higher confidence than the location of the forepaws because the forepaws may be more occluded than the hind paws in a top-down perspective. In another example, visually salient body parts, like the spine center, may have a lower confidence since it may be more difficult for the point tracker component 110 to locate accurately.


Now referring to the gait analysis component 120 and the posture analysis component 130. As used herein, gait metrics may refer to metrics derived from the subject's paw movements. Gait metrics may include, but is not limited to, step width, step length, stride length, speed, angular velocity, and limb duty factor. As used herein, posture metrics may refer to metrics derived from the movements of the subject's whole body. In some embodiments, the posture metrics may be based on movements of the subject nose and tail. Posture metrics, may include, but is not limited to, lateral displacement of nose, lateral displacement of tail base, lateral displacement of tail tip, nose lateral displacement phase offset, tail base displacement phase offset, and tail tip displacement phase offset.


The gait analysis component 120 and the posture analysis component 130 may determine one or more of the gait metrics and the posture metrics on a per-stride basis. The system(s) 150 may determine a stride interval(s) represented in a video frame of the video data 104. In some embodiments, the stride interval may be based on a stance phase and a swing phase. FIG. 4 is a flowchart illustrating an example process 400 that may be performed by the gait analysis component 120 and/or the posture analysis component 130 to determine a set of stride intervals for analysis.


In example embodiments, the approach for detecting stride intervals is based on the cyclic structure of gait. During a stride cycle, each of the paws may have a stance phase and a swing phase. During the stance phase, the subject's paw is supporting the weight of the subject and is in static contact with the ground. During the swing phase, the paw is moving forward and is not supporting the subject's weight. The transition from a stance phase to a swing phase is referred to herein as a toe-off event, and the transition from a swing phase to a stance phase is referred to herein as a foot-strike event. FIG. 8A-C illustrates an example stance phase, an example swing phase, an example toe-off event and an example foot-strike event.


At a step 402, the system(s) 150 may determine a plurality of stance and swing phases represented in a time period. In an example embodiment, the stance and swing phases may be determined for the hind paws of the subject. The system(s) 150 may calculate a paw speed and may infer that a paw is in the stance phase when the speed falls below a threshold value, and may infer that the paw is in the swing phase when it exceeds that threshold value. At a step 404, the system(s) 150 may determine that the foot strike events occur at the video frame where the transition from the swing phase to the stance phase occurs. At a step 406, the system(s) 150 may determine the stride intervals represented in the time period. A stride interval may span over multiple video frames of the video data 104. The system(s) 150, for example, may determine that a time period of 10 seconds has 5 stride intervals, and that one of the 5 stride intervals is represented in 5 consecutive video frames of the video data 104. In an example embodiment, the left hind foot strike event may be defined as the event that separates/differentiates stride intervals. In another example embodiment, the right hind foot strike event may be defined as the event that separates/differentiates the stride intervals. In yet another example embodiment, a combination of the left hind foot strike event and the right hind foot strike event may be used to define the separate stride intervals. In some other embodiments, the system(s) 150 may determine the stance and swing phases for the fore paws, may calculate a paw speed based on the fore paws, and may differentiate between the stride intervals based on the right and/or left forepaw foot strike event. In some other embodiments, the transition from the stance phase to the swing phase—the toe-off event—may be used to separate/differentiate the stride intervals.


In some embodiments, it may be preferred to determine the stride intervals based on a hind foot strike event, rather than a forepaw strike event due to the keypoint inference quality (determined by the point tracker component 110) for the forepaws, in some cases, being of low confidence. This is may be a result of the forepaws being occluded more often than the hind paws from within a top-down view, and therefore the forepaws being more difficult to accurately locate.


At a step 408, the system(s) 150 may filter the determined stride intervals to determine which stride intervals are used to determine the metrics data 122, 132. In some embodiments, such filtering may remove spurious or low confidence stride intervals. In some embodiments, the criteria for removing the stride intervals may include, but is not limited to: low confidence keypoint estimate, physiologically unrealistic keypoint estimates, missing right hind paw strike event, and insufficient overall body speed of subject (e.g., a speed under 10 cm/sec).


In some embodiments, the filtering of the stride intervals may be based on a confidence level in determining the keypoints used to determine the stride intervals. For example, stride intervals determined with a confidence level below a threshold value may be removed from the set of stride intervals used to determine the metrics data 122, 132. In some embodiments, the first and last strides are removed in a continuous sequence of strides to avoid starting and stopping behaviors from adding noise to the data to be analyzed. For example, a sequence of seven strides will result in at most five strides being used for analysis. After determining the stride intervals represented in the video data 104, the system(s) 150 may determine the gait metrics and the posture metrics. FIG. 5 is a flowchart illustrating an example process 500 that may be performed by the gait analysis component 120 for determining subject gait metrics, according to embodiments of the present disclosure. The steps of the process 500 may be performed in the optional sequence shown in FIG. 5. In other embodiments, the steps of the process 500 may be performed in a different sequence. In yet other embodiments, the steps of the process 500 may be performed in parallel.


At a step 502, the gait analysis component 120 may determine, using the point data 112, a step length for a stride interval determined to be analyzed at the step 408 shown in FIG. 4. The gait analysis component 120 may determine a step length for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a left hind paw, a left forepaw, a right hind paw and a right forepaw. In some embodiments, the step length may be a distance between the left forepaw and the right hind paw for the stride interval. In some embodiments, the step length may be a distance between the right forepaw and the left hind paw for the stride interval. In some embodiments, the step length may be a distance that the right hind paw travels past the previous left hind paw strike.


At a step 504, the gait analysis component 120 may determine, using the point data 112, a stride length for a stride interval determined to be analyzed at the step 408. The gait analysis component 120 may determine a stride length for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a left hind paw, a left forepaw, a right hind paw and a right forepaw. In some embodiments, the stride length may be a distance between the left forepaw and the left hind paw for the each stride interval. In some embodiments, the stride length may be a distance between the right forepaw and the right hind paw. In some embodiments, the stride length may be the full distance that the left hind paw travels for a stride from a toe-off event to a foot-strike event.


At a step 506, the gait analysis component 120 may determine, using the point data 112, a step width for a stride interval determined to be analyzed at the step 408. The gait analysis component 120 may determine a step width for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a left hind paw, a left forepaw, a right hind paw and a right forepaw. In some embodiments, the step width is a distance between the left fore paw and the right fore paw. In some embodiments, the step width is a distance between the left hind paw and the right hind paw. In some embodiments, the step width is an averaged lateral distance separating hind paws. This may be calculated as length of the shortest line segment that connects the right hind paw strike to the line that connects the left hind paw's toe-off location to its subsequent foot strike position.


At a step 508, the gait analysis component 120 may determine, using the point data 112, a paw speech for a stride interval determined to be analyzed at the step 408. The gait analysis component 120 may determine a paw speed for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a left hind paw, a right hind paw, a left forepaw, and a right forepaw. In some embodiments, the paw speed may be a speed of one of the paws during the stride interval. In some embodiments, the paw speed may be a speed of the subject and may be based on a tail base of the subject.


At a step 510, the gait analysis component 120 may determine, using the point data 112, a stride speed for a stride interval determined to be analyzed at the step 408. The gait analysis component 120 may determine a stride speed for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a tail base. In some embodiments, the stride speed may be determined by determining a set of speed data for the subject based on the movement of the subject tail base during a set of video frames representing the stride interval. Each speed data in the set of speed data may correspond to one frame of the set of video frames. The stride speed may be calculated by averaging (or combining in another manner) the set of speed data.


At a step 512, the gait analysis component 120 may determine, using the point data 112, a limb duty factor for a stride interval determined to be analyzed at the step 408. The gait analysis component 120 may determine a limb duty factor for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a right hind paw and a left hind paw. In some embodiments, the limb duty factor for the stride interval may be an average of a first duty factor and a second duty factor. The gait analysis component 120 may determine a first stance time representing an amount of time that the right hind paw is in contact with the ground during the stride interval, and then may determine the first duty factor based on the first stance time and the length of time for the stride interval. The gait analysis component 120 may determine a second stance time representing an amount of time that the left hind paw is in contact with the ground during the stride interval, and then may determine the second duty factor based on the second stance time and the length of time for the stride interval. In other embodiments, the limb duty factor may be based on the stance time and duty factors of the forepaws.


At a step 514, the gait analysis component 120 may determine, using the point data 112, an angular velocity for a stride interval determined to be analyzed at the step 408. The gait analysis component 120 may determine an angular velocity for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a tail base and a neck base. The gait analysis component 120 may determine a set of vectors connecting the tail base and the neck base, where each vector in the set corresponds to a frame of a set of frames for the stride interval. The gait analysis component 120 may determine the angular velocity based on the set of vectors. The vectors may represent an angle of the subject, and a first derivative of the angle value may be the angular velocity for the frame. In some embodiments, the gait analysis component 120 may determine a stride angular velocity by averaging the angular velocities for the frames for the stride intervals.



FIG. 6 is a flowchart illustrating an example process 600 that may be performed by the posture analysis component 130 for determining subject posture metrics, according to embodiments of the present disclosure. At a high level, the posture analysis component 130 may determine lateral displacements of a nose, a tail tip and a tail base on the subject for individual stride intervals. Based on the lateral displacements of the nose, the tail tip, and the tail base, the posture analysis component 130 may determine a displacement phase offset of each of the respective subject body part. In that respective, the steps of the process 600 may be performed in a different sequence than that shown in FIG. 6. For example, the posture analysis component 130 may determine the lateral displacement of the nose and the nose displacement phase offset after determining or in parallel of determining the lateral displacement of the tail tip and the tail tip displacement phase offset.


To determine the lateral displacements, the posture analysis component 130 may first, at a step 602, determine using the point data 112, a displacement vector for a stride interval determined to be analyzed at the step 408. The posture analysis component 130 may determine the displacement vector for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a spine center of the subject. The stride interval may span over multiple video frames. In some embodiments, the displacement vector may be a vector connecting the spine center in a first video frame of the stride interval and the spine center in the last video frame of the stride interval.


At a step 604, the posture analysis component 130 may determine, using the point data 112 and the displacement vector (from the step 602), a lateral displacement of the subject nose for the stride interval. The posture analysis component 130 may determine the lateral displacement of the nose for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a spine center and a nose of the subject. In some embodiments, the posture analysis component 130 may determine a set of lateral displacements of the nose, where each lateral displacement of the nose may correspond to a video frame of the stride interval. The lateral displacement may be a perpendicular distance of the nose, in the respective video frame, from the displacement vector for the stride interval. In some embodiments, the posture analysis component 130 may subtract the minimum distance from the maximum distance and divide that by the subject body length so that the displacement measured in larger subjects may be comparable to the displacement measured in smaller subjects.


At a step 606, the posture analysis component 130 may determine, using the set of lateral displacements of the nose for the stride interval, a nose displacement phase offset. The posture analysis component 130 may perform an interpolation using the set of lateral displacements of the nose to generate a smooth curve lateral displacement of the nose for the stride interval, then may determine, using the smooth curve lateral displacement of the nose, when a maximum displacement of the nose occurs during the stride interval. The posture analysis component 130 may determine a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the nose occurs. In some embodiments, the posture analysis component 130 may perform a cubic spline interpolation in order to generate the smooth curve for the displacement, and because of the cubic interpolation the maximum displacement may occur at time points between video frames.


At a step 608, the posture analysis component 130 may determine, using the point data 112 and the displacement vector (from the step 602), a lateral displacement of the subject tail base for the stride interval. The posture analysis component 130 may determine the lateral displacement of the tail base for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a spine center and a tail base of the subject. In some embodiments, the posture analysis component 130 may determine a set of lateral displacements of the tail base, where each lateral displacement of the tail base may correspond to a video frame of the stride interval. The lateral displacement may be a perpendicular distance of the tail base, in the respective video frame, from the displacement vector for the stride interval. In some embodiments, the posture analysis component 130 may subtract the minimum distance from the maximum distance and divide that by the subject body length so that the displacement measured in larger subjects may be comparable to the displacement measured in smaller subjects.


At a step 610, the posture analysis component 130 may determine, using the set of lateral displacements of the tail base for the stride interval, a tail base displacement phase offset. The posture analysis component 130 may perform an interpolation using the set of lateral displacements of the tail base to generate a smooth curve lateral displacement of the tail base for the stride interval, then may determine, using the smooth curve lateral displacement of the tail base, when a maximum displacement of the nose occurs during the stride interval. The posture analysis component 130 may determine a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail base occurs. In some embodiments, the posture analysis component 130 may perform a cubic spline interpolation in order to generate the smooth curve for the displacement, and because of the cubic interpolation the maximum displacement may occur at time points between video frames.


At a step 612, the posture analysis component 130 may determine, using the point data 112 and the displacement vector (from the step 602), a lateral displacement of the subject tail tip for the stride interval. The posture analysis component 130 may determine the lateral displacement of the tail tip for each stride interval for the time period. In some embodiments, the point data 112 may be for the keypoints representing a spine center and a tail tip of the subject. In some embodiments, the posture analysis component 130 may determine a set of lateral displacements of the tail tip, where each lateral displacement of the tail tip may correspond to a video frame of the stride interval. The lateral displacement may be a perpendicular distance of the tail tip, in the respective video frame, from the displacement vector for the stride interval. In some embodiments, the posture analysis component 130 may subtract the minimum distance from the maximum distance and divide that by the subject body length so that the displacement measured in larger subjects may be comparable to the displacement measured in smaller subjects.


At a step 614, the posture analysis component 130 may determine, using the set of lateral displacements of the tail tip for the stride interval, a tail base displacement phase offset. The posture analysis component 130 may perform an interpolation using the set of lateral displacements of the tail tip to generate a smooth curve lateral displacement of the tail tip for the stride interval, then may determine, using the smooth curve lateral displacement of the tail tip, when a maximum displacement of the nose occurs during the stride interval. The posture analysis component 130 may determine a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail tip occurs. In some embodiments, the posture analysis component 130 may perform a cubic spline interpolation in order to generate the smooth curve for the displacement, and because of the cubic interpolation the maximum displacement may occur at time points between video frames.


In reference to the statistical analysis component 140 of the system(s) 150, the statistical analysis component 140 may take as input the metrics data 122 (determined by the gait analysis component 120) and the metrics data 132 (determined by the posture analysis component 130). In some embodiments of the invention the statistical analysis component 140 may only take the metrics data 122, based on the system being configured for processing gait metrics data only. In other embodiments, the statistical analysis component 140 may only take the metrics data 132 based on the system being configured for processing posture metrics data only.


Subject body size and subject speed can affect the gait and/or posture of the subject. For example, a subject that moves faster will have a different gait than a subject that moves slow. As a further example, a subject with a larger body will have a different gait than a subject with a smaller body. However, in some cases a difference (as compared to a control subject gait) in stride speed may be a defining feature of gait and posture changes due to genetic or pharmacological perturbation. The system(s) 150 collects multiple repeated measurements for each subject (via the video data 104 and a subject in an open area), and each subject has a different number of strides giving rise to imbalanced data. Averaging over repeated strides, which yields one average value per subject, may be misleading as it removes variation and introduces false confidence. At the same time, classical linear models do not discriminate between stable intra-subject variations and inter-subject fluctuations, which can bias the statistical analysis. To address these issues, the statistical analysis component 140, in some embodiments, employ a linear mixed model(s) (LMM) to dissociate within-subject variation from genotype-based variation between subjects. In some embodiments, the statistical analysis component 140 may capture the main effects such as subject size, genotype, age, and may additionally capture a random effect for the intra-subject variation. The techniques of the invention collects multiple repeated measurements at different ages of the subject giving rise to a nested hierarchical data structure. Example statistical models implemented at the statistical analysis component 140 are shown below as models M1, M2 and M3. These models follow the standard LMM notation with (Genotype, BodyLength, Speed, TestAge) denoting the fixed effects and (SubjectID/TestAge) (where the test age is nested within the subject) denoting the random effect.

    • M1: Phenotype˜Genotype+TestAge+BodyLength+(1|MouseID/TestAge)
    • M2: Phenotype˜Genotype+TestAge+Speed+(1|MouseID/TestAge)
    • M3: Phenotype˜Genotype+TestAge+Speed+BodyLength+(1|MouseID/TestAge)


The model M1 take age and body length as inputs, the model M2 take age and speed as inputs, and the model M3 take age, speed and body length as inputs. In some embodiments, the models of the statistical analysis component 140 does not include subject sex as an effect because the sex may be highly correlated with the body length/size of the subject. In other embodiments, the models of the statistical analysis component 140 may take subject sex as an input. Using the point data 112 (determined by the point tracker component 110), enables determination of subject body size and speed for these models. Therefore, no additional measurements are needed to these variables for the models.


One or more of the data included in the metrics data 122, 132 may be circular variables (e.g., stride length, angular velocities, etc.), and the statistical analysis component 140 may implement a function of linear variables using a circular-linear regression model. The linear variables, such as body length and speed, may be included as covariates in the model. In some embodiments, the statistical analysis component 140 may implement a multivariate outlier detection algorithm at the individual subject level to identify subjects with injuries and developmental effects.


The statistical analysis component 140 may, in some embodiments, also implement a linear discriminant analysis that processes the metrics data 122, 132 with respect to the control data 144 and outputs the difference data 148. The linear discriminant analysis allows for quantitatively distinguish between the subject gait and/or posture metrics and a control subject gait and/or posture metrics.


Stitching Video Feeds

In some embodiments, the video data 104 may be generated using multiple video feeds capturing movements of the subject from multiple different angles/views. The video data 104 may be generated by stitching/combining a first video of a top view of the subject and a second video of a side view of the subject. The first video may be captured using a first image capture device (e.g., device 101a) and the second video may be captured using a second image capture device (e.g., device 101b). Other views of the subject may include a right side view, a left side view, a top-down view, a bottom-up view, a front side view, a back side view, and other views. Videos from these different views may be combined to generate the video data 104 to provide a comprehensive/expansive view of the subject's movements that may result in more accurate and/or efficient classification of subject behavior by the automated phenotyping system. In some embodiments, videos from different views may be combined to provide a wide field of view with a short focal distance, while preserving a top-down perspective over the entirety of the view. In some embodiments, the multiple videos from different views may be processed using one or more ML models (e.g., neural networks) to generate the video data 104. In some embodiments, the system may generate 3D video data using 2D video/images.


In some embodiments, the videos captured by the multiple image capture devices 101 may be synced using various techniques. For example, the multiple image capture devices 101 may be synced to a central clock system and controlled by a master node. Synchronization of multiple video feeds may involve the use various hardware and software such as an adapter, a multiplexer, USB connections between the image capture devices, wireless or wired connections to the network(s) 199, software to control the devices (e.g., MotionEyeOS), etc.


In an example embodiment, the image capture device 101 may be an ultra-wide-angle lens (i.e., a FishEye lens) that produces strong visual distortion intended to create a wide panoramic or hemispherical image, and capable of achieving extremely wide angles of view. In an example implementation, the system to capture the videos for video data 104 may include 4 FishEye lens cameras connected to 4 single-board computing devices (e.g., a Raspberry Pi), and an additional image capture device to capture a top-down view. The system may synchronize these components using various techniques. One technique involves pixel/spatial interpolation, for example, where a point-of-interest (e.g., a body part on the subject) is located at (x, y), the system identifies, with respect to time, a position within the top-down view video along the x and y axes. In an example, the pixel interpolation for the x-axis may be calculated by the single-board computing device per the following equation:





(Pi offset×X/Pi offset×T)*(top-down view offsetΔT)+the initial point(x)


The equation then may be used to calculate the point-of-interest position for the y axis. In some embodiments, to address lens distortion during video calibration, padding may be added to one or more video feeds (instead of scaling the video feed).


Subjects

Some aspects of the invention include use of gait and posture analysis methods with a subject. As used herein, a the term “subject” may refer to a human, non-human primate, cow, horse, pig, sheep, goat, dog, cat, pig, bird, rodent, or other suitable vertebrate or invertebrate organism. In certain embodiments of the invention, a subject is a mammal and in certain embodiments of the invention, a subject is a human. In some embodiments, a subject used in method of the invention is a rodent, including but not limited to a: mouse, rat, gerbil, hamster, etc. In some embodiments of the invention, a subject is a normal, healthy subject and in some embodiments, a subject is known to have, at risk of having, or suspected of having a disease or condition. In certain embodiments of the invention, a subject is an animal model for a disease or condition. For, example though not intended to be limiting, in some embodiments of the invention a subject is a mouse that is an animal model for autism.


As a non-limiting example, a subject assessed with a method and system of the invention may be a subject that is an animal model for a condition such as a model for one or more of: psychiatric illness, neurodegenerative illness, neuromuscular illness, autism spectrum disorder, schizophrenia, bipolar disorder, Alzheimer's disease, Rett syndrome, ALS, and Down syndrome.


In some embodiments of the invention, a subject is a wild-type subject. As used herein the term “wild-type” means to the phenotype and/or genotype of the typical form of a species as it occurs in nature. In certain embodiments of the invention a subject is a non-wild-type subject, for example, a subject with one or more genetic modifications compared to the wild-type genotype and/or phenotype of the subject's species. In some instances, a genotypic/phenotypic difference of a subject compared to wild-type results from a hereditary (germline) mutation or an acquired (somatic) mutation. Factors that may result in a subject exhibiting one or more somatic mutations include but are not limited to: environmental factors, toxins, ultraviolet radiation, a spontaneous error arising in cell division, a teratogenic event such as but not limited to radiation, maternal infection, chemicals, etc.


In certain embodiments of methods of the invention, a subject is a genetically modified organism, also referred to as an engineered subject. An engineered subject may include a pre-selected and/or intentional genetic modification and as such exhibits one or more genotypic and/or phenotypic traits that differ from the traits in a non-engineered subject. In some embodiments of the invention, routine genetic engineering techniques can be used to produce an engineered subject that exhibits genotypic and/or phenotypic differences compared to a non-engineered subject of the species. As a non-limiting example, a genetically engineered mouse in which a functional gene product is missing or is present in the mouse at a reduced level and a method or system of the invention can be used to assess the genetically engineered mouse phenotype, and the results may be compared to results obtained from a control (control results).


In some embodiments of the invention, a subject may be monitored using a gait level determining method or system of the invention and the presence or absence of an activity disorder or condition can be detected. In certain embodiments of the invention, a test subject that is an animal model of an activity and/or movement condition may be used to assess the test subject's response to the condition. In addition, a test subject that is an animal model of a movement and/or activity condition may be administered a candidate therapeutic agent or method, monitored using a gait monitoring method and/or system of the invention and results can be used to determine an efficacy of the candidate therapeutic agent to treat the condition. The terms “activity” and “action” may be used interchangeably herein.


As described elsewhere here, trained models of the invention may be configured to detect behavior of a subject, regardless of the subject's physical characteristics. In some embodiments of the invention, one or more physical characteristics of a subject may be pre-identified characteristics. For example, though not intended to be limiting, a pre-identified physical characteristic may be one or more of: a body shape, a body size, a coat color, a gender, an age, and a phenotype of a disease or condition.


Controls and Candidate Compound Testing and Screening

Results obtained for a subject using the method or system of the invention can be compared to control results. Methods of the invention can also be used to assess a difference in a phenotype in a subject versus a control. Thus, some aspects of the invention provide methods of determining the presence or absence of a change in an activity in a subject compared to a control. Some embodiments of the invention include using gait and posture analysis of the invention to identify phenotypic characteristics of a disease or condition.


Results obtained using the method or system of the invention can be advantageously compared to a control. In some embodiments of the invention, one or more subjects can be assessed using an automated gait analysis method followed by retesting the subjects following administration of a candidate therapeutic compound to the subject(s). The term “test” subject may be used herein in relation to a subject that is assessed using a method or system of the invention. In certain embodiments of the invention, a result obtained using an automated gait analysis method to assess a test subject is compared to results obtained from the automated gait analysis methods performed on other test subjects. In some embodiments of the invention, a test subject's results are compared to results of the automated gait analysis method performed on the test subject at a different time. In some embodiments of the invention, a result obtained using an automated gait analysis method to assess a test subject is compared to a control result.


A control value may be a value obtained from testing a plurality of subjects using a gait analysis method of the invention. As used herein a control result may be a predetermined value, which can take a variety of forms. It can be a single cut-off value, such as a median or mean. It can be established based upon comparative groups, such as subjects that have been assessed using an automated gait analysis method of the invention under similar conditions as the test subject, wherein the test subject is administered a candidate therapeutic agent and the comparative group has not been contacted with the candidate therapeutic agent. Another example of comparative groups may include subjects known to have a disease or condition and groups without the disease or condition. Another comparative group may be subjects with a family history of a disease or condition and subjects from a group without such a family history. A predetermined value can be arranged, for example, where a tested population is divided equally (or unequally) into groups based on results of testing. Those skilled in the art are able to select appropriate control groups and values for use in comparative methods of the invention. Non-limiting examples of types of candidate compounds include chemicals, nucleic acids, proteins, small molecules, antibodies, etc.


A subject assessed using an automated gait analysis method or system of the invention may be monitored for the presence or absence of a change that occurs in a test condition versus a control condition. As non-limiting examples, in a subject, a change that occurs may include, but is not limited to one of more of: a frequency of movement, a response to an external stimulus, etc. Methods and systems of the invention can be used with test subjects to assess the effects of a disease or condition of the test subject and can be used to assess efficacy of candidate therapeutic agents to treat a disease or condition. As a non-limiting example of use of method of the invention to assess the presence or absence of a change in a test subject as a means to identify efficacy of a candidate therapeutic agent, a test subject known to be an animal model of a disease such as autism is assessed using an automated gait analysis method of the invention. The test subject is administered a candidate therapeutic agent and assessed again using the automated gait analysis method. The presence or absence of a change in the test subject's results indicates a presence or absence, respectively, of an effect of the candidate therapeutic agent on the autism in the test subject. Diseases and conditions that can be assessed using a gait analysis method of the invention include, but are not limited to: ALS, autism, Down syndrome, Rett syndrome, bipolar disorder, dementia, depression, a hyperkinetic disorder, an anxiety disorder, a developmental disorder, a sleep disorder, Alzheimer's disease, Parkinson's disease, a physical injury, etc.


It will be understood that in some embodiments of the invention, a test subject may serve as its own control, for example by being assessed two or more times using an automated gait analysis method of the invention and comparing the results obtained at two or more of the different assessments. Methods and systems of the invention can be used to assess progression or regression of a disease or condition in a subject, by identifying and comparing changes in gait characteristics in a subject over time using two or more assessments of the subject using an embodiment of a method or system of the invention.


Diseases and Disorders

Methods and systems of the invention can be used to assess activity and/or behavior of a subject known to have, suspected of having, or at risk of having a disease or condition. In some embodiments, the disease and/or condition is one associated with an abnormal level of an activity or behavior. In a non-limiting example, a test subject that may be subject with anxiety or a subject that is an animal model of anxiety may have one or more activities or behaviors that are associated with anxiety that can be detected using an embodiment of a method of the invention. Results of assessing the test subject can be compared to control results of the assessment, for example of a control subject that does not have anxiety, a control subject that is not a subject that is an animal model of anxiety, a control standard obtained from a plurality of subjects without the condition, etc. Differences in the results of the test subject and the control can be compared. Some embodiments of methods of the invention can be used to identify subjects that have a disease or condition that is associated with abnormal activity and/or behavior.


Onset, progression, and/or regression of a disease or a condition associated with an abnormal activity and/or behavior can also be assessed and tracked using embodiments of methods of the invention. For example in certain embodiments of methods of the invention, 2, 3, 4, 5, 6, 7, or more assessments of an activity and/or behavior of a subject are carried out at different times. A comparison of two or more of the results of the assessments made at different times can show differences in the activity and/or behavior of the subject. An increase in a determined level or type of an activity may indicate onset and/or progression in the subject of a disease or condition associated with the assessed activity. A decease in a determined level or type of an activity may indicate regression in the subject of a disease or condition associated with the assessed activity. A determination that an activity has ceased in a subject may indicate the cessation in the subject of the disease or condition associated with the assessed activity.


Certain embodiments of methods of the invention can be used to assess efficacy of a therapy to treat a disease or condition associated with abnormal activity and/or behavior. For example, a test subject may be administered a candidate therapy and methods of the invention used to determine in the subject, a presence or absence of a change in activity associated with the disease or condition. A reduction in an abnormal activity following administration of a candidate therapy may indicate efficacy of the candidate therapy against the disease or condition.


As indicated elsewhere herein, a gait analysis method of the invention may be used to assess a disease or condition in a subject and may also be used to assess animal models of diseases and conditions. Numerous different animal models for diseases and conditions are known in the art, including but not limited to numerous mouse models. A subject assessed with a system and/or method of the invention may be a subject that is an animal model for a disease or condition such as a model for a disease or condition such as, but not limited to: neurodegenerative disorders, neuromuscular disorders, neuropsychiatric disorders, ALS, autism, Down syndrome, Rett syndrome, bipolar disorder, dementia, depression, a hyperkinetic disorder, an anxiety disorder, a developmental disorder, a sleep disorder, Alzheimer's disease, Parkinson's disease, a physical injury, etc. Additional models of diseases and disorders that may be assessed using a method and/or system of the invention are known in the art, see for example: Barrot M. Neuroscience 2012; 211: 39-50; Graham, D. M., Lab Anim (NY) 2016; 45: 99-101; Sewell, R. D. E., Ann Transl Med 2018; 6: S42. 2019/01/08; and Jourdan, D., et al., Pharmacol Res 2001; 43: 103-110, the contents of which are incorporated herein by reference in their entirety.


In addition to testing subjects with known diseases or disorders, methods of the invention may also be used to assess new genetic variants, such as engineered organisms. Thus, methods of the invention can be used to assess an engineered organism for one or more characteristics of a disease or condition. In this manner, new strains of organisms, such as new mouse strains can be assessed and the results used to determine whether the new strain is an animal model for a disease or disorder.


EXAMPLES
Example 1. Model Development: Data Training, Testing, and Model Validation

Methods


Training Data

Labeled data consists of 8,910 480×480 grayscale frames containing a single mouse in the open field along with the twelve manually labeled pose keypoints per frame. Strains were selected from a diverse set of mouse strain with different appearance accounting for variation in coat color, body size and obesity. FIG. 8C shows a representative frame generated by the open field apparatus. The frames were generated from the same open field apparatus as was used to generate experimental data previously (Geuther, B. Q. et al., Commun Biol (2019) 2:1-11). Pose keypoint annotations were performed by several Kumar lab members. Frame images and keypoint annotations were stored together using an HDF5 format, which was used for neural network training. Frame annotations were split into a training dataset (7,910 frames) and a validation dataset (1,000 frames) for training.


Neural Network Training

The network was trained over 600 epochs and validations were performed at the end of every epoch. The training loss curves (FIG. 8C) show a fast convergence of the training loss without an overfitting of the validation loss. Transfer learning (Weiss, K. et al., J Big Data (2016) 3:9; Tan, C. et al. 27th Intl Conference on Artificial Neural Networks (2018), 270-279, arXiv:1808.01974 [cs.LG]) was used on the network in order to minimize the labeling requirements and improve the generality of the model. Initially, the imagenet model was used, which was provided by the authors of the HRNet paper (hrnet_w32-36af842e.pth) and the weights were frozen up to the second stage during training. In order to further improve the generality of the network several data augmentation techniques were employed during training including: rotation, flipping, scaling, brightness, contrast and occlusion. The ADAM optimizer was used to train the network. The learning rate was initially set to 5×104, then reduced to 5×105 at the 400th epoch and 5×10−6 at the 500th epoch.


Statistical Analysis

The following LMM model was considered for repeated measurements:






y
ij
=x
ij
Tβ+γiij, i=1, . . . n, j=1, . . . ,ni


where n is the total number of subjects; yij is the jth repeat measurement on the ith subject, ni denotes the number of repeat measurements on subject i; xij is a p×1 vector of covariates such as body length, speed, genotype, age; β is a p×1 vector of unknown fixed population-level effects; γi is a random intercept, which describes subject-specific deviation from the population mean effect; and εij is the error term that describes the intrasubject variation of the ith subject that is assumed to be independent of the random effect. To test fixed effects and get p-values, the F test with Satterthwaite's approximation to the denominator degrees of freedom was used. The LMM models were fit using the lme4 package in R (Bates, D. et al., J Stat Softw (2015) 67:1-48).


The circular phase variables in FIG. 14A were modeled as a function of linear variables using a circular-linear regression model. Analyzing circular data is not straightforward and statistical models developed for linear data do not apply to circular data [Calínski, T. & Harabasz, Communications in Statistics-theory and Methods 3, 1-27 (1974)]. The circular response variables were assumed to have been drawn from a von-Mises distribution with unknown mean direction p and concentration parameter κ. The mean direction parameter was related to the variables X through the equation:






Y
i˜vonMises(μ1,κ), μi=μ+g1X1+ . . . +γpXp), i=1; . . . ,n


where g(u)=2 tan−1(u) is a link function such that for −∞<u<∞, −π<g(u)<π. The parameters μ; γ1 . . . γk and κ were estimated via maximum likelihood. The model was fitted using the circular package in R [Tibshirani, R. et al. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 411-423 (2001)].


Animals

Animal strains used in experiments are shown in FIG. 14B-D.


Experimental Descriptions and Results

The approach to gait and posture analysis was composed of several modular components. At the base of the toolkit was a deep convolutional neural network that has been trained to perform pose estimation on top-down video of an open field. This network provided twelve two-dimensional markers of mouse anatomical location, or “keypoints”, for each frame of video describing the pose of the mouse at each time point. Also developed, were downstream components capable of processing the time series of poses and identifying intervals that represent individual strides. These strides formed the basis of almost all of the phenotypic and statistical analyses that followed. The methods permit extraction of several important gait metrics on a per-stride basis because pose information was obtained for each stride interval (see FIG. 14A for a list of metrics). This resulted in significant power to perform statistical analysis on stride metrics as well as allowing aggregation of large amounts of data in order to provide consensus views of the structure of mouse gait.


Pose Estimation

Pose estimation located the 2D coordinates of a pre-defined set of keypoints in an image or video, and was a foundation of methods for quantifying and analyzing gait. The selected pose keypoints were either visually salient, such as ears or nose, or capture important information for understanding pose, such as limb joints or paws. Twelve keypoints were selected to capture mouse pose: nose, left ear, right ear, base of neck, left forepaw, right forepaw, mid spine, left hind paw, right hind paw, base of tail, mid tail and tip of tail (FIG. 7B).


Much effort has been spent developing and refining pose estimation techniques for human pose (Moeslund, T. B. et al., Comput Vis Image Underst (2006) 104:90-126; Dang, Q. et al., Tsinghua Sci Technol (2019) 24:663-676). Traditional approaches to pose estimation relied on techniques such as the use of local body part detectors and modeling of skeletal articulation. These approaches were limited in their ability to overcome complicating factors such as complex configurations and body part occlusion. Some of these shortcomings were addressed by developing a deep neural network for pose estimation was the DeepPose (Toshev, A. & Szegedy, C., Proc IEEE Conf Comp Vis Pattern Recognit (2014), 1653-1660). DeepPose was able to demonstrate improvements on the state-of-the-art performance for pose estimation using several benchmarks. After the publication of DeepPose, the majority of successful work on pose estimation leveraged deep convolutional neural network architectures. Some prominent examples include: DeeperCut (Insafutdinov, E., et al., European Conference on Computer Vision (2016), 34-50), Stacked Hourglass Networks (Newell, A. et al., European Conference on Computer Vision (2016), 483-499), and Deep High-Resolution architecture (HRNet) (Sun, K. et al., Proc IEEE Conf Comp Vis Pattern Recognit (2019), 5693-5703). Some concepts used in high performance pose estimation architectures developed for human pose estimation were considered in the development of the rodent pose estimation methods included in methods of the invention.


There were several important considerations on which the rodent pose estimation architecture selection was based.

    • High accuracy and precision for pose inference: the gait inference method is sensitive to errors in pose estimation so it is desirable to reduce those errors as much as possible
    • Speed of inference: should be able to infer at or near real time speeds (30 fps) on a modern high end GPU
    • Simplicity and generality of architecture to facilitate modification and extension.
    • Fixed scale inference: because all of the images are at fixed scale, approaches that are designed to work at multiple scales waste network capacity and inference time.
    • Available open source implementation
    • Modularity of architecture in order to facilitate potential future upgrades.


Based on these criteria the HRNet architecture (Sun, K. et al., Proc IEEE Conf Comp Vis Pattern Recognit (2019), 5693-5703) was selected for the network and it was modified for the experimental setup. The main differentiator of this architecture is that it maintains high-resolution features throughout the network stack, thereby preserving spatial precision (FIG. 7A). HRNet showed highly competitive performance in terms of both GPU efficiency and pose accuracy. The interface was also highly modular and is expected to allow for relatively simple network upgrades if needed. The smaller HRNet-W32 architecture was used rather than HRNet-W48 because it was shown to provide significant speed and memory improvements for only a small reduction in accuracy. Two 5×5 transpose convolutions were added to the head of the network to match the heatmap output resolution with the resolution of the video input (FIG. 7B). Because all of the experiments had a single mouse in an open field, it was not necessary to rely on object detection for instancing. Thus, this step was eliminated from the inference algorithm, which also led to clear runtime performance benefits. Instead of performing pose estimation after object detection, full resolution pose keypoint heatmaps were used to infer the posture of a single mouse at every frame. This means that for each 480×480 frame of video 12 480×480 heatmaps were generated (one heatmap per keypoint). The maximum value in each heatmap represented the highest confidence location for each respective point. Thus, taking the argmax of each of the 12 heatmaps resulted in 12 (x, y) coordinates.


In order to train the network, it was necessary to select a loss function and an optimization algorithm. For loss, the approach used in the original HRNet description (Sun, K. et al., Proc IEEE Conf Comp Vis Pattern Recognit (2019), 5693-5703) was used. For each keypoint label, a 2D gaussian distribution centered on the respective keypoint was generated. The output of the network was then prepared with the keypoint-centered Gaussian and calculated loss as the mean squared difference between the labeled keypoint Gaussian and the heatmap generated by the network. The network was trained using the ADAM optimization algorithm which is a variant of stochastic gradient descent (Kingma, D. P. & Ba, J. (2014) arXiv:1412.6980). FIG. 7C shows that the validation loss converges rapidly. Labels were intentionally generated that represented a wide diversity of mouse appearances, including variation in coat color, body length and obesity to ensure that the resulting network operates robustly across these differences. Eight thousand nine hundred and ten (8,910) frames across these diverse strains were hand labeled for training (see Methods). The resulting network was able to track dozens of mouse strains with varying body size, shape and coat color (Geuther, B. Q. et al., Commun Biol (2019) 2:1-11).


Stride Inference

The approach to detecting stride intervals was based on the cyclic structure of gait as described by Hildebrand (FIG. 8A) (Hildebrand, M. J Mammalogy (1977) 58:131-156; Hildebrand, M. Bioscience (1989) 39:766). During a stride cycle, each of the paws has a stance phase and a swing phase (Lakes, E. H. & Allen, K. D. Osteoarthr Cartil (2016) 24:1837-1849). During the stance phase, the mouse's paw is supporting the weight of the mouse and is in static contact with the ground. During the swing phase, the paw is moving forward and is not supporting the mouse's weight. Following Hildebrand, the transition from stance phase to swing phase is referred to as the toe-off event and the transition from swing phase to stance phase is referred to as the foot-strike event.


In order to calculate stride intervals, stance and swing phases were determined for the hind paws. Paw speed was calculated and it was inferred that a paw was in stance phase when the speed fell below a threshold and that it was in swing phase when it exceeded that threshold (FIG. 8C-F). It could then be determined that foot strike events occurred at the transition frame from swing phase to stance phase (FIG. 8C). The left hind foot strike was defined as the event that separates stride cycles. An example of the relationship between paw speed and foot strike events is shown in FIG. 8D for hind paws. Clean, high-amplitude oscillations of the hind paws, but not forepaws, were observed, as shown in FIG. 8E. This difference in inference quality between the forepaws and hind paws is likely due to the fact that forepaws are occluded more often than hind paws from the top-down view and are therefore more difficult to accurately locate. A corresponding decrease in confidence of forepaw inferences was observed as show in FIG. 8G. For this reason, forepaws were excluded from consideration when deriving stride intervals and instead the focus was on hind paws. A significant amount of filtering was also performed on strides to remove spurious or low quality stride cycles from the dataset (FIG. 8G). Criteria for removing strides included: low confidence or physiologically unrealistic pose estimates, missing right hind paw strike event, and insufficient overall body speed of mouse, which was any speed under 10 cm/sec. FIG. 8G shows the distribution of confidences for each keypoint. The filtering method used 0.3 as a confidence threshold. Very high confidence keypoints are close to 1.0. The first and last strides in a continuous sequence of strides were always removed to avoid starting and stopping behaviors from adding noise to the stride data (FIG. 8C-D, labeled A and D, in Track A and B). This meant that a sequence of seven strides would result in at most five strides being used for analysis. The distribution of keypoint confidence varies by keypoint type (FIG. 8G). Keypoints which tended to be occluded in a top-down view such as fore paws had confidence distributions shifted down compared to other keypoints. It was also observed that keypoints that were not visually salient, such as the spine center, would have lower confidence because they were more difficult to locate precisely. Finally, an instantaneous angular velocity was also calculated, which permitted determination of the directionality of each stride (FIG. 8F). The angular velocity was calculated by taking the first derivative of the angle formed by the line that connects the base of the mouse's tail to the base of its neck. Combined, this approach allowed identification of individual high quality strides of a mouse in the open field.


To validate that the gait quantitation was functioning properly, data from a commonly used inbred strain, C57BL/6NJ was analyzed. Percent of stance and swing were calculated from 15,667 strides from 31 animals using approximately 1-hour of open field video per mouse. Data from hind paws was analyzed because these showed the highest amplitude oscillations during stance and swing (FIG. 8D, E). The data was stratified into 9 angular velocity and 8 speed bins based on the tail base point (FIG. 8H, I, respectively). As expected, increase in stance percent over a stride of the left hind paw was determined when the animal is turning left. Reciprocally, when the animal was turning right, the stance percent of the right hind paw was increased (FIG. 8H). The strides were then analyzed in central angular velocity bin (−20 to 20 deg/sec) to determine if stance percent during a stride cycle decreased as the speed of the stride increased. It was determined that the stance time decreased as the stride speed increased (FIG. 8I). A duty factor was calculated for the hind paws to quantitate this relationship with speed (FIG. 8J). Combined, it was concluded that the methods were able to quantitatively and accurately extract strides from these open field videos from a top-down perspective.


After the stride intervals had been determined, frame poses could be used in conjunction with stance and swing phase intervals to derive several stride metrics as defined in FIG. 14A. All relevant spatiotemporal metrics were able to be extracted from the hind paws, which served as the primary data source for the statistical analyses (Lakes, E. H. & Allen, K. D. Osteoarthr Cartil (2016) 24:1837-1849).


Whole Body Posture Estimation During Gait Cycle

The top-down videos allow determination of the relative position of the spine with 6 keypoints (nose, neck base, spine center, tail base, tail middle, and tail tip). With these, the whole body pose during a stride cycle was extracted. Only three points were used (nose, base of tail, and tip of tail) to capture the lateral movement during a stride cycle (FIG. 9A-C). These measures were circular, with opposite phases of the nose and the tip of tail. For display, C57BL/6J and NOR/LtJ were used, which have different tip of tail phases during a stride cycle. It was possible to extract these phase plots for each stride, which provided high sensitivity (FIG. 9D-E). Because several hours of video were obtained across each strain, it was possible to extract thousands of strides enabling high level of sensitivity. These could be combined at one speed and angular velocity bin to determine a consensus stride phase plot for each animal and strain (FIG. 9F-G). Finally, these phase plots were compared between several strains and striking diversity was found among whole body posture during the gait cycle.


Several of the metrics related to the cyclic lateral displacement observed in pose keypoints (FIG. 9A-I). The measures of lateral displacement were defined as an orthogonal offset from the relevant stride displacement vector. The displacement vector was defined as the line connecting the mouse's center of spine on the first frame of a stride to the mouse's center of spine on the last frame of stride. This offset was calculated at each frame of a stride and then a cubic interpolation was performed in order to generate a smooth displacement curve. The phase offset of displacement was defined as the percent stride location where maximum displacement occurred on this smoothed curve. As an example, if a value of 90 for phase offset was not observed, it indicated that the peak lateral displacement occurred at the point where a stride cycle is 90% complete. The lateral displacement metric assigned to stride was the difference between maximum displacement value and minimum displacement value observed during a stride (FIG. 9A). This analysis was very sensitive and allowed detection of subtle, but highly significant difference is overall posture during a stride. The previous classical spatiotemporal measures based on Hildebrand's methods were used with the combined whole body posture metrics for the analysis. Because of the cyclic nature of phase-offset metrics, care was taken to apply circular statistics to these in the analysis. The other measures were analyzed using linear methods.


Statistical Analysis and genetic validation of gait measures Following gait and posture extraction, a statistical framework was established for analysis of the data. In order to validate the methods, three mouse models were phenotyped, each having been previously been shown to have gait defects and to be a preclinical mode; of a human disease—Rett's syndrome, Amyotrophic Lateral Sclerosis (ALS or Lou Gehrig's disease), and Down syndrome. The three models, Mecp2 knockout, SOD1 G93A transgene, and Ts65Dn Trisomic, respectively, were tested with appropriate controls at two ages in a one hour open field assay (FIG. 14B). Gait metrics are highly correlated with animal size and speed of stride (Hildebrand, M. Bioscience (1989) 39:766) (FIG. 8I-J). However, in many cases a change in stride speed is a defining feature of gait change due to genetic or pharmacological perturbation. In addition, the methods were used to collect multiple repeated measurements for each subject (mouse) and each subject had a different number of (strides) giving rise to imbalanced data. Averaging over repeated strides, which yields one average value per subject, can be misleading as it removes variation and introduces false confidence. At the same time, classical linear models do not discriminate between stable intra-subject variations and inter-subject fluctuations, which severely bias the estimates. To address this, a linear mixed model (LMM) was used to dissociate within-subject variation from genotype-based variation between subjects (Laird, N. M. & Ware, J. H., Biometrics (1982) 38:963-974; Pinheiro, J. & Bates, D. Mixed-effects models in S and S-PLUS, New York: Springer-Verlag, 2000). Specifically, in addition to the main effects such as animal size, genotype, age, a random effect that captures the intra-subject variation is included. Finally, multiple repeated measurements have been taken at two different ages giving rise to a nested hierarchical data structure. The models (M1, M2 M3) follow the standard LMM notation with (Genotype, BodyLength, Speed, TestAge) denoting the fixed effects and (MouseID/TestAge) (test age nested within the animal) denoting the random effect. In order to compare the results with previously published data that do not take animal size and sometimes speed of stride into account, the results were statistically modeled with three models that only take age and body length (M1), age and speed (M2), age, speed, and body length (M3) (FIGS. 10 and 14). The models were: M1: Phenotype˜Genotype+TestAge+BodyLength+(1|MouseID/TestAge); M2: Phenotype˜Genotype+TestAge+Speed+(1|MouseID/TestAge); and M3: Phenotype˜Genotype+TestAge+Speed+BodyLength+(1|MouseID/TestAge).


Sex was not included in the models as it is highly correlated with body length (measured using ANOVA and denoted by η, is strong for both SOD1 (η=0.81) and Ts65Dn (η=0.16 overall, η=0.89 for controls, η=0.61 for mutants). The Mecp2 males and females were analyzed separately. The circular phase variables in FIG. 14A were modeled as a function of linear variables using a circular-linear regression model (Fisher, N. I. & Lee, A. J., Biometrics (1992) 48:665-677). To adjust for linear variables such as body length and speed, they were included as covariates in the model (also see Methods). FIGS. 10 and 11 report p-values and normalized effect size. For clarity, exact statistics are reported in detail in FIGS. 19 and 20.


Validation Using a Rett Syndrome Model

Rett syndrome, an inherited neurodevelopmental disorder, is caused by mutations in the X-linked MECP2 gene (Amir, R. E. et al., Nat Genet (1999) 23:185-188). Studies included a commonly studied deletion of Mecp2 that recapitulates many of the features of Rett syndrome, including reduced movement, abnormal gait, limb clasping, low birth weight, and lethality (Guy, J. et al., Nature Genet, (2001) 27:322-326). Hemizygous males (n=8), heterozygous females (n=8), and littermate controls (n=8 of each sex) were tested (FIG. 14B). Null males are normal at birth and have an expected lifespan of about 50-60 days. They start to show age-dependent phenotypes by 3-8 weeks and lethality by 10 weeks. Heterozygous females have mild symptoms at a much older age (Guy, J. et al., Nature Genet, (2001) 27:322-326). Male mice were tested twice at 43 and 56 days and females at 43 and 86 days.


Studies of this knockout have shown changes in stride length and stance width in an age-dependent manner in hemizygous males (Kerr, B. et al., PLoS One (2010) 5(7):e11534; (2010); Robinson, L. et al., Brain (2012) 135:2699-2710). Recent analysis showed increased step width, reduced stride length, changes in stride time, step angle, and overlap distance (Gadalla, K. K. et al., PloS One (2014) 9(11):e112889). However, these studies did not adjust for the reduced body size seen in Mecp2 hemizygous males (FIG. 14D) and in some cases did not model speed of the stride. The most relevant comparison of the experimental data obtained to previously published data was using M2, which models speed but not body length (Gadalla, K. K. et al., PloS One (2014) 9(11):e112889; FIG. 14D). It was found that most of the gait metrics and several body coordination metrics were significantly different in the hemizygous males versus controls including limb duty factor, step and stride length, step width and temporal symmetry. However, most gait metrics are dependent on the size of the animal and the hemizygous males are 13% smaller in body length (FIG. 14D) (Guy, J. et al., Nature Genet, (2001) 27:322-326). In addition, the analysis was limited to stride speeds between 20-30 cm/s which allowed reduction in variation introduced by differences in speed. Therefore, a model that includes body length instead of speed as a covariate (M1, FIG. 10A) and one in which both body length and speed are included (M3, FIG. 15A) were also compared. Results of M2 model indicated significant difference in stride speed, step width, stride length, whole body coordination phenotypes (tail tip amplitude, phase of tail tip and nose) in hemizygous males (FIG. 10B). Most phenotypes were dependent on age with severe effects in males by 7 weeks (56 days) (FIG. 10D). The model that includes both speed and body length (M3) showed a significant decrease in step width and suggestive difference in stride length, and robust differences in whole body coordination metrics (tail tip amplitude, phase of tail tip, tail base, and nose) (FIG. 15). Very few significant differences were observed in Mecp2 heterozygous females and they were consistent across all three models. All three models consistently find tail tip amplitude to be significantly higher suggesting more lateral movement in the females (FIG. 10A-B and FIG. 15). Combined, these results demonstrated that the method permitted accurate detection of previously described differences in Mecp2. In addition, the whole body coordination metrics were able to detect differences that had not been previously described.


Validation Using an ALS Model

Mice carrying the SOD1-G93A transgene are a preclinical model of ALS with progressive loss of motor neurons (Gurney, M. E. et al., Science (1994) 264:1772-1775; Rosen, D. R. et al., Nature (1993) 362:59-62). The SOD1-G93A model has been shown to have changes in gait phenotypes, particularly of hindlimbs (Wooley, C. M. et al., Muscle & Nerve (2005) 32:43-50; Amende, I. et al., J Neuroeng Rehabilitation (2005) 2:20; Preisig, D. F. et al., Behavioural Brain Research (2016) 311:340-353; Tesla, R. et al., PNAS (2012) 109:17016-17021; Mead, R. J. et al., PLoS ONE (2011) 6:e23244; Vergouts, M. et al., Metabolic Brain Disease (2015) 30:1369-1377; Mancuso, R. et al., Brain Research (2011) 1406:65-73). The most salient phenotypes are an increase in stance time (duty factor), and decreased stride length in an age-dependent manner. However, several other studies have observed opposite results (Wooley, C. M. et al., Muscle & Nerve (2005) 32:43-50; Amende, I. et al., J Neuroeng Rehabilitation (2005) 2:20; Mead, R. J. et al., PLoS ONE (2011) 6:e23244; Vergouts, M. et al., Metabolic Brain Disease (2015) 30:1369-1377), and some have not seen significant gait effects (Guillot, T. S. et al., Journal of Motor Behavior (2008) 40: 568-577). These studies did not adjust for body size difference or in some cases for speed. SOD1-G93A transgenes and appropriate controls were tested at 64 and 100 days, during time of disease onset (Wooley, C. M. et al., Muscle & Nerve (2005) 32:43-50; Preisig, D. F. et al., Behavioural Brain Research (2016) 311:340-353; Vergouts, M. et al., Metabolic Brain Disease (2015) 30:1369-1377; Mancuso, R. et al., Brain Research (2011) 1406:65-73; Knippenberg, S. et al., Behavioural Brain Research (2010) 213: 82-87).


Surprisingly, it was found that the phenotypes differing between transgene carriers and controls varied considerably depending on the linear mixed model used. M1, which adjusts for body length and age but not speed, finds stride speed, length, and duty factor as significantly different (FIG. 10A). However, when speed was in the model (M2) or speed and body length were in the model (M3), the only differences were small changes in phase of tail tip and nose (FIGS. 10B and 15). This indicated that the changes seen in duty factor and stride length using M1 are due to changes in speed of the strides. These results argue that the major effect of the SOD1 transgene is on stride speed, which leads to changes in stride time and duty factor. Slight changes in whole body coordination are due to decrease in body size (FIG. 14D). The results were congruent with reports that gait changes may not be the most sensitive preclinical phenotype in this ALS model, and other phenotypes such as visible clinical signs and motor learning tasks such as rotarod are more sensitive measures (Mead, R. J. et al., PLoS ONE (2011) 6:e23244; Guillot, T. S. et al., Journal of Motor Behavior (2008) 40: 568-577). In sum, the testing results validated the statistical model and may help explain some of the discordant results in the literature.


Validation Using a Down Syndrome Model

Down syndrome, caused by trisomy of all or part of chromosome 21, has complex neurological and neurosensorial phenotypes (Haslam, R. H. Down syndrome: living and learning in the community. New York: Wiley-Liss, 107-14 (1995)). Although there are a spectrum of phenotypes such as intellectual disability, seizures, strabismus, nystagmus, and hypoacusis, the more noticeable phenotypes are developmental delays in fine motor skills (Shumway-Cook, A. & Woollacott, M. H. Physical Therapy 65:1315-1322 (1985); Morris, A. et al., Journal of Mental Deficiency Research (1982) 26:41-46). These are often described as clumsiness or uncoordinated movements (Vimercati, S. et al., Journal of Intellectual Disability Research (2015) 59:248-256; Latash, M. L. Perceptual-motor behavior in Down Syndrome (2000) 199-223). One of the best studied models, Tn65Dn mice are trisomic for a region of mouse chromosome 16 that is syntenic to human chromosome 21 and recapitulate many of the features of Down syndrome (Reeves, R. et al., Nat Genet (1995) 11:177-184; Herault, Y. et al., Dis Model Mech (2017) 10:1165-1186). Tn65Dn mice have been studied for gait phenotypes using traditional inkblot footprint analysis or treadmill methods (Hampton, T. G. and Amende, I. J Mot Behav (2009) 42:1-4; Costa, A. C. et al., Physiol Behav (1999) 68:211-220; Faizi, M. et al., Neurobiol Dis (2011) 43, 397-413). The inkblot analysis showed mice with shorter and more “erratic” and “irregular” gait, similar to motor coordination deficits seen in patients (Costa, A. C. et al., Physiol Behav (1999) 68:211-220). Treadmill-based analysis revealed further changes in stride length, frequency, some kinetic parameters, and foot print size (Faizi, M. et al., Neurobiol Dis (2011) 43, 397-413; Hampton, T. G. et al., Physiol Behav (2004) 82:381-389). These previous analyses have not studied the whole body posture of these mice.


Using methods of the invention, Tn65Dn mice were analyzed along with control mice at approximately 10 and 14 weeks (FIG. 14B) and all three linear mixed models M1-M3 found consistent changes. The Ts65Dn mice are not hyperactive in the open field (FIG. 10C), although they have increased stride speed (FIG. 10A, C). This indicated that the Tn65Dn mice take quicker steps but travel the same distance as controls. Step width was increased and step and stride lengths were significantly reduced. The most divergent results from controls are obtained with M3, which accounts for speed and body length. In particular, whole body coordination phenotypes were highly affected in the Tn65Dn mice. The amplitude of tail base and tip, and the phase of tail base, tip, and nose were significantly decreased (FIG. 15A). The results confirmed this with a phase plot of nose and tail tip (FIG. 10E). Surprisingly, it was found that there were large differences in phase. The tail tip phase peak was near 30% of the stride cycle in controls and close to 60% in mutants at multiple speeds (FIG. 10E). Similar changes were seen in the phase plot for the nose. Combined, these results confirmed previous reported differences in traditional gait measures, and highlight the utility of the novel open field whole body coordination measures in broadening the assayable phenotypic features in models of human disease. Indeed, the most salient feature of the Tn65Dn gait was the alteration of whole body coordination, which previously was reported as a qualitative trait using inkblot analysis (Costa, A. C. et al., Physiol Behav (1999) 68:211-220) and is now quantifiable using methods of the invention.


Characterization of Autism Spectrum Disorder-Related Mutants

To further validate the analysis approach, gait was investigated in four autism spectrum disorder (ASD) mouse models, in addition to Mecp2 above that also falls on this spectrum. In humans, gait and posture defects are often seen in ASD patients and sometimes gait and motor defects precede classical deficiencies in verbal and social communication and stereotyped behaviors (Licari, M. K. et al., Autism Research (2020) 13:298-306; Green et al., Dev Med Child Neurol (2009) 51:311-316). Recent studies indicate that motor changes are often undiagnosed in ASD cases (Hughes, V. Motor problems in autism move into research focus. Spectrum News (2011)). It is unclear if these differences have genetic etiologies or are secondary to lack of social interactions that may help children develop learned motor coordination (Zeliadt, N., Autism in motion: Could motor problems trigger social ones. Scientific American, Spectrum, Mental Health (2017)). In mouse models of ASD, gait defects have been poorly characterized, and thus studies were performed to determine if any gait phenotypes occur in four commonly used ASD genetic models, which were characterized with appropriate controls at 10 weeks (FIG. 14C). Similar to the three models with known gait defects, these mutants and controls were tested in the one hour open field assay and gait and posture metrics were extracted (FIG. 14A). The results were modeled using the same approach used for gait mutants (M1 and M3 results are presented in FIG. 11, M2 results are in FIG. 16).


Cntnap2 is a member of the neurexin gene family, which functions as a cell adhesion molecule between neurons and glia (Poliak, S. et al., Neuron (1999) 24:1037-1047). Mutations in Cntnap2 have been linked to neurological disorders such as ASD, schizophrenia, bipolar disorder, and epilepsy (Toma, C. et al., PLoS Genetics (2018) 14:e1007535). Cntnap2 knockout mice have previously been shown to have mild gait effects, with increased stride speed leading to decreased stride duration (Brunner, D. et al., PloS One (2015) 10(8):e0134572). Model M2 was used to compare our results to the previous study and found that Cntnap2 mice show significant differences in a majority of the gait measures (FIG. 16). These mice are significantly smaller in body length and weight than controls (FIG. 14D, FIG. 16C). In the open field, Cntnap2 mice were not hyperactive (FIG. 11C) but showed a markedly increased stride speed (M1, FIG. 11A, C and FIG. 16C). These results argue that the Cntnap2 mice do not travel more, but take quicker steps when moving, similar to Ts65Dn mice.


Because Cntnap2 mice are smaller and have faster stride speeds, results from M3 were used to determine if gait parameters are altered after adjusting for body size and stride speed (FIG. 14D). Results indicated that Cntnap2 mice were significantly different from controls for a majority of the traditional gait metrics as well as whole body coordination measures in both models M1 and M3 (FIG. 11B). The Cntnap2 mice have reduced limb duty factor, step length, step width, and highly reduced stride length (FIG. 11B, D and FIG. 16C). The mice also showed altered phase of tail tip, base, and nose, as well as significant but small changes in amplitude of tail tip base and nose. Another salient feature of gait in Cntnap2 mice is the decrease in inter-animal variance compared to controls, particularly for limb duty factor (Fligner-Killeen test, p<0.01), step length (Fligner-Killeen test, p<0.01), and stride length (Fligner-Killeen test, p<0.02) (FIG. 11D). This may indicate a more stereotyped gait in these mutants. Combined, these results imply that Cntnap2 mice are not hyperactive as measured by total distance traveled in the open field, but are hyperactive at the individual stride level. They take quicker steps with shorter stride and step length, and narrower step width. Finally, studies were performed to distinguish Cntnap2 mice from controls based on all combined gait measures using unsupervised clustering. First, a principal component analysis (PCA) was performed on the linear gait phenotypes and then Gaussian mixture modeling (GMM) was used on the PCs to cluster the animals into two separate groups. It was determined that the gait metrics allowed Cntnap2 to be distinguished from controls (FIG. 11E). This analysis argued that Cntnap2 mice could be distinguished from controls based on its gait patterns in the open field, and that these phenotypes are more dramatic than previously detected (Brunner, D. et al., PloS One (2015) 10(8):e0134572).


Mutations in Shank3, a scaffolding postsynaptic protein, have been found in multiple cases of ASD (Durand, C. M. et al., Nat Genet (2007) 39:25-27). Mutations in Fmr, a RNA binding protein that functions as a translational regulator, are associated with Fragile X syndrome, the most commonly inherited form of mental illness in humans (Crawford, D. C. et al., Genetics in Medicine (2001) 3:359-371). Fragile X syndrome has a broad spectrum of phenotypes that overlaps with ASD features (Belmonte, M. K. and Bourgeron, T. Nat Neurosci (2006) 9:1221-1225). Del4Aam mice contain a deletion of 0.39 Mb on mouse chromosome 7 that is syntenic to human chromosome 16p11.2 (Horev, G. et al., PNAS (2011) 108:17076-17081). Copy number variations (CNVs) of human 16p11.2 have been associated with a variety of ASD features, including intellectual disability, stereotypy, and social and language deficits (Weiss, L. A. et al., NEJM (2008) 358:667-675). Fmr1 mutant mice travel more in the open field (FIG. 11C) and have higher stride speed (FIG. 11A, C). When adjusted for stride speed and body length (M3) these mice had slight but significant changes in limb duty factor in M2 and M3. Shank3 and Del4Aam are both hypoactive in the open field compared to controls. Shank3 mice had a significant decrease in stride speed, whereas Del4Aam mice had faster stride speeds (FIG. 11A, C). All three statistical models show a suggestive or significant decrease in step length in both strains. Using M3, it was determined that Shank3 had longer step and stride length, whereas Del4Aam had shorter steps and strides. In whole body coordination, Shank3 mice had a decrease in nose phase and Del4Aam had an increase in tail tip phase. These results indicated that, even though both Shank3 and Del4Aam are hypoactive in the open field, Shank3 takes slower and longer strides and steps, whereas Del4Aam takes faster strides with shorter steps and strides. Both mutants have some defects in whole body coordination. In sum, it was determined that each of the ASD models had some gait deficits, with Cntnap2 having the strongest phenotypes. All had some change in stride speed, although the directionality of change and the variance of the phenotype differ.


Strain Survey

After validation of the analysis methods, experiments were performed in order to understand the range of gait and posture phenotypes in the open field in standard laboratory mouse strains. Forty four classical inbred laboratory strains were surveyed, 7 wild derived inbred strains, and 11 F1 hybrid strains (1898 animals, 1,740 hours of video). All animals were isogenic and both males and females were surveyed in a one hour open field assay (FIG. 14E) (Geuther, B. Q. et al., Commun Biol (2019) 2:1-11). Gait metrics were then extracted from each video and the data analyzed on a per-animal level (FIG. 12A-B, FIG. 17, and FIG. 18). Stride data was analyzed when animals were traveling in medium speed (20 to 30 cm/sec) and in a straight direction (angular velocity between −20 to +20 degrees/sec). Such a selective analysis could be performed because of the large amount of data that could be collected and processed in freely moving mice. Because these mice varied considerably in their size, residuals from M1 that adjusts for body size (Geuther, B. Q. et al., Commun Biol (2019) 2:1-11) were used. M1 allowed extraction of stride speed as a feature, which was determined to be important in ASD mutants. In order to visualize differences between strains, a z-score was calculated for each strain's phenotype and k-means clustering was performed (FIG. 12B). Overall, high inter-strain variability was observed in most of the classical gait and whole body posture metrics, indicating high levels of heritability of these traits. Emerging patterns were also observed in open field gait movements of laboratory mouse with certain strains showing similar behaviors.


Studies were performed to determine if strains could be clustered based on their open field gait and posture phenotypes. A k-means clustering algorithm was applied on the principal components obtained by performing a PCA on the original linear gait features, as was done for the Cntnap2 mutant. Circular phase metrics were not included in the clustering analysis as both PCA and k-means clustering algorithms assume the metrics to lie in a Euclidean space. The first 2 PCs were selected as they explain 53% of the total variance in the original feature space. Four criteria were looked at in order to assess the quality of clustering and an optimal number of clusters in the k-means clustering algorithm was chosen, all of which indicated 3 optimal clusters (FIG. 21). It was determined that there were three clusters of strains that could be distinguished based on their open field gait behaviors (FIG. 12C-E). Cluster 1 consisted of mostly classical strains such as A/J, C3H/HeJ, 129S1/SvImJ; cluster 2 consisted of several classical strains and a large number of wild derived strains such as MOLF/EiJ and CAST/EiJ. Cluster 3 mainly consisted of C57 and related strains, including the reference C57BL/6J. A consensus stride phase plot of the nose and tail tip for each cluster was constructed. Cluster 3 had much higher amplitude, while clusters 1 and 2 had similar amplitude but shifted phase offset (FIG. 12D). An examination of the linear gait metrics revealed individual metrics that distinguished the clusters (FIG. 12E). For example, cluster 1 had longer stride and step length, while cluster 3 had higher lateral displacement of tail base and tip, while cluster 2 had low lateral displacement of nose. Overall, an analysis of individual metrics revealed a significant difference in 9 of 11 measures. Combined, this analysis revealed high levels of heritable variation in gait and whole body posture in the laboratory mouse. A combined analysis using multidimensional clustering of these metrics found three subtypes of gait in the laboratory mouse. The results also showed that the reference mouse strain, C57BL/6J, is distinct from other common mouse strains and wild derived strains.


GWAS

The strain survey demonstrated that the gait features measured were highly variable, and therefore, studies were performed to investigate the heritable components and the genetic architecture of mouse gait in the open field. In human GWAS, both mean and variance of gait traits are highly heritable (Adams, H. H. et al., J of Gerontol A Biol Sci Med Sci (2016) 71:740-746). The strides of each animal were separated into four different bins according to the speed it was travelling (10-15, 15-20, 20-25, and 25-30 cm/s) and the mean and variance of each trait were calculated for each animal in order to conduct a GWAS to identify Quantitative Trait Loci (QTL) in the mouse genome. GEMMA (Zhou, X. and Stephens, M. Nat Genet (2012) 44: 821-824) was used to conduct a genome-wide association analysis using a linear mixed model, taking into account sex and body length as fixed effects, and population structure as a random effect. Because linear mixed models do not handle circular values, phase gait data was excluded from this analysis. The heritability was estimated by determining the proportion of variance of a phenotype that is explained by the typed genotypes (PVE) (FIG. 13A left panel). Heritability of gait measures showed a broad range and the majority of the phenotypes are moderately to highly heritable. The mean phenotypes with lowest heritability are angular velocity and temporal symmetry, indicating that variance in the symmetrical nature of gait or turning behaviors were not due to genetic variance in the laboratory mouse. In contrast, it was found that measures of whole body coordination (amplitude measures) and traditional gait measures were moderately to highly heritable. Variance of phenotypes showed moderate heritability, even for traits with low heritability of mean traits (FIG. 13A right panel). For instance, mean AngularVelocity phenotypes have low heritability (PVE<0.1), whereas the variance AngularVelocity phenotypes have moderate heritability (PVE between 0.25-0.4). These heritability results indicated that the gait and posture traits are appropriate for GWAS of mean and variance traits.


For significance threshold, an empirical p-value correction was calculated for the association of a SNP with a phenotype by shuffling the values (total distance traveled in the open field) between the individuals 1000 times. In each permutation, the lowest p-value was extracted to find the threshold that represented a corrected p-value of 0.05 (1.9×10-5). The minimal p-value over all mean phenotypes, variance phenotypes, and both classes combined for each SNP was taken to generate combined Manhattan plots (FIG. 13B-D). Each SNP is colored according to the phenotype associated to the SNP with the lowest p-value. The different speed bins were usually consistent for each phenotype and it was decided to combine all bins of the same phenotype by taking the minimal p-value of the four bins for each SNP.


It was determined that 239 QTL for mean traits and 239 QTL for variance traits (FIG. 13B-C). The least heritable phenotype, mean AngularVelocity, showed only one significant associated genomic region, whereas the variance of AngularVelocity had 53 associated genomic loci. The phenotype with the most associated loci was stride count with 95 loci. Overall, when considering all the phenotypes together, 400 significant genomic regions associated with at least one phenotype (FIG. 22) were found, indicating only 78 QTL were identified for both a mean phenotype and a variance phenotype. Most phenotypes had limited to no overlap between QTL associated with the mean of the feature and its variance. Of note, QTL associated with mean TemporalSymmetry and variance TemporalSymmetry had a lot of overlapping regions. Out of 28 loci associated with the mean phenotype and 52 with variance, ten QTL overlapped. These data argue that the genetic architecture of mean and variance traits in the mouse are largely independent. These results also begin to outline the genetic landscape of mouse gait and posture in the open field.


Discussion

Gait and posture are an important indicator of health and are perturbed in many neurological, neuromuscular, and neuropsychiatric diseases. The goal of these experiments was to develop a simple and reliable automated system that is capable of performing pose estimation on mice and to extract key gait and posture metrics from pose. The information herein presents a solution that allows researchers to adapt a video imaging system used for open field analysis to extract gait metrics. The approach has some clear advantages and limitations. The methods permit processing a large amount of data with low effort and low cost because the only data that needs to be captured is top-down gray scale video of a mouse in an open field, and all pose estimation and gait metric extraction is fully automated after that. Because the method does not require expensive specialized equipment, it is also possible to allow the mouse time to acclimate to the open field and collect data over long periods of time. Additionally the methods of the invention allow the animal to move of its own volition (unforced behavior) in an environment that is familiar to it, a more ethologically relevant assay (Jacobs, B. Y. et al., Curr Pain Headache Rep (2014) 18:456). It was not possible to measure kinetic properties of gait because of the use of video methods (Lakes, E. H. & Allen, K. D. Osteoarthr Cartil (2016) 24:1837-1849). The decision to use top-down video also meant that some pose keypoints were often occluded by the mouse's body. The pose estimation network is robust to some amount of occlusion as is the case with the hind paws but the forepaws, which are almost always occluded during gait, have pose estimates, which are too inaccurate and so have been excluded from the analysis. Regardless, in all genetic models that were tested, hind paw data was sufficient to detect robust differences in gait and body posture. In addition, the ability to analyze large amounts of data in free moving animals, proved to be highly sensitive, even with very strict heuristic rules around what was considered to be a gait.


The gait measures that were extracted are commonly quantified in experiments (e.g. step width and stride length), but measures of whole body coordination such as lateral displacement and phase of tail are typically not measured in rodent gait experiments (phase and amplitude of keypoints during stride). Gait and whole body posture is frequently measured in humans as an endophenotype of psychiatric illness sanders2010gait, licari2020prevalence, flyckt1999neurological, walther2012motor. The results of the studies described herein in mice indicate that gait and whole body coordination measures are highly heritable and perturbed in disease models. Specifically, tests were performed to assess neurodegenerative (Sod1), neurodevelopmental (Down syndrome, Mecp2) and ASD models (Cntnap2, Shank3, FMR1, Del4Am) and altered gait features were identified in all of these mutants. Others have also found similar results with neurodegenerative models machado2015quantitative. Of note are the data for Down syndrome. In humans, miscoordination and clumsiness are prominent features of Down syndrome. In mouse models, this miscoordination was previously characterized in inkblot gait assays as a disorganized hind footprint. Here, the analysis revealed perturbed whole body coordination differences between control and Tn65Dn mice. The approach described herein thus enables quantitation of a previously qualitative trait.


The analysis of a large number of mouse strains for gait and posture identified three distinct classes of overall movement. The reference C57BL/6J and related strains were found to belong to a distinct cluster separate from other common laboratory as well as wild-derived strains. The main difference was seen in the high amplitude of tail and nose movement of the C57BL/6 and related strains. This may be important when analyzing gait and posture in differing genetic backgrounds. The GWAS revealed 400 QTL for gait and posture in the open field for both mean and variance phenotypes. It was found that the mean and variance of traits are regulated by distinct genetic loci. Indeed, methods of the invention identified that most variance phenotypes show moderate heritability, even for mean traits with low heritability. Human GWAS have been conducted for gait and posture, albeit with underpowered samples, which has led to good estimates of heritability but only a few significantly associated loci heritability. The results presented herein in the mouse support a conclusion that a well-powered study in humans may identify hundreds of genetic factors that regulates gait and posture.


Example 2
Devices and Systems

One or more of the machine learning models of the system(s) 150 may take many forms, including a neural network. A neural network may include a number of layers, from an input layer through an output layer. Each layer is configured to take as input a particular type of data and output another type of data. The output from one layer is taken as the input to the next layer. While values for the input data/output data of a particular layer are not known until a neural network is actually operating during runtime, the data describing the neural network describes the structure, parameters, and operations of the layers of the neural network.


One or more of the middle layers of the neural network may also be known as the hidden layer. Each node of the hidden layer is connected to each node in the input layer and each node in the output layer. In the case where the neural network comprises multiple middle networks, each node in a hidden layer will connect to each node in the next higher layer and next lower layer. Each node of the input layer represents a potential input to the neural network and each node of the output layer represents a potential output of the neural network. Each connection from one node to another node in the next layer may be associated with a weight or score. A neural network may output a single output or a weighted set of possible outputs.


In some embodiments, the neural network may be a convolutional neural network (CNN), which may regularized versions of multilayer perceptrons. Multilayer perceptrons may be fully connected networks, that is, each neuron in one layer is connected to all neurons in the next layer.


In one aspect, the neural network may be constructed with recurrent connections such that the output of the hidden layer of the network feeds back into the hidden layer again for the next set of inputs. Each node of the input layer connects to each node of the hidden layer. Each node of the hidden layer connects to each node of the output layer. The output of the hidden layer is fed back into the hidden layer for processing of the next set of inputs. A neural network incorporating recurrent connections may be referred to as a recurrent neural network (RNN).


In some embodiments, the neural network may be a long short-term memory (LSTM) network. In some embodiments, the LSTM may be a bidirectional LSTM. The bidirectional LSTM runs inputs from two temporal directions, one from past states to future states and one from future states to past states, where the past state may correspond to characteristics for the video data for a first time frame and the future state may corresponding to characteristics for the video data for a second subsequent time frame.


Processing by a neural network is determined by the learned weights on each node input and the structure of the network. Given a particular input, the neural network determines the output one layer at a time until the output layer of the entire network is calculated.


Connection weights may be initially learned by the neural network during training, where given inputs are associated with known outputs. In a set of training data, a variety of training examples are fed into the network. Each example typically sets the weights of the correct connections from input to output to 1 and gives all connections a weight of 0. As examples in the training data are processed by the neural network, an input may be sent to the network and compared with the associated output to determine how the network performance compares to the target performance. Using a training technique, such as back propagation, the weights of the neural network may be updated to reduce errors made by the neural network when processing the training data.


Various machine learning techniques may be used to train and operate models to perform various steps described herein, such as user recognition feature extraction, encoding, user recognition scoring, user recognition confidence determination, etc. Models may be trained and operated according to various machine learning techniques. Such techniques may include, for example, neural networks (such as deep neural networks and/or recurrent neural networks), inference engines, trained classifiers, etc. Examples of trained classifiers include Support Vector Machines (SVMs), neural networks, decision trees, AdaBoost (short for “Adaptive Boosting”) combined with decision trees, and random forests. Focusing on SVM as an example, SVM is a supervised learning model with associated learning algorithms that analyze data and recognize patterns in the data, and which are commonly used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier. More complex SVM models may be built with the training set identifying more than two categories, with the SVM determining which category is most similar to input data. An SVM model may be mapped so that the examples of the separate categories are divided by clear gaps. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gaps they fall on. Classifiers may issue a “score” indicating which category the data most closely matches. The score may provide an indication of how closely the data matches the category.


In order to apply the machine learning techniques, the machine learning processes themselves need to be trained. Training a machine learning component such as, in this case, one of the first or second models, requires establishing a “ground truth” for the training examples. In machine learning, the term “ground truth” refers to the accuracy of a training set's classification for supervised learning techniques. Various techniques may be used to train the models including backpropagation, statistical learning, supervised learning, semi-supervised learning, stochastic learning, or other known techniques.



FIG. 23 is a block diagram conceptually illustrating a device 1600 that may be used with the system. FIG. 24 is a block diagram conceptually illustrating example components of a remote device, such as the system(s) 150, which may assist processing of video data, identifying subject behavior, etc. A system(s) 150 may include one or more servers. A “server” as used herein may refer to a traditional server as understood in a server/client computing structure but may also refer to a number of different computing components that may assist with the operations discussed herein. For example, a server may include one or more physical computing components (such as a rack server) that are connected to other devices/components either physically and/or over a network and is capable of performing computing operations. A server may also include one or more virtual machines that emulates a computer system and is run on one or across multiple devices. A server may also include other combinations of hardware, software, firmware, or the like to perform operations discussed herein. The server(s) may be configured to operate using one or more of a client-server model, a computer bureau model, grid computing techniques, fog computing techniques, mainframe techniques, utility computing techniques, a peer-to-peer model, sandbox techniques, or other computing techniques.


Multiple systems 150 may be included in the overall system of the present disclosure, such as one or more systems 150 for performing keypoint/body part tracking, one or more systems 150 for gait metrics extraction, one or more systems 150 for posture metrics extraction, one or more systems 150 for statistical analysis, one or more systems 150 for training/configuring the system, etc. In operation, each of these systems may include computer-readable and computer-executable instructions that reside on the respective device 150, as will be discussed further below.


Each of these devices (1600/150) may include one or more controllers/processors (1604/1704), which may each include a central processing unit (CPU) for processing data and computer-readable instructions, and a memory (1606/1706) for storing data and instructions of the respective device. The memories (1606/1706) may individually include volatile random access memory (RAM), non-volatile read only memory (ROM), non-volatile magnetoresistive memory (MRAM), and/or other types of memory. Each device (1600/150) may also include a data storage component (1608/1708) for storing data and controller/processor-executable instructions. Each data storage component (1608/1708) may individually include one or more non-volatile storage types such as magnetic storage, optical storage, solid-state storage, etc. Each device (1600/150) may also be connected to removable or external non-volatile memory and/or storage (such as a removable memory card, memory key drive, networked storage, etc.) through respective input/output device interfaces (1602/1702).


Computer instructions for operating each device (1600/150) and its various components may be executed by the respective device's controller(s)/processor(s) (1604/1704), using the memory (1606/1706) as temporary “working” storage at runtime. A device's computer instructions may be stored in a non-transitory manner in non-volatile memory (1606/1706), storage (1608/1708), or an external device(s). Alternatively, some or all of the executable instructions may be embedded in hardware or firmware on the respective device in addition to or instead of software.


Each device (1600/150) includes input/output device interfaces (1602/1702). A variety of components may be connected through the input/output device interfaces (1602/1702), as will be discussed further below. Additionally, each device (1600/150) may include an address/data bus (1624/1724) for conveying data among components of the respective device. Each component within a device (1600/150) may also be directly connected to other components in addition to (or instead of) being connected to other components across the bus (1624/1724).


Referring to FIG. 23, the device 1600 may include input/output device interfaces 1602 that connect to a variety of components such as an audio output component such as a speaker 1612, a wired headset or a wireless headset (not illustrated), or other component capable of outputting audio. The device 1600 may additionally include a display 1616 for displaying content. The device 1600 may further include a camera 1618.


Via antenna(s) 1614, the input/output device interfaces 1602 may connect to one or more networks 199 via a wireless local area network (WLAN) (such as WiFi) radio, Bluetooth, and/or wireless network radio, such as a radio capable of communication with a wireless communication network such as a Long Term Evolution (LTE) network, WiMAX network, 3G network, 4G network, 5G network, etc. A wired connection such as Ethernet may also be supported. Through the network(s) 199, the system may be distributed across a networked environment. The I/O device interface (1602/1702) may also include communication components that allow data to be exchanged between devices such as different physical servers in a collection of servers or other components.


The components of the device(s) 1600 or the system(s) 150 may include their own dedicated processors, memory, and/or storage. Alternatively, one or more of the components of the device(s) 1600, or the system(s) 150 may utilize the I/O interfaces (1602/1702), processor(s) (1604/1704), memory (1606/1706), and/or storage (1608/1708) of the device(s) 1600, or the system(s) 150, respectively.


As noted above, multiple devices may be employed in a single system. In such a multi-device system, each of the devices may include different components for performing different aspects of the system's processing. The multiple devices may include overlapping components. The components of the device 1600, and the system(s) 150, as described herein, are illustrative, and may be located as a stand-alone device or may be included, in whole or in part, as a component of a larger device or system.


The concepts disclosed herein may be applied within a number of different devices and computer systems, including, for example, general-purpose computing systems, video/image processing systems, and distributed computing environments.


The above aspects of the present disclosure are meant to be illustrative. They were chosen to explain the principles and application of the disclosure and are not intended to be exhaustive or to limit the disclosure. Many modifications and variations of the disclosed aspects may be apparent to those of skill in the art. Persons having ordinary skill in the field of computers and speech processing should recognize that components and process steps described herein may be interchangeable with other components or steps, or combinations of components or steps, and still achieve the benefits and advantages of the present disclosure. Moreover, it should be apparent to one skilled in the art, that the disclosure may be practiced without some or all of the specific details and steps disclosed herein.


Aspects of the disclosed system may be implemented as a computer method or as an article of manufacture such as a memory device or non-transitory computer readable storage medium. The computer readable storage medium may be readable by a computer and may comprise instructions for causing a computer or other device to perform processes described in the present disclosure. The computer readable storage medium may be implemented by a volatile computer memory, non-volatile computer memory, hard drive, solid-state memory, flash drive, removable disk, and/or other media. In addition, components of system may be implemented as in firmware or hardware.


EQUIVALENTS

Although several embodiments of the present invention have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the present invention. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings of the present invention is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto; the invention may be practiced otherwise than as specifically described and claimed. The present invention is directed to each individual feature, system, article, material, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, and/or methods, if such features, systems, articles, materials, and/or methods are not mutually inconsistent, is included within the scope of the present invention. All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.


The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.” The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Other elements may optionally be present other than the elements specifically identified by the “and/or” clause, whether related or unrelated to those elements specifically identified, unless clearly indicated to the contrary.


Conditional language used herein, such as, among others, “can,” “could,” “might,” “may,” “e.g.,” and the like, unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps. Thus, such conditional language is not generally intended to imply that features, elements, and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without other input or prompting, whether these features, elements, and/or steps are included or are to be performed in any particular embodiment. The terms “comprising,” “including,” “having,” and the like are synonymous and are used inclusively, in an open-ended fashion, and do not exclude additional elements, features, acts, operations, and so forth. Also, the term “or” is used in its inclusive sense (and not in its exclusive sense) so that when used, for example, to connect a list of elements, the term “or” means one, some, or all of the elements in the list.


All literature references, patents and patent applications and publications that are cited or referred to in this application are incorporated by reference in their entirety herein.

Claims
  • 1. A computer-implemented method comprising: receiving video data representing a video capturing movements of a subject; processing the video data to identify point data tracking movement, over a time period, of a set of body parts of the subject;determining, using the point data, a plurality of stance phases and a corresponding plurality of swing phases represented in the video data during the time period;determining, based on the plurality of stance phases and the plurality of swing phases, a plurality of stride intervals represented in the video data during the time period;determining, using the point data, metrics data for the subject, the metrics data being based on each stride interval of the plurality of stride intervals;comparing the metrics data for the subject to control metrics data; anddetermining, based on the comparing, a difference between the subject's metrics data and the control metrics data.
  • 2. The computer-implemented method of claim 1, wherein the set of body parts comprises the nose, base of neck, mid spine, left hind paw, right hind paw, base of tail, middle of tail and tip of tail; and wherein the plurality of stance phases and the plurality of swing phases are determined based on the change in movement speed of the left hind paw and the right hind paw.
  • 3. The computer-implemented method of claim 2, further comprising: determining a transition from a first stance phase of the plurality of stance phases and a first swing phase of the plurality of swing phases based on a toe-off event of the left hind paw or the right hind paw; anddetermining a transition from a second swing phase of the plurality of swing phases to a second stance phase of the plurality of stance phases based on a foot strike event of the left hind paw or the right hind paw.
  • 4. The computer-implemented method of claim 1, wherein the metrics data correspond to gait measurements of the subject during each stride interval.
  • 5. The computer-implemented method of claim 1 or 4, wherein the set of body parts comprises a left hind paw and a right hind paw, and wherein determining the metrics data comprises: determining, using the point data, a step length for each stride interval, the step length representing a distance that the right hind paw travels past a previous left hind paw strike;determining, using the point data, a stride length using for the each stride interval, the stride length representing a distance that the left hind paw travels during the each stride interval;determining, using the point data, a step width for the each stride interval, the step width representing a distance between the left hind paw and the right hind paw.
  • 6. The computer-implemented method of claim 1 or 4, wherein the set of body parts comprises a tail base, and wherein determining the metrics data comprises: determining, using the point data, speed data of the subject based on movement of the tail base for the each stride interval.
  • 7. The computer-implemented method of claim 1 or 4, wherein the set of body parts comprises a tail base, and wherein determining the metrics data comprises: determining, using the point data, a set of speed data of the subject based on movement of the tail base during a set of frames representing a stride interval of the plurality of stride intervals; anddetermining a stride speed, for the stride interval, by averaging the set of speed data.
  • 8. The computer-implemented method of claim 1 or 4, wherein the set of body parts comprises a right hind paw and a left hind paw, and wherein determining the metrics data comprises: determining, using the point data, first stance duration representing an amount of time that the right hind paw is in contact with ground during a stride interval of the plurality of stride intervals;determining a first duty factor based on the first stance duration and the duration of the stride interval;determining, using the point data, second stance duration representing an amount of time that the left hind paw is in contact with ground during the stride interval;determining a second duty factor based on the second stance duration and the duration of the stride interval; anddetermining an average duty factor for the stride interval based on the first duty factor and the second duty factor.
  • 9. The computer-implemented method of claim 1 or 4, wherein the set of body parts comprises a tail base and a neck base, and wherein determining the metrics data comprises: determining, using the point data, a set of vectors connecting the tail base and the neck base during a set of frames representing a stride interval of the plurality of stride intervals; anddetermining, using the set of vectors, an angular velocity of the subject for the stride interval.
  • 10. The computer-implemented method of claim 1, wherein the metrics data correspond to posture measurements of the subject during each stride interval.
  • 11. The computer-implemented method of claim 1 or 10, wherein the set of body parts comprises a spine center of the subject, wherein a stride interval of the plurality of stride intervals is associated with a set of frames of the video data, andwherein determining the metrics data comprises determining, using the point data, a displacement vector for the stride interval, the displacement vector connecting the spine center represented in a first frame of the set of frames and the spine center represented in a last frame of the set of frames.
  • 12. The computer-implemented method of claim 11, wherein the set of body parts further comprises a nose of the subject, and wherein determining the metrics data comprises: determining, using the point data, a set of lateral displacements of the nose for the stride interval based on a perpendicular distance of the nose from the displacement vector for each frame in the set of frames.
  • 13. The computer-implemented method of claim 12, wherein the lateral displacement of the nose is further based on a body length of the subject.
  • 14. The computer-implemented method of claim 12, wherein determining the metrics data further comprises determining a tail tip displacement phase offset by: performing an interpolation using the set of lateral displacements of the nose to generate a smooth curve lateral displacement of the nose for the stride interval;determining, using the smooth curve lateral displacement of the nose, when a maximum displacement of the nose occurs during the stride interval; anddetermining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the nose occurs.
  • 15. The computer-implemented method of claim 11, wherein the set of body parts further comprises a tail base of the subject, and wherein determining the metrics data comprises: determining, using the point data, a set of lateral displacements of the tail base for the stride interval based on a perpendicular distance of the tail base from the displacement vector for each frame in the set of frames.
  • 16. The computer-implemented method of claim 15, wherein determining the metrics data further comprises determining a tail base displacement phase offset by: performing an interpolation using the set of lateral displacements of the tail base to generate a smooth curve lateral displacement of the tail base for the stride interval;determining, using the smooth curve lateral displacement of the tail base, when a maximum displacement of the tail base occurs during the stride interval; anddetermining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail base occurs.
  • 17. The computer-implemented method of claim 11, wherein the set of body parts further comprises a tail tip of the subject, and wherein determining the metrics data comprises: determining, using the point data, a set of lateral displacements of the tail tip for the stride interval based on a perpendicular distance of the tail tip from the displacement vector for each frame in the set of frames.
  • 18. The computer-implemented method of claim 17, wherein determining the metrics data further comprises determining a tail tip displacement phase offset by: performing an interpolation using the set of lateral displacements of the tail tip to generate a smooth curve lateral displacement of the tail tip for the stride interval;determining, using the smooth curve lateral displacement of the tail tip, when a maximum displacement of the tail tip occurs during the stride interval; anddetermining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail tip occurs.
  • 19. The computer-implemented method of claim 11, wherein processing the video data comprises processing the video data using a machine learning model.
  • 20. The computer-implemented method of claim 1, wherein processing the video data comprises processing the video data using a neural network model.
  • 21. The computer-implemented method of claim 1, wherein the video captures subject-determined movements of the subject in an open arena with a top-down view.
  • 22. The computer-implemented method of claim 1, wherein the control metrics data is obtained from a control organism or plurality thereof.
  • 23. The computer-implemented method of claim 22, wherein the subject is an organism and the control organism and the subject organism are the same species.
  • 24. The computer-implemented method of claim 23, wherein the control organism is a laboratory strain of the species, and optionally wherein the laboratory strain is one listed in FIG. 14E.
  • 25. The computer-implemented method of claim 22, wherein a statistically significant difference in the subject's metrics data compared to the control metrics data indicates a difference in the phenotype of the subject compared to the phenotype of the control organism.
  • 26. The computer-implemented method of claim 25, wherein the phenotypic difference indicates the presence of a disease or condition in the subject.
  • 27. The computer-implemented method of claim 25 or 26, wherein the phenotypic difference indicates a difference between the genetic background of the subject and the genetic background of the control organism.
  • 28. The computer-implemented method of claim 22, wherein a statistically significant difference in the subject's metrics data and the control metrics data indicates a difference in the genotype of the subject compared to the genotype of the control organism.
  • 29. The computer-implemented method of claim 28, wherein the difference in the genotype indicates a strain difference between the subject and the control organism.
  • 30. The computer-implemented method of claim 28, wherein the difference in the genotype indicates the presence of a disease or condition in the subject.
  • 31. The computer-implemented method of claim 1, wherein the control metrics data corresponds to elements including: control stride length, control step length and control step width, wherein the subject's metrics data comprises elements including stride lengths for the subject during the time period, step lengths for the subject during the time period and step widths for the subject during the time period, and wherein the difference between the one or more of the elements of the control data and the metrics data is indicative of a phenotypic difference between the subject and the control.
  • 32. A method of determining the presence of an effect of a candidate compound on a disease or condition, comprising: obtaining first metrics data for a subject, wherein a means for the obtaining comprises a computer-generated method of any one of claims 1-31, and wherein the subject has the disease or condition or is an animal model for the disease or condition;administering to the subject the candidate compound;obtaining post-administration metrics data for the organism; andcomparing the first and post-administration metrics data, wherein a difference in the first and post-administration metrics data identifies an effect of the candidate compound on the disease or condition.
  • 33. The method of claim 32, further comprising additional testing of the compound's effect in treatment of the disease or condition.
  • 34. A method of identifying the presence of an effect of a candidate compound on a disease or condition, the method comprising: administering the candidate compound to a subject that has the disease or condition or that is an animal model for the disease or condition;obtaining metrics data for the subject, wherein a means for the obtaining comprises a computer-generated method of any one of claims 1-32; andcomparing the obtained metrics data to a control metrics data, wherein a difference in the obtained metrics data and the control metrics data identifies the presence of an effect of the candidate compound on the disease or condition.
  • 35. A system comprising: at least one processor; andat least one memory comprising instructions that, when executed by the at least one processor, cause the system to: receive video data representing a video capturing movements of a subject; processing the video data to identify point data tracking movement, over a time period, of a set of body parts of the subject;determine, using the point data, a plurality of stance phases and a corresponding plurality of swing phases represented in the video data during the time period;determine, based on the plurality of stance phases and the plurality of swing phases, a plurality of stride intervals represented in the video data during the time period;determine, using the point data, metrics data for the subject, the metrics data being based on each stride interval of the plurality of stride intervals;compare the metrics data for the subject to control metrics data; anddetermine, based on the comparing, a difference between the subject's metrics data and the control metrics data.
  • 36. The system of claim 35, wherein the set of body parts comprises the nose, base of neck, mid spine, left hind paw, right hind paw, base of tail, middle of tail and tip of tail; and wherein the plurality of stance phases and the plurality of swing phases are determined based on the change in movement speed of the left hind paw and the right hind paw.
  • 37. The system of claim 36, wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine a transition from a first stance phase of the plurality of stance phases and a first swing phase of the plurality of swing phases based on a toe-off event of the left hind paw or the right hind paw; anddetermine a transition from a second swing phase of the plurality of swing phases to a second stance phase of the plurality of stance phases based on a foot strike event of the left hind paw or the right hind paw.
  • 38. The system of claim 35, wherein the metrics data correspond to gait measurements of the subject during each stride interval.
  • 39. The system of claim 35 or 38, wherein the set of body parts comprises a left hind paw and a right hind paw, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a step length for each stride interval, the step length representing a distance that the right hind paw travels past a previous left hind paw strike;determine, using the point data, a stride length using for the each stride interval, the stride length representing a distance that the left hind paw travels during the each stride interval;determine, using the point data, a step width for the each stride interval, the step width representing a distance between the left hind paw and the right hind paw.
  • 40. The system of claim 35 or 38, wherein the set of body parts comprises a tail base, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, speed data of the subject based on movement of the tail base for the each stride interval.
  • 41. The system of claim 35 or 38, wherein the set of body parts comprises a tail base, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of speed data of the subject based on movement of the tail base during a set of frames representing a stride interval of the plurality of stride intervals; anddetermine a stride speed, for the stride interval, by averaging the set of speed data.
  • 42. The system of claim 35 or 38, wherein the set of body parts comprises a right hind paw and a left hind paw, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, first stance duration representing an amount of time that the right hind paw is in contact with ground during a stride interval of the plurality of stride intervals;determine a first duty factor based on the first stance duration and the duration of the stride interval;determine, using the point data, second stance duration representing an amount of time that the left hind paw is in contact with ground during the stride interval;determine a second duty factor based on the second stance duration and the duration of the stride interval; anddetermine an average duty factor for the stride interval based on the first duty factor and the second duty factor.
  • 43. The system of claim 35 or 38, wherein the set of body parts comprises a tail base and a neck base, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of vectors connecting the tail base and the neck base during a set of frames representing a stride interval of the plurality of stride intervals; anddetermine, using the set of vectors, an angular velocity of the subject for the stride interval.
  • 44. The system of claim 35, wherein the metrics data correspond to posture measurements of the subject during each stride interval.
  • 45. The system of claim 35 or 44, wherein the set of body parts comprises a spine center of the subject, wherein a stride interval of the plurality of stride intervals is associated with a set of frames of the video data, andwherein the instruction that causes the system to determine the metrics data further causes the system to determine, using the point data, a displacement vector for the stride interval, the displacement vector connecting the spine center represented in a first frame of the set of frames and the spine center represented in a last frame of the set of frames.
  • 46. The system of claim 45, wherein the set of body parts further comprises a nose of the subject, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of lateral displacements of the nose for the stride interval based on a perpendicular distance of the nose from the displacement vector for each frame in the set of frames.
  • 47. The system of claim 46, wherein the lateral displacement of the nose is further based on a body length of the subject.
  • 48. The system of claim 46, wherein the instruction that causes the system to determine the metrics data further causes the system to determine a tail tip displacement phase offset by: performing an interpolation using the set of lateral displacements of the nose to generate a smooth curve lateral displacement of the nose for the stride interval;determining, using the smooth curve lateral displacement of the nose, when a maximum displacement of the nose occurs during the stride interval; anddetermining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the nose occurs.
  • 49. The system of claim 45, wherein the set of body parts further comprises a tail base of the subject, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of lateral displacements of the tail base for the stride interval based on a perpendicular distance of the tail base from the displacement vector for each frame in the set of frames.
  • 50. The system of claim 49, wherein the instruction that causes the system to determine the metrics data further causes the system to determine a tail base displacement phase offset by: performing an interpolation using the set of lateral displacements of the tail base to generate a smooth curve lateral displacement of the tail base for the stride interval;determining, using the smooth curve lateral displacement of the tail base, when a maximum displacement of the tail base occurs during the stride interval; anddetermining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail base occurs.
  • 51. The system of claim 45, wherein the set of body parts further comprises a tail tip of the subject, and wherein the instruction that causes the system to determine the metrics data further causes the system to: determine, using the point data, a set of lateral displacements of the tail tip for the stride interval based on a perpendicular distance of the tail tip from the displacement vector for each frame in the set of frames.
  • 52. The system of claim 51, wherein the instruction that causes the system to determine the metrics data further causes the system to determine a tail tip displacement phase offset by: performing an interpolation using the set of lateral displacements of the tail tip to generate a smooth curve lateral displacement of the tail tip for the stride interval;determining, using the smooth curve lateral displacement of the tail tip, when a maximum displacement of the tail tip occurs during the stride interval; anddetermining a percent stride location representing a percent of the stride interval that is completed when the maximum displacement of the tail tip occurs.
  • 53. The system of claim 35, wherein the instruction that causes the system to process the video data further causes the system to process the video data using a machine learning model.
  • 54. The system of claim 35, wherein the instruction that causes the system to process the video data further causes the system to process the video data using a neural network model.
  • 55. The system of claim 35, wherein the video captures subject-determined movements of the subject in an open arena with a top-down view.
RELATED APPLICATIONS

This application claims benefit under 35 U.S.C. § 119(e) of U.S. Provisional application Ser. No. 63/144,052, filed Feb. 1, 2021 and U.S. Provisional application Ser. No. 63/131,498, filed Dec. 29, 2020, the entire contents of each of which is incorporated by reference herein in its entirety.

GOVERNMENT SUPPORT

This invention was made with government support under R21DA048634 and UM1OD023222 awarded by the National Institutes of Health. The government has certain rights in the invention.

PCT Information
Filing Document Filing Date Country Kind
PCT/US21/65425 12/29/2021 WO
Provisional Applications (2)
Number Date Country
63144052 Feb 2021 US
63131498 Dec 2020 US