The presently disclosed subject matter relates to the motor skills learning field, and, more particularly, to facilitating the learning of motor skills using a computing device.
In a computerized process of training a user to perform motor skills, such as a user that wishes to learn a dance from a dancing teacher, the user watches the dancing teacher perform the dance, and performs the same dance by mimicking the teacher's dance. Performance of the dance is tracked, e.g. by a camera or sensors attached to the user, processed, and then feedback of the performed dance is provided to the user. In some known solutions, the system provides feedback to the user by reporting a set of measurements related to the dance performed by the user. However, in many known systems, interpreting the measurements, and deciding how exactly the dance execution should be improved, is left to the user. Hence, it is desired to provide the user with a more accurate evaluation of his dance, and also to provide guiding feedback on the dance that was performed.
One goal of motor learning by a trainee from a trainer is to achieve optimized performance of a motor skill performed by the trainer, at a rate of success and precision. Continuous practice of the motor skill by the trainee, while mimicking the trainer, may eventually result in an improved performance of the motor skill. However, it is desired to optimize the learning process, while achieving an improved performance as quickly as possible. In order to optimize this process, it is advantageous that the trainee's learning attempts are analyzed in a precise manner, and that focused feedback is provided during the learning process.
In addition, in cases where the motor skill includes a series of moves, as opposed to a single move, it may be advantageous to divide the motor skill into smaller segments, and teach the trainee each segment, or a combination of several segments, individually, and to provide the trainee with feedback on the segments. Enabling the trainee to repeat learning of segments of the motor skill, as opposed to learning of the whole motor skill, and providing feedback for learning attempts on the segments, may facilitate the trainee to improve his performance of the entire motor skill.
It should be noted that for purpose of illustration only, the following description is provided for a dance motor skill. However, various examples of motor skills are applicable to the presently disclosed subject matter, such as yoga, fitness, boxing, or different katal.
According to one aspect of the presently disclosed subject matter there is provided a computerized method for facilitating motor learning of a motor skill by a trainee in relation to a trainer's motor skill, the method comprising:
In addition to the above features, the computerized method according to this aspect of the presently disclosed subject matter can comprise one or more of features (i) to (xxvi) listed below, in any desired combination or permutation which is technically possible:
According to another aspect of the presently disclosed subject matter there is provided a computerized system for facilitating motor learning of a motor skill by a trainee in relation to a trainer's motor skill, the system comprising a processing and memory circuitry (PMC) configured to:
According to another aspect of the presently disclosed subject matter there is provided a non-transitory computer readable storage medium tangibly embodying a program of instructions that, when executed by a computer, cause the computer to perform a method for facilitating communication with a user device of a user associated with a property in a property location, the method comprising:
According to another aspect of the presently disclosed subject matter there is provided in a trainee device, a computerized method for facilitating motor learning of a motor skill by a trainee in relation to a trainer's motor skill, the method comprising:
This aspect of the disclosed subject matter can comprise one or more of features (i) to (iivi) listed above with respect to the system, mutatis mutandis, in any desired combination or permutation which is technically possible.
In accordance with an aspect of the presently disclosed subject matter, there is provided a computerized method for scoring performance of a move of a trainee in a trainee frame input in relation to a move of a trainer in a trainer video, wherein the trainer move includes at least one trainer keyframe, the method comprising:
In addition to the above features, the computerized method according to this aspect of the presently disclosed subject matter can comprise one or more of features (i) to (xix) listed below, in any desired combination or permutation which is technically possible:
According to another aspect of the presently disclosed subject matter, there is provided a system for scoring performance of a move of a trainee in a trainee frame input in relation to a move of a trainer in a trainer video, wherein the trainer move includes at least one trainer keyframe, by a processor and memory circuitry (PMC), the processor being configured to:
According to another aspect of the presently disclosed subject matter, there is yet further provided a non-transitory computer readable storage medium tangibly embodying a program of instructions that, when executed by a computer, cause the computer to perform a method for scoring performance of a move of a trainee in a trainee frame input in relation to a move of a trainer in a trainer video, wherein the trainer move includes at least one trainer keyframe, the method comprising:
In order to understand the invention and to see how it can be carried out in practice, embodiments will be described, by way of non-limiting examples, with reference to the accompanying drawings, in which:
block
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the presently disclosed subject matter may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the presently disclosed subject matter.
Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing”, “computing”, “providing”, “determining”, “selecting”, “obtaining”, “scoring”, “calculating”, “transforming”, “fusing”, “pre-processing”, “associating”, “aggregating”, “normalizing” “presenting”, “comparing”, “displaying”, “prioritizing”, “facilitating”, “superimposing”, “learning”, “organizing”, “proceeding”, “calibrating”, “receiving”, “detecting”, “generating”, “manipulating”, “identifying”, “filtering out”, “customizing” or the like, refer to the action(s) and/or process(es) of a computer that manipulate and/or transform data into other data, said data represented as physical, such as electronic, quantities and/or said data representing the physical objects.
The terms “computer”, “computer/computerized device, “computer/computerized system”, or the like, as disclosed herein, should be broadly construed to include any kind of hardware-based electronic device with data processing capabilities including, by way of non-limiting example, the processor and memory circuitry (PMC) 120 disclosed in the present application. The processing circuitry can comprise for example, one or more computer processors operatively connected to computer memory, loaded with executable instructions for executing operations, as further described below.
The terms “non-transitory memory” and “non-transitory storage medium” used herein should be expansively construed to cover any volatile or non-volatile computer memory suitable to the presently disclosed subject matter.
The operations in accordance with the teachings herein may be performed by a computer specially constructed for the desired purposes or by a general-purpose computer specially configured for the desired purpose by a computer program stored in a non-transitory computer-readable storage medium.
Embodiments of the presently disclosed subject matter are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the presently disclosed subject matter as described herein.
In known methods of learning motor skills, by a trainee user (referred to also as a trainee throughout and below), learning the motor skills requires the trainee to observe and mimic the motor skill usually comprised of an atomic move of a teaching user (also referred to hereinbelow as a trainer). The trainee performs the motor skill, and his/her performance is tracked by sensors, e.g. by a video camera. The performance is processed, and then feedback of the performed motor skill is provided to the trainee. If the motor skill comprises an atomic move only, in some known solutions the system provides feedback to the trainee by reporting a set of measurements relating to the move performed by the trainee. Consider the example of a user performing a kick of a ball move. Known learning systems include providing feedback to the trainee on the specific move of the kick, with reference to the force or speed of kicking the ball, as measured by sensors on the user or the ball. In an example of a tennis player hitting a ball with a racquet, the learning system may include feedback which may relate to the angle of the hand holding the racquet.
However, while processing an atomic trainee move with reference to a trainer move can facilitate learning the specific move of the trainer, it is advantageous, according to certain embodiments of the presently disclosed subject matter, to provide a method that facilitates learning of motor skills that include a series of moves, in a personalized and automatic manner. According to certain embodiments of the presently disclosed subject matter, the motor skill can be divided, in a pre-processing stage, into individual, reusable, moves. In some examples, a move may include an atomic unit, e.g. a smallest learning unit of a dance that the user learns. A “reusable” move can include a move that can be used or repeated in more than one dance of one or more trainers. In some examples, the learning system 100 can be associated with a database, e.g. stored in memory 220 including a library of stored moves. The moves in the library can appear in one or more dances. A learning process of a move including analysis of trainee's moves when learning the move, and the feedbacks that can be associated with performance of the move, can be stored in the library, and can be used in more than one dance, e.g. by retrieving the move and the learning data associated with the move, e.g. by learning system 100. For example, a move of a “left kick” or a “kick with left leg forward at knee height” may be present in more than one dance. In some examples, splitting a dance may include not only dividing the trainer video to individual moves, but also selecting certain, differentiable moves from a library of reusable moves. Hence, the motor skill can be divided, in a pre-processing stage, into segments, where a segment can include, one move, a small, limited, number of moves, or a set of consecutive moves. In such a manner, the motor skill can be learnt in stages. For example, in the first stage, each segment can be taught separately. Then, two segments can be combined, etc.
Therefore, in some cases, the motor skill can be divided into segments, where each segment comprises one or more moves of the motor skill. The trainee can learn each segment separately, and can receive feedback on each performed segment, thus optimizing the learning of the entire motor skill and achieving better performance of the motor skill. In some examples, the motor skill can be divided into segments in a hierarchical manner, such that the trainee can learn first shorter segments, including a small number of moves, and then proceed to longer segments including a larger number of moves, and, optionally, moves that were included in the shorter segments may now be combined with new moves.
In addition, according to certain embodiments of the presently disclosed subject matter, it is advantageous to further focus on the differences in aspects of the performance of the move, e.g. the accuracy, the timing, and the style aspects of the performed move when compared to the trainer move. Learning how to mimic the trainer can be enhanced by providing relevant feedback to the trainee regarding the aspects of the performance, e.g. for each segment separately, in an automatic manner. For instance, two different trainee users might do the same move (e.g. ball dribbling or playing the guitar) in an accurate and similar manner when compared to a trainer move, yet with different style. Hence, each trainee should receive different feedback. While both feedbacks may include an indication of the high similarity to the trainer move, each feedback should focus on other aspects of the performed move, such as the style of the move. Hence, it is advantageous to process the trainee movement with respect to various aspects, such as accuracy, timing and style, and to provide feedback based on the aspects of the performance, such that feedback facilitates the trainee to improve the performance of a future move with respect to the trainer move.
Reference is made to
In some examples, the learning system 100 comprises a processor and memory circuitry (PMC) 120 comprising a processor and a memory (not shown), communication interface 130, a camera 140 operatively communicating with the PMC 120 and a monitor 150. In some examples, the learning system 100 is configured to implement a computerized method in a trainee's mobile device, while the camera 140 can be the mobile device's camera and the monitor 150 can be the screen of the mobile device. The learning system 100 is configured to facilitate learning of motor skills of a trainee 110, in a personalized and automatic manner.
According to certain embodiments of the presently disclosed subject matter, trainee 110 tries to learn a trainer's motor skill including an atomic move or series of moves by mimicking the trainer's moves. In some examples, the learning process can include several phases. A Presentation Phase may include a presentation of the trainer's moves, e.g. by presenting to the trainee a video of the trainer with execution of a motor skill. In some examples, execution of the motor skill includes execution of the correct moves of the motor skill, however, execution may include presentation of one or more common mistakes as well.
The learning process can also include a Learning Phase. The learning phase may include a phase where the trainee is prompted (implicitly or explicitly) to perform the motor skill including the moves, as shown in the trainer video. The trainee's moves are tracked by the learning system 100, analysed, and the trainee is then provided with feedback on his performance.
In some examples, the Learning Phase can include a journey flow. The journey flow may include a plurality of selectable items (referred to also as segments of the motor skill), that can be displayed to the trainee, where one item can be the entire dance, and each other item can include a portion of the dance, e.g. a move or several moves of the dance. The trainee can learn the dance by selecting, in a repetitive manner, portions of the dance to learn, as included in the plurality of displayed items. In some examples, the items to be displayed can be automatically selected by learning system 100. The journey flow is created individually for each trainee, based on the order that the items are selected. The journey flow is further described below.
It should be noted that although the presentation phase and the learning phase are described separately and sequentially, this should not be considered as limiting, and those versed in the art would realise that the phases can be executed in a reversed order, or alternately.
Referring to
In order to execute the learning process, by starting from the presentation phase, in some cases a trainer video may be obtained by obtaining module 230, e.g. by retrieving the trainer video from memory 220 operatively communicating with PMC 120. The trainer video may be displayed on monitor 150 appearing in
Reference is made to
Another example of a manipulation on a move can include a trainer video incorporating a visual overlay. A visual overlay can be incorporated in either or both a trainer or a trainee video (usage of the visual overlay on the trainee video may also be part of the feedback provided to the trainee, and is further referred to below as visual cues). The visual overlay can be displayed alongside the videos and/or over one or more of the videos, e.g. by superimposing it on the trainer's or the trainee's videos. The visual overlay can include one or more visual guiding symbols highlighting portions of a move, such as circles, directional arrows, springs, waves, balls, lightning, and others. In some examples, the semantics of the visual guidance are related to the semantics of the move, such that the visual guidance may highlight one or more notable characteristics in the body pose or the move (for example, the water animation may “flow” in the direction to which the trainee should move his arm) or an easy association to the move (like a turning hourglass used for the “flip hands” move). For example, the visual guidance may be displayed as a visual overlay, simultaneously to the trainer's move, optionally corresponding to the characteristic in the body pose that the visual guidance intends to highlight or note to the user. For example, visual guidance of sticks crossing can be associated and displayed next to arms that should be crossed in a body pose, or visual guidance of springs being pushed down can be associated and displayed next to arms that should be pushing down in a body pose. Alternatively or additionally, a screenshot of the trainer's video, including the body pose associated with the visual guidance, together with the visual guidance, can be displayed to the trainee. Reference is made to
In some examples, one or more frames from the trainer's video can be selected e.g. by learning system 100 to include body poses that should be associated with one or more visual guidance steps. Next, the body poses can be associated with visual guidance, and a corresponding visual overlay can be added to the trainer's video in the respective segment that the body poses appear. In some examples, a visual guidance is shown at least for one move. In some examples, a visual guidance is shown for more than one or every move in a dance, or in a segment of dance. Yet, in some examples, the visual guidance is shown on the trainer's video during the presentation phase, where the entire segment or dance is shown to the trainee.
In some examples, the visual guidance can be added in real time, and can also be shown on the trainee's video e.g. during the learning phase. In order to show visual guidance, it is required to detect the location of joints in order to associate these body parts with one or more visual guidance. For example, according to the location of the joint of the trainer in the video, the visual guidance on the trainee video will be added over or in a position defined relative to the coordinates of the corresponding joint of the trainee. The visual guidance may be added over virtual joints too that are created via interpolation between known joints from the computer vision model. Interpolation is done with a low-degree spline taking the anatomic prior knowledge into account, for example, the shoulder-line is perpendicular to the outline of the side of the body. In some examples, the best approximation for objects that are only partially rigid (such as the body) is achieved. In many practical scenarios linear interpolation would also yield adequate solutions and can be used, due to its advantages in consuming less resources, thus may be preferred on low-end edge devices. Reference is made to
Reference is made back to
In some examples, the trainer video is divided into a hierarchy of the plurality of selectable segments, the segments representing a hierarchical learning flow with levels of hierarchy. A segment at a low level of the hierarchy can include one move, or a small number of moves. A segment at a higher level of the hierarchy can include (i) all the moves included in at least one lower level selectable segment of the hierarchy and, (ii) at least one move of the plurality of consecutive moves that is not included in the at least one lower level selectable segment of the hierarchy. The hierarchy may continue upward indefinitely. The hierarchy may correspond to a post-order method of depth-first traversal of a binary tree in computer science. Dividing the motor skill and executing the learning phase, while tracking the hierarchy in the journey menu may be advantageous, as it may ease the learning of the dance and result in faster and more efficient learning of the dance, and a high performance score of the trainee. It should be noted that the hierarchy may include an optional optimal executing of the learning phase, that will result in more efficient learning of the dance. As illustrated in
Reference is made back to
It is noted that the teachings of the presently disclosed subject matter are not bound by the learning system described with reference to
Referring to
Turning to
In some examples, once the presentation phase is over, learning system 100 can execute the learning phase. The learning phase can commence by the trainee selecting one of the selectable segments displayed on monitor 150. For example, the trainee can select, from the displayed segments, one of the segments to start. Alternatively or additionally, PMC 120 can automatically select a first selectable segment to display. Since, as explained, the segments may be provided in a hierarchical manner, in which the first segments are shorter and include the first moves of the motor skill, automatically selecting the first segment by PMC 120 may in itself facilitate the motor learning of a motor skill by a trainee.
Once a segment is selected, it may be displayed to the trainee on the monitor 150. In some examples, the trainee's video can be displayed together with the trainer's video. Optionally, the trainee can define e.g. via a menu, the proportion and size of each displayed video on the viewport, e.g. the size of the trainer's video, and the size of the trainee's video or camera area. Irrespective of whether the trainee selects the segment, or PMC 120 automatically selects the segment, PMC 120 receives data indicative of a selected segment displayed to the trainee (block 420). In some examples, PMC 120 can mark each segment that was selected, e.g. by highlighting the selected segment, to ease tracking of the journey and the selected segments of the individual trainee. With reference to
Following display of the segment, the trainee 110 tries to mimic the trainer's move and performs the moves included in the displayed segment. In some examples, the trainee 110 performs the moves simultaneously with the display of the segment. Alternatively, the trainee 110 can watch the trainer's video and then perform the moves, optionally together with another display of the same segment.
In some examples, camera 140 captures a trainee video. PMC 120 receives the trainee video from camera 140 (block 430). In case PMC 120 operates on the trainee's device, then PMC 120, e.g. using obtaining module 320, receives the trainee video from camera 140. In case PMC 120 operates in a remote server to camera 140, PMC 120, e.g. using obtaining module 320, can receive the trainee video e.g. by communicating with camera 140 and receiving data indicative of the trainee video. The trainee video comprises at least one trainee move.
In some examples, in order to improve the analysis of the performance of the trainee's moves in the captured trainee video, and comparison to the trainer's body proportion, to facilitate providing the feedback in a more accurate manner, a process of calibration can be performed. The calibration process assists in measuring the individual's body proportions, e.g. the trainee's individual limb length. In some examples, calibration can be performed one time, e.g. at the beginning of the learning phase, and/or at the beginning of a segment. Yet, in some examples, the learning system 100 can be integrated in a mobile device, while the camera 140 can be the mobile device's camera. In such examples, during the learning, the trainee keeps moving towards the camera and back, e.g. in order to better view the feedback that is displayed on his performance of the previous segment. In order to better analyse the trainee's moves, it may be advantageous to perform calibration at the beginning of every segment. As opposed to known solutions, which require calibration, and in which the calibration phase requires an individual to stand at a specific distance in a specific pose for the camera to capture the individual, according to certain embodiments of the presently disclosed subject matter the calibration can be performed by enabling a trainee to stand freely at a selected distance from the camera. PMC 120 can provide a visual calibration shape to be displayed on the trainee's device, wherein displaying the visual calibration shape facilitates calibration of the captured trainee video with a trainee's position. For example, the calibration can include the camera capturing the surroundings of the trainee with the trainee, and displaying a visual shape, e.g. a rectangle in which the trainee should stand in relation to the shape. For example, during calibration, it may be required from the trainee to stand inside a rectangle with a specified set of joints (e.g. it can be required from the trainee that his/her upper body is inside of the rectangle, or that the trainee's full body is inside the rectangle). Enabling calibration, where the trainee stands freely, gives the trainee flexibility, while it guarantees that analysis of the trainee moves, as captured by the camera, are optimally processed on the visible trainee's joints.
In some examples, once the trainee performs the moves, the body pose of the trainee may be processed in conjunction with the rectangle shown, to notify the trainee in case the position needs to be changed, e.g. if the trainee is too close, too far away, or outside of the rectangle. The trainee can be notified so that the trainee can correct his/her position. Reference is made to
Once the trainee video is obtained, PMC 120, e.g. using analysis module 240, processes the at least one trainee move in the trainee video with respect to at least one corresponding move of the trainer, as included in the selected segment, to obtain a segment performance score (block 440). The segment performance score is indicative of a performance of the at least one trainee move in relation to the at least one corresponding move of the selected segment.
In some examples, processing the trainee moves to provide a segment performance score can be done by processing the trainee video and the trainer video to provide a move performance score (block 450). A move performance score is indicative of the performance of one trainee move of the at least one trainee move in relation to one trainer move of the plurality of consecutive moves. The processing is done with respect to at least one aspect of the performance, e.g. by performing one or more of analysing keyframe similarity (block 460), analyzing keyframe timing (block 470) and analyzing motion dynamics (block 480). Processing the trainee video and the trainer video to provide a move performance score is further described below with respect to
It should be noted that the move performance score should not be confused with a segment performance score. The move performance score is indicative of the performance of a single trainee move in relation to a single trainer move of the trainee's and the trainer's moves, where the performance can be evaluated with respect to various aspects of the performance.
In cases where each selected segment includes one move only, the result of processing of the trainee move with respect to a corresponding move included in a selected segment is a move performance score, which is identical to the segment performance score. The segment performance score is indicative of a performance of the trainee move in relation to the corresponding move of the selected segment. In cases where the trainee video and the selected segment include more than one move, processing the trainee moves with respect to corresponding moves included in the selected segment results in a number of move performance scores, each calculated for one move of the trainee's moves with respect to corresponding moves in the selected segment. The number of move performance scores can be fused into a segment performance score, e.g. by aggregating them. Aggregation of the number of move performance scores can be done in a similar manner to that described with respect to fusing the scores of various aspects in block 790 of
Analysis module 240 comprises similarity module 250, timing module 260, and motion dynamics module 270. Each of the modules 250, 260, and 270, are configured to analyse the trainee moves with respect to at least one aspect of the performance, and to provide an aspect score. For example, similarity module 250 is configured to analyse the similarity of the trainee move to that of the trainer move, and to provide a similarity score. Timing module 260 is configured to analyse the timing of the trainee move, and to provide a timing score. Motion dynamics module 270 is configured to analyse the style of the trainee move, when considering various motion dynamic features of the trainee move, and to provide a motion dynamics score. Following is a detailed description of analysing the aspects of performance, as performed by modules 260, 270, and 280, with reference to
PCT Application No. PCT/IL2021/050129 “Method of scoring a move of a user and system thereof”, filed on Feb. 3, 2021, includes some examples of analysis of the trainee's move in the trainee's video, with respect to similarity, timing and motion dynamics features, in order to provide a move performance score with respect to various aspects of the move, and is incorporated herein with its entirely. The content of PCT/IL2021/050129 is also added in the following description. However, it should be noted that other known per se methods can be used to process the trainee's moves with respect to the trainer's moves in order to provide a segment performance score.
Reference is now made to
In some examples, each of the trainer and trainee videos includes at least one move. For example, the trainer video may include a trainer move of putting the hand down. In some cases, in order to provide a move performance score for a trainee move, it may be advantageous to divide the trainer move, and the trainee move, into frame and keyframes. A frame, as known in the art, may include a shot in a video, with 2D or 3D representation of the skeleton of a person appearing in the shot, or other vector-based representation of a person, at a particular point in time of the video. A move of the person can be represented by a sequence of frames. In examples where a sequence of trainee images are received (instead of a trainee video), each image can be referred to as a frame. A keyframe should be expansively construed to cover any kind of a subset of a sequence of frames, typically defining a starting or ending point of a transition in a move. In some examples, keyframes can distinguish one frame sequence from another, and can be used to summarize the frame sequence, such that it is indicative of the move of the person in the sequence of frames.
As illustrated in
As explained, once a trainer and trainee's videos are obtained PMC 120, e.g. using analysis module 240, is configured to process the trainee move in the trainee video, based on the trainer move included in the trainer video, in order to provide a move performance score. Analysis module 240 comprises similarity module 250, timing module 260, and motion dynamics module 270. Each of the modules 250, 260 and 270 are configured to analyse the trainee move with respect to at least one aspect of the performance, and to provide an aspect score. For example, similarity module 250 is configured to analyse the similarity of the trainee move to that of the trainer move, and to provide a similarity score. Timing module 260 is configured to analyse the timing of the trainee move, and to provide a timing score. Motion dynamics module 270 is configured to analyse the style of the trainee move, when considering various motion dynamic features of the trainee move and to provide a motion dynamics score. Analysing the aspects of performance, as performed by modules 260, 270 and 280, is further described below with respect to
The calculated aspect scores can be transformed, e.g. by the transformation module 280, giving rise to a move performance score. For example, a similarity score can be calculated for each trainee frame. Transformation module 280 is configured to aggregate the similarity scores of all trainee frames into a single move performance score. In some examples, if more than one aspect scores are calculated, then the scores of each aspect for each frame can be fused, e.g. using a conditional aggregation, giving rise to the move performance score. Fusing several aspects scores is further described below with respect to
Referring to
The performance of move of a trainee in an input frame, e.g. a trainee video or a sequence of images, can be processed and scored in relation to a trainer move in a trainer video. As explained above, in order to process the trainee move in a more accurate manner, it may be advantageous to process frames included in the move. In some examples, the obtained trainer video includes two or more trainer keyframes. The obtained trainee video comprising a trainee user move, where the trainee user move can comprise a plurality of trainee frames. As mentioned above, the description is provided for processing a video of a trainee, e.g. as captured by a camera. Those skilled in the art will readily appreciate that the teachings of the presently disclosed subject matter are, likewise, applicable to any other medium that is configured to provide a frame input, including a move of a user, such as a sequence of trainee images.
Analysis module 240 can then process the plurality of trainee frames to provide a move performance score (block 720). Block 720 correspond the stage of processing the trainee video and the trainer video to provide a move performance score (block 450 of
Analysis module 240 can select, for a trainer keyframe, a corresponding trainee frame of the plurality trainee frames. The selected trainee frame constitutes a candidate trainee frame. In some examples, analysis module 240 can select, for a trainer keyframe, more than one corresponding trainee frames constituting candidate trainee frames. In some examples, for each trainee frame, one or more corresponding trainee frames are selected and constitute candidates. A matching trainee frame to the trainer keyframe can be selected from the candidate trainee frames.
Selection can be based on a selection criterion. For example, the selection criterion can be a time criterion. Selecting according to time criterion can include selecting one or more trainee frames that appear in a time window in the trainee video, that are around a time point of the trainer keyframe in the trainer video. The term “time window around a time point” should not be considered as limiting and should be interpreted in a broad manner. In some examples, the trainee time window includes a time interval comprising a predefined time before and/or after the time point that the trainer keyframe appears in the video.
Reference is now made to
For each trainer keyframe KF1 and KF2, one or more trainee frames F3-F10 may be selected as candidates. For example, for trainer KF1 appearing in time point w1 a predefined+2/−2 time interval may be determined, and trainee frames F3-F7 appearing in a time window w3 that is around time window w1 can be selected as candidates. Yet, in some examples, the predefined time may be 0. In such examples, the trainee time window is identical to the trainer time point, resulting in selecting one candidate trainee frame for each trainer keyframe. The candidate appearing at the same time point in the trainee video as the trainer keyframe appears in the trainer video. For example, F5 may be selected for KF1 and F8 may be selected for KF2. It should be noted that in some examples, the trainee time window can include the entire trainee video, and the candidates for a trainer keyframe can be selected from the entire trainee video. However, selecting candidates from a time window that is shorter than the entire trainee video may optimize the process and require less computational time for processing.
Referring back to
Once one or more aspect scores are calculated, the scores may be transformed to provide a move performance score (block 780) This last stage is further described below.
Following is a description of three exemplary aspects of the performance and calculating aspect scores for each of them as described in block 740 in
The similarity aspect may measure to which extent the trainee move is similar and accurate, with respect to the trainer move. In some examples, in order to evaluate similarity of moves, body parts in a trainer keyframe can be compared to body parts in the candidate trainee frame. Body parts in a pose can be defined by joint pairs, e.g. by their start and end joints, where a joint may be regarded, as known in computer vision terminology, as a structure in the human body, typically, but not exclusively, at which two parts of the skeleton are fitted together.
In order to calculate the similarity aspect score, the angular differences between body parts in a pose of the trainer in the trainer keyframe, and body parts in a pose of the trainee in the candidate trainee frames, can be computed. Reference is now made to
Referring now to
A trainer keyframe can include a pose. A trainer pose can be obtained (block 1010), e.g. by defining a set of several body parts appearing in the trainer keyframe. Each body part may be represented by a vector from the start joint to the end joint. A trainee pose from a candidate trainee frame can be obtained, e.g. using known, per se techniques, (block 1020), for example, by defining a set of several body parts appearing in the candidate trainee frame. Each body part may be represented by a vector from the start joint to the end joint and may correspond to a respective body part in the trainer pose, starting from the same start joint to the end joint. This enables comparison between the vectors. In
For at least one body part included in the trainer keyframe, and at least one corresponding body part included in the candidate trainee frames, analysis module 250 can compute the angular difference between the body parts (block 1030). For example, analysis module 250 can compute the angular difference between the vectors of the body parts, e.g. as illustrated in
In some examples, using angular differences, such as angle E illustrated in
Another example of pre-processing relates to the dimension of the vector representing body parts. Based on information captured by a non-depth camera, a 2-dimensional vector of body parts can be formulated. In some examples, a depth plane extension can be predicted from the 2-dimensional information, to formulate a 3 dimensions vector representing the body parts. For example, this can be done using the principle of projection: the projection of a line segment with a known length appears shorter when the start and end point are not located on the same depth. Representing a body part as a 3 dimensions vector may be advantageous as computing the angular differences between the body parts is more accurate since it accounts for rotation in the depth plane as well.
Based on the computed angular differences, analysis module 250 can calculate a similarity aspect score for a frame (block 1050, which corresponds to block s 740 and 750 in
In some examples, the aggregation can be weighted and indicate a predefined importance of the body parts in the pose, such that the computed angular differences between less important body parts will contribute less to the calculation of the similarity aspect score. In order to indicate a predefined importance of body parts in a pose, a body part of the trainer can be associated with a respective weight. The weight can be indicative of the importance of the body part in the pose. In the example of putting the hand down, low weights may be associated with legs body parts, average weights may be given to the hand which is not moving, and high weights may be associated with body parts of the hand which should be put down. One or more body parts may be associated with a zero weight, such that they do not contribute to the similarity aspect score. The associated weights can be stored, e.g. in memory 220, and can be retrieved by analysis module 250. In cases where a body part is associated with a respective weight, analysis module 250 can compute the angular difference between body parts, and associate the computed angular difference with the respective weight of the body part. The similarity aspect score can be calculated according to the associated respective weight. For example, the aggregation of the separate angular differences can be according to the associated respective weight of each body part.
Alternatively or additionally, in some cases, predefined variations of the pose of the trainer are allowed, such that despite a high angular difference between body parts of the trainer and the body parts of the trainee being computed, the high angular difference should contribute less to the similarity aspect score, resulting in a higher similarity score.
Alternatively or additionally, aggregation of the similarity of the separate body parts can be calculated using summary statistics, such as minimum, average, and percentile. Yet, in some examples, the aggregation can be also learnt using known per se machine learning methods by mapping the distances of one or more body parts to an overall similarity score. For example, machine learning methods can include regression, neural networks, or statistical approaches.
The calculated similarity scores can be stored by analysis module 250, e.g. in memory 220.
In some cases, the calculated similarity score of the candidate frames can be transformed, giving rise to the move performance score (block 1060 which corresponds block 780 in
Reference is now made to
In some cases, the calculated aspect scores of the candidate frames can be transformed, giving rise to the move performance score. The transformation function is further described below with respect to block 780 of
In some examples, one or more additional similarity analysis can be performed. The additional similarity analysis can be performed, in order to identify one or more additional insights on the performance of the move, and provide suitable feedback. The additional similarity analysis can be performed based on the same trainee input frames with respect to a second, different, set of trainer keyframes. The second set of keyframes can be predefined based on the keyframes of the trainer, and may reflect typical mistakes or possible variations to the trainer keyframes, for example, wrong limb, mirrored move, swapping of two moves. The second set of keyframes may be stored in memory 220 and obtained by similarity analysis module 250. Additional similarity scores, calculated based on the additional similarity analysis, can be indicative of the performance of the move by the trainee. In case the additional set of keyframes reflects typical mistakes (e.g. an alternative trainer keyframe shows a body pose including hand up instead of the hand down in the trainer keyframe), then, as opposed to the regular similarity analysis, high similarity scores in the additional similarity analysis are indicative of low performance of the move. In case the additional keyframe similarity reflects possible variations, then high similarity scores in the modified similarity analysis are indicative of high performance of a variant of the move. In addition, usage of calculated modified similarity scores to provide a move performance score is further described below with respect to timing analysis.
Attention is now reverted to a description of the timing aspect and calculating a timing score in accordance with certain embodiments of the presently disclosed subject matter. The timing aspect may analyse the timing of the trainee move with respect to the trainer move, while assuming that the trainer timing is accurate, and that the trainee should follow the trainer's timing. The timing analysis may assist in indicating whether the trainee performs the move at the same speed as the trainer. In some examples, based on the timing score that is calculated for the timing aspect, it will be possible to provide the trainee with feedback that his move is too fast, or too slow. In some cases, in order to analyse the timing aspect, it is required to process a segment from the trainer video that includes several trainer keyframes, and to process a corresponding segment from the trainee video that includes corresponding candidate trainee frames. The trainer keyframe and trainee frames in the segments are examined with respect to a plurality of timing parameters. A timing score can be calculated, based on timing parameters.
It should be noted that the timing analysis is independent of the similarity analysis described above, and is related to calculating the timing score with respect to timing parameters. However, processing the timing parameters on frames assumes that some or all of the trainee frames have a matching score indicative of a likelihood of match between the candidate trainee frame to a trainer keyframe. In some examples, the matching score can be calculated based on the similarity analysis, however, this should not be considered as limiting, and those skilled in the art will readily appreciate that the teachings of the presently disclosed subject matter are, likewise, applicable to other matching scores, calculated by other known, per se, techniques.
It should also be noted that while the similarity analysis (or any other matching analysis technique) aims to match a candidate trainee frame to a trainer keyframe in a local manner when considering a trainer keyframe in an individual manner, the timing analysis aims to match trainee frames to trainer keyframes in a more holistic manner, while reviewing the optimal match of trainee frames to keyframes, when considering some or all of the frames in the trainee video. As described further below, in some cases, although the similarity or matching score for a particular trainee frame is high, when timing analysis is performed, the trainee frame may not be selected as a matching frame. The reason is that when processing the entire trainee video and several trainer keyframes, the timing parameters dictate that another trainee frame should be selected as a matching frame. Referring back to
To generally illustrate the timing aspect analysis, an example of an analysis of one timing parameter, the out-of-sync timing parameter, is provided. The general description of the timing analysis, and further examples of timing parameters, are described with respect to
F7 is marked in grey as it has the highest matching score to KF1 of the trainer, from all candidates of KF1. F6 is marked in grey as it has the highest matching score to KF2 of the trainer, from all candidates of KF2. Selecting a matching candidate based on the matching scores only would have result in selection of F7 as matching to KF1, and selection of F6 as a matching KF2. However, if F7 and F6 are selected for matching KF1 and KF2 respectively, the result would be that F6, appearing before F7 in the trainee video, matches a trainer keyframe KF2, that is later than the trainer keyframe that F7 matches, KF1. When considering the example of the move that includes putting the hand down, where KF1 represents when the hand is up, and KF2 represents when the hand is down, in practice, selection of F7 and F6 would mean that the trainee first put down his hand (KF2 of the trainer), and then raised his hand up (KF1 of the trainer), in an opposite manner to the trainer. The holistic approach of processing KF1 and KF2 together, while considering timing parameters, and applying an out-of-sync timing parameter, may result in selection of either one of F3, F4 or F5 as matching to KF1, such that F6 can be selected to match KF2. Applying a timing analysis may therefore be advantageous when aiming to match a trainee frame to a trainer keyframe more accurately, in a holistic and optimal approach, while processing the entire sequence of the trainer keyframes and trainee frames, in order to provide the move performance score.
Referring now to
As described above, the timing analysis assumes that candidate trainee frames have been selected for a trainer keyframe, and that each candidate has been processed to indicate a matching score to the trainer keyframe. Therefore, in some cases, timing analysis module 260 can obtain, for at least two candidate trainee frames, a respective matching score (block 1310). The matching scores are indicative of a likelihood of match between the candidate and a trainer keyframe. In some examples, the matching score can be a similarity aspect score, as calculated by the similarity analysis module, as described above. However, those skilled in the art will readily appreciate that the obtained matching scores, are, likewise, applicable to matching scores calculated by other known per se techniques. With reference to
Based on the obtained matching scores, timing analysis module 260 can calculate the timing aspect score (block 1320). A trainer time interval from the trainer video can be obtained. The trainer time interval includes at least two trainer keyframes. With reference to
In some examples, the trainer time interval can include the entire trainer video, and, accordingly, the entire trainee video. Yet, in some other examples, the trainee time interval can be determined based on trainer keyframes that were included in the trainer time interval. The trainee time interval is determined to include all the candidates that correspond to the trainer keyframes included in the trainer time interval. For example, with reference to
In some cases, timing analysis module 260 can calculate a timing score for the at least two successive candidate trainee frames, with respect to one or more timing parameters (block 1340). The out-of-sync example described above is one example of a timing parameter. In some examples, the candidates keyframe can be scored in relation to other candidate frames. Assuming for example a first candidate to a first trainer keyframe, and second and third candidates for a second trainer keyframe. The first candidate is scored in relation to each of the second and third candidates.
Following is a non-exhaustive list of optional timing parameters. In some examples, a timing parameter may apply a constraint on a candidate trainee frame, e.g. a Boolean constraint, such that if a condition is met, a candidate cannot be selected as a matching frame. This can be achieved e.g. by providing a zero or the lowest optional timing score to a candidate, or by associating an n.a. (not applicable) indication to a candidate, such that no score is applicable for that candidate. In some other examples, a timing score is calculated for a candidate based on a timing parameter, and the timing score may later be aggregated with other aspect scores, e.g. a matching score. Some optional timing parameters include at least:
The timing offset score of F3 would be an array of scores, including a cell of offset score with a high offset score for F6 and a low offset score for F7. An exemplary matrix including scores of each frame with respect to other frames is described below with respect to
The above examples should not be considered as limiting, and those skilled in the art will readily appreciate that the teachings of the presently disclosed subject matter are, likewise, applicable to other examples of timing parameters.
Based on one or more parameters of the above, a timing score can be calculated for a candidate frame. For example, the timing score can be the score calculated for offset timing parameter and can be in absolute seconds, preserving the sign or some transformed value based on the raw difference (e.g. based on a non-linear kernel). In cases where the timing score is based on more than one timing parameter, the timing score can be calculated, e.g. by aggregating the timing score of some or all parameters. In case one or more of the parameters is associated with a weight, the timing score can be calculated based on the associated weight.
In cases where, for each trainer keyframe, only one candidate frame is selected from the trainee video, the timing analysis may be skipped, or may be performed under determination that synchronization exists between the trainer keyframes and the trainee frames, and provides an equal timing score for each of the trainee frames. However, in cases where several candidates are selected for one trainer keyframe, it may be advantageous to select one of the candidates as a matching trainee frame for each trainer keyframe. Hence, in some examples, after timing scores are calculated, the timing scores and the matching scores of each candidate frame can be aggregated to provide an optimality score (block 1350). The optimality scores can be indicative of an optimal selection of a matching candidate to a trainer keyframe. In some examples, the timing score and the optimality score can be indicative of a score of a frame with reference to another frame. This is further described below with respect to
After calculating an optimality score, in some examples, the candidate having the highest optimality score can be selected as a matching trainee frame for a trainer keyframe (block 1360). In some cases, selecting a matching frame is performed where more than one aspect of performance is evaluated, e.g. when similarity analysis and timing analysis are processed. This selection is further illustrated in
In some examples, a threshold may be determined for calculating optimality scores and selecting a matching candidate. In such examples, calculating optimality scores in the timing analysis can be performed only for candidates having a matching score above a predefined threshold. A reason is that if no frame is associated with a similarity score which is above a certain threshold, then there is no point in calculating optimality scores and selecting a matching frame based on the optimality scores. However, it is still advantageous to select a matching frame to each keyframe, e.g. to indicate of the error. Therefore, in such cases the matching frame may be selected based on one or more timing constraints, e.g. based on ‘proximity to expected time’ timing parameter and ‘time offset’ from the keyframe.
In some examples, after selecting a respective matching candidate for keyframes, a move performance score can then be calculated based on the calculated timing score, the optimality score, the matching scores or a combination thereof (block 1370).
Reference is now made to
The illustration of
For example, consider trainer keyframes KF1 and KF2 only. In table 1200, F3 has a similarity score of 0.4 for similarity to trainer KF1. This similarity score is aggregated to each similarity score of any other candidates F6-F10 of KF2, resulting in the following aggregated scores for F3:
As shown in the above rows, F3-F5 were not candidates of KF2, hence, no scores could be aggregated and the aggregated scores for each of the cells F3/F3, F3/F4 and F3/F5 is denoted by n.a. F6 was scored with 0.8 for similarity score for KF2, hence, the aggregated score for cell F3/F6 is 1.2 (0.4+0.8). F7 was scored with 0.1 in similarity score for KF2, hence, the aggregated score for cell F3/F7 is 0.5 (0.4+0.1).
As mentioned above, some of the timing parameters may apply a constraint on a candidate trainee frame, e.g. a Boolean constraint, such that if a condition is met, a candidate cannot be selected as a matching frame. As illustrated in table 1200, F7 was scored with 0.7 in similarity score for KF1 and F6 was scored with 0.8 in a similarity score for KF2, which should have been resulted in aggregated score 1.5 in cell F7/F6. However, out-of-sync parameter is applied to F7/F6, in this case, a constraint that the order of appearance of the trainee frames in the sequence of keyframes should be the same order of appearance of the trainer keyframes in the sequence, resulting in no aggregated score in cell F7/F6. For similar reasons, cell F7/F7 does not include an aggregated score. F7 was scored as both similar to KF1 and to KF2, however, a coincidence constraint prevents from a single frame to match two keyframes, hence, the cell of F7/F7 does not include an aggregated score. It is to be noted that the aggregated scores in
The aggregated scores calculated based on the aggregated matching scores in the above F3 row are also the optimality scores for frame F3. As illustrated, Matrix 1400 illustrates the optimality scores for all candidates.
The optimality scores for each of cells F3/F6, F4/F6 and F5/F6 equals 1.2. The scores are marked in grey to indicate that these scores are the highest scores in the table. Each of these cells is indicative that selecting one of F3/F4/F5 for matching KF1 and selecting F6 for matching KF2 would yield the highest score for the trainee in overall view of the entire move. In the current, the highest optimality scores yield three equally suitable matchings. In these three matchings, trainee F6 is matched with trainer KF2, but the optimality score could equally well match trainee F3, F4, or F5 to trainer KF1.
In some other examples, additional constraints and/or timing parameters in the timing analysis may be applied to select one matching candidate. For example, in the case of equally optimal matches, the candidate with the closest time to the expected KF time is selected (in this case F5 is closest in time to trainer KF1 and hence and will have a higher score over F4. F4 in turn, will have a higher score over F3). Additional constraints can be added, as described above, on the minimum distance between consecutive selected candidate frames (assume at least 1 second difference), or similar time offset between keyframes and respective frames. Both of these constraints or timing parameters result in a higher optimality score and selection of F4 over F5 for matching KF1.
The timing scores for the selected matching frames F4 and F6 can be based on one timing parameter, e.g. the difference in the expected and actual times of the keyframes (F4 appeared 1 second sooner than KF1 and F6 appeared 2 seconds sooner than KF2).
In case additional similarity analysis is performed with respect to a second set of keyframes reflecting typical mistakes and/or possible variations, then the calculated modified similarity scores can be used in the above example, together with the similarity threshold, for calculating optimality scores of the candidate frames, to effectively provide a wide range of mistakes of the move in a flexible way.
It should be noted that the above timing analysis was described with respect to one trainer time interval. In some examples, once a first trainer time interval has been processed, candidates are scored, and, optionally, a matching frame is selected for each trainer keyframe in the time interval, the process proceeds by shifting the time interval to process the next trainer keyframes. In some examples, the time interval is shifted based on the distance between the last keyframe in the time interval, and the next successive trainer time interval. In some examples, selecting matching frames for each trainer keyframe, or some of the trainer keyframes, results in a sequence of selected matching frames for trainer keyframes. This sequence of frames, which are a subset of all frames in the trainee video or sequence or images, are the frames in which the trainee tried to imitate the moves of the trainer. Hence, selecting the matching frames, and reaching the sequence of selected frames, achieves to provide a more accurate feedback to the user, which will enable the user to improve his future moves.
In some examples, once a candidate frame is selected as a matching candidate and a move performance score is calculated based on the matching frames a more accurate and efficient feedback can be provided, as the feedback may rely on insights learned and focus on the matching frame and its score, compared to the trainer keyframe. Accordingly, feedback on how the trainee can improve the performance of a future move, when relying on the insights learned from that matching frame, can be provided to facilitate the trainee to improve performance of a future move with respect to the trainer move.
Attention is now reverted to a description of the motion dynamics aspect. While keyframe matching based on similarity and timing aspects may indicate the correctness of the move and its timing, the motion dynamics aspect relates to the style of the move and movement transformation between two trainer keyframes in a move. It should be noted that although the motion dynamics analysis is now described after performing the timing analysis, it should not be considered as limiting, and those versed in the art would realise that motion dynamics analysis can be performed before the timing analysis. Scores calculated during the motion dynamics analysis can be used as matching scores obtained by the timing analysis module 260, as an input to the timing analysis. The motion dynamics scores can be combined with other matching scores, such as calculated similarity scores, or may be used independently as matching scores.
In some cases, in order to process the move in relation to the motion dynamics aspect, successive trainer keyframes in a trainer video, and trainee frames in the trainee video, are processed. Motion features can be extracted from the two successive trainer keyframes and can be analysed in the trainee frames. For example, the velocity of joints' change in the move of the trainer can be compared to the velocity of joints' change in the move of the trainee. Other examples of motion dynamic features appear below.
Referring now to
Based on the trainer time interval, motion dynamics module 270 can determine a corresponding trainee time interval in the trainee video (block 1510). Reference is made to
Yet, in some other examples, the trainee time interval can be determined based on trainer keyframes that were included in time interval w1. In case matching frames have already been selected for each keyframe before motion dynamics analysis is performed, then the trainee time interval can be determined based on the matching frames, and can include at least the respective matching trainee frames to the trainer keyframes included in the time interval w1. With reference to
Motion features can be extracted from the trainer keyframes included in the trainer time interval. The motion features can relate to one or more of the following groups of features, or a combination thereof, and can be indicative of movement transformation between two keyframes:
The above list should not be considered as limiting, and those skilled in the art will readily appreciate that the teachings of the presently disclosed subject matter are, likewise, applicable to other motion features.
Referring back to
Each of the trainer time interval and the corresponding trainee time interval may be associated with a window size. In some examples, the window size associated with the corresponding trainee time interval is different than the window size associated with the trainer time interval, as illustrated by time intervals w1 and w2 in
Referring back to
A similarity analysis is then performed to F3-F10, and a similarity score is computed for each F3-F10, as illustrated in the similarity score with respect to KF1 and KF2. As illustrated, F6 and F7 are candidates of both KF1 and KF2, and can yield different similarity scores for different trainer keyframes. Also to be noted, is that, based on the similarity scores, F7 has the highest similarity score for KF1, and F6 has the highest similarity score for KF2.
Next, motion dynamics analysis is then performed to F3-F10. In this example, motion magnitude similarity feature is performed, to consider the peak of the motion of the trainee. Motion magnitude scores are listed in table 1700.
The scores of the dynamic motion analysis and the similarity analysis can be aggregated, referred to in table 1700 as ‘similarity score for KF1+motion magnitude similarity’ and ‘similarity score for KF2+motion magnitude similarity’.
In some examples, the aggregated scores illustrated in table 1700 can constitute matching scores for next to be performed timing analysis. F4 and F6 have the highest matching scores for KF1 and KF2.
Next, timing analysis is performed for a more holistic process of the frames. The timing analysis may add constraints of the selection of matching frames, or compute a low timing score for the frames, resulting in different selection of matching frames when timing aspects is further processed. For example, the timing analysis can include a constraint of out-of-sync keyframes, offset parameters and such.
In the example of table 1700, the timing constraints that are applied (now shown) do not change the scores, and as such, F4 and F6 remain as having the highest scores, now being optimality scores for F4 and F6. These frames can be selected as matching to KF1 and KF2 respectively. The timing scores of F4 and F6 can be based on offset timing parameter to be −1 and −2, respectively.
The scores calculated for F3 and F6 for the various aspects can be fused to provide a move performance score and a suitable feedback.
It should be noted that the above is merely an example of the order of performing the aspects analysis. A different order of execution can be determined, e.g. based on the type of the move that the trainee tries to imitate. For example, for freestyle dances, it may be advantageous to select a different order, such that first the motion activity level is evaluated to calculate a motion dynamics score, then the timing analysis is evaluated based on the scores of the motion dynamics (constituting the matching scores for the timing analysis). Once the matching candidates are selected in the timing analysis, only then, keyframe similarity aspect scores are calculated for the matching frames. A move performance score can then be calculated based on the calculated scores, and a suitable feedback can be provided.
Referring back to
In some examples, the similarity, timing and motion dynamics analysis provide indication on different aspects of the performance of the move. Transforming the computed scores of the aspects into a move performance score, based on which feedback is provided, is advantageous, since the aspect scores may be translated to high-level concepts of accuracy, timing, and style. The feedback may then be focused on specific aspects according to the scores, such that it facilitates the trainee to improve his/her performance. Thus, the learning process of the trainee imitating a trainer may go through different qualitative stages.
In cases where only one aspect is evaluated, a move performance score can be calculated based on transformation of the scores calculated for each feature or parameter in that aspect. For example, average, geometric mean or based on a learned method considering the individual scores informativeness can be performed to transform the scores into a move performance score.
In cases where trainee frames are processed in relation to more than one aspect, transforming the scores to provide a motion dynamics scores include fusing the scores of the various aspects (block 790). In some examples, in order to fuse one or more aspect scores, the scores of the matching frames and/or transformations thereof can be aggregated. In some other examples, the aggregation can be conditional, such that the transformation function of one calculated aspect score is determined or weighted based on one or more conditions pertaining to another calculated aspect score of a second aspect. The conditional aggregation is advantageous to provide a more accurate move performance score, since, as explained further below, different weights may be given to different aspects, depending on the scores of the aspects. For example, if no trainee frame is scored with a high similarity score, there is no relevancy to the timing, and hence the timing scores and motion dynamics score may be weighted with zero. In some examples, one or more weights for one or more aspects can be predefined.
Alternatively or additionally, the fusion of the aspects scores may include creating a summarization function, which depends on the aspects scores, or a combination thereof. One example of combining the aspects scores includes a three parameter function, for example:
In another example the following function can be applied:
w1*similarity aspect score+w2*1*timing aspect score+w3 motion dynamics score
Therefore, not only a hard ‘if’ threshold can be used but a lower threshold, including a logistic function, can also modulate the effect of one aspect score on another.
The fusion functions can be predefined and applied when necessary, or can be learned, using known per se machine learning modules.
Reference is made back to block 440 in
In some examples, the feedback generated by feedback module 290 can be audio feedback, textual feedback, and/or visual feedback. For example, visual feedback can include written text, and body part components. The feedback may include specific corrections that should be made in the performance of the move, such that the feedback facilitates the trainee to improve the performance of a future move with respect to the trainer move. For example, the feedback may include a guiding statement of raising the hand all the way up, or being faster. In some examples, the feedback may indicate differences between the performed trainee move and the trainer move, as processed in relation to the aspects of performance, e.g. based on the move performance score. For example, the generated feedback may pertain to one of more aspects of the performance of a move, as analysed by similarity module 250, timing module 260, and motion dynamics module 270. For example, if, according to the similarity analysis of the trainee's move with the trainer's move, it arises that the trainee raised his left arm instead of the right arm, as in the dance in the trainer's video, then, based on the similarity analysis, suitable feedback can be generated. Such feedback can include text indicating to the trainee use of a wrong arm. Yet, in some examples, the aspects of the moves can be summarized in one metric that consolidates the overall performance of the move. Suitable feedback can be provided based on that metric.
Some challenges of learning motor skills lie in the fact that the feedback on the trainee's move should be interpretable and acceptable by the trainee in an efficient manner. As such, even in cases where the similarity analysis yielded a distance function between the move of a trainee and that of the trainer, it is advantageous not only to indicate the distance function, but also to facilitate that this function focusses on relevant aspects in the performed trainee move. For instance, people might do the same move (e.g. ball dribbling, arms wave, or playing the guitar) with a different style. This is likely to result in a distance function indicating a slightly different motion pattern between the moves, but having a high similarity performance. Hence, in such examples, the feedback can pertain to motion dynamics aspects of the move. Another example is when the analysis detects a delay in execution of the moves—based on either alignment of key frames between the trainee and the trainer or alignment of the videos, the feedback can include a still image or a short video of the trainee video execution potentially with the trainer execution showing the offset in time (i.e. by showing the videos side by side, the trainee can see that their execution is delayed compared to the expected execution of the trainer). Optionally, a short text (Hurry up) or an icon (e.g. a turtle) can further be added, and can assist in providing this feedback.
In case the timing analysis reveals that in a certain move a key frame is not matched and the timing analysis returns (1) that the keyframes are not matched (2) which body part is responsible for the non-match and (3) the location of all joints and body parts in a move as detected by a computer vision algorithm, the generated feedback can be based on the semantic information from (1) and (2) and can include a textual or audio feedback, e.g. “Pay attention to your left arm” or “Try again and correct the move of your left arm”. The feedback can also include visual feedback based on (3), which highlights the body part.
Feedback can also be generated based on meeting pre-defined conditions with respect to timing analysis, keyframe matching, and/or motion dynamics analysis. For example, when a swing with the left arm is late (resulting from the timing analysis) and a robotic move is performed instead of a smooth move (resulting from the motion dynamics analysis), the feedback can include a visual/audio feedback of “move left arm smoothly and a bit sooner on your next try”. In some examples, the feedback can be customized by selecting feedback that corresponds to mistakes performed by the trainee.
In some examples, based on the similarity analysis, the feedback can include a body part of the trainee/trainer, e.g., a screenshot of the trainee/trainer video with shown body part. Hence, where the trainee move is defined by a set of joints, as e.g. used by the similarity analysis, feedback module 290 can select, based on the at least one aspect of the performance, such as the similarity analysis, a joint included in a processed trainee move to cut out. The generated visual feedback can include at least a display of the selected joint. The relevant body part may be zoomed onto, focused by blurring the rest of the image, or cut out otherwise, to assist highlighting the trainee's mistake. The center of the cutout may be defined based on a present location on the screen, and/or based on the actual joint location, e.g. as detected by the machine learning model. The radius of the cutout may be a predetermined parameter, or may be based on the length of the body part, e.g. the length of the limb, or twice the length of the limb, where the start of the end point of the limb is the target joint that should be the center of the cutout. In some examples, the cutout may be a different geometrical shape and is not constrained to be a circle. The cutout joint may be selected automatically based on the comparison between the trainer and the trainee, or specified in a configuration beforehand. For example, when the cutout is automatically selected, then it can be based on the location of the lowest matching score in the joint-by-joint comparison (e.g. in the similarity analysis). When the cutout is set up in advance, it can be selected based in relation to a certain specific feedback configuration (e.g. a wrist may be the focus for a waving motion). Reference is made to
In some examples, visual feedback can include the trainee video or a manipulation thereof. In some examples, this video is only generated for successful moves or for successful segments. Hence, feedback module 290 can determine whether the segment performance score exceeds a pre-defined threshold, and if in the affirmative, to generate feedback comprising a manipulated version of the trainee video. For example, if the trainee managed to clap his hands on time, the trainee video with animated fireworks at the area and at the time of the clapping can be displayed to the trainee. In some examples, effects and animations can be added to the video as follows:
In some examples, the feedback can be generated by identifying and selecting feedbacks from a predefined list of candidate feedbacks. The list of predefined candidate feedbacks can include semantic feedback and can be stored in memory 220 in
Identifying and selecting one or more feedbacks from a list of predefined feedbacks can be done in accordance with one or more pre-set rules. Below are some non-limiting examples of rules:
In some examples, feedback module 290 can filter out at least one candidate feedback, e.g. based on a history of feedbacks provided to the trainee, and provide the remaining candidate feedback to the trainee, without the filtered out candidates. Tracking the history of feedback facilitates providing feedback that has a higher likelihood of acceptance and implementation by the trainee. This can be reached e.g. by learning system 100 storing in memory 220 the feedbacks previously provided to the trainee. When the next feedback is to be provided to the trainee, feedback modules can retrieve the feedback history for the move for the trainee's previous performances. One implementation would filter out an already triggered feedback on a subsequent try (e.g. if trainee received “left hand should be straight” in trial 1 and on trial 2 “left hand should be straight” and “left hand should point upward” are the candidate feedbacks, then the first is filtered and the second is shown to the trainee). A more complex implementation can include the feedback module 290 tracking of how many trials in the past certain feedback was provided, and filter out and weight the probability of the feedback again in function of that (e.g. if it a certain feedback was shown 3 trials ago then there is a 80% chance that it is going to be shown, whereas if it was 1 trial ago, then there is a 20% chance that it is shown). Such functions may be predetermined or alternatively learned from a trainee's history, or from a larger group of trainees' history data, e.g. using machine learning methods.
In some examples, feedback module 290 can associate a priority to the candidate feedbacks, based on pre-set priority rules, and can provide the one or more candidate feedbacks to the trainee's device having the highest priority. In some examples, the pre-set priority rules are selected from a group comprising: affected body parts of the trainee in the trainee move, specificity of the feedback, history of provided feedbacks, and affected part of the move. Affected body parts can include providing a higher priority to certain body parts based on the trainer move. Specificity may be related in at least two ways; first, specificity can be defined to include the number of joints that should be checked for the feedback condition (left arm wrong<both arms wrong). In such a case, the fewer joints have to be checked against the more specific the feedback that should be provided. Second, specificity may be additionally specified or overwritten by an editor during the set up of the dance. For example, assuming the following three feedbacks are detected:
Feedback module 290 can associate a priority of each of the three feedbacks based on specificity, yielding a high priority to “arms wrong” feedback compared to “missing move” feedback, and even higher priority to “left arm wrong” feedback, which may result in providing the “left arm wrong” feedback to the trainee solely. The editor may decide to nevertheless assign the highest priority to a missing move feedback which overrides this default behavior, and triggers the “missing move” feedback to the trainee.
In some examples, a personalized learning experience of the trainee can be achieved by providing customized feedback. Feedback module 290 can customize one or more of the generated feedbacks, where the feedback includes at least the trainee video, and provide the customized feedback to the trainee's device.
For example, the customized feedback can include one or more visual cues. The visual cues are also referred to above as visual guidance. The visual cues can be added to the trainer or the trainee videos to highlight a body part or other details of the trainer's or the trainee's move during the presentation stage. Similarly to the visual guidance, a visual overlay can be incorporated on the trainer or the trainee video. The visual overlay can be displayed alongside the videos and/or over one or more of the videos, e.g. by superimposing it on the trainer's or the trainee's videos. The visual overlay can include one or more visual cues including symbols highlighting portions of a move, such as circles, directional arrows, springs, wave, balls, lightning, and others. Based at least one the processed trainee move, feedback module 290 can obtain at least one visual cue, e.g. by retrieving it from memory 220. Feedback module 290 can determine in real time a location on the received trainee video suitable for superimposing the visual cue, and to customize the generated feedback by superimposing the obtained visual cue on the trainee video at the determined location. In some examples, feedback module 290 can determine a time duration for superimposing the visual cue, and superimpose the visual cue on the trainee video for the determined time duration. In some examples, more than one visual cue is superimposed on the trainee video.
In some examples, the feedback is provided in a manner that facilitates displaying the generated feedback, in real time, simultaneously to displaying of the selected segment, during the performance of the move or the segment, after the end of the segment, or after the dance. The feedback can include negative and/or positive feedback. In some examples, in order to facilitate learning, one negative feedback and one positive feedback can be shown to the trainee.
In some examples, a feedback can include one or more general feedbacks indicative of progress of the performance with respect to the motor skill, e.g. while considering the previous learning performances of the trainee. Additionally or alternatively, the feedback can include one or more feedbacks pertaining to performance of the selected segment, such as feedback on specific body parts or aspects of the moves in the segment. As illustrated in
In some cases, the motor learning of the motor skill by the trainee, as referred to above also as a journey flow in the learning phase, includes executing, in a repetitive manner: providing the segments to the trainee, selecting one segment, performing the selected segment, processing the performance of the trainee, and providing a feedback. Performing the segments of the motor skill and receiving feedback for each segment, facilitates the motor learning of the motor skill by the trainee. Hence, in some examples, the motor learning of the motor skill comprises executing in a repetitive manner the stages illustrated in
As explained above, in some examples, during learning, the trainee can move to the next segment either by reaching a passing score of the current segment, as calculated by PMC 120, in which case, an automatic traversion occurs by PMC 120, or by selecting freely any segment that is displayed in the menu, e.g. to execute manual traversion. The next segment with the automatic traversion can be a next segment in the journey or the next segment in the journey without passing score. This assists in achieving an expert guided traversion based on the score of PMC 120, along with the flexibility of individual selection of segments. In some examples, the next segment that the trainee will learn when passing to the next segment after succeeding the current segment, can be a segment that the trainee has not yet passed in the past, irrespective if this segment is the next segment in the order of segments according to the dance. Considering an example of a dance including 3 segments, moves 1-2, 3-4, 5-7, and a trainee succeeded in the past segment 3-4, and is now trying segment 1-2, when passing segment 1-2, the trainee will automatically move to the next segment of segment 5-7. Upon completion of the current segment, PMC 120 can determine a next segment of the plurality of selectable segments to be displayed to the trainee, select the segment, and provide data indicative of the next selected segment, e.g. to displaying module 232, thereby facilitating displaying the next selected segment upon completion of displaying the current selected segment. Additionally, the data can be reflected in the journey menu.
In some examples, the segment can be displayed along with a visual aid of a countdown to start of the presentation of the trainer's video. In some examples, the countdown may be replaced by a recognition of the target's body pose of the trainer in the trainee body pose. For example, the trial of the current segment or move may start when the trainee reaches a standing straight pose, or, when the trainee takes the starting position of the current segment (this may be displayed on the screen). As shown in
The above examples should not be considered as limiting, and a those skilled in the art will readily appreciate that the teachings of the presently disclosed subject matter are, likewise, applicable to other examples of predefined conditions on how to select feedback to the trainee.
In embodiments of the presently disclosed subject matter, fewer, more and/or different stages than those shown in
It is to be understood that the invention is not limited in its application to the details set forth in the description contained herein or illustrated in the drawings. The invention is capable of other embodiments and of being practiced and carried out in various ways. Hence, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting. As such, those skilled in the art will appreciate that the conception upon which this disclosure is based may readily be utilized as a basis for designing other structures, methods, and systems for carrying out the several purposes of the presently disclosed subject matter.
It will also be understood that the system according to the invention may be, at least partly, implemented on a suitably programmed computer. Likewise, the invention contemplates a computer program being readable by a computer for executing the method of the invention. The invention further contemplates a non-transitory computer-readable memory tangibly embodying a program of instructions executable by the computer for executing the method of the invention.
Those skilled in the art will readily appreciate that various modifications and changes can be applied to the embodiments of the invention as hereinbefore described without departing from its scope, defined in and by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
PCT/IL2021/050129 | Feb 2021 | WO | international |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IL2021/051187 | 10/1/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63086360 | Oct 2020 | US | |
63165962 | Mar 2021 | US |