CONTEXTUALIZATION OF SUBJECT PHYSIOLOGICAL SIGNAL USING MACHINE LEARNING

Information

  • Patent Application
  • 20240374167
  • Publication Number
    20240374167
  • Date Filed
    May 10, 2024
    7 months ago
  • Date Published
    November 14, 2024
    a month ago
Abstract
Implementations described herein disclose a method of monitoring motion of a patient, the method includes receiving, using a processor, a video stream, the video stream comprising a sequence of images for at least a portion of a patient, dividing the video stream into a plurality of temporal video sequences, each of the temporal video sequences having a plurality of frames each being apart from each other, generating a matrix of depth difference frames, and determining a machine learning (ML) input feature matrix based on the matrix of depth difference frames.
Description
BACKGROUND

There is a need in the medical community to determine neonatal motion and/or activity for a number of reasons including to monitor relative neonatal activity over time which may correlate with disease state and/or over or under sedation and to help mitigate the effect of motion on physiological signals and their derived parameters (e.g. respiratory rate from the Transthoracic Impedance (TTI) signal). A range of depth sensing technologies are available to determine various physiological and contextual parameters, including respiration rate, tidal volume, minute volume, effort to breathe, activity, presence in bed, etc., that may be useful in detecting neonatal motion and/or activity. Specifically, video (RGB) and depth-sensing cameras have enormous potential to provide non-contact methods for the determination of physiological parameters.


SUMMARY

Implementations described herein disclose a method of monitoring motion of a patient, the method includes receiving, using a processor, a video stream, the video stream comprising a sequence of images for at least a portion of a patient, dividing the video stream into a plurality of temporal video sequences, each of the temporal video sequences having a plurality of frames each being apart from each other, generating a matrix of depth difference frames, and determining a machine learning (ML) input feature matrix based on the matrix of depth difference frames.


This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.


Other implementations are also described and recited herein.





BRIEF DESCRIPTIONS OF THE DRAWINGS

A further understanding of the nature and advantages of the present technology may be realized by reference to the figures, which are described in the remaining portion of the specification.



FIG. 1 illustrates a non-contact subject monitoring system that can be used for contextualization of subject physiological signals using machine learning.



FIG. 2 indicates an example of a depth video image of a neonatal patient.



FIGS. 3A and 3B illustrates parts of the processing steps undertaken to denoise the depth image to generate a depth difference matrix.



FIG. 4 illustrates various example input features.



FIG. 5 illustrates example schematic of a random forest classifier.



FIG. 6 illustrates an example feature labeled with detected motion as identified by the classifier and manually detected motion.



FIG. 7 illustrates modifying display of physiological signal or parameter of the patient with information about motion of the patient.



FIG. 8 illustrates operations for using the output of the motion detector classifier to determine neonatal activity.



FIG. 9 illustrates alternative operations for generating an input feature matrix for training a motion detection classifier.



FIG. 10 shows a portable non-contact subject monitoring system that includes a non-contact detector and a computing device.



FIG. 11 shows a semi-portable non-contact subject monitoring system that includes a non-contact detector and a computing device.



FIG. 12 shows a non-portable non-contact subject monitoring system that includes a non-contact detector and a computing device.



FIG. 13 is a block diagram illustrating a system including a computing device, a server, and an image capture device.



FIG. 14 illustrates example operations for displaying a motion flag on a display of a physiological signal of the patient based on the motion of the patient as determined by the system disclosed herein.



FIG. 15 illustrates alternative example operations for filtering noise from a physiological signal of the patient based on the motion of the patient as determined by the system disclosed herein.



FIG. 16 illustrates alternative example operations for generating and displaying a no-motion flag on a display of a physiological signal of the patient based on the motion of the patient as determined by the system disclosed herein.



FIG. 17 illustrates example operations for modifying an alarm signal based on motion of the patient as determined by the system disclosed herein.





DETAILED DESCRIPTIONS

Neonatal care typically requires constant or near constant monitoring of various physiological parameters of neonatal patients. For example, when in the ICU, various physiological parameters such as ECG, oxygen saturation levels, respiratory rate, etc., of a neonatal patient are typically monitored on a continuous basis. Any abnormal changes in such physiological parameters may generate an alarm or require immediate attention of a caretaker. However, sometimes, changes to such physiological parameters may be simply the result of movement or motion by the neonatal patient. Therefore, it is useful if a display of a neonatal physiological parameter is modified to indicate presence of a motion of the neonate.


The technology disclosed herein provides a method of monitoring motion using our touchless monitoring system which involves a machine learning (ML) model. Specifically, the system disclosed herein may be used to monitor motion and activity as a physiological parameter from the touchless system, or it may be used to monitor motion so that its effect on the measurement of a physiological parameter may be mitigated. For the latter, this may be a physiological parameter from the touchless system including respiratory rate or it may be from another device, such as oxygen saturation from a pulse oximeter, heart rate from an ECG, or respiration from any other device. Such contextualization of respiratory rate using the proposed motion flag may significantly enhance the interpretation of this parameter in the neonatal intensive care unit (NICU). Furthermore, activity monitoring using this method also has clinical significance including lethargy associated with over-sedation, hyperactivity related to under-sedation and/or neonatal pain, and deterioration en route to an advanced disease state such as sepsis. In an alternative implementation, such contextualization of physiological parameters may be used to improve the accuracy of a physiological parameter (including respiratory rate, heart rate, SpO2, etc.) which may otherwise provide erroneous readings due to neonatal motion.



FIG. 1 illustrates a non-contact subject monitoring system 100 and a patient 102, in this particular example an infant in a crib. It is noted that the systems and methods described herein are not limited to a crib, but may be used with a bassinette, an incubator, an isolette, or any other place where the patient 102 is. The system 100 includes a non-contact detector system 110 placed remote from the patient 102. In this embodiment, the detector system 110 includes a camera system 114, particularly, a camera that includes an infrared (IR) detection feature. The camera system 114 may be a depth sensing camera system, such as a Kinect camera from Microsoft Corp. (Redmond, Washington) or a RealSense™ D415, D435 or D455 camera from Intel Corp. (Santa Clara, California).


The camera system 114 may operate at a set frame rate, which is the number of image frames taken per second (or other time period). Example frame rates include 20, 30, 40, 50, or 60 frames per second, greater than 60 frames per second, or other values between those. Frame rates of 20-30 frames per second produce useful signals, though frame rates above 100 or 120 frames per second may be helpful in avoiding aliasing with light flicker (for artificial lights having frequencies around 50 or 60 Hz).


The camera system 114 is remote from the patient 102, in that it is spaced apart from and does not physically contact the patient 102. The camera system 114 may be positioned in close proximity to or on the crib. The camera system 114 has a field of view F that encompasses at least a portion of the patient 102. The field of view F is selected to be at least the upper torso of the subject. However, as it is common for young children and infants to move within the confines of their crib, bed or other sleeping area, the entire area potentially occupied by the patient 102 (e.g., the crib) may be the field of view F.


The camera system 114 includes a depth sensing camera that can detect a distance between the camera system 114 and objects in its field of view F. Such information can be used to determine that the patient 102 is within the field of view of the camera system 114 and determine a region of interest (ROI) to monitor on the subject. The ROI may be the entire field of view F or may be less than the entire field of view F. Once an ROI is identified, the distance to the desired feature is determined and the desired measurement(s) can be made.


The measurements (e.g., one or more of depth signal, RGB reflection, light intensity) are sent to a computing device 120 through a wired or wireless connection 121. The computing device 120 includes a display 122, a processor 124, and hardware memory 126 for storing software and computer instructions. Sequential image frames of the patient 102 are recorded by the video camera system 114 and sent to the computing device 120 for analysis by the processor 124. The display 122 may be remote from the computing device 120, such as a video screen positioned separately from the processor and memory. Other embodiments of the computing device 120 may have different, fewer, or additional components than shown in FIG. 1. In some embodiments, the computing device may be a server. In other embodiments, the computing device of FIG. 1 may be connected to a server. The captured images (e.g., still images or video) can be processed or analyzed at the computing device and/or at the server to create a topographical map or image to identify the patient 102 and any other objects within the ROI.


Also in some embodiments, the computing device 120 is communicatively connected to a monitoring device (not shown) that collects a physiological signal 130 from the patient 102. For example, such physiological signal may be the patient oxygen rate from an oximeter, the patient heart rate from an electrocardiogram (ECG) monitor, the patient respiration rate, etc.


The memory 126 may be configured to store a physiological signal contextualization module 140 that stores various data and modules that may be used to monitor motion of the patient 102 and to ensure that the impact of the motion on measurement and display of the physiological signal 130 are mitigated. The physiological signal contextualization module 140 stores the data stream 142 of the depth images received from the detector system 110 and a depth image processor 144 processes the depth images to generate various machine learning (ML) input feature matrices. Such processing of the depth images to generate ML input feature matrix is described in further detail in one or more of the figures disclosed herein and further discussed below.


The ML input feature matrices are used to train a classifier 146. For example, the classifier 146 can be a machine learning algorithm that automatically orders or categorizes data in one or more classes. In one implementation, the classifier 146 may be used to classify whether a stream of depth image data stream represents a motion or not for the patient 102. Example implementation of the classifier 146 may be a perceptron classifier, a logistic regression classifier, a naïve bayes classifier, etc.


Subsequently the result of the classifier 146 may be input to a display modifier 148. For example, the display modifier 148 may modify a display of a physiological parameter about the patient 102. In one implementation, the display modifier 148 may modify display of a respiratory rate of the patient 102, an ECG chart of the patient 102, etc. For example, the display modifier 148 may superimpose the display of such physiological parameter with display of whether the patient 102, such as an infant, was in motion or not. For example, the part of the chart of the physiological parameter may be highlighted when the patient 102 is determined to be in motion. Alternatively, the color of the chart of the physiological parameter may be changed based on whether the patient 102 is in motion or not.


In an alternative implementation, the result of the classifier 146 may be used to generate a motion-flag that is added to the display of another physiological parameter such as display of the respiratory rate of the patient, an ECG chart of the patient, a PPG chart of the patient, etc. In one implementation, the result of the classifier 146 may be monitored to determine lack of motion by the patient for an extended period of time, such as more than a few minutes when the patient is awake or more than a few hours when the patient is asleep. Such extended lack of motion may indicate complications such as over-sedation, lethargy, etc. Subsequently, such detection of non-motion is used to generate a no-motion flag that may be added to the display of other physiological parameter such as display of the respiratory rate of the patient, an ECG chart of the patient, a PPG chart of the patient, etc.


Alternatively, the result of the classifier 146 may be used to generate a modifier signal that can be used to modify of filter the other physiological parameter such as display of the respiratory rate of the patient, an ECG chart of the patient, a PPG chart of the patient, etc. For example, a portion of the display of the ECG signal may be removed in the vicinity of the motion flag. In one implementation, such filtering or removal of a portion of the physiological signal may be only when the output of the classifier 146 suggests that the patient's motion is above a threshold. Yet alternatively, the result of the classifier 146 or the motion flag maybe used to generate or modify an alarm signal. In one implementation, the result of the classifier 146 is provided to the display module that generates display of various physiological signals. Subsequently, the display module may incorporate the result of the classifier 146 to modify display of a physiological signal and/or an alarm signal based on such physiological signal.



FIG. 2 indicates an example of a depth video image 200 of a neonatal patient. Specifically, a camera may generate an RGB image 202 and a corresponding depth image 204. The RGB image 202 indicates a neonatal patient 210 in a bassinette. The depth image 202 shows the depth of various components in the image, including the head of neonate, the bassinette at various depth levels in milli-meters (mm). The camera may generate a stream of the depth images 204 at predetermined intervals, such as at every second, every five seconds, etc., as selected by a user.



FIG. 3A indicates parts of the processing steps 300 undertaken to denoise the depth image 302 to generate a depth difference matrix. The depth image 302 represents a raw depth image of a stream of depth images generated by a depth camera. For example, each pixel in the depth image 302 may represent a value indicating the depth of that pixel between 0 and 1000 mms. The stream of depth images is divided into set of images for a predetermined time span, such as for example one-second time span, with each timespan including a fixed number of depth frames. Subsequently, for each timespan, at a first point in time a first temporal median fa 304a average is computed over the first N frames 304 preceding the first point in time. Similarly, or each timespan, at a second point in time a second temporal median fb 306a is computed over the last N frames 306 preceding the second point in time. Here the second point in time is subsequent to the first point in time. The value of N may be provided as an input parameter by a user, in the illustrated example, N is 10.


An example, of a temporal median depth frame is shown by 308. In one implementation the median average depth frame 308 may be further processed to focus on the neonatal patient's location within the frame. For example, depths which are either too near or too far from the camera are removed based on user-specified percentile depth limits. For example, the depths greater than 87th percentile and depths lower than 3rd percentile are removed to get a modified depth median frame 310.


Subsequently, as shown by operation 320, the depth difference frames are computed by finding the difference between the two temporal median averaged depth frames fb 306a and fa 304a to generate a depth difference frame 330 as shown in FIG. 3B. The depth difference frame 330 may be further denoised by applying spatial median filtering, such as 5×5 or 3×3 spatial median filtering to get the denoised depth difference frame 332. Applying such median filter of kernel size 5×5 removes small regions of unusually high depth differences by applying a moving 5×5 window over the frame and returning the median depth difference of all pixels within the window.


Alternatively, depth differences that do not correspond to neonatal motion, such as too large depth differences such as over 150 mm, are removed to generate alternate denoised depth difference frame 334. In another implementation, area-based connected component removal is applied to identify and remove small pixel regions (connected areas of under 40 pixels) containing valid depth differences, as they are considered too small to be gross movements of the baby, to generate an alternate denoised depth difference frame 336.


In alternative implementations, one or more of the denoising processes to generate the alternate denoised depth difference frames 334-336 are selectively applied. For example, depth differences higher than a threshold depth difference may be filtered out. Subsequently, the denoised depth difference frame can be used to derive input features over time where such input features are indicative of the amount of movement by the patient over various ranges and scales.


One or more such input features are disclosed in further detail in FIG. 4. The features disclosed herein may be used as an input feature for a machine learning algorithm to train the machine to classify a series of depth frames in different classes. For example, such classifier may classify the series of frames as representing a motion or not representing a motion. For example, 402 and 404 represent the spatial scale of motion and enable the model to detect smaller motions over a larger spatial area. Specifically, 402 represents fraction of all non-null pixels where the depth difference in in the range of 1-3 mm, whereas 404 represents number of pixels with depth difference above 3 mm.


Similarly, the summed depth difference features 406 and 408 represent the magnitude of motion, allowing detection of larger motions over smaller spatial areas. Specifically, 406 represents a sum of depth differences for pixels where depth differences are greater than 3 mm, whereas 408 represents sum of depth differences for pixels where the depth difference is in the range of 1-3 mm.


The features 410 and 412 are averaged depth features that are intended to represent depth difference features with respect to distance from the camera, such that the machine learning model is able to learn motions of different scale and magnitude regardless of where the camera is placed. Specifically, 410 represents mean depth of all pixels where the depth difference is greater than 3 mm, whereas 412 represents mean depth of all pixels where depth difference is in the range of 1-3 mm.


In alternative implementations, other features may be derived from the depth video streams indicating other depth/depth difference ranges and/or measures indicative of motion variability. For example, to further represent motion variability, measures such as the mean or median of per-pixel standard deviation of depths within the frame can be included as additional features. In addition, denoising parameters described above, including the number of temporally averaged depth frames, percentile depth limits, spatial median filter kernel size and connected component area limits, can be specified as input parameters to the ML model. Additionally, the depth difference ranges used to extract depth/depth difference-based features can be altered to represent varying spatial scales and magnitudes of motion and can be included as additional features.


Any of the features disclosed above may be used as input features for machine learning models. For example, one of the input features may be used to train a decision tree classifier or random forest classifier to classify periods of motion. Alternatively other classifiers may also be used in the method including, but not limited to: Support Vector Machine, kNN, AdaBoost, Boosted Trees, Linear Regression, Naïve Bayes or a Neural Network classifier.



FIG. 5 illustrates example schematic 500 of a random forest classifier. The classifier may be trained on labeled data of neonate's motion (derived from manual observation of the RGB video data set) and then tested on unseen validation data. The classifier in this example was trained and tested on the neonatal data set using a leave-one-out cross validation (LOOCV) approach.



FIG. 6 illustrates an example output class probability signal 600 labeled with detected motion 602 as identified by the classifier and manually detected motion 604. As illustrated here, the detected motion 602 as identified by the classifier generally corresponds with the manually detected motion 604.



FIG. 7 illustrates modifying display of physiological signal or parameter of the neonate with information about motion of the neonate. For example, 702 indicates motion superimposed on a flow signal and 704 indicates motion superimposed on a respiratory rate signal.



FIG. 8 illustrates operations 800 for using the output of the motion detector classifier to determine neonatal activity such as the number of movement episodes, characteristics of the duration of movement (e.g., total duration, average duration of a movement episode, longest period, standard deviation, the percent of time movement took place, etc.). A clinician may use these output statistics to determine how lethargic a baby is and to determine what therapeutic intervention may be necessary.


Specifically, an operation 802 generates a depth image data stream that is preprocessed at operation 804 to generate input features in the manner described above. An operation 806 classifies a stream of depth images to classify them in motion and no-motion regions. Subsequently, an operation 808 generates various statistics, which are displayed at an operation 810. The statistical output may be displayed on the screen, or it may be recorded for future use. Alternatively, it may be computed over various periods of time including the whole stay of the neonate, every day, every 8 hours, every 4 hours, 2 hours, 1 hour, 30 minutes etc. In one implementation, the statistics may be updated as a rolling window over these periods and sent to a patient record system.


In yet another implementation, the motion detection classifier may also be used to improve determination of physiological signals. For example, a respiration rate algorithm may calculate the respiration rate as the average of a number of recently detected breaths. With the addition of the motion classification signal, the algorithm may “downweight” any breaths that are found during motion—leading to the calculation of a more robust respiration rate. In an alternative implementation, the respiration rate algorithm may hold its value for longer during a period of motion. Note that this could be done for many algorithms (e.g. for heart rate, SpO2, blood pressure, temperature, etc.) and for many devices, with respiration rate from a depth camera system being just an example.



FIG. 9 illustrates alternative operations 900 for generating an input feature matrix for training a motion detection classifier or to use a trained classifier to classify series of depth images into motion or no-motion classes. Specifically, an operation 902 receives a stream of depth images. An operation 904 divides the depth images into a number of temporal video sequences, each of the temporal video sequences having N frames each being tdelta apart from each other. For example, tdelta may be one second. An operation 906 generates a matrix of depth difference frames. For example, generating the depth difference frames includes an operation 906a that at time t=ta, determines a first temporal median fa of previous N depth frames, where N may be 5, 10, 15, frames etc. Subsequently, an operation 906b, at time tb=ta+tdelta, determines a second temporal median fb of previous N depth frames.


An operation 906c generates the depth difference frame by subtracting the first temporal median fa from the second temporal median fb. An operation 908 determines a machine learning (ML) input feature matrix. The ML input feature matrix may be used to train an ML model such as a motion detection classifier. Subsequently, at operation 910, real-time matrix of depth difference frames may be input into the trained ML model to identify an area of motion by the patient. An operation 912 may generate a motion flag based on the identified area of motion by the patient. Furthermore, the identified area of motion or the motion flag may be super-imposed on a time-series display of a physiological signal of the patient. For example, the identified area of motion or the motion flag may be super-imposed on a display of an ECG signal of the patient.



FIG. 10 shows a portable non-contact subject monitoring system 1000 that includes a non-contact detector 1010 and a computing device 1020. In this embodiment, the non-contact detector 1010 and the computing device 1020 are generally fixed in relation to each other and the system 1000 is readily moveable in relation to the subject to be monitored. The detector 1010 and the computing device 1020 are supported on a trolley or stand 1002, with the detector 1010 on an arm 1004 that is pivotable in relation to the stand 1002 as well as adjustable in height. The system 1000 can be readily moved and positioned where desired.


The detector 1010 includes a first camera 1014 and a second camera 1015, at least one of which includes an infrared (IR) camera feature. The detector 1010 also includes an IR projector 1016, which projects individual features (e.g., dots, crosses or Xs, lines, or a featureless pattern, or a combination thereof etc.).


The detector 1010 may be wired or wireless connected to the computing device 1020. The computing device 1020 includes a housing 1021 with a touch screen display 1022, a processor (not seen), and hardware memory (not seen) for storing software and computer instructions.



FIG. 11 shows a semi-portable non-contact subject monitoring system 1100 that includes a non-contact detector 1110 and a computing device 1120. In this embodiment, the non-contact detector 1110 is in a fixed relation to the subject to be monitored and the computing device 1120 is readily moveable in relation to the subject.


The detector 1110 is supported on an arm 1101 that is attached to a bed, in this embodiment, a hospital bed, although the detector 1110 and the arm 1101 can be attached to a crib, a bassinette, an incubator, an isolette, or other bed-type structure. In some embodiments, the arm 1101 is pivotable in relation to the bed as well as adjustable in height to provide for proper positioning of the detector 1110 in relation to the subject.


The detector 1110 may be wired or wireless connected to the computing device 1120, which is supported on a moveable trolley or stand 1102. The computing device 1120 includes a housing 1121 with a touch screen display 1122, a processor (not seen), and hardware memory (not seen) for storing software and computer instructions.



FIG. 12 shows a non-portable non-contact subject monitoring system 1200 that includes a non-contact detector 1210 and a computing device (not seen in FIG. 12). In this embodiment, at least the non-contact detector 1210 is generally fixed in a location, configured to have the subject to be monitored moved into the appropriate position to be monitored.


The detector 1210 is supported on a stand 1201 that is free standing, the stand having a base 1203, a frame 1205, and a gantry 1207. The gantry 1207 may have an adjustable height, e.g., movable vertically along the frame 1205, and may be pivotable, extendible and/or retractable in relation to the frame 1205. The stand 1201 is shaped and sized to allow a bed or bed-type structure to be moved (e.g., rolled) under the detector 1210.



FIG. 13 is a block diagram illustrating a system including a computing device 1300, a server 1325, and an image capture device 1385 (e.g., a camera, e.g., the camera system 114 or cameras 214, 215). In various embodiments, fewer, additional and/or different components may be used in the system.


The computing device 1300 includes a processor 1315 that is coupled to a memory 1305. The processor 1315 can store and recall data and applications in the memory 1305, including applications that process information and send commands/signals according to any of the methods disclosed herein. The processor 1315 may also display objects, applications, data, etc. on an interface/display 1310 and/or provide an audible alert via a speaker 1312. The processor 1315 may also or alternately receive inputs through the interface/display 1310. The processor 1315 is also coupled to a transceiver 1320. With this configuration, the processor 1315, and subsequently the computing device 1300, can communicate with other devices, such as the server 1325 through a connection 1370 and the image capture device 1385 through a connection 1380. For example, the computing device 1300 may send to the server 1325 information determined about a subject from images captured by the image capture device 1385, such as depth information of a subject or object in an image.


The server 1325 also includes a processor 1335 that is coupled to a memory 1330 and to a transceiver 1340. The processor 1335 can store and recall data and applications in the memory 1330. With this configuration, the processor 1335, and subsequently the server 1325, can communicate with other devices, such as the computing device 1300 through the connection 1370.


The computing device 1300 may be, e.g., the computing device 120 of FIG. 1 or the computing device 220 of FIG. 2. Accordingly, the computing device 1300 may be located remotely from the image capture device 1385, or it may be local and close to the image capture device 1385 (e.g., in the same room). The processor 1315 of the computing device 1300 may perform any or all of the various steps disclosed herein. In other embodiments, the steps may be performed on a processor 1335 of the server 1325. In some embodiments, the various steps and methods disclosed herein may be performed by both of the processors 1315 and 1335. In some embodiments, certain steps may be performed by the processor 1315 while others are performed by the processor 1335. In some embodiments, information determined by the processor 1315 may be sent to the server 1325 for storage and/or further processing.


The devices shown in the illustrative embodiment may be utilized in various ways. For example, either or both of the connections 1370, 1380 may be varied. For example, either or both the connections 1370, 1380 may be a hard-wired connection. A hard-wired connection may involve connecting the devices through a USB (universal serial bus) port, serial port, parallel port, or other type of wired connection to facilitate the transfer of data and information between a processor of a device and a second processor of a second device. In another example, one or both of the connections 1370, 1380 may be a dock where one device may plug into another device. As another example, one or both of the connections 1370, 1380 may be a wireless connection. These connections may be any sort of wireless connection, including, but not limited to, Bluetooth connectivity, Wi-Fi connectivity, infrared, visible light, radio frequency (RF) signals, or other wireless protocols/methods. For example, other possible modes of wireless communication may include near-field communications, such as passive radio-frequency identification (RFID) and active RFID technologies. RFID and similar near-field communications may allow the various devices to communicate in short range when they are placed proximate to one another. In yet another example, the various devices may connect through an internet (or other network) connection. That is, one or both of the connections 1370, 1380 may represent several different computing devices and network components that allow the various devices to communicate through the internet, either through a hard-wired or wireless connection. One or both of the connections 1070, 1080 may also be a combination of several modes of connection.


The configuration of the devices in FIG. 13 is merely one physical system on which the disclosed embodiments may be executed. Other configurations of the devices shown may exist to practice the disclosed embodiments. Further, configurations of additional or fewer devices than the ones shown in FIG. 13 may exist to practice the disclosed embodiments. Additionally, the devices shown in FIG. 13 may be combined to allow for fewer devices than shown or separated such that more than the three devices exist in a system. It will be appreciated that many various combinations of computing devices may execute the methods and systems disclosed herein. Examples of such computing devices may include other types of infrared cameras/detectors, night vision cameras/detectors, other types of cameras, radio frequency transmitters/receivers, smart phones, personal computers, servers, laptop computers, tablets, RFID enabled devices, or any combinations of such devices.


In contrast to tangible or non-transitory computer-readable storage media, intangible or transitory computer-readable communication signals may embody computer-executable instructions, data structures, program modules or other data resident in a modulated data signal, such as a carrier wave or other signal transport mechanism. In one implementation, the tangible or non-transitory computer-readable storage media may be implemented as a physical article of manufacture including one or more tangible computer readable storage media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, intangible communication signals include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.



FIG. 14 illustrates operations 1400 for displaying a motion flag on a display of a physiological signal of the patient based on the motion of the patient as determined by the system disclosed herein. Specifically, the operations 1400 may generate and display a motion flag on the display of a physiological signal of the patient. For example, such physiological signal may be a breathing flow signal, an electrocardiogram (ECG) signal, a photoplethysmography (PPG) signal, etc.


Specifically, an operation 1402 receives an indication of the detected motion of the patient. For example, such an indication of the detected motion may be generated by a classifier such as the classifier 146 disclosed in FIG. 1. An operation 1404 generates a motion flag based on the indicator or the detected motion. An operation 1406 receives a physiological signal of the patient, such as an ECG signal, a PPG signal, etc. Subsequently, an operation 1410 may add the motion flag to the display of a physiological signal, such as the ECG signal. Alternatively, the operation 1410 may merely send the motion flag to a device that displays the ECG signal. Subsequently, the display device may display the motion flag on the display of the ECG signal. Similarly, in case of the physiological signal being the PPG signal, operation 1410 adds the motion flag to the display of the PPG signal. In one implementation, the motion flag may be generated from a physiological signal that is different from the physiological signal on which it displayed. For example, the ML model may use depth images and/or an ECG signal to generate the detected motion and the motion flag generated based on the detected motion may be overlayed on a display of the PPG signal, etc.



FIG. 15 illustrates alternative operations 1500 for filtering noise from a physiological signal of the patient based on the motion of the patient as determined by the system disclosed herein. An operation 1502 receives an indication of the detected motion of the patient. For example, such an indication of the detected motion may be generated by a classifier such as the classifier 146 disclosed in FIG. 1. An operation 1506 receives a physiological signal of the patient, such as an EEG signal, an ECG signal, a PPG signal, etc. Here the physiological signal received at 1506 may be different than the physiological signal that may be used by the ML model, together with the depth images, to generate the detected motion and the motion flag. For example, the operation 1502 may generate receive indication of detected motion that is generated by an ML model using depth images together with a PPG signal, whereas the physiological signal received at operation 1506 may be the EEG signal. Subsequently, an operation 1510 may filter the EEG signal based on the value of the detected motion that was generated using depth images and/or a PPG signal. Such filtering may include adjusting the amplitude of the displayed EEG based on the amplitude of the detected motion of the patient, etc. In one implementation, an operation 1512 may remove noisy portions of the EEG signal based on the amplitude of the detected motion of the patient.



FIG. 16 illustrates alternative operations 1600 for generating and displaying a no-motion flag on a display of a physiological signal of the patient based on the motion of the patient as determined by the system disclosed herein. Specifically, the operations 1600 monitors the detected motion output from the classifier 146 to determine lack of motion by the patient for an extended period of time. For example, the operation 1600 may determine a no-motion state for the patient when no motion is detected for more than a few minutes when the patient is awake or for more than a few hours when the patient is asleep. Such extended lack of motion may indicate complications such as over-sedation, lethargy, etc. Subsequently, such detection of non-motion is used to generate a no-motion flag that may be added to the display of other physiological parameter such as display of the respiratory rate of the patient, an ECG chart of the patient, a PPG chart of the patient, etc.


An operation 1602 receives an indication of the detected motion of the patient. For example, such an indication of the detected motion may be generated by a classifier such as the classifier 146 disclosed in FIG. 1. Subsequently, an operation 1604 evaluates the output of the classifier to see if there is an indication of no-motion for the patient. For example, the operation 1604 may determine such existence of no-motion if the amplitude of the detected motion signal is below a threshold level for an elongated period of time. If so, an operation 1606 generates a no-motion flag. Here the no-motion flag may be indicative of an unusual condition for the patient and therefore, it may be useful to be displayed on top of a display of a physiological signal such as the EEG signal, etc. An operation 1608 receives a physiological signal of the patient, such as an ECG signal, a PPG signal, etc. Subsequently, at operation 1610 the no-motion flag that is determined based on the detected motion may be displayed on a display of the PPG signal.



FIG. 17 illustrates operations 1700 for modifying an alarm signal based on the detected motion of the patient as determined by the system disclosed herein. An operation 1702 receives an indication of the detected motion of the patient. For example, such an indication of the detected motion may be generated by a classifier such as the classifier 146 disclosed in FIG. 1. Subsequently, an operation 1704 may determine if the motion above a threshold level. If so, an operation 1706 may generate an alarm and an operation 1708 may display the alarm on the physiological signal of the patient or sound the alarm.


However, if the detected motion is not above the threshold, an operation 1710 may determine if the detected motion signal indicates lack of normal or expected motion by the patient for over a threshold time period. If so, an operation 1712 may generate an alarm and an operation 1714 may display the alarm on the physiological signal of the patient or sound the alarm. However, if the detected motion signal indicates lack of normal or expected motion by the patient for over a threshold time period, there is no need for display of an alarm and therefore, no action is taken at operation 1722.


The implementations described herein are implemented as logical steps in one or more computer systems. The logical operations may be implemented (1) as a sequence of processor-implemented steps executing in one or more computer systems and (2) as interconnected machine or circuit modules within one or more computer systems. The implementation is a matter of choice, dependent on the performance requirements of the computer system being utilized. Accordingly, the logical operations making up the implementations described herein are referred to variously as operations, steps, objects, or modules. Furthermore, it should be understood that logical operations may be performed in any order, unless explicitly claimed otherwise or a specific order is inherently necessitated by the claim language.


The above specification, examples, and data provide a complete description of the structure and use of exemplary embodiments of the invention. Since many implementations of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended. Furthermore, structural features of the different embodiments may be combined in yet another implementation without departing from the recited claims.

Claims
  • 1. A method of monitoring motion, comprising: receiving, using a processor, a video stream, the video stream comprising a sequence of images for at least a portion of a patient;dividing the video stream into a plurality of temporal video sequences, each of the temporal video sequences having a plurality of frames;generating a matrix of depth difference frames;determining a machine learning (ML) input feature matrix based on the matrix of depth difference frames; andtraining an ML model using the ML input feature matrix.
  • 2. The method of claim 1, wherein generating each of the matrix of depth difference frames comprising: determining, at a first point in time, a first temporal median of a first plurality of frames preceding the first point in time,determining, at a second point in time, a second temporal median of a second plurality of frames preceding the second point in time, wherein the second point in time is subsequent to the first point in time; andgenerating a depth difference frame based on the first temporal median and the second temporal median.
  • 3. The method of claim 1, wherein determining the ML input feature matrix further comprises determining a time series comprising fraction of all non-null pixels within each of the matrix of depth difference frames.
  • 4. The method of claim 1, wherein determining the ML input feature matrix further comprises determining a time series comprising a number of pixels within each of the matrix of depth difference frames with a depth difference greater than a threshold depth difference.
  • 5. The method of claim 4, wherein the threshold depth difference is 3 mm.
  • 6. The method of claim 1, wherein determining the ML input feature matrix further comprises determining a time series comprising a sum of depth differences of pixels within each of the matrix of depth difference frames with a depth difference greater than a threshold depth difference.
  • 7. The method of claim 1, wherein determining the ML input feature matrix further comprises determining a time series comprising a sum of depth differences of pixels within each of the matrix of depth difference frames with a depth difference within a threshold depth difference range.
  • 8. The method of claim 1, further comprising denoising the matrix of depth difference frames by at least one of (a) performing spatial median filtering of the matrix of depth difference frames, (b) removing depth differences higher than a threshold depth difference, and (c) removing area based connected components from the depth difference frames.
  • 9. The method of claim 1, further comprising: inputting a real-time matrix of depth difference frames into the trained ML model to identify an area of motion by a patient; andsuper-imposing the area of motion by the patient with a time-series of a physiological signal of the patient.
  • 10. The method of claim 8, further comprising modifying a display of the physiological signal of the patient based on the identified area of motion by a patient.
  • 11. The method of claim 1, further comprising: training an ML model using the ML input feature matrix;inputting a real-time matrix of depth difference frames into the trained ML model to detect motion by a patient;generating a motion flag corresponding the detected motion; anddisplaying the motion flag with a display of a physiological signal of the patient.
  • 12. The method of claim 1, further comprising: training an ML model using the ML input feature matrix;inputting a real-time matrix of depth difference frames into the trained ML model to detect motion by a patient;analyzing the detected motion to determine a period of lack of motion;generating a no-motion flag based on the period of lack of motion; anddisplaying the no-motion flag with a display of a physiological signal of the patient.
  • 13. In a computing environment, a method performed at least in part on at least one processor, the method comprising: receiving, using a processor, a video stream, the video stream comprising a sequence of images for at least a portion of a patient;dividing the video stream into a plurality of temporal video sequences, each of the temporal video sequences having a plurality of frames;generating a matrix of depth difference frames;determining a machine learning (ML) input feature matrix based on the matrix of depth difference frames; andtraining a machine learning model using the ML input feature matrix.
  • 14. The method of claim 13, wherein generating each of the matrix of depth difference frames comprising: determining, at a first point in time, a first temporal median of a first plurality of frames preceding the first point in time,determining, at a second point in time, a second temporal median of a second plurality of frames preceding the second point in time, wherein the second point in time is subsequent to the first point in time, andgenerating a depth difference frame based on the first temporal median and the second temporal median.
  • 15. The method of claim 13, further comprising: inputting a real-time matrix of depth difference frames into the trained machine learning model to identify an area of motion by the patient; andsuper-imposing the area of motion by the neonatal patient with a time-series of a physiological signal of the patient.
  • 16. The method of claim 13, wherein determining the ML input feature matrix further comprises determining a time series comprising fraction of all non-null pixels within each of the matrix of depth difference frames.
  • 17. The method of claim 16, wherein determining the ML input feature matrix further comprises determining a time series comprising a number of pixels within each of the matrix of depth difference frames with a depth difference greater than a threshold depth difference.
  • 18. A physical article of manufacture including one or more tangible computer-readable storage media, encoding computer-executable instructions for executing on a computer system a computer process to provide a system for contextualizing patient physiological signals using machine learning, the computer process comprising: receiving, using a processor, a video stream, the video stream comprising a sequence of images for at least a portion of a patient;dividing the video stream into a plurality of temporal video sequences, each of the temporal video sequences having a plurality of frames;generating a matrix of depth difference frames, wherein generating each depth difference frame includes: determining, at a first point in time, a first temporal median of a first plurality of frames preceding the first point in time,determining, at a second point in time, a second temporal median of a second plurality of frames preceding the second point in time, wherein the second point in time is subsequent to the first point in time, andgenerating a depth difference frame based on the first temporal median and the second temporal median;determining a machine learning (ML) input feature matrix based on the matrix of depth difference frames; andtraining a machine learning model using the ML input feature matrix.
  • 19. The physical article of manufacture of claim 18, wherein the computer process further comprising: inputting a real-time matrix of depth difference frames into the trained machine learning model to identify an area of motion by a neonatal patient; andsuper-imposing the area of motion by the neonatal patient with a time-series of a physiological signal of the neonatal patient.
  • 20. The physical article of manufacture of claim 19, wherein the computer process further comprising modifying a display of the physiological signal of the neonatal patient based on the identified area of motion by a neonatal patient.
CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims benefit of priority to U.S. Provisional Patent Application No. 63/501,354, entitled “CONTEXTUALIZATION OF SUBJECT PHYSIOLOGICAL SIGNAL USING MACHINE LEARNING” and filed on May 10, 2023, which is specifically incorporated by reference herein for all that it discloses or teaches.

Provisional Applications (1)
Number Date Country
63501354 May 2023 US