There is a need in the medical community to determine neonatal motion and/or activity for a number of reasons including to monitor relative neonatal activity over time which may correlate with disease state and/or over or under sedation and to help mitigate the effect of motion on physiological signals and their derived parameters (e.g. respiratory rate from the Transthoracic Impedance (TTI) signal). A range of depth sensing technologies are available to determine various physiological and contextual parameters, including respiration rate, tidal volume, minute volume, effort to breathe, activity, presence in bed, etc., that may be useful in detecting neonatal motion and/or activity. Specifically, video (RGB) and depth-sensing cameras have enormous potential to provide non-contact methods for the determination of physiological parameters.
Implementations described herein disclose a method of monitoring motion of a patient, the method includes receiving, using a processor, a video stream, the video stream comprising a sequence of images for at least a portion of a patient, dividing the video stream into a plurality of temporal video sequences, each of the temporal video sequences having a plurality of frames each being apart from each other, generating a matrix of depth difference frames, and determining a machine learning (ML) input feature matrix based on the matrix of depth difference frames.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Other implementations are also described and recited herein.
A further understanding of the nature and advantages of the present technology may be realized by reference to the figures, which are described in the remaining portion of the specification.
Neonatal care typically requires constant or near constant monitoring of various physiological parameters of neonatal patients. For example, when in the ICU, various physiological parameters such as ECG, oxygen saturation levels, respiratory rate, etc., of a neonatal patient are typically monitored on a continuous basis. Any abnormal changes in such physiological parameters may generate an alarm or require immediate attention of a caretaker. However, sometimes, changes to such physiological parameters may be simply the result of movement or motion by the neonatal patient. Therefore, it is useful if a display of a neonatal physiological parameter is modified to indicate presence of a motion of the neonate.
The technology disclosed herein provides a method of monitoring motion using our touchless monitoring system which involves a machine learning (ML) model. Specifically, the system disclosed herein may be used to monitor motion and activity as a physiological parameter from the touchless system, or it may be used to monitor motion so that its effect on the measurement of a physiological parameter may be mitigated. For the latter, this may be a physiological parameter from the touchless system including respiratory rate or it may be from another device, such as oxygen saturation from a pulse oximeter, heart rate from an ECG, or respiration from any other device. Such contextualization of respiratory rate using the proposed motion flag may significantly enhance the interpretation of this parameter in the neonatal intensive care unit (NICU). Furthermore, activity monitoring using this method also has clinical significance including lethargy associated with over-sedation, hyperactivity related to under-sedation and/or neonatal pain, and deterioration en route to an advanced disease state such as sepsis. In an alternative implementation, such contextualization of physiological parameters may be used to improve the accuracy of a physiological parameter (including respiratory rate, heart rate, SpO2, etc.) which may otherwise provide erroneous readings due to neonatal motion.
The camera system 114 may operate at a set frame rate, which is the number of image frames taken per second (or other time period). Example frame rates include 20, 30, 40, 50, or 60 frames per second, greater than 60 frames per second, or other values between those. Frame rates of 20-30 frames per second produce useful signals, though frame rates above 100 or 120 frames per second may be helpful in avoiding aliasing with light flicker (for artificial lights having frequencies around 50 or 60 Hz).
The camera system 114 is remote from the patient 102, in that it is spaced apart from and does not physically contact the patient 102. The camera system 114 may be positioned in close proximity to or on the crib. The camera system 114 has a field of view F that encompasses at least a portion of the patient 102. The field of view F is selected to be at least the upper torso of the subject. However, as it is common for young children and infants to move within the confines of their crib, bed or other sleeping area, the entire area potentially occupied by the patient 102 (e.g., the crib) may be the field of view F.
The camera system 114 includes a depth sensing camera that can detect a distance between the camera system 114 and objects in its field of view F. Such information can be used to determine that the patient 102 is within the field of view of the camera system 114 and determine a region of interest (ROI) to monitor on the subject. The ROI may be the entire field of view F or may be less than the entire field of view F. Once an ROI is identified, the distance to the desired feature is determined and the desired measurement(s) can be made.
The measurements (e.g., one or more of depth signal, RGB reflection, light intensity) are sent to a computing device 120 through a wired or wireless connection 121. The computing device 120 includes a display 122, a processor 124, and hardware memory 126 for storing software and computer instructions. Sequential image frames of the patient 102 are recorded by the video camera system 114 and sent to the computing device 120 for analysis by the processor 124. The display 122 may be remote from the computing device 120, such as a video screen positioned separately from the processor and memory. Other embodiments of the computing device 120 may have different, fewer, or additional components than shown in
Also in some embodiments, the computing device 120 is communicatively connected to a monitoring device (not shown) that collects a physiological signal 130 from the patient 102. For example, such physiological signal may be the patient oxygen rate from an oximeter, the patient heart rate from an electrocardiogram (ECG) monitor, the patient respiration rate, etc.
The memory 126 may be configured to store a physiological signal contextualization module 140 that stores various data and modules that may be used to monitor motion of the patient 102 and to ensure that the impact of the motion on measurement and display of the physiological signal 130 are mitigated. The physiological signal contextualization module 140 stores the data stream 142 of the depth images received from the detector system 110 and a depth image processor 144 processes the depth images to generate various machine learning (ML) input feature matrices. Such processing of the depth images to generate ML input feature matrix is described in further detail in one or more of the figures disclosed herein and further discussed below.
The ML input feature matrices are used to train a classifier 146. For example, the classifier 146 can be a machine learning algorithm that automatically orders or categorizes data in one or more classes. In one implementation, the classifier 146 may be used to classify whether a stream of depth image data stream represents a motion or not for the patient 102. Example implementation of the classifier 146 may be a perceptron classifier, a logistic regression classifier, a naïve bayes classifier, etc.
Subsequently the result of the classifier 146 may be input to a display modifier 148. For example, the display modifier 148 may modify a display of a physiological parameter about the patient 102. In one implementation, the display modifier 148 may modify display of a respiratory rate of the patient 102, an ECG chart of the patient 102, etc. For example, the display modifier 148 may superimpose the display of such physiological parameter with display of whether the patient 102, such as an infant, was in motion or not. For example, the part of the chart of the physiological parameter may be highlighted when the patient 102 is determined to be in motion. Alternatively, the color of the chart of the physiological parameter may be changed based on whether the patient 102 is in motion or not.
In an alternative implementation, the result of the classifier 146 may be used to generate a motion-flag that is added to the display of another physiological parameter such as display of the respiratory rate of the patient, an ECG chart of the patient, a PPG chart of the patient, etc. In one implementation, the result of the classifier 146 may be monitored to determine lack of motion by the patient for an extended period of time, such as more than a few minutes when the patient is awake or more than a few hours when the patient is asleep. Such extended lack of motion may indicate complications such as over-sedation, lethargy, etc. Subsequently, such detection of non-motion is used to generate a no-motion flag that may be added to the display of other physiological parameter such as display of the respiratory rate of the patient, an ECG chart of the patient, a PPG chart of the patient, etc.
Alternatively, the result of the classifier 146 may be used to generate a modifier signal that can be used to modify of filter the other physiological parameter such as display of the respiratory rate of the patient, an ECG chart of the patient, a PPG chart of the patient, etc. For example, a portion of the display of the ECG signal may be removed in the vicinity of the motion flag. In one implementation, such filtering or removal of a portion of the physiological signal may be only when the output of the classifier 146 suggests that the patient's motion is above a threshold. Yet alternatively, the result of the classifier 146 or the motion flag maybe used to generate or modify an alarm signal. In one implementation, the result of the classifier 146 is provided to the display module that generates display of various physiological signals. Subsequently, the display module may incorporate the result of the classifier 146 to modify display of a physiological signal and/or an alarm signal based on such physiological signal.
An example, of a temporal median depth frame is shown by 308. In one implementation the median average depth frame 308 may be further processed to focus on the neonatal patient's location within the frame. For example, depths which are either too near or too far from the camera are removed based on user-specified percentile depth limits. For example, the depths greater than 87th percentile and depths lower than 3rd percentile are removed to get a modified depth median frame 310.
Subsequently, as shown by operation 320, the depth difference frames are computed by finding the difference between the two temporal median averaged depth frames fb 306a and fa 304a to generate a depth difference frame 330 as shown in
Alternatively, depth differences that do not correspond to neonatal motion, such as too large depth differences such as over 150 mm, are removed to generate alternate denoised depth difference frame 334. In another implementation, area-based connected component removal is applied to identify and remove small pixel regions (connected areas of under 40 pixels) containing valid depth differences, as they are considered too small to be gross movements of the baby, to generate an alternate denoised depth difference frame 336.
In alternative implementations, one or more of the denoising processes to generate the alternate denoised depth difference frames 334-336 are selectively applied. For example, depth differences higher than a threshold depth difference may be filtered out. Subsequently, the denoised depth difference frame can be used to derive input features over time where such input features are indicative of the amount of movement by the patient over various ranges and scales.
One or more such input features are disclosed in further detail in
Similarly, the summed depth difference features 406 and 408 represent the magnitude of motion, allowing detection of larger motions over smaller spatial areas. Specifically, 406 represents a sum of depth differences for pixels where depth differences are greater than 3 mm, whereas 408 represents sum of depth differences for pixels where the depth difference is in the range of 1-3 mm.
The features 410 and 412 are averaged depth features that are intended to represent depth difference features with respect to distance from the camera, such that the machine learning model is able to learn motions of different scale and magnitude regardless of where the camera is placed. Specifically, 410 represents mean depth of all pixels where the depth difference is greater than 3 mm, whereas 412 represents mean depth of all pixels where depth difference is in the range of 1-3 mm.
In alternative implementations, other features may be derived from the depth video streams indicating other depth/depth difference ranges and/or measures indicative of motion variability. For example, to further represent motion variability, measures such as the mean or median of per-pixel standard deviation of depths within the frame can be included as additional features. In addition, denoising parameters described above, including the number of temporally averaged depth frames, percentile depth limits, spatial median filter kernel size and connected component area limits, can be specified as input parameters to the ML model. Additionally, the depth difference ranges used to extract depth/depth difference-based features can be altered to represent varying spatial scales and magnitudes of motion and can be included as additional features.
Any of the features disclosed above may be used as input features for machine learning models. For example, one of the input features may be used to train a decision tree classifier or random forest classifier to classify periods of motion. Alternatively other classifiers may also be used in the method including, but not limited to: Support Vector Machine, kNN, AdaBoost, Boosted Trees, Linear Regression, Naïve Bayes or a Neural Network classifier.
Specifically, an operation 802 generates a depth image data stream that is preprocessed at operation 804 to generate input features in the manner described above. An operation 806 classifies a stream of depth images to classify them in motion and no-motion regions. Subsequently, an operation 808 generates various statistics, which are displayed at an operation 810. The statistical output may be displayed on the screen, or it may be recorded for future use. Alternatively, it may be computed over various periods of time including the whole stay of the neonate, every day, every 8 hours, every 4 hours, 2 hours, 1 hour, 30 minutes etc. In one implementation, the statistics may be updated as a rolling window over these periods and sent to a patient record system.
In yet another implementation, the motion detection classifier may also be used to improve determination of physiological signals. For example, a respiration rate algorithm may calculate the respiration rate as the average of a number of recently detected breaths. With the addition of the motion classification signal, the algorithm may “downweight” any breaths that are found during motion—leading to the calculation of a more robust respiration rate. In an alternative implementation, the respiration rate algorithm may hold its value for longer during a period of motion. Note that this could be done for many algorithms (e.g. for heart rate, SpO2, blood pressure, temperature, etc.) and for many devices, with respiration rate from a depth camera system being just an example.
An operation 906c generates the depth difference frame by subtracting the first temporal median fa from the second temporal median fb. An operation 908 determines a machine learning (ML) input feature matrix. The ML input feature matrix may be used to train an ML model such as a motion detection classifier. Subsequently, at operation 910, real-time matrix of depth difference frames may be input into the trained ML model to identify an area of motion by the patient. An operation 912 may generate a motion flag based on the identified area of motion by the patient. Furthermore, the identified area of motion or the motion flag may be super-imposed on a time-series display of a physiological signal of the patient. For example, the identified area of motion or the motion flag may be super-imposed on a display of an ECG signal of the patient.
The detector 1010 includes a first camera 1014 and a second camera 1015, at least one of which includes an infrared (IR) camera feature. The detector 1010 also includes an IR projector 1016, which projects individual features (e.g., dots, crosses or Xs, lines, or a featureless pattern, or a combination thereof etc.).
The detector 1010 may be wired or wireless connected to the computing device 1020. The computing device 1020 includes a housing 1021 with a touch screen display 1022, a processor (not seen), and hardware memory (not seen) for storing software and computer instructions.
The detector 1110 is supported on an arm 1101 that is attached to a bed, in this embodiment, a hospital bed, although the detector 1110 and the arm 1101 can be attached to a crib, a bassinette, an incubator, an isolette, or other bed-type structure. In some embodiments, the arm 1101 is pivotable in relation to the bed as well as adjustable in height to provide for proper positioning of the detector 1110 in relation to the subject.
The detector 1110 may be wired or wireless connected to the computing device 1120, which is supported on a moveable trolley or stand 1102. The computing device 1120 includes a housing 1121 with a touch screen display 1122, a processor (not seen), and hardware memory (not seen) for storing software and computer instructions.
The detector 1210 is supported on a stand 1201 that is free standing, the stand having a base 1203, a frame 1205, and a gantry 1207. The gantry 1207 may have an adjustable height, e.g., movable vertically along the frame 1205, and may be pivotable, extendible and/or retractable in relation to the frame 1205. The stand 1201 is shaped and sized to allow a bed or bed-type structure to be moved (e.g., rolled) under the detector 1210.
The computing device 1300 includes a processor 1315 that is coupled to a memory 1305. The processor 1315 can store and recall data and applications in the memory 1305, including applications that process information and send commands/signals according to any of the methods disclosed herein. The processor 1315 may also display objects, applications, data, etc. on an interface/display 1310 and/or provide an audible alert via a speaker 1312. The processor 1315 may also or alternately receive inputs through the interface/display 1310. The processor 1315 is also coupled to a transceiver 1320. With this configuration, the processor 1315, and subsequently the computing device 1300, can communicate with other devices, such as the server 1325 through a connection 1370 and the image capture device 1385 through a connection 1380. For example, the computing device 1300 may send to the server 1325 information determined about a subject from images captured by the image capture device 1385, such as depth information of a subject or object in an image.
The server 1325 also includes a processor 1335 that is coupled to a memory 1330 and to a transceiver 1340. The processor 1335 can store and recall data and applications in the memory 1330. With this configuration, the processor 1335, and subsequently the server 1325, can communicate with other devices, such as the computing device 1300 through the connection 1370.
The computing device 1300 may be, e.g., the computing device 120 of
The devices shown in the illustrative embodiment may be utilized in various ways. For example, either or both of the connections 1370, 1380 may be varied. For example, either or both the connections 1370, 1380 may be a hard-wired connection. A hard-wired connection may involve connecting the devices through a USB (universal serial bus) port, serial port, parallel port, or other type of wired connection to facilitate the transfer of data and information between a processor of a device and a second processor of a second device. In another example, one or both of the connections 1370, 1380 may be a dock where one device may plug into another device. As another example, one or both of the connections 1370, 1380 may be a wireless connection. These connections may be any sort of wireless connection, including, but not limited to, Bluetooth connectivity, Wi-Fi connectivity, infrared, visible light, radio frequency (RF) signals, or other wireless protocols/methods. For example, other possible modes of wireless communication may include near-field communications, such as passive radio-frequency identification (RFID) and active RFID technologies. RFID and similar near-field communications may allow the various devices to communicate in short range when they are placed proximate to one another. In yet another example, the various devices may connect through an internet (or other network) connection. That is, one or both of the connections 1370, 1380 may represent several different computing devices and network components that allow the various devices to communicate through the internet, either through a hard-wired or wireless connection. One or both of the connections 1070, 1080 may also be a combination of several modes of connection.
The configuration of the devices in
In contrast to tangible or non-transitory computer-readable storage media, intangible or transitory computer-readable communication signals may embody computer-executable instructions, data structures, program modules or other data resident in a modulated data signal, such as a carrier wave or other signal transport mechanism. In one implementation, the tangible or non-transitory computer-readable storage media may be implemented as a physical article of manufacture including one or more tangible computer readable storage media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, intangible communication signals include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.
Specifically, an operation 1402 receives an indication of the detected motion of the patient. For example, such an indication of the detected motion may be generated by a classifier such as the classifier 146 disclosed in
An operation 1602 receives an indication of the detected motion of the patient. For example, such an indication of the detected motion may be generated by a classifier such as the classifier 146 disclosed in
However, if the detected motion is not above the threshold, an operation 1710 may determine if the detected motion signal indicates lack of normal or expected motion by the patient for over a threshold time period. If so, an operation 1712 may generate an alarm and an operation 1714 may display the alarm on the physiological signal of the patient or sound the alarm. However, if the detected motion signal indicates lack of normal or expected motion by the patient for over a threshold time period, there is no need for display of an alarm and therefore, no action is taken at operation 1722.
The implementations described herein are implemented as logical steps in one or more computer systems. The logical operations may be implemented (1) as a sequence of processor-implemented steps executing in one or more computer systems and (2) as interconnected machine or circuit modules within one or more computer systems. The implementation is a matter of choice, dependent on the performance requirements of the computer system being utilized. Accordingly, the logical operations making up the implementations described herein are referred to variously as operations, steps, objects, or modules. Furthermore, it should be understood that logical operations may be performed in any order, unless explicitly claimed otherwise or a specific order is inherently necessitated by the claim language.
The above specification, examples, and data provide a complete description of the structure and use of exemplary embodiments of the invention. Since many implementations of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended. Furthermore, structural features of the different embodiments may be combined in yet another implementation without departing from the recited claims.
The present application claims benefit of priority to U.S. Provisional Patent Application No. 63/501,354, entitled “CONTEXTUALIZATION OF SUBJECT PHYSIOLOGICAL SIGNAL USING MACHINE LEARNING” and filed on May 10, 2023, which is specifically incorporated by reference herein for all that it discloses or teaches.
Number | Date | Country | |
---|---|---|---|
63501354 | May 2023 | US |