Telemedicine enables practitioners and patients (including disabled patients who have difficulty traveling to in-person consultations) to interact at anytime from anywhere in the world, reducing the time and cost of transportation, reducing the risk of infection by allowing patients to receive care remotely, reducing patient wait times, and enabling practitioners to spend more of their time providing care patients. Accordingly, telemedicine has the potential to improve the efficiency of the medical consultations for patients seeking medical care, practitioners evaluating the effectiveness of a specific treatment (e.g., as part of a clinical trial), etc.
Telemedicine also provides a platform for capturing and digitizing relevant information and adding that data to the electronic health records of the patient, enabling the practitioner to for example, using voice recognition and natural language processing to assist the provider in documenting the consultation and even recognizing the patient pointing to a region of interest and selecting a keyword identifying that region of interest1).
However, telemedicine has a number of drawbacks. Practitioners using existing telemedicine systems must rely on two-dimensional images and audio that are often low resolution, filtered, and compressed. With traditional telemedicine systems, the practitioner's view of the patient is limited and outside of the practitioner's control. Meanwhile, practitioners also cannot control other aspects of the patient environment, such as lightening, equipment, environmental distractions, etc. Finally, practitioners often find it difficult to simultaneously conduct the telemedicine consultation and document the exam without diminishing the quality of communication. In particular, keeping eye contact, mental focus, and maintaining effective listening require a large amount of the doctor's attention.
From the patient standpoint, many patients report that telehealth consultations yield lower quality of care compared to in-person visits, that health care providers are not able to conduct a physical exam at all, that they have difficulty seeing or hearing health care providers, that they feel less personally connected to the health care providers during telehealth visits, and/or that they have privacy concerns about telemedicine. The draw backs of telemedicine are particularly acute for elderly or cognitively impaired patients, who may have difficulty using—or may not have access to—the computing devices required to connect to existing telemedicine systems.
Accordingly, there is a need for an improved system to enhance the usability and quality of telehealth communication. In particular, there is a need for a system that enables patients-particularly elderly and cognitively-impaired patients-to easily participate in telehealth sessions. Additionally, there is a need for a telehealth platform that noninvasively captures (and digitizes) information indicative of the physical, emotive, cognitive, and/or social state of the patient.
Disclosed is a cyber-physical system (e.g., a practitioner system and a patient system) for conducting a telehealth session between a practitioner and a patient. In some embodiments, the patient system includes a hardware control box that enables patients-including elderly or cognitively-impaired patients-to easily initiate the telehealth session (e.g., with a single click of a hardware button or by saying a simple voice command) without using any software application, touch display, keyboard, or mouse. In those embodiments, the system is particularly well suited for conducting a computer-assisted cognitive impairment assessment, for example by outputting questions for the patient, providing functionality for the patient to easily answer those questions using the hardware buttons on the control box, time stamping the questions and patient responses, and calculating variables indicative of the cognitive state of the patient based on time-stamped questions and the time-stamped responses.
In some embodiments, the patient system includes environmental sensors, enabling the practitioner to view and assess environmental conditions (e.g., temperature, humidity, airborne particles, etc.) that may be affecting the health conditions of the patient.
In some embodiments, the system analyzes sensor data captured by the patient system (e.g., thermal images captured by a thermal imaging camera, eye tracking data captured by an eye tracker, three-dimensional images captured by a depth camera, etc.) and calculates state variables indicative of the physical, emotive, cognitive, or social state of the patient. For example, to calculate state variables (e.g., the Myasthenia Gravis core examination metrics), a computer vision module may perform computer vision analysis on patient video data and/or an audio analysis module may perform audio analysis on patient audio data.
In some embodiments, the state variables calculated by the system, together with the electronic health records of the patient and subjective assessments of the practitioner, form a “digital twin”—a mathematical representation of the physical, emotive, cognitive, and/or social state of the patient. In some of those embodiments, the digital twin may be used as an input of a heuristic computer reasoning system, which uses artificial intelligence to support clinical diagnosis and decision-making. For example, the heuristic computer reasoning engine may detect deviations from previously-determined state variables or identify potentially relevant diagnostic explorations.
The patient system includes a patient camera for capturing images of the patient. In some embodiments, the patient camera may be inside a camera enclosure that prevents the patient from seeing the patient camera (while still allowing the patient camera to capture images of the patient), to prevent the patient camera from distracting the patient and allow the patient to focus on the dialog with the practitioner. In some embodiments, the patient camera is a remotely-controllable pan-tile-zoom (TPZ) camera that can be controlled remotely (e.g., by the practitioner or automatically by the system) to capture images of a region of interest that is relevant to the examination being performed. In some of those embodiments, the computer vision module may use the digital twin of the patient to recognize a region of interest in the patient video data captured by the pan-tile-zoom camera and output control signals to the pan-tile-zoom camera to zoom in on the region of interest.
Aspects of exemplary embodiments may be better understood with reference to the accompanying drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of exemplary embodiments.
Reference to the drawings illustrating various views of exemplary embodiments is now made. In the drawings and the description of the drawings herein, certain terminology is used for convenience only and is not to be taken as limiting the embodiments of the present invention. Furthermore, in the drawings and the description below, like numerals indicate like elements throughout.
In the embodiment of
As described in detail below; the cyber-physical system 100 generates objective metrics indicative of the physical, emotive, cognitive, and/or social state of the patient 101. (Additionally, the cyber-physical system 100 may also provide functionality for the practitioner 102 to provide subjective assessments of the physical, emotive, cognitive, and/or social state of the patient 101.) Together with the electronic health records 184 of the patient 101, those objective metrics and/or subjective assessments are used to form a digital representation of the patient 101 (referred to as a digital twin 800, which is described in detail below with reference to
In the embodiment of
As shown in
In the embodiment of
In the embodiment of
Unlike keyboards and other generic user input devices, which enable users to provide input data for processing by any software application running on a computing system, the control box 300 may be a dedicated hardware device for users to provide input data (e.g., via the microcomputer 310) that is processed solely by the telehealth software described herein. While the control box 300 may be configured to perform multiple telehealth functions as described below (e.g., providing functionality for the patient 101 to initiate a telehealth session, capturing patient audio data via the patient microphone 350, outputting practitioner audio data via the speaker 360, providing functionality for the patient 101 to provide responses using the buttons 410 and 420, etc.), in some embodiments the control box 300 can be described as a single purpose hardware device, meaning the control box 300 is solely for use by the telehealth software described herein.
In some embodiments, the patient system 200 does not include any user input device (e.g., a keyboard, a mouse, etc.) other than the control box 300, enabling patients 101 (including elderly and/or cognitively-impaired patients 101) to easily initiate and participate in telehealth sessions as described below. In other embodiments, the patient system 200 includes the control box 300 in addition to one or more generic, multi-purpose user input devices (e.g., a keyboard, a mouse, etc.).
In the embodiment of
The communications modules 320 and 520 of the control box 300 and the patient computing system 500 may be any device suitably configured to send data from the control box 300 to the patient computing system 500 via a wired connection, a wireless connection (e.g., Bluetooth), a local area network 178, etc.
The presence of the patient camera 240 may distract the patient 101 and prevent the patient 101 from focusing on the interaction with the practitioner 102. In particular, in embodiments where the patient camera 240 is a remotely-controllable pan-tilt-zoom (PTZ) camera, any movement of the patient camera 240 may be particularly distracting. Therefore, in the embodiment of
By hiding the patient camera 240 in a “black box,” the camera enclosure 600 ensures that patient camera 240 is as minimally invasive as possible, enabling the patient 101 to focus on the dialog with the practitioner 102 dialogue between the patient 101 and the practitioner 102.
In the embodiment of
In order to perform the computer vision analysis described below (e.g., by the patient computing system 500), the patient video data 744 may be captured and/or analyzed at a higher resolution (and/or a higher frame rate, etc.) than is typically used for commercial video conferencing. Similarly, to perform the audio analysis described below; the patient audio data 743 may be captured and/or analyzed at a higher sampling rate, with a larger bit depth, etc., than is typical for commercial video conferencing software. Accordingly, while the patient video data 744 and the patient audio data 743 transmitted to the practitioner system 120 via the communications networks 170 may be compressed, the computer vision and audio analysis described below may be performed (e.g., by the patient computing system 500) using the uncompressed patient video data 744 and/or patient audio data 743.
In the embodiment of
More specifically, the sensor data classification module 720 may be configured to reduce or eliminate noise in the sensor data 740 and perform lower level artificial intelligence algorithms to identify specific patterns in the sensor data 740 and/or classify the sensor data 740 (e.g., as belonging to one of a number of predetermined ranges). In the embodiments of
As described in more detail below with reference to
In a clinical setting, for instance, the signal analysis module 725 may identify physical state variables 820 indicative of the physiological condition of the patient 101 (e.g., body temperature, pulse oxygenation, blood pressure, heart rate, etc.) based on physiological data 748 received from one or more physiological sensors 580 (e.g., a thermometer, a pulse oximeter, a blood pressure monitor, an electrocardiogram, data transferred from a wearable health monitor, etc.). Additionally, to provide functionality to identify physical state variables 820 in settings where physiological sensors 580 would be inconvenient or are unavailable, the sensor data classification module 720 may be configured to directly or indirectly identify physical state variables 820 in a non-invasive manner by performing computer vision and/or signal processing using other sensor data 740. For example, the thermal images 742 may be used to track heart beats2 and/or measure breathing rates.3
In some embodiments, for instance, the cyber-physical system 100 may be configured to enable the practitioner 102 to conduct a computer-assisted cognitive impairment test of the patient 101,4 such the Automated Neuropsychological Assessment Metrics (ANAM) adapted to elderly people, the Cambridge Neuropsychological Test Automated Battery (CANTAB), MindPulse: Attention and Excecution/Inhibition/Feedback to Difficulty, NeuroTrax, etc. To conduct the cognitive impairment test, test questions may be displayed to the patient 101 via the patient display 230. Meanwhile, because cognitive impairment tests require only yes or no answers, the cyber-physical system 100 enables the patient 101 to easily answer those test questions using the buttons 410 and 420 of the control box 300.
In addition to recording the test questions and the patient responses 741 to those questions, the sensor data classification module 720 uses the timer 728 to record time stamps indicative of when each test question was displayed to the patient display 230 and when each patient response 741 was provided. That time series is typically the only data produced when conducting a typical cognitive impairment test. However, the cyber-physical system 100 may be configured to use a number of input channels to provide a larger spectrum of information useful to the analysis and interpretation of the cognitive impairment test results. In some of those embodiments, for instance, physiological sensors 580 may be used to identify the physiological condition of the patient 101 (for instance, a breath sensor 340 may record physiological data 748 related to alcohol consumption or digestive issues). Additionally, the thermal images 742 may provide the input to one or more algorithms that identify indicators of stress, pain, cognitive load, and potentially vital signs.5 Similarly, the eye tracking data 745 may be used to identify evidence of the behavior and level of attention of the patient.6 Additionally or alternatively, the computer vision module 724 may analyze the patient video data 744 and use various algorithms to classify facial expressions7 and body language8 (e.g., to support an interpretation of neurologic condition. Finally, the audio analysis module 723 may perform a multispectral analysis of the patient audio data 743, for example to detect stress and/or deception.9
In some embodiments, the cyber-physical system 100 may enable the practitioner 102 to conduct a neurological examination of the patient 101. In those embodiments, the sensor data classification module 720 may be configured to compute the Myasthenia Gravis (MG) core examination metrics, for example by using the computer vision module 724 to identify and track facial and body movements of the patient 101 in the patient video data 744 and/or using the audio analysis module 723 to analyze the patient audio data 743 as outlined below and described in the inventors' forthcoming paper.
During the neurological examination, for example, the practitioner 102 may ask the patient 101 to perform an arm strength exercise and another exercise where the patient need pass from a standing position to seated position. In those instances, the computer vision module 724 may identify and track the movement of body landmarks 701 (e.g., as shown in
Similarly, the practitioner 102 may ask the patient 101 to perform a cheek puff exercise and a tongue-to-cheek exercise. In those instances, the computer vision module 724 may identify the face and/or eyes of the patient 101 in the patient video data 744 and identify and track face and/or face landmarks 702 (e.g., as shown in
To assess diplopia, the computer vision module 724 may track eye motion to verify the quality of the exercise, identify the duration of each phase, and register the time stamp of the patient expressing the moment double vision occurs.14
To assess ptosis as shown in
To determine whether the patient 101 can perform the cheek puff exercise, sensor data classification module 720 identifies cheek deformation by measuring the polygon 708 delimited by points (3), (15), (13), and (5) of
In some embodiments, the cyber-physical system 100 may include a depth camera. In those embodiments, the sensor data classification module 720 may use the three-dimensional image data of the patient 101 to identify the local curvature of the cheek. 17 However, lower cost depth cameras that use infrared and/or stereo images to measure depth (e.g., the Intel Realsense D435) may not be accurate enough to measure cheek deformation (particularly for patients 101 who have difficulty pushing their cheek with their tongues). Meanwhile, more accurate depth cameras that use the time of flight technology to measure depth18 may be too expensive to include in many embodiments. Accordingly, in most embodiments, the computer vision module 724 uses the patient video data 744 to track mouth deformation and/or change in illumination of the cheek to measure cheek deformation and reproducibility.
Similar to what a medical doctor grades during a telehealth consultation, for example, the computer vision module 724 may determine when the cheek deformation starts, when the cheek deformation ends, and if the cheek deformation gets weaker in time during the examination. For instance, because the local skin appearance changes as it gets dilated,19 the computer vision module 724 may identify and track a region of interest 703 between the mouth location and the external boundary of the cheek (where supposedly the deformation should be significant) and calculates the average pixel value of the blue dimension of the RGB code over time during the exercise. Additionally or alternatively, because the average pixel value method may depend on skin color and may not be sufficient (e.g., in certain lighting conditions), the computer vision module 724 may take advantage of the fact that cheek deformation impacts mouth geometry and identify cheek puffs and by tracking the movement of facial landmarks 702. For example, the computer vision module 724 may identify a cheek puff by determining whether the lips of the patient 101 take on a more rounded shape. Similarly, the computer vision module 724 may identify a tongue-to-cheek push by determining whether the upper lip is deformed.
During the arm strength assessment, the computer vision module 724 may track movement of the upper body motion patient 101 and a steady position of both arms using a standard deep learning technique,20 which provides a precise metric of the angle of the arm versus body, stability of the horizontal arm position, and duration.
Additionally, the practitioner 102 may ask the patient 101 to count to 50 and count to the highest number possible in a single breath. In those instances, the sensor data classification module 720 may determine the highest number the patient 101 can count to in a single breath (e.g., less than 20, between 20 and 24, between 25 and 29, or more than 30), for example, using a speech recognition algorithm, and/or whether the patient 101 experiences shortness of breath (e.g., shortness of breath with exertion, shortness of breath at rest, or ventilator dependence). In some embodiments, the sensor data classification module 720 may determine whether the patient 101 experiences shortness of breath using a three-dimensional images of the patent 102 captured by a depth camera and/or by analyzing the thermal images to track the extent of the plume of warm air that is exhaled by the patient 101.21 Additionally or alternatively, the audio analysis module 723 may extract features from the patient audio data 743 indicative of shortness of breath, such as:
In addition to the standard MG scores described above, the sensor data classification module 720 may also be configured to calculate state variables 810 indicative of the dynamic of neuromuscular weakness of the patient 101 during the time interval of each exercise, which are essential to create a digital twin 800 of neuromuscular weakness. For instance, an essential model can assimilate the core examination data for each of the following muscles groups: left and right eyes, tongue to left cheek and right cheek, pulmonary diaphragm, left arm and right arm, left leg and right leg. Because each fatigue exercise corresponds to the activation by the central nervous system of one of those muscle groups for a specific duration, the sensor data classification module 720 may calculate a time dependent curve that represents the physical response of that activation. A simple three compartment model of muscle fatigue can be expressed as follows:
where t corresponds to the time scale of the physical exercise, M0 is the total available motor units (decomposed into the group of activated muscles MA, already fatigued muscles MF and muscles at rest Muc), B is the activation rate, F is the fatigue rate, and R is the recovery rate. The model of muscle fatigue above is inspired by Jing et al.,26 except that loop cycling is used between the three compartments. Additionally, while there is always a residual positive amount of muscle activated in the model of Jing et al., the model of muscle fatigue above leads to a limit state that is zero for the available motor units of muscles at rest Muc, which seems more realistic from the biological point of view. Because the system of differential equations is linear with constant coefficients (i.e., the activation rate B, the fatigue rate F, and the recovery rate R), it is straightforward to write down the explicit solution and check the asymptotic behavior as in Jing et al.
The model of muscle fatigue may be modified to take into account the potential defective activation of the muscle fibers due to a generic auto-immune factor. For example, an autoimmune response Q may be modulated in each muscles group by a generic vascularization factor that takes into account the local distribution of the autoimmune factor. The model may be stochastic, meaning the activation comes with a probability of activation components p(N), where N is the order of magnitude of the number of muscle fibers. For instance, a phenomenological model of muscle fatigue for each muscle group having an index j may be expressed as follows:
where Q0 is a generic autoimmune factor that is common to all muscle groups and, for each muscle group having an index j, Qj is the autoimmune factor, Vj represents the impact of vascularization on that muscle group, Nj is the number of muscle fibers, MAj is the available motor units of activated muscles, MFj is the available motor units of already fatigued muscles, Mucj is the available motor units of muscles at rest, Bj is the activation rate, Fj is the fatigue rate, Rj is the recovery rate, MAj+MFj+Mucj=1 after normalization, and p(Nj) is a probability distribution that gets close to a bell function as the number of fibers Nj→∞. The phenomenological model above is designed in such a way that:
The digitalization of state variables 810 indicative of the dynamic of neuromuscular weakness of the patient 101 enables the system 100 (e.g., the server 180) to build a large unbiased database of 182 the patient 101. The quality of the dataset supports the classification of the treatment of patients 101 as a function of the severity of the score in each of the above categories, as well as the fitting of a stochastic dynamic system:
where {right arrow over (S)}(T) is the state variable 810 describing the MG patient condition and {right arrow over (C)}(T) is the control variable corresponding to drug treatment, and T is the long-time scale of the patient disease (as opposed to the short time scale t of the core physical examination). Overall, the digital twin 800 is then multiscale in time. To be more specific, the vector S(T) contains at minimum the baseline autoimmune Q0 that is common to all muscle groups and may include gene regulation factors (as in the model of vascular adaptation described in Casarin et al.27). The control variable {right arrow over (C)}(T) may focus on the drug treatment and have comorbidity factors as well.
The cyber-physical system 100 provides unique advantages compared to traditional telehealth systems. Referring back briefly to
As shown in
Accordingly, once the telehealth connection is established, the cyber-physical system 100 enables the practitioner 102 to get the best view of the patient 101, zoom in and zoom out in the regions of interest 703 important to the diagnosis, orient the patient display 230 so the patient 101 is well positioned to view the practitioner 102, and control the sound volume of the patient speaker 260 and/or 360, the sensitivity of the patient microphone 350, and the brightness of the lighting in the patient environment 110. Accordingly, the practitioner 102 benefits from a much better view of the region of interest than with an ordinary telehealth system. For example, it would be much more difficult to ask an elderly patient 101 to hold a camera toward the region of interest to get the same quality of view.
As shown in
Traditional telemedicine systems can introduce significant variability in the data acquisition process (e.g., patient audio data 743 recorded at an inconsistent volume, patent video data 744 recorded in inconsistent lighting conditions). In order to calculate accurate state variables 810, it is important to reduce that variability, particularly when capturing sensor data 740 from the same patient 101 over multiple telehealth sessions. Accordingly, the cyber-physical system 100 may output control signals 716 to reduce variability in the data acquisition process. For example, the lighting calibration module 768 may determine the brightness of the patient video data 744 and output control signals 716 to the lighting system 114 to adjust the brightness in the patient environment 110.
As described above, the control box 300 may include a microphone 350 in order to capture better the voice of the patient 101. Additionally, the audio calibration module 762 may form a feedback loop to calibrate the sound volume of the patient speaker 360 and/or the sensitivity of the patient microphone 350. For example, the beeper 370 may output a consistent tone (e.g., via the patient speaker 360), which may be captured by audio calibration module 762 via the patient microphone 350. The audio calibration module 762 may then calculate the volume (for example, using algorithms defined in ITU-R BS.1770-4 and EBU R 128 standards) and adjust the sound volume of the patient speaker 360 and/or the sensitivity of the patient microphone 350.
The patient tracking module 764 may use the patient video data 744 to track the location of the patient 101 and output control signals 716 to the patient camera 260 (to capture images of the patient 101) and/or to the display base 234 to rotate and/or tilt the patient display 230 towards the patient 101. Additionally or alternatively, the patient tracking module 764 may adjust the pan, tilt, and/or zoom of the patient camera 260 to automatically provide a view selected by the practitioner 102 (e.g., centered on the face of the patient 101, capturing the upper body of the patient 101, a view for a dialogue with the patient 101 and a nurse or family member, etc.), or to provide a focused view of interest based on sensor interpretation of vital signs or body language in autopilot mode.28
In preferred embodiments, the patient tracking module 764 automatically adjusts the pan, tilt, and/or zoom of the patient camera 260 to capture each region of interest 703 relevant to each assessment being performed. As shown in
Additionally, to the limit any undesired impact on the emotional and social state of the patient 101 caused by the telehealth session, in some embodiments the cyber-physical system 100 may monitor the emotive state variables 840 and/or social state variables 880 of the patient 101 and, in response to changes in the emotive state variables 840 and/or social state variables 880 of the patient 101, adjust the view output by the patient display 230, the sounds output via the patient speakers 260 and/or 360, and or the lights output by the lighting system 114 and/or the buttons 410 and 420 (e.g., according to preferences specified by the practitioner 102) to minimize those changes in the emotive state variables 840 and/or social state variables 880 of the patient 101.
As shown in
In the counting to 50 and single breath counting exercises described above, for example, the patient 101 may output different airflow depending on how quickly and loudly the patient 101 is counting. Accordingly, the timer 728 may be used to provide a visual aid 718 (e.g., via the patient display 230) to guide the patient 101 to count with a consistent rhythm of about one number counted per second. Additionally, to ensure that patient audio data 743 is captured at a consistent volume as described above, the audio calibration module 762 may analyze the patient audio data 743 and provide a visual aid 718 to the patient 101 (e.g., in real time) instructing the patient 101 to speak at a higher or lower volume.
Additionally, digitalization of the ptosis, diplopia, cheek puff, tongue-to-cheek, arm strength, and stand-to-sit assessments depends heavily on controlling the framing of the regions of interest 703 (and the distance from the camera patient camera 240 to the region of interest 703). Therefore, the patient video data 744 may be output to the patient 101 (and/or the practitioner 102) with a landmark 719 (e.g., a silhouette showing the desired size of the patient 101) so the practitioner 102 can make sure the patient 101 is properly centered and distanced from the patient camera 240).
Additionally, in some embodiments, the cyber-physical system 100 provides the practitioner 102 with real-time environmental data 747 indicative of the environmental conditions (e.g., temperature, humidity, and airborne particle rates) in the patient environment 110 so that the practitioner 102 may assess the parameters that may affect the health conditions of the patient.29 For example, senior patients often do not realize that they get dehydrated if the temperature is high and the humidity is low or that they risk pneumonia if the temperature is low and the humidity is high. Similarly, high particle counts in the air due to a lack of room ventilation are related to a greater risk of airborne disease. All those factors, for instance, may contribute to an abnormally low cognitive performance. (The environmental data 747 may be used as inputs of the digital twin 800 of the patient 101 as a controlled variables because they impact the evolution of the principle state variables 810.) As described below with reference to
The cyber-physical system 100 even provides features to address the main draw back of telehealth consultations: the inability of practitioners 102 to physically interact with the patient 101. Examining a patient 191 complaining of abdominal pain, for instance, the thermal imaging camera 250 may detect inflammation that manifests by an intense sub-surface vascularization flow.30 Because the practitioner 102 cannot touch the patient 101 to localize pain points and abdomen stiffness, a practitioner 102 using a traditional telemedicine platform is typically reduced to asking the patient 111 to press on their abdomen at specific locations and get feedback on pain or discomfort. However, cyber-physical system 100 offers possibilities to mitigate that difficulty. First, a laser pointer 550 may be mounted on top of the patient display 230, which can be activated to show the patient precisely where the doctor's region of interest is located. The patient camera 240 system can automatically track the region of interest by using either the laser pointer 250 landmark or a thermal map of the abdomen generated using the thermal images 742. Second, using the computer vision module 724 and the thermal images 742, the practitioner 102 can get an indirect assessment of abdominal stiffness and local inflammation. Finally, as mentioned above, the pain level can be analyzed from facial expressions and correlated to the patient feedback registered in the doctor's medical notes.
In some embodiments, the cyber-physical system 100 can also be used in conjunction with a patient surveillance system and quickly and automatically establish telehealth communication with medical staff when needed. For example, a patient 101 (e.g., having Alzheimer's disease) in the clinical setting of
In each utilization scenario described above, the input provided to the system is rich enough to allow the construction of a digital twin 800 of the patient, the same way it can be with a genome,33 but with a scalable database that integrates multiple modalities related to anatomy, behavior, and environmental conditions of the patients 101.
In principle, a digital twin is “a virtual representation that serves as the real-time digital counterpart of a physical object or process” and was first introduced for manufacturing processes.34 A digital twin 800 of patient 101 may be far more complex and may require a strong ethical principal.35 In the cyber-physical system 100 described herein, the digital twin 800 is a model that can behave as a user of a telehealth system. The digital twin 800 of the patient 101 is an agent-based model with state variables 810 that represent the anatomic and metabolic variables of the patient 101 as well as a behavior state related to stress, cognitive load, memory components, etc. Statistical rules describe how the dynamical system transitions from one set of state variable values 810 to another in time. State variables 810 are preferably discrete numbers, and statistical rules are parametrized. Such an agent-based model has hundreds of unknown parameters that can be interpreted as the “gene set” of the user with respect to the model (similar to the inventors' work in system biology36). As described below, those unknown parameters are calibrated using telehealth data. Meanwhile, the model is calibrated dynamically and keeps evolving with the patient observations accumulated at each telehealth session.
As shown in
The digital twin 800 of the patient 101 is specific to the disease management and describes the dynamic of the variables (e.g., physical state variables 820, emotive state variables 840, cognitive state variables 860, and/or social state variables 880) under the stimuli of the cognitive test or medical examination run by the practitioner 102.
The physical state variables 820 are models specific to the disease. In the Myasthenia Gravis (MG) example above, for instance, the physical state variables 820 may include eye motion, upper body motion, facial motion focusing on lip when talking, or cheek when the patient uses their tongue. As described above, the sensor data classification module 720 may directly or indirectly recover many physical state variables 820 through the sensor data 740 using computer vision and signal processing. As described below with reference to
The emotive state variables 840 may include indicators on happiness, sadness, fear, surprise, disgust, anger or being neutral. As described above, the sensor data classification module 720 may identify many emotive state variables 840 through facial expression and eye tracking.
The cognitive state variables 860 include the ability to process a specific task and/or the ability to memorize specific data. Typically, the telehealth session concentrates on a specific cognitive variable to establish a diagnostic and monitor the progression of a disease.
As described above, the cyber-physical system 100 may be used to conduct a cognitive assessment and the sensor data classification module 720 may identify many cognitive state variables 860 through sensor data 740. As such, the cyber-physical system 100 can be used in controlled consultations (such as the consultations for Myasthenia Gravis patients 101 described above) and can also be beneficial on routine consultations for a patient 101 without chronic disease.
The social state variables 880 measure some aspect of aggressive behavior, mutualistic behavior, cooperative behavior, altruistic behavior, and parental behavior.
Emotive state variables 840, cognitive state variables 860, and social state variables 880 are correlated with noninvasive measurements during human computer interaction37 and may be identified using the sensor data 740 captured by the sensor array of the cyber-physical system 100.38
The digital twin 800 can be used also to support heuristic computer reasoning engine 890 to deliver telehealth assistance.
Based on the digital twin 800, the heuristic computer reasoning engine 890 runs an artificial intelligence algorithm (on-site or in the cloud) that can reconciliate all these data sets and produce a high-quality analytical report that supports the provider's diagnostic analysis and decisions. The heuristic computer reasoning engine 890 continually consults the digital twin 800 of the patient 101 to assist the practitioner 102 with the workflow process—starting with the medical history and updates and following with the analytical processing of cognitive and other test results in the context of all of the sensor data 740 described above—to support a high quality clinical practice.
The heuristic computer reasoning engine 890 may be, for example, the HERALD system described in PCT Pub. No. 2022/217263, which describes a broad spectrum of applications that have been tested in clinical conditions to improve workflow efficiency and safety and is hereby incorporated by reference. The HERALD system 890 provides the architecture of a heuristic computer reasoning system that assists in the optimized efficiency of workflow by exercising the digital process analogue of a “thought experiment.” The heuristic computer reasoning engine 890 is comprised of a “learning” algorithm, developed as a system. For example, HERALD 890 can assist a system operator in charge of a medical facility with complex workflow efficiency issues and multiple human factor contingencies to gain a higher level of certainty in processes and increased efficiency in resources utilization-both staff and equipment. The same architecture can be a heuristic computer reasoning system 890 that coaches an individual to improve their behavior according to some specific objective such as work quality, quality of health or enhanced productivity.
Once the heuristic computer reasoning system 890 is in place, it allows the construction of a sentient computer system that generates its own most relevant questions and provides optimum decision-making support to an operator based on continuous monitoring of working environments through multiple inputs and sensors. Not only does the present system allow for a more statistically based, informed decision-making tool, but also the system monitors collected data with a continually adaptive system in order to more accurately predict potential future outcomes and events. As more data is collected and as data sets expand, the inherent predictive value of the system increases through enhanced accuracy and reliability. This value is manifested in increasingly efficient workflow operations which may be best reflected through resultant changes in the principal's behavior and overall satisfaction.
The cyber-physical system 100 uses the same three components as the HERALD system, including sensing (set of low-level artificial intelligence algorithms running on sensor output that acquire multimodalities input from the sensor array mounted on the display and patient input), communication (a medium level layer of artificial intelligence to communicate with the medical doctor running the telehealth session either by text messages, graphic user interface or voice recognition), and an evolving model of the problem domain for the patient consultation to support heuristic reasoning using a customized digital twin 800 of the medical doctor workflow that assimilates data from the sensors and communications to best describe the clinical practice.
Once this agent based model is in place, or any dynamical system that will mimic patients' reactions to external stimuli such as environmental conditions, medical conditions including those induced by drugs, questions given by the providers, etc. one can exercise the heuristic computer reasoning schema 890 to support the telehealth session of the provider. As shown in
As shown in
Traditionally, clinical decision of provider during a telehealth or in person visit operate on a decision tree.39 This is a standard practice used by provider to determine the best course of action. The simplicity of the decision tree graph makes the construction of the clinical decision algorithm amenable to artificial intelligence techniques such as vector machine [65],40 random forest,41 and deep learning.42 The simplicity and ease of use of clinical decision tree make the method very popular. However, decision trees operate on discrete sets, ignoring the continuity in time and state variable space of the patient condition and eventually erasing any nuance in clinical decision. Decision trees also have the tendency to suppress the critical thinking needed in precision medicine and leave the practitioner 102 with little choice under the assumption that the clinical tree decision is standard of care.
By contrast, as shown in
As shown in
The digital twin 800 provides a rigorous mathematical framework that allows the heuristic computer reasoning engine 890 to use artificial intelligence to systematically test out clinical options by generating and simulating “what if?” considerations. As shown in
By reducing the prospective and retrospective analysis to mathematical formulations executed using the digital twin 800 as described below, the heuristic computer reasoning engine 890 can suggest answers to prefactual questions, suggest answers to counterfactual questions, suggest answers to semi-factual questions, suggest answers to predictive questions, perform hind casting, perform retrodiction, perform backcasting, etc.
The following mathematical framework may be used to implement the heuristic computer reasoning engine 890:
where ωi≥0, i=1 . . . N.
where κ(S0) denotes a real number that depends only on the state value S0.
where κ(C0) denotes a real number that depends only on the control value C0
The heuristic computer reasoning engine 890 can suggest answers to prefactual questions (i.e., “What will be the outcome if event X occurs?”). The abstract formulation of event X is a sudden change of the state variable {right arrow over (S)}, denoted Δ{right arrow over (S)}. The mathematical formation of this question is
The algorithm to answer that problem is based on a forward digital twin run:
Accordingly, heuristic computer reasoning engine 890 can suggest answers to prefactual questions, for example:
The heuristic computer reasoning engine 890 can suggest answers to counterfactual questions (i.e., “What might have happened if X had happened instead of Y?”). The mathematical formulation of this question is the same as above provided that Δ{right arrow over (S)} denotes the difference on the state variable between the sudden change of state variable switching from event X to event Y. Accordingly, the heuristic computer reasoning engine 890 can suggest answers to counterfactual questions, for example:
The heuristic computer reasoning engine 890 can suggest answers to semi-factual questions (i.e., “Even though X occurs instead of Y, would Z still occur?”). For instance, let's assume that event Z corresponds to the j component of the projected value in the objective space. Using Δ{right arrow over (S)} defined as above, the mathematical formulation of this question would be:
The heuristic computer reasoning engine 890 answers that question by substituting the digital twin simulation to the observation. Accordingly, the heuristic computer reasoning engine 890 can suggest answers to semi-factual questions, for example:
The heuristic computer reasoning engine 890 can suggest answers to predictive questions (i.e., “Can we provide forecasting from stage Z?”). To formulate that question in a more rigorous way, we need to specify how further in time we like this prediction to be, i.e., set ΔT, and how accurate should be this prediction to be valuable. For instance, let's assume {right arrow over (ϵ)} to be the tolerance for an admissible prediction value in the objective value space. The mathematical formulation of the question is:
If this inequality is satisfied the prediction is correct for each component of the objective function. On one end, the heuristic computer reasoning engine 890 checks that inequality comparing observation with simulation and starting from some specific state value in the region of interest. On the other hand, the heuristic computer reasoning engine 890 uses the continuity of the of error estimate with respect to state variable and this past observation to answer that question for any state values closer enough to S(t). Accordingly, the heuristic computer reasoning engine 890 can suggest answers to predictive questions, for example:
The heuristic computer reasoning engine 890 can perform hind casting (i.e., “Can we provide forecasting from stage Z with new event X?”). That question is no different than the previous one except it applies to {right arrow over (S)}(t)+Δ{right arrow over (S)} where Δ{right arrow over (S)} stands for new event X:
The heuristic computer reasoning engine 890 uses the same learning process from past experience described above to handle that problem. Accordingly, the heuristic computer reasoning engine 890 can perform hind casting, for example:
The heuristic computer reasoning engine 890 can perform retrodiction (i.e., past observations, events and data are used as evidence to infer the process(es) that produced them). The heuristic computer reasoning engine 890 starts from a past observation
that has been tracking the state variable evolution in time for a given control variable. The heuristic computer reasoning engine 890 starts verifies that the model has been predictive:
The heuristic computer reasoning engine 890 assumes that the error estimate is continuous with respect to the state variable. The mathematical formulation of retrodiction can be done in many different ways depending on the level of causality the heuristic computer reasoning engine 890 are looking for. In its simplest form, the heuristic computer reasoning engine 890 looks for the variable component that changes the outcome significantly, i.e.:
Find j∈(1 . . . N), such that |(ε(Δt)({right arrow over (S)}(t−Δt)+ΔS(t−Δt),{right arrow over (C)}(t−Δt)))j|>>{right arrow over (∈)}j?
That problem is amenable to standard optimization techniques. A more sophisticated analysis would involve a nonlinear sensitivity analysis on all potential events or combination of events represented by Δ{right arrow over (S)} in a neighborhood of {right arrow over (S)} to be defined. Accordingly, the heuristic computer reasoning engine 890 can perform retrodiction, for example:
The heuristic computer reasoning engine 890 can perform backcasting (i.e., moving backwards in time, step-by-step, in as many stages as are considered necessary, from the future to the present to reveal the mechanism through which that particular specified future could be attained from the present). The mathematical formulation of that question can be derived from the above one. For example, assuming that
one can move one back step further to identify an event that would change the outcome:
Find j∈(1 . . . N),such that |(ε(Δt)({right arrow over (S)}(t−Δt)+ΔS(t−Δt),{right arrow over (C)}(t−Δt)))j|>>{grave over (ϵ)}j?
or repeat the process backward in time until such event exists. That would assume that the validity of the prediction holds for that many time steps backward, i.e.
where n is the number of back steps Δt involved. Accordingly, the heuristic computer reasoning engine 890 can perform backcasting, for example:
As shown in
As described above, construction of a digital twin 800 model of the telehealth patient 101 supports heuristic computer reasoning 890 in the specific medical problem domain as an adjoint to the telehealth workflow and reporting that the practitioner 102 is handling. Meanwhile, similar to a modern cockpit of a fighter jet (that can assist the pilot to focus on his objective, gathering lateral information that may have escaped his attention and supporting flight conditions semi-automatically in order to lower the cognitive load of the pilot), the practitioner user interface 900 provides a “smart and enhanced cockpit” capability that manages information that could otherwise overload the practitioner 102 with information that is not needed.
The server 180, the physician system 120, the microcomputer 310 of the control box 300, and the compact computer 510 of the patient computing system 500 may be any hardware computing device capable of performing the functions described herein. Accordingly, each of those computing devices includes non-transitory computer readable storage media for storing data and instructions and at least one hardware computer processing device for executing those instructions. The computer processing device can be, for instance, a computer, personal computer (PC), server or mainframe computer, or more generally a computing device, processor, application specific integrated circuits (ΔSIC), or controller. The processing device can be provided with, or be in communication with, one or more of a wide variety of components or subsystems including, for example, a co-processor, register, data processing devices and subsystems, wired or wireless communication links, user-actuated (e.g., voice or touch actuated) input devices (such as touch screen, keyboard, mouse) for user control or input, monitors for displaying information to the user, and/or storage device(s) such as memory, RAM, ROM, DVD, CD-ROM, analog or digital memory, database, computer-readable media, and/or hard drive/disks. All or parts of the system, processes, and/or data utilized in the system of the disclosure can be stored on or read from the storage device(s). The storage device(s) can have stored thereon machine executable instructions for performing the processes of the disclosure. The processing device can execute software that can be stored on the storage device. Unless indicated otherwise, the process is preferably implemented automatically by the processor substantially in real time without delay.
The processing device can also be connected to or in communication with the Internet, such as by a wireless card or Ethernet card. The processing device can interact with a website to execute the operation of the disclosure, such as to present output, reports and other information to a user via a user display, solicit user feedback via a user input device, and/or receive input from a user via the user input device. For instance, the patient system 200 can be part of a mobile smartphone running an application (such as a browser or customized application) that is executed by the processing device and communicates with the user and/or third parties via the Internet via a wired or wireless communication path.
The system and method of the disclosure can also be implemented by or on a non-transitory computer readable medium, such as any tangible medium that can store, encode or carry non-transitory instructions for execution by the computer and cause the computer to perform any one or more of the operations of the disclosure described herein, or that is capable of storing, encoding, or carrying data structures utilized by or associated with instructions. For example, the database 182 is stored is non-transitory computer readable storage media that is internal to the server 180 or accessible by the server 180 via a wired connection, a wireless connection, a local area network, etc.
The heuristic computer reasoning engine 890 may be realized as software instructions stored and executed by the server 180.
In some embodiments, the sensor data classification module 720 may be realized as software instructions stored and executed by the server 180, which receives the sensor data 740 captured by the patient computing system 500 and data (e.g., input by the physician 102 via the physician user interface 900) from the physician computing system 102. In preferred embodiments, however, the sensor data classification module 720 may be realized as software instructions stored and executed by the patient system 200 (e.g., by the compact computer 510 of the patient computing system 500). In those embodiments the patient system 200 may classify the sensor data 740 (e.g., as belonging to one of a number of predetermined ranges and/or including any of a number of predetermined patterns) using algorithms (e.g., lower level artificial intelligence algorithms) specified by and received from the server 180.
Analyzing the sensor data 740 at the patient computing system 500 provides a number of benefits. For instance, the sensor data classification module 720 can accurately time stamp the sensor data 740 without being affected by any time lags caused by network connectivity issues. Additionally, analyzing the sensor data 740 at the patient computing system 500 enables the sensor data classification module 720 to analyze the sensor data 740 at its highest available resolution (e.g., without compression) and eliminates the need to transmit that high resolution sensor data 740 via the communications networks 170. Meanwhile, by analyzing the sensor data 740 at the patient computing system 500 and transmitting state variables 810 to the server 180 (e.g., in encrypted form), the cyber-physical system 100 may address patient privacy concerns and ensure compliance with regulations regarding the protection of sensitive patient health information, such as the Health Insurance Portability and Accountability Act of 1996 (HIPAA).
While preferred embodiments have been described above, those skilled in the art who have reviewed the present disclosure will readily appreciate that other embodiments can be realized within the scope of the invention. Accordingly, the present invention should be construed as limited only by any appended claims.
www.github.com/opencv/opencv/blob/master/data/haarcascades/haarcascade_eye.xml
This application is a continuation application of International Application No. PCT/US2023/061783, filed Feb. 1, 2022, which claims priority to U.S. Prov. Pat. Appl. No. 63/305,420, filed Feb. 1, 2022, which is hereby incorporated by reference.
This invention was made with government support under Grant No. U54 NS115054 awarded by NIH. The U.S. government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63305420 | Feb 2022 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2023/061783 | Feb 2023 | WO |
Child | 18791292 | US |