The present disclosure relates to a computer-implemented method configured for automatic real-time detection of clinical deterioration events in a patient. The disclosure further relates to a system for carrying out the disclosed method.
Clinical practice relies on manual recording of physiological patient data every 12 hours to manually calculate an Early Warning Score (EWS) or similar “risk-scores” for detection of patients at need. Selected physiological patient data (blood pressure, respiratory rate, heart/pulse rate, and body temperature) are traditionally described as vital signs. The term vital sign is, however, historically based on those biomarkers for physical status that have been possible to measure (starting with pulse rate and temperature centuries ago). Because peripheral monitoring of blood-pressure and oxygen saturation became reliable and commercially available, this is commonly also included as vital signs. Some hospitals also include parameters such as mental status, pain, urine output, blood glucose and end-tidal CO2 as vital signs and in the EWS. A problem with the EWS approach and similar “track-and-trigger” systems, is that a number of critical events occur between the fixed measurements without being detected. The EWS has a mandatory requirement for more frequent registration depending on the severity of the latest measured values, but EWS has not proved any effect on complications or survival despite intense resource allocation. For example, it has been shown that manual routine EWS measurements detects only 5% of the serious cases of severe oxygen desaturation, when compared to continuous 24/7 automatic measurements. In layman's terms, the main cause for low and late detection rate despite large resource allocation is two-fold: Manual and infrequent collection of patient data is insufficient to detect clinical deterioration prior to the event. Manual, clinical interpretation of threshold values for individual parameters is not sensitive or sophisticated enough to capture events in a timely manner. Trends and combination of physiological parameters of vital signs are too complex to be manually assessed, and vulnerable to the experience or alertness of the individual assessor (e.g. nurse, physician). Nowadays, it is furthermore possible to measure a variety of other biomarkers for physical status, including advanced analyses of heart rate and heart rhythm and peripheral perfusion to describe circulatory function. It is therefore very plausible that new biomarkers for physical status will be added to the traditional list of vital signs introduced above. In the sections below, the term ‘vital signs’ will refer to this indefinite list of biomarkers for the assessment of the physical status of a patient.
In general, clinical decision support systems (CDSS) are focused on using knowledge management in such a way to achieve clinical advice for patient care based on multiple claims of patient data. A problem encountered within many clinical support systems is that the alarm generations are based on simple threshold alerts, which consequently results in too many alarms, many of them being false alarms, whereby the medical staff (nurses, doctors, etc.) is exhausted. This phenomenon is also known as alarm fatigue, which is a well-known and hugely recognized challenge related to monitoring of patients, where alarms will typically be either muted, or threshold values disregarded—or simply ignored.
Therefore, there is a need of a system and method that provides an alternative to simple threshold monitoring for predicting adverse patient events. Specifically, there is a need of a clinical decision support system, wherein the alarm generation is based on intelligent algorithms such that true alarms are maximized and false alarms are reduced, while still providing vital sign alarms, which are clinically actionable.
Currently, the monitoring of post-operative patients relies on intermittent bedside monitoring and simple models of Early Warning Scores (EWS) in the hospital. There is a need of a system that facilitates continuous and predictive monitoring and therefore improves the monitoring of patients.
The present disclosure addresses the above-mentioned challenges by providing a system and method for automatic and continuous detection of clinical deterioration events in a patient. An advantage of the presently disclosed system and method, is that it provides a continuous real-time monitoring of a patient (ideally 24/7), wherein an alarm is generated in case the patient has a deterioration event. The method comprises the step of executing one or more computer-implemented clinically validated automatic deterioration event subroutines, wherein each subroutine is configured to determine a specific clinical deterioration event in the patient.
In particular, the present disclosure relates to a computer-implemented method configured for automatic real-time detection of clinical deterioration events in a patient, the method comprising the steps of:
Optionally and/or alternatively the one or more computer-implemented clinically validated deterioration event subroutines can be executed based on forecasted vital signs, as for example shown in example 5, where the vital sign parameters heart rate and respiration rate are forecasted based on modelling, e.g. machine learning, in particular Multivariate Auto-Regressive (MAR) models, of the validated vital sign parameters HR and RR. Any vital sign parameters can be forecasted based on this approach, i.e. forecasted for at least 5, 10, 15, 30, 45 or even 60 minutes. Each subroutine based on forecasted vital signs can then be configured to receive one or more of the forecasted vital sign parameters and determine a specific forecasted clinical deterioration event in the patient, said clinical deterioration event selected from the group of clinical deterioration events listed above. In that way alarms can be generated based on forecasted data, i.e. before the deterioration event actually happens and/or predicting whether the deterioration event is likely to occur in the near future.
Preferably, all vital signs are received continuously. The data may be transmitted with different sampling frequencies and it may be transmitted blockwise (sampling). As also described in Example 3 herein blood pressure (both systolic and diastolic) can be estimated on other measured vital signs, for example HR, RR, SpO2 and PR, such that blood pressure can be measured both cuffless and non-invasively, and possibly be validated because it can be based on validated vital signs data, and thereby possibly replace the validated systolic blood pressure measurement used in the subroutines as described herein.
Each clinically validated deterioration event subroutine is associated with one or more criterions, which determine whether an alarm should be generated. The criterions comprise thresholds and time duration(s), which have been clinically determined and evaluated by medical doctors, such that the amount of alarms generated are reduced and more important alarms are generated. Hence, the presently disclosed system and method provides much more predictive value than existing systems, since it provides alarms of events that require clinical action from the medical staff, while simultaneously greatly reducing the amount of false alarms. This has been achieved by engineering a plurality of deterioration event subroutines, also referred to as predictive computer algorithms.
The present disclosure further relates to a system for automatic detection of a clinical deterioration event in a patient, said system comprising:
The system described herein is configured for executing the presently disclosed method thereby providing automatic detection of a clinical deterioration event in a patient. The disclosed system and its functionality is shown in
The disclosure further relates to a computer program having instructions thereon which, when executed by a computing device or system, causes the computing device or system to execute the method disclosed herein, thereby providing automatic real-time detection of clinical deterioration events in a patient.
Accordingly, the presently disclosed system and method provides continuous 24/7 monitoring of patients, wherein clinical deterioration events in the patient is automatically detected and reported through intelligent alarm generation. Specifically, this is achieved by executing a plurality of deterioration event subroutines, which receives input from one or more sensors associated with the patient, wherein said subroutines are configured to provide an alarm in case of a clinical deterioration event in the patient.
Accordingly, the presently disclosed system and method provides a significant improvement to existing clinical support systems, which typically rely heavily on simple thresholds for generating alarms.
The present disclosure further relates to a system for identifying unauthorized access of an account of an online service, comprising a non-transitive, computer-readable storage device for storing instructions that, when executed by a processor, performs a method for identifying unauthorized access of an account of an online service according to the described method. The system may comprise a mobile device comprising a processor and a memory and being adapted to perform the method but it can also by a stationary system or a system operating from a centralized location, and/or a remote system, involving e.g. cloud computing. The invention further relates to a computer program having instructions which when executed by a computing device or system cause the computing device or system to identify an unauthorized access of an account of an online service according to the described method. Computer program in this context shall be construed broadly and include e.g. programs to be run on a PC or software designed to run on smartphones, tablet computers or other mobile devices. Computer programs and mobile applications include software that is free and software that has to be bought, and also include software that is distributed over distribution software platforms such as Apple App Store, Google Play and Windows Phone Store.
The present disclosure relates to a computer-implemented method configured for automatic real-time detection of clinical deterioration events in a patient.
In a preferred embodiment, the first step of the method is receiving a plurality of different vital sign data from a plurality of sensors worn by the patient. The vital sign data may be selected from the group of: electrocardiogram (ECG), photoplethysmogram (PPG), heart rate (HR), respiration rate (RR), blood pressure (e.g. systolic blood pressure, SBP), heart rhythm, ischemic electrocardiographic response, peripheral temperature, peripheral skin conductance, 3D body position and acceleration, pulse rate (PR), peripheral perfusion index, peripheral oxygen saturation (SpO2) (e.g. derived from PPG), and subcutaneous glucose concentration. The vital sign data may be received continuously or it may be received at predefined time intervals, such as every minute. The vital signs may have fixed sampling frequencies and some may have block-wise sampling. As an example, ECG may be delivered every first 10 seconds per minute i.e. the next package of 10 seconds of samples are then sent after 50 seconds and so on.
In a preferred embodiment, the next step of the method is analyzing the vital sign data to identify artefacts in the data. Artefacts and noise are preferably taken care of for each vital sign whenever needed. Artefacts should be understood as erroneous data displaying unphysical values arising from external factors, which influences the measurement (of the sensors) such that the measurement is disturbed or altered from its true value. An example is motion-related artefacts, e.g. if the patient makes sudden movements, this may influence some of the measured vital sign data. Another example is if one of the sensors is placed incorrectly, it may be unable to measure the intended vital sign. As an example, if the pulse oximeter measures a negative value or more than 100% SpO2, such data points will be considered artefacts and consequently removed. Therefore, a next step of the method is preferably to discard one or more data samples (i.e. sets of data points) associated with the identified artefacts in the vital sign data in order to obtain validated patient vital sign parameters. Artefacts and noise are estimated by the overall approach to look for abnormal deviation as a function of time, amplitude and frequency content.
In a preferred embodiment, the method further comprises the step of executing an ECG preprocessing subroutine configured to assess the quality of the ECG data from an ECG sensor worn by the patient. Some vital sign data, such as RR interval (RRI), PP interval (PPI), HR, heart rhythm and RR are estimated/calculated based on R peak detection in the ECG data, i.e. validated heart rate and validated heart rhythm are typically based on ECG data. The RR interval (RRI) and PP interval (PPI) represent cardiac beat-to-beat interval extracted from ECG and PPG signals, respectively. R peak is understood to have its common meaning, i.e. the maximum amplitude in the R wave in the QRS complex of an electrocardiogram. The system associated with the disclosed method is preferably configured to provide such vital sign data automatically, i.e. automatically perform the R peak detection in the ECG data. However, if the ECG data is noisy or distorted, the detection of the R peak(s) can be faulty, leading to erroneous estimations of RRI, HR, and RR. Therefore, the system is preferably further configured to stream parts of ECG data. The purpose of the ECG preprocessing subroutine is to evaluate whether the streamed parts of ECG data (also referred to as ECG samples) have an acceptable quality. The output of the ECG preprocessing subroutine is a plurality of parameters (goodForHR, goodForAF, goodForRR, goodForMorph), which can obtain a value of either 1 or 0. A value of 1 indicates that the ECG sample is good enough to be used for deriving vital sign data (HR, RR, and/or RRI), which can be used as input to the clinically validated deterioration event subroutines. In that case, the vital sign data is referred to as validated vital sign data. Conversely, a value of 0 indicates that the concerned ECG sample and/or derived vital sign data should be discarded and not applied in the deterioration event subroutines. The parameters goodForHR and goodForRR means that the concerned ECG sample is of good enough quality (in case of a value of 1) to be used to estimate the heart rate (HR), the heart rhythm and respiratory rate (RR) of the patient based on the ECG sample, respectively. Similarly, the parameter goodForAF means that the concerned ECG sample is of good enough quality to be used in the atrial fibrillation (AF) subroutine. The parameter goodForMorph indicates that the ECG sample is good enough (in terms of noise, e.g. quantified by signal-to-noise ratio) for calculating other values from the ECG morphology. In one embodiment, the ECG preprocessing subroutine comprises the steps of:
In a preferred embodiment, the method further comprises the step of executing a SpO2 preprocessing subroutine configured to assess the quality of the SpO2 data from a pulse oximeter worn by the patient in order to obtain validated SpO2 data. The system associated with the presently disclosed method is preferably configured to receive SpO2 data from the pulse oximeter with a given sampling frequency, e.g. 1 Hz. A given time segment (e.g. length of 1 minute) of data comprising a number of SpO2 values, is represented by an average SpO2 sample (i.e. one value representing the oxygen level for a one minute interval), may be transferred e.g. every minute, to the one or more servers storing the computer program for executing the disclosed method. In one embodiment, the SpO2 preprocessing subroutine comprises the steps of:
In a preferred embodiment, the next step of the method is executing one or more computer-implemented clinically validated deterioration event subroutines. Preferably, each subroutine is configured to receive one or more of the validated vital sign parameters and determine a specific clinical deterioration event in the patient. The clinical deterioration event may be selected from the group of: bradypnea/apnea, tachypnea, hypoventilation, desaturation, sinus tachycardia, bradycardia, hypotension, circulatory collapse, hypertension, atrial fibrillation, ventricular extrasystoles, ventricular tachycardia/-fibrillation (VT/VF), asystole, cardiac ischemia, low perfusion index, and acute stress. Each deterioration event subroutine is described in further details in the following.
The disclosed method may comprise a bradypnea subroutine configured to determine bradypnea/apnea. According to one embodiment, the bradypnea subroutine comprises the steps of:
The one or more bradypnea thresholds may be selected from the group of: HR>10 bpm, HR>15 bpm, HR>20 bpm, HR>25 bpm, RR≤3 bpm, RR≤5 bpm, RR≤10 bpm, RR≤15 bpm, and/or combinations thereof. In a preferred embodiment, the bradypnea subroutine comprises the bradypnea thresholds HR>20 and RR≤5. The predefined time duration may be selected from the group of: ≥1 min, ≥2 min, ≥3 min, ≥5 min, or ≥10 min. In a preferred embodiment, the bradypnea subroutine provides an alarm in case HR>20 and RR≤5 for more than 1 minute.
The disclosed method may comprise a tachypnea subroutine configured to determine tachypnea. According to one embodiment, the tachypnea subroutine comprises the steps of:
The predefined tachypnea threshold may be selected from the group of: RR≥20 bpm, RR≥24 bpm, RR≥28 bpm, and/or combinations thereof. In a preferred embodiment, the tachypnea subroutine comprises the tachypnea threshold: RR≥24 bpm. The predefined time duration may be selected from the group of: ≥1 min, ≥2 min, ≥3 min, ≥5 min, ≥10 min. A time duration of ≥5 min is preferred. In a preferred embodiment, the tachypnea subroutine provides an alarm in case RR≥24 bpm for more than 5 minutes.
The disclosed method may comprise a hypoventilation subroutine configured to determine hypoventilation. According to one embodiment, the hypoventilation subroutine comprises the steps of:
The hypoventilation thresholds may be selected from the group of: RR<15 bpm, RR<13 bpm, RR<11 bpm, RR<9 bpm, SpO2<92%, SpO2<90%, SpO2<88%, SpO2<86%, and/or combinations thereof. In a preferred embodiment, the hypoventilation thresholds comprise RR<11 bpm and SpO2<88%. The predefined time duration may be selected from the group of: ≥1 min, ≥2 min, ≥3 min, ≥5 min, or ≥10 min. A time duration of ≥5 min is preferred. In a preferred embodiment, the hypoventilation subroutine provides an alarm in case RR<11 bpm and SpO2<88% for more than 5 minutes.
The disclosed method may comprise a desaturation subroutine configured to determine desaturation. According to one embodiment, the desaturation subroutine comprises the steps of:
The predefined SpO2 thresholds may comprise any of: SpO2<92%, SpO2<88%, SpO2<85%, SpO2<80%, and/or combinations thereof. The predefined time duration may be selected from the group of: ≥1 min, ≥5 min, ≥10 min, ≥30 min, or ≥60 min. In a preferred embodiment, the desaturation subroutine provides an alarm in case:
The disclosed method may comprise a sinus tachycardia subroutine configured to determine sinus tachycardia, said subroutine comprising the steps of:
The one or more predefined sinus tachycardia thresholds may be selected from the group of: HR≥100 bpm, HR≥111 bpm, HR≥120 bpm, or HR>130 bpm. The predefined time duration may be selected from the group of: ≥5 min, ≥10 min, ≥30 min, >60 min, or ≥80 min. In a preferred embodiment, the sinus tachycardia subroutine provides an alarm in case: HR>130 bpm for t≥30 min, or in case HR≥ 111 bpm for t≥60 min.
The disclosed method may comprise a bradycardia subroutine configured to determine bradycardia, said subroutine comprising the steps of:
The bradycardia thresholds/ranges may be selected from the group of: HR<40 bpm, HR<30 bpm, HR<25 bpm, 25 bpm≤ HR≤45 bpm, or 30 bpm≤ HR≤40 bpm. The predefined time duration may be selected from the group of: ≥1 min, ≥2 min, ≥3 min, ≥5 min, or ≥10 min. In a preferred embodiment, the bradycardia subroutine provides an alarm in case HR<30 bpm for t≥1 min, or in case 30 bpm≤ HR≤40 bpm for t≥5 min. Preferably, the alarm is only provided if the parameter goodForHR is equal to 1.
The disclosed method may comprise a hypotension subroutine configured to determine hypotension, said subroutine comprising the steps of:
The hypotension thresholds may be selected from the group of: SBP<91 mmHg, SBP<80 mmHg, SBP<70 mmHg, SBP<60 mmHg, and/or combinations thereof. In a preferred embodiment, the hypotension subroutine provides an alarm in case SBP<91 mmHg for two consecutive measurements, or in case SBP<70 mmHg.
The disclosed method may comprise a circulatory collapse subroutine configured to determine circulatory collapse, said subroutine comprising the steps of:
The predefined SBP threshold(s) may be selected from the group of: SBP<110 mmHg, SBP<100 mmHg, and/or SBP<90 mmHg. The predefined HR thresholds may be selected from the group of: HR>110 bpm, HR>120 bpm, HR>130 bpm, HR<60 bpm, HR<50 bpm, HR<40 bpm, and/or combinations thereof. The predefined time duration may be selected from the group of: ≥1 min, ≥5 min, ≥10 min, ≥30 min, or ≥60 min. In a preferred embodiment, the circulatory collapse subroutine provides an alarm in case:
The disclosed method may comprise an asystole subroutine configured to determine asystole, said subroutine comprising the steps of:
The predefined time durations t1 and t2 may be more than 10 seconds, or more than 15 seconds, or more than 20 seconds, or more than 25 seconds, or more than 30 seconds.
The disclosed method may comprise a hypertension subroutine configured to determine hypertension, said subroutine comprising the steps of:
The hypertension threshold(s) may be selected from the group of: SBP≥180 mmHg, SBP≥190 mmHg, SBP≥200 mmHg, SBP≥210 mmHg, SBP≥220 mmHg, and/or combinations thereof. The predefined time duration may be selected from the group of: ≥1 min, ≥5 min, ≥10 min, ≥30 min, or >60 min. In a preferred embodiment, the hypertension subroutine provides an alarm in case SBP≥180 mmHg for t≥60 min, or in case SBP≥220 mmHg for at least one measurement.
The disclosed method may comprise an atrial fibrillation subroutine configured to determine atrial fibrillation, said subroutine comprising the steps of:
The predefined RRI threshold(s) may be selected from the group of: RRI<300, RRI<200, RRI<150, RRI>2500, RRI>3000, or RRI>3500. The second RRI threshold may be that the size of the RRI array storing the RRI values is greater than 15, or greater than 20, or greater than 25, or greater than 30. The atrial fibrillation subroutine preferably comprises the step of computing the normalized difference of the validated RRI values and storing the computed normalized difference values in a stored set of NDR values. The atrial fibrillation subroutine may further comprise the step of removing the NDR values that fall outside predefined percentiles of the values. Said predefined percentiles may comprise the 10th percentile and the 90th percentile, such that NDR values that are below the 10th percentile and/or above the 90th percentile are removed from the stored set of NDR values. The atrial fibrillation subroutine may further comprise the step of providing the stored set of NDR values to a Support Vector Machine (SVM) model configured to determine the presence of atrial fibrillation. In one embodiment, an SVM model was separately trained for binary classification (of atrial fibrillation) using a radial basis function (RBF) kernel. The misclassification costs were set to be proportional to the number of the training samples for each class. The feature NDR was extracted from a plurality of RRI samples, each sample comprising RRI values from a timespan of one minute. These RRI samples were fed into the SVM model for AF detection.
The disclosed method may comprise yet another atrial fibrillation subroutine configured to determine atrial fibrillation, said subroutine comprising the steps of:
The semi-supervised learning model may have been trained on less than 50% labelled data, preferably less than 40% labelled data, more preferably less than 30% labelled data, even more preferably less than 20% labelled data, most preferably less than 10% labelled data.
A reference subroutine for ventricular fibrillation (VF) detection can be found in Ibtehaz et al., “VFPred: A fusion of signal processing and machine learning techniques in detecting ventricular fibrillation from ECG signals”, Biomedical Signal Processing and Control 49 (2019), pp. 349-359. The VFPred algorithm can detect VF, which contains the classes VF and non-VF, and can be expanded with the classes VT/VF and non-VT/VF such that both VF and ventricular tachycardia (VT) can be detected.
Conventional bedside monitoring system has demonstrated the difficulty in long term monitoring of post-operative patients because the majority of them are ambulatory. With the presently disclosed approach employing wearable sensors and advanced data analytics, those patients will benefit greatly from continuous and predictive monitoring. A Serious Adverse Event (SAE) is any untoward medical occurrence or effect at any dose, any undesirable or unintentional effect that:
Examples of SAEs are Pneumonia, wound infection. anastomosis leakage, pneumothorax, bleeding, myocardial infarction, pulmonary embolism, delirium, syncope, stroke, transient ischaemic attack, respiratory failure, atelectasis, pneumothorax, pleural effusion, pulmonary embolism, heart failure, deep vein thrombosis, non-fatal cardiac arrest, troponin elevation, myocardial infarction, atrial fibrillation, atrial flutter, ventricular tachycardia, other supraventricular tachyarrhytmias, second-degree atrio-ventricular block, third-degree atrio-ventricular block, urinary tract infection, sepsis, septic shock, surgical site infection, major bleeding, drain, acute renal failure, hypoglycemia, diabetic ketoacidosis, intestinal obstruction, fracture, opiod intoxication, re-operation, and death.
SAEs like atrial fibrillation, atrial flutter, ventricular tachycardia, other supraventricular tachyarrhytmias, second-degree atrio-ventricular block, and third-degree atrio-ventricular block are examples deterioration events that both can be termed clinical deterioration events, and thereby be detected according to the presently disclosed approach, and be termed SAE because they also fall within the definition of SAE as stated above.
Example 2 discloses detection of serious adverse events (SAE) based on machine learning, where a support vector machine model has been trained on validated data. The feature input to the model was extracted from time series of four vital sign parameters HR, RR, SpO2 and sysBP, from where clinical deterioration events were extracted as trends in the data time series. However, the model could equally well have been trained based on features selected from one or more of the specific clinical deterioration events disclosed herein. I.e. once the model is trained as described in example 2, the input to the prediction of SAE will be vital sign data and detection of one or more clinical deterioration events as disclosed herein.
Application of the approach disclosed in example also 2 applies to clinical deterioration events detected in accordance with the presently disclosed approach, i.e. clinical deterioration events and/or SAEs can be detected, and thereby also possibly predicted and preferably prevented, with detection of clinical deterioration events as disclosed herein, i.e. by application of machine learning and continuous vital sign monitoring of (post-operative) patients.
As disclosed in example 4 nighttime monitoring of patient can improve prediction of SAE's, in particular patients having an increased heart rate and breathing rate as well as a slightly lower oxygen saturation during sleep during the nighttime, e.g. from midnight to 6 AM, compared to their normal vital sign parameters, have an increased risk of developing a SAE during the following day. This can be improved by combining the monitoring with a sleep stage detector, for example based on EEG measurements, such that it is known when the patient sleeps such that only sleep vital sign data is used in the nighttime analysis. The observation of an abnormal nighttime period of a patient may trigger an alarm, or a pre-alarm, such that the patient is surveyed more closely the following day and/or by adjusting one or more of the subroutine thresholds such that an alarm is generated earlier.
The presently disclosed method is configured for providing an alarm when at least one deterioration event has been detected by one of the described deterioration event subroutines. Each subroutine receives one or more validated vital sign parameters (such as RR, HR, SpO2, and SBP) and provides an alarm in case the monitored parameter(s) exceed one or more predefined thresholds for a predefined time duration as explained in further detail in relation to each subroutine. The preferred values of the different thresholds and durations for alarm generation associated with the different subroutines are summarized in the table below.
The present disclosure further relates to a system for automatic detection of a clinical deterioration event in a patient, said system comprising:
The sensors to be worn by the patient are preferably selected from the group of: electrocardiography (ECG) sensors, pulse oximeters, oscillometric blood pressure monitors, peripheral skin conductance sensors, 3D accelerometers, peripheral thermometers, and continuous glucose monitors. The sensors are preferably wireless wearable sensors configured for wireless communication with one or more gateways or servers. Data from the sensors may be streamed at a predefined streaming interval in order to save battery consumption and data storage. The streaming interval may be different from sensor to sensor. As an example, the streaming interval for the ECG sensor may be every two minutes, every minute, or every 30 seconds. Furthermore, different time intervals data from each sensor may be selected to be streamed. For example, ECG data may be collected continuously, whereas only 10 seconds of the ECG data may be selected to be streamed each minute. Preferably, the heart rate and temperature of the patient is received continuously. The respiratory rate is preferably received as a 10 second average, which may be streamed continuously or at a predefined interval such as every 10 seconds. The peripheral oxygen saturation and perfusion index is preferably measured (and streamed) every second, and the blood pressure is preferably measured (and streamed) every 15 or every 30 minutes.
A patient gateway should be understood herein as an electronic device configured for communication with one or more sensors and/or servers. An example of a patient gateway is a tablet computer. The gateway is preferably located near the patient, e.g. at the bedside of the patient, such that the wireless signals from the sensors can reach the gateway. The system preferably comprises a patient gateway for each patient. The wireless communication between the sensors and the patient gateway(s) may be any suitable wireless standard such as Bluetooth, Bluetooth Low Energy (BLE), Ultra Wideband (UWB), Wi-Fi, IEEE 802.11ah (Wi-Fi HaLow), GSM, 4G, 5G, or other similar technologies.
A server should be understood as a computer or computer program that provides services (e.g. computation) for other programs or devices. The servers of the presently disclosed system is preferably cloud servers, i.e. located remotely from the rest of the system, and accessible through the internet. The presently disclosed subroutines preferably form part of a computer program stored on one or more servers, such as cloud servers. In a preferred embodiment, the computer program comprising the one or more subroutines is stored on the first server. The first server is preferably configured for communication with the patient gateway. Preferably, the communication is encrypted and may be wired or wireless. The wireless communication between the patient gateway and the first server may be any suitable wireless standard as mentioned in relation to the sensors and the patient gateway(s).
The system may further comprise a second server. Preferably, the second server is configured to provide an alarm (e.g. in the form of a push notification) to a remote device (such as a computer, a smartphone or a tablet computer) in case the system has detected a clinical deterioration event or medical complication.
Atrial fibrillation (AF) is the most common cardiac arrhythmia and associated with a six times higher risk of stroke, and twice as high risk of death. According to the National Health Service (NHS) AF is the most common heart rhythm disturbance affecting more than 1 million people in the United Kingdom alone. Atrial fibrillation is classified as a tachyarrhythmia, where the electrical impulse is not initiated in the sinus node, but instead in fibrillatory waves in the atrias. Atrial fibrillation may also be characterized as an irregular rhythm with loss of the P-waves in the ECG signal. Preliminary studies have shown that atrial fibrillation is common in post-operative cancer patients. With the presently disclosed approach ECG is available from continuous bedside monitoring thereby providing a possibility of autonomous analysis of the ECG and thereby the possibility to detect atrial fibrillation as demonstrated in this example.
Normally deep neural networks are trained fully supervised, and thus requiring a large amount of labelled data. Vast amounts of medical data exist, but only a small amount of it has been labeled. This can be utilized in semi-supervised learning, where an unsupervised model is jointly trained on large amounts of unlabeled data with a supervised model that is trained on a smaller amount of labelled data. The neural network used in this example is therefore trained in a semi-supervised way where both labelled and unlabelled data is used. This allows for the neural network to learn features from a larger dataset, where the segments are not necessarily labelled. The model is built as a convolutional neural network, utilizing the ResNet architecture.
The input to the model is a 10 second segment from single lead ECG. The classification model used after completed training of the model includes the encoder (cf.
The data used in this project came from the publicly available MIT-BIH Atrial Fibrillation database (AFDB). The AFDB includes 25 records from different subjects (two only contains the location of the QRS-complexes and no waveform) each of 10 hours length. The remaining 23 records contain the ECG signal obtained from two leads. Each signal was digitized using a sampling frequency of 250 Hz and a 12-bit resolution in the ±10 mV range. Unaudited annotations of the QRS complexes are available along with manual annotation of the into the following subcategories: Atrial Fibrillation, Atrial Flutter, AV-Junctional rhythm and Sinus Rhythm (SR).
Each ECG record was split into 10 seconds non-overlapping segments to avoid the parts of the same segment being present in both the labelled and unlabelled dataset. The label was given based on the annotation files available with the data and was divided into AF vs. Non-AF. In conditions where multiple labels were present in the same segment, the label present for the majority of the segment was used for the entire segment. For both the training and test set, the data was stratified by down-sampling of the majority class. The dataset was split into a training set containing 90% of the segments and a test set containing the remaining 10%. To remove the DC-offset and any baseline wandering before normalization, a high-pass filter with cut off frequency of 0:5 Hz and a filter order of 5. All segments where down sampled to 100 Hz.
The variational autoencoder is a unsupervised generative model, that consists of two neural networks, an inference model, the encoder and a generative model, the decoder. The encoder maps the input sample into a lower dimensional latent variable, which the decoder maps into a reconstruction of the input sample. The variational autoencoder builds upon probability theory and Bayes' rule. In the variational autoencoder the inference model is defined as q-(zjx) and the generative model as p(xjz). By including the label variable, y into the model, a semi-supervised generative probabilistic model can be achieved. In this model the inference model, Q, is defined as q-(zjx; y) q-(yjx), with each term defined as:
and the generative model, P, is defined as p(z)pΘ(x|z; y), with each term defined as:
where qϕ and pθ are neural networks with parameters ϕ and θ, respectively. The inference and generative model is shown in
The Gaussian distribution q(z|x; y) is achieved by splitting the last layer of the model into two channels representing the mean, μϕ, and the log variance, log σϕ2 of the distributions, from which z is sampled using the reparameterization trick. The reconstruction loss p(x|z; y) is defined as a Gaussian distribution with μθ being the reconstruction and σθ2=2.
The objective of optimizing the parameters, θ and ϕ, is to maximize the log-likelihood log p(x). This is achieved by using Jensen' inequality to obtain the evidence lower bound function, which can be optimized. For the unlabeled case the lower bound is given as
and the labeled case the lower bound is defined as
In the lower bounds the contribution of z and y in the unlabeled case and z in the labeled case is marginalized out. For the unlabeled case y is treated as latent variable and is sampled by summing over the two classes, and for z the integral is approximated by sampling from the Gaussian distribution in the latent space. In the case of labeled data, optimization for the labels y is done using binary crossentropy.
Besides the lower bounds defined in equations (5) and (6), an extra loss was introduced where the standard deviations of the input signal and the reconstructions were subtracted and the absolute value was taken of the difference. This was introduced to help the decoder to make better reconstructions. For the classifier, binary cross-entropy loss was used.
To further help the training of the DGM two warmups were introduced, defined as delay and a linear ramp up to a maximum value. One for the KL divergence, with a 25 epoch delay, a max weight of 0.1 at 100 epochs, and a second for the classification loss, with a delay of 0 and a max weight of 0.5 at 40 epochs. These were introduced to not restrain the generative part of the network too much in the beginning, before pushing towards classification and a standard normal distribution for z.
The deep generative model (DGM) can be divided into three parts, the encoder, the classifier and the decoder. The encoder was built with a residual network (ResNet) architecture consisting of four blocks each containing three convolutional layers and a residual connection. ResNet has shown superiority in other image classification tasks, when compared to classic convolutional networks. In order to increase the receptive field of the network, dilation of 2, 4 and 8 was applied to the three layers within each block respectively. Max-pooling was done in the end of each block using a kernel size of 3 and a stride of 3, thus decreasing the signal size by a factor 3 per block. The kernel size and stride were 3 and 1, respectively, for all convolutional layers, and the number of output channels were fixed per block to 32, 32, 64, and 64 for the four blocks respectively. Two fully connected layers was applied to the end of the blocks with a size of 1,000 and 500. The decoder and the classifier were constructed as simple fully connected neural networks (CNN). The decoder consisted of input layer, four hidden layers each with 4,096 nodes and an output layer. The classifier consisted of three layers with 500, 200 and 200 nodes respectively and a binary softmax function as output. All layers except for output layers used Rectified Linear Unit as activation function and had batch normalization and dropout (p=0:3). mA diagram of the model is show in
In order to demonstrate the potential of using the semi-supervised approach, the proposed DGM was tested against a conventional connected neural network (CNN), identical to the encoder+classifier of the DGM. The setup is constructed using different proportions of unlabeled and labeled data, were the labeled data was used to train both the supervised part of the DGM and the CNN and the unlabeled part of the data was used only to train the unsupervised part of the DGM. In this way a “titration curve” style setup was obtain mimicking cases were different amounts of labeled data could be obtained data. The models was trained in setups using 1%, 5%, 10% and 50% of the data as labeled and the remaining as unlabeled. It was ensured that for each setup, the data in the training and test set was the same for both the DGM and CNN. Furthermore the random seed was fixed such that as much as possible was kept alike between the runs. A total of 111,894 segments were available in the training set after balancing the classes. The test set consisted of 12,434 segments that also were balanced. Each training phase of the DGM consisted of 50 epoch, where labeled data was cycled to correspond with the amount of unlabeled data. As the amount of data per epoch is smaller when training the CNN and thus would lead to fewer updates of the weights if it was only permitted to train for 50 epochs, these were allowed to train for more epoch and instead until convergence.
The results of the training of the DGM and the CNN using different amounts of labeled data is shown in the table below.
The best result is obtained by the DGM in the semi-supervised approach using 50% of the data labeled. The input segment and corresponding reconstruction of chosen samples is shown in
In general monitoring post-operative patients is important for preventing serious adverse events (SAE), which increases morbidity and mortality, but currently monitoring of post-operative patients relies on intermittent bedside monitoring. The presently disclosed approach facilitates continuous and predictive monitoring and therefore improves the management of patients. This example demonstrates machine learning based prediction of SAE in post-operative patient based on vital signs acquired by wearable sensors showing that SAEs can be predicated with high AUROC of as high as 93% by monitoring only four common vital signs. Using descriptive statistics extracted from trends as features and SVM based machine learning technique, as in this example, reduces algorithm complexity and thereby consume less battery power, which is very important for wearable systems.
This example demonstrates classification of “SAE” versus “no SAE” in 2 hours (prediction window) based on last 10 hours recordings (observation window). First, the trends of time series of vital signs were extracted with moving average in order to remove noises. Then the descriptive statistics were calculated from the trend of each modality and concatenated into a feature vector. Finally, a machine learning based on support vector machine was employed for prediction of SAE.
During the study the vital signs of heart rate, respiration rate, and blood oxygen saturation, were continuously acquired by wearable devices and blood pressure was measured intermittently from 453 post-operative patients. Data acquisition was managed by the Isansys patient status engine.
The study took place at Rigshospitalet and Bispebjerg Hospital in Copenhagen, Denmark from February 2018 to August 2020. 453 post-operative patients (278 males, 175 females) were included in the study. The average age was 71 years (range: 60-93) and the average amount of monitoring hours was 79 hours (range: 0.73-168.8). Patients in the study had a wide range of clinical SAEs ranging from neurologic, respiratory, circulatory, infectious and other complications. Information about SAEs were registered by medical doctors.
The vital signs HR, RR and SpO2 were acquired continuously by the wearable sensors and BP was measured intermittently. The acquisition of vital signs was managed by Isansys patient status engine (PSE) (Isansys Lifecare Ltd). The Isansys Lifetouch was attached to the patients' chest for acquiring single lead ECG with a sampling frequency of 1000 Hz, from which HR in beats per minute and RR in breaths per minute were derived. Pulse Oximeter (Nonin Model 3150 WristOx2) was attached to the finger for the acquisition of the photoplethysmogram (PPG) with sampling frequency of 75 Hz, from which SpO2 as a percentage was derived. The wearable sensors' data and derived values were first transmitted via Bluetooth to a gateway of PSE, which was located near the bed of the patient, and then to a hospital server for storing data in a patient database via WIFI every minute. Systolic blood pressure (sysBP) in mmHg was measured intermittently by using Meditech BlueBP-05. These sysBP measurements were entered into the gateway by medical staff and then automatically transmitted to the patient database. HR, RR, SpO2 and sysBP were synchronized through their timestamps.
Prediction of SAE can be seen as a classification problem aiming to classify “SAE” versus “no SAE” over a time period (prediction window), e.g. few hours, based on last recordings (observation window). The prediction window was chosen to be two hours and the observation window was chosen to be ten hours as shown in
1) Extraction of SAE class and control class: The SAE class was identified based on SAEs' timestamps. To account for class imbalance, the SAE class was oversampled. SAE class samples were extracted as eight hours' time series of vital signs with overlapping from two hours before to twelve hours before SAE timestamp. Four samples were extracted for each SAE as illustrated in
2) Feature extraction: Selection of discriminative features is normally important for the prediction of SAE. One or more clinical deterioration events are often preceded with SAE and can be extracted from vital signs as demonstrated in the presently disclosed approach. In this example the trends of time series of HR, RR, SpO2 and sysBP were extracted by using moving average with a sliding window of 60 minutes. In this examples the trends were supposed to represent the deterioration. Then four descriptive statistics (maximum, minimum, mean, and standard deviation) were calculated from the trend of each modality as features. The features from each modality were concatenated into one feature vector. The size of each feature vector was sixteen.
3) SVM classification: The SVM model used in this example is a supervised machine learning algorithm for solving classification and regression problems. It has shown good generalization property in many applications. The basic idea is to construct an optimal hyperplane for linearly separable patterns. The optimal hyperplane is the one that has maximal margin between two classes. For the non-linearly separable patterns one solution is to transform original data into a higher or indefinite dimensional space and then find a separating hyperplane in the transformed space by using a kernel function. Given a training set (xi; yi), i=1, . . . , N where xi∈Rn and yi={±1}, xi is a data point and y; indicates the class which the point x; belongs to. The output of the classifier is defined as
where the function' maps xi into a higher dimensional space. w is the weight vector and b is the bias of the hyperplane. The standard SVM requires the solution of the following optimization problem:
subject to
where ζi is a slack variable and c is a penalty parameter. They are introduced if the training data cannot be separated without error. As a consequence, training samples can be at a small distance ζi on the wrong side of the hyperplane. In practice, there is a trade-off between a low training error and a large margin. This trade-off is controlled by the penalty parameter c. A Gaussian kernel k was chosen for non-linear SVM classifier in this study:
where σ is the width of Gaussian kernel. Tuning of σ is important for optimizing classifier performance. Threefold cross-validation was applied to estimate the classification performance. The misclassification cost (NSAE+ncontrol)/nSAE was given to SAE data samples, whereas (nSAE+ncontrol)/ncontrol to control data samples. Here, nSAE and ncontrol represent the number of data samples belonging to SAE class and control class, respectively. The dataset was randomly partitioned into three subsets. One subset (a testing set) was used to validate the classifier trained on the remaining two subsets (a training set). This process was repeated three times such that each subset was validated once. During training, the training set was further divided into subsets for optimizing Gaussian kernel parameter σ and boxconstraints (inner cross-validation). The set of parameters, boxconstraints and σ, were searched among positive values, with a log-scale in the range [10−3; 103]. The optimal boxconstraints and σ were then applied to build classifier for the testing set. The performance of the classifier was evaluated in terms of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and the area under receiver operating characteristic curve (AUROC).
The performance of the classifier with threefold cross validation was summarized in Table I. The accuracy, sensitivity, specificity, PPV, NPV and area under receiver operating characteristic curve (AUROC) are relatively close among three tests. The classifier achieved an averaged accuracy of 89%, sensitivity of 80%, specificity of 93%, PPV of 82%, NPV of 92% and AUROC of 93%. Additionally,
This example demonstrates a new non-invasive way of estimating blood pressure (BP) of patients without the need for the normal cuff. It is based on measured vital signs, as also disclosed herein, and application of artificial intelligence, in particular a trained machine learning model. The new BP estimation doesn't require the usual strict synchronization between wearable devices. As vital signs can be acquired continuously and in real-time, the presently disclosed BP estimation can also be provided in real time, for example by the application of a trained machine learning model.
Blood pressure (BP) is a key hemodynamic variable for the evaluation and diagnosis of conditions such as stroke and cardiovascular disease. BP can vary dramatically from beat to beat, and minute to minute. It is crucial to monitor BP continuously on post-operative patients. Currently BP is often monitored continuously with an invasive arterial catheter, for example in critically ill patients in the ICU. This way has risk of infection and need clinical operation. Outside ICU, BP is measured by a cuff-based device, however only intermittently. The inflation/deflation often causes discomfort/pain for the patients and disturbs their rest. The cuffless BP estimation is therefore favored. Many cuffless BP estimations are based on features which require synchronization between electrocardiogram (ECG) and photoplethysmogram (PPG). Since ECG and PPG are recorded from two different devices, synchronization between ECG and PPG often causes problem. In his study, we propose a new way to estimate BP based on vital signs heart rate (HR), respiration rate (RR), blood oxygen saturation (SpO2) and pulse rate (PR). The vital signs are calculated independently and not sensitive to the synchronization.
498 post-operative patients participated in the study. After major abdominal cancer surgery, they were re-admitted to the general ward where their vital signs were monitored for up to four days, with the approach described herein. Severe adverse events, resulting from a wide range of complications, were collected for up to 30 days. Two wearable devices were attached to the patients for acquiring vital signs continuously. One was the Isansys Lifetouch at chest for acquiring single lead ECG, from which HR and RR were derived. Another was a Pulse Oximeter at the finger for acquiring PPG, from which SpO2 and PR was derived. Systolic blood pressure (SBP) and diastolic blood pressure (DBP) was measured intermittently by Meditech BlueBP-05. The wireless acquisition and transmission of vital signs was managed by Isansys patient status engine (PSE) (Isansys Lifecare Ltd). HR, RR, SpO2, PR, SBP and DBP were synchronized through their timestamps.
Random forest with 200 trees was applied for estimation of DBP and SBP, but other models can be used. First, 3 hours' time series of HR, RR, SpO2 and PR before BP measurements were extracted, from which descriptive statistics such as mean, standard deviation and range were calculated as features. Then, the regression model was trained with first day's data of each patient. The trained model was tested by the following days' data. The mean absolute error (MAE) and standard deviation (STD) of the error were used for evaluating estimation performance.
Estimation performance is shown in the Table I. According to Association for the Advancement of Medical Instrumentation (AAMI) standard, MAE should be less than or equal to 5 mmHg and STD should be less than or equal to 8 mmHg for both DBP and SDP. In this example STD of DBP met the standard and MAE was close to the standard. STD of SBP is closer to the standard, while MAE is higher
The period directly following surgery is critical for patients as they are volatile to infections and other types of complications, i.e. severe adverse events (SAE). Impending complications might alter the circadian rhythm and, therefore, be detectable during the night before. This example provides a prediction model that can classify nighttime vital signs depending on whether they precede a serious adverse event or come from a patient that does not have a complication at all, based on data from 450 post-operative patients. The prediction model is compared to random classifiers to demonstrate the applicability.
Circadian clocks, which are autonomous molecular mechanisms, are found in all mammalian cells and regulate body functions, such as hormone secretion, immune response and the sleep/wake phases. These normal changes in cardiovascular function can be accompanied by adverse events, as, for example, the onset of myocardial infarctions or sudden cardiac death has been found to be elevated in the early morning compared to nighttime. Antiarrhythmic mechanisms, such as increasing heart rate variability, can be constrained by disease and, therefore, protection might not be optimal. As demonstrated herein serious adverse events might be preceded by changes in vital signs during the night, stemming from deactivation of the patients' parasympathetic nervous system. Because heart rate and respiration rate reliably reach their nadir during sleep, nighttime offers an opportunity to observe the physiological baseline and make a comparison between patients that will have a complication and patients that will not. This might cause an increase in heart rate, respiratory rate and blood pressure.
Vital signs were obtained according to the procedure as described herein. In particular systolic (SBP) and diastolic (DBP) blood pressure values in mmHg were automatically recorded using the Meditech BlueBP-05. Heart rate (HR) and respiratory rate (RR) were obtained once per minute from the Isansys Lifetouch single lead ECG using a sampling rate of 1000 Hz at the chest of the patient. The pulse rate measurements were taken at the patient's arm. Photoplethysmogram (PPG) was measured at a rate of 75 Hz using a pulse oximeter (Nonin Model 3150 WristOx2). The SpO2 values were determined from that.
The problem was modelled as a binary classification task. The nights, which were defined as ranging from midnight to 6 AM, were extracted from the continuous measurements and either labelled as 1, if they preceded a SAE or 0, if they did not.
The procedures described in the previous sections provide one 360 minutes×6 modalities vector for each night. For each modality with at least one recorded datapoint, missing values are filled by performing first forward and then backward-carry. In order to smooth the data and correct for measurement errors, the moving average is computed using the nearest 10 values. For each night, 9 features are calculated: mean, median, standard deviation (STD), maximum, minimum, kurtosis, skewness, 10th and 90th percentile. The mean, median and standard deviation can be useful to detect anomalous vital sign values such as an elevated heart rate or an unstable respiration rate. The maximum and minimum measured during the night can indicate unusual events such as hypoxemia episodes in case of SpO2. Kurtosis measures the weight of a distribution's tail relative to the center, skewness evaluates its asymmetry. The 10th and 90th percentile provide information about the distribution. At each time steps, the static variables age, gender, height, weight, whether the patient smokes, number of packs smoked and units of alcohol consumed per week are added. This results in a total of (9*6)+7=61 features. Missing features, which arise if not a single value was recorded for the respective modality during the whole night, are filled by mean imputation.
After feature extraction, 5-fold cross validation is used to split the data into a training set and a test set. Each fold is used once for testing while the four remaining folds constitute the training set. This procedure is performed 10 times and from the evaluation metrics the average is computed. To correct for the imbalance in the dataset, the Synthetic minority Oversampling Technique (SMOTE) is applied to the training but not the test set. The implementation used in this example was provided by the imbalanced-learn library and brings both classes to equal size. XGBoost was chosen as the classification algorithm, because XGBoost has shown to deliver state-of-the-art performance while running faster than most other solutions. In short, the algorithm works via gradient tree boosting. Given a dataset D={(xi,yi)}(|D|=n,xi∈m,yi∈(), with n examples and m features, the output is predicted by adding over K functions
In the formulation above
and represents the space of trees, q is the structure of the trees and T the number of leaves per tree. From this representation, the regularized objective function can be derived:
where ŷi is the prediction, yi the true outcome and I the loss function. To prevent the model from overfitting, a regularization function xan be included as a second term. Because all the trees cannot be learned at once, the model parameters are learned in an additive fashion. In the formula only the functions ft that optimize the model are chosen.
This prediction approach can be compared to two implementations of a random classifier as provided by scikit-learn's Dummy classifier class. The first, uniform version simple represents a coin flip and chooses the classes with equal probability. The second, stratified version chooses the classes with the same probability as presented to the classifier in the labelled training output sets, so the majority class if chosen more frequently.
To assess the effectiveness of this approach, standard performance measures were calculated. The accuracy shows which percentage of the test data was predicted correctly, recall, precision and F1-score provide information about the type of errors. Additionally, the ROC-AUC value gets computed by integrating over the area below the curve.
After the data was filtered as described in the previous section, 184 nights preceding SAEs and 475 nights of patients without SAEs remained. In the table below, the mean values per night for both classes are shown, with the standard deviation of the means in brackets. On average, on nights preceding SAEs, patients have a higher heart and breathing rate as well as a slightly lower oxygen saturation compared to the patients without complications. However, it is also evident that due to the large standard deviation a simple threshold-based classifier is not sufficient to solve this task, which reaffirms the presently disclosed machine learning based approach.
Comparing the average percentage of missing data in the table below, it can be seen that the SAE group has many more values missing than the Non-SAE group. A reason for that could be that very sick patients take off their measurement devices more frequently, especially the oxygen saturation sensor at the finger.
The table below present the performance of the classifier compared to the two random baseline models. The model of this example achieved a F1-score of 0.49, a precision of 0.58, an accuracy of 0.75 and a ROC-AUC score of 0.65, all better than baseline. However, this classifier underperforms on the recall metric, which is due to the wide standard deviations as presented earlier.
The present example has some limitations: All the data used came from a single cohort for both training and validation. A model trained on data from various institutions will generalize better and be more valuable in clinical contexts. Additionally, the fact that the SAEs used as outcome measures in this study have different causes and severities and might lead to changes in the vital signs at all, could explain the heterogeneous results. There were also notable differences between the nights which preceded an SAE and the nights which did not in terms of missing data. Nights before a critical event had a higher percentage of data missing for all modalities. Another issue is that it was assumed that patients were sleeping based on the time of day. If a patient is awake during the night their vital signs could be altered and make prediction more difficult. Combining the algorithm in this example with a sleep stage detector, for example based on EEG measurements, could substantially improve its predictive capabilities. In spite of these limitations this example further illustrates that monitoring of vital signs is an important tool for prediction of SAEs, and that nighttime monitoring can further improve the prediction of SAEs, possibly even hours before the SAE arises, and the presently disclosed approach provided a significant step in the understanding of disease progression during sleep.
The data for this project was acquired at Rigshospitalet and Bispebjerg Hospital in Copenhagen, Denmark from February 2018 to August 2020. The 450 patients (275 males, 175 females) had a mean age of 71 years (range 60-93). On average there were 80 hours of data recorded (range 12-169). A serious adverse event (SAE) as defined according to the guidelines as any medical occurrence that results in death, results in in subject hospitalization, results in persistent or significant disability or incapacity of the subject, is associated with a congenial anomaly or birth defect or is qualified as “other important medically significant event or condition”. These events were recorded by attending clinicians and entered into a database. Written informed consent was obtained from all patients participating in the study.
As disclosed herein continuous monitoring of vital signs improved the foundation for data analysis with respect to standard care. The present example relates to prediction of vital signs. I.e. not only providing an alarm when deterioration has occurred, but actually predicting whether it is likely to occur in the near future.
The present example employs Multivariate Auto-Regressive (MAR) models to create a forecast projection of vital signs parameters based on past measurements. Forecasting vital signs could help identify deviation of the normal physiology that is likely to occur in the near future.
Consider a set of variables y=y1, . . . , yN, where each element yt=[yt1, . . . , ytm] is the response at time t, N is the signal length and m the number of modalities in the signal. The response at time, t, as defined by the MAR model is given by
where α is a vector of m elements, βk is a matrix of size [m, m] from the array β=[βk, . . . , βK]. Thus, in the auto regressive model the value of yt is given as a linear combination of the previous K elements of y, the intercept a and the weights in β.
Due to the nature of the vital signs signals, the physiological expectation of the temporal evolution in the signals is that homeostasis will cause the value to return to some patient specific baseline value. It can be advantageous to construct a model that includes the ‘pull’ towards a baseline value. This can be achieved by creating the MAR model centred around the intercept parameter. A popular implementation is to center the model around the mean of the signal, μy, where the response yt, computed in the equation above, instead comes from
As the value of μy, when computed from the time series available, does not necessarily reflect the true baseline, this can be fixed globally or as a parameter fitted in the model.
In this example data from an observational study with 500 postoperative cancer patients monitored for up to 4 days after major abdominal surgery was used. The data were obtained at Rigshospitalet and Bispebjerg Hospital in Copenhagen, Denmark from February 2018 to August 2020. Patients were monitored with a single lead ECG patch (Lifetouch Blue), a wrist-worn pulse oximeter (Nonin WristOx2), and a cuff-based blood pressure monitor (TM-2441). From the sensors the following modalities were available: Heart Rate (HR) (1/60 Hz), respiration rate (RR) (1/60 Hz), peripheral oxygen saturation (SpO2) (1/60 Hz) and systolic and diastolic blood pressure (measured every 30 minutes). All data were transmitted to a central server by the Isansys Patient Status Engine. A subset of the measurements were selected from the cohort, to perform the inference of the parameters in the model and evaluate the predictive accuracy. Only measurements of HR and RR values were used. The data extracted was ensured to not have any missing values for HR or RR in the time period. To fit the model, the subset consisted of 150 minutes of simultaneous HR and RR measurements from eight different patients chosen at random. This gave a total of 20 hours of data for inference. The time series used are shown in
The MAR model was constructed as a pooled model. A pooled model defines a model, where the same parameters are fitted across several different data sources, in this case different patients vital signs signals. This results in a single set of model parameters used for all future patients. In the case of the MAR model this means, that the parameters α, β and Σ are kept equal for all patients, P. The probabilistic graphical model of the implemented pooled MAR model is shown in the graph in
For the model, the priors for the parameters were kept uninformative and given by normal distributions. As the intercept, α, is used as a global baseline, values for this were chosen to reflect common baseline values for the heart rate and respiration rate. For heart rate the mean was set to 70 and for respiration rate it was set to 12. All parameters in β had priors set to follow a standard normal distribution. The following summarizes the model
with the priors for α and β being
The lag-parameter, K, was set to K=20, reflecting the past 20 minutes of vital signs data.
The objective for fitting a model is to establish the parameters of the model, θ, to fit the target distribution p(θ|y). This is done by either exact of approximate inference, depending on the dimensionality of the problem at hand. Due to the computational complexity of exact inference the current problem would be intractable in an exact approach.
Instead approximate inference in the form of Markov Chain Monte Carlo (MCMC)-sampling is used. MCMC-sampling is a general method based on iteratively drawing samples of θ from approximate distributions and updating these to continuously improve the approximation of the target distribution. The idea is as in Bayesian simulation that the collection of the simulated draws from p(θ|y) will summarize the posterior density. Hence MCMC-sampling is useful for sampling from Bayesian posterior distributions, where it is intractable to infer θ exactly from p(θ|y). Due to the random initialization of the sampling algorithm, the samples will have a transition period from initialization to the posterior distribution. To account for this, a warm-up period is defined and the samples from this are rejected. To ensure that the sampling is stabilized at the posterior distribution, sampling from multiple independent chains were done such that convergence could be quantified by used of the diagnostic measure, Rb, which compares the within-chain variance and the between-chains variance. The idea is that while the individual chains have not mixed and thus not approached the target distribution, the variance of all chains mixed should be larger than that of the chains individually. As the individual chains converge, Rb-→1 and Vehtari et al. recommends R<b 1.01 before using the sample [10]. In the used setup, each model was fitted using 4 chains with 2000 iterations in each. Each chain was given a warmup period of 1000 iterations, thus leaving 1000 for sampling per chain. This provided 4000 posterior samples of the parameters.
To evaluate the model's predictive accuracy, the model was applied to data from 5 unseen patients. A window matching the lag parameter, K=20, was provided to the model to create a forecast of 15 minutes. The forecast segment was compared to the true values within the window. For this, the root mean squared error (RMSE) was used to quantify the accuracy of the expected value in the forecast window with respect to the original signal. The window was moved 10 minutes forward and the process was repeated for the entirety of the time series. The setup for the first to steps is shown in
The results of the evaluation of the predictive accuracy of the forecasts are presented in the table below. The parameters, α, β and Σ, of the model showed proper convergence with all values of R<b 1.01. The average RMSE for HR across all patients was 11.4 bpm with the lowest and highest being 0.4 bpm and 32.1 bpm, respectively. For RR the average RMSE was 3.3 brpm with the lowest and highest being 0.9 brpm and 7.4 brpm, respectively. For HR the results in the table below show a large difference between patients, where the lowest average RMSE for one patient was 4.7 bpm and the highest 20.5 bpm. The resulting responses of the MAR model are visualized in
Predicting future deviations in vital signs, such as heart rate and respiration rate, is challenging, as the nature of the signals may imply rapid changes not known in advance. Sudden activation of the patients will lead to changes in their vital signs, that will not be possible to predict before the activation occur. The difficulty to capture this can be seen in
The model proposed in this example demonstrates promising results when applied to different patients. The range in the subset used for evaluation shows that both in low and high values of HR and RR the model still provides a good forecast. Though, the variation occurring over multiple days and under different circumstances has only barely been assessed and there will most likely be rare events, that has not been represented in the evaluation. In this example, the model was implemented in a pooled construction which has advantages in a clinical setting. As the pooled model relies on a single set of parameters to span all patients, there is no requirement to perform inference of the parameters for each patient, which is resource demanding in computational power when done in an iterative Bayesian approach. This can also be a disadvantage of the pooled model compared to other constructions, such as the separate or hierarchical model, where patient specific variations can be built into the model. It could be advantageous if the model has difficulties in fitting to the diversity in data that different patients will present. However, as there is no clear patient specific deviation, use of these models must be held against the increased computational requirements.
Another aspect of the natural representation of vital signs, not included in this example, is the heteroscedasticity assumed to be present. The current model assumes the data to be homoscedastic within each modality, i.e. the data has the same variance across patients and temporal location. It becomes clear from the plots in
The construction of a model that creates a forecast will lead to the question of how to use the forecast. As the nature of the signals entails rapid changes, the conception that it will be possible to predict far into the future does not resemble reality. Instead, it could be advantageous to use the forecasts as baseline prediction and evaluate deviation from this based on the true values in a practical setup. Quantifying rapid changes from the forecast values could be a way to use the model to detect deviations in a real time setting.
The present example shows that it is possible to predict/forecast time series of the vital signs HR and RR based on previous measurements, for example by employing a pooled MAR model. Though there were large deviations in the predictive accuracy in the forecast window between patients, an fairly low RMSE of 11.4 bpm for HR and 3.3 bpm for RR was achieved on average, see for example
Number | Date | Country | Kind |
---|---|---|---|
21184712.4 | Jul 2021 | EP | regional |
21205557.8 | Oct 2021 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/069262 | 7/11/2022 | WO |