This application is based upon and claims the benefit of priority of the prior Indian Patent Application number 202311057599, filed on Aug. 28, 2023, the entire contents of which are incorporated herein by reference.
The present invention relates to sepsis diagnosis/prediction, and in particular to a computer-implemented method, a computer program, and an information programming apparatus.
Sepsis may be defined as a body's extreme response to an infection. It is a life-threatening medical emergency. Sepsis occurs when an infection already present triggers a chain reaction throughout the body. Sepsis can lead to tissue damage, organ failure, and death.
Sepsis diagnosis/prediction is high in demand as an accurate prediction may save human lives and hospital resources. In particular, early sepsis diagnosis/prediction is desirable. Sepsis diagnosis/prediction may for example be carried out in an ICU (intensive care unit) setting.
In light of the above, a method for sepsis prediction is desired.
According to an embodiment of a first aspect there is disclosed herein a computer-implemented method comprising: performing a prediction process, the prediction process comprising: based on input data comprising values of physiological measurement variables of/related to a patient (and) over a (first) time period, computing (first) correlations in the input data, the computing comprising computing (short range temporal) correlations between values of (different) physiological measurement variables at (different) consecutive time steps (and between values of the same physiological measurement variable at (different) consecutive time steps)) (using an attention-based mechanism) and computing (spatial) correlations between values of different physiological measurement variables at a same time step (using a self-attention mechanism); generating first updated node embeddings based on the input data and the (first) correlations, each node corresponding to a (value of a) physiological measurement variable at a time step; using a recurrent neural network, RNN, updating the first updated node embedding based on/with (second) correlations between the first updated node embeddings to generate (temporally) updated embeddings; and based on the (temporally) updated embeddings and using a neural network, NN, generating a prediction/diagnosis indicating whether the patient ((currently) has or) will have sepsis (at a future point in time).
Features relating to any aspect/embodiment may be applied to any other aspect/embodiment.
Reference will now be made, by way of example, to the accompanying drawings, in which:
The spatial features block 23 dynamically computes edge weights based on the static adjacency matrix of road networks and the absolute difference between the high dimensional representation (maybe referred to as node values) E. The vectors “e” are generated based on the edge weights and the input data (i.e. using a weighted sum).
In a temporal convolution block 24 (specifically a temporal convolution network 241), a dilated convolution takes place on data frame vectors (e) of different time stamps (i.e. e1, e2, etc.) to obtain a time-based aggregation of data. Based on the output of the temporal convolution network 241, a binary prediction is made.
The methodology of comparative method 1 is not suitable or useful for sepsis prediction based on physiological parameters. This is because, for example, unlike the input data in comparative method 1 (traffic speed), physiological parameters data of different physiological variables do not share similar attributes (e.g. HR (heart rate) and O2Sat (O2 saturation) do not share similar attributes—at least, not sufficiently for the methodology of comparative method 1 to be useful), hence the absolute difference between them does not hold any useful meaning in capturing edge strength (weight). Furthermore, in comparative method 1 the initial edges (weights) are known (i.e. via the static adjacency matrix of road networks). The edges are merely updated at each time step using the previous traffic speed data using a graph convolution network. There is no such known adjacency matrix available in the case of sepsis prediction based on physiological parameters data.
A problem with applying the methodology of comparative method 2 to a sepsis prediction based on time series physiological parameters data is that useful temporal relationships between different variables are not captured by the spatial correlation generation nor by the time-aggregation of the attention weights between nodes. Another problem is that edge weights should not be computed based on previous layer edge weights because each layer represents different information corresponding to different physiological parameters.
At step 1, (raw) data (indicated by “x” in
At step 2, data imputation and preprocessing is performed. For example, data imputation is a method for retaining the majority of a dataset's data and information by substituting missing data with a different value. Data preprocessing comprises, for example, cleaning, transforming, and integrating of data in order to make it ready for analysis. The imputation and preprocessing will of course depend on the specific data which is input. The output of step 2 (the preprocessed data) is indicated by “x” in
At step 3, a high dimensional feature representation of the input numerical features/vitals (Physiological parameters) is generated. The high dimensional feature representations may be referred to as high dimensional encodings or multi-dimensional encodings and are indicated as “E” in
At step 4 the processing of a correlation block (or “spatial with short-range temporal correlation block”) takes place. Here, first correlations are generated between the values of the physiological measurement variables as described later. In short, the spatial pattern (cross-correlation between features at the same time step, e.g. time t) is captured by this block, as well as the cross-correlation between features of the time t and t−1 (i.e. short-range temporal correlations across two consecutive time steps). As described later, the calculation of correlations in such a way extracts the spatial as well as short term temporal information which is more informative. This block generates first updated node embeddings (“e”) based on the high dimensional encodings and the first correlations.
At step 5, a recurrent neural network (RNN), or an RNN-based network, comprising at least one gated recurrent unit (GRU) generates long-term temporal patterns of the features, i.e. second correlations between the first updated node embeddings. As described later, the long-term temporal patterns of the features are captured using the GRU which improves the sepsis prediction. The RNN generates temporally updated embeddings (“h”) based on the first updated node embeddings and the second correlations.
At step 6, a neural network (NN) is used to generate a prediction of sepsis or no sepsis (i.e. as a binary classification task), e.g. at time t+T. The NN comprises fully connected layers. The binary prediction is indicated by “Yt+T” in
As illustrated, the input data has dimension i×m, where i is the number of time steps and m is the number of features. The high dimensional encoder 42 generates the high dimensional encodings or “high dimensional features” of the input data. The high dimensional encodings are represented as “E” in
The encodings are input to the correlation block 43 (as the input data) so that first correlations are computed. Computing the first correlations comprises computing short range temporal correlations between values of different physiological measurement variables at different time steps and between values of the same physiological measurement variable at different time steps using an attention-based mechanism (i.e. a modified attention mechanism) and computing spatial correlations between values of different physiological measurement variables at a same time step using a self-attention mechanism.
The correlation block 43 uses a key-query-value (KQV) scheme/mechanism to generate the short range (or short-term) temporal correlations (in the short range temporal correlation block 432) and the spatial correlations (in the spatial cross-correlation block 433). First, the encodings are multiplied with weight vectors (W, not shown) and a non-linear transformation is performed to generate initial node embeddings E′ (as illustrated by the linear layers in the feature embeddings block 431). A different weight vector W is used for each feature in
The first correlations of a given node (i.e. a given variable at time t) are determined in two stages. In the first stage (in block 432), short range temporal correlations are determined and in the second stage (in block 433) spatial correlations are determined.
To generate the short-term temporal correlations, data (encodings) of consecutive pairs of time steps are considered. Considering, for example, the consecutive times steps t=0 and t=1, for the key and value (vectors K, V) data of time step t=1 is used, while for the query (vector Q) data of time step t=0 is used, and correlation weights (beta), which may be referred to as short range temporal correlation weights, are learned. Specifically, a modified attention mechanism is used to generate the short range temporal correlations. To generate the spatial correlations, data (intermediate node embeddings s) of each time step are considered separately, and the correlation weights (alpha), which may be referred to as spatial correlation weights, are determined using the key-query-value mechanism. Specifically, a self-attention mechanism is used to generate the spatial correlations. The output of the correlation block 43 is first updated node embeddings “e”.
In the first stage, intermediate node embeddings (“s”) are generated, which may be referred to as short range temporal embeddings. This is done by determining the correlations between nodes of time t−1 and time t using a modified attention mechanism (as described later in detail), which are referred to as short range temporal correlations, and updating the initial node embeddings with the correlations to generate the intermediate node embeddings (as in
In the second stage, the intermediate node embeddings generated in the first stage are used as input to determine the first updated node embeddings (may be referred to as spatial node embeddings) of each node (i.e. for each feature) using a self attention mechanism (described later in detail). That is, spatial correlations are determined between the intermediate node embeddings of the same time step and used to update the intermediate node embeddings to generate the first updated embeddings (as shown in
Effectively, the initial node embeddings E′ are updated with the first correlations (spatial and short range temporal) to generate first updated node embeddings e.
As illustrated, a weighted sum is used in the computation of the intermediate node embeddings and the first updated node embeddings (i.e. an initial node embedding at time t=1 is updated based on node embeddings at time t=0 weighted by the relevant short range temporal correlation weight (beta), and an intermediate node embedding at time t=1 is updated based on the other intermediate node embeddings at time t=1 weighted by the relevant spatial correlation weight (alpha)).
The first correlation computation and update processing is carried out L times, where L is a positive integer of one or more. That is, the processing may be repeated. This is indicated by the “L-layers” text in
The initial, intermediate, and first updated node embeddings may be referred to as initial, intermediate, and first updated: feature embeddings; embeddings; embedding vectors; or feature embedding vectors.
At the bottom of
Returning to
These first updated node embeddings are concatenated and input to the RNN 45. This step may be considered inputting the first updated node embeddings into the RNN 45.
The RNN 45 processes the first updated node embeddings to learn long-range temporal correlations between them and updates the input with these correlations, which may be referred to as second correlations. The outputs of the RNN 45 are referred to as temporally updated embeddings.
The temporally updated embeddings are input to the NN 46. The temporally updated embeddings include information of the first correlations and the second correlations and thus information about patterns across the different features and different time steps. The NN 46 has been trained to generate a sepsis/non-sepsis prediction based on such data as a binary classification task. The NN 46 makes the prediction.
A specific implementation of the first correlation computation will be described.
“Key” vectors/matrices: K=WKet−1, where et−1 represents the key features at time step t−1 (i.e. the embeddings “e” at t−1, that is, the first updated node embeddings at time t−1), and WK represents a weight vector for performing the linear transformation (multiplying embedding with weights).
“Query” vectors/matrices: Q=WQE′t, where E′t represents the query features at time step t (i.e. the embeddings “E” at t, that is, the initial node embeddings at time t), and WQ represents a weight vector for performing the linear transformation.
“Value” vectors/matrices: V=WVet−1, where et−1 represents the value features at time step t−1 (i.e. the embeddings “e” at t−1, that is, the first updated node embeddings at time t−1), and WV represents a weight vector for performing the linear transformation.
The weights WK, WQ, and WV for the key, query, and value are different from each other (not necessarily the same).
The attention weights β represent the short-range temporal correlation weights between keys and queries, which are learned/trainable.
β=softmax(Q·KT/sqrt(d)), where d is the dimension of query and key vector. The correlations use the dot product between Q and K.
Final short-range temporal embeddings (intermediate node embeddings) st of nodes are calculated as: st=zt+Q, where, zt=βV is the weighted sum of the values V, and Q is the query vector.
These intermediate node embeddings s include information regarding the correlation of the physiological measurement variables (nodes) at time t with physiological measurement variables (nodes) at time t−1. They may be considered to also the includes the information of self-correlation (that is, due to the “+Q” term), which for example may be considered useful as the intermediate node embeddings are used to compute the spatial correlations. This “self-correlation” is illustrated in
Referring back to
“Key” vectors/matrices: K=WKst, where st represents the short-range temporal node embeddings (intermediate node embeddings) at time step t, and WK represents a weight for the linear layers of the key vector.
“Query” vectors/matrices: Q=WQst, where st represents the short-range temporal node embeddings (intermediate node embeddings) at time step t, and Wo represents a weight for the linear layers of the query vector.
“Value” vectors/matrices: V=WVst, where st represents the short-range temporal node embeddings (intermediate node embeddings) at time step t, and WV represents a weight for the linear layers of the value vector.
The weights WK, WQ, and WV for the key, query, and value are different from each other (not necessarily the same), and also different (not necessarily the same) from the weights WK, WQ, and WV of the short-range temporal correlation computation.
For computing the spatial correlation at time t, a self-attention mechanism is used, and the correlation weights between physiological parameters at time t (nodes at time t) are learned. These weights represent spatial correlations (because they are computed among the same time step) and are denoted by a and defined by the equation:
First updated node embeddings at time t (et) are computed by performing the weighted sum of values V, that is, et=αV. These node embeddings (et) include information regarding the correlations between the physiological measurement variables (nodes) at same time step t. They also include the information of the short-range correlations of nodes between time step t and previous time step t−1.
As previously described, the processing above is repeated (with, for a given time step t, the first updated node embedding et of the previous iteration being used in place of the initial node embeddings E′t of the following iteration, as indicated in the equations above). That is, the modified attention processing followed by the self-attention processing is repeated. This processing is performed for each time step. When considering the first time step there is no previous time step. In some implementations, the earliest data processed using the modified attention and self-attention processing is the data of the second time step so that the updated node embeddings for every time step include both spatial and short range temporal correlation information.
Returning to the processing of the short range temporal correlations (e.g.
The key (K), query (Q), and value (V) vectors may be represented as follows:
The intermediate node embeddings may be represented as follows:
The above equation can be expanded as follows:
The generation of the first updated node embeddings may be similarly represented, and in such a case the time step for all quantities would stay the same (e.g. t=2) and there is no “+Q” term.
It will be appreciated that the correlation computations described above may be considered to comprise computing correlations between the values represented by the node embeddings.
The input data in
Input data may comprise (additionally or alternatively) biomarkers (molecules) that are released into the blood or other body fluids in response to infection or inflammation such as procalcitonin (PCT), C-reactive protein (CRP), and interleukin-6 (IL-6).
In general, input data may comprise values of any of the above-mentioned physiological measurement variables.
Step S51 comprises: based on input data comprising values of physiological measurement variables of a patient over a time period, computing first correlations in the input data, comprising computing short range temporal correlations between values of different physiological measurement variables at different time steps and between values of the same physiological measurement variable at different time steps using an attention-based mechanism and computing spatial correlations between values of different physiological measurement variables at a same time step using a self-attention mechanism.
Step S52 comprises generating first updated node embeddings based on the input data and the first correlations.
Step S53 comprises: using a recurrent neural network, RNN, updating the first updated node embeddings based on second correlations between the embedding vectors to generate temporally updated embeddings.
Step S54 comprises: based on the temporally updated embeddings and using a neural network, NN, generating a prediction/diagnosis indicating whether or not the patient has (or will have, i.e. at predetermined time in the future) sepsis.
Steps S51 and S52 are based on the high dimensional encodings/features as the input data. The prediction process may comprise generating the high dimensional encodings/features, for example: for each physiological measurement variable, performing a data binning method on the values concerned and generating, as the high dimensional encodings/features, a feature vector for each value.
Any of the steps S51-S54 may comprise processing described above with reference to
Step S14 comprises loading data windows and labels as training data. The labels comprise (as indicated in step S19) a “ground truth” classification of the patient having sepsis at time t+T. Step S15 comprises generating a node embedding (E′) for each data point. Step S15 may be considered to comprise multiplying the high dimensional features/encoding with weight vectors. Step S16 comprises generating the cross-correlational embeddings (i.e. generating the first correlations and updating the node embeddings based thereon). Step S17 comprises generating temporal embeddings (i.e. generating the second correlations based on the first updated node embeddings and updating the first updated node embeddings based thereon to generate the temporally updated embeddings.
Step S18 comprises classifying the training data as sepsis or non-sepsis, i.e. generating the prediction indicating whether the patient will have or not have sepsis at time t+T (some predefined time in the future). Step S19 comprises comparing the prediction to the label indicating a “ground truth” sepsis/non-sepsis classification and computing the loss therebetween. The prediction indicated by the label may be referred to as a training prediction. The training process in an implementation comprises using cross-entropy loss or focal loss. Weighted cross-entropy loss or logarithmic loss may be used. What is used may depend on the amount of class imbalance in the dataset, for example.
Although not shown in
Step S20 comprises determining whether the loss is converged. If yes then the method ends. If no then the method proceeds to step S14 and more training data is loaded. The training process is iterated in this way. For example, it may be determined whether the loss is below a threshold or has been below a threshold for a predetermined number of iterations. Alternatively or additionally, step S20 may comprise determining whether the maximum number of epochs has been reached. This may be determined in addition to the loss convergence—i.e. the method may end if either determination is true (“yes”), that is, if either the maximum number of epochs has been reached or if the loss is determined to be converging/have converged.
The training process may not be considered to comprises any of steps S11-S14 and any of the steps may be considered already carried out as part of a pre-training process. A method of training a system for sepsis prediction may comprise using input data related to different patients per iteration. Any of the steps S15-S18 may comprise any of the corresponding processing described above with reference to
The test process comprises steps S31-S38. Steps S31-S38 correspond with steps S11-S18, except test data is used rather than training data. The test process ends with a prediction/diagnosis/classification of sepsis/non-sepsis in step S38. The considerations described with respect to steps S11-S18 similarly apply to steps S31-S38, respectively.
An implementation of the neural network is described below but is not essential. A neural network can be composed of 2 fully connected layers and 1 fully connected classification layer, which means it is a type of neural network that has 3 layers of neurons. The first 2 layers are fully connected, meaning that every neuron in one layer is connected to every neuron in the next layer. The third layer is a fully connected classification layer, meaning that it has a number of output neurons equal to the number of classes in the problem (in this case it is 2). The input to the neural network is a vector of real numbers (in this case input dimension is [1×128]). Each layer can be represented mathematically as follows:
z=W*x+b
where z is the output of the layer.
The output z is then passed through a non-linear activation function f. As an activation function, there are a number of options such as ReLU, Leaky ReLU etc. These may be used by the intermediate layer. The most common activation function for classification tasks (last layer) is the softmax function or sigmoid function. The softmax function takes a vector of real numbers as input and outputs a vector of probabilities. The probabilities represent the probability of each label being assigned to the input data point. The label with the highest probability is then chosen as the output label. It is noted that 2 or more fully connected layers may be used for the NN.
An implementation of a method of training used to train the NN is as follows. The architecture that may be used herein is end-to-end architecture which takes the input as the second updated embeddings (h) and produces a vector of probabilities as output from classification layer, wherein the loss is calculated using a loss function (e.g. cross-entropy loss, focal loss etc.). The loss function is then used to train the model using an optimization algorithm. The optimization algorithm tries to find the values of the model's parameters that minimize the loss (stochastic gradient descent, Adam, Adagrad etc).
The use of vital signs data as the physiological measurement variables is not essential. A GRU RNN is not essential—an RNN-based network that is not necessarily with a GRU RNN is used in some implementations.
A data binning method may be carried out as follows, using a value of temperature as an example. The value is assigned to a bin, for example bin number 6 (the bins represent intervals of temperature values, e.g. bin 6 may represent the interval 36-36.3 degrees Celsius). Then a high dimensional encoding for that value is an array comprising 5 zeroes (representing bins 1-5), a normalized (between 0 and 1) value corresponding to the actual temperature value, and then a number of 1s corresponding to how many other bins exist above bin 6. For example if there are 10 bins the high dimensional encoding may look like: “0 0 0 0 0 y 1 1 1 1”, where y is the normalized temperature value. Data binning is merely one way of generating high dimensional encodings.
The second correlations may be referred to as long range temporal correlations. The output of an RNN may comprise the output of the last GRU block.
Any of the above prediction processes may be repeated so that an iteration is carried out every time period, e.g. every hour or half-hour, or using irregular intervals of time. Up-to-data input data (vital signs data) may be used each iteration. That is, each successive iteration may use more (and more recent) input data. This may be considered monitoring the patient for sepsis risk. A prediction/classification of sepsis is a diagnosis of sepsis based on physiological measurements of the patient which may be obtained using at least one sensor. Graphs/networks may be stored in the form of linked data nodes.
As described above, a training process/method is disclosed herein which comprises performing any prediction process described above for multiple iterations and performing weight adjustment based on the loss each iteration.
According to an aspect there is disclosed herein a method to predict sepsis, before actual sepsis onset, in an ICU setting, using physiological data of the patient recorded at regular or irregular time intervals (e.g. using hourly recorded physiological data of the patient). Improved accuracy compared to other methods is achieved by modelling spatial correlation/spatial and temporal structure of patients' physiological parameters/vitals with graph-based network and RNN. Sepsis prediction is high in demand as it saves human lives and hospital resources. Finding the correlation between vitals in spatial and temporal directions simultaneously is difficult and until now not fully explored.
An objective of aspects disclosed herein may be considered an early sepsis prediction before sepsis onset in an ICU setting, and this is achieved by exploiting the cross-correlation and temporal structure of the patients' physiological data.
Limitations of existing sepsis prediction models include the fact that they don't consider the correlational structure of patients various physiological and other body parameters during the prediction of sepsis, nor the fact that their cross-correlation changes as the disease progresses with time. Furthermore, the spatial cross-correlation patterns in comparative method 1 do not include the cross-correlation patterns between features of different (consecutive) time steps (short range temporal correlations). In addition, the number of timestamps (in sepsis prediction based on physiological parameters, e.g. vital signs) is not necessarily fixed but the convolutions employed in comparative method 1 need a fixed input size. Aspects disclosed herein employ a memory learning mechanism, not merely a simple aggregation over time as in comparative method 1. Moreover, more recent time-frames are more important (in sepsis prediction), which is not captured in comparative method 1.
The spatial cross-correlation patterns in comparative method 2 do not include temporal cross-correlation patterns between features of different (consecutive) time steps (short range temporal correlations). If use of comparative method 2 was attempted for physiological parameters-based sepsis prediction, each layer would have different vital nodes information combinations so edge weights cannot be computed based on previous layer edge weight. Comparative method 2 merely employs time-aggregation of attention weights between nodes, whereas in contrast aspects disclosed herein model attention between nodes at every time-frame and retain the latest evolving relation between nodes. Hence an RNN is used to obtain second (long range temporal) correlations.
Aspects disclosed herein are capable of predicting sepsis a few hours before sepsis onset by using only patient's physiological time series data. Improved sepsis prediction accuracy with only vitals data is achieved. Aspects may help the building of multi-variate time-series models for sepsis prediction in an ICU setting. General methods to predict sepsis including modelling via RNNs, time domain CNNs, and classic machine learning methods do not consider interdependencies between physiological parameters, for example.
Aspects disclosed herein may include graph-based spatio (cross-correlation)-temporal network and learning strategy for sepsis prediction; early sepsis prediction using physiological parameters or only vital signs; inclusion of the short-range temporal cross-correlation embedding in estimation of spatial cross-correlation embedding; unification of the cross-correlation structure with temporal structure to exploit the spatio-temporal aspect of the data.
Where processing is described as being performed by/using specific architecture, this architecture is not essential and other architecture may be used to perform the same processing.
The computing device 10 comprises a processor 993 and memory 994. Optionally, the computing device also includes a network interface 997 for communication with other such computing devices, for example with other computing devices of invention embodiments. Optionally, the computing device also includes one or more input mechanisms such as keyboard and mouse 996, and a display unit such as one or more monitors 995. These elements may facilitate user interaction. The components are connectable to one another via a bus 992.
The memory 994 may include a computer readable medium, which term may refer to a single medium or multiple media (e.g., a centralized or distributed database and/or associated caches and servers) configured to carry computer-executable instructions. Computer-executable instructions may include, for example, instructions and data accessible by and causing a computer (e.g., one or more processors) to perform one or more functions or operations. For example, the computer-executable instructions may include those instructions for implementing a method disclosed herein, or any method steps disclosed herein, e.g. any of steps S51-S54, S11-S20, and S31-S38 and/or any processes/processing described above. Thus, the term “computer-readable storage medium” may also include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the method steps of the present disclosure. The term “computer-readable storage medium” may accordingly be taken to include, but not be limited to, solid-state memories, optical media and magnetic media. By way of example, and not limitation, such computer-readable media may include non-transitory computer-readable storage media, including Random Access Memory (RAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory devices (e.g., solid state memory devices).
The processor 993 is configured to control the computing device and execute processing operations, for example executing computer program code stored in the memory 994 to implement any of the method steps described herein. The memory 994 stores data being read and written by the processor 993 and may store weights and/or input data and/or equations and/or training data and/or labels and/or nodes and weights of networks and/or other data, described above, and/or programs for executing any of the method steps/processes described above. As referred to herein, a processor may include one or more general-purpose processing devices such as a microprocessor, central processing unit, or the like. The processor may include a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or a processor implementing other instruction sets or processors implementing a combination of instruction sets. The processor may also include one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. In one or more embodiments, a processor is configured to execute instructions for performing the operations and operations discussed herein. The processor 993 may be considered to comprise any of the modules described above. Any operations described as being implemented by a module may be implemented as a method by a computer and e.g. by the processor 993.
The display unit 995 may display a representation of data stored by the computing device, such as a prediction and/or data and/or a representation of networks and/or GUI windows and/or interactive representations enabling a user to interact with the apparatus 10 by e.g. drag and drop or selection interaction, and/or any other output described above, and may also display a cursor and dialog boxes and screens enabling interaction between a user and the programs and data stored on the computing device. The input mechanisms 996 may enable a user to input data and instructions to the computing device, such as enabling a user to input any user input described above.
The network interface (network I/F) 997 may be connected to a network, such as the Internet, and is connectable to other such computing devices via the network. The network I/F 997 may control data input/output from/to other apparatus via the network. Other peripheral devices such as microphone, speakers, printer, power supply unit, fan, case, scanner, trackball etc may be included in the computing device.
Methods embodying the present invention may be carried out on a computing device/apparatus 10 such as that illustrated in
A method embodying the present invention may be carried out by a plurality of computing devices operating in cooperation with one another. One or more of the plurality of computing devices may be a data storage server storing at least a portion of the data.
The invention may be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. The invention may be implemented as a computer program or computer program product, i.e., a computer program tangibly embodied in a non-transitory information carrier, e.g., in a machine-readable storage device, or in a propagated signal, for execution by, or to control the operation of, one or more hardware modules.
A computer program may be in the form of a stand-alone program, a computer program portion or more than one computer program and may be written in any form of programming language, including compiled or interpreted languages, and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a data processing environment. A computer program may be deployed to be executed on one module or on multiple modules at one site or distributed across multiple sites and interconnected by a communication network.
Method steps of the invention may be performed by one or more programmable processors executing a computer program to perform functions of the invention by operating on input data and generating output. Apparatus of the invention may be implemented as programmed hardware or as special purpose logic circuitry, including e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions coupled to one or more memory devices for storing instructions and data.
The above-described embodiments of the present invention may advantageously be used independently of any other of the embodiments or in any feasible combination with one or more others of the embodiments.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
The disclosure extends to the following statements:
S1. A computer-implemented method comprising:
S2. The computer-implemented method according to statement S1, comprising performing the prediction process for the patient a plurality of instances, using input data covering a longer and/or more recent time period at each consecutive instance.
S3. The computer-implemented method according to statement S1 or S2, comprising performing the prediction process each hour for a given number of hours.
S4. The computer-implemented method according to any of the preceding statements, wherein the physiological measurement variables comprise at least two of: heart rate; oxygen saturation; temperature; blood pressure (comprising any of systolic blood pressure, mean arterial pressure, and diastolic blood pressure); respiration rate; and end-tidal carbon dioxide.
S5. The computer-implemented method according to any of the preceding statements, wherein the physiological measurement variables comprise at least two of: heart rate; oxygen saturation; temperature; blood pressure (comprising any of systolic blood pressure, mean arterial pressure, and diastolic blood pressure); respiration rate; end-tidal carbon dioxide; blood sugar; Base Excess, bicarbonate, HCO3, level; fibrinogen level, platelets level; Fraction of Inspired Oxygen level pH level; Partial Pressure of Carbon Dioxide; oxygen saturation of arterial blood; Aspartate aminotransferase level; Blood urea nitrogen; Alkaline phosphatase level; Calcium level; Chloride level; Creatinine level; Direct Bilirubin level; Glucose level; Lactate level; Magnesium level; Phosphate level; Potassium level; Total Bilirubin level; Troponin I level; Haematocrit level; Haemoglobin level; Partial thromboplastin level; and White blood cell level.
S6. The computer-implemented method according to any of the preceding statements, wherein the values of at least one of the physiological measurement variables are obtained using at least one sensor (attached to the patient's body).
S7. The computer-implemented method according to any of the preceding statements, wherein generating the temporally updated embeddings comprises using the RNN to compute/determine/find/recognize the second correlations between the first updated node embeddings.
S8. The computer-implemented method according to any of the preceding statements, wherein the prediction process comprises: generating multi-dimensional feature encodings based on the input data; and generating initial node embeddings by multiplying the multi-dimensional feature encodings with weight vectors, wherein the computing the first correlations comprises computing the first correlations based on the initial node embeddings.
S9. The computer-implemented method according to statement S8, wherein generating the multi-dimensional feature encodings comprises, for each physiological measurement variable, performing a data binning method on the values concerned (and generating, as the multi-dimensional feature encodings, a feature vector for each value).
S10. The computer-implemented method according to any of the preceding statements, wherein the prediction process comprises generating an initial node embedding for each value of the input data, and wherein the computation of the first correlations comprises computing the first correlations based on the initial node embeddings.
S11. The computer-implemented method according to statement S10, wherein generating the initial node embeddings comprises, for each physiological measurement variable, performing a data binning method on the values concerned to generate multi-dimensional feature encodings and multiplying the multi-dimensional feature encodings by weight vectors.
S12. The computer-implemented method according to any of the preceding statements, wherein computing the short range temporal correlations comprises computing correlations between values of the same and different physiological measurement variables at two consecutive time steps.
S13. The computer-implemented method according to any of the preceding statements, wherein computing the short range temporal correlations for a value of a physiological measurement variable at a time step comprises computing correlations between that value and the value of each physiological measurement variable at the previous time step.
S14. The computer-implemented method according to any of the preceding statements, wherein computing the short range temporal correlations comprises computing correlations between a value at one of the consecutive time steps and the value of each physiological measurement variable at another of the consecutive time steps.
S15. The computer-implemented method according to any of the preceding statements, wherein computing the short range temporal correlations comprises: computing a correlation between a value of a first one of the physiological measurement variables at a primary time step and a value of a second one of the physiological measurement variables at a secondary time step; computing a correlation between a value of the first one of the physiological measurement variables at the secondary time step and a value of the second one of the physiological measurement variables at the primary time step; computing a correlation between the value of the first one of the physiological measurement variables at the primary time step and the value of the first one of the physiological measurement variables at the secondary time step; and computing a correlation between the value of the second one of the physiological measurement variables at the primary time step and the value of the second one of the physiological measurement variables at the secondary time step, and the primary and secondary time steps are consecutive time steps.
S16. The computer-implemented method according to any of the preceding statements, wherein computing the short range temporal correlations for a pair of consecutive time steps comprises computing said correlations between values of the physiological measurement variables at the two consecutive time steps, and computing the first correlations comprises computing said short range temporal correlations for each pair of consecutive time steps.
S17. The computer-implemented method according to any of the preceding statements, wherein computing the short range temporal correlations for a pair of consecutive primary and secondary time steps comprises: computing a correlation between a value of a first one of the physiological measurement variables at the primary time step and a value of a second one of the physiological measurement variables at the secondary time step; computing a correlation between a value of the first one of the physiological measurement variables at the secondary time step and a value of the second one of the physiological measurement variables at the primary time step; computing a correlation between the value of the first one of the physiological measurement variables at the primary time step and the value of the first one of the physiological measurement variables at the secondary time step; and computing a correlation between the value of the second one of the physiological measurement variables at the primary time step and the value of the second one of the physiological measurement variables at the secondary time step, and computing the first correlations comprises computing said short range temporal correlations for each pair of consecutive time steps.
S18. The computer-implemented method according to any of statements S8-S17, wherein generating the first correlations comprises using the initial node embeddings corresponding to the values, respectively.
S19. The computer-implemented method according to any of statements S8-S18, wherein computing the short range temporal correlations for the values corresponding to the second time step in the order of time steps comprises computing correlations between initial node embeddings corresponding to the second time step and initial node embeddings corresponding to the first time step (in the order of time steps).
S20. The computer-implemented method according to any of statements S8-S19, wherein computing the short range temporal correlations for the values corresponding to the second time step in the order of time steps comprises computing correlations between each of the initial node embeddings corresponding to the second time step and each of the initial node embeddings corresponding to the first time step (in the order of time steps).
S21. The computer-implemented method according to any of statements S8-S20, wherein computing the short range temporal correlations for the values corresponding to the second time step in the order of time steps comprises: computing a correlation between an initial node embedding corresponding to a first one of the physiological measurement variables and corresponding to the/a first time step (preceding the second time step) and an initial node embedding corresponding to a second one of the physiological measurement variables and corresponding to the second time step; and computing a correlation between an initial node embedding corresponding to the first one of the physiological measurement variables and corresponding to the second time step and an initial node embedding corresponding to the second one of the physiological measurement variables and corresponding to the first time step; computing a correlation between the initial node embedding corresponding to the first one of the physiological measurement variables and corresponding to the first time step and the initial node embedding corresponding to the first one of the physiological measurement variables and corresponding to the second time step; and computing a correlation between the initial node embedding corresponding to the second one of the physiological measurement variables and corresponding to the first time step and the initial node embedding corresponding to the second one of the physiological measurement variables and corresponding to the second time step.
S22. The computer-implemented method according to any of statements S8-S21, wherein computing the short range temporal correlations for the values corresponding to each of the third and subsequent time steps in the order of time steps comprises computing correlations between each of the initial node embeddings corresponding to the time step concerned and each of the first updated node embeddings corresponding to the preceding time step.
S23. The computer-implemented method according to any of statements S8-S22, wherein computing the short range temporal correlations for the values corresponding to the second time step in the order of time steps comprises computing correlations between each of the initial node embeddings corresponding to the second time step and each of the initial node embeddings corresponding to the first time step (in the order of time steps), and computing the short range temporal correlations for the values corresponding to each of the third and subsequent time steps in the order of time steps comprises computing correlations between each of the initial node embeddings corresponding to the time step concerned and each of the first updated node embeddings corresponding to the preceding time step.
S24. The computer-implemented method according to any of statements S8-S23, wherein the prediction process comprises updating the initial node embeddings based on the short range temporal correlations to generate intermediate node embeddings.
S25. The computer-implemented method according to any of statements S8-S24, wherein the prediction process comprises updating each initial node embedding with/based on its short range temporal correlations with the (initial or updated) node embeddings of the preceding time step to generate intermediate node embeddings.
S26. The computer-implemented method according to statement S25, wherein updating each initial node embedding with/based on its short range temporal correlations to generate intermediate node embeddings comprises using a weighted sum.
S27. The computer-implemented method according to any of statements S24-26, wherein updating an initial node embedding to generate an intermediate node embedding comprises adding a contribution to the intermediate node embedding based on each (initial or updated) node embedding of the previous time step weighted by its (short range temporal) correlation with the initial node embedding concerned.
S28. The computer-implemented method according to any of statements S24-S27, wherein generating the spatial correlations comprises computing correlations between the intermediate node embeddings corresponding to a same time step (and corresponding to different physiological measurement variables).
S29. The computer-implemented method according to any of statements S24-S28, wherein generating the first updated node embeddings comprises updating each intermediate node embedding with/based on its (spatial) correlation with each other intermediate node embedding corresponding to the same time step.
S30. The computer-implemented method according to any of statements S8-S29, wherein computing the short range temporal correlations for the values corresponding to the second time step in the order of time steps comprises computing correlations between each of the initial node embeddings corresponding to the second time step and each of the initial node embeddings corresponding to the first time step (in the order of time steps); and computing the short range temporal correlations for the values corresponding to each of the third and subsequent time steps in the order of time steps comprises computing correlations between each of the initial node embeddings corresponding to the time step concerned and each of the first updated node embeddings corresponding to the preceding time step.
S31. The computer-implemented method according to any of statements S8-S30, wherein computing the short range temporal correlations comprises computing the short range temporal correlations for the values corresponding to a second time step and for the values corresponding to a third time step and for the values corresponding to subsequent time steps, the second time step occurring after a first time step and before the third time step; the computing the short range temporal correlations for the values corresponding to the second time step comprises computing correlations between each of the initial node embeddings corresponding to the second time step and each of the initial node embeddings corresponding to the first time step; and the computing the short range temporal correlations for the values corresponding to each of the third and subsequent time steps comprises computing correlations between each of the initial node embeddings corresponding to the time step concerned and each of the first updated node embeddings corresponding to the said time step which precedes the time step concerned.
S32. The computer-implemented method according to any of statements S24-S31, wherein the generating the first updated node embeddings comprises updating each intermediate node embedding with/based on its (spatial) correlation with each other intermediate node embedding corresponding to the same time step.
S33. The computer-implemented method according to any of statements S24-S32, wherein updating each intermediate node embedding to generate the first updated node embeddings comprises using a weighted sum.
S34. The computer-implemented method according to any of statements S24-S33, wherein updating an intermediate node embedding to generate a first updated node embedding comprises adding a contribution to the first updated node embedding based on each other intermediate node embedding of the same time step weighted by its (spatial) correlation with the intermediate node embedding concerned.
S35. The computer-implemented method according to any of the preceding statements, wherein generating the spatial and short range temporal correlations comprises computing dot-products between the node embeddings concerned.
S36. The computer-implemented method according to any of the preceding statements, wherein generating the spatial correlations comprises using a self-attention mechanism/network.
S37. The computer-implemented method according to any of the preceding statements, wherein generating the short range temporal correlations comprises using an attention-based mechanism/network.
S38. The computer-implemented method according to any of the preceding statements, wherein generating the spatial correlations comprises using a key-query-value self-attention mechanism/network.
S39. The computer-implemented method according to any of the preceding statements, wherein generating the short range temporal correlations comprises using a key-query-value attention-based mechanism/network.
S40. The computer-implemented method according to any of the preceding statements, wherein the prediction process comprises repeatedly generating said first correlations and updating the node embeddings concerned, comprising, for each subsequent iteration, starting with the first updated node embeddings of the previous iteration in place of the initial node embeddings.
S41. The computer-implemented method according to any of the preceding statements, wherein the RNN is configured (has been trained) to learn long term temporal correlations between the first updated embedding vectors.
S42. The computer-implemented method according to any of the preceding statements, wherein the RNN comprises at least on gated recurrent unit, GRU.
S43. The computer-implemented method according to any of the preceding statements, wherein the NN has been trained to output a sepsis prediction.
S44. The computer-implemented method according to any of the preceding statements, wherein the NN is configured (has been trained) to generate the prediction in the form of a binary classification task.
S45. The computer-implemented method according to any of the preceding statements, wherein the NN comprises at least one or a plurality of (fully-connected) layers.
S46. The computer-implemented method according to any of the preceding statements, wherein the computer-implemented method comprises performing a training process, the training process comprising: performing the prediction process using training data corresponding to a training patient as the input data; adjusting at least one (or every) network weight used in the attention-based mechanism, the self-attention mechanism, the RNN, and the NN based on a difference between the generated prediction/diagnosis and a training prediction/diagnosis corresponding to the training data (as a ground truth prediction/diagnosis/output) (to bring the generated prediction to or towards the training prediction).
S47. The computer-implemented method according to statement S46, wherein the computer-implemented method comprises performing the training process for a plurality of iterations using different training data for each iteration.
S48. The computer-implemented method according to statement S46 or S47, wherein the computer-implemented method comprises performing/iterating the training process until (an iteration in which) the difference between the generated prediction and the training prediction concerned converges or is below an error threshold.
S49. The computer-implemented method according to any of statements S46-S48, wherein the computer-implemented method comprises performing/iterating the training process until a predefined number of successive iterations in which the difference between the generated prediction and the training prediction concerned is below an error threshold.
S50. The computer-implemented method according to any of statements S46-S49, wherein the computer-implemented method comprises performing/iterating the training process a predefined number of iterations.
S51. The computer-implemented method according to any of statements S46-S50, wherein the computer-implemented method comprises, after performing the training process, performing the prediction process using target input data of a target patient to generate a target prediction/diagnosis.
S52. The computer-implemented method according to any of statements S46-S51, wherein the computer-implemented method comprises performing the prediction process for the target patient a plurality of instances, using target input data covering a longer (and more recent) time period at each consecutive instance.
S53. The computer-implemented method according to any of the preceding statements, wherein the computer-implemented method comprises performing the prediction process each hour for a given number of hours.
S54. According to an embodiment of a second aspect there is disclosed herein a computer program which, when run on a computer, causes the computer to carry out a method comprising: performing a prediction process, the prediction process comprising: based on input data comprising values of physiological measurement variables of/related to a patient (and) over a (first) time period, computing (first) correlations in the input data, the computing comprising computing (short range temporal) correlations between values of (different) physiological measurement variables at (different) consecutive time steps (and between values of the same physiological measurement variable at (different) consecutive time steps)) (using an attention-based mechanism) and computing (spatial) correlations between values of different physiological measurement variables at a same time step (using a self-attention mechanism); generating first updated node embeddings based on the input data and the (first) correlations, each node corresponding to a (value of a) physiological measurement variable at a time step; using a recurrent neural network, RNN, updating the first updated node embedding based on/with (second) correlations between the first updated node embeddings to generate (temporally) updated embeddings; and based on the (temporally) updated embeddings and using a neural network, NN, generating a prediction/diagnosis indicating whether the patient ((currently) has or) will have sepsis (at a future point in time).
S55. According to an embodiment of a third aspect there is disclosed herein an information processing apparatus comprising a memory and a processor connected to the memory, wherein the processor is configured to: perform a prediction process, the prediction process comprising: based on input data comprising values of physiological measurement variables of/related to a patient (and) over a (first) time period, computing (first) correlations in the input data, the computing comprising computing (short range temporal) correlations between values of (different) physiological measurement variables at (different) consecutive time steps (and between values of the same physiological measurement variable at (different) consecutive time steps)) (using an attention-based mechanism) and computing (spatial) correlations between values of different physiological measurement variables at a same time step (using a self-attention mechanism); generating first updated node embeddings based on the input data and the (first) correlations, each node corresponding to a (value of a) physiological measurement variable at a time step; using a recurrent neural network, RNN, updating the first updated node embedding based on/with (second) correlations between the first updated node embeddings to generate (temporally) updated embeddings; and based on the (temporally) updated embeddings and using a neural network, NN, generating a prediction/diagnosis indicating whether the patient ((currently) has or) will have sepsis (at a future point in time).
| Number | Date | Country | Kind |
|---|---|---|---|
| 202311057599 | Aug 2023 | IN | national |