The present disclosure refers to a method and a state machine system for determining an operation status for a sensor.
U.S. Publication No. 2014/0182350 A1 discloses a method for determining the end of life of a CGM (continuous glucose monitoring) sensor including evaluating a plurality of risk factors using an end of life function to determine an end of life status of the sensor and providing an output related to the end of life status of the sensor. The plurality of risk factors are selected from a list including a number of days the sensor has been in use, whether there has been a de-crease in signal sensitivity, whether there is a predetermined noise pattern, whether there is a predetermined oxygen concentration pattern, and an error between reference BG (blood glucose) values and EGV sensor values.
EP 2 335 584 A2 relates to a method for self-diagnostic test and setting a suspended mode of operation of the continuous analyte sensor in response to a result of the self-diagnostic test.
U.S. Publication No. 2015/164386 A1, electrochemical impedance spectroscopy (EIS) is used in conjunction with continuous glucose monitors and continuous glucose monitoring (CGM) to enable in-vivo sensor calibration, gross (sensor) failure analysis, and intelligent sensor diagnostics and fault detection. An equivalent circuit model is defined, and circuit elements are used to characterize sensor behavior.
U.S. Publication No. 2010/323431 A1 discloses a control circuit and method for controlling a bi-stable display having bi-stable segments each capable of transitioning between an on state and an off state via application of a voltage. The voltage is provided to a display driver from a charge pump, and supplied to individual ones of the bi-stable segments via outputs from the display driver in accordance with display instructions provided by a system controller. Both a bi-stable segment voltage level of at least one of the outputs of the display driver and a charge pump voltage level of the voltage are detected and compared to a valid bi-stable segment voltage level and a valid charge pump voltage level, respectively. A malfunction signal may be provided to the system controller if either of the detected voltage levels is not valid.
The present disclosure teaches a sensor system that is a state machine (“sensor system” and “state machine” may be used interchangeably herein) and a method for detecting an operation status for a sensor which allows predicting potential operation status problems more safely.
According to an aspect, a method for detecting an operation status for a sensor is provided. In a state machine, the method comprises: receiving continuous monitoring data related to an operation of a sensor, providing a trained learning algorithm for detecting an operation status for the sensor which signifies a sensor function, wherein the learning algorithm is trained according to a training data set comprising historical data, detecting an operation status for the sensor by analyzing the continuous monitoring data with the trained learning algorithm, and providing output data indicating the detected operation status for the sensor.
According to further aspect, a state machine system is provided. The state machine system has one or more processors configured for data processing and for performing a method for detecting an operation status for a sensor, the method comprising: receiving continuous monitoring data related to an operation of a sensor, providing a trained learning algorithm for detecting an operation status for the sensor which signifies a sensor function, wherein the learning algorithm is trained according to a training data set comprising historical data, detecting an operation status for the sensor by analyzing the continuous monitoring data with the trained learning algorithm, and providing output data indicating the detected operation status for the sensor.
According to the technologies proposed, a process of machine learning is applied for detecting operation status of the sensor. Thereby, a predictive method is implemented for determining the operation status of the sensor by using a trained learning algorithm trained according to a training data set and applied for analyzing continuous monitoring data related to the operation of the sensor.
For example, abnormalities and/or malfunctions with regard to the operation of the sensor may be predicted, thereby avoiding potential problems in the operation of the sensor.
The learning algorithm is trained according to the training data set comprising historical data. The term “historical data” as used in the present application refers to data collected, detected and/or measured prior to the process of determining the operation status. The historical data may have been detected or collected prior to starting collection of the continuous monitoring data received for operation status detection.
The training data set may be collected, detected and/or measured by the same sensor and/or by some different sensor. The sensor different from the sensor for which the operation status is detected may be of the same sensor type.
The training data set may comprise training data indicative of a sensor status to be detected or predicted. For example, the training data set may be indicative of one or more of the following: a manufacturing fault status, malfunction status, a glycemic indicating status, and an anamnestic indicating status.
The detecting may comprise at least one of detecting a manufacturing fault status for the sensor indicative of a fault in a process for manufacturing the sensor, detecting a malfunction status for the sensor indicative of a malfunction of the sensor, detecting an anomaly status for the sensor indicative of an anomaly in operation of the sensor, detecting a glycemic indicating status for the sensor indicative of a glycemic index for a patient for whom the continuous monitoring data are provided; and detecting an anamnestic indicating status for the sensor indicative of an anamnestic patient status for the patient for whom the continuous monitoring data are provided. The detecting of the manufacturing fault status for the sensor may be performed after manufacturing the sensor. Alternatively or in addition, the detecting of the manufacturing fault status may be applied to an intermediate sensor product (not finalized sensor) while the manufacturing process is still running. Similarly, the detecting of the malfunction status for the sensor may be part of or related to the manufacturing process. Alternatively, by the technology proposed, a malfunction status for the sensor may be predicted after the manufacturing process has been finalized, for example in case of applying the sensor for measurement. The detecting of the anomaly status for the sensor may be done in a measurement process, for example in real time while detection of measurement signals by the sensor is going on. Similarly one of the detecting of the glycemic indicating status and the detecting of the anamnestic indicating status may be performed while a measurement process is running. Alternatively, such detecting may be applied after a measurement process has been finished.
A glycemic index may be determined for the patient, for example, in response to detecting the glycemic indicating status for the sensor. The glycemic index is a number associated with a particular type of food that indicates the food's effect on a person's blood glucose (also called blood sugar) level. A value of one hundred may represent the standard, an equivalent amount of pure glucose. In addition or as an alternative, other glycemic parameters may be determined, such parameters including rate-of-change of blood glucose level, acceleration, event patterns due to, for example, movement of the patient, meal, mechanical stress on the sensor with regard to the anamnestic indicating status for the sensor. With regard to the anamnestic indicating status, potentially anamnestic data may be determined such as hba1c or demographic data like age and/or sex of the patient.
Providing the trained learning algorithm may comprise providing at least one learning algorithm selected from the following group, K-nearest neighbor, support vector machines, naive bayes, decision trees such as random forest, logistic regression such as multinominal logistic regression, neuronal network, decision trees, and bayes network. Of preferred interest may be one of naive bayes, random forest, and multinominal logistic regression. In a preferred embodiment the random forest algorithm may be applied for which correlation and interactions between parameters are analyzed or automatically incorporated.
In this embodiment a method comprises the training of the learning algorithm according to the training data set which comprises the historical data.
The method may further comprise training a learning algorithm according to the training data set comprising the historical data.
The training may comprise training the learning algorithm according to the training data set comprising at least one of in vivo historical training data and in vitro historical training data.
The training may comprise training the learning algorithm according to the training data set comprising continuous monitoring historical data.
The training may comprise training the learning algorithm according to the training data set comprising test data from the following group: manufacturing test data, patient test data, personalized patient test data, population test data comprising multiple patient data sets. The training data set may be derived from one or more of such different test data for optimizing the training data set with regard to one or more operation status of the sensor.
The training may comprise training the learning algorithm according to the training data set comprising training data indicative of one or more sensor-related parameters from the following group: current values of the sensor, particularly in the case of a continuous monitoring sensor current values of a working electrode; voltage values of the sensor, particularly in the case of a continuous monitoring sensor voltage values of a counter electrode, or voltage values between the reference electrode and the working electrode; temperature of an environment of the sensor during measurement; sensitivity of the sensor; offset of the sensor; and calibration status of the sensor. In dependence on the operation status which is to be detected, one or more of the sensor-related parameters may be selected. With regard to the calibration status of the sensor, for example, it may indicate when a last calibration has been performed.
The one or more sensor-related parameters may include at least one of non-correlated sensor-related parameters, and correlated sensor-related parameters. Two or more sensor-related parameters may be correlated. In such case, the correlated sensor-related parameters may be selected for detecting the operation status by taking into account all the correlated sensor-related parameters. Differently, in case of non-correlated sensor-related parameters a single one of the non-correlated sensor-related parameters may be selected for detecting an operation status. The non-correlated sensor-related parameters may independently allow for detection of operation status.
The method may further comprise validating the trained learning algorithm according to a validation data set comprising measured continuous monitoring data and/or simulated continuous monitoring data indicative, for the sensor, of at least one of: manufacturing fault status, malfunction status, glycemic indicating status, and anamnestic indicating status.
The method may further comprise at least one of receiving continuous monitoring data comprising compressed monitoring data, and training the learning algorithm according to the training data set comprising compressed training data, wherein the compressed monitoring data and/or the compressed training data are determined by at least one of a linear regression method and a smoothing method. The compressed data may be the result of reduction of the dimension of monitoring data or training data. With regard to the smoothing method, kernel smoothing or spline smoothing models or time series analysis known as such may be applied. In the different stages of compression, the monitoring data/training data may comprise a data (measurement signals) per second, data per minute and/or statistic data including characteristic values such as sensor parameters, variance, noise or rate-of-change.
Continuous monitoring data may be provided by the sensor that is a fully or partially implanted sensor for continuous glucose monitoring (CGM). In general, in the context of CGM, an analyte value or level indicative of a glucose value or level in the blood may be determined. The analyte value may be measured in an interstitial fluid. The measurement may be performed subcutaneously or in vivo. CGM may be implemented as a nearly real-time or quasi-continuous monitoring procedure frequently or automatically providing/updating analyte values without user interaction. In an alternative embodiment, analyte may be measured with a biosensor in a contact lens through the eye fluid or with a biosensor on the skin via transdermal measurement in sudor. A CGM sensor may stay in place for several days to weeks and then must be replaced.
With regard to the state machine system, the alternative embodiments described above may apply mutatis mutandis.
The above-mentioned aspects of exemplary embodiments will become more apparent and will be better understood by reference to the following description of the embodiments taken in conjunction with the accompanying drawings, wherein:
The embodiments described below are not intended to be exhaustive or to limit the invention to the precise forms disclosed in the following detailed description. Rather, the embodiments are chosen and described so that others skilled in the art may appreciate and understand the principles and practices of this disclosure.
In a further embodiment, additional functional elements (e.g., hardware, sensors, etc.) 7 may be provided in the sensor system 1.
Continuous monitoring data related to an operation of a sensor 7 is received in the one or more processors 2 via the input interface 4. Sensor 7 may be connected to input interface 4 of state machine system 1 via a wire. Alternatively or additionally, a wireless connection, such as Bluetooth, Wi-Fi or other wireless technology, may be provided.
In the embodiment shown, sensor 7 comprises a sensing element 8 and sensor electronics 9. In this embodiment, sensing element 8 and sensor electronics 9 are provided in the same housing of sensor 7. Alternatively, sensing element 8 and sensor electronics 9 may be provided separately and may be connected using a wire and/or wirelessly.
In one embodiment, continuous monitoring data may be provided by a sensor 7 that is a fully or partially implanted sensor for continuous glucose monitoring (CGM). In general, in the context of CGM, an analyte value or level indicative of a glucose value or level in the blood may be determined. The analyte value may be measured in an interstitial fluid. The measurement may be performed subcutaneously or in vivo. CGM may be implemented as a nearly real-time or quasi-continuous monitoring procedure frequently or automatically providing/updating analyte values without user interaction. In an alternative embodiment, analyte may be measured with a biosensor in a contact lens through the eye fluid or with a biosensor on the skin via transdermal measurement in sudor.
A CGM sensor may stay in place for several days to weeks and then must be replaced. A transmitter may be used to send information about an analyte value or level indicative of the glucose level via wireless and/or wired data transmission from the sensor to a receiver such as sensor electronics 9 or input interface 4.
Via the output interface 5, output data indicating the detected operation status for the sensor 7 is provided to one or more output devices 10. Any suitable output device may serve as output device 10 is contemplated. For example, output device 10 may comprise a display device. Alternatively or additionally, output device 10 may comprise an alert generator, a data network and/or one or more further processing devices (processors) and/or one or more signaling devices (transmitters and/or receivers) in communication with another system such as, e.g., an insulin pump. In another embodiment (not shown), more than one output device 10 is provided.
The one or more output devices 10 may be connected to output interface 5 of sensor system 1 via a wire. Alternatively or additionally, a wireless connection, such as Bluetooth, Wi-Fi or other wireless technology, may be provided.
In an alternative embodiment, the output device 10, or one of the more than one output devices 10, is integrated in state machine system 1. Non-limiting examples of typical actions of the output device 10 in response to the detected operation status for the sensor 7 would be halting operation of the sensor, producing an error signal such as a haptic, audible or visual signal, calibrating the sensor, correcting a sensor signal, and/or halting insulin delivery.
In an embodiment, one or more further input devices 11 are connected to the input interface 4. Such further input devices 11 may include one or more further sensors to collect training data and/or validation data for use with the learning algorithm. Further input devices 11 may also include, in addition or as an alternative, sensors for acquiring different types of data. An example of such a different type of data is temperature data. Sensor data of such different type of data may be additionally analyzed for detecting an operation status for the sensor 7. In addition or as an alternative, sensor data of such different type of data may be used as training data and/or validation data. Alternatively or additionally, the one or more further input devices 11 may include a data network, external data storage device, user input device, such as a keyboard, mouse or the like, one or more further processing devices and/or any other device suitable to provide relevant data to sensor system 1.
In step 20, continuous monitoring data related to an operation of a sensor 6 is received in an input interface 4 of a state machine system 1.
Continuous monitoring data may be indicative of one or more sensor-related parameter. Such sensor-related parameters may include current values of a working electrode of the sensor, voltage values of a counter electrode of the sensor, voltage values between the reference electrode and the working electrode, temperature of an environment of the sensor during measurement, sensitivity of the sensor, offset, and/or calibration status of the sensor. Sensor-related parameters may include non-correlated sensor-related parameters, correlated sensor parameters or a combination thereof.
In one embodiment, continuous monitoring data may comprise compressed monitoring data. In this case, compressed monitoring data is determined by at least one of a linear regression method and a smoothing method.
In step 21, a trained learning algorithm is provided. The learning algorithm is trained according to a training data set comprising historical data. The trained learning algorithm may be provided in the memory 3 of the sensor system 1. Alternatively, the trained learning algorithm may be provided in the one or more processors 2 from the memory 3. In an alternative embodiment, the trained learning algorithm is provided via the input interface 4. For example, the trained learning algorithm may be received from an external storage device. In further embodiments, the trained learning algorithm may be provided in one or more additional functional elements (also referred to as sensors) 7 or may be provided in the one more processors 2 from one or more additional functional elements 7.
The order of steps 20 and 21 may be reversed in different embodiments. In a particular embodiment, the trained learning algorithm is provided before sensor 7 is put into operation. As a further alternative, step 20 and 21 may be performed, in whole or partially, at the same time.
In step 22, using the one or more processors 2, the continuous monitoring data is analyzed with the trained learning algorithm. In embodiments in which the trained learning algorithm is not provided in the processor 2, the processor 2 may access the trained learning algorithm to analyze the continuous monitoring data. By analyzing the continuous monitoring data, an operation status for the sensor 7 is detected.
The operation status detected for the sensor in step 22 may be one of several different states. For example, a manufacturing fault status for the sensor indicative of a fault in a process for manufacturing the sensor, a malfunction status for the sensor indicative of a malfunction of the sensor, an anomaly status for the sensor indicative of an anomaly in operation of the sensor, a glycemic indicating status for the sensor indicative of a glycemic index for a patient for whom the continuous monitoring data are provided, and/or an anamnestic indicating status for the sensor indicative of an anamnestic patient status for the patient for whom the continuous monitoring data are provided may be detected.
Following, in step 23, output data indicating the detected operation status for the sensor is provided at output interface 5.
In an embodiment, the method for detecting an operation status for a sensor may further comprise training a learning algorithm according to a training data set comprising historical data.
Still referring to
Historical training data may comprise in vivo historical training data being indicative of sensor-related parameters acquired while sensor 7 is in operation on a living subject. Alternatively or additionally, historical training data may comprise in vitro historical training data being indicative of sensor-related parameters acquired while sensor 7 is not in operation on a living subject.
The training data set provided in step 24 may comprise continuous monitoring historical data.
The training data set may comprise manufacturing test data, patient test data, personalized patient test data and/or population test data comprising multiple patient datasets.
Training data may be indicative of one or more sensor-related parameter. Such sensor-related parameters may include current values of a working electrode of the sensor, voltage values of a counter electrode of the sensor, voltage values between the reference electrode and the working electrode, temperature of an environment of the sensor during measurement, sensitivity of the sensor, offset, and/or calibration status of the sensor. Sensor-related parameters may include non-correlated sensor-related parameters, correlated sensor parameters or a combination thereof.
In one embodiment, the training data set may comprise compressed training data. In this case, compressed training data is determined by at least one of a linear regression method and a smoothing method.
In step 25, the learning algorithm is trained according to the training data set provided in step 24.
The learning algorithm may be selected from suitable algorithms. Such learning algorithms include: K-nearest neighbor, support vector machines, naive bayes, decision trees such as random forest, logistic regression such as multinominal logistic regression, neuronal network, decision trees and bayes network. A learning algorithm may be selected based on suitability for use with the continuous monitoring data analyzed in step 22.
Training of the learning algorithm in step 25 may take place in state machine system 1. In this case, in step 24, the training data set may be provided in the memory 3 of the state machine system 1. Alternatively, the training data set may be provided in the one or more processors 2 from the memory 3. In an alternative embodiment, the training data set is provided via the input interface 4. For example, the training data set may be received from an external storage device. In further embodiments, the training data set may be provided in one or more additional functional elements 7 or may be provided in the one more processors 2 and/or the memory 3 from one or more additional functional elements 7.
In an alternative embodiment, training of the learning algorithm in step 25 may take place outside sensor system 1. In this embodiment, in step 24, the training data set is provided in any suitable way that enables training of the learning algorithm.
A further embodiment may include step 26 in which the trained learning algorithm is validated according to a validation data set. The validation data set comprises measured continuous monitoring data and/or simulated continuous monitoring data. This data is indicative, for the sensor, of at least one of: manufacturing fault status, malfunction status, glycemic indicating status, and anamnestic indicating status.
Validating of the trained learning algorithm in step 26 may take place in state machine system 1. In this case, the validation data set may be provided in the memory 3 of the sensor system 1. Alternatively, the validation data set may be provided in the one or more processors 2 from the memory 3. In an alternative embodiment, the validation data set is provided via the input interface 4. For example, the validation data set may be received from an external storage device. In further embodiments, the validation data set may be provided in one or more additional functional elements 7 or may be provided in the one more processors 2 and/or the memory 3 from one or more additional functional elements 7.
In an alternative embodiment, validation of the trained learning algorithm in step 26 may take place outside state machine system 1. In this embodiment, the validation data set is provided in any suitable way that enables validating the learning algorithm.
In one embodiment, the validation data set may comprise compressed validation data. In this case, compressed validation data is determined by at least one of a linear regression method and a smoothing method.
Following, additional aspects are described.
Measurements for collecting continuous monitoring data are performed with a plurality of continuous glucose monitoring sensors.
Based on an established sequence of working steps in the field of data mining (see Shmueli et al., Data Mining for Business analytics—Concepts, Techniques, and Applications with XLMiner, 3rd Ed., New York: John Wiley & Sons, 2016), which is to serve as support for the development of a model, the following steps, all or in part, may be realized:
1. Draw up the problem
2. Obtain data
3. Analyze and clean data
4. Reduce the dimensions, if necessary
5. Specify the problem (classification, clustering, prediction)
6. Share the data in training. Validate and test data set.
7. Select the data mining technique (regression, neuronal network, etc.)
8. Different versions of the algorithm (different variables)
9. Interpret the results
10. Incorporate model into the existing system
Following, a process for data collection is described, which may be applied in an alternative embodiment.
At test sites, the current value of a working electrode of the sensor, the voltage value of the counter electrode of the sensor, the voltage values between the reference electrode and the working electrode may be recorded each second each channel. The temperature of the solution in which the sensors are located may be detected each minute. These parameters may be stored in a data file, for example, in an Extensible Markup Language (XML) file. A data processing program, such as, by way of non-limiting example, CoMo, then captures the data file and formants it for use in a statistical analysis package, e.g., as an experiment in the form of an SAS data set. At the lowest stage, this experiment consists of data referring to one second. As shown in
To start, data from the highest compression stage, the basic statistics, may be used because access to more complex data may be reserved to cases in which the classification using simpler data provides insufficient results. In addition, the classification of time-resolved data, as they are present in the minute and second stage, would require a different programming language, such as Python.
A plurality of test series, such as 16 test series, were identified, which are distributed to the test sites, resulting, multiplied by the plurality of channels, in one example in 256 data entries.
For the error identification of each sensor, the graphic illustration according to
Once all channels have been analyzed and identified, the test series may be exported to a memory. In a last step, the test series may be read from this memory and stored as reference.
The entire data set was divided into three parts, a training data set, a validation data set and a test data set representing continuous monitoring data.
In an alternative embodiment, two types of errors, representing an operation status of the sensor, are to be identified by the models. These are a fluidics error and a maxed out current error. A channel without errors, as shown in
In this embodiment, the fluidics error is in the focus of error detection. Therefore, data from a period of time with a high volume of these defects is chosen. One difficulty associated with this error type is the large variety of manifestations in which it may occur. However, as illustrated in
The maxed out current error can appear, when the sensor is inserted into the channel at the beginning of the test. The sensor at the test site is marked with the error type when a current above a threshold value is detected. It is now possible for a member of the staff at the test site to insert the sensor into the channel anew, thus fixing the error. Alternatively, the sensor may ultimately be marked as being faulty.
In order to be able to mark the data in a meaningful manner, the individual errors may be provided with different error codes according to table 1.
In an alternative embodiment, the strength of the linear connection between the variables may be determined by means of the correlation coefficient, which can have values of between −1 and 1. In the case of a value of 1, a high positive linear correlation is present. When looking at
As indicated above, there may be variables, such as the current, which may be measured directly at the test site. In an embodiment, when compressing the data, a linear model as well as a spline model are used, which estimate various parameters. Due to the fact that the data set, which is to be used later, includes compressed data, integrated models are considered.
The analysis of the normal distribution condition, which, according to DIN 53804-1 can be carried out graphically by means of Quantil-Quantil plots, may be of interest for the descriptive statistics regarding the measured values representing sensor-related parameters. The X-axis of a QQPlot is defined by the theoretical quantile, and the Y-axis is defined by the empirical quantile. A normally distributed parameter results in a straight line, which is illustrated as straight line in the QQPlot. In addition, there are various normal distribution tests, such as the Chi-square test or the Shapiro-Wilk test. These hypotheses tests define the null hypothesis as a presence of the normal distribution and the alternative hypothesis, in contrast, assumes that a normal distribution is not present. These test methods are highly sensitive with respect to deviations. In an embodiment, normal distribution may therefore be analyzed by means of QQPlot for each parameter.
Measured values may include the sensor current for different glucose concentrations. These may be determined as certain time period medians and may, additionally or alternatively, be averaged. Measured values may further include the sensitivity of the sensor. Additionally or alternatively, measured values may include parameters characteristic of the graphs that describe measured values, such as the sensor current. These may, for example, include a drift and/or a curvature. In addition or as an alternative, values may include statistical values regarding other measured values. Measured values may be approximated employing different models, such as a linear model and/or a spline model. All or any of the measured values and parameters may be determined at different glucose concentrations and/or for different time periods.
In an alternative embodiment, several modeling methods for a learning algorithm are chosen (see, for example, Domingos, A Few Useful Things to Know About MachineLearning, Commun. ACM 55.10, S. 78-87. DOI: 10.1145/2347736.2347755, 2012) and are analyzed with regard to their advantages as well as disadvantages. In addition, the methods may be analyzed with regard to their compatibility with regard to the problem, in order to be able to make a method selection. Following, exemplary methods are described (Sammut et al., Encyclopedia of Machine Learning, 1st. Springer Publishing Company, Incorporated, 2011). Table 2 summarizes advantages and disadvantages of the methods.
The goal of this method is to classify an object into a class, into which similar objects of the training quantity have already been classified, whereby the class which appears most frequently is output as result. In order to determine the proximity of the objects, a similarity measure, such as, for example, the Euclidian distance, is used. This method is very well suited for significantly larger data quantities, which are not present in the present example. This is also why this model is not taken into the comparative consideration.
In this method, a hyper plane is calculated, which classifies objects into classes. For calculating the hyper plane, the distance around the class boundaries is to be maximized, which is why the Support Vector Machine is one of the ‘Large Margin Classifiers’. An important assumption of this method is the linear separability of the data, which, however, can be expanded to higher dimensional vector spaces by means of the Kernel trick. Large data quantities, which in some embodiments are not present, are required for a classification with less overfitting.
The naive assumption is that the present variables are statistically independent from one another. This assumption is not true for most cases. In many cases, Naive Bayes nonetheless reaches good results to the effect that a high rate of correct classifications is reached, even if the attributes correlate slightly. Naive Bayes is characterized by a simple mode of operation and may thus be adopted into the model selection.
In connection with the logistic regression, a likelihood is calculated for the analysis as to what extent the characteristic of a dependent variable can be attributed to values of independent variables.
Artificial neuronal networks are based on the biological structure of neurons in the brain. A simple neuronal network consists of neurons arranged in three layers. These layers are the input layer, the hidden layer and the output layer. Between the layers, all neurons are connected to one another via weights, which are optimized step by step in the training phase. Neuronal networks are currently used heavily in many areas and thus comprise a large spectrum of model variations. There is a plurality of hyper parameters, which must be determined from experience values for the optimization of such networks. In some embodiments, for reasons of time efficiency, these hyper parameters are not determined.
Decision trees are sorted, layered trees, which are characterized by their simple and easily comprehensible appearance. Nodes which are located close to the root are more significant for the classification than nodes located close to the leaf. In one embodiment, due to the fact that decision trees often experience problems caused by overfitting, the methodology of the random forest is chosen for the model selection. This method consists of a plurality of decision trees, whereby each tree represents a partial quantity of variables.
A Bayes network is a directed graph, which illustrates multi-variable likelihood distributions. The nodes of the network correspond to random variables and the edges show the relationships between them. A possible application can be in diagnostics to illustrate the cause of symptoms of a disease. For developing a Bayes network, it is essential to be able to describe the dependencies between the variables in as much detail as possible. For the errors addressed in some embodiments, the generation of such a graph is not feasible.
In an alternative embodiment, models are initially considered theoretically and are analyzed with regard to their assumptions, whereupon the first implementation takes place, which may then be optimized by means of various methods.
In the first step, a binary problem with a linear model may be used, which includes three variables of the total quantity. The learning algorithms represented by the models may be subsequently trained with all classes and parameters, based on the actual problem. Finally, an adaptation of the model characteristics with regard to the data at hand may be made by means of hyper parameters such as, for example, the number of the decision trees in the case of Random Forest. An illustration with regard to this process using the example of the Random Forest model is illustrated in
This model, which may be used in an embodiment, is based on Bayes' theorem and may serve as a simple and quick method for classifying data. In such an embodiment, it is a condition that the data present is statistically independent from one another and that it is distributed normally. Due to the fact that the method can determine the relative frequencies of the data in only a single pass, it is considered to be a simple as well as quick method.
According to Bayes' theorem, the following formula serves to calculate conditional likelihoods:
When assuming that the attributes are present independently from one another, the Naive Bayes classifier can be defined as follows:
This function always predicts the most likely class y for an attribute xi with the help of the maximum a posteriori rule. The latter behaves similar to the maximum likelihood method, but with the knowledge of the a priori term. When metric data is present in the data set, a distribution function is required in order to calculate the conditional likelihoods for P(xi|y). In an embodiment, Naive Bayes may also fall back on the normal distribution (Berthold et al., Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data, 1st, Springer Publishing Company, Incorporated, 2010). In spite of the fact that a normal distribution is not present in the case of many CGM variables, Naive Bayes may be used because it can attain a high rate of correct classifications in spite of slight deviations from normal distribution.
P(xi|y)=N(xi,μ,σ2)
μ, the average value, and σ, the variance, are calculated for each attribute xi and each class y.
Due to the fact that a smaller data set is sufficient for a good prediction in the case of this model, only four measurements may be used as input in one embodiment. In one embodiment, for first consideration, a partial quantity of the available parameters, consisting of A2, I90 and D, may be chosen.
Naive Bayes may be used determining the probability of an error under the condition that I90 appears in one class.
In one embodiment, no statement is to be made about the type of error. So that a new identification of the data does not need to take place, four test sites may be chosen which contain only fluidic errors. In this case, the error code 0 may be identified as no error and 1 may be identified as error in general. Table 3 illustrates an excerpt of the input data set of one embodiment for Naive Bayes.
As illustrated in Table 4, the model output may include the calculated a priori values for the classes. In a next step, the average value as well as the standard deviation of each variable for class 0 (no error) and for class 1 (error) may be calculated. They may serve to determine the distribution function of the variable based on the normal distribution.
The quality of the model may be evaluated by means of various parameters of the output. As illustrated in Table 5, in one embodiment, from this output, the accuracy, the sensitivity and the specificity may be of predominant significance.
In one embodiment, the accuracy allows for a first impression about the results of the models and may thus be used for assessing the quality.
In certain embodiments, in order to be able to assess the significance of the accuracy, the Kappa value may be used. The Kappa value is a statistical measure for the correspondence of two quality parameters, in this embodiment of the observed accuracy with the expected accuracy. After the observed accuracy and the expected accuracy are calculated, the Kappa value can be determined as follows:
Different approaches exist for the interpretation of the Kappa value. One such approach, known from (Landis et al., The Measurement of Observer Agreement for Categorical Data, Biometrics 33, S. 159-174, 1977), is summarized in table 6:
In an embodiment, the positive predictive value, negative predictive value, the sensitivity and the specificity may be determined.
The positive predictive value specifies the percentage of the values, which have been correctly classified as being faulty, of all of the results, which have been classified as being faulty (corresponds to the second row of the four-field table).
Accordingly, the negative predictive value specifies the percentage of the values, which have been correctly classified as being free from error, of all of the results, which have been classified as being free from error (corresponds to the second line of the four-field table).
The sensitivity specifies the percentage of the objects, which have been correctly classified as being positive, of the actually positive measurements:
The specificity specifies the percentage of the objects, which have been correctly classified as being negative, of the measurements, which are in fact negative.
In an embodiment, the prediction of the binary model with the variables A2, D and I90 as well as the holistic model can be illustrated via a four-field table. In the embodiment illustrated in table 7, the binary model has the most difficulties in the area of the rate of false negatives, which is reflected in a sensitivity of
In an alternative embodiment, after naive Bayes has been discussed in the context of a binary question, all error types and variables may then be highlighted at a second stage. The implementation may be based on all of the available data. If the accuracy as well as the Kappa value behave similarly in both model versions, this may reinforce the thesis that Naive Bayes with less data can already reach good results.
A logistic regression may be implemented as known as such (Backhaus et al., Multivariate Analysemethoden: Eine anwendungsorientierte Einführung, Springer, Berlin Heidelberg, 2015). Logistic regression may be used to determine a connection between the manifestation of an independent variable and a dependent variable. Normally, the binary dependent variable Y is coded as 0 or 1, i.e., 1: an error is present, 0: no error is present. A possible application of logistic regression in the context of CGM is determining whether current value, spline and sensitivity are connected to the manifestation of an error.
In an embodiment, logistic regression may be implemented using a generalized linear model (see, for example, Dobson, An Introduction to Generalized Linear Models, Second Edition. Chapman & Hall/CRC Texts in Statistical Science, Taylor & Francis, 2010). This may be advantageous as linear models are easily interpreted.
Table 8 shows a comparison of a simplified model of one embodiment using variables I90, A2 and D to a model using all variables. In this embodiment, accuracy for the model using all variables lies about 7% above accuracy for the simplified model, suggesting that the simplified model does not use the variables relevant for classification.
The relevant parameters may be identified using ‘backwards elimination’ (Sheather, A Modern Approach to Regression with R, Springer Science & Business Media, 2009) and the Akaike information criterion (Aho K et al., Model selection for ecologists: the worldviews of AIC and BIC, Ecology, 95: 631-636, 2014). These may be examined regarding the prediction error of the logistic regression.
In an embodiment, sensitivity and specificity may be determined using a Receiver-Operating-Characteristic-Curve (ROC). In this case, an ideal curve rises vertically at the start, signifying a rate of error of 0%, with the rate of false positives only rising later. A curve along the diagonal hints at a random process.
In a multinomial logistic regression, dependent variable X may have more than two different values, making binary logistic regression a special case of multinomial logistic regression.
Random forest follows the principle of Bagging which states that the combination of a plurality of classification methods increases accuracy of classification by training several classifications with different samples of the data. In an embodiment, a random forest algorithm as known as such (Breiman, Random Forests, Mach. Learn. 45.1, S. 5-32. DOI: 10.1023/A:1010933404324, 2001) may be used.
In such embodiment, when a new element is fed to the decision trees, each tree determines a class as a result. In the next step, the resulting class is determined based on the class proposed by the majority of trees.
Random forest may be optimized using, for example, the number of trees and/or the number of nodes in a tree. In
For this embodiment, the Kappa value allows the assumption of a trend according to which the accuracy of the multi-nominal logistic regression is less significant as compared to the other models.
This assumption is confirmed by the prediction of the trained models for the test data set of this embodiment, which is illustrated in the four-field tables summarized in table 9. The measurements of the test data set were chosen randomly in order to simulate an actual data input. In spite of a maxed out current error not being present in the test data set, the multi-nominal logistic regression erroneously predicts this error type. However, the model has the most problems with the fluidics error, of which not a single case was classified correctly.
For this embodiment, the multi-nominal logistic regression thus corresponds to an accuracy of 66% and is thus lower than Naive Bayes with 80% and random forest with 88% of correctly classified cases. The first possible cause for this could be the correlations between the parameters, which can lead to distorted estimates and to increased standard errors. However, Naive Bayes also requires that the parameters do not correlate and this model reaches significantly better results for the embodiment shown. The reason for this could be that Naive Bayes can already reach a high accuracy with very small data quantities. With higher data quantities for the training of the models, the accuracy of Naive Bayes could strongly increase in spite of correlations of the parameters. However, the second assumption of the multi-nominal logistic regression could be violated as well, the ‘Independence of irrelevant alternatives’. This specifies that the odds ratio of two error types is independent from all other response categories. It may be assumed, for example, that the selection of the result class “fluidics error” or “no error” is not influenced by the presence of “other errors.”
In an embodiment, the random forest provides the highest rate of correctly classified cases with 86%, whereby a plurality of incorrectly classified cases are predicted as ‘no error’, even though a fluidics error is present. The reason for the fact that in this embodiment random forest represents the most successful model with regard to the prediction could be, on the one hand, that the tree structure makes it possible to arrange the parameters with respect to their interactions. On the other hand, random forest could be optimized as compared to the multi-nominal logistic regression and Naive Bayes without much effort, due to the number of the trees. This may be made possible by means of a graphic of the error relating to the number of decision trees which shows the number of decision trees, at which the error converges.
As an alternative to compressed data, uncompressed data may be used. For data exhibiting time resolution, it is possible to achieve a prediction using neuronal networks such as recurrent networks. Recurrent neuronal networks have the advantage that no assumptions have to be made prior to the creation of the model.
While exemplary embodiments have been disclosed hereinabove, the present invention is not limited to the disclosed embodiments. Instead, this application is intended to cover any variations, uses, or adaptations of this disclosure using its general principles. Further, this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this invention pertains and which fall within the limits of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
17178771.6 | Jun 2017 | EP | regional |
This application is a continuation of PCT/EP2018/067654, filed Jun. 29, 2018, which claims priority to EP 17 178 771.6, filed Jun. 29, 2017, both of which are hereby incorporated herein by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2018/067654 | Jun 2018 | US |
Child | 16724893 | US |