The disclosure relates to interpretation of a medical dataset, and more particularly to using a neural network model to interpret a 12-lead ECG dataset.
Machine-learning technology has been widely used in image interpretation, speech recognition, item matching, and the presentation of relevant results during a search. Deep learning, which is one type of machine learning, involves an artificial neural network (ANN) with multilayer representation learning. In automatic interpretation of cardiovascular images, deep learning has been developed to interpret the results of electrocardiography (ECG), echocardiography, coronary computed tomography, and single-photon emission computed tomography for the evaluation of myocardial perfusion.
Electrocardiography is a graph of voltage versus time of the electrical activity of the heart using electrodes placed on the skin, and is widely used to detect various heart diseases, including rhythm disorders, conduction abnormalities, and myocardial ischemia or infarction.
In an article by Mathews, S. M., Kambhamettu, C. and Barner, K. E., “A novel application of deep learning for single-lead ECG classification,” Comput Biol Med. 99, 53-62 (2018), a machine-learning based method was proposed to detect and classify different types of cardiac arrhythmias using a single-lead ECG dataset. In an article by Hannun, A. Y. et al., “Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network,” Nat Med. 25, 65-69 (2019), a cardiologist-level arrhythmia detection system was developed for diagnosing twelve different types of cardiac rhythms based on a single-lead ECG dataset.
However, the studies in the abovementioned articles are both based on single-lead ECG records, which provide limited information that limits diagnostic accuracy.
Therefore, an object of the disclosure is to provide a method that can alleviate at least one of the drawbacks of the prior art.
According to the disclosure, the method for building a heart rhythm classification model that is used to classify a heart rhythm is provided. The method is implemented by a computer device, and includes steps of: A) providing a neural network model that includes first to Mth bidirectional long short-term memory (LSTM) layers, M being a positive integer greater than one, wherein each of the first to Mth bidirectional LSTM layers includes a first set of LSTM neurons for forward data input and a second set of LSTM neurons for reverse data input, and wherein, for an mth bidirectional LSTM layer where m is an arbitrary one of integers from two to M, each of the LSTM neurons is connected to all of the LSTM neurons of an (m−1)th bidirectional LSTM layer for receiving outputs thereof; B) receiving a plurality of 12-lead electrocardiogram (ECG) datasets, wherein each of the 12-lead ECG datasets is acquired by performing 12-lead electrocardiography on a respective person, corresponds to one of a plurality of predetermined class labels that respectively correspond to a plurality of predetermined heart conditions, and includes first to Nth data points that are ordered according to a time sequence the first to Nth data points are obtained, wherein N is a positive integer; C) for each of the 12-lead ECG datasets, using the neural network model to generate a classification result that indicates, for each of the predetermined heart conditions, whether the 12-lead ECG dataset corresponds to the predetermined heart condition, wherein step C) includes: feeding the 12-lead ECG dataset into the first set of LSTM neurons of the first bidirectional LSTM layer in a forward sequence from the first data point to the Nth data point of the 12-lead ECG dataset, and feeding the 12-lead ECG dataset into the second set of LSTM neurons of the first bidirectional LSTM layer in a reverse sequence from the Ntn data point to the first data point of the 12-lead ECG dataset; D) acquiring classification accuracies respectively for the predetermined heart conditions based on the classification results respectively obtained for the 12-lead ECG datasets and the predetermined class labels; and E) when any one of the classification accuracies acquired in step D) is lower than a respective predetermined first threshold, adjusting parameters of the neural network model, and repeating steps C) and D) using the neural network model thus adjusted; and F) making the neural network model serve as the heart rhythm classification model when each of the classification accuracies acquired in step D) is equal to or higher than the respective predetermined first threshold.
Other features and advantages of the disclosure will become apparent in the following detailed description of the embodiment(s) with reference to the accompanying drawings, of which:
Before the disclosure is described in greater detail, it should be noted that where considered appropriate, reference numerals or terminal portions of reference numerals have been repeated among the figures to indicate corresponding or analogous elements, which may optionally have similar characteristics.
For the first bidirectional LSTM layer, the inputted 12-lead ECG dataset serves as the input dataset for the LSTM neurons thereof. The 12-lead ECG dataset includes a plurality of ECG data points (i.e., input data points of the input dataset for the first bidirectional LSTM layer) that are ordered according to a time sequence the ECG data points are measured. For an mth bidirectional LSTM layer where m is an arbitrary one of integers from two to M (including two and M), each of the LSTM neurons is connected to all of the LSTM neurons of an (m−1)th bidirectional LSTM layer for receiving the output datasets thereof. Accordingly, the output datasets of the LSTM neurons of the (m−1)th bidirectional LSTM layer serve as the input datasets for each of the LSTM neurons of the mth bidirectional LSTM layer.
The embodiment of the method for building the heart rhythm classification model is implemented by a computer device that stores the neural network model or that is communicatively connected to a system (e.g., a cloud system) storing the neural network model which is to be executed by the computer device. Referring to
In step 31, the computer device receives a plurality of 12-lead electrocardiogram (ECG) datasets. Each of the 12-lead ECG datasets is acquired by performing 12-lead electrocardiography on a respective person for a predetermined period of time, such as 10 seconds, and is provided with a predetermined class label that corresponds to at least one of the predetermined heart conditions. For each of the 12-lead ECG datasets, the predetermined class label is given via a consensus of multiple board-certified electrophysiologists, and serves as a gold standard for verifying a classification result of the neural network model 1. Each ECG data point may be a vector that indicates magnitudes of the heart's electrical potential measured respectively from twelve leads at a corresponding time point, where the twelve leads include the three standard limb leads (I, II and III), three augmented limb leads (aVR, aVL and aVF) and six precordial leads (V1, V2, V3, V4, V5 and V6).
Referring to
In each of the bidirectional LSTM layers, bidirectional data processing is performed, which means that the computer device makes the LSTM neurons in the first set receive the corresponding input dataset(s) in a forward sequence that is the same as the sequence the input data points of each of the input dataset(s) are ordered (i.e., from the first input data point to the last input data point in a single input dataset), and makes the LSTM neurons in the second set receive the corresponding input dataset(s) in a reverse sequence that is opposite to the sequence the input data points of each of the input dataset(s) are ordered (i.e., from the last input data point to the first input data point in a single input dataset). Referring to
In the max pooling layer 15, the computer device acquires a feature value for each of the LSTM neurons in the first set and the second set of the Mth bidirectional LSTM layer. In this embodiment, the computer device performs max pooling to acquire, for each of the LSTM neurons of the Mth bidirectional LSTM layer, the greatest one of the output data points of the output dataset of the LSTM neuron to be the feature value.
The fully connected layer 16 receives the feature values from the max pooling layer 15, and includes a plurality of fully connected neurons that respectively correspond to the predetermined heart conditions. Therefore, the fully connected layer 16 includes thirteen fully connected neurons that respectively correspond to the thirteen predetermined heart conditions in this embodiment. Each of the fully connected neurons includes a plurality of weights. In each of the fully connected neurons of the fully connected layer 16, the computer device performs inner product operation on the feature values and the weights of the fully connected neuron to obtain a fully connected feature value that is related to the corresponding one of the predetermined heart conditions. It is noted that, in each of the fully connected neurons, the computer device may further perform linear operation on the result of the inner product operation to obtain the fully connected feature value, but this disclosure is not limited in this respect.
The sigmoid operation block 17 includes multiple sigmoid units that correspond to the predetermined heart conditions. Each of the sigmoid units is connected to a respective one of the fully connected neurons for receiving the fully connected feature value generated thereby. In each of the sigmoid units, the computer device applies a sigmoid function to the fully connected feature value that is obtained for the corresponding one of the fully connected neurons to obtain an index value that ranges between 0 and 1 for the corresponding one of the predetermined heart conditions. The sigmoid function is generally expressed by:
where K is a number of inputs. In this case, each of the sigmoid units is connected to only one of the fully connected neurons, so K=1, and the sigmoid function can be modified to be:
where x is the fully connected feature value generated by the corresponding one of the fully connected neurons.
Then, the computer device generates, based on the index values obtained for the predetermined heart conditions, the classification result that indicates, for each of the predetermined heart conditions, whether the 12-lead ECG dataset corresponds to the predetermined heart condition. In practice, a plurality of comparison thresholds may be provided for the predetermined heart conditions, respectively. For each of the predetermined heart conditions, the computer device compares the corresponding one of the index values with the corresponding one of the comparison thresholds, determines that the 12-lead ECG dataset corresponds to the predetermined heart condition when the corresponding one of the index values is greater than the corresponding one of the comparison thresholds, and determines that the 12-lead ECG dataset does not correspond to the predetermined heart condition when otherwise. In general cases, the comparison thresholds may be set to 0.5. In some cases, each of the comparison thresholds may be adjusted as desired, so the comparison thresholds may be different for different predetermined heart conditions.
After generating the classification result for each of the 12-lead ECG datasets, in step 33, the computer device acquires classification accuracies respectively for the predetermined heart conditions based on the classification results respectively obtained for the 12-lead ECG datasets and the predetermined class labels. In detail, the computer device determines, for each of the predetermined heart conditions, whether the classification result generated for each of the 12-lead ECG datasets accurately indicates a correspondence between the 12-lead ECG dataset and the predetermined heart condition by comparing the classification result and the predetermined class label provided for the 12-lead ECG dataset, so as to determine the classification accuracies respectively for the predetermined heart conditions.
In step 34, the computer device determines, for each of the predetermined heart conditions, whether the corresponding one of the classification accuracies is higher than a respective predetermined first threshold. Since the difficulties for identifying different heart conditions may vary, the predetermined heart conditions may correspond to different predetermined first thresholds, which can be defined by the model developer as desired. The flow goes to step 35 when every single one of the classification accuracies is higher than or equal to the respective predetermined first threshold, and goes to step 36 when any one of the classification accuracies is lower than the respective predetermined first threshold.
In step 35, the computer device makes the neural network model serve as the heart rhythm classification model, which can be used to make diagnosis on a heart condition of a person by inputting a 12-lead ECG dataset measured from the person into the heart rhythm classification model.
In step 36, the computer device adjusts parameters (i.e., weights of each neuron) of the neural network model 1, and the flow goes back to step 32 to repeat operations of steps 32 and 33 using the neural network model 1 thus adjusted. Adjustment can be made to parameters/weights of the bidirectional LSTM layers and/or the fully connected layer 16 based on conventional algorithms such as gradient descents, or on the experience of the model developer. In one embodiment, the computer device adjusts the weights of the fully connected neurons based on the classification accuracies. In detail, for one of the predetermined heart conditions of which the corresponding classification accuracy is lower than a predetermined second threshold, the adjustment made to the weights of one of the fully connected neurons that corresponds to said one of the predetermined heart conditions is greater than the adjustment made to the weights of one of the fully connected neurons that corresponds to one of the predetermined heart conditions of which the corresponding classification accuracy is equal to or higher than the predetermined second threshold. Taking the confusion matrices shown in
In summary, the embodiment of the method for building a heart rhythm classification model according to this disclosure uses 12-lead ECG datasets to classify heart rhythms. 12-lead ECG datasets contain more information than single-lead ECG datasets, so the classification accuracies can be enhanced. The embodiment further uses the bidirectional LSTM layers to make the neural network model or the heart rhythm classification model analyze 12-lead ECG datasets from two different aspects, so the model may obtain more features from the 12-lead ECG datasets, thereby further enhancing the classification accuracies for the predetermined heart conditions.
In the description above, for the purposes of explanation, numerous specific details have been set forth in order to provide a thorough understanding of the embodiment(s). It will be apparent, however, to one skilled in the art, that one or more other embodiments may be practiced without some of these specific details. It should also be appreciated that reference throughout this specification to “one embodiment,” “an embodiment,” an embodiment with an indication of an ordinal number and so forth means that a particular feature, structure, or characteristic may be included in the practice of the disclosure. It should be further appreciated that in the description, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of various inventive aspects, and that one or more features or specific details from one embodiment may be practiced together with one or more features or specific details from another embodiment, where appropriate, in the practice of the disclosure.
While the disclosure has been described in connection with what is (are) considered the exemplary embodiment(s), it is understood that this disclosure is not limited to the disclosed embodiment(s) but is intended to cover various arrangements included within the spirit and scope of the broadest interpretation so as to encompass all such modifications and equivalent arrangements.
This application claims priority of U.S. Provisional Patent Application No. 62/962,802, filed on Jan. 17, 2020.
Number | Date | Country | |
---|---|---|---|
62962802 | Jan 2020 | US |