ELECTROCARDIOGRAM (ECG) SIGNAL CLASSIFICATION METHOD BASED ON CONTRASTIVE LEARNING AND MULTI-SCALE FEATURE EXTRACTION

Description

CROSS-REFERENCE TO THE RELATED APPLICATIONS

This application is based upon and claims priority to Chinese Patent Application No. 202311523667.5, filed on Nov. 16, 2023, the entire contents of which are incorporated herein by reference.

TECHNICAL FIELD

The present disclosure relates to the technical field of electrocardiogram (ECG) signal classification, and in particular to an ECG signal classification method based on contrastive learning and multi-scale feature extraction.

BACKGROUND

The heart's electrical activity is represented by electrical signals. The electrical signals are acquired by electrodes placed in specific positions on the body and are presented in the form of an electrocardiogram (ECG). Due to its non-invasive and real-time characteristics, ECG is often involved in classifying electrical signals of the heart. An ECG signal classification method based on contrastive learning can acquire useful features based on data characteristics, without the need for pre-training through pre-labeled data. However, traditional network models still have shortcomings in ECG signal classification. For example, traditional network models cannot achieve sufficient processing of features from different channels and are unable to adaptively learn the correlation among channels. In addition, traditional network models are limited by a small receptive field and cannot fully capture the contextual information of the ECG signal.

SUMMARY

In order to overcome the above technical shortcomings, the present disclosure provides an electrocardiogram (ECG) signal classification method based on contrastive learning and multi-scale feature extraction, which can improve the performance and generalization ability of an ECG signal classification task.

The technical solution used in the present disclosure to resolve the technical problem thereof is as follows:

The ECG signal classification method based on contrastive learning and multi-scale feature extraction includes the following steps:

- a) dividing an ECG signal dataset into K batches, each batch including T signals forming an ECG signal set X, X={X₁, X₂, . . . , X_i, . . . , X_T}, where X_idenotes an i-th ECG signal, i∈{1, . . . ,T};
- b) performing data augmentation on the i-th ECG signal X_ito acquire a sample X_i^M1;
- c) constructing a squeeze-and-excitation—residual networks with next-generation aggregated transformations—context-aware network (SE-ResNeXt-CAN) network model, including a shallow feature extraction module, a first squeeze-and-excitation residual module (SERM), a second SERM, a first context-aware residual module (CARM), and a second CARM;
- d) inputting the sample X_i^M1into the shallow feature extraction module of the SE-ResNeXt-CAN network model to acquire a feature ƒ¹;
- e) inputting the feature ƒ¹into the first SERM of the SE-ResNeXt-CAN network model to acquire a feature ƒ²;
- f) inputting the feature ƒ²into the second SERM of the SE-ResNeXt-CAN network model to acquire a feature ƒ³;
- g) inputting the feature ƒ³into the first CARM of the SE-ResNeXt-CAN network model to acquire a feature ƒ⁴;
- h) inputting the feature ƒ⁴into the second CARM of the SE-ResNeXt-CAN network model to acquire a feature ƒ⁵;
- i) constructing a first multilayer perceptron, including a flattening layer, a first fully connected layer, and a second fully connected layer in sequence; and inputting the feature ƒ⁵into the first multilayer perceptron to acquire a feature h_i;
- j) training the SE-ResNeXt-CAN network model to acquire an optimized SE-ResNeXt-CAN network model;
- k) dividing a new ECG signal dataset into K batches, each batch including T signals forming an ECG signal set Y, Y={Y₁, Y₂, . . . , Y_i, . . . , Y_N}, where Y_idenotes an i-th ECG signal, i∈{1, . . . ,T}; and
- n) inputting the i-th ECG signal Y_iinto the optimized SE-ResNeXt-CAN network model to acquire a feature ƒ^5′; constructing a second multilayer perceptron, including a flattening layer, a first fully connected layer, a rectified linear unit (Relu) activation function layer, and a second fully connected layer in sequence; inputting the feature ƒ^5′ into the second multilayer perceptron to acquire a feature ƒ^5″; and inputting the feature ƒ^5″ into a softmax activation function to acquire a probability distribution Z_iof the i-th ECG signal Y_i, where the probability distribution Z_iis a classification result of the ECG signal.

Further, in the step a), the ECG signal dataset is a PTB-XL dataset, and the T signals in each batch are sampled at a rate of 500 Hz, with duration of 10 s.

Further, step b) includes the following steps:

- b-1) generating, by an np.random.normal function in Python, Gaussian noise with a mean of 0, a variance of 0.01, and a same size as the i-th ECG signal X_i; and
- b-2) adding the Gaussian noise to the i-th ECG signal X_ito acquire the sample X_i^M1.

Further, the step d) includes:

- d-1) constructing the shallow feature extraction module of the SE-ResNeXt-CAN network model, including a convolutional layer, a Batch_Norm layer, a Relu activation function, and a Dropout layer in sequence; and inputting the sample X_i^M1into the shallow feature extraction module to acquire the feature ƒ¹.

Further, the step e) includes the following steps:

- e-1) constructing the first SERM of the SE-ResNeXt-CAN network model, including a residual module (RM) and an excitation and convolution module;
- e-2) constructing the RM of the first SERM, including a first branch and a second branch; constructing the first branch of the RM, including a first convolutional layer, a second convolutional layer, a first batch normalization (BN) layer, a Relu activation function, a third convolutional layer, a second BN layer, and a Dropout layer in sequence; inputting the feature ƒ¹into the first branch of the RM to acquire a feature ƒ₁²; constructing the second branch of the RM, including a max pooling layer and a convolutional layer in sequence; inputting the feature ƒ¹into the second branch of the RM to acquire a feature ƒ₂²; and adding the feature ƒ₁²and the feature ƒ₂²to acquire a feature rm¹;
- e-3) constructing the excitation and convolution module, including a first convolutional layer, a first SE module, a second convolutional layer, a second SE module, a third convolutional layer, and a third SE module in sequence; and inputting the feature ƒ¹into the excitation and convolution module to acquire a feature ƒ₃²; and
- e-4) adding the feature rm¹and the feature ƒ₃²to acquire the feature ƒ².

Further, the step f) includes:

- f-1) constructing the second SERM of the SE-ResNeXt-CAN network model, including a RM and an excitation and convolution module;
- f-2) constructing the RM of the second SERM, including a first branch and a second branch; constructing the first branch of the RM, including a first convolutional layer, a second convolutional layer, a first batch normalization (BN) layer, a Relu activation function, a third convolutional layer, a second BN layer, and a Dropout layer in sequence; inputting the feature ƒ²into the first branch of the RM to acquire a feature ƒ₁³; constructing the second branch of the RM, including a max pooling layer and a convolutional layer in sequence; inputting the feature ƒ²into the second branch of the RM to acquire a feature ƒ₂³; and adding the feature ƒ₁³and the feature ƒ₂³to acquire a feature rm²;
- f-3) constructing the excitation and convolution module, including a first convolutional layer, a first SE module, a second convolutional layer, a second SE module, a third convolutional layer, and a third SE module in sequence; and inputting the feature ƒ²into the excitation and convolution module to acquire a feature ƒ₃³; and
- f-4) adding the feature rm²and the feature ƒ₃³to acquire the feature ƒ³.

Further, the step g) includes:

- g-1) constructing the first CARM of the SE-ResNeXt-CAN network model, including a RM, a dilated convolution module, and a channel attention module;
- g-2) constructing the RM of the first CARM, including a first branch and a second branch; constructing the first branch of the RM, including a first convolutional layer, a BN layer, a first Relu activation function, a second convolutional layer, a second Relu activation function, and a Dropout layer in sequence; inputting the feature ƒ³into the first branch of the RM to acquire a feature ƒ₁⁴; constructing the second branch of the RM, including a convolutional layer; inputting the feature ƒ³into the second branch of the RM to acquire a feature ƒ₂⁴; adding the feature ƒ₁⁴and the feature ƒ₂⁴to acquire a feature rm³;
- g-3) constructing the dilated convolution module of the first CARM, including a first dilated convolution layer, a first convolutional layer, a second dilated convolution layer, a second convolutional layer, and a Relu activation function in sequence; inputting the feature ƒ³into the dilated convolution module to acquire a feature ƒ₃⁴; and adding the feature rm³and the feature ƒ₃⁴to acquire a feature pcdc¹;
- g-4) constructing the channel attention module of the first CARM, including a convolutional layer, an average pooling layer, a first fully connected layer, and a second fully connected layer in sequence; and inputting the feature ƒ³into the channel attention module to acquire a feature ƒ₄⁴; and
- g-5) multiplying the feature pcdc¹by the feature ƒ₄⁴to acquire the feature ƒ⁴.

Further, the step h) includes:

- h-1) constructing the second CARM of the SE-ResNeXt-CAN network model, including a RM, a dilated convolution module, and a channel attention module;
- h-2) constructing the RM of the second CARM, including a first branch and a second branch; constructing the first branch of the RM, including a first convolutional layer, a BN layer, a first Relu activation function, a second convolutional layer, a second Relu activation function, and a Dropout layer in sequence; inputting the feature ƒ⁴into the first branch of the RM to acquire a feature ƒ₁⁵; constructing the second branch of the RM, including a convolutional layer; inputting the feature ƒ⁴into the second branch of the RM to acquire a feature ƒ₂⁵; adding the feature ƒ₁⁵and the feature ƒ₂⁵to acquire a feature rm⁴;
- h-3) constructing the dilated convolution module of the second CARM, including a first dilated convolution layer, a first convolutional layer, a second dilated convolution layer, a second convolutional layer, and a Relu activation function in sequence; inputting the feature ƒ⁴into the dilated convolution module to acquire a feature ƒ₃⁵; and adding the feature rm⁴and the feature ƒ₃⁵to acquire a feature pcdc²;
- h-4) constructing the channel attention module of the second CARM, including a convolutional layer, an average pooling layer, a first fully connected layer, and a second fully connected layer in sequence; and inputting the feature ƒ⁴into the channel attention module to acquire a feature ƒ₄⁵; and
- h-5) multiplying the feature pcdc²by the feature ƒ₄⁵to acquire the feature ƒ⁵.

Further, the step j) includes: training, by an adaptive moment estimation (Adam) optimizer, the SE-ResNeXt-CAN network model through an NT-Xent loss function to acquire the optimized SE-ResNeXt-CAN network model.

Further, in the step k), the new ECG signal dataset is a China Physiological Signal Challenge (CPSC) dataset, and the T signals in each batch are sampled at a rate of 500 Hz, with duration of 10 s.

The present disclosure has the following beneficial effects. The SE-ResNeXt-CAN network model includes five modules. The first module is a shallow feature extraction module, which is configured to extract a shallow feature. The second module is a first SERM, including a RM and an excitation and convolution module in parallel. The third module is a second SERM, including a RM and an excitation and convolution module in parallel. The SE module adaptively learns the correlation between channels and assigns different weights to features of different channels, thereby improving the effectiveness and discriminability of feature representation and enhancing the expression ability of the model. The SE module and convolution are combined in series, which can reduce the total number of parameters in the network. Compared with parallel combination, the serial combination can utilize the parameters of the model efficiently while maintaining certain performance. The fourth module is a first CARM, including a dilated convolution module, a RM, and a channel attention module in parallel. The fifth module is a second CARM, including a dilated convolution module, a RM, and a channel attention module in parallel. The dilated convolution module can learn features of different receptive fields, thereby improving the diversity and richness of feature extraction. The parallel operation can simultaneously acquire feature information at multiple scales, thereby capturing feature representations at different levels. The RM can maintain the integrity and consistency of the feature and prevent problems such as gradient vanishing. The channel attention module adaptively learns the correlation between channels and assigns different weights to different channels. The design improves the effectiveness and discriminability of feature representation, enabling the model to focus on important feature channels, thereby enhancing the expression ability of the model. In the present disclosure, through the combination of these modules, the ECG signal classification method based on contrastive learning and multi-scale feature extraction can well capture key features of the ECG signal, thereby improving classification accuracy and robustness.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flowchart of a method according to the present disclosure;

FIG. 2 is a structural diagram of a squeeze-and-excitation residual module (SERM) according to the present disclosure; and

FIG. 3 is a structural diagram of a context-aware residual module (CARM) according to the present disclosure.

FIG. 4 shows a raw ECG signal.

FIG. 5 shows an ECG signal after a random augmentation.

FIGS. 6A-6E show confusion matrixes for each model.

FIG. 7 shows pre-training epoch ablation experiment results on PTB-XL dataset (AUROC).

FIG. 8 shows pre-training epoch ablation experiment results on CPSC dataset (AUROC).

DETAILED DESCRIPTION OF THE EMBODIMENTS

The present disclosure is further described below with reference to FIG. 1, FIG. 2, and FIG. 3.

An electrocardiogram (ECG) signal classification method based on contrastive learning and multi-scale feature extraction includes the following steps.

a) An ECG signal dataset is divided into K batches, each batch including T signals forming ECG signal set X, X={X₁, X₂, . . . , X_i, . . . , X_T}, where X_idenotes an i-th ECG signal, i∈{1, . . . ,T}.

b) Data augmentation is performed on the i-th ECG signal X_ito acquire sample X_i^M1.

c) A squeeze-and-excitation—residual networks with next-generation aggregated transformations—context-aware network (SE-ResNeXt-CAN) network model is constructed, including a shallow feature extraction module, a first squeeze-and-excitation residual module (SERM), a second SERM, a first context-aware residual module (CARM), and a second CARM. The combination of these five modules improves the feature extraction ability and enables hierarchical feature representation, thereby enhancing the performance and generalization ability of the model and improving the performance of downstream classification tasks.

d) The sample X_i^M1is input into the shallow feature extraction module of the SE-ResNeXt-CAN network model to acquire feature ƒ¹.

e) The feature ƒ¹is input into the first SERM of the SE-ResNeXt-CAN network model to acquire feature ƒ².

f) The feature ƒ²is input into the second SERM of the SE-ResNeXt-CAN network model to acquire feature ƒ³.

g) The feature ƒ³is input into the first CARM of the SE-ResNeXt-CAN network model to acquire feature ƒ⁴.

h) The feature ƒ⁴is input into the second CARM of the SE-ResNeXt-CAN network model to acquire feature ƒ⁵.

i) A first multilayer perceptron is constructed, including a flattening layer, a first fully connected layer, and a second fully connected layer in sequence; and the feature ƒ⁵is input into the first multilayer perceptron to acquire feature h_i. The first fully connected layer in the first multilayer perceptron includes 64 neurons, and the second fully connected layer in the first multilayer perceptron includes 32 neurons. The feature h_iis in a dimension of 32.

j) The SE-ResNeXt-CAN network model is trained to acquire an optimized SE-ResNeXt-CAN network model.

k) A new ECG signal dataset is divided into K batches, each batch including T signals forming ECG signal set Y, Y={Y₁, Y₂, . . . , Y_i, . . . , Y_N}, where Y_idenotes an i-th ECG signal, i∈{1, . . . ,T}.

n) The i-th ECG signal Y_iis input into the optimized SE-ResNeXt-CAN network model to acquire feature ƒ^5′, where the feature ƒ^5′ includes 512 channels, and is in a dimension of 40. A second multilayer perceptron is constructed, including a flattening layer, a first fully connected layer, a rectified linear unit (Relu) activation function layer, and a second fully connected layer in sequence. The feature ƒ^5′ is input into the second multilayer perceptron to acquire feature ƒ^5″. The feature ƒ^5″ is input into a softmax activation function to acquire probability distribution Z_iof the i-th ECG signal Y_i, where the probability distribution Z_iis a classification result of the ECG signal. The first fully connected layer in the second multilayer perceptron includes 16 neurons, and the second fully connected layer in the second multilayer perceptron includes 9 neurons.

The SE-ResNeXt-CAN network model includes a shallow feature extraction module, a first SERM, a second SERM, a first CARM, and a second CARM. The SERM includes a RM and an excitation and convolution module in parallel. The RM can easily learn identity mapping, which accelerates the convergence speed of the network. The SE module models the global information of the feature map, effectively mining the channel correlation between features and improving the feature expression ability. The SE module is combined with the convolutional layer to weight importance of the feature while extracting the feature, further enhancing the feature expression ability and improving model performance. The CARM includes a dilated convolution module, a RM, and a channel attention module in parallel. The dilated convolution module expands the receptive field and can extract rich context information. The RM helps network training and optimizes the deep structure. The channel attention module can improve the feature expression ability and generalization ability of the network.

In an embodiment of the present disclosure, in the step a), the ECG signal dataset is a PTB-XL dataset, and the T signals in each batch are sampled at a rate of 500 Hz, with duration of 10 s.

In an embodiment of the present disclosure, the step b) is as follows.

b-1) Gaussian noise with a mean of 0, a variance of 0.01, and a same size as the i-th ECG signal X_iis generated by an np.random.normal function in Python.

b-2) The Gaussian noise is added to the i-th ECG signal X_ito acquire the sample X_i^M1.

The step d) is as follows.

d-1) The shallow feature extraction module of the SE-ResNeXt-CAN network model is constructed, including a convolutional layer, a Batch_Norm layer, a Relu activation function, and a Dropout layer in sequence. The sample X_i^M1is input into the shallow feature extraction module to acquire the feature ƒ¹. In the shallow feature extraction module, the convolutional layer includes 32 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the Dropout layer, the probability is 0.5. The feature ƒ¹includes 32 channels, and is in a dimension of 2,500.

In an embodiment of the present disclosure, the step e) is as follows.

e-1) The first SERM of the SE-ResNeXt-CAN network model is constructed, including a RM and an excitation and convolution module.

e-2) The RM of the first SERM is constructed, including a first branch and a second branch. The first branch of the RM is constructed, including a first convolutional layer, a second convolutional layer, a first batch normalization (BN) layer, a Relu activation function, a third convolutional layer, a second BN layer, and a Dropout layer in sequence. The feature ƒ¹is input into the first branch of the RM to acquire feature ƒ₁². The second branch of the RM is constructed, including a max pooling layer and a convolutional layer in sequence. The feature ƒ¹is input into the second branch of the RM to acquire feature ƒ₂². The feature ƒ₁²and the feature ƒ₂²are added to acquire feature rm¹. In the first branch, the first convolutional layer includes 64 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. The second convolutional layer includes 64 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. The third convolutional layer includes 64 channels and a convolutional kernel with a size of 3, a stride of 1, and a padding of 1. In the first branch, the Dropout layer has a probability of 0.5. In the second branch, the convolutional layer includes 64 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the second branch, the max pooling layer has a pooling window of 2.

e-3) The excitation and convolution module is constructed, including a first convolutional layer, a first SE module, a second convolutional layer, a second SE module, a third convolutional layer, and a third SE module in sequence. The feature ƒ¹is input into the excitation and convolution module to acquire feature ƒ₃². In the excitation and convolution module, the first convolutional layer includes 64 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the excitation and convolution module, the second convolutional layer includes 64 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the excitation and convolution module, the third convolutional layer includes 64 channels and a convolutional kernel with a size of 3, a stride of 1, and a padding of 1.

e-4) The feature rm¹and the feature ƒ₃²are added to acquire the feature ƒ². The feature ƒ²includes 64 channels, and is in a dimension of 625.

In an embodiment of the present disclosure, the step f) is as follows.

f-1) The second SERM of the SE-ResNeXt-CAN network model is constructed, including a RM and an excitation and convolution module.

f-2) The RM of the second SERM is constructed, including a first branch and a second branch. The first branch of the RM is constructed, including a first convolutional layer, a second convolutional layer, a first batch normalization (BN) layer, a Relu activation function, a third convolutional layer, a second BN layer, and a Dropout layer in sequence. The feature ƒ²is input into the first branch of the RM to acquire a feature ƒ₁³. The second branch of the RM is constructed, including a max pooling layer and a convolutional layer in sequence. The feature ƒ²is input into the second branch of the RM to acquire feature ƒ₂³. The feature ƒ₁³and the feature ƒ₂³are added to acquire feature rm². In the first branch, the first convolutional layer includes 128 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. The second convolutional layer includes 128 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. The third convolutional layer includes 128 channels and a convolutional kernel with a size of 3, a stride of 1, and a padding of 1. In the first branch, the Dropout layer has a probability of 0.5. In the second branch, the convolutional layer includes 128 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 2. In the second branch, the max pooling layer has a pooling window of 2.

f-3) The excitation and convolution module is constructed, including a first convolutional layer, a first SE module, a second convolutional layer, a second SE module, a third convolutional layer, and a third SE module in sequence. The feature ƒ²is input into the excitation and convolution module to acquire feature ƒ₃³. In the excitation and convolution module, the first convolutional layer includes 128 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the excitation and convolution module, the second convolutional layer includes 128 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the excitation and convolution module, the third convolutional layer includes 128 channels and a convolutional kernel with a size of 3, a stride of 1, and a padding of 1.

f-4) The feature rm²and the feature ƒ₃³are added to acquire the feature ƒ³. The feature ƒ³includes 128 channels, and is in a dimension of 157.

In an embodiment of the present disclosure, the step g) is as follows.

g-1) The first CARM of the SE-ResNeXt-CAN network model is constructed, including a RM, a dilated convolution module, and a channel attention module.

g-2) The RM of the first CARM is constructed, including a first branch and a second branch. The first branch of the RM is constructed, including a first convolutional layer, a BN layer, a first Relu activation function, a second convolutional layer, a second Relu activation function, and a Dropout layer in sequence. The feature ƒ³is input into the first branch of the RM to acquire feature ƒ₁⁴. The second branch of the RM is constructed, including a convolutional layer. The feature ƒ³is input into the second branch of the RM to acquire feature ƒ₂⁴. The feature ƒ₁⁴and the feature ƒ₂⁴are added to acquire feature rm³. In the first branch, the first convolutional layer includes 256 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the first branch, the second convolutional layer includes 256 channels and a convolutional kernel with a size of 3, a stride of 1, and a padding of 1. In the second branch, the convolutional layer includes 256 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1.

g-3) The dilated convolution module of the first CARM is constructed, including a first dilated convolution layer, a first convolutional layer, a second dilated convolution layer, a second convolutional layer, and a Relu activation function in sequence. The feature ƒ³is input into the dilated convolution module to acquire feature ƒ₃⁴. The feature rm³and the feature ƒ₃⁴are added to acquire feature pcdc¹. In the dilated convolution module, the first dilated convolution layer includes 256 channels and a convolutional kernel with a size of 5, a dilation rate of 2, a stride of 1, and a padding of 4. In the dilated convolution module, the second dilated convolution layer includes 256 channels and a convolutional kernel with a size of 5, a dilation rate of 2, a stride of 1, and a padding of 4. In the dilated convolution module, the first convolutional layer includes 256 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the dilated convolution module, the second convolutional layer includes 256 channels and a convolutional kernel with a size of 3, a stride of 1, and a padding of 1.

g-4) The channel attention module of the first CARM is constructed, including a convolutional layer, an average pooling layer, a first fully connected layer, and a second fully connected layer in sequence. The feature ƒ³is input into the channel attention module to acquire feature ƒ₄⁴. In the channel attention module, the convolutional layer includes 256 channels and a convolutional kernel with a size of 1 and a stride of 1. In the channel attention module, the average pooling layer includes has a pooling window of 157. In the channel attention block, the first fully connected layer includes 64 neurons, and the second fully connected layer includes 1 neuron.

g-5) The feature pcdc¹is multiplied by the feature ƒ₄⁴to acquire the feature ƒ⁴. The feature ƒ₄⁴includes 256 channels, and is in a dimension of 79.

In an embodiment of the present disclosure, the step h) is as follows.

h-1) The second CARM of the SE-ResNext-CAN network model is constructed, including a RM, a dilated convolution module, and a channel attention module.

h-2) The RM of the second CARM is constructed, including a first branch and a second branch. The first branch of the RM is constructed, including a first convolutional layer, a BN layer, a first Relu activation function, a second convolutional layer, a second Relu activation function, and a Dropout layer in sequence. The feature ƒ⁴is input into the first branch of the RM to acquire feature ƒ₁⁵. The second branch of the RM is constructed, including a convolutional layer. The feature ƒ⁴is input into the second branch of the RM to acquire feature ƒ₂⁵. The feature ƒ₁⁵and the feature ƒ₂⁵are added to acquire feature rm⁴. In the first branch, the first convolutional layer includes 512 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the first branch, the second convolutional layer includes 512 channels and a convolutional kernel with a size of 3, a stride of 1, and a padding of 1. In the second branch, the convolutional layer includes 512 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1.

h-3) The dilated convolution module of the second CARM is constructed, including a first dilated convolution layer, a first convolutional layer, a second dilated convolution layer, a second convolutional layer, and a Relu activation function in sequence. The feature ƒ⁴is input into the dilated convolution module to acquire feature ƒ₃⁵. The feature rm⁴and the feature ƒ₃⁵are added to acquire feature pcdc². In the dilated convolution module, the first dilated convolution layer includes 512 channels and a convolutional kernel with a size of 5, a dilation rate of 2, a stride of 1, and a padding of 4. In the dilated convolution module, the second dilated convolution layer includes 512 channels and a convolutional kernel with a size of 5, a dilation rate of 2, a stride of 1, and a padding of 4. In the dilated convolution module, the first convolutional layer includes 512 channels and a convolutional kernel with a size of 3, a stride of 2, and a padding of 1. In the dilated convolution module, the second convolutional layer includes 512 channels and a convolutional kernel with a size of 3, a stride of 1, and a padding of 1.

h-4) The channel attention module of the second CARM is constructed, including a convolutional layer, an average pooling layer, a first fully connected layer, and a second fully connected layer in sequence. The feature ƒ⁴is input into the channel attention module to acquire feature ƒ₄⁵. In the channel attention module, the convolutional layer includes 512 channels and a convolutional kernel with a size of 1 and a stride of 1. In the channel attention module, the average pooling layer includes has a pooling window of 79. In the channel attention block, the first fully connected layer includes 128 neurons, and the second fully connected layer includes 1 neuron.

h-5) The feature pcdc²is multiplied by the feature ƒ₄⁵to acquire the feature ƒ⁵. The feature ƒ⁵includes 512 channels, and is in a dimension of 40.

In an embodiment of the present disclosure, in the step j), the SE-ResNeXt-CAN network model is trained by an adaptive moment estimation (Adam) optimizer through an NT-Xent loss function to acquire the optimized SE-ResNeXt-CAN network model.

In an embodiment of the present disclosure, in the step k), the new ECG signal dataset is a China Physiological Signal Challenge (CPSC) dataset, and the T signals in each batch are sampled at a rate of 500 Hz, with duration of 10 s. There are 9 classes in the CPSC dataset.

(1) Contrastive learning relies on data augmentation. FIG. 4 shows a raw ECG signal, and FIG. 5 shows an ECG signal after a random augmentation. This comparative display is helpful to the understanding of the impact of data augmentation on ECG signal processing.

(2) In order to evaluate the performance of the proposed model when it is transferred from one task to another, a comparative experiment on transferability evaluation was conducted on two datasets. The experimental results are shown in Table 1. Table 1 shows Transferability Evaluation Results (AUROC).

(3) In order to evaluate the accuracy of the model in prediction, a comparative experiment of linear evaluation was conducted on two datasets. The experimental results are shown in Table 2. Table 2 shows Linear Evaluation Results (AUROC).

(4) In order to comprehensively evaluate the performance of the classification models, confusion matrices for different models are provided. Due to the influence of floating point numbers, there may be some errors in the experimental results, as shown in FIGS. 6A-6E.

(5) In order to verify the influence of the number of pre-training epochs on the experimental results, ablation experiments were conducted on the number of pre-training epochs. The experimental results are shown in FIGS. 7 and 8.

Finally, it should be noted that the above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Although the present disclosure has been described in detail with reference to the foregoing embodiments, those skilled in the art may still modify the technical solutions described in the foregoing embodiments, or equivalently substitute some technical features thereof. Any modification, equivalent substitution, improvement, etc. within the spirit and principles of the present disclosure shall fall within the scope of protection of the present disclosure.

Claims

1. An electrocardiogram (ECG) signal classification method based on contrastive learning and multi-scale feature extraction, comprising the following steps: a) dividing an ECG signal dataset into K batches, each of the K batches comprising T signals forming an ECG signal set X, X={X1, X2, . . . , Xi, . . . , XT}, wherein Xi denotes an i-th ECG signal, i∈{1, . . . ,T};b) performing data augmentation on the i-th ECG signal Xi to acquire a sample XiM1;c) constructing a squeeze-and-excitation—residual network with next-generation aggregated transformations—context-aware network (SE-ResNeXt-CAN) network model, comprising a shallow feature extraction module, a first squeeze-and-excitation residual module (SERM), a second SERM, a first context-aware residual module (CARM), and a second CARM, wherein the residual network is constructed such that the first SERM receives an output of the shallow feature extraction module as an input, the second SERM receives an output of the first SERM as an input, the first CARM receives an output of the second SERM as an input, and the second CARM receives an output of the first CARM as an input;d) inputting the sample XiM1 into the shallow feature extraction module of the SE-ResNeXt-CAN network model to acquire a feature ƒ1;e) inputting the feature ƒ1 into the first SERM of the SE-ResNeXt-CAN network model to acquire a feature ƒ2;f) inputting the feature ƒ2 into the second SERM of the SE-ResNeXt-CAN network model to acquire a feature ƒ3;g) inputting the feature ƒ3 into the first CARM of the SE-ResNeXt-CAN network model to acquire a feature ƒ4;h) inputting the feature ƒ4 into the second CARM of the SE-ResNeXt-CAN network model to acquire a feature ƒ5;i) constructing a first multilayer perceptron, comprising a flattening layer, a first fully connected layer, and a second fully connected layer in sequence; and inputting the feature ƒ5 into the first multilayer perceptron to acquire a feature hi;j) training the SE-ResNeXt-CAN network model to acquire an optimized SE-ResNeXt-CAN network model;k) dividing a new ECG signal dataset into K batches, each of the K batches comprising T signals forming an ECG signal set Y, Y={Y1, Y2, . . . , Yi, . . . , YN}, wherein Yi denotes an i-th ECG signal, i∈{1, . . . ,T}; andn) inputting the i-th ECG signal Yi into the optimized SE-ResNeXt-CAN network model to acquire a feature ƒ5′; constructing a second multilayer perceptron, comprising a flattening layer, a first fully connected layer, a rectified linear unit (Relu) activation function layer, and a second fully connected layer in sequence; inputting the feature ƒ5′ into the second multilayer perceptron to acquire a feature ƒ5″; and inputting the feature ƒ5″ into a softmax activation function to acquire a probability distribution Zi of the i-th ECG signal Yi, wherein the probability distribution Zi is a classification result of the ECG signal;wherein the step d) comprises: d-1) constructing the shallow feature extraction module of the SE-ResNeXt-CAN network model, comprising a convolutional layer, a Batch Norm layer, a Relu activation function, and a Dropout layer in sequence; and inputting the sample XiM1 into the shallow feature extraction module to acquire the feature ƒ1;wherein the step e) comprises: e-1) constructing the first SERM of the SE-ResNeXt-CAN network model, comprising a residual module (RM) and an excitation and convolution module;e-2) constructing the RM of the first SERM, comprising a first branch and a second branch; constructing the first branch of the RM, comprising a first convolutional layer, a second convolutional layer, a first batch normalization (BN) layer, a Relu activation function, a third convolutional layer, a second BN layer, and a Dropout layer in sequence; inputting the feature ƒ1 into the first branch of the RM to acquire a feature ƒ12; constructing the second branch of the RM, comprising a max pooling layer and a convolutional layer in sequence; inputting the feature ƒ1 into the second branch of the RM to acquire a feature ƒ22; and adding the feature ƒ12 and the feature ƒ22 to acquire a feature rm1;e-3) constructing the excitation and convolution module, comprising a first convolutional layer, a first SE module, a second convolutional layer, a second SE module, a third convolutional layer, and a third SE module in sequence; and inputting the feature ƒ1 into the excitation and convolution module to acquire a feature ƒ32; ande-4) adding the feature rm1 and the feature ƒ32 to acquire the feature ƒ2;wherein the step f) comprises: f-1) constructing the second SERM of the SE-ResNeXt-CAN network model, comprising a RM and an excitation and convolution module;f-2) constructing the RM of the second SERM, comprising a first branch and a second branch; constructing the first branch of the RM, comprising a first convolutional layer, a second convolutional layer, a first BN layer, a Relu activation function, a third convolutional layer, a second BN layer, and a Dropout layer in sequence; inputting the feature ƒ2 into the first branch of the RM to acquire a feature ƒ13; constructing the second branch of the RM, comprising a max pooling layer and a convolutional layer in sequence; inputting the feature ƒ2 into the second branch of the RM to acquire a feature ƒ23; and adding the feature ƒ13 and the feature ƒ23 to acquire a feature rm2;f-3) constructing the excitation and convolution module, comprising a first convolutional layer, a first SE module, a second convolutional layer, a second SE module, a third convolutional layer, and a third SE module in sequence; and inputting the feature ƒ2 into the excitation and convolution module to acquire a feature ƒ33; andf-4) adding the feature rm2 and the feature ƒ33 to acquire the feature ƒ3;wherein the step g) comprises: g-1) constructing the first CARM of the SE-ResNeXt-CAN network model, comprising a RM, a dilated convolution module, and a channel attention module;g-2) constructing the RM of the first CARM, comprising a first branch and a second branch; constructing the first branch of the RM, comprising a first convolutional layer, a BN layer, a first Relu activation function, a second convolutional layer, a second Relu activation function, and a Dropout layer in sequence; inputting the feature ƒ3 into the first branch of the RM to acquire a feature ƒ14; constructing the second branch of the RM, comprising a convolutional layer; inputting the feature ƒ3 into the second branch of the RM to acquire a feature ƒ24; adding the feature ƒ14 and the feature ƒ24 to acquire a feature rm3;g-3) constructing the dilated convolution module of the first CARM, comprising a first dilated convolution layer, a first convolutional layer, a second dilated convolution layer, a second convolutional layer, and a Relu activation function in sequence; inputting the feature ƒ3 into the dilated convolution module to acquire a feature ƒ34; and adding the feature rm3 and the feature ƒ34 to acquire a feature pcdc1;g-4) constructing the channel attention module of the first CARM, comprising a convolutional layer, an average pooling layer, a first fully connected layer, and a second fully connected layer in sequence; and inputting the feature ƒ3 into the channel attention module to acquire a feature ƒ44; andg-5) multiplying the feature pcdc1 by the feature ƒ44 to acquire the feature ƒ4;wherein the step h) comprises: h-1) constructing the second CARM of the SE-ResNeXt-CAN network model, comprising a RM, a dilated convolution module, and a channel attention module;h-2) constructing the RM of the second CARM, comprising a first branch and a second branch; constructing the first branch of the RM, comprising a first convolutional layer, a BN layer, a first Relu activation function, a second convolutional layer, a second Relu activation function, and a Dropout layer in sequence; inputting the feature ƒ4 into the first branch of the RM to acquire a feature ƒ15; constructing the second branch of the RM, comprising a convolutional layer; inputting the feature ƒ4 into the second branch of the RM to acquire a feature ƒ25; adding the feature ƒ15 and the feature ƒ25 to acquire a feature rm4;h-3) constructing the dilated convolution module of the second CARM, comprising a first dilated convolution layer, a first convolutional layer, a second dilated convolution layer, a second convolutional layer, and a Relu activation function in sequence; inputting the feature ƒ4 into the dilated convolution module to acquire a feature ƒ35; and adding the feature rm4 and the feature ƒ35 to acquire a feature pcdc2;h-4) constructing the channel attention module of the second CARM, comprising a convolutional layer, an average pooling layer, a first fully connected layer, and a second fully connected layer in sequence; and inputting the feature ƒ4 into the channel attention module to acquire a feature ƒ45; andh-5) multiplying the feature pcdc2 by the feature ƒ45 to acquire the feature ƒ5.
2. The ECG signal classification method based on the contrastive learning and the multi-scale feature extraction according to claim 1, wherein in the step a), the ECG signal dataset is a PTB-XL dataset, and the T signals in each of the K batches are sampled at a rate of 500 Hz, with duration of 10 s.
3. The ECG signal classification method based on the contrastive learning and the multi-scale feature extraction according to claim 1, wherein the step b) comprises: b-1) generating, by an np.random.normal function in Python, Gaussian noise with a mean of 0, a variance of 0.01, and a same size as the i-th ECG signal Xi; andb-2) adding the Gaussian noise to the i-th ECG signal Xi to acquire the sample XiM1.
4. (canceled)
5. (canceled)
6. (canceled)
7. (canceled)
8. (canceled)
9. The ECG signal classification method based on the contrastive learning and the multi-scale feature extraction according to claim 1, wherein the step j) comprises: training, by an adaptive moment estimation (Adam) optimizer, the SE-ResNeXt-CAN network model through an NT-Xent loss function to acquire the optimized SE-ResNeXt-CAN network model.
10. The ECG signal classification method based on the contrastive learning and the multi-scale feature extraction according to claim 1, wherein in the step k), the new ECG signal dataset is a China Physiological Signal Challenge (CPSC) dataset, and the signals in each of the batches are sampled at a rate of 500 Hz, with duration of 10 s.

Priority Claims (1)

Number	Date	Country	Kind
2023115236675	Nov 2023	CN	national

ELECTROCARDIOGRAM (ECG) SIGNAL CLASSIFICATION METHOD BASED ON CONTRASTIVE LEARNING AND MULTI-SCALE FEATURE EXTRACTION

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)