The present disclosure relates to the technical field of medical image detection, and in particular, to a medical image auxiliary detection method using a Convolutional Block Attention Module (CBAM) mechanism-based residual network.
The key to controlling the epidemic is early detection, early isolation, and early treatment. It is crucial to assist doctors in quickly identifying COVID-19 patients. Currently, the main testing methods include nucleic acid testing, antigen testing, and antibody testing. Using medical images for detection has advantages such as convenience, high sensitivity, and repeatability. Chest medical imaging for COVID-19 diagnosis includes two major technologies: chest X-rays and computed tomography (CT) scan images. These imaging technologies provide important evidence for doctors in making diagnoses. Both chest X-rays and CT scan images of the lungs play crucial roles in early screening and diagnosis of lesions. However, the large number of patients and the rapid evolution of the disease pose a significant challenge to radiologists due to the substantial amount of images generated during follow-up examinations. Especially in severely affected areas, rapidly screening and diagnosing a large number of suspected COVID-19 patients present a huge challenge to radiologists. In recent years, numerous studies have focused on automatic identification and assisted diagnosis based on medical images. The recognition of medical images has become a hot topic and entry for deep learning extending from the field of computer science to medicine. The recognition and detection of medical images based on deep learning not only alleviates the strain on medical resources but also helps avoid errors and missed diagnoses caused by human factors. Particularly during disease outbreaks, using computers to assist doctors in making diagnoses based on medical images significantly improves diagnostic efficiency and reduces the risk of infection for healthcare workers and the general public. Therefore, introducing artificial intelligence for assisting in the detection of medical images offers benefits such as facilitating patient treatment, alleviating pressure on medical resources, and enhancing detection accuracy.
In conclusion, there have been related research reports on using chest X-ray imaging for COVID-19 detection. However, in the medical diagnostic environment, the large amount of data and the highly contagious nature of COVID-19 impose stricter requirements on identification speed and accuracy. For large-scale medical lung imaging systems, more efficient and precise image classification and visualization methods are still lacking.
Accordingly, it is necessary to provide a medical image auxiliary detection method using a CBAM mechanism-based residual network, to address the issue of low efficiency in medical image diagnosis.
To achieve the above objective, the present disclosure provides the following technical solutions:
A medical image auxiliary detection method using a CBAM mechanism-based residual network, including the following steps:
Preferably, the medical image is a lung chest X-ray (cxr) medical image.
Preferably, step S1 specifically includes the following steps:
Preferably, step S2 specifically includes the following steps:
Preferably, step S3 specifically includes the following steps:
Preferably, a process of the channel attention mechanism is described as follows:
Global average pooling (AvgPool) and global maximum pooling (MaxPool) are performed on a width and a height of a network feature map; channel attention weights are obtained through a multi-layer perceptron (MLP); the obtained weights are summed element-wise; finally, the weights are normalized using a Sigmoid function, and are then multiplied channel-wise to the original feature map, with a formula as follows:
Preferably, a process of the spatial attention mechanism is described as follows:
With input from the channel attention mechanism, global maximum pooling (MaxPool) and global average pooling (AvgPool) are performed on the feature map based on channels; then, the dimensionality is reduced to 1D through convolution operations, and attention features are generated through a Sigmoid function, with a formula as follows:
Preferably, said inserting the constructed CBAM attention mechanism into the Res2Net residual network structure specifically includes:
inserting the constructed CBAM attention mechanism into the last layer of each residual block of ResNet.
Preferably, said constructing the HS-block multi-level separable module specifically includes:
dividing the feature map into groups by channels, and performing cross-combination and convolution on different groups, which facilitates the extraction of abstract information.
Preferably, step S4 specifically includes:
extracting features from the model based on a Grad-CAM++ algorithm, plotting a heatmap, and overlaying the heatmap on an original image with 0.3 opacity.
Preferably, the Grad-CAM++ algorithm is specifically as follows:
A score for a specific class in a feature map is derived from a dot product of weights and the feature map, with a formula as follows: Yc=Σkwkc·ΣiΣjAi,jk, where c represents a class, (i, j) represents a position of a feature value in the feature map, k represents a channel, Y represents a contribution to the class c, and A represents the feature map; a corresponding heatmap formula is as follows: Li,jc=Σkwkc·Ai,jk, where Ai,jc represents a value at the position (i, j) in the feature map, Wkc represents a fully connected weight for the class c regarding the channel k, and Li,jc represents a contribution of the position (i, j) in the feature map to the class c; the calculation of the weights uses gradients and a ReLU activation function for improvement, with a formula as follows:
where ∝i,jkc represents a weighted coefficient of a pixel gradient for the class c and Ak in the feature map, relu( ) represents a ReLU activation function, Ak represents a value at the position (i, j) in the feature map, and Yc represents a differentiable function used for activation of Ak.
Compared with the prior art, the present disclosure achieves following beneficial effects:
The present disclosure provides an auxiliary detection method for COVID-19 lung images using a CBAM mechanism-based residual network. It achieves a high-precision COVID-19 X-ray assisted diagnostic algorithm, optimizes traditional manual case screening solutions, and integrates the attention mechanism with the HS-block module to enhance inference accuracy, thereby meeting the demands for recognizing a large number of images in medical diagnosis.
To describe embodiments of the present disclosure or technical solutions in the prior art more clearly, the accompanying drawings required in the embodiments are briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present disclosure. Those of ordinary skill in the art can also obtain other accompanying drawings according to these accompanying drawings without creative efforts.
The technical solutions of the embodiments of the present disclosure are clearly and completely described below with reference to the drawings in the embodiments of the present disclosure. Apparently, the described embodiments are merely a part rather than all of the embodiments of the present disclosure. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present disclosure without creative efforts shall fall within the protection scope of the present disclosure.
To make the above objectives, features, and advantages of the present disclosure clearer and more comprehensible, the present disclosure will be further described in detail below with reference to the accompanying drawings and the specific embodiments.
Data used in the present disclosure comes from eight datasets of three open source websites: Kaggle, RSNA, and Github, as shown in the table below:
The present disclosure provides a medical image auxiliary detection method using a CBAM mechanism-based residual network. As shown in
The present disclosure uses a computer device to execute the foregoing steps. The computer device includes a memory and a processor, where the memory stores a computer program, and the computer program is executed by the processor to perform the steps of the medical image auxiliary detection method using a CBAM mechanism-based residual network.
Each step is described in detail below.
Specifically, step S1 specifically includes the following steps:
Specifically, step S2 specifically includes the following steps:
Specifically, step S3 specifically includes the following steps:
Specifically, a process of the channel attention mechanism is described as follows:
Global average pooling (AvgPool) and global maximum pooling (MaxPool) are performed on a width and a height of a network feature map; channel attention weights are obtained through a multi-layer perceptron (MLP); the obtained weights are summed element-wise; finally, the weights are normalized using a Sigmoid function, and are then multiplied channel-wise to the original feature map, with a formula as follows:
A process of the spatial attention mechanism is described as follows:
With input from the channel attention mechanism, global maximum pooling (MaxPool) and global average pooling (AvgPool) are performed on the feature map based on channels; then, the dimensionality is reduced to 1D through convolution operations, and attention features are generated through a Sigmoid function (the block diagram is as shown in
The step of constructing the HS-block multi-level separable module specifically includes:
dividing the feature map into groups by channels, and performing cross-combination and convolution on different groups, which facilitates the extraction of abstract information, where the corresponding structure is as shown in
Specifically, step S4 specifically includes:
The Grad-CAM++ algorithm is specifically as follows:
A score for a specific class in a feature map is derived from a dot product of weights and the feature map, with a formula as follows: Yc=Σkwkc·ΣiΣjAi,jka corresponding heatmap formula is as follows: Li,jc=Σkwkc·Ai,jk, where Ai,jc calculation of weights uses gradients and a ReLU activation function for improvement, with a formula as follows:
The present disclosure describes detection of the current model using accuracy (Acc), recall, balanced F-score (F1 Score), sensitivity, specificity, and AUC. Accuracy indicates the correctness
of predictions, where TP represents that a positive sample is predicted as a positive sample, TN represents that a negative sample is predicted as a negative sample, FP represents that a negative sample is predicted as a positive sample, and FN represents that a positive sample is predicted as a negative sample. Specificity indicates a proportion of correctly classified cases among all negative cases, measuring a recognition capability
of the classifier for negative cases. The balanced F-score is defined as a harmonic mean
of accuracy and recall. Recall represents a proportion
of positive cases that the model can correctly predict among all actual positive cases. Sensitivity indicates a proportion of correctly classified cases among all positive cases, measuring a recognition capability
of the classifier for positive cases. AUC equals an area
under the ROC curve, where
The changes in accuracy and loss during the training process are shown in
Each embodiment in the description is described in a progressive mode, each embodiment focuses on differences from other embodiments, and references can be made to each other for the same and similar parts between embodiments.
Specific examples are used herein for illustration of the principles and embodiments of the present disclosure. The description of the foregoing embodiments is used to help understand the method of the present disclosure and the core principles thereof. In addition, those of ordinary skill in the art can make various modifications in terms of specific embodiments and scope of application in accordance with the teachings of the present disclosure. In conclusion, the content of the description shall not be construed as limitations to the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202210868339.8 | Jul 2022 | CN | national |
The present disclosure is a national stage application of International Patent Application No. PCT/CN2022/139162, filed on Dec. 25, 2022, which claims the benefit and priority of Chinese Patent Application No. 202210868339.8, filed with the China National Intellectual Property Administration (CNIPA) on Jul. 22, 2022, and entitled “MEDICAL IMAGE AUXILIARY DETECTION METHOD USING CBAM MECHANISM-BASED RESIDUAL NETWORK”, which is incorporated herein by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2022/139162 | 12/15/2022 | WO |