Anomaly Detection in Chest X-rays Based on Dual-Attention Mechanism and Multi-Scale Feature Fusion
Abstract
1. Introduction
2. Related Work
2.1. Chest Anomaly Detection of Specific Diseases
2.2. Chest Anomaly Detection of Multiple Diseases
3. Methods
3.1. Feature Extraction Based on Dual-Attention Mechanism Residual Network
3.1.1. Split Attention Module
3.1.2. Spatial Context Attention Module
3.2. Multi-Scale Feature Fusion Framework
- (1)
- Consistent supervision: In this case, the same supervised signal was applied to the multi-scale features before the fusion. Specifically, each candidate region obtained from the deep network was mapped to and the corresponding feature map was obtained. Next, classification and regression operations were performed directly on these feature maps to calculate the loss function of backup supervision, which was further added to the loss of the network itself. With this strategy, similar semantic information could be effectively obtained via different feature maps, which was beneficial for improving the discriminative ability of the model.
- (2)
- Residual feature augmentation: The proportional invariant and adaptive pooling strategies were used to capture different context features and reduce information loss. Meanwhile, spatial context information extraction was used to reduce the information loss in the M5 channel and to improve the performance of the feature pyramid. Specifically, the C5 layer with scale S was performed with adaptive pooling with a constant proportion, and multiple scale feature maps could then be obtained. Next, the channel dimension was reduced to 256 through a 1 × 1 convolution. Finally, up-sampling was performed on the scale of S through the bi-linear interpolation of the feature maps. However, the interpolation operation could result in aliasing, i.e., the image gray was discontinuous and a jagged shape could appear in the area with a drastic grayscale change. To this end, an adaptive spatial fusion module was introduced to adaptively learn the weight. The features of the different layers were then fused with adaptive weights in the training process instead of through direct addition.
- (3)
- Soft RoI selection: In the original FPN model, the features of each RoI were extracted using a specific feature layer. However, overlooked features from other layers also contained important information useful for target classification or regression. In Aug-FPN, a soft RoI selection strategy was used, in which adaptive weights were introduced to better measure the importance of features in different RoIs. The final RoI features were generated based on adaptive weights rather than hard selection methods, such as RoI assignment or maximization operations. More specifically, the features of all pyramid layers for each RoI were first collected. Subsequently, an adaptive spatial fusion module was used to adaptively integrate these features; that is, various spatial weight maps were generated for different levels of RoI features. Finally, the weighted aggregation and fusion of RoI features were performed.
3.3. Loss Function
4. Experiments and Analysis
4.1. Experimental Datasets and Pre-Processing
4.2. Experimental Setup and Evaluation
4.3. Comparison with the Latest Methods
4.4. Ablation Study
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Boeddinghaus, J.; Nestelberger, T.; Lopez-Ayala, P.; Ratmann, P.D.; Wussler, D.; Zimmermann, T.; Wildi, K.; Gimenez, M.R.; Miro, O.; Martin-Sanchez, F.J.; et al. Early diagnosis of myocardial infarction in patients presenting late after chest pain onset. Eur. Heart J. 2020, 41, 1706. [Google Scholar] [CrossRef]
- Zhao, H.; Li, Y.X.; He, N.J.; Ma, K.; Fang, L.Y.; Li, H.Q.; Zheng, Y.F. Anomaly Detection for Medical Images using Self-supervised and Translation-consistent Features. IEEE Trans. Med. Imaging 2021, 40, 3641–3651. [Google Scholar] [CrossRef] [PubMed]
- Çallı, E.; Sogancioglu, E.; Ginneken, B.V.; Leeuwen, K.G.; Murphy, K. Deep learning for chest X-ray analysis: A survey. Med. Image Anal. 2021, 72, 102125. [Google Scholar] [CrossRef]
- Wang, G.Y.; Liu, X.H.; Shen, J.; Wang, C.D.; Li, Z.H.; Ye, L.S.; Wu, X.W.; Chen, T.; Wang, K.; Zhang, X.; et al. A deep-learning pipeline for the diagnosis and discrimination of viral, non-viral and COVID-19 pneumonia from chest X-ray images. Nat. Biomed. Eng. 2021, 5, 509–521. [Google Scholar] [CrossRef] [PubMed]
- Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; Laak, J.A.W.M.V.D.; Ginneken, B.V.; Sánchez, C.I. A Survey on Deep Learning in Medical Image Analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef]
- Liu, Y.; Jian, A.; Eng, C.; Way, D.H. A deep learning system for differential diagnosis of skin diseases. Nat. Med. 2020, 26, 900–908. [Google Scholar] [CrossRef]
- Li, X.; Chen, H.; Qi, X.; Dou, Q.; Fu, C.W.; Heng, P.A. H-DenseUNet: Hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans. Med. Imaging 2018, 37, 2663–2674. [Google Scholar] [CrossRef]
- Liu, Y.H.; Zhang, F.D.; Zhang, Q.Y.; Wang, S.W.; Wang, Y.Z.; Yu, Y.Z. Cross-view correspondence reasoning based on bipartite graph convolutional network for mammogram mass detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3812–3822. [Google Scholar]
- Wang, X.; Peng, Y.; Lu, L.; Lu, Z.; Bagheri, M.; Summers, R.M. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2097–2106. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L.V.D.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2921–2929. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
- Dawid, P.; Marcin, W.; Robertas, D.; Wei, W. Chest radiographs segmentation by the use of nature-inspired algorithm for lung disease detection. In Proceedings of the IEEE Symposium Series on Computational Intelligence, Bangalore, India, 18–21 November 2018. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Ze, L.; Yutong, L.; Yue, C.; Han, H.; Yixuan, W.; Zheng, Z.; Stephen, L.; Baining, G. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021. [Google Scholar]
- Wang, R.; Walters, R.; Yu, R. Incorporating Symmetry into Deep Dynamics Models for Improved Generalization. In Proceedings of the International Conference on Learning Representations (ICLR), Vienna, Austria, 4 May 2021. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Schultheiss, M.; Schober, S.A.; Lodde, M.; Bodden, J.; Aichele, J.; Müller-Leisse, C.; Renger, B.; Pfeiffer, F.; Pfeiffer, D. A robust convolutional neural network for lung nodule detection in the presence of foreign bodies. Sci. Rep. 2020, 10, 12987. [Google Scholar] [CrossRef]
- Giacomo, C.; Grazia, L.S.; Christian, N.; Dawid, P.; Marcin, W. Small lung nodules detection based on fuzzy-logic and probabilistic neural network with bioinspired reinforcement learning. IEEE Trans. Fuzzy Syst. 2020, 28, 1178–1189. [Google Scholar]
- Peng, T.; Gu, Y.D.; Ye, Z.Y.; Cheng, X.X.; Wang, J. A-LugSeg: Automatic and explainability-guided multi-site lung detection in chest X-ray images. Expert Syst. Appl. 2022, 198, 116873. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Li, X.; Shen, L.; Xie, X.; Huang, S.; Xie, Z.; Hong, X.; Yu, J. Multi-resolution convolutional networks for chest X-ray radiograph based lung nodule detection. Artif. Intell. Med. 2020, 103, 101744. [Google Scholar] [CrossRef] [PubMed]
- Rahman, S.; Sarker, S.; Miraj, A.A.; Nihal, R.A.; Haque, A.K.M.N.; Noman, A.A. Deep Learning Driven Automated Detection of COVID-19 from Radiography Images: A Comparative Analysis. Cogn. Comput. 2021. [Google Scholar] [CrossRef] [PubMed]
- EI-Dahshan, E.A.; Bassiouni, M.M.; Hagag, A.; Chakrabortty, R.K.; Loh, H.; Acharya, U.R. RESCOVIDTCNnet: A residual neural network-based framework for COVID-19 detection using TCN and EWT with chest X-ray images. Expert Syst. Appl. 2022, 204, 117410. [Google Scholar] [CrossRef]
- Fan, Y.; Liu, J.; Yao, R.; Yuan, X. COVID-19 Detection from X-ray Images using Multi-Kernel-Size Spatial-Channel Attention Network. Pattern Recognit. 2021, 119, 108055. [Google Scholar] [CrossRef]
- Cha, S.-M.; Lee, S.-S.; Ko, B. Attention-Based Transfer Learning for Efficient Pneumonia Detection in Chest X-ray Images. Appl. Sci. 2021, 11, 1242. [Google Scholar] [CrossRef]
- Park, S.; Lee, S.M.; Kim, N.; Choe, J.; Cho, Y.; Do, K.H.; Seo, J.B. Application of deep learning–based computer-aided detection system: Detecting pneumothorax on chest radiograph after biopsy. Eur. Radiol. 2019, 29, 5341–5348. [Google Scholar] [CrossRef]
- Tolkachev, A.; Sirazitdinov, I.; Kholiavchenko, M.; Mustafaev, T.; Ibragimov, B. Deep learning for diagnosis and segmentation of pneumothorax: The results on the Kaggle Competition and Validation Against Radiologists. IEEE J. Biomed. Health Inform. 2020, 25, 1660–1672. [Google Scholar] [CrossRef]
- Hwang, E.J.; Park, S.; Jin, K.N.; Kim, J.I.; Choi, S.Y.; Lee, J.H.; Goo, J.M.; Aum, J.; Yim, J.J.; Cohen, J.G.; et al. Development and validation of a deep learning–based automated detection algorithm for major thoracic diseases on chest radiographs. Soc. Sci. Electron. Publ. 2019, 2, e191095. [Google Scholar] [CrossRef]
- Pham, H.H.; Le, T.T.; Tran, D.Q.; Ngo, D.T.; Nguyen, H.Q. Interpreting chest X-rays via CNNs that exploit hierarchical disease dependencies and uncertainty labels. Neurocomputing 2021, 437, 186–194. [Google Scholar] [CrossRef]
- Pesce, E.; Withey, S.J.; Ypsilantis, P.P.; Bakewell, R.; Goh, V.; Montana, G. Learning to detect chest radiographs containing pulmonary lesions using visual attention networks. Med. Image Anal. 2019, 53, 26–38. [Google Scholar] [CrossRef]
- Zhao, G.; Fang, C.; Li, G.; Jiao, L.; Yu, Y. Contralaterally Enhanced Networks for Thoracic Disease Detection. IEEE Trans. Med. Imaging 2021, 40, 2428–2438. [Google Scholar] [CrossRef] [PubMed]
- Antoni, J.; Dawid, P.; Robertas, D. Lung X-Ray Image Segmentation Using Heuristic Red Fox Optimization Algorithm. Sci. Program. 2022, 2022, 4494139. [Google Scholar]
- Zhang, H.; Wu, C.R.; Zhang, Z.Y.; Zhu, Y.; Lin, H.B.; Zhang, Z.; Sun, Y.; He, T.; Mueller, J.; Manmatha, R.; et al. Resnest: Split-attention networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 2736–2746. [Google Scholar]
- Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable ConvNets V2: More Deformable, Better Results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–19 June 2019; pp. 9300–9308. [Google Scholar]
- Guo, C.; Fan, B.; Zhang, Q.; Xiang, S.; Pan, C. Augfpn: Improving multi-scale feature learning for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 12595–12604. [Google Scholar]
- Wang, J.; Zhang, W.; Zang, Y.; Cao, Y.; Pang, J.; Gong, T.; Chen, K.; Liu, Z.; Loy, C.C.; Lin, D. Seesaw loss for long-tailed instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 9695–9704. [Google Scholar]
- Nguyen, H.Q.; Lam, K.; Le, L.T.; Pham, H.H.; Tran, D.Q.; Nguyen, D.B.; Le, D.D.; Pham, C.M.; Tong, H.T.T.; Dinh, D.H.; et al. VinDr-CXR: An open dataset of chest X-rays with radiologist’s annotations. Sci. Data 2022, 9, 429. [Google Scholar] [CrossRef] [PubMed]
- Solovyev, R.; Wang, W.; Gabruseva, T. Weighted boxes fusion: Ensembling boxes from different object detection models. Image Vis. Comput. 2021, 107, 104117. [Google Scholar] [CrossRef]
- Cai, Z.W.; Vasconcelos, N. Cascade R-CNN: Delving Into High Quality Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Li, Y.H.; Chen, Y.T.; Wang, N.Y.; Zhang, Z.X. Scale-Aware Trident Networks for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Pang, J.; Chen, K.; Shi, J.; Feng, H.; Ouyang, W.; Lin, D. Libra r-cnn: Towards balanced learning for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–19 June 2019; pp. 821–830. [Google Scholar]
- Sun, P.; Zhang, R.; Jiang, Y.; Kong, T.; Xu, C.; Zhan, W.; Tomizuka, M.; Li, L.; Yuan, Z.; Wang, C.; et al. Sparse r-cnn: End-to-end object detection with learnable proposals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–21 June 2021; pp. 14454–14463. [Google Scholar]
- Daniel, B.; Sean, F.; James, H.; Judy, H. TIDE: A General Toolbox for Identifying Object Detection Errors. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020. [Google Scholar]












| Type of Abnormal Chest Image | Proportion | Type of Abnormal Chest Image | Proportion | 
|---|---|---|---|
| Aortic Enlargement, AE | 10.54% | Other Lesion, OL | 3.24% | 
| Cardiomegaly, CM | 7.99% | Infiltration, IF | 1.83% | 
| Pleural Thickening, PT | 7.12% | Interstitial Lung Disease, ILD | 1.47% | 
| Pulmonary Fibrosis, PF | 6.85% | Calcification, CC | 1.41% | 
| Nodule/Mass, NM | 3.79% | Consolidation, CS | 0.81% | 
| Lung Opacity, LO | 3.65% | Atelectasis, AL | 0.41% | 
| Pleural Effusion, PE | 3.64% | Pneumo-Thorax, PM | 0.33% | 
| Abnormal Categories | Baseline | TridentNet | Libra R-CNN | Sparse R-CNN | Swin Transformer | Ours | 
|---|---|---|---|---|---|---|
| Aortic Enlargement | 0.899 | 0.891 | 0.876 | 0.917 | 0.907 | 0.908 | 
| Atelectasis | 0.058 | 0.094 | 0.171 | 0.076 | 0.136 | 0.176 | 
| Calcification | 0.089 | 0.092 | 0.090 | 0.112 | 0.192 | 0.131 | 
| Cardiomegaly | 0.910 | 0.904 | 0.919 | 0.932 | 0.923 | 0.933 | 
| Consolidation | 0.371 | 0.335 | 0.262 | 0.111 | 0.218 | 0.453 | 
| Interstitial Lung Disease | 0.289 | 0.305 | 0.293 | 0.207 | 0.251 | 0.294 | 
| Infiltration | 0.332 | 0.239 | 0.291 | 0.173 | 0.285 | 0.350 | 
| Lung Opacity | 0.270 | 0.192 | 0.184 | 0.106 | 0.197 | 0.269 | 
| Nodule/Mass | 0.286 | 0.156 | 0.172 | 0.130 | 0.270 | 0.274 | 
| Pleural Effusion | 0.385 | 0.362 | 0.443 | 0.331 | 0.385 | 0.425 | 
| Pleural Thickening | 0.232 | 0.162 | 0.170 | 0.194 | 0.224 | 0.238 | 
| Pneumo-Thorax | 0.145 | 0.328 | 0.099 | 0.028 | 0.374 | 0.255 | 
| Pulmonary Fibrosis | 0.273 | 0.230 | 0.271 | 0.143 | 0.286 | 0.261 | 
| Other Lesion | 0.081 | 0.106 | 0.049 | 0.026 | 0.061 | 0.097 | 
| mAP | 0.330 | 0.314 | 0.306 | 0.249 | 0.336 | 0.362 | 
| Baseline | TridentNet | Libra R-CNN | Sparse R-CNN | Swin Transformer | Ours | |
|---|---|---|---|---|---|---|
| GFLOPS | 260.88 | 822.23 | 261.93 | 176.27 | 267.00 | 263.55 | 
| Parameter Memory (M) | 70.85 | 32.89 | 71.12 | 107.85 | 47.44 | 74.03 | 
| Abnormal Categories | L1 (Baseline) | L2 | L3 | L4 | L5 (Ours) | 
|---|---|---|---|---|---|
| Aortic enlargement | 0.899 | 0.906 | 0.896 | 0.895 | 0.908 | 
| Atelectasis | 0.058 | 0.112 | 0.130 | 0.210 | 0.176 | 
| Calcification | 0.089 | 0.134 | 0.129 | 0.095 | 0.131 | 
| Cardiomegaly | 0.910 | 0.933 | 0.935 | 0.917 | 0.933 | 
| Consolidation | 0.371 | 0.310 | 0.376 | 0.319 | 0.453 | 
| Interstitial lung disease | 0.289 | 0.306 | 0.307 | 0.249 | 0.294 | 
| Infiltration | 0.332 | 0.315 | 0.342 | 0.336 | 0.350 | 
| Lung opacity | 0.270 | 0.270 | 0.247 | 0.252 | 0.269 | 
| Nodule/mass | 0.286 | 0.287 | 0.284 | 0.287 | 0.274 | 
| Pleural effusion | 0.385 | 0.385 | 0.406 | 0.389 | 0.425 | 
| Pleural thickening | 0.232 | 0.241 | 0.214 | 0.223 | 0.238 | 
| Pneumo-thorax | 0.145 | 0.247 | 0.116 | 0.226 | 0.255 | 
| Pulmonary fibrosis | 0.273 | 0.296 | 0.288 | 0.241 | 0.261 | 
| Other lesion | 0.081 | 0.111 | 0.026 | 0.087 | 0.097 | 
| mAP | 0.330 | 0.347 | 0.341 | 0.338 | 0.362 | 
| Main Errors | Special Errors | |||||||
|---|---|---|---|---|---|---|---|---|
| Cls | Loc | Both | Dupe | Bkg | Miss | FP | FN | |
| Baseline | 7.07 | 10.93 | 1.85 | 0.17 | 4.09 | 4.87 | 28.24 | 14.41 | 
| Ours | 5.60 | 10.19 | 1.72 | 0.22 | 3.35 | 6.49 | 25.96 | 17.25 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, D.; Lu, S.; Zhang, L.; Liu, Y. Anomaly Detection in Chest X-rays Based on Dual-Attention Mechanism and Multi-Scale Feature Fusion. Symmetry 2023, 15, 668. https://doi.org/10.3390/sym15030668
Liu D, Lu S, Zhang L, Liu Y. Anomaly Detection in Chest X-rays Based on Dual-Attention Mechanism and Multi-Scale Feature Fusion. Symmetry. 2023; 15(3):668. https://doi.org/10.3390/sym15030668
Chicago/Turabian StyleLiu, Dong, Shuzhen Lu, Lingrong Zhang, and Yaohui Liu. 2023. "Anomaly Detection in Chest X-rays Based on Dual-Attention Mechanism and Multi-Scale Feature Fusion" Symmetry 15, no. 3: 668. https://doi.org/10.3390/sym15030668
APA StyleLiu, D., Lu, S., Zhang, L., & Liu, Y. (2023). Anomaly Detection in Chest X-rays Based on Dual-Attention Mechanism and Multi-Scale Feature Fusion. Symmetry, 15(3), 668. https://doi.org/10.3390/sym15030668
 
        


 
       