Enhanced WGAN Model for Diagnosing Laryngeal Carcinoma
Abstract
:Simple Summary
Abstract
1. Introduction
2. Related Work
2.1. AI Technology for Laryngeal Cancer Diagnosis
2.2. WGAN
3. Proposed Model
3.1. Advanced U-Net
3.1.1. Entire Structure
3.1.2. Downsampling
3.1.3. Upsampling
3.1.4. Auto Encoder and Dimension Reduction
3.2. WGAN
3.2.1. WGAN
3.2.2. Autoencoder
3.2.3. Loss Function
- Label smoothing [44]: Cross entropy as a loss function is designed to alleviate the rapid changes caused by the discontinuous functions. In addition, the MSE that calculates the distance instead of the cross entropy, which calculates the probability, is used as a loss function.
- SSIM [45]: It is one of the methods used to quantify the degree of structural similarity between images. The output images are used along with the MSE as a loss function to match the ground truth structurally.
- nn.BCBWithLogitsLoss(): As a loss function provided by PyTorch, it combines the binary cross entropy (BCE) with the Sigmoid function. There is no need to apply the Sigmoid to the target variable. By using this loss function, the target variable can be treated as a scalar value after training. This scalar value, called validity, can be interpreted as a metric of the latent vector.
3.2.4. Generator and Discriminator
4. Experiments
4.1. Datasets
4.2. System Setup
4.3. Evaluation Methods
4.4. Observation
4.5. Experimental Results
Cls | Count | F1 | TP | TN | FP | FN | ACC | IoU | bbIoU | mAUC | Precision | Recall |
---|---|---|---|---|---|---|---|---|---|---|---|---|
6 | 300 | 0.77 | 2417.41 | 94,777.63 | 432.32 | 676.65 | 0.99 | 0.65 | 0.38 | 0.39 | 0.85 | 0.78 |
5 | 300 | 0.74 | 2670.96 | 94,389.35 | 479.80 | 763.89 | 0.99 | 0.62 | 0.46 | 0.37 | 0.85 | 0.78 |
4 | 300 | 0.72 | 871.88 | 97,029.50 | 166.04 | 236.59 | 1.00 | 0.55 | 0.42 | 0.35 | 0.84 | 0.79 |
3 | 300 | 0.74 | 977.20 | 96,889.34 | 215.41 | 222.06 | 1.00 | 0.59 | 0.39 | 0.39 | 0.82 | 0.81 |
2 | 300 | 0.64 | 409.78 | 97,535.78 | 140.10 | 218.34 | 1.00 | 0.33 | 0.28 | 0.33 | 0.75 | 0.65 |
1 | 300 | 0.65 | 396.02 | 97,564.11 | 123.58 | 220.30 | 1.00 | 0.35 | 0.29 | 0.32 | 0.76 | 0.64 |
0 | 300 | 0.54 | 125.57 | 97,775.43 | 68.55 | 334.45 | 1.00 | 0.14 | 0.19 | 0.20 | 0.65 | 0.27 |
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
- Ren, J.; Jing, X.; Wang, J.; Ren, X.; Xu, Y.; Yang, Q.; Ma, L.; Sun, Y.; Xu, W.; Yang, N.; et al. Automatic recognition of laryngoscopic images using a deep-learning technique. Laryngoscope 2020, 130, E686–E693. [Google Scholar] [CrossRef] [PubMed]
- Mundt, M.; Pliushch, I.; Majumder, S.; Ramesh, V. Open set recognition through deep neural network uncertainty: Does out-of-distribution detection require generative classifiers? In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019. [Google Scholar]
- Scheirer, W.J.; de Rezende Rocha, A.; Sapkota, A.; Boult, T.E. Toward Open Set Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1757–1772. [Google Scholar] [CrossRef] [PubMed]
- Geng, C.; Huang, S.j.; Chen, S. Recent advances in open set recognition: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3614–3631. [Google Scholar] [CrossRef]
- Sufyan, M.; Shokat, Z.; Ashfaq, U.A. Artificial intelligence in cancer diagnosis and therapy: Current status and future perspective. Comput. Biol. Med. 2023, 165, 107356. [Google Scholar] [CrossRef]
- Zhang, B.; Shi, H.; Wang, H. Machine learning and AI in cancer prognosis, prediction, and treatment selection: A critical approach. J. Multidiscip. Healthc. 2023, 16, 1779–1791. [Google Scholar] [CrossRef]
- Bi, W.L.; Hosny, A.; Schabath, M.B.; Giger, M.L.; Birkbak, N.J.; Mehrtash, A.; Allison, T.; Arnaout, O.; Abbosh, C.; Dunn, I.F.; et al. Artificial intelligence in cancer imaging: Clinical challenges and applications. CA Cancer J. Clin. 2019, 69, 127–157. [Google Scholar] [CrossRef]
- Yang, R.; Yu, Y. Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front. Oncol. 2021, 11, 638182. [Google Scholar] [CrossRef]
- Kaur, C.; Garg, U. Artificial intelligence techniques for cancer detection in medical image processing: A review. Mater. Today Proc. 2023, 81, 806–809. [Google Scholar] [CrossRef]
- Semantic Segmentation|Papers with Code. 2021. Available online: https://paperswithcode.com/task/semantic-segmentation (accessed on 16 December 2021).
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 20 September 2018; pp. 3–11. [Google Scholar]
- Obid, R.; Redlich, M.; Tomeh, C. The treatment of laryngeal cancer. Oral Maxillofac. Surg. Clin. 2019, 31, 1–11. [Google Scholar] [CrossRef]
- Megwalu, U.C.; Sikora, A.G. Survival outcomes in advanced laryngeal cancer. JAMA Otolaryngol. Head Neck Surg. 2014, 140, 855–860. [Google Scholar] [CrossRef] [PubMed]
- Mohamed, N.; Almutairi, R.L.; Abdelrahim, S.; Alharbi, R.; Alhomayani, F.M.; Elamin Elnaim, B.M.; Elhag, A.A.; Dhakal, R. Automated Laryngeal cancer detection and classification using dwarf mongoose optimization algorithm with deep learning. Cancers 2023, 16, 181. [Google Scholar] [CrossRef] [PubMed]
- Joseph, J.S.; Vidyarthi, A.; Singh, V.P. An improved approach for initial stage detection of laryngeal cancer using effective hybrid features and ensemble learning method. Multimed. Tools Appl. 2024, 83, 17897–17919. [Google Scholar] [CrossRef]
- Bensoussan, Y.; Vanstrum, E.B.; Johns, M.M., III; Rameau, A. Artificial intelligence and laryngeal cancer: From screening to prognosis: A state of the art review. Otolaryngol. Head Neck Surg. 2023, 168, 319–329. [Google Scholar] [CrossRef] [PubMed]
- Azam, M.A.; Sampieri, C.; Ioppi, A.; Africano, S.; Vallin, A.; Mocellin, D.; Fragale, M.; Guastini, L.; Moccia, S.; Piazza, C.; et al. Deep learning applied to white light and narrow band imaging videolaryngoscopy: Toward real-time laryngeal cancer detection. Laryngoscope 2022, 132, 1798–1806. [Google Scholar] [CrossRef]
- Esmaeili, N.; Sharaf, E.; Gomes Ataide, E.J.; Illanes, A.; Boese, A.; Davaris, N.; Arens, C.; Navab, N.; Friebe, M. Deep convolution neural network for laryngeal cancer classification on contact endoscopy-narrow band imaging. Sensors 2021, 21, 8157. [Google Scholar] [CrossRef]
- Sahoo, P.K.; Mishra, S.; Panigrahi, R.; Bhoi, A.K.; Barsocchi, P. An improvised deep-learning-based mask R-CNN model for laryngeal cancer detection using CT images. Sensors 2022, 22, 8834. [Google Scholar] [CrossRef] [PubMed]
- U-Net—Wikipedia. Available online: https://en.wikipedia.org/wiki/U-Net (accessed on 23 August 2023).
- Convolutional Neural Network—Wikipedia. Available online: https://en.wikipedia.org/wiki/Convolutional_neural_network (accessed on 23 August 2023).
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. arXiv 2017, arXiv:stat.ML/1701.07875. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of wasserstein gans. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Kazeminia, S.; Baur, C.; Kuijper, A.; van Ginneken, B.; Navab, N.; Albarqouni, S.; Mukhopadhyay, A. GANs for medical image analysis. Artif. Intell. Med. 2020, 109, 101938. [Google Scholar] [CrossRef]
- Das, S. 6 GAN Architectures You Really Should Know. 2021. Available online: https://neptune.ai/blog/6-gan-architectures (accessed on 13 August 2021).
- Mao, Q.; Lee, H.Y.; Tseng, H.Y.; Ma, S.; Yang, M.H. Mode seeking generative adversarial networks for diverse image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 1429–1437. [Google Scholar]
- Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
- Wu, W.; Li, D.; Du, J.; Gao, X.; Gu, W.; Zhao, F.; Feng, X.; Yan, H. An intelligent diagnosis method of brain MRI tumor segmentation using deep convolutional neural network and SVM algorithm. Comput. Math. Methods Med. 2020, 2020, 6789306. [Google Scholar] [CrossRef] [PubMed]
- Jiang, B.; Li, N.; Shi, X.; Zhang, S.; Li, J.; de Bock, G.H.; Vliegenthart, R.; Xie, X. Deep learning reconstruction shows better lung nodule detection for ultra–low-dose chest CT. Radiology 2022, 303, 202–212. [Google Scholar] [CrossRef] [PubMed]
- Yousef, R.; Khan, S.; Gupta, G.; Siddiqui, T.; Albahlal, B.M.; Alajlan, S.A.; Haq, M.A. U-Net-Based Models towards Optimal MR Brain Image Segmentation. Diagnostics 2023, 13, 1624. [Google Scholar] [CrossRef] [PubMed]
- Halupka, K.J.; Antony, B.J.; Lee, M.H.; Lucy, K.A.; Rai, R.S.; Ishikawa, H.; Wollstein, G.; Schuman, J.S.; Garnavi, R. Retinal optical coherence tomography image enhancement via deep learning. Biomed. Opt. Express 2018, 9, 6205–6221. [Google Scholar] [CrossRef]
- Kamnitsas, K.; Bai, W.; Ferrante, E.; McDonagh, S.; Sinclair, M.; Pawlowski, N.; Rajchl, M.; Lee, M.; Kainz, B.; Rueckert, D.; et al. Ensembles of multiple models and architectures for robust brain tumour segmentation. In Proceedings of the Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: Third International Workshop, BrainLes 2017, Conjunction with MICCAI 2017, Quebec City, QC, Canada, 14 September 2017; pp. 450–462. [Google Scholar]
- Fu, H.; Xu, Y.; Lin, S.; Kee Wong, D.W.; Liu, J. Deepvessel: Retinal vessel segmentation via deep learning and conditional random field. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, 17–21 October 2016; pp. 132–139. [Google Scholar]
- Pytorch Implementation of U-Net, R2U-Net, Attention U-Net—GitHub. 2021. Available online: https://github.com/LeeJunHyun/Image_Segmentation (accessed on 13 December 2021).
- Sahoo, S. Residual Blocks—Building Blocks of ResNet. 2018. Available online: https://towardsdatascience.com/residual-blocks-building-blocks-of-resnet-fd90ca15d6ec (accessed on 13 December 2021).
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Lotlikar, R.; Kothari, R. Multilayer perceptron based dimensionality reduction. In Proceedings of the IJCNN’99. International Joint Conference on Neural Networks. Proceedings (Cat. No. 99CH36339), Washington, DC, USA, 10–16 July 1999; Volume 3, pp. 1691–1695. [Google Scholar]
- Getting Started with Distributed Data Parallel—PyTorch. 2021. Available online: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html (accessed on 22 November 2021).
- Optimization—PyTorch Lightning. Available online: https://lightning.ai/docs/pytorch/1.6.0/ (accessed on 13 December 2021).
- LightningDataModule—PyTorch Lightning 1.6.0dev Documentation. 2021. Available online: https://lightning.ai/docs/pytorch/1.6.0/common/lightning_module.html (accessed on 13 December 2021).
- An Efficient Network for Multi-Label Brain Tumor Segmentation. Available online: https://openaccess.thecvf.com/content_ECCV_2018/papers/Xuan_Chen_Focus_Segment_and_ECCV_2018_paper.pdf (accessed on 22 November 2021).
- Müller, R.; Kornblith, S.; Hinton, G.E. When does label smoothing help? Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Po-Hsun-Su/pytorch-ssim—GitHub. 2021. Available online: https://github.com/Po-Hsun-Su/pytorch-ssim (accessed on 13 December 2021).
- Torchvision 0.11.0 Documentation—PyTorch. 2021. Available online: https://pytorch.org/vision/ (accessed on 13 December 2021).
- TensorBoard|TensorFlow. 2021. Available online: https://www.tensorflow.org/tensorboard (accessed on 13 December 2021).
- scikit-Learn: Machine Learning in Python—Scikit-Learn 1.0.1. Available online: https://scikit-learn.org/ (accessed on 13 December 2021).
- OpenCV-Python Tutorials. Available online: https://docs.opencv.org/4.x/d6/d00/tutorial_py_root.html (accessed on 13 December 2021).
- Intersection over Union (IoU) for Object Detection—PyImageSearch. 2016. Available online: https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/ (accessed on 13 December 2021).
- How to Calculate & Use the AUC Score—Towards Data Science. Available online: https://towardsdatascience.com/how-to-calculate-use-the-auc-score-1fc85c9a8430 (accessed on 13 December 2021).
Lesion | (H, S, V) | (R, G, B) | Color Code | |
---|---|---|---|---|
R-TVC | (105°, 100%, 100%) | (64, 255, 0) | polygon color | #40FF00 |
L-TVC | (180°, 100%, 100%) | (0, 255, 255) | polygon color | #00FFFF |
R-FVC | (210°, 100%, 100%) | (0, 128, 255) | polygon color | #0080FF |
L-FVC | (255°, 100%, 100%) | (64, 0, 255) | polygon color | #4000FF |
Subglottis | (60°, 100%, 100%) | (255, 255, 0) | polygon color | #FFFF00 |
Benign Tumor | (0°, 100%, 100%) | (255, 0, 0) | polygon color | #FF0000 |
Cancer | (300°, 100%, 100%) | (255, 0, 255) | polygon color | #FF00FF |
Counts | 300 | 1035 | 300 | 1035 | |||
---|---|---|---|---|---|---|---|
F1 | F1 | IoU(pix) | IoU(BB) | IoU(pix) | IoU(BB) | ||
Name | Class | ||||||
R-TVC | 2 | 0.64466 | 0.65400 | 0.33159 | 0.28494 | 0.41680 | 0.41015 |
L-TVC | 3 | 0.74483 | 0.72868 | 0.59073 | 0.39312 | 0.56017 | 0.54482 |
R-FVC | 4 | 0.72005 | 0.68070 | 0.55318 | 0.41790 | 0.51202 | 0.53456 |
L-FVC | 5 | 0.73936 | 0.72070 | 0.61976 | 0.46260 | 0.59198 | 0.55988 |
Subglottis | 1 | 0.65393 | 0.65011 | 0.34636 | 0.29183 | 0.44008 | 0.43077 |
Benign Tumor | 0 | 0.54029 | 0.50486 | 0.14321 | 0.18924 | 0.15398 | 0.22279 |
Cancer | 6 | 0.77319 | 0.75490 | 0.64823 | 0.37732 | 0.60632 | 0.48714 |
Cls | Count | F1 | TP | TN | FP | FN | ACC | IoU | bbIoU | mAUC | Precision | Recall |
---|---|---|---|---|---|---|---|---|---|---|---|---|
6 | 1035 | 0.75 | 2814.15 | 93,923.13 | 697.20 | 869.52 | 0.98 | 0.61 | 0.49 | 0.38 | 0.80 | 0.76 |
5 | 1035 | 0.72 | 3140.11 | 93,426.73 | 831.94 | 905.22 | 0.98 | 0.59 | 0.56 | 0.37 | 0.79 | 0.78 |
4 | 1035 | 0.68 | 861.58 | 96,943.25 | 228.44 | 270.73 | 0.99 | 0.51 | 0.53 | 0.33 | 0.79 | 0.76 |
3 | 1035 | 0.73 | 935.46 | 96,909.83 | 235.43 | 223.27 | 1.00 | 0.56 | 0.54 | 0.36 | 0.80 | 0.81 |
2 | 1035 | 0.65 | 586.78 | 97,251.42 | 245.84 | 219.95 | 1.00 | 0.42 | 0.41 | 0.33 | 0.70 | 0.73 |
1 | 1035 | 0.65 | 578.17 | 97,275.17 | 229.28 | 221.37 | 1.00 | 0.44 | 0.43 | 0.32 | 0.72 | 0.72 |
0 | 1035 | 0.50 | 128.96 | 97,857.93 | 70.05 | 247.06 | 1.00 | 0.15 | 0.22 | 0.17 | 0.65 | 0.34 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, S.; Chang, Y.; An, S.; Kim, D.; Cho, J.; Oh, K.; Baek, S.; Choi, B.K. Enhanced WGAN Model for Diagnosing Laryngeal Carcinoma. Cancers 2024, 16, 3482. https://doi.org/10.3390/cancers16203482
Kim S, Chang Y, An S, Kim D, Cho J, Oh K, Baek S, Choi BK. Enhanced WGAN Model for Diagnosing Laryngeal Carcinoma. Cancers. 2024; 16(20):3482. https://doi.org/10.3390/cancers16203482
Chicago/Turabian StyleKim, Sungjin, Yongjun Chang, Sungjun An, Deokseok Kim, Jaegu Cho, Kyungho Oh, Seungkuk Baek, and Bo K. Choi. 2024. "Enhanced WGAN Model for Diagnosing Laryngeal Carcinoma" Cancers 16, no. 20: 3482. https://doi.org/10.3390/cancers16203482
APA StyleKim, S., Chang, Y., An, S., Kim, D., Cho, J., Oh, K., Baek, S., & Choi, B. K. (2024). Enhanced WGAN Model for Diagnosing Laryngeal Carcinoma. Cancers, 16(20), 3482. https://doi.org/10.3390/cancers16203482