Deep Learning Techniques to Diagnose Lung Cancer

Simple Summary This study investigates the latest achievements, challenges, and future research directions of deep learning techniques for lung cancer and pulmonary nodule detection. Hopefully, these research findings will help scientists, investigators, and clinicians develop new and effective medical imaging tools to improve lung nodule diagnosis accuracy, sensitivity, and specificity. Abstract Medical imaging tools are essential in early-stage lung cancer diagnostics and the monitoring of lung cancer during treatment. Various medical imaging modalities, such as chest X-ray, magnetic resonance imaging, positron emission tomography, computed tomography, and molecular imaging techniques, have been extensively studied for lung cancer detection. These techniques have some limitations, including not classifying cancer images automatically, which is unsuitable for patients with other pathologies. It is urgently necessary to develop a sensitive and accurate approach to the early diagnosis of lung cancer. Deep learning is one of the fastest-growing topics in medical imaging, with rapidly emerging applications spanning medical image-based and textural data modalities. With the help of deep learning-based medical imaging tools, clinicians can detect and classify lung nodules more accurately and quickly. This paper presents the recent development of deep learning-based imaging techniques for early lung cancer detection.


Introduction
Lung cancer is the most frequent cancer and the cause of cancer death, with the highest morbidity and mortality in the United States [1]. In 2018, GLOBOCAN estimated approximately 2.09 million new cases and 1.76 million lung cancer-related deaths [2]. Lung cancer cases and deaths have increased significantly globally [2]. Approximately 85-88% of lung cancer cases are non-small cell lung carcinoma (NSCLS), and about 12-15% of lung cancer cases are small cell lung cancer (SCLC) [3]. Early lung cancer diagnosis and intervention are crucial to increase the overall 5-year survival rate due to the invasiveness and heterogeneity of lung cancer [4].
Over the past two decades, various medical imaging techniques, such as chest X-ray, positron emission tomography (PET), magnetic resonance imaging (MRI), computed tomography (CT), low-dose CT (LDCT), and chest radiograph (CRG), have been extensively investigated for lung nodule detection. Although CT is the golden standard imaging tool for lung nodule detection, it can only detect apparent lung cancer with high false-positive rates and produces harmful X-ray radiation [5]. LDCT has been proposed to reduce harmful radiation to detect lung cancer [6]. However, cancer-related deaths were concentrated in subjects undergoing LDCT. 2-deoxy-18F-fluorodeoxyglucose (18F-FDG) PET has been developed to improve the detection performance of lung cancer [7]. 18F-FDG PET produces semi-quantitative parameters of tumor glucose metabolism, which is helpful in the diagnosis of NSCLC [8]. However, 18F-FDG PET requires further evaluation of patients with NSCLC. Some new imaging techniques, such as magnetic induction tomography (MIT), with longer progression times and overall survival rates. 18F-FDG PET has been applie to diagnose solitary pulmonary nodules [25]. 18F-FDG PET is a crucial in-patient selectio and advanced NSCLC for radical radiotherapy. PET-assisted radiotherapy offers mo accuracy [26] and manages about 32% of patients with stage IIIA lung cancer [27]. 18 FDG PET provides a significant response assessment in patients with NSCLC undergoin induction chemotherapy.
MRI is the most potent lung imaging tool without ionizing radiation, but it provid insufficient information with high costs and time-consuming limitations. It failed to dete about 10% of small lung nodules (4-8 mm in diameter) [28]. MRI with ultra-short ech time (UTE) can improve signal intensity and reduce lung susceptibility artifacts. MRI wi UTE is sensitive for detecting small lung nodules (4-8 mm) [29]. MRI achieves a high lung nodule detection rate than LDCT. MRI with different pulse sequences also improve lung nodule detection sensitivity. The authors investigated T1-weighted and T2-weighte MRI to detect small lung nodules [30,31]. Compared to 3T 1.5 MRI, 1.5T MRI is muc easier to identify ground glass opacities [32]. Ground glass opacities were successful detected in 75% of subjects with lung fibrosis who received 1.5T MRI with SSFP sequenc [33]. MRI with T2-weighted fast spin echo provides similar or even better performance f ground glass infiltrate detection in immunocompromised subjects [34].
Several research groups have recently investigated the feasibility of using MIT f lung disease detection [35,36]. However, due to the lack of measurement systems, expe sive computational electromagnetic models, low image resolution, and some other cha lenges, MIT technology still has a long way to go before it can be widely used as a com mercial imaging tool in clinical conditions.
Medical imaging approaches play an essential strategy in early-stage lung cancer d tection and improve the survival rate. However, these techniques have some limitation including high false positives, and cannot detect lesions automatically. Several CAD sy tems have been developed for lung cancer detection [37,38]. As shown in Figure 1, a CAD based lung nodule detection system [14] usually consists of three main phases: data co lection and pre-processing, training, and testing. There are two types of CAD systems: th detection system identifies specific anomalies according to interest regions, and the dia nostic system analyses lesion information, such as type, severity, stage, and progression  [14]. The figure is reused from reference [14]; n special permission is required to reuse all or part of articles published by MDPI, including figur and tables. For articles published under an open-access Creative Common CC BY license.  [14]. The figure is reused from reference [14]; no special permission is required to reuse all or part of articles published by MDPI, including figures and tables. For articles published under an open-access Creative Common CC BY license.

Deep Learning-Based Imaging Techniques
A deep learning-based CAD system has been reported as a promising tool for the automatic diagnosis of lung disease in medical imaging with significant accuracy [34][35][36]. The deep learning model is a neural network model with multiple levels of data represen-tation. The deep learning approaches can be grouped into unsupervised, reinforcement, and supervised learning.
Unsupervised learning does not require user guidance, which analyzes the data and then sorts inherent similarities between the input data. Therefore, semi-supervised learning is a mixed model that can provide a win-win situation, even with different challenges. Semi-supervised learning techniques use both labeled and unlabeled data. With the help of labeled and unlabeled data, the accuracy of the decision boundary becomes much higher. Auto-Encoders (AE), Restricted Boltzmann Machines (RBM), and Generative Adversarial Networks (GAN) are good at clustering and nonlinear dimensionality reduction. A large amount of labeled data is usually required during training, which increases cost, time, and difficulty. Researchers have applied deep clustering to reduce labeling and make a more robust model [39,40].
Convolutional neural networks (CNN), deep convolutional neural networks (DCNN), and recurrent neural networks (RNN) are the most widely used unsupervised learning algorithms in medical images. CNN architecture is one of the most widely used supervised deep learning approaches for lesion segmentation and classification because less preprocessing is required. CNN architectures have recently been applied to medical images for image segmentation (such as Mask R-CNN [41]) and classification (such as AlexNet [42] and VGGNet [43]). DCNN architectures usually contain more layers with complex nonlinear relationships, which have been used for classification and regression with reasonable accuracy [44][45][46]. RNN architecture is a higher-order neural network that can accommodate the network output to re-input [47]. RNN applies the Elman network with feedback links from the hidden layer to the input layer, which has the potential to capture and exploit cross-slice variations to incorporate volumetric patterns of nodules. However, RNN has a vanishing gradient problem.
The reinforcement learning technique was first applied in Google Deep Mind in 2013 [48]. Since then, reinforcement learning approaches have been extensively investigated to improve lung cancer detection accuracy, sensitivity, and specificity. Semi-supervised learning approaches, such as deep reinforcement learning and generative adversarial networks, use labeled datasets.
Supervised learning usually involves a learning algorithm, and labels are assigned to the input data according to the labeling data during training. Various supervised deep learning approaches have been applied to CT images to identify abnormalities with anatomical localization. These approaches have some drawbacks, such as the large amount of labeled data required during training, the assumption of fixed network weights upon training completion, and the inability to be improved after training. Thus, a few-shot learning (FSL) model is developed to reduce data requirements during training.

Lung Cancer Prediction Using Deep Learning
This section presents recent achievements in lung cancer and nodule prediction using deep learning techniques. The processing includes image pre-processing, lung nodule segmentation, detection, and classification. The pre-processed images are injected into a deep learning algorithm with specific architecture and training and tested on the image datasets. The image noise affects the precision of the final classifier. Several noise reduction approaches, such as median filter [48], Wiener filter [49], and non-local means filter [50], have been developed for pre-processing to improve accuracy and generalization performance. After denoising, a normalization method, such as min-max normalization, is required to rescale the images and reduce the complexity of image datasets.
Accuracy assesses the capability concerning the results with the existing information features. Sensitivity is helpful for evaluation when FN is high. Precision is an effective measurement index when FP is high. The F1_score is applied when the class distribution is uneven. ROC can tune detection sensitivity. The area under the receiver operating characteristic curve (AUC) has been used to evaluate the proposed deep learning model. Larger values of accuracy, precision, sensitivity, specificity, AUC, DSC, and JS, and smaller values of Error, UR, OR, and MHD indicate better performance of a deep learning-based algorithm.
These performance metrics can be computed using the following equations [51,52]: where TP (true positive) denotes the number of correct positives; TN (true negative) indicates the number of correct negatives; FP (false positive) means the number of incorrect positives; FN (false negative) denotes the number of incorrect negatives; B is the target object region, A denotes ground truth dataset, and N a is the number of pixels in A; IoU refers to the percentage of the intersection to the union of the ground truth and predicted areas and is a metric for various object detection and semantic segmentation problems.

Datasets
Lung image datasets play an essential role in evaluating the performance of deep learning-based algorithms for lung nodule classification and detection. Table 1 shows publicly available lung images and clinical datasets for assessing nodule classification and detection performance.

Lung Image Segmentation
Image segmentation aims to recognize the voxel information and external contour of the region of interest. In medical imaging, segmentation is mainly used to segment organs or lesions to quantitatively analyze relevant clinical parameters and provide further guidance for follow-up diagnosis and treatment. For example, target delineation is crucial for surgical image navigation and tumor radiotherapy guidance.
Lung segmentation plays a crucial role in medical images for lesion detection, including thorax extraction (removes artifacts) and lung extraction (identifies the left and right lungs). Several threshold techniques, such as the threshold method [69], iterative threshold [70], Otsu threshold [71], and adaptive threshold [72,73], have been investigated for lung segmentation. Few research groups have investigated segmentation methods based on region and 3D region growth [74,75]. Kass et al. [76] first introduced the active contour model, and Lan et al. [77] applied the active contour model for lung segmentation. These techniques are manual segmentation and have many disadvantages, such as being relatively slow, prone to human error, scarcity of ground truth, and class imbalance.
Several deep learning approaches have been investigated for lung segmentation. Wang et al. [78] developed a multi-view CNN (MV-CNN) for lung nodule segmentation, with an average DSC of 77.67% and an average ASD of 0.24 for the LIDC-IDRI dataset. Unlike conventional CNN, MV-CNN integrates multiple input images for lung nodule identification. However, it is difficult for MV-CNN to process 3D CT scans. Thus, a 3D CNN was developed to process volumetric patterns of cancerous nodules [79]. Sun et al. [80] designed a two-stage CAD system to segment lung nodules and FP reduction automatically. The first stage aims to identify and segment the nodules, and the second stage aims to reduce FP. The system was tested on the LIDC-IDRI dataset and evaluated by four experienced radiologists. The system obtained an average F1_score of 0.8501 for lung nodule segmentation.
In 2020, Cao et al. [81] developed a dual-branch residual network (DB-ResNet) that simultaneously captures the multi-view and multi-scale features of nodules. The proposed DB-ResNet was evaluated on the LIDC-IDRI dataset and achieved a DSC of 82.74%. Compared to trained radiologists, DB-ResNet provides a higher DSC.
In 2021, Banu et al. [82] proposed an attention-aware weight excitation U-Net (AWEU-Net) architecture in CT images for lung nodule segmentation. The architecture contains two stages: lung nodule detection based on fine-tuned Faster R-CNN and lung nodule segmentation based on the U-Net with position attention-aware weight excitation (PAWE) and channel attention-aware weight excitation (CAWE). The AWEU-Net obtained DSC of 89.79% and 90.35%, IoU of 82.34%, and 83.21% for the LUNA16 and LIDC-IDRI datasets, respectively.
Dutta [83] developed a dense recurrent residual CNN (Dense R2Unet) based on the U-Net and dense interconnections. The proposed method was tested on a lung segmentation dataset, and the results showed that the Dense R2UNet offers better segmentation performance than U-Net and ResUNet.

Lung Nodule Detection
Lung nodule detection is challenging because its shape, texture, and size vary greatly, and some non-nodules, such as blood vessels and fibrosis, have a similar appearance to lung nodules that often appear in the lungs. The processing includes two main steps: lung nodule detection and false-positive nodule reduction. Over the past few decades, researchers worldwide have extensively investigated machine learning and deep learningbased approaches for lung nodule detection. Chang et al. [106] applied the support vector machine (SVM) for nodules classification in ultrasound images. Nithila et al. [107] developed a lung nodule detection model based on heuristic search and particle clustering algorithms for network optimization. In 2005, Zhang et al. [108] developed a discrete-time cellular neural network (DTCNN) to detect small (2-10 mm) juxtapleural and non-pleural nodules in CT images. The method obtained a sensitivity of 81.25% at 8.29 FPs per scan for juxtapleural nodule detection and a sensitivity of 83.9% at 3.47 FPs per scan for non-pleural nodule detection.
Hwang et al. [109] investigated the relationship between CT and commercial CAD to detect lung nodules. They also studied LDCT images with three reconstruction kernels (B, C, and L) from 36 human subjects. The sensitivities of 82%, 88%, and 82% for the nodules of B, C, and L were obtained for all images. Experimental results showed that CAD sensitivity could be elevated by combining data from 2 different kernels without radiation exposure. Young et al. [110] studied the effects on the performance of a CAD-based nodule detection model by reducing the CT dose. The CAD system was evaluated on the NLST dataset and obtained sensitivities of 35%, 20%, and 42.5% at the initial dose, 50% dose, and 25% dose, respectively. Tajbakhsh et al. [111] studied massive training ANN (MTANN) and CNN for lung nodule detection and classification. MTANN and CNN obtained AUCs of 0.8806 and 0.7755, respectively. MTANN performs better than CNN for lung nodule detection and classification.
Liu et al. [112] developed a cascade CNN for lung nodule detection. The transfer learning model was applied to train the network to detect nodules, and a non-nodule filter was introduced to the detection network to reduce false positives (FP). The proposed architecture effectively reduces FP in the lung nodule detection system. Li et al. [65] developed a lung nodule detection method based on a faster R-CNN network and an FP reduction model in thoracic MR images. In this study, a faster R-CNN was developed to detect lung nodules, and an FP reduction model was developed to reduce FP. The method was tested on the FAHGMU dataset and obtained a sensitivity of 85.2%, with 3.47 FP per scan. Cao et al. [113] developed a two-stage CNN (TSCNN) model for lung nodule detection. In the first stage, a U-Net based on ResDense was applied to detect lung nodules. A 3D CNN-based ensemble learning architecture was proposed in the second stage to reduce false-positive nodules. The proposed model was compared with three existing models, including 3DDP-DenseNet, 3DDP-SeResNet, and 3DMBInceptionNet.
Several 3D CNN models have been developed for lung nodule detection [114][115][116]. Perez et al. [117] developed a 3D CNN to automatically detect lung cancer and tested the model on the LIDC-IDRI dataset. The experimental results showed that the proposed method provides a recall of 99.6% and an AUC of 0.913. Vipparla et al. [118] proposed a multi-patched 3D CNN with a hybrid fusion architecture for lung nodule detection with reduced FP. The method was tested on the LUNA16 dataset and achieved a competition performance metric (CPM) of 0.931. Dutande et al. [119] developed a 2D-3D cascaded CNN architecture and compared it with existing lung nodule detection and segmentation methods. The results showed that the 2D-3D cascaded CNN architecture obtained a DCM of 0.80 for nodule segmentation and a sensitivity of 90.01% for nodule detection. Luo et al. [120] developed a 3D sphere representation-based center-point matching detection network (SCPM-Net) consisting of sphere representation and center-point matching components.
The SCPM-Net was tested on the LUNA16 dataset and achieved an average sensitivity of 89.2% at 7 FPs per image for lung nodule detection. Franck et al. [121] investigated the effects on the performance of deep learning image reconstruction (DLIR) techniques on lung nodule detection in chest CT images. In this study, up to 6 artificial nodules were located within the lung phantom. Images were generated using 50% ASIR-V and DLIR with low (DL-L), medium (DL-M), and high (DL-H) strengths. No statistically significant difference was obtained between these methods (p = 0.987, average AUC: 0.555, 0.561, 0.557, and 0.558 for ASIR-V, DL-L, DL-M, and DL-H). Table 3 shows recently developed lung nodule detection approaches using deep learning techniques. Among these approaches, the co-learning feature fusion CNN obtained the best accuracy of 99.29%, which is higher than other lung nodule detection approaches. Several networks, including 3D Faster R-CNN with U-Net-like encoder, YOLOv2, YOLOv3, VGG-16, DTCNN-ELM, U-Net++, MIXCAPS, and ProCAN, obtained good accuracy (>90%) of lung nodule detection.
The comparative study results showed that the sensitivity and specificity of CNN and DBN for pulmonary nodule classification are 73.40% and 73.30%, 82.20%, and 78.70%, respectively [165]. Another comparative study showed that the sensitivity and specificity of CNN and ResNet in the classification of nodules are 76.64% and 89.50%, 81.97%, and 89.38%, respectively [171]. The combined application of CNN and RNN achieved accuracy, sensitivity, and specificity of 94.78%, 94.66%, and 95.14%, respectively, in classifying pulmonary nodules [172].
In 2019, Zhang et al. [174] used an ensemble learner of multiple deep CNN in CT images and obtained a classification accuracy of 84% for the LIDC-IDRI dataset. The proposed classifier achieved better performance than other algorithms, such as SVM, multilayer perceptron, and random forests.
Sahu et al. [175] proposed a lightweight multi-section CNN with a classification accuracy of 93.18% for the LIDC-IDRI dataset to improve accuracy. The proposed architecture could be applied to select the representative cross sections determining malignancy that facilitate the interpretation of the results.
Ali et al. [176] developed a system based on transferable texture CNN that consists of nine layers to extract features automatically and classify lung nodules. The proposed method achieved an accuracy of 96.69% ± 0.72%, with an error of 3.30% ± 0.72% and a recall of 97.19% ± 0.57%, respectively.
Marques et al. [177] developed a multi-task CNN to classify malignancy nodules with an AUC of 0.783. Thamilarasi et al. [178] proposed an automatic lung nodule classifier based on CNN with an accuracy of 86.67% for the JSRT dataset. Kawathekar et al. [179] developed a lung nodule classifier using a machine-learning technique with an accuracy of 94% and an F1_score of 92% for the LNDb dataset.
More recently, Radford et al. [180] proposed deep convolution GAN (DCGAN), Chuquicusma et al. [181] applied DCGAN to generate realistic lung nodules, and Zhao et al. [182] applied Forward and Backward GAN (F&BGAN) to classify lung nodules. The F&BGAN was evaluated on the LIDC-IDRI dataset and obtained the best accuracy of 95.24%, a sensitivity of 98.67%, a specificity of 92.47%, and an AUC of 0.98. Table 4 shows the recently developed traditional and deep learning-based techniques for classifying lung nodules. Among these methods, CNN variants obtained an accuracy range of 83.4-99.6%, a specificity range of 73.3-95.17%, a sensitivity range of 73.3-96.85%, and an AUC range of 0.7755-0.9936, respectively. Several methods achieved high classification accuracy (>95%), including F&BGAN, Inception_ResNet_V2, ResNet152V2, ResNet152V2+GRU, CSO-CADLCC, ProCAN, Net121, ResNet50, DITNN, and optimal DBN with an opposition-based pity beetle algorithm. DCNN systems obtained a sensitivity of 89.3% [183] and an accuracy of 97.3% [184]. The classifier was developed based on the VGG19 and CNN models and achieved accuracy, sensitivity, specificity, recall, F1_score, AUC, and MCC above 98%.   Forte et al. [209] recently conducted a systematic review and meta-analysis of the diagnostic accuracy of current deep learning approaches for lung cancer diagnosis. The pooled sensitivity and specificity of deep learning approaches for lung cancer detection were 93% and 68%, respectively. The results showed that AI plays an important role in medical imaging, but there are still many research challenges.

Challenges and Future Research Directions
This study extensively surveys papers published between 2014 and 2022. Tables 2-4 demonstrate that deep learning-based lung imaging systems have achieved high efficiency and state-of-the-art performance for lung nodule segmentation, detection, and classification using existing medical images. Compared to reinforcement and supervised learning techniques, unsupervised deep learning techniques (such as CNN, Faster R-CNN, Mask R-CNN, and U-Net) are more popular methods that have been used to develop convolutional networks for lung cancer detection and false-positive reduction.
Previous studies have shown that CT is the most widely used imaging tool in the CAD system for lung cancer diagnosis. Compared to 2D CNN, 3D CNN architectures provide more promising usefulness in obtaining representative features of malignant nodules. To this day, only a few works on 3D CNN for lung cancer diagnosis have been reported.
Deep learning techniques have achieved good performance in segmentation and classification. However, deep learning techniques still have many unsolved problems in lung cancer detection. First, clinicians have not fully acknowledged deep learning techniques for everyday clinical exercise due to the lack of standardized medical image acquisition protocols. The unification of the acquisition protocols could minimize it.
Second, deep learning techniques usually require massive annotated medical images by experienced radiologists to complete training tasks. However, it is costly and time consuming to collect an enormous annotated image dataset, even performed by experienced radiologists. Several methods were applied to overcome the scarcity of annotated data. For example, transfer learning is a possible way to solve the training problem of small samples. Another possible method is the computer synthesis of images, such as the generation of confrontation networks. Inadequate data will inevitably affect the accuracy and stability of predictions. Therefore, improving prediction accuracy using weak supervision, transfer learning, and multi-task learning with small labeled data is one of the future research directions.
Third, the clinical application of deep learning requires high interpretability, but current deep learning techniques cannot effectively explain the learned features. Many researchers have applied visualization and parameter analysis methods to explain deep learning models. However, there is still a certain distance from the interpretable imaging markers required by clinical requirements. Therefore, investigating the interpretable deep learning method will be a hot spot in the medical image field.
Fourth, developing the robustness of the prediction model is a challenging task. Most deep learning techniques work well only for a single dataset. The image of the same disease may vary significantly due to different acquisition parameters, equipment, time, and other factors. This led to poor robustness and generalization of existing deep learning models. Thus, improving the model structure and training methods by combining brain cognitive ideas and improving the generalization ability of deep learning is one of the key future directions.
Finally, some of the current literature has little translation into applicability in clinical practice due to the lack of experience of non-medical investigators in choosing more relevant clinical outcomes. Most deep learning techniques were developed by non-medical professionals with little or no oversight of radiologists, who, in practice, will use these resources when they become more widely available. As a result, some performance metrics, such as accuracy, AUC, and precision, which have little meaningful clinical application, continue to be used and are often the only summary outcomes reported by some studies. Instead, investigators should always strive to report more relevant clinical parameters, such as sensitivity and specificity, because they are independent of the prevalence of the disease and can be more easily translated into practice.
In the future, investigators should pay more attention to the following research directions: (1) develop new convolutional networks and loss functions to improve the performance; (2) weak supervised learning, using a large number of incomplete, inaccurate, and ambiguous annotation data in the existing medical records to achieve model training; (3) bring prior clinical knowledge into model training; (4) radiologists, computer scientists, and engineers need to work more closely to develop more realistic and sensitive models and add more meaning to the research field; (5) single disease identification to complete disease identification. In clinical examination, only a few cases need to solve one well-defined problem. For example, clinicians can detect pulmonary nodules in LDCT and check whether there are other abnormalities, such as emphysema. Solving multiple problems with one network will not reduce performance in specific tasks. In addition, deep learning can be explored in some areas where the medical mechanism is not precise, such as large-scale lung image analysis using deep learning, which is expected to make diagnosing lung diseases more objective.

Conclusions
This paper reviewed recent achievements in deep learning-based approaches for lung nodule segmentation, detection, and classification. CNN is one of the most widely used deep learning techniques for lung disease detection and classification, and CT image datasets are the most frequently used imaging datasets for training networks. The article review was based on recent publications (published in 2014 and later). Experimental and clinical trial results demonstrate that deep learning techniques can be superior to trained radiologists. Deep learning is expected to effectively improve lung nodule segmentation, detection, and classification. With this powerful tool, radiologists can interpret images more accurately. Deep learning algorithm has shown great potential in a series of tasks in the radiology department and has solved many medical problems. However, it still faces many difficulties, including large-scale clinical verification, patient privacy protection, and legal accountability. Despite these limitations, with the current trend and rapid development of the medical industry, deep learning is expected to generate a greater demand for accurate diagnosis and treatment in the medical field.