Next Article in Journal
Task-Driven Learned Hyperspectral Data Reduction Using End-to-End Supervised Deep Learning
Next Article in Special Issue
Data Augmentation Using Adversarial Image-to-Image Translation for the Segmentation of Mobile-Acquired Dermatological Images
Previous Article in Journal
FACS-Based Graph Features for Real-Time Micro-Expression Recognition
Previous Article in Special Issue
Musculoskeletal Images Classification for Detection of Fractures Using Transfer Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

A Survey of Deep Learning for Lung Disease Detection on Medical Images: State-of-the-Art, Taxonomy, Issues and Future Directions

by
Stefanus Tao Hwa Kieu
1,
Abdullah Bade
1,
Mohd Hanafi Ahmad Hijazi
2,* and
Hoshang Kolivand
3
1
Faculty of Science and Natural Resources, Universiti Malaysia Sabah, Kota Kinabalu 88400, Sabah, Malaysia
2
Faculty of Computing and Informatics, Universiti Malaysia Sabah, Kota Kinabalu 88400, Sabah, Malaysia
3
School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool L3 3AF, UK
*
Author to whom correspondence should be addressed.
J. Imaging 2020, 6(12), 131; https://doi.org/10.3390/jimaging6120131
Submission received: 24 October 2020 / Revised: 25 November 2020 / Accepted: 25 November 2020 / Published: 1 December 2020
(This article belongs to the Special Issue Deep Learning in Medical Image Analysis)

Abstract

:
The recent developments of deep learning support the identification and classification of lung diseases in medical images. Hence, numerous work on the detection of lung disease using deep learning can be found in the literature. This paper presents a survey of deep learning for lung disease detection in medical images. There has only been one survey paper published in the last five years regarding deep learning directed at lung diseases detection. However, their survey is lacking in the presentation of taxonomy and analysis of the trend of recent work. The objectives of this paper are to present a taxonomy of the state-of-the-art deep learning based lung disease detection systems, visualise the trends of recent work on the domain and identify the remaining issues and potential future directions in this domain. Ninety-eight articles published from 2016 to 2020 were considered in this survey. The taxonomy consists of seven attributes that are common in the surveyed articles: image types, features, data augmentation, types of deep learning algorithms, transfer learning, the ensemble of classifiers and types of lung diseases. The presented taxonomy could be used by other researchers to plan their research contributions and activities. The potential future direction suggested could further improve the efficiency and increase the number of deep learning aided lung disease detection applications.

1. Introduction

Lung diseases, also known as respiratory diseases, are diseases of the airways and the other structures of the lungs [1]. Examples of lung disease are pneumonia, tuberculosis and Coronavirus Disease 2019 (COVID-19). According to Forum of International Respiratory Societies [2], about 334 million people suffer from asthma, and, each year, tuberculosis kills 1.4 million people, 1.6 million people die from lung cancer, while pneumonia also kills millions of people. The COVID-19 pandemic impacted the whole world [3], infecting millions of people and burdening healthcare systems [4]. It is clear that lung diseases are one of the leading causes of death and disability in this world. Early detection plays a key role in increasing the chances of recovery and improve long-term survival rates [5,6]. Traditionally, lung disease can be detected via skin test, blood test, sputum sample test [7], chest X-ray examination and computed tomography (CT) scan examination [8]. Recently, deep learning has shown great potential when applied on medical images for disease detection, including lung disease.
Deep learning is a subfield of machine learning relating to algorithms inspired by the function and structure of the brain. Recent developments in machine learning, particularly deep learning, support the identification, quantification and classification of patterns in medical images [9]. These developments were made possible due to the ability of deep learning to learned features merely from data, instead of hand-designed features based on domain-specific knowledge. Deep learning is quickly becoming state of the art, leading to improved performance in numerous medical applications. Consequently, these advancements assist clinicians in detecting and classifying certain medical conditions efficiently [10].
Numerous works on the detection of lung disease using deep learning can be found in the literature. To the best of our knowledge, however, only one survey paper has been published in the last five years to analyse the state-of-the-art work on this topic [11]. In that paper, the history of deep learning and its applications in pulmonary imaging are presented. Major applications of deep learning techniques on several lung diseases, namely pulmonary nodule diseases, pulmonary embolism, pneumonia, and interstitial lung disease, are also described. In addition, the analysis of several common deep learning network structures used in medical image processing is presented. However, their survey is lacking in the presentation of taxonomy and analysis of the trend of recent work. A taxonomy shows relationships between previous work and categorises them based on the identified attributes that could improve reader understanding of the topic. Analysis of trend, on the other hand, provides an overview of the research direction of the topic of interest identified from the previous work. In this paper, a taxonomy of deep learning applications on lung diseases and a trend analysis on the topic are presented. The remaining issues and possible future direction are also described.
The aims of this paper are as follows: (1) produce a taxonomy of the state-of-the-art deep learning based lung disease detection systems; (2) visualise the trends of recent work on the domain; and (3) identify the remaining issues and describes potential future directions in this domain. This paper is organised as follows. Section 2 presents the methodology of conducting this survey. Section 3 describes the general processes of using deep learning to detect lung disease in medical images. Section 4 presents the taxonomy, with detailed explanations of each subtopic within the taxonomy. The analysis of trend, research gap and future directions of lung disease detection using deep learning are presented in Section 5. Section 6 describes the limitation of the survey. Section 7 concludes this paper.

2. Methodology

In this section, the methodology used to conduct the survey of recent lung disease detection using deep learning is described. Figure 1 shows the flowchart of the methodology used.
First, a suitable database, as a main source of reference, of articles was identified. The Scopus database was selected as it is one of the largest databases of scientific peer-reviewed articles. However, several significant articles, indexed by Google Scholar but not Scopus, are also included based on the number of citations that they have received. Some preprint articles on COVID-19 are also included as the disease has just recently emerged. To ensure that this survey only covers the state-of-the-art works, only articles published recently (2016–2020) are considered. However, several older but significant articles are included too. To search for all possible deep learning aided lung disease detection articles, relevant keywords were used to search for the articles. The keywords used were “deep learning”, “detection”, “classification”, “CNN”, “lung disease”, “Tuberculosis”, “pneumonia”, “lung cancer”, “COVID-19” and “Coronavirus”. Studies were limited to articles written in English only. At the end of this phase, we identified 366 articles.
Second, to select only the relevant works, screening was performed. During the screening, only the title and abstract were assessed. The main selection criteria were this survey is only interested in work, whereby deep learning algorithms were applied to detect the relevant diseases. Articles considered not relevant were excluded. Based on the screening performed, only 98 articles were shortlisted.
Last, for all the articles screened, the eligibility inspection was conducted. Similar criteria, as in the screening phase, were used, whereby the full-text inspection of the articles was performed instead. All 98 screened articles passed this phase and were included in this survey. Out of the eligible articles, 90 were published in 2018 and onwards. This signifies that lung disease detection using deep learning is still a very active field. Figure 1 shows the numbers of studies identified, screened, assessed for eligibility and included in this survey.

3. The Basic Process to Apply Deep Learning for Lung Disease Detection

In this section, the process of how deep learning is applied to identify lung diseases from medical images is described. There are mainly three steps: image preprocessing, training and classification. Lung disease detection generally deals with classifying an image into healthy lungs or disease-infected lungs. The lung disease classifier, sometimes known as a model, is obtained via training. Training is the process in which a neural network learns to recognise a class of images. Using deep learning, it is possible to train a model that can classify images into their respective class labels. Therefore, to apply deep learning for lung disease detection, the first step is to gather images of lungs with the disease to be classified. The second step is to train the neural network until it is able to recognise the diseases. The final step is to classify new images. Here, new images unseen by the model before are shown to the model, and the model predicts the class of those images. The overview of the process is illustrated in Figure 2.

3.1. Image Acquisition Phase

The first step is to acquire images. To produce a classification model, the computer needs to learn by example. The computer needs to view many images to recognise an object. Other types of data, such as time series data and voice data, can also be used to train deep learning models. In the context of the work surveyed in this paper, the relevant data required to detect lung disease will be images. Images that could be used include chest X-ray, CT scan, sputum smear microscopy and histopathology image. The output of this step is images that will later be used to train the model.

3.2. Preprocessing Phase

The second step is preprocessing. Here, the image could be enhanced or modified to improve image quality. Contrast Limited Adaptive Histogram Equalisation (CLAHE) could be performed to increase the contrast of the images [12]. Image modification such as lung segmentation [13] and bone elimination [14] could be used to identify the region of interest (ROI), whereby the detection of the lung disease can then be performed on the ROI. Edge detection could also be used to provide an alternate data representation [15]. Data augmentation could be applied to the images to increase the amount of available data. Feature extraction could also be conducted so that the deep learning model could identify important features to identify a certain object or class. The output of this step is a set of images whereby the quality of the images is enhanced, or unwanted objects have been removed. The output of this step is images that were enhanced or modified that will later be used in training.

3.3. Training Phase

In the third step, namely training, three aspects could be considered. These aspects are the selection of deep learning algorithm, usage of transfer learning and usage of an ensemble. There are numerous deep learning algorithm, for example deep belief network (DBN), multilayer perceptron neural network (MPNN), recurrent neural network (RNN) and the aforementioned CNN. Different algorithms have different learning styles. Different types of data work better with certain algorithms. CNN works particularly well with images. Deep learning algorithm should be chosen based on the nature of the data at hand. Transfer learning refers to the transfer of knowledge from one model to another. Ensemble refers to the usage of more than one model during classification. Transfer learning and ensemble are techniques used to reduce training time, improve classification accuracy and reduce overfitting [16]. Further details concerning these two aspects could be found in Section 4.5 and Section 4.6, respectively. The output of this step is models generated from the data learned.

3.4. Classification Phase

In the fourth and final step, which is classification, the trained model will predict which class an image belongs to. For example, if a model was trained to differentiate X-ray images of healthy lungs and tuberculosis-infected lungs, it should be able to correctly classify new images (images that are never seen by the model before) into healthy lungs or tuberculosis-infected lungs. The model will give a probability score for the image. The probability score represents how likely an image belongs to a certain class. At the end of this step, the image will be classified based on the probability score given to it by the model.

4. The Taxonomy of State-Of-The-Art Work on Lung Disease Detection Using Deep Learning

In this section, a taxonomy of the recent work on lung disease detection using deep learning is presented, which is the first contribution of this paper. The taxonomy is built to summarise and provide a clearer picture of the key concepts and focus of the existing work. Seven attributes were identified for inclusion in the taxonomy. These attributes were chosen as they were imminent and can be found in all the articles being surveyed. The seven attributes included in the taxonomy are image types, features, data augmentation, types of deep learning algorithms, transfer learning, the ensemble of classifiers and types of lung diseases. Section 4.1, Section 4.2, Section 4.3, Section 4.4, Section 4.5, Section 4.6 and Section 4.7 describe each attribute in detail, whereby the review of relevant works is provided. Section 4.8 describes the datasets used by the works surveyed. Figure 3 shows the taxonomy of state-of-the-art lung disease detection using deep learning.

4.1. Image Type

In the papers surveyed, four types of images were used to train the model: chest X-ray, CT scans, sputum smear microscopy images and histopathology images. These images are described in detail in Section 4.1.1, Section 4.1.2, Section 4.1.3 and Section 4.1.4. It should be noted that there are other imaging techniques exist such as positron emission tomography (PET) and magnetic resonance imaging (MRI) scans. Both PET and MRI scans could also be used to diagnose health conditions and evaluate the effectiveness of ongoing treatment. However, none of the papers surveyed used PET or MRI scans.

4.1.1. Chest X-rays

An X-ray is a diagnostic test that helps clinicians identify and treat medical problems [17]. The most widely performed medical X-ray procedure is a chest X-ray, and a chest X-ray produces images of the blood vessels, lungs, airways, heart and spine and chest bones. Traditionally, medical X-ray images were exposed to photographic films, which require processing before they can be viewed. To overcome this problem, digital X-rays are used [18]. Figure 4 shows several examples of chest X-ray with different lung conditions taken from various datasets.
Among the papers surveyed, the majority of them used chest X-rays. For example, X-rays were used for tuberculosis detection [19], pneumonia detection [20], lung cancer detection [14] and COVID-19 detection [21].

4.1.2. CT Scans

A CT scan is a form of radiography that uses computer processing to create sectional images at various planes of depth from images taken around the patient’s body from different angles [22]. The image slices can be shown individually, or they can be stacked to produce a 3D image of the patient, showing the tissues, organs, skeleton and any abnormalities present [23]. CT scan images deliver more detailed information than X-rays. Figure 5 shows examples of CT scan images taken from numerous datasets. CT scans have been used to detect lung disease in numerous work found in the literature, for example for tuberculosis detection [24], lung cancer detection [25] and COVID-19 detection [26].

4.1.3. Sputum Smear Microscopy Images

Sputum is a dense fluid formed in the lungs and airways leading to the lungs. To perform sputum smear examination, a very thin layer of the sputum sample is positioned on a glass slide [27]. Among the papers surveyed, only five used sputum smear microscopy image [28,29,30,31,32]. Figure 6 shows examples of sputum smear microscopy images.

4.1.4. Histopathology Images

Histopathology is the study of the symptoms of a disease through microscopic examination of a biopsy or surgical specimen using glass slides. The sections are dyed with one or more stains to visualise the different components of the tissue [33]. Figure 7 shows a few examples of histopathology images. Among all the papers surveyed, only Coudray et al. [34] used histopathology images.

4.2. Features

In computer vision, features are significant information extracted from images in terms of numerical values that could be used to solve specific problem [35]. Features might be in the form of specific structures in the image such as points, edges, colour, sizes, shapes or objects. Logically, the types of images affect the quality of the features.
Feature transformation is a process that creates new features using the existing features. These new features may not have the same representation as to the original features, but they may have more discriminatory power in a different space than the original space. The purpose of feature transformation is to provide a more useful feature for the machine learning algorithm for object identification. The features used in the surveyed papers include: Gabor, GIST, Local binary patterns (LBP), Tamura texture descriptor, colour and edge direction descriptor (CEDD) [36], Hu moments, colour layout descriptor (CLD) edge histogram descriptor (EHD) [37], primitive length, edge frequency, autocorrelation, shape features, size, orientation, bounding box, eccentricity, extent, centroid, scale-invariant feature transform (SIFT), regional properties area and speeded up robust features (SURF) [38]. Other feature representations in terms of histograms include pyramid histogram of oriented gradients (PHOG), histogram of oriented gradients (HOG) [39], intensity histograms (IH), shape descriptor histograms (SD), gradient magnitude histograms (GM), curvature descriptor histograms (CD) and fuzzy colour and texture histogram (FCTH). Some studies even performed lung segmentations before training their models (e.g., [13,14,36]).
From the literature, a majority of the works surveyed used features that are automatically extracted from CNN. CNN can automatically learn and extract features, discarding the need for manual feature generation [40].

4.3. Data Augmentation

In deep learning, it is very important to have a large training dataset, as the community agrees that having more images can help improve training accuracy. Even a weak algorithm with a large amount of data can be more accurate than a strong algorithm with a modest amount of data [41]. Another obstacle is imbalanced classes. When doing binary classification training, if the number of samples of one class is a lot higher than the other class, the resulting model would be biased [6]. Deep learning algorithms perform optimally when the amount of samples in each class is equal or balanced.
One way to increase the training dataset without obtaining new images is to use image augmentation. Image augmentation creates variations of the original images. This is achieved by performing different methods of processing, such as rotations, flips, translations, zooms and adding noise [42]. Figure 8 shows various examples of images after image augmentation.
Data augmentation can also help increase the amount of relevant data in the dataset. For example, consider a car dataset with two labels, X and Y. One subset of the dataset contains images of cars of label X, but all the cars are facing left. The other subset contains images of cars of label Y, but all the cars are facing right. After training, a test image of a label Y car facing left is fed into the model, and the model labels that the car as X. The prediction is wrong as the neural network search for the most obvious features that distinguish one class from another. To prevent this, a simple solution is to flip the images in the existing dataset horizontally such that they face the other side. Through augmentation, we may introduce relevant features and patterns, essentially boosting overall performance.
Data augmentation also helps prevent overfitting. Overfitting refers to a case where a network learns a very high variance function, such as the perfect modelling of training results. Data augmentation addresses the issue of overfitting by introducing the model with more diverse data [43]. This diversity in data reduces variance and improves the generalisation of the model.
However, data augmentation cannot overcome all biases present in a small dataset [43]. Other disadvantages of data augmentation include additional training time, transformation computing costs and additional memory costs.

4.4. Types of Deep Learning Algorithm

The most common deep learning algorithm, CNN, is especially useful to find patterns in images. Similar to the neural networks of the human brain, CNNs consist of neurons with trainable biases and weights. Each neuron receives several inputs. Then, a weighted sum over the inputs is computed. The weighted sum is then passed to an activation function, and an output is produced. The difference between CNN and other neural networks is that CNN has convolution layers. Figure 9 shows an example of a CNN architecture [44]. A CNN consists of multiple layers, and the four main types of layers are convolutional layer, pooling layer and fully-connected layer. The convolutional layer performs an operation called a “convolution”. Convolution is a linear operation involving the multiplication of a set of weights with the input. The set of weights is called a kernel or a filter. The input data are larger than the filter. The multiplication between a filter-sized section of the input and the filter is a dot product. The dot product is then summed, resulting in a single value. The pooling layer gradually reduces the spatial size of the representation to lessen the number of parameters and computations in the network, thus controlling overfitting. A rectified linear unit (ReLu) is added to the CNN to apply an elementwise activation function such as sigmoid to the output of the activation produced by the previous layer. More details of CNN can be found in [44,45].
CNN generally has two components when learning, which are feature extraction and classification. In the feature extraction stage, convolution is implemented on the input data using a filter or kernel. Then, a feature map is subsequently generated. In the classification stage, the CNN computes a probability of the image belongs to a particular class or label. CNN is especially useful for image classification and recognition as it automatically learns features without needing manual feature extraction [40]. CNN also can be retrained and applied to a different domain using transfer learning [46]. Transfer learning has been shown to produce better classification results [19].
Another deep learning algorithm is DBN. DBN can be defined as a stack of restricted Boltzmann machines (RBM) [47]. The layer of the DBN has two functions, except for the first and final layers. The layer serves as the hidden layer for the nodes that come before it, and as the input layer for the nodes that come after it. The first RBM is designed to reproduce as accurately as possible the input to train a DBN. Then, the hidden layer of the first RBM is treated as the visible layer for the second one, and the second RBM is trained using the outputs from the first RBM. This process keeps repeating until every layer of the network is trained. After this initial training, the DBN has created a model that can detect patterns in the data. DBN can be used to recognise objects in images, video sequences and motion-capture data. More details of DBN can be found in [31,48].
One more example of a deep learning algorithm used in the papers surveyed is a bag of words (BOW) model. BOW is a method to extract features from the text for use in modelling. In BOW, the number of the appearance of each word in a document is counted, then the frequency of each word was examined to identify the keywords of the document, and a frequency histogram is made. This concept is similar to the bag of visual words (BOVW), sometimes referred to as bag-of-features. In BOVW, image features are considered as the “words”. Image features are unique patterns that were found in an image. The general idea of BOVW is to represent an image as a set of features, where each feature contains keypoints and descriptors. Keypoints are the most noticeable points in an image, such that, even if the image is rotated, shrunk or enlarged, its keypoints are always the same. A descriptor is the description of the keypoint. Keypoints and descriptors are used to construct vocabularies and represent each image as a frequency histogram of features. From the frequency histogram, one can find other similar images or predict the class of the image. Lopes and Valiati proposed Bag of CNN features to classify tuberculosis [19].

4.5. Transfer Learning

Transfer learning emerged as a popular method in computer vision because it allows accurate models to be built [49]. With transfer learning, a model learned from a domain can be re-used on a different domain. Transfer learning can be performed with or without a pre-trained model.
A pre-trained model is a model developed to solve a similar task. Instead of creating a model from scratch to solve a similar task, the model trained on other problem is used as a starting point. Even though a pre-trained model is trained on a task which is different from the current task, the features learned, in most cases, found to be useful for the new task. The objective of training a deep learning model is to find the correct weights for the network by numerous forward and backward iterations. By using pre-trained models that have been previously trained on large datasets, the weights and architecture obtained can be used and applied to the current problem. One of the advantages of a pre-trained model is the reduced cost of training for the new model [50]. This is because pre-trained weights were used, and the model only has to learn the weights of the last few layers.
Many CNN architectures are pre-trained on ImageNet [51]. The images were gathered from the internet and labelled by human labellers using Amazon’s Mechanical Turk crowd-sourcing tool. ILSVRC uses a subset of ImageNet with approximately 1000 images in each of 1000 classes. Altogether, there are approximately 1.2 million training images, 50,000 validation images and 150,000 testing images.
Transfer learning can be used in two ways: (i) fine-tuning; or (ii) using CNN as a feature extractor. In fine-tuning, the weights of the pre-trained CNN model are preserved on some of the layers and tuned in the others [52]. Usually, the weights of the initial layers of the model are frozen while only the higher layers are retrained. This is because the features obtained from the first layers are generic (e.g., edge detectors or colour blob detectors) and applicable to other tasks. The top-level layers of the pre-trained models are retrained so that the model learned high-level features specific to the new dataset. This method is typically recommended if the training dataset is huge and very identical to the original dataset that the pre-trained model was trained on. On the other hand, CNN is used as a feature extractor. This is conducted by removing the last fully-connected layer (the one which outputs the probabilities for being in each of the 1000 classes from ImageNet) and then using the network as a fixed feature extractor for the new dataset [53]. For tasks where only a small dataset is available, it is usually recommended to take advantage of features learned by a model trained on a larger dataset in the same domain. Then, a classifier is trained from the features extracted.
There are several issues that need to be considered when using transfer learning: (i) ensuring that the pre-trained model selected has been trained on a similar dataset as the new target dataset; and (ii) using a lower learning rate for CNN weights that are being fine-tuned, because the CNN weights are expected to be relatively good, and we do not wish to distort them too quickly and too much [53].

4.6. Ensemble of Classifiers

When more than one classifier is combined to make a prediction, this is known as ensemble classification [16]. Ensemble decreases the variance of predictions, therefore making predictions that are more accurate than any individual model. From work found in the literature, the ensemble techniques used include majority voting, probability score averaging and stacking.
In majority voting, every model makes a prediction for each test instance, or, in other words, votes for a class label, and the final prediction is the label that received the most votes [54]. An alternate version of majority voting is weighted majority voting, in which the votes of certain models are deemed more important than others. For example, majority voting was used by Chouhan et al. [55].
In probability score averaging, the prediction scores of each model are added up and divided by the number of models involved [56]. An alternate version of this is weighted averaging, where the prediction score of each model is multiplied by the weight, and then their average is calculated. Examples of works which used probability score averaging are found in [15,57].
In stacking ensemble, an algorithm receives the outputs of weaker models as input and tries to learn how to best combine the input predictions to provide a better output prediction [58]. For example, stacking ensemble was used by Rajaraman et al. [12].

4.7. Type of Disease

In this section, the deep learning techniques applied for detecting tuberculosis, pneumonia, lung cancer and COVID-19 are discussed in greater detail in Section 4.7.1, Section 4.7.2, Section 4.7.3 and Section 4.7.4, respectively. The first three diseases were considered as they are the most common causes of critical illness and death worldwide related to lung [2], while COVID-19 is an ongoing pandemic [3]. We also found that most of the existing work was directed at detecting these specific lung-related diseases.

4.7.1. Tuberculosis

Tuberculosis is a disease caused by Mycobacterium tuberculosis bacteria. According to the World Health Organisation, tuberculosis is among the ten most common causes of death in the world [59]. Tuberculosis infected 10 million people and killed 1.6 million in 2017. Early detection of tuberculosis is essential to increase the chances of recovery [5].
Two studies used Computer-Aided Detection for Tuberculosis (CAD4TB) for tuberculosis detection [60,61]. CAD4TB is a tool developed by Delft Imaging Systems in cooperation with the Radboud University Nijmegen and the Lung Institute in Cape Town. CAD4TB works by obtaining the patient’s chest X-ray, analysing the image via CAD4TB cloud server or CAD4TB box computer, generating a heat map of the patient’s lung and displaying an abnormality score from 0 to 100. Murphy et al. [60] showed that CAD4TB v6 is an accurate system, reaching the level of expert human readers. A technique for automated tuberculosis screening by combining X-ray-based computer-aided detection (CAD) and clinical information was introduced by Melendez et al. [61]. They combined automatic chest X-ray scoring by CAD with clinical information. This combination improved accuracies and specificities compared to the use of either type of information alone.
In the literature, several works use CNN to classify tuberculosis. A method that incorporated demographic information, such as age, gender and weight, to improve CNN’s performance was presented by Heo et al. [62]. Results indicate that CNN, including the demographic variables, has a higher area under the receiver operating characteristic curve (AUC) score and greater sensitivity then CNN based on chest X-rays images only. A simple convolutional neural network developed for tuberculosis detection was proposed by Pasa et al. [63]. The proposed approach is found to be more efficient than previous models but retains their accuracy. This method significantly reduced the memory and computational requirement, without sacrificing the classification performance. Another CNN-based model has been presented to classify different categories of tuberculosis [64]. A CNN model is trained on the region-based global and local features to generate new features. A support vector machine (SVM) classifier was then applied for tuberculosis manifestations recognition. CNN has also been used to classify tuberculosis [65,66,67]. Ul Abideen et al. [68] used a Bayesian-based CNN that exploits the model uncertainty and Bayesian confidence to improve the accuracy of tuberculosis identification. In other work, a deep CNN algorithm named deep learning-based automatic detection (DLAD), was developed for tuberculosis classification that contains 27 layers with 12 residual connections [69]. DLAD shows outstanding performance in tuberculosis detection when applied on chest X-rays, obtaining results better than physicians and thoracic radiologists.
Lopes and Valiati proposed Bag of CNN features to classify tuberculosis [19] where feature extraction is performed by ResNet, VggNet and GoogLenet. Then, each chest X-ray is separated into subregions whose size is equal to the input layer of the networks. Each subregion is regarded as a “feature”, while each X-ray is a “bag”.
Several works that utilised transfer learning are described in this paragraph. Hwang et al. obtained an accuracy of 90.3% and AUC of 0.964 using transfer learning from ImageNet and training on a dataset of 10848 chest X-rays [70]. Pre-trained GoogLeNet and AlexNet were used to perform pulmonary tuberculosis classification by Lakhani and Sundaram [57], who concluded that higher accuracy was achieved when using the pre-trained model. Their pre-trained AlexNet achieved an AUC of 0.98 and their pre-trained GoogLeNet achieved an AUC of 0.97. Lopes and Valiati used pre-trained GoogLenet, ResNet and VggNet architectures as features extractors and the SVM classifier to classify tuberculosis [19]. They achieved AUC of 0.900–0.912. Fine-tuned ResNet-50, ResNet-101, ResNet-512, VGG16, VGG19 and AlexNet were used by Islam et al. to classify tuberculosis. These models achieved an AUC of 0.85–0.91 [71]. Instead of using networks pre-trained from ImageNet, pre-training can be performed on other datasets, such as the NIH-14 dataset [72]. This dataset contains an assortment of diseases (which does not include tuberculosis) and is from the same modality as that of the data under consideration for tuberculosis. Experiments show that the features learned from the NIH dataset are useful for identifying tuberculosis. A study performed data augmentation and then compared the performances of three different pre-trained models to classify tuberculosis [73]. The results show that suitable data augmentation methods were able to rise the accuracies of CNNs. Transfer learning was also used by Abbas and Abdelsamea [74], Karnkawinpong and Limpiyakorn [75] and Liu et al. [76]. A coarse-to-fine transfer learning was applied by Yadav et al. [77]. First, the datasets are split according to the resolution and quality of the images. Then, transfer learning is applied to the low-resolution dataset first, followed by the high-resolution dataset. In this case, the model was first trained on the low-resolution NIH dataset, and then trained on the high-resolution Shenzen and Montgomery datasets. Sahlol et al. [78] used CNN as fixed feature extractor and Artificial Ecosystem-Based Optimisation to select the optimal subset of relevant features. KNN was used as the classifier.
Several works that utilised ensemble are described in this paragraph. An ensemble method using the weighted averages of the probability scores for the AlexNet and GoogLeNet algorithms was used by Lakhani and Sundaram [57]. In [79], ensemble by weighted averages of probability scores is used. An ensemble of six CNNs was developed by Islam et al. [71]. The ensemble models were generated by calculating the simple averaging of the probability predictions given by every single model. Another ensemble classifier was created by combining the classifier from the Simple CNN Feature Extraction and a classifier from Bag of CNN features proposals [19]. Three classifiers were trained, using the features from ResNet, GoogLenet and VggNet, respectively. The Simple Features Ensemble combines all three classifiers, and the output is obtained through a simple soft-voting scheme. A stacking ensemble for tuberculosis detection was proposed by Rajaraman et al. [12]. An ensemble generated via a feature-level fusion of neural network models was also used to classify tuberculosis [80]. Three models were employed: the DenseNet, ResNet and Inception-ResNet. As such, the ensemble was called RID network. Features were extracted using the RID network, and SVM was used as a classifier. Tuberculosis classification was also executed using another ensemble of three regular architectures: ResNet, AlexNet and GoogleNet [79]. Each architecture was trained from scratch, and different optimal hyper-parameter values were used. The sensitivity, specificity and accuracy of the ensemble were higher than when each of the regular architecture was used independently. The authors of [15,81] performed a probability score averaging ensemble of CNNs trained on features extracted from a different type of images; the enhanced chest X-ray images and the edge detected images of the chest X-ray. Rajaraman and Antani [82] studied and compared various ensemble methods that include majority voting and stacking. Results show that stacking ensemble achieved the highest classification accuracy.
Other techniques used to classify tuberculosis images include k-Nearest Neighbour (kNN), sequential minimal optimisation and simple linear regression [38]. A Multiple-Instance Learning-based approach was also attempted [83]. The advantage of this method is the lower labelling detail required during optimisation. In addition, the minimal supervision required allows easy retraining of a previously optimised system. One tuberculosis detection system uses ViDi Systems for image analysis of chest X-rays [84]. ViDi is an industrial-grade deep learning image analysis software developed by COGNEX. ViDi has shown feasible performance in the detection of tuberculosis. The authors of [36] introduced a fully automatic frontal chest screening system that is capable of detecting tuberculosis-infected lungs. This method begins with the segmentation of the lung. Then, features are extracted from the segmented images. Examples of features include shape and curvature histograms. Finally, a classifier was used to detect the disease.
For CT scans related tuberculosis detection works, a method called AECNN was proposed [85]. An AE-CNN block was formed by combining the feature extraction of CNN and the unsupervised features of AutoEncoder. The model then analyses the region of interest within the image to perform the classification of tuberculosis. A research study explores the use of CT pulmonary images to diagnose and classify tuberculosis at five levels of severity to track treatment effectiveness [24]. The tuberculosis abnormalities only occupy limited regions in the CT image, and the dataset is quite small. Therefore, depth-ResNet was proposed. Depth-ResNet is a 3D block-based ResNet combined with the injection of depth information at each layer. As an attempt to automate tuberculosis related lung deformities without sacrificing accuracy, advanced AI algorithms were studied to draw clinically actionable hypotheses [86]. This approach involves thorough image processing, subsequently performing feature extraction using TensorFlow and 3D CNN to further augment the metadata with the features extracted from the image data, and finally perform six class binary classification using the random forest. Another attempt for this problem was proposed by Zunair et al. [87]. They proposed a 16-layer 3D convolutional neural network with a slice selection. The goal is to estimate the tuberculosis severity based on the CT image. An integrated method based on optical flow and a characterisation method called Activity Description Vector (ADV) was presented to take care of the classification of chest CT scan images affected by different types of tuberculosis [88]. The important point of this technique is the interpretation of the set of cross-sectional chest images produced by CT scan, not as a volume but as a series of video images. This technique can extract movement descriptors capable of classifying tuberculosis affections by analysing deformations or movements generated in these video series. The idea of optical flow refers to the approximation of displacements of intensity patterns. In short, the ADV vector describes the activity in image series by counting for each region of the image the movements made in four directions of the 2D space.
For sputum microscopy images-related tuberculosis detection works, CNN was used for the detection and localisation of drug-sensitive tuberculosis bacilli in sputum microscopy images [29]. This method automatically localises bacilli in each view-field (a patch of the whole slide). A study found that, when training a CNN on three different image versions, namely RGB, R-G and grayscale, the best performance was achieved when using R-G images [28]. Image binarisation can also be used for preprocessing before the data were fed into a CNN [30]. Image binarisation is a segmentation method to classify the foreground and background of the microscopic sputum smear images. The segmented foreground consists of single bacilli, touching bacillus and other artefacts. A trained CNN is then given the foreground objects, and the CNN will classify the objects into bacilli and non-bacilli. Another tuberculosis detection system automatically attains all view-fields using a motorised microscopic stage [32]. After that, the data are delivered to the recognition system. A customised Inception V3 DeepNet model is used to learn from the pre-trained weights of Inception V3. Afterwards, the data were classified using SVM. DBN was also used to detect tuberculosis bacillus present in the stained microscopic images of sputum [31]. For segmentation, the Channel Area Thresholding algorithm is used. Location-oriented histogram and speed up robust feature (SURF) algorithm were used to extract the intensity-based local bacilli features. DBN is then used to classify the bacilli objects. Table 1 shows the summary of papers for tuberculosis detection using deep learning.

4.7.2. Pneumonia

Pneumonia is a lung infection that causes pus and fluid to fill the alveoli in one or both lungs, thus making breathing difficult [89]. Symptoms include severe shortness of breath, chest pain, chills, cough, fever or fatigue. Community-acquired pneumonia is still a recurrent cause of morbidity and mortality [90]. Most of the studies used transfer learning and data augmentation. Tobias et al. [91] straightforwardly used CNN. Stephen et al. [92] trained their CNN from scratch while using rescale, rotation, width shift, height shift, shear, zoom and horizontal flip as their augmentation techniques. A pre-trained CNN was utilised by the authors of [20,55,93,94,95,96,97] for pneumonia detection, while the latter four also applied data augmentation on their training datasets. For data augmentation, random horizontal flipping was used by Rajpurkar et al. [96]; shifting, zooming, flipping and 40-degree angles rotation were used by Ayan and Ünver [20]; Chouhan et al. [55] used noise addition, random horizontal flip random resized crop and images intensity adjustment; and Rahman et al. [97] used rotation, scaling and translation. Hashmi et al. [98] used CNN with transfer learning, data augmentation and ensemble by weighted averaging.
In a unique study, Acharya and Satapathy [99] used Deep Siamese CNN architecture. Deep Siamese network uses the symmetric structure of the two input image for classification. Thus, the X-ray images were separated into two parts, namely the left half and the right half. Each half was then fed into the network to compare the symmetric structure together with the amount of the infection that is spread across these two regions. Training the model for both left and right parts of the X-ray images makes the classification process more robust. Elshennawy and Ibrahim [100] used CNN and Long Short-Term Memory (LSTM)-CNN for pneumonia detection. The key advantage of the LSTM is that it can model both long and short-term memory and can deal with the vanishing gradient problem by training on long strings and storing them in memory. Emhamed et al. [101] studied and compared seven different deep learning algorithms: Decision Tree, Random Forest, KNN, AdaBoost, Gradient Boost, XGBboost and CNN. Their results show CNN obtained the highest accuracy for pneumonia classification, followed by Random forest and XGBboost. Hashmi et al. [98] used CNN with transfer learning, data augmentation and ensemble by weighted averaging.
In addition, Kumar et al. [102] attempted not only pneumonia classification, but also ROI identification. Pneumonia was detected by looking at lung opacity, and Mask-RCNN based model was used to identify lung opacity that is likely to depict pneumonia. They also performed ensemble by combining confidence scores and bounding boxes. In addition to pneumonia detection, Hurt et al. [103] proposed an approach that provides a probabilistic map on the chest X-ray images to assist in the diagnosis of pneumonia. Table 2 shows the summary of papers for pneumonia detection using deep learning.

4.7.3. Lung Cancer

One key characteristic of lung cancer is the presence of pulmonary nodules, solid clumps of tissue that appear in and around the lungs [104]. These nodules can be seen in CT scan images and can be malignant (cancerous) in nature or benign (not cancerous) [23].
As early as 2015, Hua et al. [105] used models of DBN and CNN to perform nodule classification in CT scans. They showed that, using deep learning, it is possible to seamlessly extract features for lung nodules classification into malignant or benign without computing the morphology and texture features. Rao et al. [25] and Kurniawan et al. [106] used CNN in a straightforward way to detect lung cancer in CT scans. Song et al. [23] compared the classification performance of CNN, deep neural network and stacked autoencoder (a multilayer sparse autoencoder of a neural network) and concluded that CNN has the highest accuracy among them. Ciompi et al. [107] used multi-stream multi-scale CNNs to classify lung nodules into six different classes: solid, non-solid, part-solid, calcified, perifissural and spiculated nodules. Specifically, they presented a multi-stream multi-scale architecture, in which CNN concurrently handles multiple triplets of 2D views of a nodule at multiple scales and then calculates the probability for the nodule in each of the six classes. Yu et al. [14] performed bone elimination and lung segmentation before training with CNN. Shakeel et al. [108] performed image denoising and enhanced the quality of the images, and then segmented the lungs by using the improved profuse clustering technique. Afterwards, a neural network is trained to detect lung cancer. The approach of Ardila et al. [13] consists of four components: lung segmentation, cancer region of interest detection model, full-volume model and cancer risk prediction model. After lung segmentation, the region of interest detection model proposes the most nodule-like regions, while the full-volume model was trained to predict cancer probability. The outputs of these two models were considered to generates the final prediction. Chen et al. [109] performed nodule enhancement and nodule segmentation before performing nodule detection.
For the works that employed transfer learning, Hosny et al. [110] and Xu et al. [111] both used CNN with data augmentation. For augmentations, both studies used flipping, translation and rotation. The authors of [112] leveraged the LUNA16 dataset to train a nodule detector and then refined that detector with the KDSB17 dataset to provide global features. Combining that and local features from a separate nodule classifier, they were able to detect lung cancer with high accuracy. The authors of [113] used transfer learning by training the model multiple times. It commenced using the more general images from the ImageNet dataset, followed by detecting nodules from chest X-rays in the ChestX-ray14 dataset, and finally detecting lung cancer nodules from the JSRT dataset. The authors of [34] is the only study surveyed to do lung cancer detection on histopathology images. Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most frequent subtypes of lung cancer, and visual examination by an experienced pathologist is needed to differentiate them. In this work, CNN was trained on histopathology slides images to automatically and accurately classify them into LUAD, LUSC or normal lung tissue. Xu et al. [114] used a CNN-long short-term memory network (LSTM) to detect lesions on chest X-ray images. Long short-term memory is an extension of RNN. This CNN-LSTM network offers probable clinical relationships between lesions to assist the model to attain better predictions. Table 3 shows the summary of papers for lung cancer detection using deep learning.

4.7.4. COVID-19

COVID-19 is an infectious disease caused by a recently discovered coronavirus [115]. Senior citizens are those at high risk to develop severe sickness, along with those that have historical medical conditions such as cardiovascular disease, chronic respiratory disease, cancer and diabetes [116].
A straightforward approach to detect COVID-19 using CNN with transfer learning and data augmentation was used by Salman et al. [21]. For transfer learning, they used InceptionV3 as a fixed feature extractor. Other works that implemented the similar approach of transfer learning for COVID-19 detection can be found in [117,118,119,120,121,122].
The authors of [123,124] performed 3-class classification using CNN with transfer learning, classifying X-ray images into normal, COVID-19 and viral pneumonia cases. Chowdhury et al. [125] utilised CNN with transfer learning and data augmentation to classify classifying X-ray images into normal, COVID-19 and viral pneumonia cases. The augmentation techniques used were rotation, scaling and translation. Wang et al. [126] trained a CNN from scratch and data augmentation to perform three-class classification. The augmentation technique used were translation, rotation, horizontal flip and intensity shift. Other work performing three-class classification can be found in [4,127,128,129,130]. Studies that employ data augmentation to increase the amount of data available can be found in [131,132]. In addition to COVID-19 detection on X-ray images, Alazab et al. [131] managed to perform prediction on the number of COVID-19 confirmations, recoveries and deaths in Jordan and Australia.
For works utilising ensemble, Ouyang et al. [133] implemented weighted averaging ensemble. Mahmud et al. [134] implemented stacking ensemble, whereby the images were classified into four categoriesL normal, COVID-19, viral pneumonia and bacterial pneumonia.
Shi et al. [135] utilised VB-Net for image segmentation and feature extraction and used a modified random decision forests method for classification. Several handcrafted features were also calculated and used to train the random forest model. More information about random forest can be found in [136].
A system that receives thoracic CT images and points out suspected COVID-19 cases was proposed by Gozes et al. [26]. The system analyses CT images at two distinct subsystems. Subsystem A performed the 3D analysis of the case volume for nodules and focal opacities, while Subsystem B performed the 2D analysis of each slice of the case to detect and localise larger-sized diffuse opacities. In Subsystem A, nodules and small opacities detection were conducted using a commercial software. Besides the detection of abnormalities, the software also provided measurements and localisation. For Subsystem B, lung segmentation was first performed, and then COVID-19 related abnormalities detection was conducted using CNN with transfer learning and data augmentation. If an image is classified as positive, a localisation map was generated using the Grad-cam technique. To provide a complete review of the case, Subsystems A and B were combined. The final outputs include per slice localisation of opacities (2D), 3D volumetric presentations of the opacities throughout the lungs and a Corona score, which is a volumetric measurement of the opacities burden.
The authors of [137] focused on location-attention classification mechanism. First, the CT images were preprocessed. Second, a 3D CNN model was employed to segment several candidate image patches. Third, an image classification model was trained and employed to categorise all image patches into one of three classes: COVID-19, Influenza-A-viral-pneumonia and irrelevant-to-infection. A location-attention mechanism was embedded in the image classification model to differentiate the structure and appearance of different infections. Finally, the overall analysis report for a single CT sample was generated using the Noisy-or Bayesian function. The results show that the proposed approach could more accurately detect COVID-19 cases than without the location-attention model. Several other studies modified the CNN for COVID-19 detection. In [138], a multi-objective differential evolution-based CNN was utilised. Sedik et al. [139] implemented CNN and LSTM with data augmentation, while Ahsan et al. [140] employed MLP-CNN based model. The authors of [141] employed capsule network-based framework with transfer learning. Table 4 shows the summary of papers for COVID-19 detection using deep learning.

4.8. Dataset

The datasets used by the surveyed works are reported in this section. Table 5, Table 6, Table 7 and Table 8 show the summary of datasets used for tuberculosis, pneumonia, lung cancer and COVID-19 detection, respectively. This is done to provide readers with relevant information on the datasets. Note that only public datasets are included in the tables because they are available to the public, whereas private datasets are inaccessible without permission.
According to Table 5, among the twelve datasets used for tuberculosis detection works, five of them do not contain tuberculosis medical images: JSRT dataset, Indiana dataset, NIH-14 dataset, LDOCTCXR and RSNA pneumonia dataset. JSRT dataset contains lung cancer images, while the Indiana and NIH-14 datasets contain multiple different diseases. LDOCTCXR and RSNA pneumonia datasets both contain pneumonia and normal lung images. These five datasets were used for transfer learning in several studies. Models were first trained to identify abnormalities in chest X-ray, and then they were trained to identify tuberculosis. The India, Montgomery and Shenzhen datasets contain X-ray images of tuberculosis; ImageCLEF 2018 and ImageCLEF 2019 datasets contain CT images of tuberculosis; and the Belarus dataset contains both X-ray and CT images of tuberculosis. Two of the datasets contain sputum smear microscopy images of tuberculosis: the TBimages dataset and ZiehlNeelsen Sputum smear Microscopy image DataBase.
For detection works related to pneumonia, only four public datasets are available, as shown in Table 6. All four datasets contain X-ray images only. Even though the number of datasets is low, the number of images within these datasets is high. Future studies utilising these datasets should have sufficient data.
According to Table 7, among the ten datasets used for lung cancer detection works, only one contains histopathology images, which is the NCI Genomic Data Commons dataset. The NIH-14 dataset contains X-ray images, while the JSRT dataset contains a mix of X-ray and CT images. The rest of the datasets all contain CT images.
Table 8 shows that there are thirteen public datasets related to COVID-19. With the rise of the COVID-19 pandemic, multiple datasets have been made available to the public. Many of these datasets still have a rising number of images. Therefore, the number of images within the datasets might be different from the number reported in this paper. Take note that some of the images might be contained in multiple datasets. Therefore, future studies should check for duplicate images.
Table 9 summarises the works surveyed based on the taxonomy. This allows readers to quickly refer to the articles according to their interested attributes. The analysis of the distribution of works based on the identified attributes of the taxonomy is given in the following section.

5. Analysis of Trend, Issues and Future Directions of Lung Disease Detection Using Deep Learning

In this section, the broad analysis of the existing work is presented, which is the last contribution outlined in this paper. The analysis of the trend of each attribute identified in the foregoing section is described, whereby the aim is to show the progress of the works and the direction the researchers are heading over the last five years. The shown trend could be useful to suggest the future direction of the work in this domain. Section 5.1 presents the analysis of the trend of the articles considered. The issues and potential future work to address the identified issues are described in Section 5.2.

5.1. An Analysis of the Trend of Lung Disease Detection in Recent Years

This subsection presents the analysis of lung disease detection works in recent years for each attribute of the taxonomy described in the foregoing section.

5.1.1. Trend Analysis of the Image Type Used

Figure 10a shows that the usage of X-ray images increases linearly over the years. The usage of CT images also increases over the years, with a slight dip in 2018. The sputum smear microscopy and histopathology images are combined into one as ‘Others’ due to the low number of previous work using them to detect lung diseases. The usage of other image types slowly increases until 2018, and then drops. This indicates that deep learning aided lung disease detection works are heading towards the direction of using X-ray images and CT images.
Figure 10b shows that the majority of the studies used X-ray images at 71%, while CT images followed second with 23%. Such observation could be due to the availability, accessibility and mobility of X-ray machines over the CT scanner. Due to the COVID-19 pandemic that has spread to all types of geographical locations, it is anticipated that the X-ray images will still be the dominant choice of medical images used to detect lung-related diseases over CT images. CT images may remain the second choice because they provide more detailed information than X-rays.

5.1.2. Trend Analysis of the Features Used

From the perspective of features used for lung disease detection in recent years, as shown in Figure 11a, the usage of CNN extracted features is steadily increasing, while the usage of other features and the combination of CNN extracted features plus other features remain low. This is because CNN allows automated feature extraction, discarding the need for manual feature generation [40]. The usage of other features was less preferred due to the fact that most recent works showed the superiority of CNN extracted features in detecting lung diseases. Figure 11b shows the distribution of work by type of features used. CNN extracted features were used in 79% of the works. The combination of CNN extracted features plus some other features were used in 13% of the recent works, while the remaining works utilised other types of features.

5.1.3. Trend Analysis of the Usage of Data Augmentation

Figure 12a shows the trend of the usage of data augmentation. Although implementing data augmentation increased the complexity of the data pre-processing, the number of works employing data augmentation increases steadily over the years. Such trend signifies that more researchers have realised how beneficial data augmentation is to train the lung disease detection models.
Figure 12b shows the distribution of data augmentation usage in deep learning aided lung disease detection. Only about one-third of the studies used data augmentation. While it is reported that data augmentation improved the classification accuracy, the majority of works did not use data augmentation. One reason for this might be that data augmentation is not that simple to implement. As mentioned in Section 4.3, the disadvantages of data augmentation include additional memory costs, transformation computing costs and training time.

5.1.4. Trend Analysis of the Types of Deep Learning Algorithm Used

Figure 13a shows the trend of the usage of deep learning algorithms in lung disease detection works in recent years. As shown in Figure 13, CNN was the most preferred deep learning algorithm for the last five years. Future works will likely follow this trend, whereby more work may prefer CNN for lung disease detection over other deep learning algorithms.
Figure 13b visualises the analysis of the usage of CNN in deep learning aided lung disease detection in recent years. The majority of the papers surveyed used CNN. This is because CNN is robust and can achieve high classification accuracy. Many of the works surveyed indicate that CNN has superior performance [74]. Other benefits of using CNN include automatic feature extraction and utilising the advantages of transfer learning, which is further analysed in the following subsection.

5.1.5. Trend Analysis of the Usage Of Transfer Learning

Figure 14a shows the trend of the usage of transfer learning. As time goes on, more works employed transfer learning. With transfer learning, there is no need to define a new model. Transfer learning also allows the usage features learned while training from an old task for the new task, often increasing the classification accuracy. This could be due to the model used being more generalised as it has been trained with a greater number of images.
Figure 14b shows the usage of transfer learning among the works which used CNN. According to the figure, 57% of the recent works utilised transfer learning. Even though the number of works utilising transfer learning increased over the years, as shown in Figure 14a, the percentage of works using transfer learning is just 57%. For example, in 2020, out of 44 studies that used CNN, 28 implemented transfer learning. This suggests that works in this domain are moving towards the direction of using transfer learning, but not at a high pace. Transfer learning remains a strong approach to lung disease detection, with respect to the detection performance. Hence, the distribution of work may be skewed towards transfer learning in the near future.

5.1.6. Trend Analysis of the Usage Of Ensemble

Based on Figure 15a, it seems that the ensemble was only applied on COVID-19, pneumonia and tuberculosis detection. It is observed that the usage of the ensemble is slowly growing in popularity for pneumonia and COVID-19 detection. Although less popular, the works that deployed an ensemble classifier reported better detection performance than when not using ensemble.
Figure 15b shows the distribution of the usage of the ensemble in deep learning aided lung disease detection. Only 15% of the studies used ensemble. This suggests that ensemble classifier is still less explored for lung disease detection. Only three types of ensemble techniques were found in the papers surveyed, which were majority voting, probability score averaging and stacking. The challenge to implement ensemble may be the caused of such low application. Using ensemble, the performance could only improve if the errors of the base classifiers have a low correlation. When using similar data, which may occur when the size of the datasets and the number of datasets itself are limited, the correlation of errors of the base classifiers tends to be high.

5.1.7. Trend Analysis of the Type Of Lung Disease Detected using Deep Learning

Based on the trend shown in Figure 16a, the total number of lung disease detection works using deep learning increased steadily over the years, with most work related to tuberculosis detection. As more lung disease medical image datasets become public, researchers have access to more data. Thus, more extensive studies were conducted. Towards 2020, the works on COVID-19 detection emerged while work conducted to detect other diseases decreased tremendously. This signifies that using deep learning to detect lung disease is still an active field of study. This also shows that much effort was directed towards easing the burden of detecting COVID-19 using the existing manual screening test, which is already anticipated.
Figure 16b shows the distribution of the diseases detected using deep learning in recent years. The majority of works were directed at tuberculosis detection, followed by COVID-19, lung cancer and pneumonia. The reason that works of tuberculosis are high is because the majority of tuberculosis-infected inhabitants were from resource-poor regions with poor healthcare infrastructure [61]. Therefore, tuberculosis detection using deep learning provides the opportunity to accelerate tuberculosis diagnosis among these communities. The reason that works of COVID-19 detection are second highest is because researchers all over the world are trying to reduce the burden of detecting COVID-19, and thus many works have been published, even though COVID-19 is a relatively new disease.

5.2. Issues and Future Direction of Lung Disease Detection Using Deep Learning

This subsection presents the remaining issues and corresponding future direction of lung disease detection using deep learning, which are the final contributions of this paper. The state-of-the-art lung disease detection field is suffering from several issues that can be found in the papers considered. Some of the proposed future works are designed to deal with the issues found. Details of the issues and potential future works are presented in Section 5.2.1 and Section 5.2.2, respectively.

5.2.1. Issues

This section presents the issues of lung disease detection using deep learning found in the literature. Four main issues were identified: (i) data imbalance; (ii) handling of huge image size; (iii) limited available datasets; and (iv) high correlation of errors when using ensemble techniques.
(i)
Data imbalance: When doing classification training, if the number of samples of one class is a lot higher than the other class, the resulting model would be biased. It is better to have the same number of images in each class. However, oftentimes that is not the case. For example, when performing a multiclass classification of COVID-19, pneumonia and normal lungs, the number of images for pneumonia far exceeds the number of images for COVID-19 [126].
(ii)
Handling of huge image size: Most researchers reduced the original image size during training to reduce computational cost. It is extremely computationally expensive to train with the original image size, and it is also time-consuming to train a deeply complex model even with the aid of the most powerful GPU hardware.
(iii)
Limited available datasets: Ideally, thousands of images of each class should be obtained for training. This is to produce a more accurate classifier. However, due to the limited number of datasets, the number of available training data is often less than ideal. This causes researchers to search for other alternatives to produce a good classifier.
(iv)
High correlation of errors when using ensemble techniques: It requires a variety of errors for an ensemble of classifiers to perform the best. The base classifiers used should have a very low correlation. This, in turn, will ensure the errors of those classifiers also will be varied. In other words, it is expected that the base classifiers will complement each other to produce better classification results. Most of the studies surveyed only combine classifiers that were trained on similar features. This causes the correlation error of the base classifiers to be high.

5.2.2. Potential Future Works

This section presents the possible future works that should be considered to improvise the performance of lung disease detection using deep learning.
(i)
Make datasets available to the public: Some researchers used private hospital datasets. To obtain larger datasets, efforts such as de-identification of confidential patients’ information can be conducted to make the data public. With more data available, the produced classifiers would be more accurate. This is because, with more data comes more diversity. This decreases the generalisation error because the model becomes more general as it was trained on more examples. Medical data are hard to come by. Therefore, if the datasets were made public, more data would be available for researchers.
(ii)
Usage of cloud computing: Performing training using cloud computing might overcome the problem of handling of huge image size. On a local mid-range computer, training with large images will be slow. A high-end computer might speed up the process a little, but it might still be infeasible. However, by training the deep learning model using cloud computing, we can use multiple GPUs at a reasonable cost. This allows higher computational cost training to be conducted faster and cheaper.
(iii)
Usage of more variety of features: Most researchers use features automatically extracted by CNN. Some other features such as SIFT, GIST, Gabor, LBP and HOG were studied. However, many other features are still yet to be explored, for example quadtree and image histogram. Efforts can be directed to studying different types of features. This can address the issue of the high correlation of errors when using ensemble techniques. With more features comes more variation. When combining many variations, the results are often better [41]. Feature engineering allows the extraction of more information from present data. New information is extracted in terms of new features. These features might have a better ability to describe the variance in the training data, thus improving model accuracy.
(iv)
Usage of the ensemble learning: Ensemble techniques show great potentials. Ensemble methods often improve detection accuracy. An ensemble of several features might provide better detection results. An ensemble of different deep learning techniques could also be considered because ensembles perform better if the errors of the base classifiers have a low correlation.

6. Limitation of the Survey

The survey presented has a limitation whereby the primary source of work considered were those indexed in the Scopus database, due to the reason described in Section 2. Exceptions were given on COVID-19 related works, as most of the articles were still at the preprint level when this survey was conducted. Concerning the publication years considered, the latest publication included were those published prior to October 2020. Therefore, the findings put forward in this survey paper did not consider contributions of works that are non-Scopus indexed and those that are published commencing October 2020 and onwards.

7. Conclusions

As time goes on, more works on lung disease detection using deep learning have been published. However, there was a lack of systematic survey available on the current state of research and application. This paper is thus produced to offer an extensive survey of lung disease detection using deep learning, specifically on tuberculosis, pneumonia, lung cancer and COVID-19, published from 2016 to September 2020. In total, 98 articles on this topic were considered in producing this survey.
To summarise and provide an organisation of the key concepts and focus of the existing work on lung disease detection using deep learning, a taxonomy of state-of-the-art deep learning aided lung disease detection was constructed based on the survey on the works considered. Analyses of the trend on recent works on this topic, based on the identified attributes from the taxonomy, are also presented. From the analyses of the distribution of works, the usage of both CNN and transfer learning is high. Concerning the trend of the surveyed work, all the identified attributes in the taxonomy observed, on average, a linear increase over the years, with an exception to the ensemble attribute. The remaining issues and future direction of lung disease detection using deep learning were subsequently established and described. Four issues of lung disease detection using deep learning were identified: data imbalance, handling of huge image size, limited available datasets and high correlation of errors when using ensemble techniques. Four potential works for lung disease detection using deep learning are suggested to resolve the identified issues: making datasets available to the public, usage of cloud computing, usage of more features and usage of the ensemble.
To conclude, investigating how deep learning was employed in lung disease detection is highly significant to ensure future research will concentrate on the right track, thereby improving the performance of disease detection systems. The presented taxonomy could be used by other researchers to plan their research contributions and activities. The potential future direction suggested could further improve the efficiency and increase the number of deep learning aided lung disease detection applications.

Author Contributions

All authors contributed to the study conceptualisation and design. Material preparation and analysis were performed by S.T.H.K. and M.H.A.H. The first draft of the manuscript was written by S.T.H.K., supervised by M.H.A.H., A.B. and H.K. All authors provided critical feedback and helped shape the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Universiti Malaysia Sabah (UMS) grant number SDK0191-2020.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Bousquet, J. Global Surveillance, Prevention and Control of Chronic Respiratory Diseases; World Health Organization: Geneva, Switzerland, 2007; pp. 12–36. [Google Scholar]
  2. Forum of International Respiratory Societies. The Global Impact of Respiratory Disease, 2nd ed.; European Respiratory Society: Sheffield, UK, 2017; pp. 5–42. [Google Scholar]
  3. World Health Organization. Coronavirus Disease 2019 (COVID-19) Situation Report; Technical Report March; World Health Organization: Geneva, Switzerland, 2020. [Google Scholar]
  4. Rahaman, M.M.; Li, C.; Yao, Y.; Kulwa, F.; Rahman, M.A.; Wang, Q.; Qi, S.; Kong, F.; Zhu, X.; Zhao, X. Identification of COVID-19 samples from chest X-Ray images using deep learning: A comparison of transfer learning approaches. J. X-Ray Sci. Technol. 2020, 28, 821–839. [Google Scholar] [CrossRef]
  5. Yahiaoui, A.; Er, O.; Yumusak, N. A new method of automatic recognition for tuberculosis disease diagnosis using support vector machines. Biomed. Res. 2017, 28, 4208–4212. [Google Scholar]
  6. Hu, Z.; Tang, J.; Wang, Z.; Zhang, K.; Zhang, L.; Sun, Q. Deep learning for image-based cancer detection and diagnosis-A survey. Pattern Recognit. 2018, 83, 134–149. [Google Scholar] [CrossRef]
  7. American Thoracic Society. Diagnostic Standards and Classification of Tuberculosis in Adults and Children. Am. J. Respir. Crit. Care Med. 2000, 161, 1376–1395. [Google Scholar] [CrossRef]
  8. Setio, A.A.A.; Traverso, A.; de Bel, T.; Berens, M.S.; van den Bogaard, C.; Cerello, P.; Chen, H.; Dou, Q.; Fantacci, M.E.; Geurts, B.; et al. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: The LUNA16 challenge. Med. Image Anal. 2017, 42, 1–13. [Google Scholar] [CrossRef] [Green Version]
  9. Shen, D.; Wu, G.; Suk, H.I. Deep Learning in Medical Image Analysis. Annu. Rev. Biomed. Eng. 2017, 19, 221–248. [Google Scholar] [CrossRef] [Green Version]
  10. Wu, C.; Luo, C.; Xiong, N.; Zhang, W.; Kim, T.H. A Greedy Deep Learning Method for Medical Disease Analysis. IEEE Access 2018, 6, 20021–20030. [Google Scholar] [CrossRef]
  11. Ma, J.; Song, Y.; Tian, X.; Hua, Y.; Zhang, R.; Wu, J. Survey on deep learning for pulmonary medical imaging. Front. Med. 2019, 14, 450–469. [Google Scholar] [CrossRef] [Green Version]
  12. Rajaraman, S.; Candemir, S.; Xue, Z.; Alderson, P.O.; Kohli, M.; Abuya, J.; Thoma, G.R.; Antani, S.; Member, S. A novel stacked generalization of models for improved TB detection in chest radiographs. In Proceedings of the 2018 40th Annual International Conference the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 17–21 July 2018; pp. 718–721. [Google Scholar] [CrossRef]
  13. Ardila, D.; Kiraly, A.P.; Bharadwaj, S.; Choi, B.; Reicher, J.J.; Peng, L.; Tse, D.; Etemadi, M.; Ye, W.; Corrado, G.; et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 2019, 25, 954–961. [Google Scholar] [CrossRef]
  14. Gordienko, Y.; Gang, P.; Hui, J.; Zeng, W.; Kochura, Y.; Alienin, O.; Rokovyi, O.; Stirenko, S. Deep Learning with Lung Segmentation and Bone Shadow Exclusion Techniques for Chest X-Ray Analysis of Lung Cancer. Adv. Intell. Syst. Comput. 2019, 638–647. [Google Scholar] [CrossRef]
  15. Kieu, S.T.H.; Hijazi, M.H.A.; Bade, A.; Yaakob, R.; Jeffree, S. Ensemble deep learning for tuberculosis detection using chest X-Ray and canny edge detected images. IAES Int. J. Artif. Intell. 2019, 8, 429–435. [Google Scholar] [CrossRef]
  16. Dietterich, T.G. Ensemble Methods in Machine Learning. Int. Workshop Mult. Classif. Syst. 2000, 1–15. [Google Scholar] [CrossRef] [Green Version]
  17. Webb, A. Introduction To Biomedical Imaging; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2003. [Google Scholar] [CrossRef]
  18. Kwan-Hoong, N.; Madan M, R. X ray imaging goes digital. Br. Med J. 2006, 333, 765–766. [Google Scholar] [CrossRef] [Green Version]
  19. Lopes, U.K.; Valiati, J.F. Pre-trained convolutional neural networks as feature extractors for tuberculosis detection. Comput. Biol. Med. 2017, 89, 135–143. [Google Scholar] [CrossRef] [PubMed]
  20. Ayan, E.; Ünver, H.M. Diagnosis of Pneumonia from Chest X-Ray Images using Deep Learning. Sci. Meet. Electr.-Electron. Biomed. Eng. Comput. Sci 2019, 1–5. [Google Scholar] [CrossRef]
  21. Salman, F.M.; Abu-naser, S.S.; Alajrami, E.; Abu-nasser, B.S.; Ashqar, B.A.M. COVID-19 Detection using Artificial Intelligence. Int. J. Acad. Eng. Res. 2020, 4, 18–25. [Google Scholar]
  22. Herman, G.T. Fundamentals of Computerized Tomography; Springer: London, UK, 2009; Volume 224. [Google Scholar] [CrossRef]
  23. Song, Q.Z.; Zhao, L.; Luo, X.K.; Dou, X.C. Using Deep Learning for Classification of Lung Nodules on Computed Tomography Images. J. Healthc. Eng. 2017, 2017. [Google Scholar] [CrossRef] [Green Version]
  24. Gao, X.W.; James-reynolds, C.; Currie, E. Analysis of tuberculosis severity levels from CT pulmonary images based on enhanced residual deep learning architecture. Neurocomputing 2019, 392, 233–244. [Google Scholar] [CrossRef]
  25. Rao, P.; Pereira, N.A.; Srinivasan, R. Convolutional neural networks for lung cancer screening in computed tomography (CT) scans. In Proceedings of the 2016 2nd International Conference on Contemporary Computing and Informatics, IC3I 2016, Noida, India, 14–17 December 2016; pp. 489–493. [Google Scholar] [CrossRef]
  26. Gozes, O.; Frid, M.; Greenspan, H.; Patrick, D. Rapid AI Development Cycle for the Coronavirus (COVID-19) Pandemic: Initial Results for Automated Detection & Patient Monitoring using Deep Learning CT Image Analysis Article. arXiv 2020, arXiv:2003.05037. [Google Scholar]
  27. Shah, M.I.; Mishra, S.; Yadav, V.K.; Chauhan, A.; Sarkar, M.; Sharma, S.K.; Rout, C. Ziehl–Neelsen sputum smear microscopy image database: A resource to facilitate automated bacilli detection for tuberculosis diagnosis. J. Med. Imaging 2017, 4, 027503. [Google Scholar] [CrossRef]
  28. López, Y.P.; Filho, C.F.F.C.; Aguilera, L.M.R.; Costa, M.G.F. Automatic classification of light field smear microscopy patches using Convolutional Neural Networks for identifying Mycobacterium Tuberculosis. In Proceedings of the 2017 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), Pucon, Chile, 18–20 October 2017. [Google Scholar]
  29. Kant, S.; Srivastava, M.M. Towards Automated Tuberculosis detection using Deep Learning. In Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bengaluru, India, 18–21 November 2018; pp. 1250–1253. [Google Scholar] [CrossRef] [Green Version]
  30. Oomman, R.; Kalmady, K.S.; Rajan, J.; Sabu, M.K. Automatic detection of tuberculosis bacilli from microscopic sputum smear images using deep learning methods. Integr. Med. Res. 2018, 38, 691–699. [Google Scholar] [CrossRef]
  31. Mithra, K.S.; Emmanuel, W.R.S. Automated identification of mycobacterium bacillus from sputum images for tuberculosis diagnosis. Signal Image Video Process. 2019. [Google Scholar] [CrossRef]
  32. Samuel, R.D.J.; Kanna, B.R. Tuberculosis ( TB ) detection system using deep neural networks. Neural Comput. Appl. 2019, 31, 1533–1545. [Google Scholar] [CrossRef]
  33. Gurcan, M.N.; Boucheron, L.E.; Can, A.; Madabhushi, A.; Rajpoot, N.M.; Yener, B. Histopathological Image Analysis: A Review. IEEE Rev. Biomed. Eng. 2009, 2, 147–171. [Google Scholar] [CrossRef] [Green Version]
  34. Coudray, N.; Ocampo, P.S.; Sakellaropoulos, T.; Narula, N.; Snuderl, M.; Fenyö, D.; Moreira, A.L.; Razavian, N.; Tsirigos, A. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 2018, 24, 1559–1567. [Google Scholar] [CrossRef]
  35. O’Mahony, N.; Campbell, S.; Carvalho, A.; Harapanahalli, S.; Hernandez, G.V.; Krpalkova, L.; Riordan, D.; Walsh, J. Deep Learning vs. Traditional Computer Vision. Adv. Intell. Syst. Comput. 2020, 128–144. [Google Scholar] [CrossRef] [Green Version]
  36. Vajda, S.; Karargyris, A.; Jaeger, S.; Santosh, K.C.; Candemir, S.; Xue, Z.; Antani, S.; Thoma, G. Feature Selection for Automatic Tuberculosis Screening in Frontal Chest Radiographs. J. Med Syst. 2018, 42. [Google Scholar] [CrossRef]
  37. Jaeger, S.; Karargyris, A.; Candemir, S.; Folio, L.; Siegelman, J.; Callaghan, F.; Xue, Z.; Palaniappan, K.; Singh, R.K.; Antani, S.; et al. Automatic tuberculosis screening using chest radiographs. IEEE Trans. Med. Imaging 2014, 33, 233–245. [Google Scholar] [CrossRef]
  38. Antony, B.; Nizar Banu, P.K. Lung tuberculosis detection using x-ray images. Int. J. Appl. Eng. Res. 2017, 12, 15196–15201. [Google Scholar]
  39. Chauhan, A.; Chauhan, D.; Rout, C. Role of gist and PHOG features in computer-aided diagnosis of tuberculosis without segmentation. PLoS ONE 2014, 9, e112980. [Google Scholar] [CrossRef]
  40. Al-Ajlan, A.; Allali, A.E. CNN—MGP: Convolutional Neural Networks for Metagenomics Gene Prediction. Interdiscip. Sci. Comput. Life Sci. 2019, 11, 628–635. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  41. Domingos, P. A Few Useful Things to Know About Machine Learning. Commun. ACM 2012, 55, 78–87. [Google Scholar] [CrossRef] [Green Version]
  42. Mikołajczyk, A.; Grochowski, M. Data augmentation for improving deep learning in image classification problem. In Proceedings of the 2018 International Interdisciplinary PhD Workshop, Swinoujscie, Poland, 9–12 May 2018; pp. 117–122. [Google Scholar] [CrossRef]
  43. Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6. [Google Scholar] [CrossRef]
  44. O’Shea, K.; Nash, R. An Introduction to Convolutional Neural Networks. arXiv 2015, arXiv:1511.08458v2. [Google Scholar]
  45. Ker, J.; Wang, L. Deep Learning Applications in Medical Image Analysis. IEEE Access 2018, 6, 9375–9389. [Google Scholar] [CrossRef]
  46. Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
  47. Lanbouri, Z.; Achchab, S. A hybrid Deep belief network approach for Financial distress prediction. In Proceedings of the 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA), Rabat, Morocco, 20–21 October 2015; pp. 1–6. [Google Scholar] [CrossRef]
  48. Hinton, G.E.; Osindero, S. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
  49. Cao, X.; Wipf, D.; Wen, F.; Duan, G.; Sun, J. A practical transfer learning algorithm for face verification. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 3208–3215. [Google Scholar] [CrossRef]
  50. Wang, C.; Chen, D.; Hao, L.; Liu, X.; Zeng, Y.; Chen, J.; Zhang, G. Pulmonary Image Classification Based on Inception-v3 Transfer Learning Model. IEEE Access 2019, 7, 146533–146541. [Google Scholar] [CrossRef]
  51. Krizhevsky, A.; Sutskeve, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012. [Google Scholar] [CrossRef]
  52. Tajbakhsh, N.; Shin, J.Y.; Gurudu, S.R.; Hurst, R.T.; Kendall, C.B.; Gotway, M.B.; Liang, J. Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? IEEE Trans. Med. Imaging 2016, 35, 1299–1312. [Google Scholar] [CrossRef] [Green Version]
  53. Nogueira, K.; Penatti, O.A.; dos Santos, J.A. Towards better exploiting convolutional neural networks for remote sensing scene classification. Pattern Recognit. 2017, 61, 539–556. [Google Scholar] [CrossRef] [Green Version]
  54. Kabari, L.G.; Onwuka, U. Comparison of Bagging and Voting Ensemble Machine Learning Algorithm as a Classifier. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2019, 9, 1–6. [Google Scholar]
  55. Chouhan, V.; Singh, S.K.; Khamparia, A.; Gupta, D.; Albuquerque, V.H.C.D. A Novel Transfer Learning Based Approach for Pneumonia Detection in Chest X-ray Images. Appl. Sci. 2020, 10, 559. [Google Scholar] [CrossRef] [Green Version]
  56. Lincoln, W.P.; Skrzypekt, J. Synergy of Clustering Multiple Back Propagation Networks. Adv. Neural Inf. Process. Syst. 1990, 2, 650–659. [Google Scholar]
  57. Lakhani, P.; Sundaram, B. Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks. Radiology 2017, 284, 574–582. [Google Scholar] [CrossRef]
  58. Divina, F.; Gilson, A.; Goméz-Vela, F.; Torres, M.G.; Torres, J.F. Stacking Ensemble Learning for Short-Term Electricity Consumption Forecasting. Energies 2018, 11, 949. [Google Scholar] [CrossRef] [Green Version]
  59. World Health Organisation. Global Health TB Report; World Health Organisation: Geneva, Switzerland, 2018; p. 277. [Google Scholar]
  60. Murphy, K.; Habib, S.S.; Zaidi, S.M.A.; Khowaja, S.; Khan, A.; Melendez, J.; Scholten, E.T.; Amad, F.; Schalekamp, S.; Verhagen, M.; et al. Computer aided detection of tuberculosis on chest radiographs: An evaluation of the CAD4TB v6 system. Sci. Rep. 2019, 10, 1–11. [Google Scholar] [CrossRef]
  61. Melendez, J.; Sánchez, C.I.; Philipsen, R.H.; Maduskar, P.; Dawson, R.; Theron, G.; Dheda, K.; Van Ginneken, B. An automated tuberculosis screening strategy combining X-ray-based computer-aided detection and clinical information. Sci. Rep. 2016, 6, 1–8. [Google Scholar] [CrossRef]
  62. Heo, S.J.; Kim, Y.; Yun, S.; Lim, S.S.; Kim, J.; Nam, C.M.; Park, E.C.; Jung, I.; Yoon, J.H. Deep Learning Algorithms with Demographic Information Help to Detect Tuberculosis in Chest Radiographs in Annual Workers’ Health Examination Data. Int. J. Environ. Res. Public Health 2019, 16, 250. [Google Scholar] [CrossRef] [Green Version]
  63. Pasa, F.; Golkov, V.; Pfeiffer, F.; Cremers, D.; Pfeiffer, D. Efficient Deep Network Architectures for Fast Chest X-Ray Tuberculosis Screening and Visualization. Sci. Rep. 2019, 9, 2–10. [Google Scholar] [CrossRef] [Green Version]
  64. Cao, Y.; Liu, C.; Liu, B.; Brunette, M.J.; Zhang, N.; Sun, T.; Zhang, P.; Peinado, J.; Garavito, E.S.; Garcia, L.L.; et al. Improving Tuberculosis Diagnostics Using Deep Learning and Mobile Health Technologies among Resource-Poor and Marginalized Communities. In Proceedings of the 2016 IEEE 1st International Conference on Connected Health: Applications, Systems and Engineering Technologies, CHASE, Washington, DC, USA, 27–29 June 2016; pp. 274–281. [Google Scholar] [CrossRef]
  65. Liu, J.; Liu, Y.; Wang, C.; Li, A.; Meng, B. An Original Neural Network for Pulmonary Tuberculosis Diagnosis in Radiographs. In Lecture Notes in Computer Science, Proceedings of the International Conference on Artificial Neural Networks, Rhodes, Greece, 4–7 October 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 158–166. [Google Scholar] [CrossRef]
  66. Stirenko, S.; Kochura, Y.; Alienin, O. Chest X-Ray Analysis of Tuberculosis by Deep Learning with Segmentation and Augmentation. In Proceedings of the 2018 IEEE 38th International Conference on Electronics andNanotechnology (ELNANO), Kiev, Ukraine, 24–26 April 2018; pp. 422–428. [Google Scholar]
  67. Andika, L.A.; Pratiwi, H.; Sulistijowati Handajani, S. Convolutional neural network modeling for classification of pulmonary tuberculosis disease. J. Phys. Conf. Ser. 2020, 1490. [Google Scholar] [CrossRef]
  68. Ul Abideen, Z.; Ghafoor, M.; Munir, K.; Saqib, M.; Ullah, A.; Zia, T.; Tariq, S.A.; Ahmed, G.; Zahra, A. Uncertainty assisted robust tuberculosis identification with bayesian convolutional neural networks. IEEE Access 2020, 8, 22812–22825. [Google Scholar] [CrossRef] [PubMed]
  69. Hwang, E.J.; Park, S.; Jin, K.N.; Kim, J.I.; Choi, S.Y.; Lee, J.H.; Goo, J.M.; Aum, J.; Yim, J.J.; Park, C.M. Development and Validation of a Deep Learning—based Automatic Detection Algorithm for Active Pulmonary Tuberculosis on Chest Radiographs. Clin. Infect. Dis. 2019, 69, 739–747. [Google Scholar] [CrossRef]
  70. Hwang, S.; Kim, H.E.; Jeong, J.; Kim, H.J. A Novel Approach for Tuberculosis Screening Based on Deep Convolutional Neural Networks. Med. Imaging 2016, 9785, 1–8. [Google Scholar] [CrossRef]
  71. Islam, M.T.; Aowal, M.A.; Minhaz, A.T.; Ashraf, K. Abnormality Detection and Localization in Chest X-Rays using Deep Convolutional Neural Networks. arXiv 2017, arXiv:1705.09850v3. [Google Scholar]
  72. Nguyen, Q.H.; Nguyen, B.P.; Dao, S.D.; Unnikrishnan, B.; Dhingra, R.; Ravichandran, S.R.; Satpathy, S.; Raja, P.N.; Chua, M.C.H. Deep Learning Models for Tuberculosis Detection from Chest X-ray Images. In Proceedings of the 2019 26th International Conference on Telecommunications (ICT), Hanoi, Vietnam, 8–10 April 2019; pp. 381–385. [Google Scholar] [CrossRef]
  73. Kieu, T.; Ho, K.; Gwak, J.; Prakash, O. Utilizing Pretrained Deep Learning Models for Automated Pulmonary Tuberculosis Detection Using Chest Radiography. Intell. Inf. Database Syst. 2019, 4, 395–403. [Google Scholar] [CrossRef]
  74. Abbas, A.; Abdelsamea, M.M. Learning Transformations for Automated Classification of Manifestation of Tuberculosis using Convolutional Neural Network. In Proceedings of the 2018 13th International Conference on Computer Engineering andSystems (ICCES), Cairo, Egypt, 18–19 December 2018; IEEE: New York, NY, USA, 2018; pp. 122–126. [Google Scholar]
  75. Karnkawinpong, T.; Limpiyakorn, Y. Classification of pulmonary tuberculosis lesion with convolutional neural networks. J. Phys. Conf. Ser. 2018, 1195. [Google Scholar] [CrossRef]
  76. Liu, C.; Cao, Y.; Alcantara, M.; Liu, B.; Brunette, M.; Peinado, J.; Curioso, W. TX-CNN: Detecting Tuberculosis in Chest X-Ray Images Using Convolutional Neural Network. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017. [Google Scholar]
  77. Yadav, O.; Passi, K.; Jain, C.K. Using Deep Learning to Classify X-ray Images of Potential Tuberculosis Patients. In Proceedings of the 2018 IEEE International Conference on Bioinformatics and Biomedicine(BIBM), Madrid, Spain, 3–6 December 2018; IEEE: New York, NY, USA, 2018; pp. 2368–2375. [Google Scholar]
  78. Sahlol, A.T.; Elaziz, M.A.; Jamal, A.T.; Damaševičius, R.; Hassan, O.F. A novel method for detection of tuberculosis in chest radiographs using artificial ecosystem-based optimisation of deep neural network features. Symmetry 2020, 12, 1146. [Google Scholar] [CrossRef]
  79. Hooda, R.; Mittal, A.; Sofat, S. Automated TB classification using ensemble of deep architectures. Multimed. Tools Appl. 2019, 78, 31515–31532. [Google Scholar] [CrossRef]
  80. Rashid, R.; Khawaja, S.G.; Akram, M.U.; Khan, A.M. Hybrid RID Network for Efficient Diagnosis of Tuberculosis from Chest X-rays. In Proceedings of the 2018 9th Cairo International Biomedical Engineering Conference(CIBEC), Cairo, Egypt, 20–22 December 2018; IEEE: New York, NY, USA, 2018; pp. 167–170. [Google Scholar]
  81. Kieu, S.T.H.; Hijazi, M.H.A.; Bade, A.; Saffree Jeffree, M. Tuberculosis detection using deep learning and contrast-enhanced canny edge detected x-ray images. IAES Int. J. Artif. Intell. 2020, 9. [Google Scholar] [CrossRef]
  82. Rajaraman, S.; Antani, S.K. Modality-Specific Deep Learning Model Ensembles Toward Improving TB Detection in Chest Radiographs. IEEE Access 2020, 8, 27318–27326. [Google Scholar] [CrossRef] [PubMed]
  83. Melendez, J.; Ginneken, B.V.; Maduskar, P.; Philipsen, R.H.H.M.; Reither, K.; Breuninger, M.; Adetifa, I.M.O.; Maane, R.; Ayles, H.; Sánchez, C.I. A Novel Multiple-Instance Learning-Based Approach to Computer-Aided Detection of Tuberculosis on Chest X-Rays. IEEE Trans. Med. Imaging 2014, 34, 179–192. [Google Scholar] [CrossRef] [PubMed]
  84. Becker, A.S.; Bluthgen, C.; van Phi, V.D.; Sekaggya-Wiltshire, C.; Castelnuovo, B.; Kambugu, A.; Fehr, J.; Frauenfelder, T. Detection of tuberculosis patterns in digital photographs of chest X-ray images using Deep Learning: Feasibility study. Int. J. Tuberc. Lung Dis. 2018, 22, 328–335. [Google Scholar] [CrossRef] [PubMed]
  85. Li, L.; Huang, H.; Jin, X. AE-CNN Classification of Pulmonary Tuberculosis Based on CT images. In Proceedings of the 2018 9th International Conference on Information Technology inMedicine and Education (ITME), Hangzhou, China, 19–21 October 2018; IEEE: New York, NY, USA, 2018; pp. 39–42. [Google Scholar] [CrossRef]
  86. Pattnaik, A.; Kanodia, S.; Chowdhury, R.; Mohanty, S. Predicting Tuberculosis Related Lung Deformities from CT Scan Images Using 3D CNN; CEUR-WS: Lugano, Switzerland, 2019; pp. 9–12. [Google Scholar]
  87. Zunair, H.; Rahman, A.; Mohammed, N. Estimating Severity from CT Scans of Tuberculosis Patients using 3D Convolutional Nets and Slice Selection; CEUR-WS: Lugano, Switzerland, 2019; pp. 9–12. [Google Scholar]
  88. Llopis, F.; Fuster-Guillo, A.; Azorin-Lopez, J.; Llopis, I. Using improved optical flow model to detect Tuberculosis; CEUR-WS: Lugano, Switzerland, 2019; pp. 9–12. [Google Scholar]
  89. Wardlaw, T.; Johansson, E.W.; Hodge, M. Pneumonia: The Forgotten Killer of Children; United Nations Children’s Fund (UNICEF): New York, NY, USA, 2006; p. 44. [Google Scholar]
  90. Wunderink, R.G.; Waterer, G. Advances in the causes and management of community acquired pneumonia in adults. BMJ 2017, 1–13. [Google Scholar] [CrossRef]
  91. Tobias, R.R.; De Jesus, L.C.M.; Mital, M.E.G.; Lauguico, S.C.; Guillermo, M.A.; Sybingco, E.; Bandala, A.A.; Dadios, E.P. CNN-based Deep Learning Model for Chest X-ray Health Classification Using TensorFlow. In Proceedings of the 2020 RIVF International Conference on Computing and Communication Technologies, RIVF 2020, Ho Chi Minh, Vietnam, 14–15 October 2020. [Google Scholar]
  92. Stephen, O.; Sain, M.; Maduh, U.J.; Jeong, D.U. An Efficient Deep Learning Approach to Pneumonia Classification in Healthcare. J. Healthc. Eng. 2019, 2019. [Google Scholar] [CrossRef] [Green Version]
  93. Kermany, D.S.; Goldbaum, M.; Cai, W.; Lewis, M.A. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning. Cell 2018, 172, 1122–1131.e9. [Google Scholar] [CrossRef] [PubMed]
  94. Young, J.C.; Suryadibrata, A. Applicability of Various Pre-Trained Deep Convolutional Neural Networks for Pneumonia Classification based on X-Ray Images. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 2649–2654. [Google Scholar] [CrossRef]
  95. Moujahid, H.; Cherradi, B.; Gannour, O.E.; Bahatti, L.; Terrada, O.; Hamida, S. Convolutional Neural Network Based Classification of Patients with Pneumonia using X-ray Lung Images. Adv. Sci. Technol. Eng. Syst. 2020, 5, 167–175. [Google Scholar] [CrossRef]
  96. Rajpurkar, P.; Irvin, J.; Zhu, K.; Yang, B.; Mehta, H.; Duan, T.; Ding, D.; Bagul, A.; Ball, R.L.; Langlotz, C.; et al. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning. arXiv 2017, arXiv:1711.05225v3. [Google Scholar]
  97. Rahman, T.; Chowdhury, M.E.H.; Khandakar, A.; Islam, K.R.; Islam, K.F.; Mahbub, Z.B.; Kadir, M.A.; Kashem, S. Transfer Learning with Deep Convolutional Neural Network (CNN) for Pneumonia Detection Using Chest X-ray. Appl. Sci. 2020, 10, 3233. [Google Scholar] [CrossRef]
  98. Hashmi, M.; Katiyar, S.; Keskar, A.; Bokde, N.; Geem, Z. Efficient Pneumonia Detection in Chest Xray Images Using Deep Transfer Learning. Diagnostics 2020, 10, 417. [Google Scholar] [CrossRef] [PubMed]
  99. Acharya, A.K.; Satapathy, R. A Deep Learning Based Approach towards the Automatic Diagnosis of Pneumonia from Chest Radio-Graphs. Biomed. Pharmacol. J. 2020, 13, 449–455. [Google Scholar] [CrossRef]
  100. Elshennawy, N.M.; Ibrahim, D.M. Deep-Pneumonia Framework Using Deep Learning Models Based on Chest X-Ray Images. Diagnostics 2020, 10, 649. [Google Scholar] [CrossRef] [PubMed]
  101. Emhamed, R.; Mamlook, A.; Chen, S. Investigation of the performance of Machine Learning Classifiers for Pneumonia Detection in Chest X-ray Images. In Proceedings of the 2020 IEEE International Conference on Electro Information Technology (EIT), Chicago, IL, USA, 31 July–1 August 2020; pp. 98–104. [Google Scholar]
  102. Kumar, A.; Tiwari, P.; Kumar, S.; Gupta, D.; Khanna, A. Identifying pneumonia in chest X-rays: A deep learning approach. Measurement 2019, 145, 511–518. [Google Scholar] [CrossRef]
  103. Hurt, B.; Yen, A.; Kligerman, S.; Hsiao, A. Augmenting Interpretation of Chest Radiographs with Deep Learning Probability Maps. J. Thorac. Imaging 2020, 35, 285–293. [Google Scholar] [CrossRef]
  104. Borczuk, A.C. Benign tumors and tumorlike conditions of the lung. Arch. Pathol. Lab. Med. 2008, 132, 1133–1148. [Google Scholar] [CrossRef]
  105. Hua, K.L.; Hsu, C.H.; Hidayati, S.C.; Cheng, W.H.; Chen, Y.J. Computer-aided classification of lung nodules on computed tomography images via deep learning technique. OncoTargets Ther. 2015, 8, 2015–2022. [Google Scholar] [CrossRef] [Green Version]
  106. Kurniawan, E.; Prajitno, P.; Soejoko, D.S. Computer-Aided Detection of Mediastinal Lymph Nodes using Simple Architectural Convolutional Neural Network. J. Phys. Conf. Ser. 2020, 1505. [Google Scholar] [CrossRef]
  107. Ciompi, F.; Chung, K.; Van Riel, S.J.; Setio, A.A.A.; Gerke, P.K.; Jacobs, C.; Th Scholten, E.; Schaefer-Prokop, C.; Wille, M.M.; Marchianò, A.; et al. Towards automatic pulmonary nodule management in lung cancer screening with deep learning. Sci. Rep. 2017, 7, 1–11. [Google Scholar] [CrossRef]
  108. Shakeel, P.M.; Burhanuddin, M.A.; Desa, M.I. Lung cancer detection from CT image using improved profuse clustering and deep learning instantaneously trained neural networks. Meas. J. Int. Meas. Confed. 2019, 145, 702–712. [Google Scholar] [CrossRef]
  109. Chen, S.; Han, Y.; Lin, J.; Zhao, X.; Kong, P. Pulmonary nodule detection on chest radiographs using balanced convolutional neural network and classic candidate detection. Artif. Intell. Med. 2020, 107, 101881. [Google Scholar] [CrossRef] [PubMed]
  110. Hosny, A.; Parmar, C.; Coroller, T.P.; Grossmann, P.; Zeleznik, R.; Kumar, A.; Bussink, J.; Gillies, R.J.; Mak, R.H.; Aerts, H.J. Deep learning for lung cancer prognostication: A retrospective multi-cohort radiomics study. PLoS Med. 2018, 15, 1–25. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  111. Xu, Y.; Hosny, A.; Zeleznik, R.; Parmar, C.; Coroller, T.; Franco, I.; Mak, R.H.; Aerts, H.J. Deep learning predicts lung cancer treatment response from serial medical imaging. Clin. Cancer Res. 2019, 25, 3266–3275. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  112. Kuan, K.; Ravaut, M.; Manek, G.; Chen, H.; Lin, J.; Nazir, B.; Chen, C.; Howe, T.C.; Zeng, Z.; Chandrasekhar, V. Deep Learning for Lung Cancer Detection: Tackling the Kaggle Data Science Bowl 2017 Challenge. arXiv 2017, arXiv:1705.09435. [Google Scholar]
  113. Ausawalaithong, W.; Thirach, A.; Marukatat, S.; Wilaiprasitporn, T. Automatic Lung Cancer Prediction from Chest X-ray Images Using the Deep Learning Approach. In Proceedings of the 2018 11th Biomedical Engineering International Conference (BMEiCON), Chiang Mai, Thailand, 21–24 November 2018. [Google Scholar]
  114. Xu, S.; Guo, J.; Zhang, G.; Bie, R. Automated detection of multiple lesions on chest X-ray images: Classification using a neural network technique with association-specific contexts. Appl. Sci. 2020, 10, 1742. [Google Scholar] [CrossRef] [Green Version]
  115. Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [Google Scholar] [CrossRef] [Green Version]
  116. Velavan, T.P.; Meyer, C.G. The COVID-19 epidemic. Trop. Med. Int. Health 2020, 25, 278–280. [Google Scholar] [CrossRef] [Green Version]
  117. Shibly, K.H.; Dey, S.K.; Islam, M.T.U.; Rahman, M.M. COVID faster R–CNN: A novel framework to Diagnose Novel Coronavirus Disease (COVID-19) in X-Ray images. Inform. Med. Unlocked 2020, 20, 100405. [Google Scholar] [CrossRef]
  118. Alsharman, N.; Jawarneh, I. GoogleNet CNN neural network towards chest CT-coronavirus medical image classification. J. Comput. Sci. 2020, 16, 620–625. [Google Scholar] [CrossRef]
  119. Zhu, J.; Shen, B.; Abbasi, A.; Hoshmand-Kochi, M.; Li, H.; Duong, T.Q. Deep transfer learning artificial intelligence accurately stages COVID-19 lung disease severity on portable chest radiographs. PLoS ONE 2020, 15, e0236621. [Google Scholar] [CrossRef]
  120. Sethi, R.; Mehrotra, M.; Sethi, D. Deep Learning based Diagnosis Recommendation for COVID-19 using Chest X-Rays Images. In Proceedings of the 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 15–17 July 2020. [Google Scholar]
  121. Das, D.; Santosh, K.C.; Pal, U. Truncated inception net: COVID-19 outbreak screening using chest X-rays. Phys. Eng. Sci. Med. 2020, 43, 915–925. [Google Scholar] [CrossRef] [PubMed]
  122. Panwar, H.; Gupta, P.K.; Siddiqui, M.K.; Morales-Menendez, R.; Singh, V. Application of deep learning for fast detection of COVID-19 in X-Rays using nCOVnet. Chaos Solitons Fractals 2020, 138, 109944. [Google Scholar] [CrossRef] [PubMed]
  123. Narin, A.; Kaya, C.; Pamuk, Z. Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks. arXiv 2020, arXiv:2003.10849. [Google Scholar]
  124. Apostolopoulos, I.D.; Mpesiana, T.A. Covid—19: Automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Phys. Eng. Sci. Med. 2020, 1–6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  125. Chowdhury, M.E.H.; Rahman, T.; Khandakar, A.; Mazhar, R.; Kadir, M.A.; Reaz, M.B.I.; Mahbub, Z.B.; Islam, K.R.; Salman, M.; Iqbal, A.; et al. Can AI help in screening Viral and COVID-19 pneumonia? arXiv 2020, arXiv:2003.13145. [Google Scholar] [CrossRef]
  126. Wang, L.; Lin, Z.Q.; Wong, A. COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images. Sci. Rep. 2020, 10, 1–12. [Google Scholar] [CrossRef]
  127. Sethy, P.K.; Behera, S.K.; Ratha, P.K.; Biswas, P. Detection of coronavirus disease (COVID-19) based on deep features and support vector machine. Int. J. Math. Eng. Manag. Sci. 2020, 5, 643–651. [Google Scholar] [CrossRef]
  128. Islam, M.Z.; Islam, M.M.; Asraf, A. A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images. Inform. Med. Unlocked 2020, 20, 100412. [Google Scholar] [CrossRef]
  129. Shorfuzzaman, M.; Masud, M. On the detection of covid-19 from chest x-ray images using cnn-based transfer learning. Comput. Mater. Contin. 2020, 64, 1359–1381. [Google Scholar] [CrossRef]
  130. Li, L.; Qin, L.; Xu, Z.; Yin, Y.; Wang, X.; Kong, B.; Bai, J.; Lu, Y.; Fang, Z.; Song, Q.; et al. Using Artificial Intelligence to Detect COVID-19 and Community-acquired Pneumonia Based on Pulmonary CT: Evaluation of the Diagnostic Accuracy. Radiology 2020, 296, 65–71. [Google Scholar] [CrossRef]
  131. Alazab, M.; Awajan, A.; Mesleh, A.; Abraham, A.; Jatana, V.; Alhyari, S. COVID-19 prediction and detection using deep learning. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 2020, 12, 168–181. [Google Scholar]
  132. Waheed, A.; Goyal, M.; Gupta, D.; Khanna, A.; Al-Turjman, F.; Pinheiro, P.R. CovidGAN: Data Augmentation Using Auxiliary Classifier GAN for Improved Covid-19 Detection. IEEE Access 2020, 8, 91916–91923. [Google Scholar] [CrossRef]
  133. Ouyang, X.; Huo, J.; Xia, L.; Shan, F.; Liu, J.; Mo, Z.; Yan, F.; Ding, Z.; Yang, Q.; Song, B.; et al. Dual-Sampling Attention Network for Diagnosis of COVID-19 from Community Acquired Pneumonia. IEEE Trans. Med. Imaging 2020, 39, 2595–2605. [Google Scholar] [CrossRef]
  134. Mahmud, T.; Rahman, M.A.; Fattah, S.A. CovXNet: A multi-dilation convolutional neural network for automatic COVID-19 and other pneumonia detection from chest X-ray images with transferable multi-receptive feature optimization. Comput. Biol. Med. 2020, 122, 103869. [Google Scholar] [CrossRef] [PubMed]
  135. Shi, F.; Xia, L.; Shan, F.; Wu, D.; Wei, Y.; Yuan, H.; Jiang, H. Large-Scale Screening of COVID-19 from Community Acquired Pneumonia using Infection Size-Aware Classification. arXiv 2020, arXiv:2003.09860. [Google Scholar]
  136. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  137. Xu, X.; Jiang, X.; Ma, C.; Du, P.; Li, X.; Ly, S.; Yu, L.; Chen, Y.; Su, J.; Lang, G.; et al. A Deep Learning System to Screen Novel Coronavirus Disease 2019 Pneumonia. Engineering 2020. [Google Scholar] [CrossRef]
  138. Singh, D.; Kumar, V.; Kaur, M. Classification of COVID-19 patients from chest CT images using multi-objective differential evolution–based convolutional neural networks. Eur. J. Clin. Microbiol. Infect. Dis. 2020, 39, 1379–1389. [Google Scholar] [CrossRef]
  139. Sedik, A.; Iliyasu, A.M.; El-Rahiem, B.A.; Abdel Samea, M.E.; Abdel-Raheem, A.; Hammad, M.; Peng, J.; Abd El-Samie, F.E.; Abd El-Latif, A.A. Deploying machine and deep learning models for efficient data-augmented detection of COVID-19 infections. Viruses 2020, 12, 769. [Google Scholar] [CrossRef]
  140. Ahsan, M.M.; Alam, T.E.; Trafalis, T.; Huebner, P. Deep MLP-CNN model using mixed-data to distinguish between COVID-19 and Non-COVID-19 patients. Symmetry 2020, 12, 1526. [Google Scholar] [CrossRef]
  141. Afshar, P.; Heidarian, S.; Naderkhani, F.; Oikonomou, A.; Plataniotis, K.N.; Mohammadi, A. COVID-CAPS: A capsule network-based framework for identification of COVID-19 cases from X-ray images. Pattern Recognit. Lett. 2020, 138, 638–643. [Google Scholar] [CrossRef] [PubMed]
  142. Rosenthal, A.; Gabrielian, A.; Engle, E.; Hurt, D.E.; Alexandru, S.; Crudu, V.; Sergueev, E.; Kirichenko, V.; Lapitskii, V.; Snezhko, E.; et al. The TB Portals: An Open-Access, Web- Based Platform for Global Drug-Resistant- Tuberculosis Data Sharing and Analysis. J. Clin. Microbiol. 2017, 55, 3267–3282. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  143. Cid, Y.D.; Liauchuk, V.; Klimuk, D.; Tarasau, A. Overview of ImageCLEFtuberculosis 2019—Automatic CT—Based Report Generation and Tuberculosis Severity Assessment; CEUR-WS: Lugano, Switzerland, 2019; pp. 9–12. [Google Scholar]
  144. Demner-Fushman, D.; Kohli, M.D.; Rosenman, M.B.; Shooshan, S.E.; Rodriguez, L.; Antani, S.; Thoma, G.R.; McDonald, C.J. Preparing a collection of radiology examinations for distribution and retrieval. J. Am. Med. Inform. Assoc. 2016, 23, 304–310. [Google Scholar] [CrossRef]
  145. Shiraishi, J.; Katsuragawa, S.; Ikezoe, J.; Matsumoto, T.; Kobayashi, T.; Komatsu, K.I.; Matsui, M.; Fujita, H.; Kodera, Y.; Doi, K. Development of a digital image database for chest radiographs with and without a lung nodule: Receiver operating characteristic analysis of radiologists’ detection of pulmonary nodules. Am. J. Roentgenol. 2000, 174, 71–74. [Google Scholar] [CrossRef] [PubMed]
  146. Jaeger, S.; Candemir, S.; Antani, S.; Wáng, Y.x.J.; Lu, P.x.; Thoma, G. Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. Quant. Imaging Med. Surg. 2014, 4, 475–477. [Google Scholar] [CrossRef]
  147. Xiaosong, W.; Yifan, P.; Le, L.; Lu, Z.; Mohammadhadi, B.; Summers, R.M. ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3462–3471. [Google Scholar]
  148. Costa, M.G.; Filho, C.F.; Kimura, A.; Levy, P.C.; Xavier, C.M.; Fujimoto, L.B. A sputum smear microscopy image database for automatic bacilli detection in conventional microscopy. In Proceedings of the 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology ociety, EMBC, Chicago, IL, USA, 26–30 August 2014; pp. 2841–2844. [Google Scholar] [CrossRef]
  149. Armato, S.G.; McLennan, G.; Bidaut, L.; McNitt-Gray, M.F.; Meyer, C.R.; Reeves, A.P.; Zhao, B.; Aberle, D.R.; Henschke, C.I.; Hoffman, E.A.; et al. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans. Med. Phys. 2011, 38, 915–931. [Google Scholar] [CrossRef]
  150. Grossman, R.L.; Allison, P.; Ferrentti, V.; Varmus, H.E.; Lowy, D.R.; Kibbe, W.A.; Staudt, L.M. Toward a Shared Vision for Cancer Genomic Data. N. Engl. J. Med. 2016, 375, 1109–1112. [Google Scholar] [CrossRef]
  151. Cohen, J.P.; Morrison, P.; Dao, L.; Roth, K.; Duong, T.Q.; Ghassemi, M. COVID-19 Image Data Collection: Prospective Predictions Are the Future. arXiv 2020, arXiv:2006.11988. [Google Scholar]
Figure 1. Flow diagram of the methodology used to conduct this survey.
Figure 1. Flow diagram of the methodology used to conduct this survey.
Jimaging 06 00131 g001
Figure 2. Overview of using deep learning for lung disease detection.
Figure 2. Overview of using deep learning for lung disease detection.
Jimaging 06 00131 g002
Figure 3. Taxonomy of lung disease detection using deep learning.
Figure 3. Taxonomy of lung disease detection using deep learning.
Jimaging 06 00131 g003
Figure 4. Examples of chest X-ray images.
Figure 4. Examples of chest X-ray images.
Jimaging 06 00131 g004
Figure 5. Examples of CT scan images.
Figure 5. Examples of CT scan images.
Jimaging 06 00131 g005
Figure 6. Examples of sputum smear microscopy images.
Figure 6. Examples of sputum smear microscopy images.
Jimaging 06 00131 g006
Figure 7. Examples of histopathology images.
Figure 7. Examples of histopathology images.
Jimaging 06 00131 g007
Figure 8. Examples of image augmentation: (a) original; (b) 45° rotation; (c) 90° rotation; (d) horizontal flip; (e) vertical flip; (f) positive x and y translation; (g) negative x and y translation; (h) salt and pepper noise; and (i) speckle noise.
Figure 8. Examples of image augmentation: (a) original; (b) 45° rotation; (c) 90° rotation; (d) horizontal flip; (e) vertical flip; (f) positive x and y translation; (g) negative x and y translation; (h) salt and pepper noise; and (i) speckle noise.
Jimaging 06 00131 g008
Figure 9. Example of a CNN structure.
Figure 9. Example of a CNN structure.
Jimaging 06 00131 g009
Figure 10. (a) The trend of the usage of image types in lung disease detection works in recent years; and (b) the distribution of the image type used in deep learning aided lung disease detection in recent years.
Figure 10. (a) The trend of the usage of image types in lung disease detection works in recent years; and (b) the distribution of the image type used in deep learning aided lung disease detection in recent years.
Jimaging 06 00131 g010
Figure 11. (a) The trend of the usage of features in lung disease detection works in recent years; and (b) the distribution of usage of data augmentation in deep learning aided lung disease detection in recent years.
Figure 11. (a) The trend of the usage of features in lung disease detection works in recent years; and (b) the distribution of usage of data augmentation in deep learning aided lung disease detection in recent years.
Jimaging 06 00131 g011
Figure 12. (a) The trend of the usage of data augmentation in lung disease detection works in recent years; and (b) the distribution of usage of data augmentation in deep learning aided lung disease detection in recent years.
Figure 12. (a) The trend of the usage of data augmentation in lung disease detection works in recent years; and (b) the distribution of usage of data augmentation in deep learning aided lung disease detection in recent years.
Jimaging 06 00131 g012
Figure 13. (a) The trend of the usage of deep learning algorithms in lung disease detection works in recent years; and (b) the distribution of the usage of CNN in deep learning aided lung disease detection in recent years.
Figure 13. (a) The trend of the usage of deep learning algorithms in lung disease detection works in recent years; and (b) the distribution of the usage of CNN in deep learning aided lung disease detection in recent years.
Jimaging 06 00131 g013
Figure 14. (a) The trend of the usage of transfer learning in lung disease detection works in recent years; and (b) the usage of transfer learning in lung disease detection works using CNN.
Figure 14. (a) The trend of the usage of transfer learning in lung disease detection works in recent years; and (b) the usage of transfer learning in lung disease detection works using CNN.
Jimaging 06 00131 g014
Figure 15. (a) The trend of the usage of ensemble classifier in lung disease detection works in recent years; and (b) the distribution of the usage of the ensemble in deep learning aided lung disease detection in recent years.
Figure 15. (a) The trend of the usage of ensemble classifier in lung disease detection works in recent years; and (b) the distribution of the usage of the ensemble in deep learning aided lung disease detection in recent years.
Jimaging 06 00131 g015
Figure 16. (a) The trend of the deep learning aided lung disease detection works in recent years; and (b) the distribution of the diseases detected using deep learning in recent years.
Figure 16. (a) The trend of the deep learning aided lung disease detection works in recent years; and (b) the distribution of the diseases detected using deep learning in recent years.
Jimaging 06 00131 g016
Table 1. Summary of papers for tuberculosis detection using deep learning.
Table 1. Summary of papers for tuberculosis detection using deep learning.
AuthorsDeep Learning TechniqueFeaturesDataset
[74]CNN with transfer learning and data augmentationFeatures extracted from CNNMontgomery
[38]K-nearest neighbour, Simple Linear Regression and Sequential Minimal Optimisation (SMO) ClassificationArea, major axis, minor axis, eccentricity, mean, kurtosis, skewness and entropyShenzhen
[84]ViDiFeatures extracted from CNNUnspecified
[64]CNNGabor, LBP, SIFT, PHOG and Features extracted from CNNPrivate dataset
[24]CNNFeatures extracted from CNNImageCLEF 2018 dataset
[62]CNN with transfer learning, with demographic informationFeatures extracted from CNN + demographic informationPrivate dataset
[79]CNN with data augmentation, and ensemble by weighted averages of probability scoresFeatures extracted from CNNMontgomery, Shenzhen, Belarus, JSRT
[70]CNN with transfer learning and data augmentationFeatures extracted from CNNPrivate dataset, Montgomery, Shenzhen
[69]CNNFeatures extracted from CNNPrivate datasets, Montgomery, Shenzhen
[71]CNN with transfer learning and ensemble by simple linear probabilities averagingFeatures extracted from CNN + rule-based featuresIndiana, JSRT, Shenzhen
[29]CNNHoG featuresZiehlNeelsen Sputum smear Microscopy image DataBase
[75]CNN and shuffle samplingFeatures extracted from CNNPrivate datasets
[81]CNN with transfer learning and ensemble by averagingCNN extracted features from edge imagesMontgomery, Shenzhen
[57]CNN with transfer learning, data augmentation and ensemble by weighted probability scores averageFeatures extracted from CNNPrivate dataset, Montgomery, Shenzhen, Belarus
[85]AutoEncoder-CNNFeatures extracted from CNNPrivate dataset
[76]CNN with transfer learning and shuffle samplingFeatures extracted from CNNPrivate dataset
[65]End-to-end CNNFeatures extracted from CNNMontgomery, Shenzhen
[88]Optical flow modelActivity Description Vector on optical flow of video sequencesImageCLEF 2019 dataset
[28]CNNColoursTBimages dataset
[83]Modified maximum pattern margin support vector machine (modified miSVM)First four moments of the intensity distributionsPrivate datasets
[61]CAD4TB with clinical informationFeatures extracted from CNN + clinical featuresPrivate dataset
[31]DBNLoH + SURF featuresZiehlNeelsen Sputum smear Microscopy image DataBase
[60]CAD4TBFeatures extracted from CNNPrivate dataset
[72]CNN with transfer learning and data augmentationFeatures extracted from CNNMontgomery, Shenzhen, NIH-14 dataset
[30]CNNFeatures extracted from CNNTBimages dataset
[63]CNN from scratch and data augmentationFeatures extracted from CNNMontgomery, Shenzhen, Belarus
[86]3D CNNFeatures extracted from CNN + lung volume + patient attribute metadataImageCLEF 2019 dataset
[12]CNN with transfer learning and ensemble by stackinglocal and global feature descriptors + features extracted from CNNPrivate dataset, Montgomery, Shenzhen, India
[80]CNN with transfer learning and feature level ensembleFeatures extracted from CNNShenzhen
[15]CNN with transfer learning and ensemble by averagingCNN extracted features from edge imagesMontgomery, Shenzhen
[32]CNN with transfer learningFeatures extracted from CNNZiehlNeelsen Sputum smear Microscopy image DataBase
[66]CNN with data augmentationFeatures extracted from CNNShenzhen
[73]CNN with transfer learning and data augmentationFeatures extracted from CNNNIH-14, Montgomery, Shenzhen
[19]CNN with transfer learning, Bag of CNN Features and ensemble by a simple soft-voting schemeFeatures extracted from CNN + BOWPrivate dataset, Montgomery, Shenzhen
[36]Neural networkShape, curvature descriptor histograms, eigenvalues of Hessian matrixMontgomery, Shenzhen
[77]CNN with transfer learning and data augmentationFeatures extracted from CNNMontgomery, Shenzhen, NIH-14
[87]3D CNNFeatures extracted from CNNImageCLEF 2019 dataset
[78]CNN and Artificial Ecosystem-based Optimisation algorithmFeatures extracted from CNNShenzhen
[67]CNNFeatures extracted from CNNShenzhen
[68]Bayesian based CNNFeatures extracted from CNNMontgomery, Shenzhen
[82]CNN with transfer learning, and ensemble by majority voting, simple averaging, weighted averaging, and stackingFeatures extracted from CNNMontgomery, Shenzhen, LDOCTCXR, 2018 RSNA pneumonia challenge dataset, Indiana dataset
Table 2. Summary of papers for pneumonia detection using deep learning
Table 2. Summary of papers for pneumonia detection using deep learning
ReferenceDeep Learning TechniqueFeaturesDataset
[99]Deep Siamese based neural networkCNN extracted features from the left half and right half of the lungsUnspecified Kaggle dataset
[20]CNN with transfer learning and data augmentationFeatures extracted from CNNLDOCTCXR
[55]CNN with transfer learning, data augmentation and ensemble by majority voting.Features extracted from CNNLDOCTCXR
[93]CNN with transfer learningFeatures extracted from CNNLDOCTCXR
[102]CNN with transfer learning, data augmentation and ensemble by combining confidence scores and bounding boxes.Features extracted from CNNRadiological Society of North America (RSNA) pneumonia dataset
[96]CNN with transfer learning and data augmentationFeatures extracted from CNNNIH Chest X-ray Dataset
[92]CNN from scratch and data augmentationFeatures extracted from CNNLDOCTCXR
[95]CNN with transfer learningFeatures extracted from CNNLDOCTCXR
[91]CNNFeatures extracted from CNNMooney’s Kaggle dataset
[100]CNN and LSTM-CNN, with transfer learning and data augmentationFeatures extracted from CNNMooney’s Kaggle dataset
[103]CNN with probabilistic map of pneumoniaFeatures extracted from CNN2018 RSNA pneumonia challenge dataset
[101]Decision Tree, Random Forest, K-nearest neighbour, AdaBoost, Gradient Boost, XGBboost, CNNMultiple featuresMooney’s Kaggle dataset
[98]CNN with transfer learning, data augmentation and ensemble by weighted averagingFeatures extracted from CNNLDOCTCXR
[97]CNN with transfer learning and data augmentationFeatures extracted from CNNMooney’s Kaggle dataset
[94]CNN with transfer learningFeatures extracted from CNNPrivate dataset
Table 3. Summary of papers for lung cancer detection using deep learning.
Table 3. Summary of papers for lung cancer detection using deep learning.
ReferenceDeep Learning TechniqueFeaturesDataset
[13]CNNFeatures extracted from CNNLUNA, LIDC, NLST
[113]CNN with transfer learningFeatures extracted from CNNJSRT Dataset, NIH-14 dataset
[107]Multi-stream multi-scale convolutional networksFeatures extracted from CNNMILD dataset DLCST dataset
[34]CNN with transfer learningFeatures extracted from CNNNCI Genomic Data Commons
[110]CNN with transfer learning and data augmentationFeatures extracted from CNNNSCLC-Radiomics, NSCLC-Radiomics-Genomics, RIDER Collections and several private datasets
[105]CNN and DBNFeatures extracted from CNN and DBNLIDC-IDRI
[112]CNN with transfer learningFeatures extracted from CNNKaggle Data Science Bowl 2017 dataset, Lung Nodule Analysis 2016 (LUNA16) dataset
[25]CNNFeatures extracted from CNNLIDC-IDRI
[108]CNNFeatures extracted from CNNLIDC-IDRI
[23]CNN with data augmentationFeatures extracted from CNNLIDC-IDRI database
[111]CNN with transfer learning and data augmentationFeatures extracted from CNNPrivate dataset
[14]Bone elimination and lung segmentation before training with CNNFeatures extracted using CNN from bone eliminated lung images and segmented lung imagesJSRT dataset
[114]CNN-long short-term memory networkFeatures extracted from CNNNIH-14 dataset
[109]CNN with transfer learning and data augmentationFeatures extracted from CNNJSRT database
[106]CNN with data augmentationFeatures extracted from CNNCancer Imaging Archive
Table 4. Summary of papers for COVID-19 detection using deep learning.
Table 4. Summary of papers for COVID-19 detection using deep learning.
AuthorsDeep Learning TechniqueFeaturesDataset
[137]CNN with transfer learning and location-attention classification mechanismFeatures extracted from CNNPrivate dataset
[125]CNN with transfer learning and data augmentationFeatures extracted from CNNSIRM database, Cohen’s Github dataset, Chowdhury’s Kaggle dataset
[26]RADLogics Inc., CNN with transfer learning and data augmentationFeatures extracted from RADLogics Inc and CNNChainz Dataset, A dataset from a hospital in Wenzhou, China, Dataset from El-Camino Hospital (CA) and Lung image database consortium (LIDC)
[123]CNN with transfer learningFeatures extracted from CNNCohen’s Github dataset and LDOCTCXR
[21]CNN with transfer learning and data augmentationFeatures extracted from CNNCohen’s Github dataset and unspecified Kaggle dataset
[135]VB-Net and modified random decision forests method96 handcrafted image featuresDataset obtained from Tongji Hospital of Huazhong University of Science and Technology, Shanghai Public Health Clinical Center of Fudan University, and China-Japan Union Hospital of Jilin University.
[126]CNN from scratch and data augmentationFeatures extracted from CNNCOVIDx Dataset
[127]CNN with transfer learningFeatures extracted from CNNCohen’s Github dataset, Andrew’s Kaggle dataset, LDOCTCXR
[117]CNN with transfer learningFeatures extracted from CNNCohen’s Github dataset, RSNA pneumonia dataset, COVIDx
[131]CNN with transfer learning and data augmentationFeatures extracted from CNNSajid’s Kaggle dataset
[4]CNN with transfer learning and data augmentationFeatures extracted from CNNCohen’s Github dataset, Mooney’s Kaggle dataset
[118]CNN with transfer learningFeatures extracted from CNNCOVID-CT-Dataset
[128]CNN as feature extractor and long short-term memory (LSTM) network as classifierFeatures extracted from CNNGitHub, Radiopaedia, The Cancer Imaging Archive, SIRM, Kaggle repository, NIH dataset, Mendeley dataset
[132]CNN with transfer learning and synthetic data generation and augmentationFeatures extracted from CNNCohen’s Github, Chowdhury’s Kaggle dataset, COVID-19 Chest X-ray Dataset, Initiative
[129]CNN with transfer learning, data augmentation and ensemble by majority votingFeatures extracted from CNNCohen’s Github, LDOCTCXR
[134]CNN with transfer learning and stacking ensembleFeatures extracted from CNNPrivate dataset, LDOCTCXR
[130]CNNFeatures extracted from CNNPrivate dataset
[138]Multi-objective differential evolution-based CNNFeatures extracted from CNNUnspecified
[119]CNN with transfer learningFeatures extracted from CNNCohen’s Github
[139]CNN and ConvLSTM with data augmentationFeatures extracted from CNNCohen’s Github, COVID-CT-Dataset
[120]CNN with transfer learningFeatures extracted from CNNCohen’s Github
[133]CNN with ensemble by weighted averagingFeatures extracted from CNNPrivate hospital datasets
[121]CNN with transfer learningFeatures extracted from CNNCohen’s Github, Mooney’s Kaggle dataset, Shenzhen and Montgomery datasets
[140]MLP-CNN based modelFeatures extracted from CNNCohen’s Github
[122]CNN with transfer learningFeatures extracted from CNNCohen’s Github, unspecified Kaggle dataset
[141]Capsule Network-based framework with transfer learningFeatures extracted from CNNCohen’s Github, Mooney’s Kaggle dataset
Table 5. Summary of datasets used for tuberculosis detection.
Table 5. Summary of datasets used for tuberculosis detection.
NameDiseaseImage TypeReferenceNumber of ImagesLink
Belarus datasetTuberculosisX-ray and CT[142]1299http://tuberculosis.by
ImageCLEF 2018 datasetTuberculosisCT 2287https://www.imageclef.org/2018/tuberculosis
ImageCLEF 2019 datasetTuberculosisCT[143]335https://www.imageclef.org/2019/medical/tuberculosis
IndiaTuberculosisX-ray[39]78 tuberculosis and 78 normalhttps://sourceforge.net/projects/tbxpredict/
Indiana DatasetMultiple diseases with annotationsX-ray[144]7284https://openi.nlm.nih.gov
JSRT datasetLung nodules and normalX-ray and CT[145]154 nodule and 93 non-nodulehttp://db.jsrt.or.jp/eng.php
Montgomery and Shenzhen datasetsTuberculosis and normalX-ray[146]394 tuberculosis and 384 normalhttps://lhncbc.nlm.nih.gov/publication/pub9931
NIH-14 datasetPneumonia and 13 other diseasesX-ray[147]112120https://www.kaggle.com/nih-chest-xrays/data
TBimages datasetTuberculosisSputum smear microscopy image[148]1320http://www.tbimages.ufam.edu.br/
ZiehlNeelsen Sputum smear Microscopy image DataBaseTuberculosisSputum smear microscopy image[27]620 tuberculosis and 622 normalhttp://14.139.240.55/znsm/
Large Dataset of Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images (LDOCTCXR)Pneumonia and normalX-ray[93]3883 pneumonia and 1349 normalhttps://data.mendeley.com/datasets/rscbjbr9sj/3
Radiological Society of North America (RSNA) pneumonia datasetPneumonia and normalX-ray 5528https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data
Table 6. Summary of datasets used for pneumonia detection.
Table 6. Summary of datasets used for pneumonia detection.
NameDiseaseImage TypeReferenceNumber of ImagesLink
LDOCTCXRX-ray[93]3883 pneumonia and 1349 normalhttps://data.mendeley.com/datasets/rscbjbr9sj/3
NIH Chest X-ray DatasetPneumonia and 13 other diseasesX-ray[147]112,120https://www.kaggle.com/nih-chest-xrays/data
Radiological Society of North America (RSNA) pneumonia datasetPneumonia and normalX-ray 5528https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data
Mooney’s Kaggle datasetPneumonia and normalX-ray 5863https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia
Table 7. Summary of datasets used for lung cancer detection.
Table 7. Summary of datasets used for lung cancer detection.
NameDiseaseImage TypeReferenceNumber of ImagesLink
JSRT datasetLung nodules and normal lungsX-ray and CT[145]154 nodule and 93 non-nodulehttp://db.jsrt.or.jp/eng.php
Kaggle Data Science Bowl 2017 datasetLung CancerCT scans 601https://www.kaggle.com/c/data-science-bowl-2017/overview
LIDC-IDRILung CancerCT[149]1018https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI
Lung Nodule Analysis 2016 (LUNA16) datasetLocation and size of lung nodulesCT scans[8]888https://luna16.grand-challenge.org/download/
NCI Genomic Data CommonsLung Cancerhistopa- thology images[150]More than 575,000https://portal.gdc.cancer.gov/
NIH-14 dataset14 lung diseasesX-ray[147]112,120https://www.kaggle.com/nih-chest-xrays/data
NLSTLung CancerCT Approximately 200,000https://biometry.nci.nih.gov/cdas/learn/nlst/images/
NSCLC-RadiomicsLung CancerCT 422https://wiki.cancerimagingarchive.net/display/Public/NSCLC-Radiomics
NSCLC- Radiomics -GenomicsLung CancerCT 89https://wiki.cancerimagingarchive.net/display/Public/NSCLC-Radiomics-Genomics
RIDER CollectionsLung CancerCT Approximately 280,000https://wiki.cancerimagingarchive.net/display/Public/RIDER+Collections
Table 8. Summary of datasets used for COVID-19 detection.
Table 8. Summary of datasets used for COVID-19 detection.
NameDiseaseImage TypeReferenceNumber of ImagesLink
Andrew’s Kaggle datasetCOVID-19X-ray and CT 79https://www.kaggle.com/andrewmvd/convid19-x-rays
Chainz DatasetCOVID-19 and normalCT 50 COVID-19, 51 normalwww.ChainZ.cn
Chowdhury’s Kaggle datasetCOVID-19, normal and pneumoniaX-ray[125]219 COVID-19, 1341 normal and 1345 pneumoniahttps://www.kaggle.com/tawsifurrahman/covid19-radiography-database
Cohen’s Github datasetCOVID-19X-ray and CT[151]123https://github.com/ieee8023/covid-chestxray-dataset
COVIDx DatasetCOVID-19, normal and pneumoniaX-ray[126]573 COVID-19, 8066 normal and 5559 pneumoniahttps://github.com/lindawangg/COVID-Net/blob/master/docs/COVIDx.md
Italian Society Of Medical And Interventional Radiology (SIRM) COVID-19 DatabaseCOVID-19X-ray and CT 68https://www.sirm.org/category/senza-categoria/covid-19/
LDOCTCXRPneumonia and normalX-ray[93]3883 pneumonia and 1349 normalhttps://data.mendeley.com/datasets/rscbjbr9sj/3
Lung image database consortium (LIDC)Lung CancerCT[149]1018https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI
Sajid’s Kaggle datasetCOVID-19 and normalX-ray 28 normal, 70 COVID-19https://www.kaggle.com/nabeelsajid917/covid-19-x-ray-10000-images
Mooney’s Kaggle datasetPneumonia and normalX-ray 5863https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia
COVID-CT DatasetCOVID-19 and normalCT 349 COVID-19 and 463 non-COVID-19https://github.com/UCSD-AI4H/COVID-CT
Mendeley Augmented COVID-19 X-ray Images DatasetCOVID-19 and normalX-ray 912https://data.mendeley.com/datasets/2fxz4px6d8/4
COVID-19 Chest X-Ray Dataset InitiativeCOVID-19X-ray 55https://github.com/agchung/Figure1-COVID-chestxray-dataset
Table 9. Summary of the works surveyed based on the taxonomy.
Table 9. Summary of the works surveyed based on the taxonomy.
AttributesSubattributesReferences
Image typesX-Ray[4,12,14,15,19,20,21,24,36,38,55,57,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,91,92,93,94,95,96,97,98,99,100,101,102,103,109,113,114,117,119,120,121,122,123,124,125,126,127,128,129,131,132,134,139,140,141]
CT Scans[13,23,25,26,86,87,88,105,106,107,108,110,111,112,118,130,133,135,137,138,139]
Sputum Smear Microscopy Images[28,29,30,31,32]
Histopathology images[34]
FeaturesExtracted from CNN[4,12,13,14,15,19,20,21,23,24,25,26,30,32,34,55,57,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,84,85,86,87,91,92,93,94,95,96,97,98,99,100,101,102,103,105,106,107,108,109,110,111,112,113,114,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,137,138,139,140,141]
Others[12,15,26,28,29,31,36,38,61,62,64,71,81,83,86,88,105,135]
Data augmentationYes[4,20,21,23,26,55,57,63,66,70,73,74,77,79,92,96,97,98,100,102,106,109,110,111,114,122,125,126,128,129,131,132,139]
Types of deep learning algorithmCNN[4,12,13,14,15,19,20,21,23,24,25,26,28,29,30,32,34,55,57,60,61,62,63,64,65,66,67,68,69,72,74,76,77,78,79,80,81,82,84,85,86,91,92,93,94,95,96,97,98,99,100,101,102,103,105,106,107,108,109,110,111,112,113,114,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,137,138,139,140,141]
Non-CNN[19,26,31,36,38,83,88,105,135]
Transfer learningFixed feature extractor[12,15,19,21,62,70,76,78,80,81,93,94,96,100,102,117,127,128,137]
Fine-tuning CNN[4,20,26,32,34,55,57,71,72,73,74,76,77,79,82,95,97,98,102,109,110,111,112,113,118,119,120,121,122,123,124,125,129,131,132,134,141]
EnsembleMajority voting[19,55,82,129]
Probability score averaging[15,57,71,79,81,82,98,102,133]
Stacking[12,82,134]
Other[80]
Disease typesTuberculosis[12,15,19,24,28,29,30,31,32,36,38,57,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88]
Pneumonia[20,55,91,92,93,94,95,96,97,98,99,100,101,102,103]
Lung cancer[13,14,23,25,34,105,106,107,108,109,110,111,112,113,114]
COVID-19[4,21,26,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,137,138,139,140,141]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kieu, S.T.H.; Bade, A.; Hijazi, M.H.A.; Kolivand, H. A Survey of Deep Learning for Lung Disease Detection on Medical Images: State-of-the-Art, Taxonomy, Issues and Future Directions. J. Imaging 2020, 6, 131. https://doi.org/10.3390/jimaging6120131

AMA Style

Kieu STH, Bade A, Hijazi MHA, Kolivand H. A Survey of Deep Learning for Lung Disease Detection on Medical Images: State-of-the-Art, Taxonomy, Issues and Future Directions. Journal of Imaging. 2020; 6(12):131. https://doi.org/10.3390/jimaging6120131

Chicago/Turabian Style

Kieu, Stefanus Tao Hwa, Abdullah Bade, Mohd Hanafi Ahmad Hijazi, and Hoshang Kolivand. 2020. "A Survey of Deep Learning for Lung Disease Detection on Medical Images: State-of-the-Art, Taxonomy, Issues and Future Directions" Journal of Imaging 6, no. 12: 131. https://doi.org/10.3390/jimaging6120131

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop