Next Article in Journal
Relevance of Non-Targeted Effects for Radiotherapy and Diagnostic Radiology; A Historical and Conceptual Analysis of Key Players
Previous Article in Journal
Involvement of the Anterior Commissure in Early Glottic Cancer (Tis-T2): A Review of the Literature
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Cancer Diagnosis Using Deep Learning: A Bibliographic Review

Department of Information Engineering, Electronics and Telecommunications (DIET), Sapienza University of Rome, Via Eudossiana 18, 00184 Rome, Italy
Department of Mechanical and Aerospace Engineering (DIMA), Sapienza University of Rome, Via Eudossiana 18, 00184 Rome, Italy
Department of Basic and Applied Science for Engineering (SBAI), Sapienza University of Rome, Via Antonio Scarpa 14/16, 00161 Rome, Italy
Author to whom correspondence should be addressed.
Cancers 2019, 11(9), 1235;
Submission received: 3 June 2019 / Revised: 30 June 2019 / Accepted: 14 August 2019 / Published: 23 August 2019


In this paper, we first describe the basics of the field of cancer diagnosis, which includes steps of cancer diagnosis followed by the typical classification methods used by doctors, providing a historical idea of cancer classification techniques to the readers. These methods include Asymmetry, Border, Color and Diameter (ABCD) method, seven-point detection method, Menzies method, and pattern analysis. They are used regularly by doctors for cancer diagnosis, although they are not considered very efficient for obtaining better performance. Moreover, considering all types of audience, the basic evaluation criteria are also discussed. The criteria include the receiver operating characteristic curve (ROC curve), Area under the ROC curve (AUC), F1 score, accuracy, specificity, sensitivity, precision, dice-coefficient, average accuracy, and Jaccard index. Previously used methods are considered inefficient, asking for better and smarter methods for cancer diagnosis. Artificial intelligence and cancer diagnosis are gaining attention as a way to define better diagnostic tools. In particular, deep neural networks can be successfully used for intelligent image analysis. The basic framework of how this machine learning works on medical imaging is provided in this study, i.e., pre-processing, image segmentation and post-processing. The second part of this manuscript describes the different deep learning techniques, such as convolutional neural networks (CNNs), generative adversarial models (GANs), deep autoencoders (DANs), restricted Boltzmann’s machine (RBM), stacked autoencoders (SAE), convolutional autoencoders (CAE), recurrent neural networks (RNNs), long short-term memory (LTSM), multi-scale convolutional neural network (M-CNN), multi-instance learning convolutional neural network (MIL-CNN). For each technique, we provide Python codes, to allow interested readers to experiment with the cited algorithms on their own diagnostic problems. The third part of this manuscript compiles the successfully applied deep learning models for different types of cancers. Considering the length of the manuscript, we restrict ourselves to the discussion of breast cancer, lung cancer, brain cancer, and skin cancer. The purpose of this bibliographic review is to provide researchers opting to work in implementing deep learning and artificial neural networks for cancer diagnosis a knowledge from scratch of the state-of-the-art achievements.

1. Introduction

Cancer is the leading cause of deaths worldwide [1]. Both researchers and doctors are facing the challenges of fighting cancer [2]. According to the American cancer society, 96,480 deaths are expected due to skin cancer, 142,670 from lung cancer, 42,260 from breast cancer, 31,620 from prostate cancer, and 17,760 deaths from brain cancer in 2019 (American Cancer Society, new cancer release report 2019) [3]. Early detection of cancer is the top priority for saving the lives of many. Typically, visual examination and manual techniques are used for these types of a cancer diagnosis. This manual interpretation of medical images demands high time consumption and is highly prone to mistakes.
For this reason, in the early 1980s [4], computer-aided diagnosis (CAD) systems were brought to assist doctors to improve the efficiency of medical image interpretation. Feature extraction is the key step to adopt machine learning. Different methods of feature extraction for different types of cancer have been investigated in [5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]. However, these methods based on feature extraction have weaknesses. To overcome these weaknesses and to enhance the performance, representation learning has been proposed in [22,23]. Deep learning has the advantage of generating directly from raw images the high-level feature representation. In addition to deep learning, Graphics Processing Units (GPU) are also being used in parallel, for feature extraction and image recognition. For example, convolutional neural networks have been able to detect cancer with promising performance [24].
To test these algorithms, there are publicly available datasets. These include INbreast and BreakHis for breast cancer testing; Digital Database for Screening Mammography (DDSM)for mass detection; MITOSTAPIA for mitosis detection; Japenese Society of Radiological Technology (JSRT), The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI), and Danish Lung Cancer Screening Trial (DLCST) for lung nodule classification; multimodal Brain Tumor Segmentation challenge (BraTS) for brain cancer identification; and Dermoscopic Image Segmentation (DermIS) as well as data given to the public by International Skin Image Collaboration (ISIC) for skin cancers.

2. Steps of Cancer Diagnosis

2.1. Pre-Processing

Raw images contain noise in it so the first step in detection procedure is preprocessing, i.e., improving the quality of an image to be used further by the removal of unwanted image information, which is referred to as the image noises. Several inaccuracies may occur in the classification if this issue is not entertained properly. In addition to inaccuracies, the requirement of performing this preprocessing is because of low contrast among skin lesion and surrounding healthy skin, irregular border and the skin artifacts, which are hairs, skin lines, and black frames. Many filters can be applied for removal of Gaussian noise, speckle noise, Poisson noise, and salt and pepper noise, including median filter, mean filter, adaptive median filter, Gaussian filter, and adaptive wiener filter. For example, an image containing hairs in it along with the lesion may cause misclassification.
The image noises are supposed to be removed or adjusted by performing pre-processing tasks such as contrast adjustment, vignetting effect removal, color correction, image smoothing, hair removal, normalization, and localization. The right combination of pre-processing tasks gives more accuracy. Some of the preprocessing techniques are black frame removal techniques, automatic color equalization, hair removal technique, dull Razor, Karhunen–Loe’ve transform [25], Gaussian filter, pseudo-random filter, non-skin masking, color space transform, and contrast enhancement. The MRI images of brain cancer are at first converted into greyscale and then undergo contrast adjustment using smoothing operation [26]. Skull stripping is also performed on brain MRI images using a brain extraction tool (BET) and the extraction of brain tissues from other parts of skull [27]. Using X-ray machines, the computed tomographic (CT) images obtained for lung cancer diagnosis are preprocessed by first converting them into grayscale images, followed by the normalization procedure and noise reduction. These images are then converted into binary images, after which the unwanted part is removed [28]. Preprocessing in breast cancer particularly consists of delineation of tumors from the background, breast border extraction and pectoral muscle suppression. Mammograms, which are used for breast cancer diagnosis, include many noises, which are the high-intensity rectangular label, low-intensity label, and tape artifacts. Thus, mammogram labeling, orientation, and segmentation are done using preprocessing [29]. For prostate cancer diagnosis, transrectal ultrasound (TRUS) images are obtained, which have inherent noise and low resolution of images. The preprocessing module used for the noise suppression and artifacts consists of: (a) tree-structured nonlinear filtering (TSF); (b) directional wavelet transform (DWT); and (c) tree-structured wavelet transform (TSWT) [30].

2.2. Image Segmentation

Division of the input image into regions where the necessary information for further processing can be extracted is known as segmentation. Segmentation is basically the separation of a region of interest (ROI) from the background of the image. ROI is the part of the image that we want to use. In the case of cancerous images, we need the lesion part to extract the features from the diseased part. Segmentation can be divided into four main classes: (i) threshold-based segmentation; (ii) region-based segmentation; (iii) pixel-based segmentation; and (iv) model-based segmentation. Threshold-based segmentation includes Ostu’s method, maximum entropy, local and global thresholding, and histogram-based thresholding. Watershed segmentation and seeded region growing are examples of region-based segmentation. Fuzzy c-means clustering, artificial neural networks, and Markov field method are some of the methods of the class of pixel-based segmentation. Model-based segmentation is a parametric deformable model, e.g. level sets. There are many other methods for image segmentation: histogram thresholding, adaptive thresholding, gradient flow vector, distributed and localized region identification, clustering and statistical region growing [31], bootstrap learning [32], active contours, supervised learning, edge detection, fuzzy-C Mean clustering, probabilistic modeling, sparse coding [33], contextual hypergraph [34], cooperative neural network segmentation, principle component transform, and region fused band and narrow band graph partition [35], among others. Hybrid models of these methods by combining two or more have been used to improve the accuracy of the system.

2.3. Post-Processing

After passing through the stages of preprocessing and image segmentation, there awaits post-processing where the task is to grab features. To accomplish this, the most common post-processing methods are opening and closing operations, island removal, region merging, border expansion, and smoothing. Some techniques used for the feature extraction are: principle component analysis (PCA), wavelet Packet Transform (WPT) [36,37], grey level co-occurrence matrix (GLCM) [38], fourier power spectrum (FPS) [39], Gaussian derivative kernels [40], and decision boundary features [41]. The basic steps of cancer diagnosis are summarized in Table 1.

2.4. ABCD-Rule

ABCD-rule analysis [42] refers to asymmetry (A), border (B), color (C) and diameter (D) of the lesion image). (A) Asymmetry: The input image is divided into a perpendicular axis in such a way that it gives the lowest possible value of asymmetry score. The score will be 2 if the asymmetry is with respect to the axes. If it is asymmetric on one axis, then its score will be 1. No asymmetry gives 0 scores. (B) Border: The image is divided into eight and checked for sharp and abrupt changes. Then, the score is checked, where a sharp cut off scores 1 and gradually scores 0. (C) Color: There are shades of colors for cancer detection: black and brown, but also sometimes white, red or pink. Colors are counted. (D) Diameter: The diameter of the lesion is carefully checked. If it is larger than 6 mm in diameter, then it is melanoma. Figure 1 shows the block diagram of all the methods described in this section.

2.5. Seven-Point Checklist Method

There are two types of criteria based on which classification is done. These are major and minor criteria. The major criteria have three points, and each point has a score value of 2, whereas minor criteria have four points each with a score value of 1. If the score value is at least 3, the classification result would be malignant melanoma [43].

2.5.1. Major Criteria

Blue-white veil: These are blue blotches with a white haze around it having no defined structure.
Atypical pigment network: In this network, the lesion has asymmetric distribution within it along with reticular lines while the color and thickness are heterogeneous in nature.
Atypical vascular pattern: These are irregular globular or dotted vessels having linearity in it.

2.5.2. Minor Criteria

Irregular globules/dots: Dots have an irregular shape, color, and distribution with size less than 0.1 mm, whereas globules size should be greater than 0.1 mm.
Irregular blotches: These are areas having different colors white, black or brown and no certain shape or regularity (no defined distribution or structure).
Irregular Streaks: When melanoma start growing radially, it forms radial streak type pattern and pseudopods, which are located at the edges of the lesion area.
Regression structures: These are scars such as de-pigmentation, particularly white in color.

2.6. Menzies Method

There are a few positive features (Pos.F) and negative features (Neg.F). The presence of any negatives declares melanoma to be malignant. It would be benign if both negatives are absent and one or more positives are true [43]. These are summarized in Table 2.

2.7. Pattern Analysis

There is a method based on finding patterns which are local or global. Global patterns can be homogenous, globular, starburst, reticular, parallel multi-component, or cobblestone. Local patterns can be the irregular steaks, inadequate pigmentation, pigment network, regression structures, globules, black dots, vascular structures or blue-white veil. The basis of this method is the qualitative assessment of the dermoscopic criteria individually.

3. Artificial Neural Networks

Neural networks are capable of performing the tasks of complex computation because of the nonlinear processing of neurons. An artificial neural network is shown in Figure 2 As the artificial neural network has the power of prediction, it can be used for medical images. In a general artificial neural network, test images are given to the neurons for training. To train neurons, back-propagation algorithm is used, with the flow in the forward direction. Then, the generated output is matched with the desired output and the error signal is generated in the case the outputs do not match. This error propagates in the backward direction. Weights are adjusted for error reduction. This processing is repeated until the error becomes zero. There is a layered structure in the neural network with the number of interconnected nodes and an activation function among them. These activation functions are tangent hyperbolic function, sigmoid function, piece-wise linear function, and threshold function. Input patterns are presented to the network through an input layer, which then connects to the hidden layer, and this hidden layer connects to the output layer. Below, some of the ANNs are explained in detail and the links to their corresponding codes are provided in Table 3.

3.1. Convolutional Neural Networks

CNN is a feed-forward neural network as shown in Figure 3. Here, the signal is processed directly without any loops or cycles. This can be represented as
G ( X ) = g N ( g N 1 ( . . . ( g 1 ( X ) ) ) )
where N represents number of hidden layers, X is the input signal and gN denotes the corresponding function to the layer N. A basic CNN model has a convolutional layer, which consists of a function g with multiple convolutional kernels (h1, … hk−1, hk). Every hk denotes a linear function in kth kernel, represented as follows:
h k ( x , y ) = s = m m t = n v = d w V k ( s , t , v ) X ( x s , y t , z v )
where (x, y, z) represents pixel position of input X, m represents height, n denotes width, w is depth of the filter, and Vk represents weight of kth kernel.
The basic purpose of pooling in CNN is the task of subsampling i.e., it summarizes the nearby neighborhood pixels and replaces them in the output at a location with summarized characteristics. Pooling reduces the dimensionality and performs the invariance of rotational transformations and translation transformations. There are many pooling functions [44]; one of the most famous is max pooling, in which the output is the maximum value of the rectangular pixel neighborhood. In average pooling function, the output becomes the average of the rectangular neighborhood. Another type consists of the weighted average based on the distance from the central pixel. Pooling helps to make the representation invariant to small changes to the translation in the input.
Atrous Convolution is given by the following equation:
y [ i ] = k = 1 K x [ i + r · k ] w [ k ]
where x[i] is the 1D input signal, w[k] is the filter of length of k, and r is the stride rate with which the input signal is sampled. y[i] is the output of the atrous convolution. Atrous convolution is applied over the input x for each location i on the output y and a filter w with the atrous rate r, which corresponds to the stride rate.
Deep residual learning is used to counter the degradation problem, which arises when the deep network starts to converge, i.e., a saturation of accuracy and degradation with the increasing depth. The residual network explicitly allows the stacked layers to fit in the residual map rather than a desired underlying map. According to the experimental results, the optimization of residual networks is easier, and the accuracy is achievable with a considerable increase in depth. Skip connections help the transverse information in deep neural networks. Due to passing through many layers, the gradient information may be lost, which is known as the vanishing gradients problem. Skip connection has the advantage of passing the feature information to lower layers, which makes it easier to classify the minute details. Some of the spatial information is lost due to the max-pooling operation, whereas skip connections make it possible to have more information on the final layer so that the classification accuracy increases.
In activation layer, different activation functions that can be used:
Sigmoid activation function [45] is given by the equation:
σ ( x ) = 1 1 + e x
It is nonlinear in nature; its combination will also be nonlinear in nature, which gives us the liberty to stack the layers together. Its range is from −2 to 2 on the x-axis and on y-axis it is fairly steep, which shows the sudden changes in the values of y with respect to small changes in the values of x. One of the advantages of this activation function is its output always remains within the range of (0,1).
Tanh function is defined as follows,
f ( x ) = tanh ( x ) = 2 1 + e 2 x 1
This is also known as the scaled sigmoid function:
tanh ( x ) = 2 s i g m o i d ( 2 x ) 1
Its range is from −1 to 1. The gradient is stronger for the tanh than the sigmoid function.
Rectified linear unit (ReLU) is the most commonly used activation function [45,46,47], where g denotes pixel-wise function, which is nonlinear in nature. That is, it gives the output x, if x is positive and it is 0 otherwise.
g ( x ) = m a x ( 0 , x )
ReLU is nonlinear in nature and its combination is also nonlinear, meaning different layers can be stacked together. Its range is from 0 to infinity, meaning it can also blow up the activation. For the pooling layer, g reduces the size of the features while acting as a layer-wise down-sampling nonlinear function. A fully connected layer has a 1 × 1 convolutional kernel. Prediction layer has a softmax which predicts the probability belonging of Xj to different possible classes.

3.2. Multi-Scale Convolutional Neural Network (M-CNN)

One of the multi-scale CNN architectures, as described by researchers in [48], consists of three convolutional blocks, each of which comprises a convolutional layer and a rectified linear unit (ReLU) followed by the max-pooling layer and two fully connected layers. Each input image is first down-sampled to multiple different scales and then the patches are collected. These patches are passed to the multi-scale CNN model scale-wise.

3.3. Multi-Instance Learning Convolutional Neural Network (MIL-CNN)

The problems in which the available labels are only for the set of data points are dealt with by the multi-instance learning MIL. Here, bags are used for the sets of data points and specific data points are referred to as instances. While using the binary labels, the most commonly made assumption is to consider a bag positive if at least one instance within the bag is positive. The mapping of instance space to the bag space has been made by using many functions, including Noisy-OR, generalization mean (GM), and log-sum-exponential (LSE).

CNN Architectures

Many CNN architectures have been proposed by several researchers in the past. They are briefly described in this section and the summary of CNN application are given in Table 4 and Table 5.
In 1998, a seven-level convolutional neural network was proposed, which was named as LeNet-5 by LeCun et al. The main advantage of this network was digit classification and was used by banks for the classification of handwritten numbers by costumers. They used 32 × 32 pixel grey-scale images as input for the classification. To process large images, high-resolution demands more convolutional layers, which limits this architecture.
AlexNet was a challenge winner architecture in 2012, by reducing the top-5 errors from 26% to 15.3%. This network is similar to LeNet but is deeper, with an increased number of filters per layer and more stacked convolutional layers. It consists of 11 × 11, 5 × 5, and 3 × 3 convolutional kernels, max pooling, dropout, data augmentation, and ReLU activations. ReLU activation is attached after every convolutional and fully connected layer. It takes two days to test this network on GPU580 Nvidia Geforce, which is why they split the network into two pipelines. The designers of ALexNet are a supervision group consisting of Alex Krizhevsky, Geoffrey Hinton, and Ilya Sutskever.
ZFNet was the winner of ImageNet Large Scale Visual Recognition Competition (ILSVRC) 2013. The authors reduced the top-5 error rate to 14.8%, which is half the non-neural error rate. They achieved it by keeping the AlexNet structure the same but changing its hyperparameters.
GoogleNet/ Inception V1
This was the winner of ILSVRC 2014 with the top-5 error rate of 6.67%, which is very close to human-level performance, thus the creators of the network were forced to perform human evaluation. After weeks of training, the human experts achieved top-5 error rate of 5.1% (single model) and 3.6% for ensemble. The network is a CNN based on LeNet dubbed with the inception module. It uses batch normalization, image distortions, and RMSprop. This is a 22-deep-layered CNN network but can reduce the parameters from 60 million to 4 million.
VGGNet was the runner-up in ILSVRC 2014. It is made up of 16 convolutional layers and a uniform architecture. It has only 3 × 3 convolution but many filters. It was trained for three weeks on 4 GPUs. Because of its architectural uniformity, it is the most appealing network for the purpose of feature extraction from images. The weighted configurations of this architecture were made public and is has been used as the baseline for many applications and challenges as the feature extractor. The biggest challenge one faces for this network is its 138 million parameters, which become difficult to handle.
Residual neural network (ResNet) at the ILSVRC 2015 uses skip connections and feature batch normalization. Those skip connections are also known as gated recurrent units, which are similar to the elements being applied recently in RNNS. This network-enables training a neural network with a 152 layers and a reduced complexity comparable to VGGNet. The achieved error rate of top-5 was 3.57%, thus it beats the human-level performance on the given dataset.

3.4. Fully Convolutional Networks (FCNs)

FCN differs from CNN as in FCNs the fully connected layer is replaced by an up-sampling layer and a deconvolutional layer [106] as shown in Figure 4. These layers are considered to be the backward versions of pooling and convolutional layers, respectively. FCNs generate a score map for each class instead of generating one probability score. This map has the exact same size as the input image and classifies the image pixel by pixel. Then, accuracy is improved by using upsampling and deconvolutional layers (this is called skip connection). These new layers are used for the development of many deep learning algorithms in many applications [107,108,109].

3.4.1. U-Net Fully Convolutional Neural Network

O.Ronneberger developed U-Net for biomedical image segmentation. Their architecture consists of two paths. The first contraction path is known as an encoder, which captures the context in the image. This is mainly a stack of convolution and pooling layers. The second path is the symmetric expanding path, also known as a decoder, which uses transposed convolutions and enables the precise localization. It is an end-to-end FCN network with no dense layer, only convolutional layers. Therefore, it can accept an image of any size.

3.4.2. Generative Adversarial Networks (GANs)

Goodfellow et al. introduced a method named as a generative adversarial network GAN [110], which is basically a two-player min-max game having a generator as the first player and discriminatory as the second player. The transformation from the prior distribution p z of the random noise z p z to realistically looking images G(z) p f a k e is done by generative network G. [111]. Discriminator D network classifies the fake sample that generator G(z) generated, from the real training data distribution x p r e a l . Parameters of the generator G are adjusted using the feedback information from the discriminator D so that generators samples are able to fool the discriminator in the classification task. D produces better and more realistic fake samples while G learns and produces the real samples. GANs have the ability to map the random to a realistic distribution [112,113]. GANs have been used for various applications including reconstruction [114,115,116], segmentation [117,118,119], domain adaptation [120,121] and detection [122,123]. GANs have also been used for the synthetic data generation; for example, Hou et al. used nuclei masks dealing with the foreground and background separately to generate pathology data. For the 3D segmentation of images, the generator of GAN architecture is used to generated images from the learned data distribution p d a t a ( x ) with the simultaneous training of the discriminator to differentiate between the generated images and true examples [124]. The generator maps the noise to the synthetic image vector. V.K.Singh [125] and his fellow researchers used conditional adversarial networks for the segmentation of breast mass from the mammography .

3.5. Recurrent Neural Networks (RNNs)

Recurrent neural networks are a powerful model of sequential data [126]. A hidden vector sequence f = ( f 1 , , f T ) is computed by the RNN from input sequence v = ( v 1 , , v T ) by iteration of t = 1 to T and an output sequence o = ( o 1 , , o T ) is obtained:
f t = F ( W v f v t + W f f f t 1 + b f )
o t = W f o f t + b o
where W denotes the weight matrix and b represents the bias vectors. F is the hidden layer function, which is usually sigmoid function. Recurrent neural networks are deep in time because they are a function of all the previous hidden states.

3.6. Long Short-Term Memory (LTSM)

Long short-term memory (LTSM) is a form of recurrent neural network [127] introduced by Hochreiter and Schmidhuber in 1997. The main purpose of designing the LTSMs was to avoid the long-term dependency problem. Remembrance of information is their default behavior. It has feedback connections, thus it is also referred to as a general-purpose computer. It has the ability to process sequences of data, for example, audio speech signal or video signals, along with single point dataset, i.e., images.

3.7. Restricted Boltzmann Machine (RBM)

The restricted Boltzmann machine is characterized by a very simple architecture. It is made up of a visible layer, which is also referred to as the input layer, and a hidden layer, arranged as a bipartite graph since there is intra-layer communication in RBM, which is the major restriction in this architecture. Restricted Boltzmann machines are trained to maximize the product of probabilities assigned to each pattern in a given training set, by a contrastive divergence algorithm performing Gibbs sampling.

3.8. Autoencoders (AEs)

Autoencoders belong to the unsupervised learning class of neural networks [128]. A general example of auto-encoder is shown in Figure 5. They learn from the input data a lower dimensionality feature representation. The basic structure of AEs has an input layer followed by the hidden layer and an output layer [129]. Training is done through two stages: coding and decoding. In the first stage, input I is encoded by some representation J by some weight matrix Y I , J and bias B I , J :
J = σ ( Y I , J I + B I , J )
where σ is an activation function, also known as sigmoid function as given in (10)
In the second step, the representation J is decoded using new weight matrix Y J , I ^ and bias B J , I ^ to reconstruct I ^ :
I ^ = σ ( Y J , I ^ + B J , I ^ )
where σ is also an activation function. Y J , I ^ can be considered as the transpose of Y I , J or new learnable matrix. These AEs are trained to minimize the error defined as:
arg max Y , B I I ^ 2

3.9. Stacked Autoencoders

Stacking of n autoencoders into n hidden layers using unsupervised layer-wise learning followed by the fine-tuning using a supervised method makes the basic structure of the stacked autoencoders (SAEs) [130]. Hence, the SAE method is composed of three steps: Firstly, using input data, the first autoencoder is trained and the feature vector is formed. Secondly, this feature vector is the input of the next layer and the process is repeated until the end of the training of hidden layers. Third, a backpropagation (BP) scheme is used for minimization of the cost function after the training of the hidden layers and the weights are updated with labeled training set to obtain the fine-tuning.

3.10. Sparse Autoencoders SAE

These are a special type of autoencoders in which sparsity is introduced in the hidden units of the hidden layer by making the number of nodes in a hidden layer larger than the input layer. Stack of SAE (SSAE) is trained in greedy fashion while they are connected with the encoding part only. First, the hidden layer is separately trained as SAE, and the output of this layer becomes the input of the next layer training. Features are extracted by using low-level SAE, after which multiple SAE are stacked together where these extracted features are fed to the input of high-level SAE for the extraction of deeper features. Hence, these SSAEs are able to extract the deeper features from the data. A fully unsupervised sparse convolutional autoencoder (CAE) for the detection of the nucleus and for feature extraction of tissues from histopathology images was proposed by the researchers in [131]. The CAE network is composed of six convolutional layers and two average-pooling layers. Then, the network is divided into three branches: the nucleus detection branch, the foreground feature branch, and the background branch. The reconstructed images of foreground and background are made by decoding the foreground and background feature maps. The final image is constructed by adding the two intermediate images. The authors evaluated their method on four datasets and they could reduce the state-of-the-art system errors by up to 42%.

3.11. Convolutional Autoencoders CAE

Convolutional autoencoders learn features from the unlabeled images by using end-to-end learning scheme [132]. The spatial relationship between the image pixels makes it superior to the stacked autoencoders. It belongs to the category of unsupervised learning algorithms. Features can be extracted from them once the filters have been learned. The extracted features can easily be used to reconstruct the input. In CAEs, the number of parameters required to create an activation map is always the same, which makes it well suited for the scaled high dimensional images. If the fully connected layers of a simple autoencoder are replaced by the convolutional layer, it becomes the convolutional autoencoder. The sizes of the input layer and the output layer remain the same as in the simple autoencoder, except the decoding part, which changes to the convolutional network [133]. A self-clustering adversarial convolutional network with an unsupervised principle was proposed for the classification of prostate tissue as a tumor or non-tumor without the labeled data.

3.12. Deep Belief Networks (DBN)

Using the stack of Restricted Boltzmann Machines (RBM) [134], a probabilistic generative model is constructed, which is named as Deep belief network (DBN) given in Figure 6. There are two layers of RBN, a visible layer and a hidden layer. Energy function used by RBN is defined as:
E ( u , J ) = a T u b T J u T Y J
where a is the bias vector of visible layer and b is the bias vector of hidden layer. The product of probability of the visible vectors are maximized by the RBM using the following energy function:
arg max Y , a P ( u ) = 1 Z J exp ( E ( u , J ) )
where Z is the partition function. RBM is optimized using the contrastive divergence theorem, which basically combines the Gibbs sampling and gradient descent [135]. DBN is also trained in the greedy fashion. AEs and RBMs share a similar structure.

3.13. Adaptive Fuzzy Inference Neural Network (AFINN)

The inference ability of fuzzy, human knowledge expertise and adaptive learning of neural network are combined into a particular machine learning approach known as adaptive fuzzy inference neural network (AFINN). This is a more powerful approach than those based on neural networks or fuzzy logic alone. Sometimes the information gain method is used for the reduction of a number of inputs in AFINN systems. It consists of two layers. One is the input–output (I/O) layer and the other is the rule-layer. The I/O layer consists of the input-part and the output-part. Each node in the rule-layer represents one fuzzy rule. Weights from the rule-layer to the output-part are fully connected and they store fuzzy if-then rules. At learning, stage membership function is automatically tuned. Weights are adjusted in AFINN by backpropagation.

4. Evaluation Metrics

True positive (TP) is the correct classification of the positive class, for example if an image contains cancerous cells and the model segments the cancer part successfully and the outcome classifies the presence of cancer. True negative (TN) is the correct classification of the negative class, for example there is no cancer present in the image and the model after classification declares that the cancer is not present. False positive (FP) is the incorrect prediction of the positives, for example the image does have cancerous cells but the model classifies that the image does not contain cancer in it. False negative (FN) is the incorrect prediction of the negatives, for example there is no cancer in the image but the model says an image is a cancerous one.

4.1. Receiver Operating Characteristic Curve (ROC-Curve)

The receiver operating characteristic curve (ROC-curve) represents the performance of the proposed model at all classification thresholds. It is the graph of true positive rate vs. false positive rate (TPR vs. FPR).
T P R = T P T P + F N
F P R = F P F P + T N

4.2. Area under the ROC Curve (AUC)

AUC provides the area under the ROC-curve integrated from (0, 0) to (1, 1). It gives the aggregate measure of all possible classification thresholds. AUC has a range from 0 to 1. A 100% correct classified version will have the AUC value 1.0 and it will be 0.0 if there is a 100% wrong classification. It is attractive for two reasons: first, it is scale-invariant, which means it checks how well the model is predicted rather than checking the absolute values; and, second, it is classification threshold invariant as it will check the model’s performance irrespective of the threshold being chosen.

4.3. F1-Score

Precision: It checks how precise the model works by checking the correct true positives from the predicted ones.
P r e c i s i o n = T P T P + F P
Recall: It calculates how many actual true positives the model hase captured, labeling them as positives.
R e c a l l = T P T P + F N
F1-score is the function of precision and recall. It is calculated when a balance between precision and recall is needed.
F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l

4.4. Accuracy

Accuracy determines that how many true positives TP, True negatives TN, False positive FP and False negatives FN were correctly classified:
A c c . = T P + T N T P + T N + F N + F P

4.5. Specificity

It is the rate of correct identification of negative items:
S p e f . = T N T N + F N

4.6. Sensitivity

Sensitivity is the amount of positive items correctly identified:
S e n s . = T P T N + F N

4.7. Precision

It is the ratio of correctly predicted positive items to the total predicted items:
P r e c . = T P T P + F P

4.8. Jaccard Index

It is a measure of similarity rate between two sample sets:
J a c i d x . = T P T P + F P + F N

4.9. Dice-Coefficient

It is a statistical measure of similarity rate between two sample sets:
D i c e c o f . = 2 × T P 2 × T P + F P + F N

4.10. Average Accuracy

It is a measure of effectiveness of the classifier:
A v g a c c . . = s = 1 m T P + T N T P + T N + F P + F N m
where m is the total number outputs of the system.

5. Models and Algorithms

5.1. Breast Cancer

Much research has already been done for the detection and diagnosis of breast cancer in the past few years. Some of the related papers are briefly discussed here in this section. Albayrak et al. [52] designed a method based on deep learning for the extraction of features applied on histopathological images of the breast, in particular, focused on the detection of mitosis. The proposed model extracted the features from CNN which were fed to support vector machine for its training and the mitosis of the breast was detected. AlexNet was used for the construction of CNN to classify the benign mitosis from the malignant one using the histopathological images [49]. A deep cascade network was proposed for the detection of mitosis from the breast histology slides [85]. From the histology slides, the mitosis candidates were extracted using the trained FCN model. Then, a CaffeNet model [114] was finely tuned and pretrained from the ImageNet images for the mitosis classification. Then, three networks with fully connected layers but different configurations were trained and the outputs were generated in the form of multiple scores or probabilities. These scores were averaged and the final output was generated. In the biomedical context, deep CNNs were explored by Albarqouni and his fellow researchers for non-expert crowd annotations. They proposed a multi-scale CNN architecture in which CNN was combined with the crowd annotations, in such a way that after every softmax layer they introduced an aggression layer (AL) to aggregate the prediction results from multiple participants with annotation results. Classification of nuclei from breast histopathological images using a stacked sparse autoencoder (SSAE) based algorithm was presented in [24]. Optimization of SSAE was done using the greedy strategy, where only one hidden layer at a time was trained and the previous layer’s output becomes the input of the forthcoming hidden layer. In addition to the histopathological images based detection of breast cancer, another dataset was also used in the studies for breast cancer detection that included mammographic images. A hybrid model by combining a CNN along with SVM was introduced in [50] for mass detection on digital mammograms. Mammogram patches were used for the training of the CNN model and high-level feature representation was obtained from the output of the last fully connected layer. This high-level feature representation was used to train the SVM for the classification. In [54], a transfer learning strategy was employed to train a CNN model as the training data were insufficient for training. Using this CNN, it was possible to detect the mass from the available mammograms. When the training data is limited to a few patterns over-fitting can occur. To overcome it, Swiderski et al. presented a way to enrich the training data using a non-negative matrix factorization (NMF) and statistical self-similarity [53]. Ertosun et al. presented a model which at first detects the presence of the mass in the mammograms and then it locates the mass from the mammographic images [51]. To learn the features form mammograms in the multiple scales, Kallenberg et al. trained a model with stacked convolutional sparse autoencoder (SCAE) [136]. The robustness of the model was enhanced by considering a sparsity regularizer in the proposed model. Different potential functions were combined by Dhungel et al. by using a structured support vector machine [137]. These potential functions included the Gaussian mixture model, prior to location, and a deep belief network for the mass segmentation in the mammograms. Dhungel et al. proposed another model using the cascade of random forest classifiers and deep learning for mass detection in another paper [138]. A 3D multi-view model for the learning of bilateral features from digital breast tomosynthesis (DBT) was proposed in [88]. From the source volume, they obtained the volume of interest (VOI), which was treated as a separate input than the VOI in the registered target. To extract high-level features from these two separate VOIs, two separate CNNs were used.

5.2. Lung Cancer

In addition to breast cancer, deep learning has found its use in lung cancer as well. Some of the studies which have applied deep learning for this purposed are discussed in this section. Patients survival time was successfully predicted using deep convolutional neural networks by Zhu et al. directly from the lung cancer pathological images [93]. A pretrained CNN, which was trained on a large scale data, was adopted by Paul et al., for the detection of lung cancer by extracting features from the CT images [87]. On the raw images of the lung, the DBN and CNN were applied with end-to-end learning [13]. They used 2D CT images for pulmonary node classifications, whereas, in [61], researchers used 3D CT images on multi-view CNN, which were could be used for end-to-end training. They extracted the 2D patches from the 3D images and used them on CNN for feature extraction. The features were fed to the classifier after fusing them together. As observed in the research study by Dou et al., they formed a model with CNN, which dealt with 3D images directly instead of mapping them into a 2D model [56].
A multi-variant Convolutional neural network (Mc-CNN) was constructed [57]. This model was designed to overcome the problem of variable nodule size. It produces the multi-scale features by replacing the max-pooling layer with the multi-crop pooling layer in the CNN structure. For the nonlinear transform, a randomized leaky rectified linear units (RReLU) was used. Convolutional operation is defined as follows,
y l = R R e L U ( k c k l × h k + b l )
where h k is the kth input map and y l is lth output map. c k l are the convolutional kernels between the kth input map and lth output map. b l is the bias of the lth output map. There were 64 CT slices and so the output features maps were also 64. RReLU is defined as:
R R e L U = x if   x 0 x a if   x < 0 , a U ( b l , b u )
where U ( b l , b u ) is the uniform distribution and a is a random factor sampled from this distribution. b l is the lower bound of the distribution and b u is the upper bound of the distribution. The used max-pooling is defined as:
y ( j , k ) i = max 0 m , n < s h ( j s + m , k s + m ) i
where y ( j , k ) i and h ( j s + m , k s + m ) i are the neuron’s position at ( j , k ) and ( j s + m , k s + m ) in the ith output, respectively, and m and n are the position offsets, whereas s is the pool size. Multi-crop pooling strategy is able to capture nodule centric visual features, whereas the traditional max pool is used for the feature subset selection and feature map size reduction. Thus, it can be said that the pooling operation is basically a one-level reduction of the features. In the multi-crop pooling, repetitive pooling strategy is used, which enables the system to obtain multi-scale features. Consider the three concatenated nodule-centric features f = [ f 0 , f 1 , f 2 ] formed from R 0 , R 1 and R 2 respectively. The size of R 0 is l × l × n , R 1 has the size l / 2 × l / 2 × n and R 2 has the size l / 4 × l / 4 × n . n is the number of features.
f i = m a x p o o l ( 2 i ) R i , i = 0 , 1 , 2
m a x p o o l tells the frequency of the max pooling used on the regions R i . R 1 is the center region cropped from the R 0 thus it is called one time for max pool to generate the feature f 0 . R 0 is max-pooled twice and generates the feature f 1 . R 2 is the center region cropped from R 1 ; it is not max-pooled but it serves as feature f 2 . The final result of multi-crop would be the concatenation of these features. Minimization of the entropy is done for the learning of this network and is defined as:
L O S S = ( q l o g P 1 + ( 1 q ) l o g p 0 )
where q has the suspiciousness value of 1 for high suspiciousness and 0 for the low suspiciousness. Stochastic Gradient descent is followed for the training of the network. The dataset used comprises 1010 patients with the nodule diameter ranging from 3 mm to 30 mm. They achieved an accuracy of 87.14%, sensitivity of 0.77% and specificity of 0.93%.
According to the research carried out by Wang et al., the deep model implementation on lung cancer classification can capture additional information with respect to considering only lung nodules, the information of interest [12]. To avoid this extra information, they calculated 26 handcrafted features and fused them with the CNN extracted features for lung nodules detection [12]. Ground glass opacity (GGO) candidate region selection was made by Hirayama et al. using the fine-tuned CNN model instead of using a pre-trained CNN [59]. GGO candidate regions were calculated by the equation:
g ( x , y , z ) = ( Δ ρ x Δ x ) 2 + ( Δ ρ y Δ y ) 2 + ( Δ ρ z Δ z ) 2
where x, y and z directions were determined by the equations:
Δ ρ x = | ρ ( x + 1 , y , z ) ρ ( x , y , z ) | + | ρ ( x , y , z ) ρ ( x 1 , y , z ) |
Δ ρ x = | ρ ( x , y + 1 , z ) ρ ( x , y , z ) | + | ρ ( x , y , z ) ρ ( x , y 1 , z ) |
Δ ρ x = | ρ ( x , y , z + 1 ) ρ ( x , y , z ) | + | ρ ( x , y , z ) ρ ( x , y , z 1 ) |
The morphological opening was performed followed by the labeling techniques and the noise was reduced using thresholding methods for each label volume sphericity. This process generated the GGO candidates at the end. They used the support vector machine classifier and achieved 93% true positives and 210.52% false positives.

5.3. Brain Cancer

Brain cancer has an uncontrolled growth and it may occur in any part of the brain. It has been quite challenging to detect which part of the brain contains cancer. Consequently, the biggest challenge for brain cancer is the segmentation of the brain from the healthy part. Several challenges have been conducted by BRATS for this purpose. Here, we include some of the research work in which deep learning has been successfully applied to brain images. Two algorithms based on 2D CNN and 3D CNN were proposed by Gao et al., working on 2D sliced images and 3D images, respectively. The final result was obtained by fusing the output from these two models. This hybrid model outperformed the 2D and 3D scale-invariant features swift (SIFT) and Kaze features. The automatic magnetic resonant image segmentation method based on CNN was discussed by Author1 [80]. They investigated the intensity normalization and augmentation for brain tumor detection. By exploring the local and global contextual features in the CNN model, Havaei et al. used a fully connected layer in the final layer of CNN to increase the speed of the system and detected the brain cancer successfully. A fully connected convolutional neural network (FCN) and conditional random field (CRFs) were used in [79] for the brain cancer segmentation. First, the image patches were used to train the FCN model and the training of CRF was done. In the end, the system was finely tuned using the image slices directly. Adjacent image patches were joined together into one pass using a dense training scheme in the CNN model [81]. The false positives were removed by using the 3D fully connected random field, after the 3D segmentation of the images using modality of CNN. Zhao et al. combined the multi-modality information from T1, T1C, T2, and fluid-attenuated inversion recovery (FLAIR) images and trained the proposed CNN from this information [82]. The algorithm proposed by them was a 3D voxel classification based on CNN. Different scaled 2D patches were extracted from 2D slices obtained by slicing the 3D dataset and these 2D patches were fed to multiple CNNs for the learning process.

5.4. Skin Cancer

“Melanoma” is curable if it is detected early. Differentiating between benign melanoma and malignant melanoma is really difficult, as they appear to be the same in the early stages. The main causes of melanoma and the risk factors are provided in Table 6 [139]. Many methods have been used to differentiate among them, including the most famous ABCD rule, seven-point checklist method, Menzies method, and pattern analysis.
Pomponiu et al. used 399 images captured from the standard camera to classify benign nevi from melanoma [65]. Firstly, preprocessing was performed along with data augmentation. High-level features of skin samples were extracted using pre-trained CNN and AlexNet. For lesion classification, K-nearest neighbor was used. They were able to achieve the accuracy of 93.62% with a specificity of 95.18% and sensitivity rate of 92.1%. In total, 129,450 images were used by Esteva et al. for the pretraining of CNN, 2032 of which were from a skin lesion and the rest were taken from dermatoscopic devices [86]. There were two types of classifications: (1) benign nevi’s classification from malignant melanoma; and (2) benign seborrheic keratosis classification from keratinocytes carcinomas. They used transfer learning for the classification. The AUC achieved was 0.96 for both melanomas and carcinomas. A pre-trained CNN along with pre-trained AlexNet and VGG-16 [140] were used for deep feature extraction and lesion classification [62]. Using 19,398 images for training a ResNet model, Han et al. proposed a classifier model for classifying 12 different types of skin diseases [13]. With the help of the Asan dataset, the achieved AUCs were 0.83 for squamous cell carcinoma, 0.82 for intraepithelial carcinoma, and 0.96 for melanoma and basal cell carcinoma. Despite the presence of pre-trained CNN’s, some efforts were made to develop new CNN algorithms. A self-advised semi-supervised learning model was proposed by massod et al. [141], for melanoma detection. Their proposed system consisted of a deep belief network and two self-advised support vector machines (SA-SVM) trained on three different datasets, along with two kernels radial basis function kernel (RBF) and polynomial kernel, respectively. The maximization of labeled data separation was done by a fine-tuning procedure with an exponential loss function. Deep features and hand-crafted features were combined in [67]. In the proposed system there were two SVM classifiers, one trained on local binary patterns (LBPs) and rotated speeded-up robust features (RSurf), while the other was trained on raw color images using the deep features extracted by the CNN model and probability scores were generated. The final decision was based on higher scores. Sabbaghi et al. used deep neural networks and mapped the images to enhance classification accuracy into bag-of-feature (BoF) space [167]. Demyanov et al. used stochastic gradient descent to train the CNN model and detect the typical network patterns and regular globules patterns in [68]. Yu et al. used residual blocks to replace FCN’s convolutional layers and therefore formed a fully convolutional residual network (FCRN), which is further used for the classification purpose [142]. Nasr-Esfahani et al. detected melanoma by feeding preprocessed images to CNN model [70], whereas border detection based CNN system was proposed in [74] for skin cancer diagnosis. Author1 [143] and Author1 [42] used the ABCDE method for cancer detection along with the image processing tools of segmentation, histogram analysis, and contour tracing. Sujaya et al. studied lesion probability using graphical user interface [144]. Palak et al. proposed a method using Fuzzy C Mean (FCM) for skin cancer analysis [145]. Colored Unsupervised segmentation, k means clustering and Gradient Vector Flow (GVF) were used in [146]. Sumithra et al. used Support Vector Machine (SVM) and K-nearest neighbors (K-NN) for lesion analysis [147]. In this section, the algorithms proposed in 2018 for skin cancer detection are briefly summarized. Jianfeng et al. used a backward-propagation method in a eight-layer CNN model [148]. Nine hundred images were used for classification testing. The achieved performance were 91.92% and 89.5% accuracy on the training set and test set, respectively. A system based on content-based image retrieval CBIR was proposed by P. Tschandi et al. in comparison with CNN. Three datasets were used to train the neural network, including 888, 2750 and 16,691 images. The prediction was done using Softmax. Performance measures were area under the characteristic-curve (AUC) and mAP (multi-class-accuracy and mean-average-prediction). Dataset 1 achieved 0.842 AUC value and 0.830 mAP value; Dataset 2 achieved 0.806 AUC value and 0.810 mAP value; and Dataset 3 achieved 0.852 AUC value and 0.847 mAP value. This was further tested on eight classes and performed well on that as well with respect to the normal CNN model. The dataset provided by ISIC in the 2016 challenge was used in [149] for the classification of the lesion using the CNN model along with ANN. Firstly, image segmentation was performed using intensity thresholding and then they used CNN for feature extraction. ANN classifier used these features to perform the classification. According to them, they achieved 98.32% accuracy, which was better than the previous best 97%. T.C. Pham et al. proposed a method for improvement of classification using CNN along with the method of data-augmentation [150]. In addition, they tried to overcome the issue of data limitation and its influence on the classifier’s performance. The dataset used contained 600 images for testing and 6162 for training. AUC value achieved was 89.2%, ACC value was 89.0% and AP value was 73.9%. They studied the influence of image augmentation on three different classifiers and found that they performed differently and showed improved results compared to the standard methods used previously. Four cutaneous diseases were diagnosed by using deep learning methods [151]. They made a hierarchical structure to make a summary of classification and diagnosis criteria. They were able to achieve the accuracy of 87.25% with a probability error of 2.24%. A SkinNet convolutional neural network was used in [152] for the segmentation and detection of skin cancer. A modified version of U-net CNN was proposed. A comparison of their results was made with state-of-the-art techniques. The dataset used in this work was from the 2017 ISBI challenge. They achieved the average value of the dice-coefficient of 85.10%, Jaccard index of 76.67% and sensitivity evaluation of 93.0%. H.A. Haenssle used 100 test images on google inception V4 CNN model in two levels [153]. At first, only dermoscopic images were used and then clinical images were also used along with dermoscopic images. Comparison was done with the 58 dermatologists internationally as well as with the five algorithms from 2016 ISBI challenge. Level 1 achieved the sensitivity of 86.6% with a standard deviation of 9.3%, and a specificity of 71.3% with a standard deviation of 11.2%. Level 2, where clinical images were also added, improved results to the sensitivity rate of 88.9% with a standard deviation of 9.6% and specificity of 75.7% with a standard deviation of 11.7%. Y. Wang used DeepLab 3, where instead of convolutional neural networks, they proposed atrous convolution method for segmentation of input image [154]. They achieved the Jaccard index of 0.498. Further improvement is necessary to improve the performance of the system. Different methods were tested on the vector extracted by PH2 using a dataset of dermatoscopic images [155]. Overall, 92.5% accuracy and 85.71% precision using Logistic Regression and VGG19 network was achieved. A multi-task convolutional neural network with the framework of joint detection and segmentation called faster region-based CNN was proposed in [156]. Region proposals and bounding boxes were generated by region proposal network (RPN) for localization of the lesion. Softmax refines these bounding blocks, which were then further processed for cropping, and SkinNet was used for their segmentation. Using the dataset of the 2017 ISBI challenge, this method achieved a dice coefficient of 0.93, accuracy of 0.96, Jaccard index of 0.88 and sensitivity of 0.95. A. Rezvantalab et al. used multiple state-of-the-art architectures to classify eight types of skin diseases [157]. The dataset contained 10,135 images including melanoma, nevi, BCC, BK, AK, ITC, DF, vascular lesion and atypical nevi. The architectures used were ResNet 152, Inception ResNet v2, Inception v3, and DenseNet 201. The AUC of ResNet 152 for melanoma and BCC classification was 94.40% and for DenseNet its value was 99.30%, while the average AUC value by DenseNet 201 was 98.16%. In an article published in JAMA Dermatology in January 2019, Philipp et al. used combined CNNs for pigmented melanocyte lesions to achieve expert-level accuracy. Their dataset consisted of 13,724 images, 7895 of which were dermoscopic images and 5829 were closeup images. The data were collected from 2008 to 2017 of lesions at a skin cancer clinic. Testing of this algorithm was made in 2072 cases, while the comparison was made by 95 medical experts (Human raters) and 62 board-certified dermatologists. They observed that cCNNs performed well as compared to human raters and they achieved a high percentage of correct diagnosis. Walker et al. used deep learning to improve the diagnosis of skin lesion [158]. They conducted two levels of study, one of which called LABS (Laboratory retrospective study) and the second one non-interventional OBS (Observational study). For experimenting with LABS, 482 biopsies were used while 63 biopsies were used for OBS. A deep learning classifier was trained on 3954 training visual data. Then, on the output of this classifier, sonification was performed (which means the conversion of the signal into sound files). Then, there was a second ML classifier that operates this raw sound for LABS and image analysis for OBS. This algorithm provided AUC of 0.976; AUC for raw sound was 0.931, for FFT was 0.90 and for spectrogram was 0.988. OBS obtained AUC of 0.819 from raw sound and 0.808 from image analysis. They proved that the addition of the second stage on the DL algorithm including sonification and heuristic analysis can improve accurate diagnosis. A new cause of cancer found in 2018 by Zioutas and Valachovic was counter argued by Hector this year. According to them, there is a connection between melanoma cancer and dark matter. They proposed the idea that inner planetary motion significantly increases the dark matter density on Earth. They specifically considered the planetary motion of Mercury and Earth. According to them, this increase in density of dark matter causes melanoma cancer, to which the black population is pretty much immune. To counter their argument, Hector used the same large amount of data and also presented periodic stats provided by Zioutas and Valachovic. Yoshimasa et al. used a convolutional neural network for the detection of esophageal cancer, SCC (Squamous cell carcinoma), and adenocarcinoma [159]. The training images used in this study included 8428 images collected from 384 patients in Japan. Test data contained 1118 images where 47 patients had esophageal cancer while 50 did not. They were able to achieve the accuracy of 98% while the sensitivity achieved was 98%. Forty percent of each image was predicted positively while 95% was negatively predicted because of the shadows, which was the reason for misdiagnosis. Gomez-Martin et al. studied clinical, dermoscopic and confocal parameters for the detection of flat leg lesions pink shaded in elders [160]. They achieved the accuracy of 49.1%, the specificity of 73.4% and the sensitivity of 68.7% with the clinical diagnosis system. While dermoscopy provided 59.6% accuracy, and 85% and 67.6% sensitivity and specificitym. respectively. Confocal microscopy achieved accuracy of 85.1%, and 97.5% and 88.2% sensitivity and specificity, respectively. Parpti et al. used image enhancement to improve the image quality followed by the multi-scale retinex MSR along with Color restoration to detect skin cancer [161].

5.5. Prostate Cancer

Prostate cancer is the third highest cause of death among men and it has a high chance of diagnosing in males [162]. To facilitate timely radiotherapy, its successful segmentation is very important. A combination of sparse patch matching and deep feature learning for prostate segmentation was proposed by Author1 [162]. To extract the feature representation from the MR images, they used the SSAE technique. Author1 [76] proposed a prostate cancer detection method using the SAE classifier (Table 7).
The collected features were improved by the supervised way fine-tuning of the SSAE model. The recognition map was refined by using the energy minimization procedure based on the neighbor pixel relationship. Tian et al. used a fully convolutional network for prostate segmentation [165]. Using the 3D MR images, Yu et al. segmented the prostate by using volumetric convolutional networks [142]. They extended their FCN with residual blocks to enable the volume to volume prediction. Maa et al. Proposed a method using patch-based CNN to use the region of interest and detect prostate cancer from its [77]. The final segmentation result was obtained by using a multi-atlas label fusion. Lumen segmentation using the CNN model was done in [92] and for the classification of prostate cancers, they generated maps. Patch-based CNN was used in [90].

6. Discussion

According to the reviewed studies, CNN has the best in performance of all architectures. The winner of ImageNet Large Scale Visual Recognition Competition (ILSVRC) 1998 was LeNet, which is a seven-level CNN architecture, and 2012 it was AlexNet, which is also a very successful version of CNN. From 2012 to 2015, the winner of this competition has been the CNN architectures AlexNet, ZFNet, GoogleNet/ Inception V1, VGGNet and ResNet, which shows the success rate of the CNN architectures in this field. Since these are all different architectures of the same CNN, as the model changes, the only evaluation measure is their percentage performance. As described in the competition, the necessary part was the reduction of top-5 errors, which AlexNet reduced from 26% to 15.3%, while ZFNet reduced to 14.8%. This performance was beaten by the GoogleNet/Inception V4, achieving the error reduction to 3.6%. The best performance was shown by ResNet, which beats the human-level performance by reducing errors to 3.57%.
When implementing deep learning for cancer diagnosis, one of the major challenges becomes a lack of availability of datasets. Every learning algorithm requires a large amount of training for performance measure. However, efforts have been made to make medical images archives containing confidential information of many patients by picture archiving and communication society (PACS). Researchers also use data images from cancer research organizations and hospitals for executing their algorithms. One of the major breakthroughs for data collection was made by Esteva et al. [86]. They collectively made an effort and formed a dataset with 127,463 training images and 1942 test images. Many researchers use a small dataset for their algorithms. In addition, most of the datasets available online with open access have raw images and so researchers are required to obtain the ground truth themselves.
To deal with the issue of limited dataset, a scheme of data augmentation was proposed. Many researchers use data augmentation, which includes techniques such as rotation, cropping and filtering to increase the number of available data. Another way to avoid over-fitting is transfer learning, which has been used by many of the researchers discussed above in this review.
Low contrast and SNR of medical images are responsible for the poor performance of deep learning algorithms. Thus, another issue is how to improve the performance of the proposed model if the data have low contrast and poor SNR. Furthermore, studies based on brain tumor segmentation raised a question: How can we maintain the performance of algorithms on multiple resource data When the algorithms were made to train on multi-institutional data, their performance decreases gradually. Some of the online available datasets along with their access links are given in Table 8.
Another issue that was observed is the inequality of training data distribution. If the positive data are made larger than the negative data, then the system will be automatically biased and will majorly give positive results while the same happens if the data have more negative than positive cases. Thus, equality of training data is very important, which was ignored by few researchers.
One of the problems faced when implementing a convolutional neural network is the size of the target object inside the image. As the target object varies in size, studies proposed to train the model with images of various scales to make the model learn this size variation. To capture the multi-scale features directly from image, Shen et al. replaced a standard pooling operation with multi-crop pooling [57].

7. Summary and Conclusions

This review focuses on providing all the necessary information to the beginners of this field, starting from the main concepts of cancer diagnosis, evaluation criterion and medical methods. As this manuscript mainly focuses on the deep learning for cancer diagnosis, the most important things to introduce to our readers are all the possible techniques of deep learning that can be used for diagnostic purposes in this document. Furthermore, to facilitate the audience, the respective practice codes for each technique, which are easily available online, are put together in a table. One of the major issues that one can encounter in implementing any algorithm is the dataset availability, therefore all possible access links to the datasets are presented in this work.
Different architectures of CNN are also described in this manuscript. The implementation of the deep learning algorithms for brain cancer, lung cancer, breast cancer, and skin cancer is the focus of this manuscript. The performance measures for different studies are provided. In this review, different deep learning algorithms for classifying different types of cancers are presented. In this review, fifteen studies used Histopath model with CNN for classification and detection of different types of cancers as provided in Table 9. Six of these studies provided the source of data [49,50,52,54,85] while nine studies did not publish the source of data [90,91,92,93,95,101,102,104]. Two research studies used mammographs for detection along with CNN and published data source [51,53]. Eight studies [12,59,77,87,96,98,103,105] used CT Slices, three of which used data from PROMISE [75], and LIDC [166]. Five studies used volumetric computed tomography [56,57,58,60,61]. Seven studies were for brain cancer classification [76,79,80,81,89,94,94].
In the field of dermatology, Esteva et al. used pre-trained CNN for skin lesion classification with accuracy of 93.62% [86]. Sabbaghi et al. mapped images to bag-of-feature to increase classification accuracy [167]. Globule patterns on the skin were detected by Demyanov et al. using a stochastic gradient descent model [68]. Yu et al. formed FRCN by replacing the FCN’s Conv. layer with the residual layer [142]. Melanoma detection was performed by Nasr et al. by feeding preprocessed images to CNN network model [70].
Two of the methods reviewed in this study by Chandrahasa et al. and M.Garbaj et al. used the ABCDE method for skin cancer detection; they made use of image segmentation, histogram analysis and contour tracing [42,143]. Sujaya et al. used a graphical user interface to classify skin lesion [144]. whereas fuzzy C Mean was used by Palak et al. for skin cancer analysis [145]. Sumithra et al. used support vector machine for skin lesion classification [147].
In total, 27 different algorithms provided by 27 different researchers [42,65,67,68,70,74,86,141,142,143,144,145,147,148,149,150,151,152,153,155,156,157,158,159,161,167] are reviewed for skin cancer diagnosis. As discussed above, there are different methods with different algorithm schemes and different training datasets, which adds difficulty when comparing them. No particular standard can be defined to compare their results.

Author Contributions

The main concept of this research article is held by K.M., A.R., and F.F. Methodology and framework is processed by K.M. and A.R. Formal analysis and investigation are performed by K.M. and H.E., F.F. and A.R. did the supervision of this research. K.M. and A.A. did resource collection and data curation. Writing—original draft preparation and writing—review and editing is done by K.M.


This research received no external funding.

Conflicts of Interest

The authors declared no conflict of interest.


The following abbreviations are used in this manuscript:
ABCDAsymmetry, Border, Color Variation and Diameter
ABCDEAsymmetry, Border, Color Variation, Diameter and Expansion
ANNArtificial neural networks
AI-ANNArtificial intelligence-artificial neural networks
AKActinic Keratosis
AUCArea under the characteristic curve
BCCBasal cell carcinoma
BKBenign Keratosis
CADComputer aided diagnosis
CAEConvolutional autoencoders
CNNsConvolutional neural networks
cCCNsCombined convolutional networks
CTComputed tomography
DANsDeep auto encoders
FCNsFully convolutional networks
FCRNFully convolutional Residual networks
FCMFuzzy-c Mean
FPSFourier power spectrum
FPFalse positives
FNFalse negatives
GVFGradient Vector Flow
GANGenerative adversarial models
GPUGraphics processing units
GLCMGrey level co-occurrence matrix
GMGeneralization mean
ICIntraepithelial carcinoma
ILSVRCImageNet large scale visual recognition competition
K-NNK-Nearest neighbor
LBPSLocal binary patterns
LABSLaboratory retrospective study
LTSMLong short-term memory
M-CNNMulti-scale convolutional neural network
MIL-CNNMultiinstance convolutional neural network
MRIMagnetic resonance image
mAPmulti class accuracy and mean average prediction
Neg.FNegative features
OBSObservational study
PACSPicture archiving and communication society
Pos.FPositive features
PCAPrincipal component analysis
RBFRadial basis function
RBMRestricted Boltzmann’s machine
ReLURectified linear unit
ROCReceiver operating characteristic curve
ROIRegion of interest
RsurfRotated speeded-up robust features
RNNRecurrent neural networks
SAEStacked autoencoders
SA-SVMSelf-advised support vector machine
SCCSquamous cell carcinoma
SNRSignal too noise ratio
SVMSupport Vector Machine
TPTrue positives
TNTrue negatives
WPTWavepacket transform


  1. Torre, L.A.; Bray, F.; Siegel, R.L.; Ferlay, J.; Lortet-Tieulent, J.; Jemal, A. Global cancer statistics, 2012. CA Cancer J. Clin. 2015, 65, 87–108. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Siegel, R.L.; Miler, K.D.; Jemal, A. Cancer Statistics, 2016. CA Cancer J. Clin. 2016, 66, 7–30. [Google Scholar] [CrossRef] [PubMed]
  3. Cancer Facts and Figures 2019, American Cancer Society. 2019. Available online: (accessed on 8 January 2019).
  4. Doi, K. Computer-aided diagnosis in medical imaging: Historical review, current status and future potential. Comput. Med. Imaging Graph. 2007, 31, 198–211. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Te Brake, G.M.; Karssemeijer, N.; Hendriks, J.H. An automatic method to discriminate malignant masses from normal tissue in digital mammograms1. Phys. Meds. Biol. 2000, 45, 2843. [Google Scholar] [CrossRef] [PubMed]
  6. Beller, M.; Stotzka, R.; Muller, T.; Gemmeke, H. An example-based system to support the segmentation of stellate lesions. In Bildverarbeitung fÃr die Medizin 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 475–479. [Google Scholar]
  7. Yin, F.F.; Giger, M.L.; Vyborny, C.J.; Schmidt, R.A. Computerized detection of masses in digital mammograms: automated alignment of breast images and its effects on bilateral-substraction technique. Phys. Med. 1994, 3, 445–452. [Google Scholar] [CrossRef] [PubMed]
  8. Aerts, H.J.; Velazquez, E.R.; Leijenaar, R.T.; Parmar, C.; Grossmann, P.; Cavalho, S.; Bussink, J.; Monshouwer, R.; Haibe-Kains, B.; Rietveld, D. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 2014, 5, 4006. [Google Scholar] [CrossRef] [PubMed]
  9. Eltonsy, N.H.; Tourassi, G.D.; Elmaghraby, A.S. A concentric morphology for the detection of masses in mammograph. IEEE Trans. Med. Imaging 2007, 26, 880–889. [Google Scholar] [CrossRef] [PubMed]
  10. Wei, J.; Sahiner, B.; Hadjiiski, L.M.; Chan, H.P.; Petrick, N.; Helvie, M.A.; Roubidoux, M.A.; Ge, J.; Zhou, C. Computer-aided detection of breast masses on full field digital mammograms. Med. Phys. 2005, 32, 2827–2838. [Google Scholar] [CrossRef] [Green Version]
  11. Hawkins, S.H.; Korecki, J.N.; Balagurunthan, Y.; Gu, Y.; Kumar, V.; Basu, S.; Hall, L.O.; Goldgof, D.B.; Gatenby, R.A.; Gillies, R.J. Predicting outcomes of nonsmall cell lung cancer using CT image features. IEEE Access 2005, 2, 1418–1426. [Google Scholar] [CrossRef]
  12. Balagurunthan, Y.; Gu, Y.; Wang, H.; Kumar, V.; Grove, O.; Hawkins, S.H.; Kim, J.; Goldgof, D.B.; Hall, L.O.; Gatenby, R.A. Reproducibility and prognosis of quantitative features extracted from CT images. Transl. Oncol. 2005, 7, 72–87. [Google Scholar] [CrossRef]
  13. Han, F.; Wang, H.; Zhang, G.; Han, H.; Song, B.; Li, L.; Moore, W.; Lu, H.; Zhao, H.; Liang, Z. Texture feature analysis for computer-aided diagnosis on pulmonary nodules. J. Digit. Imaging 2015, 28, 99–115. [Google Scholar] [CrossRef] [PubMed]
  14. Barata, C.; Marquees, J.S.; Rozeira, J. A system for the detection of pigment network in dermoscopy images using directional filters. IEEE Trans. Biomed. Eng. 2012, 59, 2744–2754. [Google Scholar] [CrossRef] [PubMed]
  15. Barata, C.; Marquees, J.S.; Celebi, M.E. Improving dermoscopy image analysis using color constancy. In Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014; pp. 3527–3531. [Google Scholar]
  16. Barata, C.; Ruela, M.; Mendonca, T.; Marquees, J.S. A bag-of-features approach for the classification of melanomas in dermoscopy images: The role of color and texture descriptors. In Computer Vision Techniques for the Diagnosis of Skin Cancer; Springer: Berlin/Heidelberg, Germany, 2014; pp. 49–69. [Google Scholar]
  17. Sadeghi, M.; Lee, T.K.; McLean, D.; Lui, H.; Atkins, M.S. Detection and analysis of irregular streaks in dermoscopic images of skin lesions. IEEE Trans. Med. Imaging 2013, 32, 849–861. [Google Scholar] [CrossRef] [PubMed]
  18. Zickic, D.; Glocker, B.; Konukoglu, E.; Criminsi, A.; Demiralp, C.; Shotton, J.; Thomas, O.M.; Das, T.; Jena, R.; Price, S.J. Decision forest foe tissue-specific segmentation of high-grade gliomas in multi-channel MR. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Berlin/Heidelberg, Germany, 2012; pp. 369–376. [Google Scholar]
  19. Meier, R.; Bauer, S.; Slotnoom, J.; Wiest, R.; Reyes, M. A hybrid model for multi-modal brain tumor segmentation. In Proceedings of the MICCAI Challenge on MultimodalBrain Tumor Image Segmentation, NCI-MICCAI BRATS, Nagoya, Japan, 22 September 2013; pp. 31–37. [Google Scholar]
  20. Pinto, A.; Pereira, S.; Correia, H.; Oliveira, J.; Rasteito, D.M.; Silva, C.A. Brain tumour segmentation based on extremely randomized forest with high-level features. In Proceedings of the 37th Annual International Conference on IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 25–29 August 2015; pp. 3037–3040. [Google Scholar]
  21. Tustison, N.J.; Shrinidhi, K.; Wintermark, M.; Durst, C.R.; Kandel, B.M.; Gee, J.C.; Grossman, M.C.; Avants, B.B. Optimal symmetric multimodal templates and concatenated random forests for supervised brain tumor segmentation(simplified) with ANTsR. Neuroinformatics 2015, 13, 209–225. [Google Scholar] [CrossRef] [PubMed]
  22. Bengio, Y.; Courville, A.; Vinvent, P. Representation learning: A review and new prespectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed]
  23. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  24. Xu, B.; Wang, N.; Chen, T.; Li, M. Empirical evaluation of rectified activations in convolutional network. arXiv 2015, arXiv:1505.00853. [Google Scholar]
  25. Messadi, M.; Bessaid, A.; Taleb-Ahmed, A. Extraction of specific parameters for skin tumour classification. J. Med. Eng. Technol. 2009, 33, 288–295. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Reddy, B.V.; Reddy, P.B.; Kumar, P.S.; Reddy, S.S. Developing an approach to brain MRI image preprocessing for tumor detection. Int. J. Res. 2014, 1, 2348–6848. [Google Scholar]
  27. Zacharaki, E.I.; Wang, S.; Chawla, S.; Soo Yoo, D.; Wolf, R.; Melhem, E.R.; Davatzikos, C. Classification of brain tumor type and grade using MRI texture and shape in a machine learning scheme. Magn. Reson. Med. 2009, 2, 1609–1618. [Google Scholar] [CrossRef]
  28. Miah, M.B.A.; Yousuf, M.A. Detection of lung cancer from CT image using image processing and neural network. In Proceedings of the 2015 International Conference on Electrical Engineering and Information Communication Technology (ICEEICT), Jahangirnagar University, Dhaka, Bangladesh, 21–23 May 2015. [Google Scholar]
  29. Ponraj, D.N.; Jenifer, M.E.; Poongodi, P.; Manoharan, J.S. A survey on the preprocessing techniques of mammogram for the detection of breast cancer. J. Emerg. Trends Comput. Inf. Sci. 2011, 2, 656–664. [Google Scholar]
  30. Zhang, Y.; Sankar, R.; Qian, W. Boundary delineation in transrectal ultrasound image for prostate cancer. Comput. Biol. Med. 2007, 37, 1591–1599. [Google Scholar] [CrossRef] [PubMed]
  31. Emre Celebi, M.; Kingravi, H.A.; Iyatomi, H.; Alp Aslandogan, Y.; Stoecker, W.V.; Moss, R.H.; Malters, J.M.; Grichnik, J.M.; Marghoob, A.A.; Rabinovitz, H.S.; et al. Border detection in dermoscopy images using statistical region merging. Skin Res. Technol. 2008, 14, 347–353. [Google Scholar] [CrossRef] [PubMed]
  32. Tong, N.; Lu, H.; Ruan, X.; Yang, M.H. Salient object detection via bootstrap learning. In Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1884–1892. [Google Scholar]
  33. Bozorgtabar, B.; Abedini, M.; Garnavi, R. Sparse coding based skin lesion segmentation using dynamic rule-based refinement. In International Workshop on Machine Learning in Medical Imaging; Springer: Cham, Switzerland, 2016; pp. 254–261. [Google Scholar]
  34. Li, X.; Li, Y.; Shen, C.; Dick, A.; Van Den Hengel, A. Contextual hypergraph modeling for salient object detection. In Proceedings of the 2013 IEEE International Conference on the Computer Vision (ICCV), Sydney, Australia, 1–8 December 2013; pp. 3328–3335. [Google Scholar]
  35. Yuan, X.; Situ, N.; Zouridakis, G. A narrow band graph partitioning method for skin lesion segmentation. Pattern Recogn. 2009, 42, 1017–1028. [Google Scholar] [CrossRef]
  36. Sikorski, J. Identification of malignant melanoma by wavelet analysis. In Proceedings of the Student/Faculty Research Day, New York, NY, USA, 7 May 2004. [Google Scholar]
  37. Chiem, A.; Al-Jumaily, A.; Khushaba, N.R. A novel hybrid system for skin lesion detection. In Proceedings of the 3rd International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIPâTM07), Melbourne, Australia, 3–6 December 2007; pp. 567–572. [Google Scholar]
  38. Maglogiannis, I.; Zafiropoulos, E.; Kyranoudis, C. Intelligent segmentation and classification of pigmented skin lesions in dermatological images. In Advances in Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2006; pp. 214–223. [Google Scholar]
  39. Tanaka, T.; Torii, S.; Kabuta, I.; Shimizu, K.; Tanaka, M.; Oka, H. Pattern classification of nevus with texture analysis. In Proceedings of the 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBCâTM04), San Francisco, CA, USA, 1–5 September 2004; pp. 1459–1462. [Google Scholar]
  40. Zhou, H.; Chen, M.; Rehg, J.M. Dermoscopic interest point detector and descriptor. In Proceedings of the 6th IEEE International Symposium on Biomedical Imaging: From Nano to Macro (ISBIâTM09), Boston, MA, USA, 28 June–1 July 2009; pp. 1318–1321. [Google Scholar]
  41. Lee, C.; Landgrebe, D.A. Feature extraction based on decision boundaries. IEEE Trans. Pattern Anal. Mach. Intell. 1993, 15, 388–400. [Google Scholar] [CrossRef]
  42. Garbaj, M.; Deshpande, A.S. Detection and Analysis of Skin Cancer in Skin Lesions by using Segmentation. IJARCCE 2015. Available online: (accessed on 1 August 2019).
  43. Johr, R.H. Dermoscopy: Alternative Melanocytic Algorithms—The ABCD Rule of Dermatoscopy, Menzies Scoring Method, and 7-Point Checklist; Elsevier: Amsterdam, The Netherlands, 2002. [Google Scholar]
  44. Lee, C.Y.; Gallagher, P.W.; Tu, Z. Generalizing pooling functions in convolutional neural networks: Mixed, gated, and tree. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), Cadiz, Spain, 9–11 May 2016. [Google Scholar]
  45. Nwankpa, C.; Ijomah, W.; Gachagan, A.; Marshall, S. Activation functions: Comparison of trends in practice and research for deep learning. arXiv 2018, arXiv:1811.03378. [Google Scholar]
  46. Nair, V.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICLM), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
  47. Kingma, D.; Ba, J. Adam: A method of stochastic optimmization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  48. Albarqouni, S.; Baur, C.; Achilles, F.; Belagiannis, V.; Demirci, S.; Navab, N. AggNet: Deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans. Med. Imaging 2016, 35, 1313–1321. [Google Scholar] [CrossRef]
  49. Spanhol, F.A.; Oliveira, L.S.; Petitjean, C.; Heutte, L. Breast cancer histopathological image classification using convolutional neural networks. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 2560–2567. [Google Scholar]
  50. Wichakam, I.; Vateekul, P. Combining deep convolutional networks and SVMs for mass detection on digital mammograms. In Proceedings of the 8th International Conference on Knowledge and Smart Technology (KST), Bangkok, Thailand, 3–6 February 2016; pp. 239–244. [Google Scholar]
  51. Ertosun, M.G.; Rubin, D.L. Probabilistic visual search for masses within mammography images using deep learning. In Proceedings of the 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Washington, DC, USA, 9–12 November 2015; pp. 1310–1315. [Google Scholar]
  52. Albayark, A.; Bilgin, G. Mitosis detection using convolutional neural network based features. In Proceedings of the IEEE Seventeenth International Symposium on Computational Intelligence and Informatics (CINTI), Budapest, Hungary, 17–19 November 2016; pp. 335–340. [Google Scholar]
  53. Swiderski, B.; Kurek, J.; Osowski, S.; Kruk, M.; Barhoumi, W. Deep learning and non-negative matrix factorization in recognition of mammograms. In Proceedings of the Eighth International Conference on Graphic and Image Processing, International Society of Optics and Photonics, Tokyo, Japan, 8 February 2017; Volume 10225, p. 102250B. [Google Scholar]
  54. Suzuki, S.; Zhang, X.; Homma, N.; Ichiji, K.; Sugita, N.; Kawasumi, Y.; Ishibashi, T.; Yoshizawa, M. Mass detection using deep convolutional neural networks for mammoghraphic computer-aided diagnosis. In Proceedings of the 55th Annual Conference of the Society of Intruments and Control Engineers of Japan (SICE), Tsukuba, Japan, 20–23 September 2016; pp. 1382–1386. [Google Scholar]
  55. Wang, C.; Elazab, A.; Wu, J.; Hu, Q. Lung nodule classification using deep feature fusion in chest radiography. Comput. Med. Imaging Graph. 2017, 57, 10–18. [Google Scholar] [CrossRef]
  56. Dou, Q.; Chen, H.; Yu, L.; Qin, J.; Heng, P.-A. Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection. IEEE Trans. Biomed. Eng. 2017, 64, 1558–1567. [Google Scholar] [CrossRef] [PubMed]
  57. Shen, W.; Zhou, M.; Yang, F.; Yu, D.; Dong, D.; Yang, C.; Tian, J. Multicrop convolutional neural networks for lung nodule malignancy suspiciousness classification. Pattern Recognit. 2017, 61, 663–673. [Google Scholar] [CrossRef]
  58. Hua, K.L.; Hsu, C.H.; Hidayati, S.C.; Cheng, W.H.; Chen, Y.J. Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Ther. 2015, 57, 2015–2022. [Google Scholar] [CrossRef]
  59. Hiryama, K.; Tan, J.K.; Kim, H. Extraction of GGO candidate regions from the LIDC database using deep learning. In Proceedings of the Sixteenth International Conference on Control, Automation and Systems (ICCAS), Gyeongju, Korea, 16–19 October 2016; pp. 724–727. [Google Scholar]
  60. Setio, A.A.A.; Ciompi, F.; Litjens, G.; Gerke, P.; Jacobs, C.; Van Riel, S.J.; Wille, M.M.W.; Naqibullah, M.; Sanchez, C.I.; van Ginneken, B. Pulmonary nodule detection in CT images: False positive reduction using multi-view convolutional networks. IEEE Trans. Med. Imaging 2016, 35, 1160–1169. [Google Scholar] [CrossRef] [PubMed]
  61. Hussein, S.; Gillies, R.; Cao, K.; Song, Q.; Bagci, U. TumorNet: Lung Nodule Characterization Using Multi-View Convolution Neural Network with Gaussian Process. In Proceedings of the IEEE 14th International Symposium on Biomedical Imaging (ISBI), Melbourne, Australia, 18–21 April 2017; pp. 1007–1010. [Google Scholar]
  62. Mahbod, A.; Ecker, R.; Ellinger, I. Skin lesion classification using hybrid deep neural networks. arXiv 2017, arXiv:1702.08434. [Google Scholar]
  63. DermQuest. Online Medical Resource. Available online: (accessed on 10 December 2018).
  64. Dey, T.K. Curve and Surface Reconstruction: Algoritms with Mathematical Analysis; Cambridge Monographs on Applied and Computational Mathematics: Cambridge, UK, 2006. [Google Scholar]
  65. Pomponiu, V.; Nejati, H.; Cheung, N.-M. Deepmole: Deep neural networks for skin mole lesion classification. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 2623–2627. [Google Scholar]
  66. Gutman, D.; Codella, N.C.; Celebi, E.; Helba, B.; Marchetti, M.; Mishra, N.; Halpern, A. Skin lesion analysis toward melanoma detection: A challenge at the international symposium on biomedical imaging (ISIC). arXiv 2016, arXiv:1605.01397. [Google Scholar]
  67. Majtner, T.; Yildirim-Yayilgan, S.; Hardeberg, J.Y. Combining deep learning and hand-crafted features for the skin lesion classification. In Proceedings of the Sixth International Conference on Image Processing Theory Tools and Applications (IPTA), Oulu, Finland, 12–15 December 2016; pp. 1–6. [Google Scholar]
  68. Demyanov, S.; Chakravorty, R.; Abedini, M.; Halpern, A.; Garnavi, R. Classification of dermoscopy patterns using deep convolutional neural networks. In Proceedings of the 13th International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 364–368. [Google Scholar]
  69. Giotis, I.; Molders, N.; Land, S.; Biehl, M.; Jonkman, M.F.; Petkov, N. MED-NODE: A computer-assisted melanoma diagnosis system using non-dermoscopic images. Expert Syst. Appl. 2015, 42, 6578–6585. [Google Scholar] [CrossRef]
  70. Nasr-Esfahani, E.; Samavi, S.; Karimi, N.; Soroushmehr, S.M.R.; Jafari, M.H.; Ward, K.; Najarian, K. Melanoma detection by analysis of clinical images using convolutional neural network. In Proceedings of the IEEE 38th Annual International Conferenced of Engineering in Medincine and Biology Society (EMBC), Orlando, FL, USA, 16–20 August 2016; pp. 1373–1376. [Google Scholar]
  71. An Atlas of Clinical Dermatology. 2014. Available online: (accessed on 14 September 2018).
  72. Online Medical Resources. 2014. Available online: (accessed on 20 November 2018).
  73. Interactive Dermatology Atlas. 2014. Available online: (accessed on 22 December 2018).
  74. Sabouri, P.; GholamHosseini, H. Lesion border detection using deep learning. In Proceedings of the 2016 IEEE Congress on Evolutionary Computation (CEC), Vancouver, BC, Canada, 24–29 July 2016; pp. 1416–1421. [Google Scholar]
  75. Litjens, G.; Toth, R.; van de Ven, W.; Hoeks, C.; Kerkstra, S.; van Ginneken, B.; Vincent, G.; Guillard, G.; Birbeck, N.; Zhang, J. Evaluation of prostate segmentation algorithms for MRI: The PROMISE12 challenge. Med. Imaging Anal. 2014, 18, 359–373. [Google Scholar] [CrossRef]
  76. Yan, K.; Li, C.; Wang, X.; Li, A.; Yuan, Y.; Feng, D.; Khadra, M.; Kim, J. Automatic prostate segmentation on MR images with deep network and graph model. In Proceedings of the 38th Annual International Conference of the Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 16–18 August 2016; pp. 635–638. [Google Scholar]
  77. Maa, I.; Guoa, R.; Zhanga, G.; Tadea, F.; Schustera, D.M.; Niehc, P.; Masterc, V.; Fei, B. Automatic segmentation of the prostate on CT images using deep convolutional neural network. In Proceeding of the SPIE MEdical Imaging, International Society for Optics and Photonics, Orlando, FL, USA, 11–16 February 2017; Volume 10133, p. 101332O. [Google Scholar]
  78. Kistler, M.; Bonaretti, S.; Pfahrer, M.; Niklaus, R.; Buchler, P. The virutal skeleton database: An open access repository for biomedical research and collaboration. J. Med. Internet Res. 2013, 15, e245. [Google Scholar] [CrossRef]
  79. Zhao, L.; Jia, K. Deep feature learning with discrimination mechanism for brain tumor segmentation and diagnosis. In Proceedings of the International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), Adelaide, SA, Australia, 23–25 September 2015; pp. 306–309. [Google Scholar]
  80. Pereira, S.; Pinto, A.; Alves, V.; Silva, C.A. Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Trans. Med. Imaging 2016, 35, 1240–1251. [Google Scholar] [CrossRef]
  81. Kamnitsas, K.; Ledig, C.; Newcombe, V.F.; Simpson, J.P.; Kane, A.D.; Menon, D.K.; Rueckert, D.; Glocker, B. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med. Image Anal. 2017, 36, 61–78. [Google Scholar] [CrossRef] [PubMed]
  82. Zhao, X.; Wu, Y.; Song, G.; Li, Z.; Zhang, Y.; Fan, Y. A deep learning model integrating FCNNs and CRFs for brain tumor segmentation. Med. Image Anal. 2018, 43, 98–111. [Google Scholar] [CrossRef] [PubMed]
  83. Sitinukunwattana, K.; Snead, D.R.; Rajpoot, N.M. A stochastic polygons model for glandular structures in colon histology images. IEEE Trans. Med. Imaging 2015, 34, 2366–2378. [Google Scholar] [CrossRef] [PubMed]
  84. Sitinukunwattana, K.; Pluim, J.P.; Chen, H.; Qj, X.; Heng, P.-A.; Guo, Y.B.; Wang, L.Y.; Matuszewski, B.J.; Bruni, E.; Sanchez, U. Gland segmentation in colon histology images: The glas challenge contest. Med. Image Anal 2017, 35, 489–502. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  85. Chen, H.; Qj, X.; Yu, L.; Heng, P.-A. DCAN: Deep contour-aware networks for accurate gland segmentation. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2487–2496. [Google Scholar]
  86. Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
  87. Paul, R.; Hawkins, S.H.; Hall, L.O.; Goldgof, D.B.; Gillies, R.J. Combining deep neural network and traditional image features to improve survival prediction accuracy for lung cancer patients from diagnostic CT. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC), Budapest, Hungary, 9–12 October 2016; pp. 2570–2575. [Google Scholar]
  88. Kim, D.H.; Kim, S.T.; Ro, Y.M. Latent feature representation with 3-D multi-view deep convolutional neural network for bilateral analysis in digital breast tomosynthesis. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016; pp. 927–931. [Google Scholar]
  89. Liu, R.; Hall, L.O.; Goldgof, D.B.; Zhou, M.; Gatenby, R.A.; Ahmed, K.B. Exploring deep features from brain tumor magnetic resonance images via transfer learning. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 235–242. [Google Scholar]
  90. Kallen, H.; Molin, J.; Heyden, A.; Lundstrom, C.; Astrom, K. Towards grading gleason score using generically trained deep convolutional neural networks. In Proceedings of the 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 1163–1167. [Google Scholar]
  91. Gummeson, A.; Arvdsson, I.; Ohlsson, M.; Overgaard, N.C.; Krzyzanowska, A.; Heyden, A.; Bjartell, A.; Astrom, K. Automatic Gleason grading of H&E stained microscopic prostate images using deep convolutional neural networks. In Proceedings of the SPIE Medical Imaging, International Society of Optics and Photonics, Orlando, FL, USA, 11–16 February 2017; Volume 10140, p. 101400S. [Google Scholar]
  92. Kwak, J.T.; Hewitt, S.M. Lumen-based detection pf prostate cancer via convolutional neural networks. In Proceedings of the SPIE Medical Imaging, International Society of Optics and Photonics, Orlando, FL, USA, 11–16 February 2017; Volume 10140, p. 1014008. [Google Scholar]
  93. Zhu, X.; Yao, J.; Huang, J. Deep convolutional neural network for survival analysis with pathological images. In Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shenzhen, China, 15–18 December 2016; pp. 544–547. [Google Scholar]
  94. Ahmed, K.B.; Hall, L.O.; Goldgof, D.B.; Liub, R.; Gatenby, R.A. Fine-tuning convolutional deep features for MRI based brain tumor classification. In SPIE Proceedings: Medical Imaging 2017: Computer-Aided Diagnosis; International Society for Optics and Photonics: Bellingham, WA, USA, 2017; Volume 10134, p. 101342E. [Google Scholar]
  95. Song, Y.; Zhang, L.; Chen, S.; Ni, D.; Lei, B.; Wang, T. Accurate segmentation of cervical cytoplasm and nuclei based on multi-scale convolutional network and graph partitioning. IEEE Trans. Biomed. Eng. 2015, 62, 2421–2433. [Google Scholar] [CrossRef] [PubMed]
  96. Cha, K.H.; Hadjiiski, L.; Samala, R.K.; Chan, H.P.; Caoili, E.M.; Cohan, R.H. Urinary bladder segmentation in CT urography using deep-learning convolutional neural network and level sets. Med. Phys. 2016, 43, 1882–1896. [Google Scholar] [CrossRef] [PubMed]
  97. Gibson, E.; Robu, M.R.; Thompson, S.; Edwards, P.E.; Schneider, C.; Schneider, C.; Gurusamy, K.; Davidson, B.; Hawkes, D.J.; Barratt, D.C.; et al. Deep residual networks for automatic segmentation of laparoscopic videos of the liver. In Medical Imaging 2017: Image-Guided Procedures, Robotic Interventions, and Modeling; International Society for Optics and Photonics: Bellingham, WA, USA, 2017; Volume 10135, p. 101351M. [Google Scholar]
  98. Gordon, M.; Hadjiiski, L.; Cha, K.; Chan, H.-P.; Samala, R.; Cohan, R.H.; Caoili, E.M. Segmentation of inner and outer bladder wall using deep-learning convolutional neural networks in CT urography. In Medical Imaging 2017: Computer-Aided Diagnosis; International Society for Optics and Photonics: Bellingham, WA, USA, 2017; Volume 10134, p. 1013402. [Google Scholar]
  99. Xu, T.; Zhang, H.; Huang, X.; Zhang, S.; Metaxas, D.N. Multimodal deep learning for cervical dysplasia diagnosis. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2016; pp. 115–123. [Google Scholar]
  100. BenTaieb, A.; Kawahara, J.; Hamarneh, G. Multi-loss convolutional networks for gland analysis in microscopy. In Proceedings of the IEEE Thirteenth International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 642–645. [Google Scholar]
  101. Xing, F.; Xie, Y.; Yang, L. An automaticl learning-based framework or robust nucleus segmentation. IEEE Trans. Med. Imaging 2016, 35, 550–566. [Google Scholar] [CrossRef] [PubMed]
  102. Mao, Y.; Yin, Z.; Schober, J. A deep convolutional neural network trained on representative samples for circualting tumor cell detection. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, 7–10 March 2016; pp. 1–6. [Google Scholar]
  103. Li, W.; Jia, F.; Hu, Q. Automatic segmentation of liver tumor in CT images with deep convolutional neural networks. J. Comput. Commun. 2015, 3, 146. [Google Scholar] [CrossRef]
  104. Song, Y.; Cheng, J.-Z.; Ni, D.; Chen, S.; Lei, B.; Wang, T. Segmenting overlapping cervical cell in pap smear images. In Proceedings of the IEEE Thirteenth International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 1159–1162. [Google Scholar]
  105. Cha, K.H.; Hafjiski, L.M.; Chan, H.-P.; Samala, R.K.; Cohan, R.H.; Caoili, E.M.; Paramagul, C.; Alva, A.; Weizer, A.Z. Bladder cancer treantment response assessment using deep learning learning in CT with transfer learning. In Medical Imaging 2017: Computer-Aided Diagnosis; International Society for Optics and Photonics: Bellingham, WA, USA, 2017; Volume 10134, p. 1013404. [Google Scholar]
  106. Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional network for semantic segmentation. arXiv 2015, arXiv:1411.4038v2. [Google Scholar]
  107. Jain, V.; Seung, S. Natural image denoising with convolutional networks. Adv. Neural Inf. Process. Syst. 2009, 21, 769–776. [Google Scholar]
  108. Dong, C.; Loy, C.C.; He, K.; Tang, X. Learning a deep convolutional network for image super-resolution. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 769–776. [Google Scholar]
  109. Noh, H.; Hong, S.; Han, B. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December 2015; pp. 1520–1528. [Google Scholar]
  110. Mahmood, F.; Borders, D.; Chen, R.; McKay, G.N.; Salimian, K.J.; Baras, A.; Durr, N.J. Deep adversarial training for multi-organ nuclei segmentation in histopathology images. IEEE Trans. Med. Imaging 2019. [Google Scholar] [CrossRef] [PubMed]
  111. Baur, C.; Albarqouni, S.; Navab, N. MelanoGANs: High resolution skin lesion synthesis with GANs. arXiv 2018, arXiv:1804.04338. [Google Scholar]
  112. Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv 2017, arXiv:1703.10593. [Google Scholar]
  113. Wang, T.C.; Liu, M.Y.; Zhu, J.Y.; Tao, A.; Kautz, J.; Catanzaro, B. High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 19–21 June 2018; Volume 1, p. 5. [Google Scholar]
  114. Quan, T.M.; Nguyen-Duc, T.; Jeong, W.K. Compressed sensing mri reconstruction with cyclic loss in generative adversarial networks. arXiv 2017, arXiv:1709.00753. [Google Scholar] [CrossRef] [PubMed]
  115. Wang, Y.; Yu, B.; Wang, L.; Zu, C.; Lalush, D.S.; Lin, W.; Wu, X.; Zhou, J.; Shen, D.; Zhou, L. 3D conditional generative adversarial networks for high-quality pet image estimation at low dose. NeuroImage 2018, 174, 550–562. [Google Scholar] [CrossRef]
  116. Mardani, M.; Gong, E.; Cheng, J.Y.; Vasanawala, S.; Zaharchuk, G.; Alley, M.; Thakur, N.; Han, S.; Dally, W.; Pauly, J.M.; et al. Deep generative adversarial networks for compressed sensing automates mri. arXiv 2017, arXiv:1706.00051. [Google Scholar]
  117. Dou, Q.; Ouyang, C.; Chen, C.; Chen, H.; Heng, P.-A. Unsupervised cross-modality domain adaptation of convnets for biomedical image segmentations with adversarial loss. arXiv 2018, arXiv:1804.10916. [Google Scholar]
  118. Li, Z.; Wang, Y.; Yu, J. Brain tumor segmentation using an adversarial network. In International MICCAI Brainlesion Workshop; Springer: Cham, Switzerland, 2017; pp. 123–132. [Google Scholar]
  119. Rezaei, M.; Yang, H.; Meinel, C. Whole heart and great vessel segmentation with context-aware of generative adversarial networks. In Bildverarbeitung fur die Medizin; Springer Vieweg: Berlin/Heidelberg, Germany, 2018; pp. 353–358. [Google Scholar]
  120. Zhang, Y.; Miao, S.; Mansi, T.; Liao, R. Task driven generative modeling for unsupervised domain adaptation: Application to X-ray image segmentation. arXiv 2018, arXiv:1806.07201. [Google Scholar]
  121. Chen, C.; Dou, Q.; Chen, H.; Heng, P.-A. Semantic-aware generative adversarial nets for unsupervised domain adaptation in chest X-ray segmentation. arXiv 2018, arXiv:1806.00600. [Google Scholar]
  122. Alex, V.; Mohammed Safwan, K.P.; Chennamsetty, S.S.; Krishnamurthi, G. Generative adversarial networks for brain lesion detection. In Medical Imaging 2017: Image Processing; International Society for Optics and Photonics: Bellingham, WA, USA, 2017; Volume 10133, p. 101330G. [Google Scholar]
  123. Schlegl, T.; Seebock, P.; Waldstein, S.M.; Schmidt-Erfurth, U.; Langs, G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In International Conference on Information Processing in Medical Imaging; Springer: Cham, Switzerland, 2017; pp. 146–157. [Google Scholar]
  124. Mondal, A.K.; Dolz, J.; Desrosiers, C. Few-shot 3D multi-modal medical image segmentation using generative adversarial learning. arXiv 2018, arXiv:1810.12241. [Google Scholar]
  125. Singh, V.K.; Romani, S.; Rashwan, H.A.; Akram, F.; Pandey, N.; Sarker, M.M.K.; Abdulwahab, S.; Torrents-Barrena, J.; Saleh, A.; Arquez, M.; et al. Conditional generative adversarial and convolutional networks for X-ray breast mass segmentation and shape classification. arXiv 2018, arXiv:1805.10207v2. [Google Scholar]
  126. Graves, A.; Mohamed, A.R.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, Vancouver, BC, Canada, 26–31 May 2013. [Google Scholar] [CrossRef]
  127. Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent neural network regularization. arXiv 2014, arXiv:1409.2329. [Google Scholar]
  128. Bishop, C.M. Neural Networks of Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
  129. Ng, A. Sparse autoencoder. CS294A Lect. Notes 2011, 72, 1–19. [Google Scholar]
  130. Liu, G.; Bao, H.; Han, B. A stacked autoencoder-based deep neural network for achieving gearbox fault diagnosis. Math. Probl. Eng. 2018. [Google Scholar] [CrossRef]
  131. Hou, L.; Nguyen, V.; Kanevsky, A.B.; Samaras, D.; Kurc, T.M.; Zhao, T.; Gupta, R.R.; Gao, Y.; Chen, W.; Foran, D.; et al. Sparse autoencoder for unsupervised nucleus detection and representation in histopathology images. Pattern Recognit. 2019, 86, 188–200. [Google Scholar] [CrossRef]
  132. Guo, X.; Liu, X.; Zhu, E.; Yin, J. Deep clustering with convolutional autoencoders. In International Conference on Neural Information Processing; Springer: Cham, Switzerland, 2017; pp. 373–382. [Google Scholar]
  133. Zhang, Y. A Better Autoencoder for Image: Convolutional Autoencoder.ICONIP17-DCEC. Available online: Tom.Gedeon/conf/ABCs2018/paper/ABCs2018paper58.pdf (accessed on 23 March 2017).
  134. Hinton, G.E. Deep belief networks. Scholarpedia 2009, 4, 5947. [Google Scholar] [CrossRef]
  135. Hinton, G.E.; Osindero, S.; Teh, T.-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
  136. Kallenberg, M.; Petersen, K.; Nielsen, M.; Ng, A.Y.; Diao, P.; Igel, C.; Vachon, C.M.; Holland, K.; Winkel, R.R.; Karssemeijer, N.; et al. Unsupervised deep learning applied to breast density segmentation and mammographic risk scoring. IEEE Trans. Med. Imaging 2016, 35, 1322–1331. [Google Scholar] [CrossRef]
  137. Dhungel, N.; Carneiro, G.; Bradley, A.P. Deep structured learning for mass segmentation from mammograms. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; pp. 2950–2954. [Google Scholar] [CrossRef]
  138. Dhungel, N.; Carneiro, G.; Bradley, A.P. Automated Mass Detection in Mammograms Using Cascaded Deep Learning and Random Forests. In Proceedings of the 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Adelaide, Australia, 23–25 November 2015. [Google Scholar]
  139. Taqdir, B. Cancer detection techniques—A review. Int. Res. J. Eng. Technol. (IRJET) 2018, 4, 1824–1840. [Google Scholar]
  140. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
  141. Masood, A.; Al-Jumaily, A.; Anam, K. Self-supervised learning model for skin cancer diagnosis. In Proceedings of the Seventh International IEEE/EMBS Conference on Neural Engineering (NER), Montpellier, France, 22–24 April 2015; pp. 1012–1015. [Google Scholar]
  142. Yu, L.; Chen, H.; Dou, Q.; Qin, J.; Heng, P.-A. Automated melanoma recognition in dermoscopy images via very deep residual networks. IEEE Trans. Med. Imaging 2016, 36, 994–1004. [Google Scholar] [CrossRef] [PubMed]
  143. Chandrahasa, M.; Vadigeri, V.; Salecha, D. Detection of skin cancer using image processing techniques. Int. J. Mod. Trends Eng. Res. (IJMTER) 2016, 5, 111–114. [Google Scholar]
  144. Saha, S. and Gupta, R. An automated skin lesion diagnosis by using image processing techniques. Int. J. Recent Innov. Trends Comput. Commun. 2015, 5, 1081–1085. [Google Scholar]
  145. Mehta, P.; Shah, B. Review on techniques and steps of computer aided skin cancer diagnosis. Procedia Comput. Sci. 2016, 85, 309–316. [Google Scholar] [CrossRef]
  146. Bhuiyan, M.A.H.; Azad, I.; Uddin, M.K. Image processing for skin cancer features extraction. Int. J. Sci. Eng. Res. 2013, 4, 1–6. [Google Scholar]
  147. Sumithra, R.; Suhil, M.; Guru, D.S. Segmentation and classification of skin lesions for disease diagnosis. Procedia Comput. Sci. 2015, 45, 76–85. [Google Scholar] [CrossRef]
  148. He, J.; Dong, Q.; Yi, S. Prediction of skin cancer based on convolutional neural network. In Recent Developments in Mechatronics and Intelligent Robotics; Springer: Cham, Switzerland, 2018; pp. 1223–1229. [Google Scholar]
  149. Rehman, M.; Khan, S.H.; Rizvi, S.D.; Abbas, Z.; Zafar, A. Classification of skin lesion by interference of segmentation and convolotion neural network. In Proceedings of the 2nd International Conference on Engineering Innovation (ICEI), Bangkok, Thailand, 5–6 July 2018; pp. 81–85. [Google Scholar]
  150. Pham, T.C.; Luong, C.M.; Visani, M.; Hoang, V.D. Deep CNN and data augmentation for skin lesion classification: Intelligent information and database systems. In Intelligent Information and Database Systems; Springer: Cham, Switzerland, 2018; pp. 573–582. [Google Scholar]
  151. Zhang, X.; Wang, S.; Liu, J.; Tao, C. Towards improving diagnosis of skin diseases by combining deep neural network and human knowledge. BMC Med. Inform. Decis. Mak. 2018, 18, 59. [Google Scholar] [CrossRef]
  152. Vesal, S.; Ravikumar, N.; Maier, A. SkinNet: A deep learning framework for skin lesion segmentation. arXiv 2018, arXiv:1806.09522v1. [Google Scholar]
  153. Haenssle, H.A.; Fink, C.; Schneiderbauer, R.; Toberer, F.; Buhl, T.; Blum, A.; Kalloo, A.; Hassen, A.B.H.; Thomas, L.; Enk, A.; et al. Man against machine: Diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann. Oncol. 2018, 29, 1836–1842. [Google Scholar] [CrossRef]
  154. Wang, Y.; Sun, S.; Yu, J.; Yu, D. Skin lesion segmentation using atrous convolution via DeepLab v3. arXiv 2018, arXiv:1807.08891. [Google Scholar]
  155. Maia, L.B.; Lima, A.; Pereira, R.M.P.; Junior, G.B.; de Almeida, J.D.S.; de Paiva, A.C. Evaluation of melanoma diagnosis using deep deatures. In Proceedings of the 25th International Conference on Systems, Signals and Image Processing (IWSSIP), Maribor, Slovenia, 20–22 June 2018. [Google Scholar]
  156. Vesal, S.; Patil, S.M.; Ravikumar, N.; Maier, A.K. A multi-task framework for skin lesion detection and segmentation. arXiv 2018, arXiv:1808.01676. [Google Scholar]
  157. Rezvantalab, A.; Safigholi, H.; Karimijeshni, S. Dermatologist level dermoscopy skin cancer classification using different deep learning convolutional neural networks algorithms. arXiv 2018, arXiv:1810.10348. [Google Scholar]
  158. Walker, B.N.; Rehg, J.M.; Kalra, A.; Winters, R.M.; Drews, P.; Dascalu, J.; David, E.O.; Dascalu, A. Dermoscopy diagnosis of cancerous lesions utilizing dual deep learning algorithms via visual and audio (sonification) outputs: Laboratory and prospective observational studies. EBio Med. 2019, 40, 176–183. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  159. Horie, Y.; Yoshio, T.; Aoyama, K.; Yoshimizu, S.; Horiuchi, Y.; Ishiyama, A.; Hirasawa, T.; Tsuchida, T.; Ozawa, T.; Ishihara, S.; et al. Diagnostic outcomes of esophageal cancer by artificial intelligence using convolutional neural networks. Gastrointest. Endosc. 2019, 89, 25–32. [Google Scholar] [CrossRef] [PubMed]
  160. GÃmez-MartÃn, I.; Moreno, S.; Duran, X.; Pujol, R.M.; Segura, S. Diagnostic accuracy of non-melanocytic pink flat skin lesions on the legs: Dermoscopic and reflectance confocal microscopy evaluation. Acta Dermato-Venereologica 2019, 99, 33–40. [Google Scholar] [CrossRef] [PubMed]
  161. Pandey, P.; Saurabh, P.; Verma, B.; Tiwari, B. A multi-scale retinex with color restoration (MSR-CR) technique for skin cancer detection. In Soft Computing for Problem Solving; Springer: Singapore, 2018; pp. 465–473. [Google Scholar]
  162. Guo, Y.; Gao, Y.; Shen, D. Deformable MR prostate segmentation via deep feature learning and sparse patch matching. IEEE Trans. Med. Imaging 2016, 35, 1077–1089. [Google Scholar] [CrossRef] [PubMed]
  163. Milletari, F.; Navab, N.; Ahamdi, S.-A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In Proceedings of the Fourth International Conference on 3D-Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 565–571. [Google Scholar]
  164. Yu, L.; Chen, H.; Dou, Q.; Qin, J.; Heng, P.A. Integrating online and offline three-dimensional deep learning for automanted plopy detection in colonscopy videos. IEEE J. Biomed. Health Inform. 2017, 21, 65–75. [Google Scholar] [CrossRef]
  165. Tian, Z.; Liu, L.; Zhang, Z.; Fei, B. PSNet: Prostate segmentation on MRI based on a convolutional neural network. J. Med. Imaging 2018, 5, 021208. [Google Scholar] [CrossRef]
  166. Armato, S.G.; McLennan, G.; Bidaut, L.; McNitt-Gray, M.F.; Meyer, C.R.; Reeves, A.P.; Zhao, B.; Aberle, D.R.; Henschke, C.I.; Hoffman, E.A. The lung image database consortium (LIDC) and image database resource initiative(IDRI): A compelete reference database of lung nodules on CT scans. Med. Phys. 2011, 38, 915–931. [Google Scholar] [CrossRef]
  167. Sabbaghi, S.; Aldeen, M.; Garnavi, R. A deep bag-of-featrues model for the classification of melanomas in dermoscopy images. In Proceedings of the IEEE 38th Annual International Conference of the Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 16–20 August 2016; pp. 1369–1372. [Google Scholar]
Figure 1. Summary of typically used skin cancer classification methods.
Figure 1. Summary of typically used skin cancer classification methods.
Cancers 11 01235 g001
Figure 2. Artificial Neural Networks (ANNs).
Figure 2. Artificial Neural Networks (ANNs).
Cancers 11 01235 g002
Figure 3. General convolutional neural networks.
Figure 3. General convolutional neural networks.
Cancers 11 01235 g003
Figure 4. Fully convolutional neural networks.
Figure 4. Fully convolutional neural networks.
Cancers 11 01235 g004
Figure 5. Autoencoders.
Figure 5. Autoencoders.
Cancers 11 01235 g005
Figure 6. Deep belief networks (DBNs).
Figure 6. Deep belief networks (DBNs).
Cancers 11 01235 g006
Table 1. Steps of cancer diagnosis.
Table 1. Steps of cancer diagnosis.
Pre-ProcessingImage SegmentationPost-Processing
Contrast adjustmentHistogram thresholdingOpening and closing operations
Vignetting effect removalDistributed and localizedIsland removal
Region identification
Color correctionClustering & Active contoursRegion merging
Image smoothingSupervised learningBorder expansion
Hair removalEdge detection & Fuzzy logicSmoothing
Normalization and localizationProbabilistic modeling and graph theory
Table 2. Menzies method.
Table 2. Menzies method.
Blue-white veilLesion’s symmetry
Depegmentation like scarsSingle color presence
Gray and blue dots
Broadened networks
Radial streaming
Table 3. List of available codes online.
Table 4. Summary of references with CNN Applications. (Histo.path: histogram pathology, vol.CT: Volumetric computed, MG: Mammographs, MRI: Magnetic Resonance Imaging, DermoS: Dermoscopic Segmentation, and BraTS: Brain tumor segmentation.)
Table 4. Summary of references with CNN Applications. (Histo.path: histogram pathology, vol.CT: Volumetric computed, MG: Mammographs, MRI: Magnetic Resonance Imaging, DermoS: Dermoscopic Segmentation, and BraTS: Brain tumor segmentation.)
Application TypeModalityDatasetReference
Breast Cancer ClassificationHisto.pathBreakHisSpanhol et al. [49]
Mass DetectionHisto.pathINbreastWichakam et al. [50]
Mass segmentationMammo.graph.DDSMErtosun et al. [51]
Mitosis DetectionHisto.pathMITOSATYPIA-14Albayrak et al. [52]
Lesion recognitionMammo.graph.DDSMSwiderski et al. [53]
Mass DetectionHisto.pathDDSMSuzuki et al. [54]
Lung nodule (LN) ClassificationCT SlicesJSRTWang et al. [55]
Pulmonary nodule DetectionVolumetric CTLIDC-IDRIDou et al. [56]
Lung nodule (LN)
Suspiciousness classification
Volumetric CTLIDC-IDRIShen et al. [57]
Nodule characterizationVolumetric CTLIDC-IDRIHua et al. [58]
Ground glass opacity
(GCO) extraction
CT SlicesLIDCHirayama et al. [59]
Pulmonary nodules detect.Volumetric CTLIDCSetio et al. [60]
Nodule CharacterizationVolumetric CTDLCST, LIDC,
Hussein et al. [61]
Skin lesion classificationDermo.S.ISICMahbod et al. [62]
Skin lesion classificationDermo.S.DermIS,
DermQuest [63,64]
Pomponiu et al. [65]
Skin lesion classificationDermo.S.ISIC [66]Majtner et al. [67]
Dermoscopy patterns classificationDermo.S.ISICDemyanov et al. [68]
Melanoma detectionClinical photoghrapyMED-NODE [69]Nasr-Esfahani et al. [70]
Lesion border detectionClinical photoghrapyDermIS, Online dataset,
DermQuest [71,72,73]
Sabouri et al. [74]
Prostate SegmentationMRIPROMISE12 [75]Yan et al. [76]
Prostate SegmentationCT ScansPROMISE12Maa et al. [77]
Brain tumor Segmentation
Cancer detection
MRIBraTS [78]Zhao et al. [79]
Brain tumor SegmentationMRIBraTSPereira et al. [80]
Brain tumor SegmentationMRIBraTSKAmnitsas et al. [81]
Prostate segmentationMRIBraTSZhao et al. [82]
Gland segmentationHisto.pathWarwick-QU [83,84]Chen et al. [85]
Table 5. Summary of references with CNN Applications.
Table 5. Summary of references with CNN Applications.
Application TypeModalityReference
Dermatologists-level skin cancerDermo.S.Esteva et al. [86]
Survival PredictionCT SlicesPaul et al. [87]
Latent bi[]=lateral feature representation learningTomosynthesisKim et al. [88]
Feature learning of Brain tumorMRILiu et al. [89]
Gleason gradingHisto.pathKallen et al. [90]
Gleason gradingHisto.pathGummeson et al. [91]
Lumen-based ProstateHisto.pathKwak et al. [92]
Survival analysisHisto.pathZhu et al. [93]
Classification of Brain tumorMRIAhmed et al. [94]
Cervical cytoplasm and nuclei segmentationHisto.pathSong et al. [95]
Urinary bladderCT Slices segmentationCha et al. [96]
Liver segmentation on Laparoscopic videosLaparoscopyGibson et al. [97]
Inner/outer bladder wall segmentationCT SlicesGordon et al. [98]
Cervical dysplasia diagnosisDigital cervicigraphyXu et al. [99]
Colon adenocarcinoma glands segmentationHisto.pathBenTaieb et al. [100]
Nucleus segmentationHisto.pathXing et al. [101]
Circulating tumor-cell detectionHisto.pathMao et al. [102]
Liver tumor segmentationCT SlicesLi et al. [103]
Cervical cytoplasm segmentationHisto.pathSong et al. [104]
bladder cancer Treatment response assessmentCT SlicesCha et al. [105]
Table 6. Skin cancer risk factors and its causes.
Table 6. Skin cancer risk factors and its causes.
CauseRisk Factors
1. Sunlight(a) UV radiations leading to cancer
(b) Sunburn Blisters: sunburns in adults are more prone to cancer
(c) Tanning
2. Tanning Boothsleads to cancer before the age of 30 and Sun lamps
3. InheritedTwo or more careers of melanoma from family inherit this disease to the descendants
4. Easily burnable skinGray/Blue eyes, Fair/Pale skin, Blond/Red hairs
5. Medications Side-EffectsSide effects of anti-depressants antibiotics and Hormones
Table 7. Summary of CNN for different cancers.
Table 7. Summary of CNN for different cancers.
Application TypeModalityReference
Prostate Segmentation3D MRIYu et al. [142]
Prostate Segmentation3D MRIMilletari et al. [163]
Prostate SegmentationMRIZhao et al. [79]
Polyp detectionclonoscopyYu et al. [164]
Table 8. Datasets and their online access links.
Table 8. Datasets and their online access links.
Dataset NameLink to Data Access
MED-NODE imaging/databases/melanoma_naevi/
STACOM-SLAWT rkarim/la_lv_framework/wall/datasets.html
ISBI15 cweiwang/ISBI2015/challenge2/index.html
OCCISC-14 carneiro/isbi14_challenge/dataset.html
Table 9. Summary of references for different cancers.
Table 9. Summary of references for different cancers.
Application NameReferencesNo. of Papers
Breast Cancer[49,50,52,53,54,85,88,114,136,137,138]12
Lung Cancer[12,56,57,59,60,61,87,93]12
Prostate Cancer[76,77,90,91,92]5
Brain Cancer[79,80,81,89,94]5
Skin Cancer[42,65,67,68,70,74,86,141,142,143,144,145,147,148,149,150,151,152,153,154,155,156,157,158,159,161,167]27

Share and Cite

MDPI and ACS Style

Munir, K.; Elahi, H.; Ayub, A.; Frezza, F.; Rizzi, A. Cancer Diagnosis Using Deep Learning: A Bibliographic Review. Cancers 2019, 11, 1235.

AMA Style

Munir K, Elahi H, Ayub A, Frezza F, Rizzi A. Cancer Diagnosis Using Deep Learning: A Bibliographic Review. Cancers. 2019; 11(9):1235.

Chicago/Turabian Style

Munir, Khushboo, Hassan Elahi, Afsheen Ayub, Fabrizio Frezza, and Antonello Rizzi. 2019. "Cancer Diagnosis Using Deep Learning: A Bibliographic Review" Cancers 11, no. 9: 1235.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop