An Intelligent Computer-Aided Scheme for Classifying Multiple Skin Lesions

: Skin diseases cases are increasing on a daily basis and are di ﬃ cult to handle due to the global imbalance between skin disease patients and dermatologists. Skin diseases are among the top 5 leading cause of the worldwide disease burden. To reduce this burden, computer-aided diagnosis systems (CAD) are highly demanded. Single disease classiﬁcation is the major shortcoming in the existing work. Due to the similar characteristics of skin diseases, classiﬁcation of multiple skin lesions is very challenging. This research work is an extension of our existing work where a novel classiﬁcation scheme is proposed for multi-class classiﬁcation. The proposed classiﬁcation framework can classify an input skin image into one of the six non-overlapping classes i.e., healthy, acne, eczema, psoriasis, benign and malignant melanoma. The proposed classiﬁcation framework constitutes four steps, i.e., pre-processing, segmentation, feature extraction and classiﬁcation. Di ﬀ erent image processing and machine learning techniques are used to accomplish each step. 10-fold cross-validation is utilized, and experiments are performed on 1800 images. An accuracy of 94.74% was achieved using Quadratic Support Vector Machine. The proposed classiﬁcation scheme can help patients in the early classiﬁcation of skin lesions.


Introduction
Skin lesions cases are increasing day by day and are a major cause of an increased global disease burden. Skin lesions stand fourth among the major causes of the global disease burden [1]. The after-effects of the skin lesions are severe. The burden of skin lesions is multi-dimensional and includes social, financial and psychological consequences on the patient's life and society [2]. People of all ages suffer from skin diseases, but young and elderly people suffer the most. Unemployment, self-harm, emotional distress, relationship loss, increased alcoholism and suicide are some of the prominent issues found in skin disease patients [3].
A huge difference exists between skin disease patients and the expertise to cope with them. The resources include skilled dermatologists, equipment, medicines and researchers. According to the World Health Organization, people living in rural areas suffer the most because of the lack of resources [4]. Due to this gross imbalance among the skin patients and the expertise, automated expert
For classifying erythemato-squamous diseases, an automated classification scheme was proposed by Guvenir and his colleague [20] by using three different classifiers. The proposed expert system was trained on the biopsy features and 99.2% classification accuracy was achieved using the voting feature algorithm. Same nature of work was proposed by Ubeyli et al. [21] to classify erythemato-squamous diseases using a combined neural network approach. Their proposed methodology can classify the erythemato-squamous diseases with an accuracy of 97.7%.
Work done by Chang et al. [22] utilize decision tree and artificial neural network(ANN) for diagnosis of same diseases, and an accuracy of 92.62% was attained. For classifying erythemato-squamous lesions on features extracted after a painful method i.e., biopsy; Xie et al. [23], Kumar et al. [24], and Nanni et al. [25], proposed their classification schemes for multi-class skin lesions classification. The classification scheme by Xie et al., achieved an accuracy of 98.61%, whereas the classification accuracy of the other two approaches was 97.22% and 95%, respectively. As stated earlier, the above-mentioned work regarding the erythemato-squamous disease classification was done on the features extracted after a painful procedure, i.e., biopsy [35]. Clinical feature extraction is a painful, time-consuming and expensive procedure, which requires domain. It is very difficult to extract these features for the people living with limited resources.
To detect malignancy, Erol et al. [36] extracted texture features of the region within the lesion boundary; which was determined by active-contour segmentation. The extracted texture features contain homogeneity, SD, and mean of pixel values. Artificial Neural Network(ANN) and Support Vector Machine (SVM) classifiers were compared and the best performance they achieved was 78% specificity on a dataset consists of 900 images with 173 malignant lesions using ANN. Schnurle et al. [37] provide an automated approach to classify hand eczema. For balancing data, they used the oversampling technique and then extract colour, texture and histogram features from the provided images. For evaluating their approach, SVM was applied to the features extracted from 48 images. An F-score of 58.6% and 43.8% was achieved for the front and back side of hands respectively.
A computer-aided classification system is proposed by Hameed et al. [38] for classification of multiple skin lesions using a hybrid approach in which features are extracted using convolution neural network (CNN) and classification is performed using SVM. As the features are extracted using CNN, hence uninterpretable. Computer-aided classification systems presented by different scholars achieved good accuracy but having the limitation in covering the scope of multiples diseases. Limitations in the current literature indicate the demand for an intelligent classification system that can classify multiple skin lesions with high accuracy.

Materials
For classifying different skin lesions, dataset plays a vital role. For experiments, an image dataset is collected from different sources. Sources include online medical data repositories, research challenges and researchers working in this domain. The online data repositories include DermIS [26], DermQuest [27], DermNZ [28] and PH 2 [29] dataset. "11k hands" publicly available dataset repository is used for healthy images. Some of the images related to eczema and healthy category are collected from researchers [30] working in the field of skin lesions classification. IEEE International Symposium on Biomedical Imaging (ISBI) skin lesion challenge [31] is an international skin lesion classification challenge organized every year since 2016. Some of the images related to benign and malignant class were used from ISBI skin lesions repository. Figure 1 graphically presents the images belonging to different categories. After collecting all the data from different sources, a uniformed dataset has been created for this work. To detect malignancy, Erol et al. [36] extracted texture features of the region within the lesion boundary; which was determined by active-contour segmentation. The extracted texture features contain homogeneity, SD, and mean of pixel values. Artificial Neural Network(ANN) and Support Vector Machine (SVM) classifiers were compared and the best performance they achieved was 78% specificity on a dataset consists of 900 images with 173 malignant lesions using ANN. Schnurle et al. [37] provide an automated approach to classify hand eczema. For balancing data, they used the oversampling technique and then extract colour, texture and histogram features from the provided images. For evaluating their approach, SVM was applied to the features extracted from 48 images. An F-score of 58.6% and 43.8% was achieved for the front and back side of hands respectively.
A computer-aided classification system is proposed by Hameed et al. [38] for classification of multiple skin lesions using a hybrid approach in which features are extracted using convolution neural network (CNN) and classification is performed using SVM. As the features are extracted using CNN, hence uninterpretable. Computer-aided classification systems presented by different scholars achieved good accuracy but having the limitation in covering the scope of multiples diseases. Limitations in the current literature indicate the demand for an intelligent classification system that can classify multiple skin lesions with high accuracy.

Materials
For classifying different skin lesions, dataset plays a vital role. For experiments, an image dataset is collected from different sources. Sources include online medical data repositories, research challenges and researchers working in this domain. The online data repositories include DermIS [26], DermQuest [27], DermNZ [28] and PH 2 [29] dataset. "11k hands" publicly available dataset repository is used for healthy images. Some of the images related to eczema and healthy category are collected from researchers [30] working in the field of skin lesions classification. IEEE International Symposium on Biomedical Imaging (ISBI) skin lesion challenge [31] is an international skin lesion classification challenge organized every year since 2016. Some of the images related to benign and malignant class were used from ISBI skin lesions repository. Figure 1 graphically presents the images belonging to different categories. After collecting all the data from different sources, a uniformed dataset has been created for this work. Data imbalacncing is an important issue that needs to be addressed while training the classification model as the model may incline towards the class having more images [1,32]. Considering this, a stratified sampling technique was used to balance the dataset. Dataset downloaded from the above-mentioned sources is organized based on the disease features and then a random down-sampling technique is applied. Psoriasis category has the minimum number of images (N = 300) so the dataset in other categories is downsampled to make the dataset balanced. After down-sampling, a total of 1800 images of size 227 × 227 × 3 were used to train and test the classification model. Detailed dataset division used in this research work is presented in Table 1. Data imbalacncing is an important issue that needs to be addressed while training the classification model as the model may incline towards the class having more images [1,32]. Considering this, a stratified sampling technique was used to balance the dataset. Dataset downloaded from the above-mentioned sources is organized based on the disease features and then a random down-sampling technique is applied. Psoriasis category has the minimum number of images (N = 300) so the dataset in other categories is downsampled to make the dataset balanced. After down-sampling, a total of 1800 images of size 227 × 227 × 3 were used to train and test the classification model. Detailed dataset division used in this research work is presented in Table 1.

Method
Pre-processing, segmentation, feature extraction and classification are the key phases of the CAD system for medical image classification [10]. The classification scheme for multi-class skin lesions classification is graphically illustrated in Figure 2, which comprises the phases of preprocessing, segmentation, feature extraction and classification.

Method
Pre-processing, segmentation, feature extraction and classification are the key phases of the CAD system for medical image classification [10]. The classification scheme for multi-class skin lesions classification is graphically illustrated in Figure 2, which comprises the phases of preprocessing, segmentation, feature extraction and classification.

Input image
Hair free Image

Pre-Processing and Segmentation
Capturing and digitisation is a noisy process considering the facts of angle, lighting, camera resolution and dimensional alignment. Because of the noisy capturing process, pre-processing is the first step of the proposed classification scheme. In this stage, different kinds of noise are removed in the steps of resizing, hair removal and smoothening of the images. The gathered images are of different sizes and contain noise because they are captured using different devices in different environments. The noise present in the images is in the form of hair. As the images are of different size; therefore, for consistency, the images are resized into 227 × 227 × 3. For removing hairs from the images, an already well-known technique titled "Dull Razor" [33] is used. To remove the other noise, a filtering technique is applied and Gaussian filter with 3 × 3 filter size is used.
Segmentation of the multi-disease classification is very tough because of their different characteristics and their location on the human body. Malignant melanoma and benign lesions usually have a definite shape and boundary; therefore; shape and geometric features can be easily extracted from them [1]. Diseases like acne, eczema, and psoriasis may cover full body area and have no definite shape, therefore, extraction of geometric and boundary features is very challenging. Due to the above-mentioned problem, in this research, segmentation is performed with respect to human skin. Any non-skin area is discarded from the image and other part is extracted and considered as a region of interest (ROI). ROI is segmented by using the methodology proposed by Phung et al. [34].

Pre-Processing and Segmentation
Capturing and digitisation is a noisy process considering the facts of angle, lighting, camera resolution and dimensional alignment. Because of the noisy capturing process, pre-processing is the first step of the proposed classification scheme. In this stage, different kinds of noise are removed in the steps of resizing, hair removal and smoothening of the images. The gathered images are of different sizes and contain noise because they are captured using different devices in different environments. The noise present in the images is in the form of hair. As the images are of different size; therefore, for consistency, the images are resized into 227 × 227 × 3. For removing hairs from the images, an already well-known technique titled "Dull Razor" [33] is used. To remove the other noise, a filtering technique is applied and Gaussian filter with 3 × 3 filter size is used.
Segmentation of the multi-disease classification is very tough because of their different characteristics and their location on the human body. Malignant melanoma and benign lesions usually have a definite shape and boundary; therefore; shape and geometric features can be easily extracted from them [1]. Diseases like acne, eczema, and psoriasis may cover full body area and have no definite shape, therefore, extraction of geometric and boundary features is very challenging. Due to the above-mentioned problem, in this research, segmentation is performed with respect to human skin. Any non-skin area is discarded from the image and other part is extracted and considered as a region of interest (ROI). ROI is segmented by using the methodology proposed by Phung et al. [34].
The segmentation accuracy achieved is 81.24% as in some cases, the colour of the background and skin matches.

Feature Extraction
Feature extraction for multi-disease classification is a very challenging and difficult task as the different diseases may have similar features. It is also a challenging task due to the diverse nature of the skin lesions, e.g., extraction of shape features is easy from skin cancer images as they have a clear boundary and has a definite size, whereas same features are difficult to extract from acne, eczema, and psoriasis images as they may cover whole body area in the captured image and have no clear shape. In this research work, a bag of features that can be extracted from any skin lesion image is proposed. In the feature extraction step, 35 colour and texture features are extracted from the skin lesion images for multi-class classification.

Colour Features
In multi-disease classification, colour features play a vital role [33,34]. Colour features are one of the important features used to distinguish between different skin diseases. This work explores the RGB colour space, and different features are extracted from it. For this research work, minimum, maximum, mean, mode, standard deviation, skewness, energy, entropy, and kurtosis of red, green, and blue colour spaces are considered. The colour features along with their description and formulae are given in Table 2. Table 2. Different colour features extracted from red, green and blue colour space along with their description and formulae. The colour features include minimum, maximum, mean, mode, standard deviation, skewness, energy, entropy and kurtosis). Entropy Measure the required amount of information to code the image data = − w−1 g=0 P(g) log 2 P(g)

Kurtosis
Measure of the peakness of the probability distribution of an image = 1 Legends*: w is the number of intensity levels, g is the intensity level, r is the number of rows, c is the number of columns in the image, g is the mean, σ g is the standard deviation

Texture Features
In the existing literature, Grey level co-occurrence matrix (GLCM) is mostly used to extract texture features [39]. In this research work, first the GLCM matrix [39] is computed and then contrast, correlation, energy and homogeneity is calculated from it. The extracted GLCM features along with their description and formula are given in Table 3. Table 3. GLCM features with their description and formulae. GLCM features include contrast, correlation, energy and homogeneity.

ContrastGLCM
Measure the local fluctuations of grey levels of neighbor pixels Measure the joint probability occurrence of specified pair pixels Measure the sum of squared elements in the GLCM − w−1 g=0 P(g) log 2 P(g)

HomogeneityGLCM
Measures the local uniformity Neighborhood grey-tone difference matrix (NGTDM) extracted features are also important and provide the human perception of texture [40]. These features are not fully investigated for the classification of multiple skin diseases. In this research, we have extracted four features from the NGTDM. NGTDM is a column matrix formed by the greyscale image. Let f (k, l) be the grey-tone of any pixel at (k, l) having grey-tone value i, the average grey-tone over a neighborhood is calculated using Equation (1).
where d specifies the neighborhood size and W = (2d + 1) 2 Then the ith entry in the NGTDM is calculated using Equation (2).
where N i is the set of all pixels having grey tone i. After calculating NGTDM, busyness, complexity, contrast, and strength are extracted. The description along with their formula are given in Table 4. Table 4. Features extracted from the Neighborhood grey-tone difference matrix along with their description and formula.

Name Description Formula
Busyness Measure changes in grey levels between neighboring voxels = Complexity Measure the non-uniformity and rapid changes in grey-levels Measures the changes between voxels and their neighborhood Measure the primitives in an image , p(i) 0, All the colour and texture features are stored in the feature vector which is then passed to the classification step for training the classification model.

Classification
Classification is the last phase of the computer-aided classification model. Classification step is the step in which the inferences is made in order to produce a diagnosis about the input image. The classification model is trained on the feature vector using supervised learning. Experiments are performed using different classification models, and the one with the best performance is selected to develop the computer-aided classification application. Different classification models utilized in the classification step are Decision Tree, Support Vector Machine (SVM), K Nearest Neighbor (KNN) and Ensemble methods. For each classifier, different kernels are employed. For decision tree; fine, medium and coarse kernels are used. Linear, quadratic, cubic, fine Gaussian, coarse Gaussian kernels are used for SVM. Kernels for KNN include fine, medium, coarse, cosine, cubic and weighted and for ensemble classifier, boosted trees, bagged trees, subspace discriminant, subspace KNN and RUSBoosted tree kernels are used [41]. The different kernels for each classifier are given in Table 5. The performance of the classifiers is calculated from the confusion matrix. As the proposed CAD system gives multi-class classification, a multi-class confusion matrix is obtained. First, the performance measure of each class is computed, and then the overall performance is calculated. To calculate the performance of the individual class, accuracy, sensitivity, and specificity are used. After calculating the individual class performance, performance of overall classification is computed. Macro averaging [42] is used to calculate the overall performance. The formulae to calculate the overall performance are given in Table 6.

Results and Discussion
The experiments were performed using the gathered dataset and the classification model was trained and tested on 1800 images. K-fold (k = 10) cross-validation technique was used for training and testing the classification model. In k-fold cross-validation, the data is divided into k equal subsets, and the holdout method is repeated k times. Each time, the k th subset is used for the testing and k-1 subsets are used for training, and finally, the average performance across all k trial is calculated. Using 35 colour and texture features, SVM with quadratic kernel performed best among all classifiers. As mentioned above, after performing classification, a multi-class confusion matrix was obtained for each classifier. The confusion matrix for fine tree, quadratic SVM, weighted KNN and bagged trees are provided in the Supplementary Material. The training time required by the SVM with the quadratic kernel was 3.0624 sec whereas the prediction speed was approximately 8400 obs/sec (observations per second). Among decision tree classifiers, fine tree gives the highest accuracy. The average per-class accuracy achieved by fine tree was 88.40%. The sensitivity and specificity obtained by fine tree was 70.24% and 93.04% respectively. The computational time for training the classification model was 3.4608 sec. The maximum number of splits used while using fine tree was 10. As mentioned earlier, among the SVM, Quadratic kernel performed better than others. The accuracy, sensitivity, and specificity achieved by quadratic SVM kernel was 94.74%, 84.23% and 96.85%. The training time for quadratic SVM was 3.0624 sec. For the KNN, weighted KNN performed better with an average per-class accuracy, sensitivity, and specificity of 92.80%, 78.38%, and 95.68% respectively. For weighted KNN, experiments were performed using Euclidean distance and 10 neighbors. The performance of the bagged trees was almost similar to quadratic SVM, and 94.16% accuracy, 82.48% sensitivity, and 96.49% specificity was attained. The results of the fine tree, quadratic SVM, weighted KNN and bagged trees are given in Table 7.
The dispersion boxplot for fine tree, quadratic SVM, weighted KNN and bagged trees is graphically presented in Figure 3 and a comparison of these classifiers is visually presented in Figure 4.  The dispersion boxplot for fine tree, quadratic SVM, weighted KNN and bagged trees is graphically presented in Figure 3 and a comparison of these classifiers is visually presented in Figure  4.   Table 7. Performance of the different classifiers using the 10-fold cross-validation. Values depict the mean score (Standard deviation). Values in bold show the best accuracy, sensitivity and specificity score. All the score is in %.

Classifier Accuracy (SD) Sensitivity (SD) Specificity (SD)
Fine Tree 88. 40   Based on the performance, the model trained using quadratic SVM is chosen, and the CAD system is developed. Two research works can be compared if they have used the same dataset. The proposed research work is compared with the existing research work and their comparison is illustrated in Table 8. For classifying a new image, an unseen image is sent to the trained model and is classified in a fraction of a second. Currently, the proposed skin lesion classification system can only classify an image into one of the six non-over lapping classes, i.e., healthy, acne, eczema, psoriasis, benign and malignant. If a rarer image arises, it will be classified in one of the provided classes and hence the FPs and FNs will be generated, which can be considered as the limitation of the proposed work. However, it can be overcome by adding more classification diseases. Factors causing difficulties in segmentation and classification are also identified in this work. One of the main hurdles is noise. Noise is present in the form of hairs, black frames, circles, skin lines, etc. Homogenous characteristics of different skin lesions is another reason. Some lesions can have the same colour and texture, which may adversely affect the classification accuracy.

Conclusions
In the literature, most of the work done on automated skin lesion classification considered only malignant melanoma classification, and the area of multi-class skin lesions classification is neglected.  Based on the performance, the model trained using quadratic SVM is chosen, and the CAD system is developed. Two research works can be compared if they have used the same dataset. The proposed research work is compared with the existing research work and their comparison is illustrated in Table 8. For classifying a new image, an unseen image is sent to the trained model and is classified in a fraction of a second. Currently, the proposed skin lesion classification system can only classify an image into one of the six non-over lapping classes, i.e., healthy, acne, eczema, psoriasis, benign and malignant. If a rarer image arises, it will be classified in one of the provided classes and hence the FPs and FNs will be generated, which can be considered as the limitation of the proposed work. However, it can be overcome by adding more classification diseases. Factors causing difficulties in segmentation and classification are also identified in this work. One of the main hurdles is noise. Noise is present in the form of hairs, black frames, circles, skin lines, etc. Homogenous characteristics of different skin lesions is another reason. Some lesions can have the same colour and texture, which may adversely affect the classification accuracy.

Conclusions
In the literature, most of the work done on automated skin lesion classification considered only malignant melanoma classification, and the area of multi-class skin lesions classification is neglected. A novel multi-class skin lesions classification framework is proposed in this work for classification of mostly occurred and prominent skin lesions. The proposed framework constitutes four steps; the first step is pre-processing where skin images are pre-processed, and noise is removed from the images. The second step is the segmentation where ROI is extracted from the provided skin lesion image. From the ROI, 35 different features are extracted for the third step, and finally different classifiers are used to train the classification model. Among the different classifiers, SVM with quadratic kernel performed better, with an accuracy of 94.74%. The proposed classification scheme performed very well on the images gathered from different sources. The proposed system can perform very well on new unseen images as it is trained on images collected from different sources.
Segmentation of multi-class skin lesion classification needs more research investigation in order to propose a unified classification scheme that can be applied to different skin lesions images. In this research work, a bag of features was extracted manually, which was time-consuming. Future studies are required for the automated feature extraction which can be easily understandable. The proposed classification scheme is designed for desktop use; more research is required to make this classification compatible with smartphone applications.