An Identiﬁcation Method of Feature Interpretation for Melanoma Using Machine Learning

: Melanoma is a fatal skin cancer that can be treated efﬁciently with early detection. There is a pressing need for dependable computer-aided diagnosis (CAD) systems to address this concern effectively. In this work, a melanoma identiﬁcation method with feature interpretation was designed. The method included preprocessing, feature extraction, feature ranking, and classiﬁcation. Initially, image quality was improved through preprocessing and k-means segmentation was used to identify the lesion area. The texture, color, and shape features of this region were then extracted. These features were further reﬁned through feature recursive elimination (RFE) to optimize them for the classiﬁers. The classiﬁers, including support vector machine (SVM) with four kernels, logistic regression (LR), and Gaussian naive Bayes (GaussianNB) were applied. Additionally, cross-validation and 100 randomized experiments were designed to guarantee the generalization of the model. The experiments generated explainable feature importance rankings, and importantly, the model demonstrated robust performance across diverse datasets.


Introduction
Skin diseases encompass a wide range of conditions, some of which can progress into severe and fatal skin cancers.The World Health Organization projects that there will be approximately 2.2 million cases of skin cancer by 2025 [1].Among skin cancers, melanoma stands out due to its very high mortality rate; its prevalence continues to rise, with over 130,000 new cases diagnosed [2].Research has established a strong link between exposure to sunlight and ultraviolet radiation and the development of melanoma.Prolonged exposure to ultraviolet radiation, along with cellular mutations or genetic defects, can lead to the mutation and rapid proliferation of normal melanocytes in the basal layer of the epidermis, culminating in the progression to melanoma.Due to its aggressive nature and high mutation frequency, early diagnosis of melanoma is of utmost importance.At the initial stage, the success rate of cure can exceed 90% [3].
For the accurate early diagnosis of skin diseases, dermoscopic images are often employed.Dermoscopy, a non-invasive imaging technique, provides a magnified view of the deeper layers of the skin under non-invasive conditions, facilitating the analysis and diagnosis of skin lesions and affected areas [4].The utilization of dermoscopic images enhances medical observation, significantly improving the diagnosis of various skin diseases.
Image processing for the detection, segmentation, and classification of skin lesions encounters various difficulties [5], including: (a) the presence of noise and artifacts such as hairs, bubbles, and blood vessels; (b) irregular, random, and sometimes diffuse edges with low contrast between the lesion and healthy skin; (c) illumination malfunctions; and (d) variability in image characteristics due to different types of capturing equipment.Due to these challenges, image preprocessing is essential before diagnosing melanoma.
In recent years, the rapid advancement of artificial intelligence and machine learning technologies has garnered significant attention from researchers, leading to the development of automated image processing systems for diagnosing diseases in the medical field.Various image CAD systems have been developed for the diagnostic classification of melanoma, providing valuable assistance to physicians [6].
The CAD system of melanoma based on machine learning typically involves three steps: 1.
Segmentation of the skin lesion region; 2.
Feature extraction from the skin lesion region; 3.
Classification of the extracted features using a classifier.
To achieve more accurate classification results, segmentation is crucial, as it provides precise samples for the classifier.While numerous studies have focused on modifying CAD systems for melanoma classification, a limited number have analyzed the features extracted from melanoma and explored their link with the pathological perspective.
Dermoscopic images can pose challenges for identification due to hair problems or acquisition processes, but these issues can be addressed through image preprocessing.Seena, J. et al. [7] demonstrated that preprocessing significantly impacts the segmentation process of skin lesion areas, leading to more accurate results.Proper segmentation is crucial as it affects the subsequent feature extraction and melanoma identification, making preprocessing necessary.Ashraf,H. et al. [8] and Bakheet, S. et al. [9] resized images and employed simple filtering to address hair effects and noise elimination.
The preprocessing stage primarily aims to improve image quality without altering the size of the skin lesion region.Segmentation can also be integrated into the preprocessing stage, utilizing traditional algorithms like adaptive threshold segmentation [10] and Otsu's clustering-based approach [11].
Automated CAD systems are commonly based on various feature descriptions, including color, texture, and shape [12].In some studies, color and texture information from dermoscopic images were extracted for melanoma identification, followed by classification using classifiers.Rastgoo, M. et al. [13] proposed a classification framework that compared shape, texture, and color features both locally and holistically to differentiate between melanomas and dysplastic nevi.They utilized SVMs, gradient boosting (GB), and random forest (RF) classifiers, considering single and combined features.
During feature extraction, Bharathi, G. et al. [14] employed color map histogram equalization and a fuzzy system to enhance dermoscopic images, using a genetic algorithm (GA) to optimize extracted texture features for melanoma detection.Nasir, M. et al. [15] utilized the Boltzmann entropy method to select fused texture, color, and shape features, and employed SVM for classification.
Previous research in this field has concentrated on extracting diverse features like texture, color, and shape to enhance melanoma recognition methods or models.However, few studies have focused on the importance of features and the interpretability of models.For instance, a study by Wahba, M.A. [16] et al. extracted texture features using the GLCM method enhanced by the GLDM technique.However, a detailed analysis of these features was lacking.Moldovanu, S. [17] et al. predominantly extracted color features in the BGR color space, with an opportunity to explore the HSV color space for broader color conditions.Feature selection, as demonstrated by Chatterjee, S. [4] et al., employed the RFE method.However, the focus was on optimal features without a comprehensive selection study.Lin et al. [18] discussed the capacity of the RFE method to rank features.Visualizing the RFE method as a gold standard for encapsulated embedding, as seen in the work by Sanz H [19], remains limited.
The trend leans towards improving the accuracy in melanoma identification methods or models.While accuracy is crucial, model generalizability holds equal significance.Explaining the internal role of model features through visualization can increase the credibility of the model.Moreover, bestowing a degree of generalizability to the model through multifaceted design holds valuable research potential.This paper designs a randomized experiment that employs RFE's feature importance ranking and cross-validation for melanoma identification.The goal is to enhance the credibility of the model and ensure its broader applicability.The process involves preprocessing dermoscopic images to enhance lesion regions and remove impurities.Segmentation using the k-means method extracts lesion regions, followed by cropping to isolate background effects.Three features-color, texture, and shape-are extracted from the lesion region and visualized.RFE is used to filter influential features based on the visualization insights.This approach helps in the understanding of individual feature contributions, explaining the model's performance.Model generalization is ensured through 100-fold randomization and cross-validation during the performance evaluation.
The major contributions are as follows: 1.
Preprocessing: employing morphological algorithms, filtering, and image sharpening during preprocessing effectively eliminates noise and artifacts from dermoscopic images.This process enhances image quality and accurately highlights skin lesion regions; 2.
Comprehensive Feature Extraction: the method extracts shape features from processed images using techniques like GLMC for texture, color moments for color, and morphology for a holistic representation of lesion attributes; 3.
Interpretable Feature Selection: the designed RFE feature selection method employs a 100-fold randomization strategy to derive the feature importance ranking.This ranking concurrently serves to elucidate the individual feature contributions.Consequently, this methodology inherently favors the selection of higher-ranked features, thereby ensuring a blend of enhanced credibility and interpretability; 4.
Model Performance: through feature screening and model evaluation, including tenfold cross-validation and 100-fold randomization, the model's generalization is rigorously ensured.

Methodology
The CAD system melanoma classification method includes a preprocessing stage, feature selection stage, and classification stage.The preprocessing improved the quality of the dermoscopic image, and the skin lesions were segmented and cropped.The extracted texture, color, and shape features (TCS feature set) were selected, and internal ranking contributions were carefully analyzed and selected (RFE-TCS feature set).Finally, SVM ("polynomial(poly)", "radial basis kernel functions (rbf)" and "linear", "sigmoid"), logistic regression (LR), and Gaussian naive Bayes (GaussianNB) were used to test and train the selected optimization group.The detailed steps are shown in Figure 1.

Preprocessing of Dermoscopic Images
From the above, dermoscopic images are affected mainly by noise such as hairs and bubbles, edge, and contrast problems.For this purpose, morphological methods [20], filters as well as sharpening were used to solve these problems and improved the quality of the image.Considering the possible damage to the original image during the removal of hairs etc., a masking technique was used to perform some restoration process.

Preprocessing of Dermoscopic Images
From the above, dermoscopic images are affected mainly by noise such as hairs and bubbles, edge, and contrast problems.For this purpose, morphological methods [20], filters as well as sharpening were used to solve these problems and improved the quality of the image.Considering the possible damage to the original image during the removal of hairs etc., a masking technique was used to perform some restoration process.Figure 2 shows the comparison of dermoscopic images before and after processing, and the above-mentioned problems such as noise were improved.

Preprocessing of Dermoscopic Images
From the above, dermoscopic images are affected mainly by noise such as hairs and bubbles, edge, and contrast problems.For this purpose, morphological methods [20], filters as well as sharpening were used to solve these problems and improved the quality of the image.Considering the possible damage to the original image during the removal of hairs etc., a masking technique was used to perform some restoration process.Figure 2 shows the comparison of dermoscopic images before and after processing, and the abovementioned problems such as noise were improved.The k-means clustering algorithm is an unsupervised clustering technique based on partitioning, which is known for its fast convergence and easy implementation [21].The k-means clustering algorithm is known to provide locally optimal solutions [22].It is suitable for the segmentation of color images, including the classification of skin lesion areas and background regions in an image.In dermoscopic images, there is a clear color contrast between lesion areas and normal skin areas.Typically, diseased skin appears brown or black, while normal skin appears white or yellow.Given this difference, the RGB color space is effective in identifying skin lesion regions.Therefore, the k-means clustering algorithm was used for RGB color-based segmentation of skin fluoroscopy images [23].
The k-means clustering [24] algorithm was used to segment dermoscopic images with two initial cluster centers representing skin lesions and normal skin regions, respectively.The image masking technique was then used to extract the skin lesion region and ignore the background.To further minimize the background effect, the skin lesion region of the segmented dermoscopic image was cropped.Figure 3 illustrates the segmentation and cropping process.
gorithm was used for RGB color-based segmentation of skin fluoroscopy images [23].
The k-means clustering [24] algorithm was used to segment dermoscopic images with two initial cluster centers representing skin lesions and normal skin regions, respectively.The image masking technique was then used to extract the skin lesion region and ignore the background.To further minimize the background effect, the skin lesion region of the segmented dermoscopic image was cropped.Figure 3 illustrates the segmentation and cropping process.

Texture Feature
Texture characterization [25,26] captures changes in surface or structural patterns in an image.Gray scale covariance matrix (GLCM) [27,28] is a fundamental method to analyze texture features, which accurately reveals the roughness and repetition direction of the texture.In total, six feature parameters were selected: Contrast, Similarity, Homogeneity, Energy, Correlation, and Angular Second Matrix (ASM).The covariance matrix can have features extracted from scanned images with different orientation angles, as well as from images with different gray levels.The formulas [25][26][27][28] are shown in (1)-( 6).To explore the effect of this, four direction angles (0°, 45°, 90°, and 135°) and three gray levels (8, 16, and 32) were selected.
Contrast quantifies the change in intensity within an image and represents the difference between the intensity of one pixel compared to another.Mathematically, it is calculated as follows:

Feature Extraction 2.2.1. Texture Feature
Texture characterization [25,26] captures changes in surface or structural patterns in an image.Gray scale covariance matrix (GLCM) [27,28] is a fundamental method to analyze texture features, which accurately reveals the roughness and repetition direction of the texture.In total, six feature parameters were selected: Contrast, Similarity, Homogeneity, Energy, Correlation, and Angular Second Matrix (ASM).The covariance matrix can have features extracted from scanned images with different orientation angles, as well as from images with different gray levels.The formulas [25][26][27][28] are shown in (1)- (6).To explore the effect of this, four direction angles (0 • , 45 • , 90 • , and 135 • ) and three gray levels (8, 16, and 32) were selected.
Contrast quantifies the change in intensity within an image and represents the difference between the intensity of one pixel compared to another.Mathematically, it is calculated as follows: Dissimilarity assesses the level of distinction between pairs of pixels with varying gray levels in an image.It is calculated as follows: Angular second moment gauges the uniformity of the distribution of gray levels and the texture thickness within an image.This metric is calculated as follows: Homogeneity quantifies the extent of variation in the textual components of the image, particularly in its uniformity.It is calculated using the following formula: Energy measures the stability of gray level variations within the texture of an image.The calculation for energy is as follows: Correlation evaluates the similarity of gray levels within an image, either along rows or columns.The calculation for correlation is as follows: In calculating the texture features, the variables "i" and "j" represent the coordinates of the pixels in the image are the i-axis and j-axis in the GLCM, and P(i,j) denotes the frequency of occurrence at pixels at a fixed location.The "mean" refers to the average value in the texture feature, while the "variance" represents the dispersion or spread of the values in the texture feature.

Color Feature
Color moments were used in [29] to extract the color features of the skin lesion region in dermoscopic images.The parameters are first moments, second moments, and third moments of color.The followings are the Formulas ( 7)-( 9) for the calculation [30] of color moments: The first moment quantifies the sensitivity of an image, reflecting its overall intensity distribution.It is calculated as follows: First moment = ∑ j P(i, j) ( The second moment provides insight into the range of the color distribution within an image and offers information about patterns and contrasts.The second moment is calculated as follows: The third moment conveys the symmetry of color distribution in an image, indicating the balance and arrangement of colors.It is calculated as follows: In the image, the variables "i" and "j" represent the coordinates of the pixels while "N" represents the total number of sub-pixels in the image.The six color channels R, G, B, H, S, V were separated under the RGB color space and HSV color space, respectively, and then color features in each single-color channel were extracted using color moments.

Shape Feature
The shape feature is a visual description of the lesion area and includes various metrics such as area and perimeter [30].The area feature (A) quantifies the total number of pixels within the lesion area, while the perimeter feature (P) quantifies the total number of contour pixels on the boundary of the lesion area.Based on these two parameters, other shape descriptors, such as dispersion, saturation, and roundness, can be derived, as shown in ( 10)-( 12) [31].In the shape characterization process, dispersion, saturation, and roundness were chosen as descriptors to fully present the shape characteristics of the lesion area.
Dispersion characterizes the extent of spread within a region in an image.It provides information about how the elements within the region are distributed.Dispersion is the ratio of the square of the perimeter of the lesion area to the area and describes the process of regional dispersion.The calculation for dispersion is as follows: Saturation, which can also be referred to as convexity, pertains to the shape of the region's boundary in an image.It describes how closely the region resembles a convex shape.Saturation is the ratio of area to perimeter of the lesion region.The calculation for saturation (or convexity) is as follows: Circularity evaluates how closely the shape of a region resembles a circle, indicating the level of proximity to a circular form.It provides insights into the compactness of the region.Roundness describes the similarity of the shape of the lesion region to a circle.The calculation for circularity is as follows:

Feature Selection by RFE Ranking
According to the above, a total of 93 features (72 texture features, 18 color features, and 3 shape features) were extracted from the skin lesion area.These features were combined into a feature vector matrix named TCS for melanoma classification.To simplify the computational process, the range of values for the color and shape parameters was narrowed down.
All these features were combined into a feature vector matrix named TCS for melanoma classification.RFE was used to select features and determine their importance.RFE selected the best set of features by eliminating and shifting.The best features were wrapped after repetition [4].On the other hand, the importance and contribution of each feature were obtained for each elimination [31].For this reason, the features could be ranked according to the size of the contribution.To select features more reliably, a 100-fold randomization method was designed.The dataset was shuffled before each training.The final ranking was calculated based on the entire ranking.The features with high contribution were selected and combined into a new feature vector matrix, called RFE-TCS, which was specifically used for melanoma classification.

Methodological Process Design
SVM [32] is a kind of supervised machine learning method widely used for classification and classification tasks [33].It is particularly effective for binary classification problems.The goal is to identify support vectors, i.e., subsets of samples that are farthest from the hyperplane and represent different classes.
One of the strengths of SVM [34] is its stability, even with small sample sizes.It can handle different classification problems and optimize the classification results by choosing appropriate kernel functions.For this experiment, "poly", "rbf", "linear" and "sigmoid" were chosen to train and classify the experimental dataset to obtain the best classification model.
Logistic regression [35] serves as a widely employed classification algorithm that is particularly suitable for binary classification tasks.Hence, it is embraced as a classifier model.Additionally, naive Bayes classifiers employ Bayes' theorem for classification.Thus, the Gaussian naive Bayes classifier was selected [36].
The entire experiment was conducted using Python, utilizing machine learning techniques.The training process consists of using the extracted features to train the melanoma classification model.The workflow of the machine learning process is shown in Figure 4.
Logistic regression [35] serves as a widely employed classification algorithm that is particularly suitable for binary classification tasks.Hence, it is embraced as a classifier model.Additionally, naive Bayes classifiers employ Bayes' theorem for classification.Thus, the Gaussian naive Bayes classifier was selected [36].
The entire experiment was conducted using Python, utilizing machine learning techniques.The training process consists of using the extracted features to train the melanoma classification model.The workflow of the machine learning process is shown in Figure 4.The entire experimental procedure was executed using Python, employing various machine learning techniques.The training phase involved utilizing the extracted features to train the melanoma classification model.The workflow of this machine learning process is visually depicted in Figure 4.The dataset was divided into two subsets: an 80% portion earmarked as the training set, and a remaining 20% portion designated as the test set: (1) The initial step encompassed the utilization of all feature sets (TCS) as input features for the machine learning process.The recursive feature elimination (RFE) algorithm was employed for feature ranking and selection.To facilitate an internal visualization of these features, a 100-fold randomization method was employed for carrying out the statistical analysis.The outcomes of the feature selection process were then subjected to comprehensive statistical analysis, culminating in the derivation of a ranking indicating the importance of each feature.Based on this ranking, features The entire experimental procedure was executed using Python, employing various machine learning techniques.The training phase involved utilizing the extracted features to train the melanoma classification model.The workflow of this machine learning process is visually depicted in Figure 4.The dataset was divided into two subsets: an 80% portion earmarked as the training set, and a remaining 20% portion designated as the test set: (1) The initial step encompassed the utilization of all feature sets (TCS) as input features for the machine learning process.The recursive feature elimination (RFE) algorithm was employed for feature ranking and selection.To facilitate an internal visualization of these features, a 100-fold randomization method was employed for carrying out the statistical analysis.The outcomes of the feature selection process were then subjected to comprehensive statistical analysis, culminating in the derivation of a ranking indicating the importance of each feature.Based on this ranking, features with high importance were selectively chosen to create a novel feature set termed "RFE-TCS"; (2) The repertoire of machine learning methods encompassed SVM (using the polynomial kernel, radial basis function kernel, linear kernel, and sigmoid kernel), logistic regression, and a Gaussian Bayesian classifier; (3) This divisional process was repeated 100 times, with each instance involving the random shuffling of the dataset.The models were all cross-validated using tenfold cross-validation.As part of the evaluation protocol, performance metrics such as average classification accuracy and the area under the curve (AUC) were meticulously computed.

Results and Discussion
In the feature selection process, each random shuffling would produce a round of optimized combinations, statistics of the optimized combinations of 100 times the results.The number of times a feature appeared in an optimized combination was recorded as the frequency, and the ranking of features was counted and computationally analyzed based on all the results.
In the classification training stage, the evaluation metrics used to assess the performance of the CAD system-based melanoma lesion identification method were accuracy, sensitivity, and specificity.Also, the ROC-AUC [37] curve was drawn.These metrics are commonly used in medical image analysis to measure the effectiveness of classification models.The formulas for accuracy, sensitivity, and specificity [38] are as shown in ( 13)-( 15): Sensitivity = TP TP + FN ( 14) where 'TP' is the number of correctly identified melanoma lesions as melanoma, 'TN' is the number of correctly identified benign lesions as non-melanoma, 'FP' is the number of benign lesions incorrectly identified as melanoma, and 'FN' is the number of melanoma lesions incorrectly identified as benign lesions.
The ISIC dataset is widely recognized as the largest public database for research in dermoscopic image analysis.In the experiment, a total of 200 dermoscopic images of melanoma and 200 dermoscopic images of benign nevus were selected from the ISIC 2019 database.This dataset consists of 400 images in total, with an equal representation of both melanoma and benign nevus cases.The balanced dataset allows for the fair and reliable evaluation of the CAD system's performance in distinguishing between melanoma and benign lesions.
The PH 2 dataset, introduced and detailed by Mendonca et al., comprises 200 dermoscopic images.This dataset categorically segregates images into melanomas and benign nevi.
Melanoma has a relatively similar presentation with benign nevi [39], and the classification between melanoma and benign nevi is mainly designed to minimize misclassification.

RFE Ranking Explains the Importance of Features
The results of ranking the importance of texture, color and shape features in melanoma identification using RFE method are shown in Figure 5.This ranking is the feature importance ranking.From the ranking in Figure 5, it is shown that texture features occupy the major high ranking.In Popecki's study [40], although the key role of texture features was observed, it was not specifically discussed or explained.The high or low feature importance ranking obtained by the designed methodology explains the reason for the texture feature as a key feature.This feature importance ranking method provides more insights and bridges the gap in feature selection in previous studies.Moreover, the ranking, obtained from the 100-fold randomization experiments that effectively highlighted the contribution of features, is also more convincing.
The performance results in Tables 1 and 2, Figures 6 and 7 affirm the filtered feature set's effectiveness, underscoring the optimized features' role in enhancing melanoma recognition.The RFE feature importance rankings are plausible and offer explanatory potential for model outcomes.
was observed, it was not specifically discussed or explained.The high or low feature importance ranking obtained by the designed methodology explains the reason for the texture feature as a key feature.This feature importance ranking method provides more insights and bridges the gap in feature selection in previous studies.Moreover, the ranking, obtained from the 100-fold randomization experiments that effectively highlighted the contribution of features, is also more convincing.The performance results in Tables 1 and 2, Figures 6 and 7 affirm the filtered feature set's effectiveness, underscoring the optimized features' role in enhancing melanoma recognition.The RFE feature importance rankings are plausible and offer explanatory potential for model outcomes.

Generalization Discussion of the Classifiers Models
In Table 3, the PH2 dataset was utilized for the experimental procedures.Specificall the individual classifiers trained on the ISIC-selected dataset were employed to test t PH2 dataset.The results, as presented in the table, demonstrate consistent and stable pe formance across all individual classifiers.Particularly, each classifier exhibited an obser able enhancement in accuracy.These findings serve as a positive indication of the gene alized nature of the proposed method's model.
To evaluate the generalization capability of SVM models, the application of cros validation techniques is a common practice [41,42].Prior studies have effectively demo strated the generalization achieved through cross-validation [43] by training SVM mode on a discovery dataset and subsequently assessing their performance using a distinct re licated dataset [44].In the approach, internal tenfold cross-validation [45] was employ to optimize model performance.

Generalization Discussion of the Classifiers Models
In Table 3, the PH2 dataset was utilized for the experimental procedures.Specifically, the individual classifiers trained on the ISIC-selected dataset were employed to test the PH2 dataset.The results, as presented in the table, demonstrate consistent and stable performance across all individual classifiers.Particularly, each classifier exhibited an observable enhancement in accuracy.These findings serve as a positive indication of the generalized nature of the proposed method's model.
To evaluate the generalization capability of SVM models, the application of crossvalidation techniques is a common practice [41,42].Prior studies have effectively demonstrated the generalization achieved through cross-validation [43] by training SVM models on a discovery dataset and subsequently assessing their performance using a distinct replicated dataset [44].In the approach, internal tenfold cross-validation [45] was employed to optimize model performance.
Furthermore, the model's capacity to generalize was further assessed using an external method involving 100 times randomization replicate instances, which utilized a holdout approach.This methodology was chosen to enhance control over the model's generalization capability and provide a comprehensive evaluation of its overall performance.

Comparative Performance with Other Models
Table 4 presents a comparative assessment of the accuracy achieved by other different models in the context of melanoma identification.In the existing literature, Gaus-sianNB [33], LR [46], RF [47], and KNN [48] classifiers have been reported to achieve accuracy rates of 65.93%, 72%, 74.28%, and 75.00% respectively.The proposed SVM model, employing the specially designed radial basis function (rbf) kernel, exhibits higher recognition accuracy when compared to other prevalent machine learning classifiers.
In addition to its capacity for automated melanoma recognition, the proposed method goes a step further by visually illustrating the contributions of individual features.This visualization not only bolsters confidence in the model but also incorporates the use of randomized experiments to ensure the model's generalizability.This distinctive approach sets it apart from the other methods, providing a transparent and reliable model.Some studies [49] have explored the use of deep learning neural networks in achieving greater accuracy.This is an approach that may be considered in the future.

Limitations
Although features were extended in a number of ways, there are still some other approaches to feature selection that can be adopted in future work.In terms of model performance, comparing the previous studies LR and GaussianNB have a relatively better surface.However, although the performance of the model is better than SVM [48], the accuracy can still be improved.There is still room for improvement in terms of the final model's performance; the next step will be the use of different modelling design methods to improve the accuracy of results.

Conclusions
This study presents an interpretable machine learning melanoma recognition method that emphasizes the importance of features.The method starts with pre-processing, feature extraction (texture features using GLCM, color features using color moments, and shape features using morphological techniques) and feature ranking of dermoscopic images.Subsequently, SVM classifiers (poly, rbf, linear, sigmoid), logistic regression and Gaussian Bayesian machine learning techniques are used for training.
An important aspect of the method is the visual ranking of feature contributions by RFE during the feature-extraction phase.The obtained feature importance ranking explains the importance of texture features.In addition, the feature selection is more convincing.Another aspect is that in the classification stage, the model is cross-validated ten times and 100 randomized experiments are used to enhance the credibility of the results and the generalization ability of the model.
The results confirm that the feature set optimized by feature importance ranking performs better.The model demonstrated significant generalization ability on different datasets.This approach provides insights into melanoma identification and emphasizes the value of feature selection and interpretation for robust model design.

16 Figure 1 .
Figure 1.Block diagram of the proposed melanoma classification method.

Figure 1 .
Figure 1.Block diagram of the proposed melanoma classification method.

Figure 1 .
Figure 1.Block diagram of the proposed melanoma classification method.

Figure 4 .
Figure 4.The flowchart of machine learning.RFE was used to rank the importance of features.The classification algorithms use SVM (poly, rbf, linear, sigmoid), LR and GaussianNB.This method was randomly repeated 100 times, with 80% of the data used for training and 20% for testing.

Figure 4 .
Figure 4.The flowchart of machine learning.RFE was used to rank the importance of features.The classification algorithms use SVM (poly, rbf, linear, sigmoid), LR and GaussianNB.This method was randomly repeated 100 times, with 80% of the data used for training and 20% for testing.

Figure 6 .
Figure 6.Comparison of classification performance of different algorithms by adjusting the RFE-TCS feature set.

Figure 6 .
Figure 6.Comparison of classification performance of different algorithms by adjusting the RF TCS feature set.

Figure 7 .
Figure 7. Classification performance and AUC for the randomized 100-fold randomization metho

Figure 7 .
Figure 7. Classification performance and AUC for the randomized 100-fold randomization method.

Table 1 .
Performance of original features in different classifiers.

Table 2 .
Performance of selected features in different classifiers.

Table 3 .
Performance of selected features in PH2 dataset.

Table 4 .
Performance comparisons of proposed model with other recent models.