A Measurement Software for Professional Training in Early Detection of Melanoma

Featured Application: An original software measurement system has been developed as a digital tool supporting the training of physicians during post-graduate courses in dermatology and corresponding clinical activity in order to increase the diagnostic performance (mainly in terms of sensitivity) regarding the early detection of cutaneous melanoma when suspicious pigmented lesions are to be examined. To pursue this goal the system is able to automatically process the ELM images of melanocytic lesions (acquired through digital dermoscopy) according to a well-known diagnostic method. As an outlook, in order to boost massive screening campaigns, the results of the automatic image processing could be adopted as a dermoscopic triage carried out in suitable tele-medicine projects involving general practitioners and / or pharmacists (for real-time image acquisition) and young dermatologists (for o ﬀ -line lesion classiﬁcation). Abstract: Software systems have been long introduced as support to the early detection of melanoma through the automatic analysis of suspicious skin lesions. Nevertheless, their behavior is not yet similar to the performance exhibited by expert dermatologists in terms of diagnostic accuracy. Instead, a software system should be adopted by non-experienced dermatologists in order to improve the measurement and detection results for skin atypical patterns and the accuracy of the corresponding second opinion. This paper describes an image-based measurement and classiﬁcation system able to score pigmented skin lesions according to the Seven-Point Check-list diagnostic method. Focus is devoted to the measurement procedure of biological structures more closely related to the atypical character of the nevus. Moreover, the performances of the measurement system are evaluated by considering the support to dermatologists with di ﬀ erent experiences during the clinical activity.


Introduction
One of the leading cancers around the world is represented by the malignant melanoma, whose dramatic number of diagnoses seems to be also influenced by both the change in recreational behavior and increasing ultraviolet radiation. According to the World Health Organization (WHO), there are more new cases of skin cancer than the combined incidence of cancers of the breast, prostate, lung, and colon [1]. In detail, melanoma is characterized by a biphasic growth which, at the beginning, evolves "horizontally", forming a stratified lesion above the basement membrane (intra-epidermal Regarding the training and testing of the automatic system, the dermoscopic image set counting more than 600 skin lesions was adopted; it includes both items randomly extracted by the reference atlas [20] and cases examined and recorded during the clinical activity at the Department of Clinical Medicine and Surgery, University of Naples Federico II. As a result, the main biological structures (corresponding to the dermoscopic criteria included in the 7-Point Checklist) are highlighted by applying to each lesion the image processing techniques briefly reported in Tables 1 and 2. As a further step, a statistical analysis is performed with the aim of providing the measurement information about the reliability of the adopted procedure for each criterion detection. Thus, a label (low, medium, or high) is assigned to each dermoscopic structure detected according to different ranges for the corresponding metric value, which is introduced for qualifying the processing algorithm. In the following section, the ad-hoc classification algorithms are briefly reported for all the dermoscopic criteria, as well as the statistical analysis proposed to evaluate the reliability of the automatic detection.

Color-Based Detection of Dermoscopic Criteria
The automatic measurement and classification of the dermoscopic structures most dependent on the chromatic distribution of the image pixels (i.e., blue-whitish veil, regression, and irregular pigmentation) is performed through the adoption of suitable logistic model trees (LMTs) [21]. In detail, the machine learning technique is adopted to classify the lesion map, i.e., the set of the chromatically homogeneous regions (low-level structures) resulting from segmentation performed through either principal component analysis (PCA) [15] or statistical region merging (SRM) [22]. The vector of the feature descriptors for each region includes the red, green, and blue (RGB), hue, saturation, and intensity (HSI), and CIELUV components of the corresponding pixels; in addition, the percentage ratio of the regional pixels with respect to the total number of the lesion map pixels is considered.
The result of the LMT training is depicted in Figure 2a regarding the automatic detection of the blue-whitish veil criterion: According to different ranges for the hue mean value of the region to be analyzed, three patterns (blue, red, or polychromatic) are highlighted by the corresponding logistic regression models. Regarding the training and testing of the automatic system, the dermoscopic image set counting more than 600 skin lesions was adopted; it includes both items randomly extracted by the reference atlas [20] and cases examined and recorded during the clinical activity at the Department of Clinical Medicine and Surgery, University of Naples Federico II. As a result, the main biological structures (corresponding to the dermoscopic criteria included in the 7-Point Checklist) are highlighted by applying to each lesion the image processing techniques briefly reported in Tables 1 and 2. As a further step, a statistical analysis is performed with the aim of providing the measurement information about the reliability of the adopted procedure for each criterion detection. Thus, a label (low, medium, or high) is assigned to each dermoscopic structure detected according to different ranges for the corresponding metric value, which is introduced for qualifying the processing algorithm. In the following section, the ad-hoc classification algorithms are briefly reported for all the dermoscopic criteria, as well as the statistical analysis proposed to evaluate the reliability of the automatic detection.

Color-Based Detection of Dermoscopic Criteria
The automatic measurement and classification of the dermoscopic structures most dependent on the chromatic distribution of the image pixels (i.e., blue-whitish veil, regression, and irregular pigmentation) is performed through the adoption of suitable logistic model trees (LMTs) [21]. In detail, the machine learning technique is adopted to classify the lesion map, i.e., the set of the chromatically homogeneous regions (low-level structures) resulting from segmentation performed through either principal component analysis (PCA) [15] or statistical region merging (SRM) [22]. The vector x of the feature descriptors for each region includes the red, green, and blue (RGB), hue, saturation, and intensity (HSI), and CIELUV components of the corresponding pixels; in addition, the percentage ratio of the regional pixels with respect to the total number of the lesion map pixels is considered.
The result of the LMT training is depicted in Figure 2a regarding the automatic detection of the blue-whitish veil criterion: According to different ranges for the hue mean value of the region to be analyzed, three patterns (blue, red, or polychromatic) are highlighted by the corresponding logistic regression models. On the basis of the mean and standard deviation for the hue ( ), saturation ( , ), and intensity ( , ) components, the functions ( ) are adopted to compute the probability that each segment of the lesion map (see Figure 2b) belongs to an area characterized by the blue-whitish veil according to: On the basis of the mean and standard deviation for the hue (σ H ), saturation (µ S , σ S ), and intensity (µ I , σ I ) components, the functions F i (x) are adopted to compute the probability P i that each segment of the lesion map (see Figure 2b) belongs to an area characterized by the blue-whitish veil according to: where the classes i = 1 and i = 2 correspond respectively to presence and absence of the dermoscopic criterion. An example of detection result is depicted in Figure 2c. The segments of the lesion map not associated with the blue-whitish veil are then classified by further LMT models as area of either regression (see Figure 3) or irregular pigmentation. On the basis of the mean and standard deviation for the hue ( ), saturation ( , ), and intensity ( , ) components, the functions ( ) are adopted to compute the probability that each segment of the lesion map (see Figure 2b) belongs to an area characterized by the blue-whitish veil according to: where the classes = 1 and = 2 correspond respectively to presence and absence of the dermoscopic criterion. An example of detection result is depicted in Figure 2c.
The segments of the lesion map not associated with the blue-whitish veil are then classified by further LMT models as area of either regression (see Figure 3) or irregular pigmentation.  The former model computes the corresponding probability for each chromatically homogeneous region by adopting the following simple logistic regression: which takes into account only the mean value (l) for the L component, the neighbor difference for intensity (i d ) and saturation (s d ) components, and the area percentage A % as truly significant attributes. In detail, irregular pigmentation is detected (see Figure 4) if P 1 > 0.5 for (at least) one segment of the lesion map (that is classified as irregular); then, the reliability of the automatic detection is evaluated by considering the probability P 1,S irr associated with the widest irregular segment S irr according to the following ranges: Analogous computation and thresholds concerned with the probability P 1 associated with the widest color segment have been adopted to provide the reliability of the automatic detection for the blue-whitish veil and the regression. Analogous computation and thresholds concerned with the probability associated with the widest color segment have been adopted to provide the reliability of the automatic detection for the blue-whitish veil and the regression.

Texture-Based Detection of Dermoscopic Criterion
A vector of 69 structural, geometric, and chromatic features was extracted to describe the lesion texture of interest by adopting the graph-based approach (introduced in [23]) and the iterative loop counting algorithm suggested in [24]. As an example, Figure 5 highlights the darker mesh of the pigmented network (the "net") and the lighter colored areas (the "holes") surrounded by the net. Thus, the automatic detection of the dermoscopic criterion appears as a three-class problem (absent/typical/atypical classification). Again, the logistic model tree approach is adopted as the

Texture-Based Detection of Dermoscopic Criterion
A vector x of 69 structural, geometric, and chromatic features was extracted to describe the lesion texture of interest by adopting the graph-based approach (introduced in [23]) and the iterative loop counting algorithm suggested in [24]. As an example, Figure 5 highlights the darker mesh of the pigmented network (the "net") and the lighter colored areas (the "holes") surrounded by the net. Analogous computation and thresholds concerned with the probability 1 associated with the widest color segment have been adopted to provide the reliability of the automatic detection for the blue-whitish veil and the regression.

Texture-Based Detection of Dermoscopic Criterion
A vector of 69 structural, geometric, and chromatic features was extracted to describe the lesion texture of interest by adopting the graph-based approach (introduced in [23]) and the iterative loop counting algorithm suggested in [24]. As an example, Figure 5 highlights the darker mesh of the pigmented network (the "net") and the lighter colored areas (the "holes") surrounded by the net.  Thus, the automatic detection of the dermoscopic criterion appears as a three-class problem (absent/typical/atypical classification). Again, the logistic model tree approach is adopted as the solution to compute the probabilities Pi for the pigmented network according to the generalization of Equation (1) with i = 1, 2, and 3 corresponding respectively to the atypical, typical, and absent classes.
The atypical pigmented network is detected if: whereas, the detection reliability is estimated by considering the following labels:

Atypical Vascular Pattern(AVP)
Automatic detection of the dermoscopic criterion integrates structural analysis and chromatic measurement of the lesion: The main linear/globular structures are selected through image enhancing as the tubularness filter response [25], and matched with the red segments resulting from the application of the statistical region merging at a fine level (Q = 256) to the inner area (see Figure 6).

Atypical Vascular Pattern(AVP)
Automatic detection of the dermoscopic criterion integrates structural analysis and chromatic measurement of the lesion: The main linear/globular structures are selected through image enhancing as the tubularness filter response [25], and matched with the red segments resulting from the application of the statistical region merging at a fine level ( = 256) to the inner area (see Figure 6). The spatial distribution of the candidate low-level features is then evaluated with respect to the main symmetry axes of the lesion. Indeed, the binomial distribution is expected when the red linear/globular structures are randomly scattered in each of 4 quadrants (resulting from drawing the major and minor axis of the ellipse characterized by the same normalized second central moments as the lesion). Thus, the analysis of the atypical vascular pattern is carried out by performing a binomial test for the N-highlighted linear/globular objects. If, in any quadrant and/or couple of quadrants, the paucity or plenty (with respect to expected value) of the low-level features are measured, the null hypothesis (i.e., the casual distribution of N linear/globular objects) may be refused with the accepted risk α of Type I Error and the atypical vascular pattern detected.
Moreover, the reliability of the classification procedure can be labeled according to the following scheme: The spatial distribution of the candidate low-level features is then evaluated with respect to the main symmetry axes of the lesion. Indeed, the binomial distribution is expected when the red linear/globular structures are randomly scattered in each of 4 quadrants (resulting from drawing the major and minor axis of the ellipse characterized by the same normalized second central moments as the lesion). Thus, the analysis of the atypical vascular pattern is carried out by performing a binomial test for the N-highlighted linear/globular objects. If, in any quadrant and/or couple of quadrants, the paucity or plenty (with respect to expected value) of the low-level features are measured, the null hypothesis (i.e., the casual distribution of N linear/globular objects) may be refused with the accepted risk α of Type I Error and the atypical vascular pattern detected.

Irregular Dots and Globules
Moreover, the reliability of the classification procedure can be labeled according to the following scheme:

Irregular Dots and Globules
Automatic detection of small dark areas within the lesion is provided through the analysis of the chromatic and morphological measurement about the segments highlighted, as previously introduced, by the statistical region merging at fine level. In detail, the chromatically uniform regions (low-level features) are ordered according to the (increasing) value of the intensity value, within a suitable range for the hue component. Then, based on the experimental testing and tuning activity, thresholding operation is performed with respect to the morphological feature descriptors represented by the region percentage area A % and eccentricity e, in order to select N significant rounded objects inside the lesion. Again, the casual (spatial) distribution of the observed items is analyzed with respect to the main symmetry axes of the lesion. If the number of the dark round objects in each quadrant is out of the expected range from the binomial distribution, the null hypothesis of the uniform distribution is refused, and the corresponding low-level features are classified as irregular dots and globules.
The reliability of the automatic detection is estimated as follows: High i f N ≥ 50 (7)

Irregular Streaks
Two approaches, both based on the adoption of color segmentation and structural analysis, are provided for the automatic detection of linear or bulbous extensions asymmetrically arranged at the edge of the lesion.
The former approach (depicted in Figure 7) is able to: (i) Detect the black/brown pigmentation localized along the lesion periphery (through the statistical region merging, see Figure 7b); and (ii) track the finger-like contour of the highlighted structure (which is split into 10 equal-length segments). Finally, a morphological irregularity index is determined as the ratio of the number of pixels constituting the lesion contour and the shortest path (i.e., the set of points belonging to the line that connects the farther contour pixels in the region, see Figure 7c) and compared with a suitable threshold (experimentally estimated during the training phase) to determine the set S IRR of candidate segments with streaks.

High
if ≥ 50

Irregular Streaks
Two approaches, both based on the adoption of color segmentation and structural analysis, are provided for the automatic detection of linear or bulbous extensions asymmetrically arranged at the edge of the lesion.
The former approach (depicted in Figure 7) is able to: i) Detect the black/brown pigmentation localized along the lesion periphery (through the statistical region merging, see Figure 7b); and ii) track the finger-like contour of the highlighted structure (which is split into 10 equal-length segments). Finally, a morphological irregularity index is determined as the ratio of the number of pixels constituting the lesion contour and the shortest path (i.e., the set of points belonging to the line that connects the farther contour pixels in the region, see Figure 7c) and compared with a suitable threshold (experimentally estimated during the training phase) to determine the set of candidate segments with streaks. According to the latter approach, the flux analysis of the streaks' principle curvature vectors [26] is performed by adopting the Frangi filter [25]. In detail, the parallel and perpendicular flow for the streak vector field are measured over the equal-length segments of the lesion border. The candidate segments with streaks are included in an S FRANGI set by comparing the corresponding mean values and variances of the measured flows with experimentally tuned thresholds.
The presence of irregular streaks and the reliability of the corresponding procedure are provided by combining the introduced approaches according to the following scheme:

Lesion Classification
A weighted version of the 7-Point Checklist scoring system is suggested for the automatic lesion classification, which takes into account the detection algorithms previously described and the corresponding reliability. In detail, 3 constants have been associated experimentally to the reliability of the detection procedures (0.2, 0.5, and 1.0, corresponding to the low, medium, and high labels, respectively) and are adopted to weight the partial scores according to the detection uncertainty. Finally, the total score for the skin lesion is achieved by summing up the seven weighted scores, and the clinical decision is provided for the following-up or excision tasks, as briefly reported in Table 3. The application of the authors' proposal to an example of a dermoscopic image is depicted in Figure 8.
classification, which takes into account the detection algorithms previously described and the corresponding reliability. In detail, 3 constants have been associated experimentally to the reliability of the detection procedures (0.2, 0.5, and 1.0, corresponding to the low, medium, and high labels, respectively) and are adopted to weight the partial scores according to the detection uncertainty. Finally, the total score for the skin lesion is achieved by summing up the seven weighted scores, and the clinical decision is provided for the following-up or excision tasks, as briefly reported in Table 3. Atypical nevus Excision , > 3.0 Melanoma Excision The application of the authors' proposal to an example of a dermoscopic image is depicted in Figure 8.

Results
A database including 270 ELM images of skin lesions and corresponding information (such as patient age and sex, lesion position, histology after excision) was collected during the continuous screening campaign by the Section of Dermatology, University Federico II of Naples, Italy, under the two-year tele-medicine project, "Di che segno è il tuo neo?" (2018-2019), which involved more than 200 patients and 25 local general practitioners. On the basis of a-posteriori analysis (biopsy) the dermoscopic database includes: • 107 benign nevi; • 99 atypical nevi (73 suspicious nevi); • 64 melanomas.
The corresponding digital images were acquired through iPhone 6/7 smartphones furnished with two dermoscope models (DermLite DL1 and MetaOptima Molescope I): Dimensions of the JPEG color images range from 700 × 447 to 2272 × 1520 pixels.
Aiming to estimate the performance of the proposed measurement and classification system, two groups of physicians were selected: and the diagnosis, as well as the application of the Seven-Point Checklist method to each record of the database, were requested to the single dermatologist. In detail, both the diagnostic and clinical accuracy have been evaluated by weighting the corresponding computation according to different class dimensions:

•
The former performance represents the weighted mean of sensitivity (i.e., the ratio of the correct detection of melanoma and the corresponding class dimension, 64) and specificity (i.e., the ratio of the correct detection of the benign/atypical nevi and the number of lesions not classified as melanoma, 206); • The latter performance takes into account the correct decision of both excisions (with respect to the group of lesions including melanomas and strongly atypical nevi) and of following-up/ignoring (with respect to the group of benign and suspicious nevi).
The automatic measurement and classification system (MS) has been implemented as Matlab routines at the different stages (image pre-processing, contour extraction, measurement of low-level feature, detection of dermoscopic structures, lesion classification) and executed as distributed code (jobs with multiple parallel tasks) on 64-bit Microsoft Windows computer, featured with Intel Core i7 CPU at 2.67 GHz and 6 GB RAM.
To evaluate the quality of the developed automatic system, the reference detection for the dermoscopic structures of each image was achieved by matching the biopsy results with the Seven-Point Checklist filed by the expert dermatologist (S.C.) who exhibited the best diagnostic and clinical accuracy. In detail, every dermoscopic structure should be detected if the corresponding lesion belongs to either a melanoma or strongly atypical nevus group.
The authors first evaluated the diagnostic concordance between the two dermatologist groups with different degrees of dermoscopy experience and the MS. Then, they considered the same parameters to evaluate the diagnostic performance of NED when helped by the MS during the dermoscopic diagnosis. In detail, after three months, the NED group was asked to re-examine the database which also included both the detection results (images highlighting the dermoscopic structures) corresponding to the single criterion and the overall lesion classification provided by the MS.
Tables 4-6 report, respectively, the Seven-Point Checklist scores and the diagnostic/clinical accuracy achieved by the dermatologist groups and the automatic measurement system.

Discussion
The different levels of experience are evident in the great amount of atypical dermoscopic structures detected by the NED group that did not lead to high diagnostic performance. Indeed, the results showed that in terms of the mean sensitivity (>87%), specificity (>92%), and therefore diagnostic accuracy (>92%), ED was the most reliable, forming the "ideal" reference. NED's diagnostic performance gap compared with that of ED is much more important than the mean value of diagnostic accuracy (less than 7% compared to ED). This can be easily understood by observing the mean value of NED's specificity (90%), which was maintained at a significantly lower sensitivity price (69% compared to 87% of ED), resulting in more lost melanomas.
It is interesting to note that the behavior of the automatic measurement system was similar, but specular to that of NED: Diagnostic accuracy was almost overlapping, but in this case, its value was guaranteed by good performance in terms of sensitivity (90%) with a significant loss of specificity (surplus excisions), with a value of 74% over 90% of the ED. The low specificity observed for the MS was in agreement with the literature data that identifies, like the main limit of the automated systems, the trend of excessive false positives, confirming, in our opinion, the impossibility to use any software system as a totally autonomous and independent diagnostic tool.
This limitation, however, was considerably reduced by the semi-automatic approach (see performance reported in Tables 7 and 8), as a result of the effect of reciprocal compensation between sensitivity (better in MS vs. NED) and specificity (better in NED than MS). This compensation was also favored by the interaction of the diagnostic decision process, guaranteed by the fact that the MS underlined to the NED, for each identified parameter, the areas of the lesion in which each of the seven parameters of the Seven-Point Checklist had been recognized, meaning the dermatologist was able to agree or not with the obtained final score. This explains why, when assisted by MS, NED greatly improved its diagnostic sensitivity, without compromising its specificity and resulting in gain in diagnostic accuracy. The latter went from 85% for NED to 88% for NED + MS, reaching a value of only 4 percentage points less than the accuracy of ED.
Similar improvements also hold for clinical accuracy. Cohen's Kappa test allowed us to verify the statistical validity of this data, excluding that the variation of diagnostic accuracy between NED and NED + MS had been purely random. It is important to point out that the MS was, from the beginning, designed with the aim of assisting and not replacing the clinician in the diagnosis of suspected pigmented lesions.
On the basis of the data in the literature, we concluded that an automated system cannot be considered as a totally independent diagnostic tool, but that a non-expert clinician could improve their own diagnostic performance if supported by a system that can overcome its limits, especially in terms of sensitivity.
Since its introduction in dermatology in 1950, dermatoscopy has spread widely in clinical practice, thanks to its non-invasiveness and easy accessibility. Numerous studies have shown that dermoscopy is able to improve the diagnostic performance of an expert dermatologist "like a bridge" between the clinic and histopathology [27,28]. The definition of dermoscopic criteria has led to the establishment of a real "dermoscopic semeiotic" in which each ELM criteria correlates to specific morphological and histopathological parameters. However, it has been shown that dermoscopy is a counter-productive tool for diagnostic accuracy when used by a NED [10,11]. In fact, dermatoscopy remains a highly operator-dependent method, despite the various dermoscopic algorithms (ABCD, Menzies method, Seven-Point Checklist, Three-Point Checklist) developed in order to simplify and standardize the diagnostic procedures to follow during a dermoscopic examination.  For this reason, in the last twenty years, an increasing interest in automated analysis of dermoscopic images of pigmented skin lesions has been developed to discriminate between melanoma and other pigmented skin lesions [29].
Several research groups have dedicated time to develop automated diagnostic systems capable of identifying dermoscopic criteria indicating malignancy [30]. Many of these systems are based on the ABCD algorithm. For example, Umbaugh et al. [31] developed an automated color segmentation algorithm used to identify the characteristics of cutaneous tumors. In 1999, Schmid and Schmid-Saugeon [32,33] proposed a pattern-based color segmentation without extraction. Two years later, Ganster and co. [34] developed an automated recognition system for melanoma, in which 21 different parameters were extracted from the images. More recently, Grana and colleagues [35] introduced a new algorithm based on the comparison by a gray scale of points extracted from the ELM images of pigmented lesions. In [12], an algorithm was described to diagnose pigmented lesions, by taking into account 64 analytical parameters, while a software for automatically evaluating 50 objective parameters of skin lesions (divided into 3 categories: Geometry, texture, and pigmented areas) was proposed in [29]. More than 400 further studies about automated diagnostic systems have been published since 2002: The review in [36] shows that computerized diagnosis systems have more diagnostic sensitivity than diagnostic specificity, which is less than classical dermoscopy practiced by experienced dermatologists, with a total diagnostic performance that can be overlapped. Other reviews about the automated melanoma diagnosis were published in 2011 and 2012 [37,38], and confirmed the high diagnostic accuracy and remarkable potential application of them in the clinical field. No previously published study has analyzed the application of automated systems to the Seven-Point Checklist diagnostic algorithm. The Seven-Point Checklist is still one of the most widely used algorithms in the clinical field [39] for the early detection of melanoma. We preferred using the classic Seven-Point Checklist to the revised one, for higher sensitivity. Thus, the classic Seven-Point Checklist would be more suitable processed by an automated software, which by itself tends to over-detect.

Conclusions
An image-based measurement system is proposed for the automatic detection of melanomas according to a well-known diagnostic method as support to the dermatologist activity. It adopts advanced statistical techniques to perform the main tasks (automatic recognition of the skin lesion within the dermoscopic image, measurement of morphological and chromatic parameters, detection of the dermoscopic structures included in the Seven-Point Checklist method, overall classification of the lesion) necessary for providing a second opinion about the clinical decision.
Main research efforts were addressed both to provide the reliability measurement about the intermediate results (in order to improve the lesion classification capability) and to validate the performance of the measurement system during clinical practice in dermatology departments. As a result, the automatic measurement system was not able to outperform the expert dermatologist group in terms of the diagnostic capabilities. However, it allowed the dermatologists with low dermoscopic experience (<3 years) to improve the sensitivity and overall diagnostic accuracy.
Although further studies are needed to confirm the validity of the preliminary results, the measurement system that we designed could be a valid support tool for the diagnostic process of dermatologists with low dermoscopy experience, increasing their diagnostic accuracy to that of an experienced dermatologist, resulting in a significant reduction in the number of melanomas left in place, especially in situ melanomas, with ambiguous and often misleading dermoscopic features, especially for non-expert dermatologists.
To further improve the automatic application of the Seven-Point Checklist, the next research will deeply investigate the dermoscopic structures' correlations among the different classes of skin lesions, and take into account the measurement uncertainty associated with the multi-scale application of the proposed techniques for measuring low-level features. Moreover, the newly introduced convolutional neural networks (CNNs) will be also investigated to reduce the classification uncertainty associated with the dermoscopic criteria included in the 7-Point Checklist [40,41] and consequently improve the specificity of the measurement system.

Conflicts of Interest:
The authors declare no conflict of interest.