1. Introduction
Malignant melanoma (MM) is one of the most common cancers worldwide. It has a favorable prognosis only if the affected area is removed at an early stage. MM reportedly causes the large majority of skin cancer deaths despite the fact that it accounts for <2% of skin cancer cases [
1]. The incidence of MM has been increasing for >30 years [
2] and one of its most ominous characteristics is its high propensity to produce distant metastases, because it can get disseminated throughout the body through lymphatic and hematogenous spread. For this reason, early detection and treatment of MM are crucial life-saving measures [
3]. Although dermoscopy is a powerful diagnostic technique [
4] and the ABCDE (abbreviation for asymmetrical shape, border, color, diameter, and evolution) rule provides a guide to the identification of involved areas [
5], pathological examination is even now the gold standard for MM diagnosis. However, diagnosis remains subjective and highly reliant on the skill level of the pathologist. Interobserver reproducibility of MM diagnosis varies even among experts.
Fractal analysis has been developed as a strategy to improve diagnostic reliability. This method is based on the calculation of the fractal dimension (FD) of the structure of MM cells and their distribution [
6,
7,
8,
9,
10,
11]. Although this is a very attractive idea, the method has three drawbacks. First, the technique is based on self-similarity, namely, having its reduced image in itself recursively; therefore, if no self-similarity is present in the structure, the arbitrarily defined FD is difficult to connect to the structural features being discussed in the paper. Second, the FD is based on the calculation of the line length or area of the covered region, and changes in its value are then visualized by changing the analyzed domain. This procedure limits the ratio between the minimal size of evaluation and the total size, resulting in limited spatial resolution and spatially dependent information. Finally, marker-free phenotyping of tumor cells by fractal analysis of reflection interference contrast microscopy (RICM) images results in a very small difference in FD values between the different types of MM cells [
12]. For example, two types of MM cells in one study had FD values of 1.353 ± 0.004 and 1.312 ± 0.005, which corresponds to only a 3.08% difference in the average value of the two groups [
12]. Even though these authors claimed that the standard deviations (SDs) were as small as 0.005 and 0.004, the FDs in Figure 5 of their paper showed nearly 80% overlap, while their RICM images showed very clear visual differences in the apparent structural features between the two types (Figure 5A of the paper). This means that, even when the image patterns are quite different between two images, similar FDs can be obtained, indicating that the method is difficult to apply to clinical diagnosis. This disadvantage forms the basis of our motivation to find a more reliable and useful diagnostic application of textural structure to skin cancer cells in the present paper.
Melanin carries information about the metabolism and location of melanocytes and melanogenesis; therefore, the melanin distribution could act as a marker for MM [
13,
14]. The two dominant types of melanin (eumelanin and pheomelanin) absorb a large cross section of visible light without substantial fluorescence emission [
15], resulting in the difficulty in the imaging of MM distribution. However, the non-fluorescent property of melanin enables even more sensitive imaging of MM by employing photothermal (PT) microscopy (PTM), which is the main subject of the present paper.
PTM, which is based on the detection of changes to probe light intensity by thermal lensing due to local heating of the sample by absorption of the laser light, has demonstrated potential for biological imaging and clinical application. The key advantages of PTM are its high sensitivity and lack of a requirement for staining [
16,
17,
18,
19,
20,
21]. It facilitates the real-time, high-resolution imaging of nanometer-sized absorbers buried among light scatterers with a high signal-to-noise ratio [
19,
22,
23]. However, the PT signal intensity in normal PTM has two extremes in the axial direction [
24], which introduces distortions resulting in limited axial resolution of three-dimensional PT images. Confocal PTM (CPTM), which has a detection scheme similar to that of confocal microscopy, can help to ameliorate this drawback and improve the axial resolution [
24]. Our group has used CPTM to study super-resolution microscopic images of neurons in mouse brains [
25,
26] and mouse skin MM [
27].
In our previous paper [
28], we developed a CPTM system and applied it to examine the features of melanin aggregates of both nevus and MM cells in thick (10–15 μm) specimens of a melanoma model mouse. The
ret gene is a receptor tyrosine kinase type oncogene, and
rfp is a fusion of 5′-half of ret gene and a finger structure probably capable to bind DNAs. RET-transgenic mice of the 304/B6 (RET-Tg) line develop benign melanocytic tumors and MM in a stepwise manner, and they are widely used for the study of melanoma genesis [
29,
30,
31]. Obtained 3D images were analyzed to characterize the features of the melanin aggregates such as the density, size, and the surface FD. Even though the differences of those features between the two cell types were clearly demonstrated, we were not able to evaluate very accurately its diagnostic capability because of the limited image quality due to slow stage with which photodegradation during one frame imaging. The photo-damage was inevitable to get images with intensity enough to high S/N but it induced signal intensity reduction in the later stage of the one-frame imaging process.
In this study, we made an improvement upon the CPTM technique and applied it for noninvasive label-free imaging of MM in the mouse model previously used for the purpose of future application to humans. The scanning method was changed from piezo-driven stage to galvanometer, hence the wider swept area from 20 × 20 μm
2 to 72 × 72 μm
2 and the shorter scanning period from 3 ms to 20 μs. This enabled to solve the problem in the previous study which hampered us to get enough quality images. The performance of the analytical system was tested with a sample of 20 nm gold nanoparticles. Using the PT imaging data, we then analyzed the structural properties of MM and nevus cells to be compared using the GLCM method [
32,
33,
34,
35,
36,
37,
38]. We calculated nine different parameters: angular second moment (ASM), Contrast, Correlation, Entropy, inverse difference moment (IDM), Homogeneity, Prominence, Shade, and Variance. These textural parameters were determined by analyzing relevant regions of interest on 12 two-dimensional PT images obtained at two different positions in three tissue sections containing nevus and MM cells. The details of this analysis are described in the following experimental section. Our method provides an objective evaluation independent of the experience, skill, and knowledge of individual medical doctors, and prognostication at each occasion of the pathological diagnosis. Thus, GLCM calculation provides a quantitative indicator that may become a “standard” in the future by the accumulation of cases in real clinical settings from various individuals with a variety of experiences. We first applied difference criterion (DIF) analysis and receiver operating characteristic (ROC) curve analysis. The analyses provided the results that Entropy, Contrast, and Variance were most suited for the discrimination between MM and nevus when 405 nm excitation was used, and Prominence, Variance, and Shade were most at 488 nm excitation. Second, we defined a new index, a “clearness discrimination parameter” (DISC), for the discrimination between nevus and MM cells. This index suggested that Entropy and Homogeneity in the case of the 405 nm pump, and Entropy and Prominence in the case of the 488 nm pump were the two most suited among the nine parameters for discrimination between nevus and MM cells in a mouse. This research received no external funding.
3. Results
The images of cell samples of about 1 µm on 1 mm thick microscope slides are shown in
Figure 2,
Figure 3 and
Figure 4.
Figure 2 shows a photograph of cutaneous tissues taken from mice with MM. Microscopic regions of 72 × 72 µm
2 in the PT images were selected from the images for analysis as shown in
Figure 3 and
Figure 4. The two bright field images in
Figure 3 are obtained with charge coupled device (CCD) (DCC1645C, Thorlabs, NJ, USA) for the nevus sample (top left) and the MM (bottom left) with a 40× objective lens. The wavelength and power of excitation of the LD was 405 nm and 0.3 mW, respectively. Top and bottom right are the PT images using a 405 nm pump (600 × 600 pixels, 120 nm/pixel) of nevus and MM cells, respectively. Red squares in the bright field images show the 72 × 72 μm
2 areas in the PT images. Four equal size (18 × 18 μm
2) areas segmented from the red square (36 × 36 μm
2) portion in the PT images were used for GLCM analysis. In
Figure 4, top left and top right are the bright field CCD images of nevus samples and MM samples, respectively, obtained with a 40× objective lens and with 488 nm excitation under the same condition as the ones in
Figure 3. The top and bottom right images are the corresponding PT images (600 × 600 pixels, 120 nm/pixel) of nevus and MM samples, respectively. Red squares in the bright field images show the 72 × 72 μm
2 areas of the PT images. Four equal size (18 × 18 μm
2) areas were segmented out of the four red square areas in the PT images and used for GLCM analysis. Twelve areas from the PT images were evaluated and 48 calculation regions were selected for both 405 nm and 488 nm excitation.
The textural structure of the images of the mouse skin samples containing both nevus and MM cells taken with the PT imaging method were analyzed by GLCM. The nine parameters were calculated as shown below. The areas of imaging data within the red lines in
Figure 3 at 405 nm excitation were analyzed by the GLCM method, and the 8-bit level gray level intensity distribution of the PT signal is shown in
Figure 5. Twelve images of 72 × 72 μm
2 to multiple samples of excitation at 488 nm for both nevus and MM samples were also obtained. The areas of imaging data within the red lines in
Figure 4 (left) at 488 nm excitation were analyzed by this method, and the 8-bit level gray level intensity distribution of the PT signal is shown in
Figure 6. Four sets of 18 × 18 μm
2 areas (shown in
Figure 3 and
Figure 4) with higher intensity out of the 12 images were chosen and analyzed using the GLCM analysis. There were total of 48 images with an area of 18 × 18 μm
2 from both nevus and MM. This provided sufficient data at two sets of pump wavelengths to ensure statistical reliability. As shown in the bright field images, the sample areas are selected out of various parts of skin. For this technique to become a standard method, many more samples from a larger number of patients are required. However, as discussed below, we discovered that a few of the nine GLCM parameters clearly showed the ability to discriminate between nevus and MM, and can hopefully be used as criteria for pathological diagnosis.
Hereafter, the formalism of GLCM is described [
39,
40] by showing nine parameters out of the most frequently used indexes.
In all of the following formulas,
P (
i,
j) stands for the (
i,
j)th entry or value in a normalized GLCM.
where
for a symmetric matrix.
Figure 7 and
Figure 8 depict bar charts of the nine averaged calculated distance parameters
d = 1–10 (corresponding to the shift distance 120–1200 nm of the image in GLCM calculations) [
39,
40] and their standard deviations (SDs). In the graph, the bar is the SD of corresponding parameters calculated for the four sets of the image data shown in
Figure 5 and
Figure 6.
The GLCM analysis of the PT images obtained at 405 nm excitation showed that the extent of the difference of nevus–MM distribution depends on the value d, and as it increases from 1 to 10, some parameters (namely Contrast, Entropy, Homogeneity and IDM) are considerably increased, whereas others (Correlation and Variance) showed very little change. As a whole, d = 10 provided the best separation, and among the nine parameters, Contrast, Entropy and Variance showed large differences between the means of nevus and MM cells, and they were almost equal to the sum of their SD values (ratios are between 0.9 and 0.95).
GLCM analysis of the PT images obtained by 488 nm excitation was also performed. Among the nine parameters, Prominence, Shade, and Variance were well separated between the nevus and MM cells compared with the other parameters. In these cases, the ratio of the difference between the two cell types to the sum of the values of SDs at d = 10 were 0.737, 0.635, and 0.557, respectively. The ratios in the other parameters were smaller than 0.31. To obtain suitable parameters for the identification of MM, we analyzed the data with a commonly used diagnostic method, including the parameters sensitivity, specificity, positive predictive value, and negative predictive value.
For the diagnosis, we used
Figure 7 and
Figure 8 for each GLCM parameter for
d = 10. We obtained the Gaussian curves fit to the data with the peaks and the widths being the mean and the SD values respectively in the histograms for
d = 10 in
Figure 7 and
Figure 8 for each GLCM parameter; these data are shown in
Figure 9 and
Figure 10.
The differences in the parameters between malignant and benign can be discriminated even though it may not provide good enough discrimination for some parameters. We call this diagnosis the GLCM-DIAG method. The results of the discrimination parameters are listed for sensitivity (Sn), specificity (Sp), accuracy (AC), positive likelihood ratio (LR+), negative likelihood ratio (LR−), positive predictive value (PPV) and negative predictive value (NPV) in
Supplementary Tables S1 and S2 using the data shown in
Figure 9 and
Figure 10 in the cases of 405 nm and 488 nm excitation, respectively. These figures show that the relative positions and widths of the two excitation wavelengths are quite different from each other. This is discussed later.
As shown in
Supplementary Table S1, Variance, Entropy, and Contrast had large AC and LR+ values and a small LR− value, indicating that these three GLCM parameters are more reliable than the other GLCM parameters at 405 nm excitation. In the case of 488 nm excitation, as tabulated in
Supplementary Table S2, Prominence, Shade, and Variance had large AC and LR+ values and a small LR− value, indicating that these three GLCM parameters are more reliable than the other GLCM parameters.
The distribution profiles calculated for the nine GLCM parameters at both 405 nm and 488 nm excitation are shown in
Figure 9 and
Figure 10.
To obtain suitable parameters for the identification of MM, we defined the difference criterion (DIF) as follows:
where PM and PN are the parameter values of the MM and nevus cells, respectively, and DM and DB are the SDs of PM and PN for the four sets of imaging data. The rank orders of the magnitude and absolute values of DIF for the nine parameters are shown in
Supplementary Tables S3–S6, respectively. Entropy, Variance, and Contrast were still ranked within the top three positions when
d is near 10 in the 10 different
d-value sets. Therefore, these three parameters (Entropy, Variance, and Contrast) are likely to be suited for the discrimination between nevus and MM cells.
From the above orders of the DIF values in
Supplementary Tables S3 and S5, it can be concluded that Entropy, Contrast, and Variance are most suited for the discrimination between MM and nevus cells in 405 nm excitation. For Entropy and Contrast, the distance parameter
d = 10 gives the highest DIF of 1.879 and 1.847, respectively, and for Variance,
d = 2 produced the largest value, although it was not substantially larger than those of other values of
d. The probabilities of correct identification of nevus cells were 39.49%, 39.21%, and 39.49% for Entropy (
d = 10), Contrast (
d = 10), and Variance (
d = 10), respectively, while the probabilities of correct identification of MM cells were 39.49%, 39.21%, and 39.49% for Entropy (
d = 10), Contrast (
d = 10), and Variance (
d = 10), respectively. These values are much more accurate than those obtained by fractal analysis [
6,
7,
8,
9,
10,
11].
From the orders of DIF values in
Supplementary Tables S4 and S6, it can be concluded that Prominence, Variance, and Shade are most suited for the discrimination between MM and nevus cells in 488 nm excitation. For Shade and Variance, the distance parameter
d = 10 gave the highest DIF values of 1.121 and 1.273, respectively, while for Prominence,
d = 9 produced the highest value, although the difference between DIF at other values of
d was very small. The probabilities of correct identification of nevus cells are 38.9%, 33.3%, and 33.6% for Prominence (
d = 10), Shade (
d = 10), and Variance (
d = 10), respectively. The probabilities of correct identification of MM cells are 38.9%, 33.3%, and 33.6% for Prominence (
d = 10), Shade (
d = 10), and Variance (
d = 10), respectively. These values are much better than those obtained by fractal analysis [
6,
7,
8,
9,
10,
11].
These findings indicate that the GLCM parameter method, especially GLCM-DIF analysis, is a simple and useful method for the identification of suitable parameters for differentiation between different stages of cancers and detection of various types of disease that alter cell structure.
We then performed receiver operating characteristic (ROC) curve analysis based on those Gaussian curves. A ROC curve is commonly used to evaluate the diagnostic ability of a test. When a threshold parameter used in the system classifying examinees into two groups, positive and negative for some features, this curve is plotted as the sensitivity against the false positive ratio. As shown in
Figure 7 and
Figure 8,
d = 10 provides the best performance for all nine GLCM parameters in both cases pumped at 405 nm and pumped at 488 nm. We plotted ROC curves for the nine parameters (
Figure 11A,B) based on the Gaussian curves (
Figure 9 and
Figure 10). The area under the curve (AUC) is an indicator of the diagnostic ability; >0.9, 0.7~0.9, and <0.7 correspond to high accuracy, moderate accuracy and poor accuracy, respectively. As shown in
Figure 11A, Entropy, Contrast and Variance show high AUCs, namely 0.909, 0.905 and 0.897, respectively. These values indicate that those parameters provide highly accurate methods to distinguish nevus and MM cells. In the case of 488 nm excitation, the AUCs of Prominence, Variance and Shade are 0.812, 0.808 and 0.768, respectively, indicating worse performance than 405 nm excitation. Those results, both at 405 nm and 488 nm excitation, agreed with those of the DIF analysis.
4. Discussion
In the previous analysis, DIF was calculated using the average of the parameters obtained from the four images which contain 48 raw images. The calculated values of DIF for the nine GLCM parameters are shown in
Table S6. To further utilize the parameters obtained by the GLCM analysis, we assessed them by taking the dispersion of the distribution of the parameters into account. The mean and distribution (=dispersion) were fitted with a Gaussian distribution as shown in
Figure 9 and
Figure 10 for all nine parameters. These figures show that the values of the parameters are widely distributed and depict that the degree of overlap between the nevus and MM cells are different among the parameters. Using the distribution function, the probability of benign (defined by B = Benign/(Benign + Malignant)) was then calculated and plotted against the parameter values for the distance
d = 10. The results are shown in
Figure 12 and
Figure 13.
We tried to evaluate the appropriateness of assigning either nevus or MM to cells using another simple discrimination level for each. For this, we adopted a “clearness discrimination parameter” (DISC value), that is defined by the following equation for each parameter using the probability of benign (B) shown in
Figure 12 and
Figure 13:
Here, “35% of B” means the value of the corresponding parameter at 35% of its maximum intensity. Usually, 90% and 10% are used for the level of discrimination or for steepness evaluation for the distribution curve; for some of the parameters, however, the B value did not reach 10% for any parameter in the present analysis. The smaller the DISC value, the more accurate the discrimination. The calculated results are shown in
Table 1 and
Table 2 in ascending order of DISC value at 405 nm and 488 nm excitation, respectively. In 405 nm excitation, the DISCs for nevus and MM, Entropy, Homogeneity, and Variance, were conspicuously small, which was expected to be effective for that discrimination (
Table 1). Additionally, in the GLCM calculation result at 488 nm excitation, the DISCs for nevus and MM, Entropy, Prominence, and Correlation, were the three smallest (
Table 2). Thus, there were differences in effective parameters depending on excitation wavelength. This can be well explained in terms of the sensitivity of the signal intensity to the components in the cells, namely melanin and porphyrin, due to the differences in the absorption cross section between them.
The ROC curve is drawn by plotting the true positive ratio (sensitivity) against the false positive ratio as the threshold parameter goes through the measuring range. We plotted the ROC curves based on
Figure 9 and
Figure 10, and the curves took on distinct shapes depending on the SDs and the difference of the means of the two cell types. (1) The difference of mean values of both cell types were near the SDs, as seen in the case of ASM, Contrast, Entropy, Homogeneity, IDM and Variance at 405 nm excitation and Entropy at 488 nm excitation. In this case, the ROC curves rose quite precipitously, gradually decreased their slopes and got almost horizontal at the end, however, always kept convex upward. (2) SDs of the parameters for the nevus cells were large enough that the distribution for them spread to cover the main part of the distribution for MM cells, as seen in the case of Correlation, Prominence and Shade at 405 nm excitation and Correlation at 488 nm excitation. In this case, the ROC curve rose precipitously, however, after a short time, decreased its slopes to nearly horizontal, and again got steep at the end. This is because at first, only nevus cells were considered as positive and MM cells were not included within threshold, and hence the false positive was kept very low. As the threshold parameter passed through the main part of MM cells, the cumulating values were mainly due to MM cells, resulting in the horizontal shape. After the threshold-parameter passed the main part of MM cells, nevus cells mainly count for the positives. (3) SDs of the parameters for the MM cells were large enough that the distribution for them spread to cover the main part of the distribution for nevus cells, as seen in the case of Prominence, Shade and Variance at 488 nm excitation. In this case, the ROC curves first went horizontally, however, after a short time, rose up very steeply and got to nearly horizontal at the end. The reason for this behavior is reverse to (2).
As mentioned previously, analyses of ROC and DIF agreed quite well at both 405 nm and 488 nm. However, there were some differences between the analyses results of ROC or DIF and DISC, especially at 488 nm excitation. The correlation diagrams between AUC and DISC at 405 nm and at 488 nm excitation are shown in
Figure 14. The correlation coefficient at 405 nm excitation was −0.844, indicating a strong correlation, and at 488 nm excitation, it was −0.3735, showing a weak correlation.
Analyses based on DIF, ROC, and DISC for evaluating the diagnostic ability of GLCM-parameters showed little difference between 405 nm and 488 nm excitation. The difference between the calculation results for experiments performed at 405 nm and 488 nm excitation can be explained in more detail as follows.
The molar extinction coefficients of melanin at 405 nm and 488 nm in the literature are approximately 2500 and 1500/mol/cm, respectively [
41]. In contrast, the molar extinction coefficient of hemoglobin at 405 nm and 488 nm is about 275,000 and 16,000/mol/cm, respectively. At both wavelengths, the molar extinction coefficient of hemoglobin is high, but the absolute values are quite different. The extinction coefficient of hemoglobin at 488 nm is 10 times larger than that of melanin, while the coefficient of hemoglobin at 405 nm is 100 times larger than that of melanin. The amount of melanin contained in the cell slice is 10 times greater than that of hemoglobin. Thus, it might be possible to assess the state or degree of transformation of the skin tissue by monitoring with GLCM analysis of 488 nm pump images. It may be possible that at 488 nm excitation, Prominence and Entropy detected the changes of melanin distribution induced by the transformation of cells.
At 405 nm excitation, Homogeneity and Entropy might detect the changes in hemoglobin distribution induced by cellular transformation.
We defined DISC using 65% and 35% DIF values instead of 10% and 90% values, which are commonly used. This might diminish the performance of DISC analysis and might provide some discordance with ROC analysis (
Figure 14A,B). We devised the DISC analysis as a simple and versatile method to evaluate the diagnostic abilities of the GLCM parameters. Applying this to a wider range of data will hopefully improve the performance of DISC analysis.
In view of the purpose of this study, it can be concluded that melanin observation by 488 nm excitation is more suitable for the determination of cancerous tumors, so the conclusion is that Entropy and Prominence at 488 nm excitation are suitable for benign-malignancy determination. By utilizing 405 nm excitation, we may be able to study the effect of cancerous tumors on the hemoglobin-containing tissues, such as muscle attached to the sample slices. In this case, Homogeneity and Entropy can be used for benign-malignancy determination. This may correspond to the morphological change in hemoglobin distribution induced by cellular tumorigenesis. This means that the spatial distribution of cancerous tumors can be investigated through hemoglobin.