Novel Snapshot-Based Hyperspectral Conversion for Dermatological Lesion Detection via YOLO Object Detection Models

Huang, Nan-Chieh; Mukundan, Arvind; Karmakar, Riya; Syna, Syna; Chang, Wen-Yen; Wang, Hsiang-Chen

doi:10.3390/bioengineering12070714

Open AccessArticle

Novel Snapshot-Based Hyperspectral Conversion for Dermatological Lesion Detection via YOLO Object Detection Models

by

Nan-Chieh Huang

^1,2,

Arvind Mukundan

^3,4,

Riya Karmakar

³,

Syna Syna

⁵

,

Wen-Yen Chang

^6,* and

Hsiang-Chen Wang

^3,7,8,*

¹

Diving Medical and Physiology Training Center, Zuoying Armed Forces General Hospital, No. 553, Junxiao Rd., Zuoying District, Kaohsiung City 813204, Taiwan

²

Department of Information Engineering, I-Shou University, No.1, Sec. 1, Syuecheng Rd., Dashu District, Kaohsiung City 84001, Taiwan

³

Department of Mechanical Engineering, National Chung Cheng University, 168, University Rd., Min Hsiung, Chia Yi 62102, Taiwan

⁴

Department of Biomedical Imaging, Chennai Institute of Technology, Sarathy Nagar, Chennai 600069, India

⁵

Department of Computer Science and Engineering, Chitkara University, Chandigarh-Patiala National Highway NH-64 Village Jansla, Rajpura 140401, India

⁶

Department of General Surgery, Kaohsiung Armed Forces General Hospital, 2, Zhongzheng 1st. Rd., Kaohsiung City 80284, Taiwan

⁷

Department of Medical Research, Dalin Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, No. 2, Minsheng Road, Dalin, Chiayi 62247, Taiwan

⁸

Hitspectra Intelligent Technology Co., Ltd., Kaohsiung 80661, Taiwan

^*

Authors to whom correspondence should be addressed.

Bioengineering 2025, 12(7), 714; https://doi.org/10.3390/bioengineering12070714

Submission received: 13 May 2025 / Revised: 23 June 2025 / Accepted: 25 June 2025 / Published: 30 June 2025

(This article belongs to the Special Issue Medical Artificial Intelligence and Data Analysis)

Download

Browse Figures

Versions Notes

Abstract

Objective: Skin lesions, including dermatofibroma, lichenoid lesions, and acrochordons, are increasingly prevalent worldwide and often require timely identification for effective clinical management. However, conventional RGB-based imaging can overlook subtle vascular characteristics, potentially delaying diagnosis. Methods: A novel spectrum-aided vision enhancer (SAVE) that transforms standard RGB images into simulated narrowband imaging representations in a single step was proposed. The performances of five cutting-edge object detectors, based on You Look Only Once (YOLOv11, YOLOv10, YOLOv9, YOLOv8, and YOLOv5) models, were assessed across three lesion categories using white-light imaging (WLI) and SAVE modalities. Each YOLO model was trained separately on SAVE and WLI images, and performance was measured using precision, recall, and F1 score. Results: Among all tested configurations, YOLOv10 attained the highest overall performance, particularly under the SAVE modality, demonstrating superior precision and recall across the majority of lesion types. YOLOv9 exhibited robust performance, especially for dermatofibroma detection under SAVE, albeit slightly lagging behind YOLOv10. Conversely, YOLOv11 underperformed on acrochordon detection (cumulative F1 = 65.73%), and YOLOv8 and YOLOv5 displayed lower accuracy and higher false-positive rates, especially in WLI mode. Although SAVE improved the performance of YOLOv8 and YOLOv5, their results remained below those of YOLOv10 and YOLOv9. Conclusions: Combining the SAVE modality with advanced YOLO-based object detectors, specifically YOLOv10 and YOLOv9, markedly enhances the accuracy of lesion detection compared to conventional WLI, facilitating expedited real-time dermatological screening. These findings indicate that integrating snapshot-based narrowband imaging with deep learning object detection models can improve early diagnosis and has potential applications in broader clinical contexts.

Keywords:

skin cancer; hyperspectral imaging; spectroscopy; YOLO; machine learning; narrowband imaging

1. Introduction

Skin cancer is a major global health concern, with incidence rates steadily rising over recent decades. According to the World Health Organization, skin cancer accounts for approximately one-third of all cancers worldwide. In 2022, an estimated 115,320 new cases of skin cancer (excluding basal and squamous) were reported, alongside around 11,540 deaths [1]. Mortality rates for skin lesions vary, with men generally exhibiting higher overall mortality than women [2]. Dermatofibromas (DFs), lichenoid keratosis, and acrochordons are less common types of skin lesions, each accounting for less than 10% of cases [3]. DFs with classical morphology have been observed in children under the age of five [4]. Lesions larger than 1 cm with positive surgical margins show a recurrence probability of approximately 10%, which is lower than previously reported rates ranging from 26% to 50% [5]. DFs typically develop on the lower body but can also appear in various locations, including the head, face, auricle, neck, trunk, shoulder, pelvic girdles, and fingers [6,7,8,9]. They are more frequently diagnosed in women, with a lifetime prevalence estimated between 1% and 2% [10]. Lichenoid eruptions occur in children and arise from various causes. Despite the lack of a definite cause, immune-related mechanisms are believed to play a central role [11]. Owing to their typically asymptomatic nature and frequent misdiagnosis, the precise incidence rates of lichenoid eruptions are not well documented. Skin tags, clinically known as acrochordons, are small, irregularly shaped growths commonly found in skin folds [12]. Approximately 46% of the population develop skin tags, though severe cases are rare [13]. These lesions are especially prevalent among obese individuals and patients with type II diabetes, affecting roughly one-quarter of the adult population [14,15].

DFs are firm, reddish to brown nodules mainly found on the distal limbs [16]. Histologically, these nodules comprise fibroblasts, histiocytes, and collagenosis. However, DFs can sometimes mimic malignant lesions due to their firmness and coloration [17]. While typically asymptomatic, patients may occasionally experience itching or tenderness in the affected area. Lichenoid keratosis primarily presents as several erythematous-brown, flat-topped papules or slightly raised, reddish-brown, round or oval, flat-topped plaques. Histopathologically, this lesion features basal layer hyperplasia resembling lichen planus (LP), degeneration of basal keratinocytes, and band-like distribution of estrogen receptor-positive lymphocytes. Lichenoid lesions refer broadly to papular lesions observed in several dermatologic conditions, with LP being a notable example. These papules are shiny, flat-topped, polygonal, and clustered, often likened to lichen growing on rocks. Histologic examination reveals inflammatory cell infiltration arranged in a dense, band-like pattern that obscures the dermo–epidermal interface [11]. Acrochordons, or skin tags, are asymptomatic, skin-colored, frustum-shaped growths commonly found in skin folds, particularly in the neck, axillae, and groin regions. They comprise loose, fibrous connective tissue covered by an epidermal layer [18]. Although benign, acrochordons are susceptible to irritation due to their frequent occurrence in areas subject to friction or trauma [19]. Unlike malignant skin lesions such as melanoma and non-melanoma carcinomas—which demonstrate cytologic atypia, invasive growth, and metastatic potential—dermatofibromas, lichenoid keratosis, and acrochordons lack these aggressive features. Therefore, the three target categories are classified as benign lesions in the SAVE-enhanced imaging and the subsequent YOLO-based detection tasks.

Imaging spectroscopy, also known as hyperspectral imaging (HSI), refers to the measurement, analysis, and interpretation of spectral data from images [20]. HSI is an advanced imaging modality that captures detailed information on the pathological and molecular characteristics of tissues, insights that are often difficult to obtain from routine diagnostic imaging methods [21]. This technique operates by capturing hyperspectral data across hundreds of contiguous wavebands, offering a rich, multilevel representation of the target area [22]. The availability of crucial spectral information allows HSI to distinguish between mucosal and submucosal features that are invisible to the human eye within the white light imaging (WLI) optical range [23]. Unlike traditional imaging, HSI constructs a data hypercube by recording a spectrum at every spatial resolution cell. HSI cameras can capture light across a wide spectral range, including ultraviolet bands ranging from 200 nm to 380 nm, visible light bands ranging from 380 nm to 780 nm, and near-infrared bands ranging from 780 nm to 2500 nm [24,25]. Therefore, HSI has been applied in several fields of study, including astronomy [26], agriculture [27], molecular biology [28], medical imaging [29,30], mineralogy [31], archeology [32], the food industry [33], and environmental studies [34], due to its versatility and rich spectral output.

Narrowband imaging (NBI) is an HSI technique that uses narrow wavelength bands to enhance certain tissue features required for medical diagnosis. This method employs two narrow beams of light: one in the blue spectrum at 415 nm and the other in the green spectrum at 540 nm [35]. These wavelengths enhance optical images by improving the visibility of superficial mucosal layers and submucosal intrapapillary capillary loops. The 415 nm wavelength enhances and highlights superficial vessels, while the 540 nm wavelength enhances deeper submucosal vessels, typically rendering them in cyan and brown hues, respectively. This special contrast enhances the sharpness between vessels and surrounding mucosa [36], resulting in the clearer visualization of shallow surface features, mucosal patterns, and vascular structures [37]. Band selection enhances the visibility of early-stage SC by providing clearer visualization of vascular patterns and superficial structures, compared to other imaging methods. When comparing WLI and band selection for detecting malignant features in skin lesions, band selection has demonstrated higher sensitivity, specificity, and overall accuracy [38,39]. While MSI and HIS systems offer detailed spatiospectral tissue characterization, their dependence on bulky scanning apparatuses or costly tunable filters limits their widespread clinical implementation. In endoscopic applications, pseudo-HSI technologies, such as Pentax i-Scan and Fujinon FICE, apply proprietary lookup tables to RGB sensor data. However, these systems often lack transparency and necessitate vendor-specific platforms, reducing flexibility. Although NBI is extensively validated and utilizes well-defined hemoglobin absorption peaks to enhance vascular detail, it generally necessitates specialized or modified light sources.

The prompt and accurate identification of skin lesions is crucial for improving patient outcomes. However, traditional RGB imaging often fails to adequately capture the subtle vascular patterns associated with the initial stages of lesion development. Although deep learning-based object detectors have seen rapid advancements, they are typically trained on standard color images, limiting their sensitivity to intricate capillary networks. Given the increasing prevalence of dermatological disorders and the high cost of clinical imaging devices, the necessity for an accessible and economical approach to improve vascular contrast without requiring a specialized apparatus is increasing. Therefore, this study introduces a novel approach using spectrum-aided vision enhancer (SAVE) technology. SAVE converts WLI into representations similar to NBI, comparable to systems developed for Olympus endoscopes, thereby enabling a clearer visualization of SC lesions. The proposed work integrates SAVE with advanced imaging modalities, employing machine learning methods (which include YoloV11, Yolov10, YoloV9, YOLOv8, and YoloV5) to develop an SC detection system targeting three lesion types: dermatofibroma, lichenoid keratosis, and acrochordon. The SAVE methodology addresses the limitations of standard RGB imaging by converting RGB images into narrowband-like representations that emphasize hemoglobin absorption characteristics, thus offering enhanced input to cutting-edge YOLO-based detectors. The principal contributions of this manuscript are as follows:

SAVE: a snapshot-based technique that transforms any RGB image into a narrowband representation aligned with hemoglobin absorption peaks.
Integration of SAVE with YOLO: adaptation of YOLOv5–YOLOv11 by incorporating SAVE outputs for real-time skin lesion detection.
Comprehensive Evaluation: an exhaustive comparison of five YOLO variants across WLI and SAVE modalities, reporting class-specific precision, recall, F1 score, and statistical significance.

2. Materials and Methods

2.1. Dataset

The images used in this study were sourced from the publicly available International Skin Imaging Collaboration (ISIC) archive, which offers dermoscopic and clinical photographs annotated by experts. For the experiments, the following three distinct types of skin lesions were selected:

Acrochordon: 577 images;
Dermatofibroma: 821 images;
Lichenoid lesions: 805 images.

All ISIC images originally had varying resolutions, ranging from 800 × 600 to 2048 × 1536 pixels. Aiming to standardize the input for the deep learning models, all images were resized to 640 × 640 pixels before training and inference. The SAVE method converted each three-channel RGB image into a five-band NBI approximation in a single snapshot. The five output bands corresponded to the central wavelengths described in Section 2.2. Therefore, each SAVE-processed image of size 640 × 640 was represented as a 5 × 640 × 640 tensor. For comparison, WLI inputs were retained as standard 3 × 640 × 640 tensors (RGB). The full dataset of 2203 labeled images was partitioned into training, validation, and test sets using a 70%/20%/10% ratio, respectively, ensuring a proportional representation of each lesion class in all subsets. Table 1 below summarizes the exact image counts. A training set of 1542 images was used to optimize model weights. A validation set of 441 images was used to tune hyperparameters (e.g., learning rate, confidence thresholds) and apply early stopping. Meanwhile, a test set of 220 images was held out entirely during model development and was used only once at the end to report final performance metrics, as shown in Table 1. Experiments on data augmentation techniques (random 90° rotations, horizontal/vertical flips, and small shear transformations) were also conducted. However, extensive augmentation introduced slight overfitting and did not improve validation accuracy. Consequently, all experiments reported herein were conducted using the original, non-augmented images.

2.2. SAVE

According to this study, converting RGB digital camera images into hyperspectral representations was efficiently achieved using the VIS-HSI conversion method introduced in this novel algorithm. While conventional NBI relies on specialized illumination at specific hemoglobin absorption peaks, the SAVE module learned a regression-based mapping from standard RGB camera responses to five designated center wavelengths (450, 500, 550, 600, and 650 nm). These wavelengths were selected based on the intersection between the spectral sensitivity curves of commercial RGB sensors and the documented absorption spectra of oxy- and deoxy-hemoglobin. This alignment ensured that the SAVE output improved vascular contrast similarly to authentic NBI. Although SAVE did not directly quantify tissue reflectance, this module empirically replicated the relative contrast between blood-rich regions and surrounding tissues, enabling cost-effective, NBI-like enhancement using any standard RGB image. While numerous manufacturers currently provide pseudo-HSI image enhancement, NBI was selected as the conceptual benchmark due to its extensive research backing and clinical validation as the leading technique for real-time vascular contrast enhancement in endoscopy. NBI uses well-defined narrowband illuminations centered on hemoglobin absorption peaks, offering a strong physiological foundation and extensive operator familiarity, in contrast to proprietary, device-specific lookup tables used by other systems. By aligning the spectral bands of SAVE with the target wavelengths of NBI, the proposed method was guaranteed to not only mimic the vascular contrast mechanisms trusted by clinicians but was also expected to remain fully compatible with standard RGB cameras, eliminating the need for specialized hardware or proprietary processing pipelines. During system calibration, data from a spectrometer were compared against corresponding RGB values using the Macbeth color checker (x-rite classic). The raw pixel data were normalized and linearized to produce accurate RGB values in the sRGB color space. A nonlinear correlation matrix and transformation coefficients were then applied to convert these values into the CIE 1931 XYZ color space. More specifically, the R, G, and B values (0 to 255) in sRGB were first reduced to a 0–1 range. A gamma correction function was applied to convert sRGB to linear RGB values. After error correction, the final X, Y, and Z values were updated (XYZ_correct) and calculated using Equations (1) and (2).

[C] = [{X Y Z}_{S p e c t r u m}] \times p i n v ([V]),

(1)

[{X Y Z}_{C o r r e c t}] = [C] \times [V] .

(2)

The integrals were then used to convert the reflectance spectra measured by the spectrometer into the XYZ color space.

X = k \int_{400 n m}^{700 n m} S (λ) R (λ) \bar{x} (λ) d λ,

(3)

Y = k \int_{400 n m}^{700 n m} S (λ) R (λ) \bar{y} (λ) d λ,

(4)

Z = k \int_{400 n m}^{700 n m} S (λ) R (λ) \bar{z} (λ) d λ,

(5)

k = 100 / \int_{400 n m}^{700 n m} S (λ) \bar{y} (λ) d λ .

(6)

Multivariate regression analysis was employed to generate the correction coefficient matrix C. Using the reflectance spectrum data (R_spectrum), the transformation matrix (M) corresponding to the colors on the X-Rite ColorChecker board was calculated. Aiming to reduce dimensionality and extract dominant spectral features, principal component analysis was applied to R_spectrum, resulting in six notable principal components (PCs) and their associated eigenvectors. The six PCs accounted for 99.64% of the total variance, effectively capturing the essential spectral information. The resulting analog spectrum, denoted as

{[S_{Spectrum}]}_{380 - 780 n m}

, was reconstructed using Equations (7) and (8).

[M] = [S c o r e] \times p i n v ([V_{C o l o r}]),

(7)

{[S_{Spectrum}]}_{380 ~ 780 n m} = [E V] [M] [V_{C o l o r}] .

(8)

An average RMSE of 0.056 was computed across all bands from 380 to 780 nm, demonstrating the high accuracy of the spectral reconstruction. Before calibrating the camera, the mean chromatic aberration across all 24 color blocks was 10.76. This value decreased to 0.63 after calibration, indicating the improved accuracy of the vectorized kernels. A qualitative analysis was conducted to assess the color match in the LAB color space. The results showed that the colors were similar, with an average color difference of 0.75, supporting the effectiveness of the algorithm in converting RGB images to HSI images. The SAVE method presented in this study provides a detailed method for calibrating the NBI of the Olympus endoscope based on the HSI conversion algorithm (Supplement Figure S20 for the flowchart of the VIS-HSI imaging algorithm). Aiming to initiate the simulation process, the output images from the conversion algorithm were compared to realistic NBI images obtained using the Olympus endoscope (Supplement Figure S21 for the lighting spectrum difference between the Olympus WLI, Olympus NBI, and the Capsule WLI). The CIEDE 2000 color difference [40] was measured and reduced for each of the 24 color blocks. After correction, the average color difference across the blocks decreased to 2.79. Three main factors affected the accuracy of the color match. The first was the range of light wavelengths, or the light spectrum. The second was the color matching function, a mathematical function that quantifies the amount of light input at a specific color. The third factor was the reflection spectrum, which provides information regarding the amount of light reflected by a specific hue [34]. The illumination spectra of WLI and NBI were corrected using the Cauchy–Lorentz distribution, as shown in Equation (9), to reduce errors arising from differences in these spectra, particularly in the 450–540 nm region, where hemoglobin absorbance was high (Supplement Figure S22 for the difference between the Olympus SAVE and the VCE-simulated NBI lighting): (a) shows the difference between Olympus SAVE and Olympus WLI, and (b) shows the difference between VCE SAVE and VCE WLI images. The actual NBI images captured by the Olympus endoscope contained not only green and blue hues but also different shades of brown corresponding to a wavelength of approximately 650 nm, despite the peak absorption wavelengths of hemoglobin occurring at 415 and 540 nm (Supplement Figure S17 for the comparison of SSIM between the simulated NBI images and the WLI images of VCE and Olympus; Table S1 for the SSIM of 20 randomly chosen images in VCE and the Olympus endoscope). This finding indicates that nuanced image post-processing techniques contribute to highly realistic NBI images (Supplement Figure S18 for the comparison of entropy between the simulated NBI images and the WLI images): (a) shows the entropy for Olympus endoscopy, and (b) shows the entropy for the VCE camera. Table S2 presents the entropy comparison of WLI and NBI images in Olympus and VCE endoscopes. Therefore, in addition to the wavelengths of 415 and 540 nm, three additional spectral regions at 600, 700, and 780 nm were observed in this study (Supplement Figure S19 for the PSNR comparison of the 20 randomly chosen images in Olympus and VCE; Table S3 for the comparison of the corresponding PSNR values).

f (x; x_{0}, γ) = \frac{1}{π γ [1 + {(\frac{x - x_{0}}{γ})}^{2}]} = \frac{1}{π} [\frac{γ}{{(x - x_{0})}^{2} + γ^{2}}]

(9)

Therefore, the optimization function used in this study was the dual annealing mechanical method, which enhanced the simulated annealing algorithm by combining elements of classical and fast simulated annealing with local search parameters to finetune the light spectrum [35]. A slight average standard CIEDE 2000 color difference of 3.06 was observed across the 24 colors. The overall flowchart of the project is shown in Figure 1 (Supplement Figure S20 for endoscopic imaging using three imaging techniques: (a) WLI, (b) NBI, and (c) SAVE).

2.3. ML Algorithms

All YOLO variants were trained in PyTorch 1.10.0 on a single NVIDIA V100 GPU (16 GB VRAM). Each model was optimized for 300 epochs with a batch size of 32 via SGD (momentum 0.937, weight decay 0.0005). The learning rate was initialized at 0.005 and annealed to 1 × 10⁻⁴ using a cosine schedule with a 3-epoch linear warmup. The validation F1 score was evaluated every epoch, and early stopping was triggered after 75 epochs without improvement, with the best checkpoint retained for final testing.

2.3.1. YoloV11

YOLOv11 introduces several architectural improvements for real-time object detection, enhancing accuracy and speed [41]. This algorithm features a multi-path ConvNeXt backbone for improved feature extraction and incorporates dynamic convolutional attention (DCA) for highly effective object localization and classification. Additionally, YOLOv11 employs cross-stage partial connections to facilitate improved gradient flow in deep models without notably increasing computational cost. The adaptive anchor-free head replaces predefined anchor boxes for object detection, while the four-level prediction head improves detection across various object scales. For classification loss, YOLOv11 uses binary cross-entropy loss for multiclass classification.

{L o s s}_{c l s} = - \sum_{i = 1}^{N} \sum_{c = 1}^{C} y_{i, c} l o g (p_{i, c})

(10)

Objectness loss uses logistic regression to estimate the confidence that a predicted bounding box contains an object.

{L o s s}_{l o c} = \sum_{i = 1}^{N} [1 - \frac{I o U (b_{i}, {\hat{b}}_{i})}{G I o U (b_{i}, {\hat{b}}_{i})}]

(11)

The improvements added to YOLOv11 make it one of the most efficient and accurate real-time object detection models.

2.3.2. YoloV10

YoloV10 is a version of YOLO models that offers enhanced accuracy and fast processing times [42]. Aiming to enhance feature extraction, this version incorporates an upgraded backbone network and integrates attention mechanisms. The loss function of YoloV10 comprises the following three primary components: localization, objectness, and classification losses. For accurate class prediction, the classification loss is computed using binary cross-entropy. Using logistic regression, the objectness loss determines a confidence value for each anticipated bounding box.

{L o s s}_{o b j} = - \sum_{i = 1}^{N} [y_{i} \log (p_{i}) + (1 - y_{i}) \log (1 - p_{i})] .

(12)

YoloV10 uses a generalized IoU loss to improve the localization loss through the elimination of bounding box coordinate inconsistencies and overlap measurements between predicted and ground truth boxes where the generalized intersection over union is represented by GloU. These enhancements help YoloV10 achieve high performance in real-time object detection.

2.3.3. YoloV9

Two key innovations introduced by YOLOv9 in the field of real-time object detection are the generalized efficient layer aggregation network (GELAN) and programmable gradient information (PGI) [43]. PGI enhances model training by computing gradients through an auxiliary reversible branch. In addition to the operational methods discussed above, GELAN increases the number of parameters while maintaining high computational efficiency. This approach is achieved by building on the concepts of CSPNet and integrating ELAN. The corresponding loss function is defined in Equation (13).

{L o s s}_{t o t a l} = λ_{c l s} {L o s s}_{c l s} + λ_{l o c} {L o s s}_{l o c} + λ_{o b j} {L o s s}_{o b j} .

(13)

Bounding box regression loss is crucial for precise bounding box predictions and typically employs a mean squared error (MSE) approach, as shown in Equation (14).

{L o s s}_{b b} = \sum_{i = 0}^{N} λ_{coord} \cdot [I_{i}^{o b j}] \cdot ({(x_{i} - {\hat{x}}_{i})}^{2} + {(y_{i} - {\hat{y}}_{i})}^{2} + {(\sqrt{w_{i}} - \sqrt{{\hat{w}}_{i}})}^{2} + {(\sqrt{h_{i}} - \sqrt{{\hat{h}}_{i}})}^{2}) .

(14)

The confidence of the model in the presence of an object within a bounding box is measured using the objectness loss, which also helps determine the confidence score. The definition of the confidence loss function is presented in Equation (15).

{L o s s}_{o b j} = λ_{n o o b j} \sum_{i = 0}^{S^{2}} \sum_{j = 0}^{B} I_{i, j}^{n o o b j} {(c_{i} - {\hat{c}}_{l})}^{2} + λ_{o b j} \sum_{i = 0}^{s^{2}} \sum_{j = 0}^{B} I_{i, j}^{o b j} {(c_{i} - {\hat{c}}_{l})}^{2} .

(15)

Classification loss is a criterion that ensures the model correctly identifies the detected items by evaluating the accuracy of class regression using cross-entropy. The definition of the classification loss function is provided in Equation (16).

{L o s s}_{c l s} = λ_{c l a s s} \sum_{i = 0}^{S^{2}} \sum_{j = 0}^{B} I_{i, j}^{o b j} \sum_{C \in c l a s s e s} p_{i} (c) \log ({\hat{p}}_{l} (c))

(16)

These enhancements make YOLOv9 a highly effective model for applications such as autonomous driving, traffic surveillance, vehicle recognition, and pedestrian detection.

2.3.4. YOLOv8

YOLOv8 introduces further architectural improvements over its predecessors by refining the backbone and neck of the model [44]. This algorithm uses a CSPDarknet53 backbone for enhanced feature extraction and integrates a combination of feature pyramid networks and PANet in the neck, enabling the efficient aggregation of multiscale features. This structure enhances detection across varying object sizes and scales, making YOLOv8 particularly effective in real-time scenarios where high accuracy and speed are required. The YOLOv8 loss function comprises specific loss components designed to address different aspects of object detection: localization, classification, and bounding box refinement. Accordingly, focal loss (classification loss) is used and is calculated for each object instance, as shown in Equation (17).

F L (p_{t}) = - α \cdot {(1 - p_{t})}^{γ} \cdot l o g (p_{t}),

(17)

where

p_{t}

is the model’s estimated probability for the true class,

α

is a balancing factor, and

γ

is a modulating factor to focus on hard, misclassified examples.

Distribution focal loss (DFL) is used for bounding box refinement. This function helps smooth bounding box predictions by leveraging a discretized distribution of regression targets, as shown in Equation (18).

D F L = \sum_{k = 0}^{r e g_m a x} w_{k} \cdot C E (p_{k}, t_{k}),

(18)

where CE is the cross-entropy loss between the predicted distribution

p_{k}

and the target

t_{k}

, and

w_{k}

is a weight factor based on the bounding box offset distance. For object bounding boxes, the IoU (intersection over union) loss is computed as shown in Equation (19).

I o U L o s s - 1 - \frac{I n t e r s e c t i o n}{U n i o n}

(19)

Additionally, CloU (complete IoU) is often used for precise localization, as shown in Equation (20).

C I o U = I o U - \frac{ρ^{2} (b, b_{g t})}{c^{2}} - α v,

(20)

where

ρ

is the Euclidean distance between the centers of predicted and ground truth boxes,

c

is the diagonal length of the smallest enclosing box, and

α

and

v

account for the aspect ratio. The total loss function for YOLOv8 is expressed as shown in Equation (21).

L_{t o t a l} = λ_{1} L_{b o x} + λ_{2} L_{o b j} + λ_{3} L_{c l s},

(21)

where

λ_{1}, λ_{2}

, and

λ_{3}

act as balancing weights. These enhancements increase the efficiency of YOLOv8 for real-time object detection, providing accuracy and speed.

2.3.5. YOLOV5

YOLOv5 is an efficient single-stage object detection model that processes input images through its backbone, neck, and head, enabling fast and accurate detection [45]. The backbone uses cross-stage partial networks (CSPNet) to improve feature extraction while reducing computational workload. The neck is based on a path aggregation network, which integrates multiscale feature maps to improve detection across varying object sizes. The head generates the final predictions, including bounding box coordinates, objectness scores, and class probabilities. Optimization is conducted at three different scales to accommodate objects of various sizes. A key structural improvement in YOLOv5 lies in its balanced tradeoff between detection accuracy and computational speed. The model uses confidence scores and the three kinds of box regression-based classification losses in its loss function while measuring the difference between predicted and ground truth values. The loss function of the YOLOv5 model can be expressed as shown in Equation (22).

L_{G I O U} = \sum_{i = 0}^{s^{2}} \sum_{j = 0}^{B} I_{i j}^{o b j} [1 - I O U + \frac{A^{c} - U}{A^{c}}],

(22)

where

S^{2}

represents the number of grids, and

B

is the number of bounding boxes in each grid. The value of

I_{t j}^{o b j}

is equal to 1 in the presence of an object in the bounding box; otherwise, it will be 0.

L_{c o n f} = - \sum_{i = 0}^{s^{2}} \sum_{j = 0}^{B} I_{i j}^{o b j} [\overset{⊖}{C_{j}^{i}} \log (C_{i}^{j}) + (1 - {\overset{⊖}{C}}_{i}) \log (1 - C_{i}^{j})]

(23)

- λ_{n o o b j} \sum_{i = 0}^{s^{2}} \sum_{j = 0}^{B} I_{i j}^{n o o b j} [\overset{↷}{C_{i}} \log (C_{i}^{j}) + (1 - C_{i}^{j}) \log (1 - C_{i}^{j})]

(24)

where

{\overset{C}{C}}_{t}

is the predicted confidence of the bounding box of

j

in the grid of

i, C_{t}^{J}

is the true confidence of the bounding box of

j

in the grid of

i

, and

λ_{n o o b j}

is the confidence weight in the absence of objects in the bounding box.

L_{c l a s s} = - \sum_{i = 0}^{s^{2}} I_{i j}^{n o o b j}

(25)

\sum_{c \in c l a s s e s} [\overset{↷}{P_{i}^{j}} (c) \log (P_{i}^{j} (c)) + (1 - \overset{P}{{\overset{⏞}{P}}_{i}^{j}} (c)) \log (1 - P_{i}^{j} (c))],

(26)

where

{\overset{⇀}{P}}_{t} (c)

is the probability that the detected object is predicted to belong to the category, and

P_{t}^{'}

(c) is the probability that it actually belongs to the category.

3. Results

In the pipeline, each raw RGB skin lesion image was first processed by the SAVE module, which applied a learned regression mapping to convert the three-channel RGB input into a five-band narrowband representation. Its RGB counterpart was then uniformly resized to 640 × 640 pixels and normalized on a per-channel basis. Weight initialization was performed by duplicating pretrained RGB kernels where appropriate. During training, the combined loss (bounding box regression + object confidence + classification) was optimized using a learning rate of 0.005, a batch size of 16, and early stopping based on the validation F1 score. This integration enabled the YOLO model to leverage the enhanced spectral contrast provided by SAVE, resulting in improved lesion detection performance.

The performances of several YOLO object detection algorithms were evaluated for skin lesion detection, comparing their effectiveness under WLI and SAVE imaging modalities. The key performance metrics, which included precision, recall, and F1 scores, were assessed across four lesion categories: acrochordon, dermatofibroma, and lichenoid, as shown in Table 2.

The validation showed that SAVE outperformed WLI in object detection using YOLOv11, especially for classes such as acrochordon and lichenoid as shown in Figure 2. For lesions such as dermatofibroma, WLI demonstrated good detection capability, which was further enhanced by the additional spectral detail provided by SAVE. This improvement across all classes indicates that SAVE provides substantial contributions to the reliability of SC detection with YOLOv11 (Supplement Figure S1 for the loss and precision of WLI and SAVE of YoloV11; Figure S2 for the confusion matrix of WLI and SAVE of YoloV11; and Figure S3 for the F1–confidence curve of WLI and SAVE of YoloV11).

In the case of YOLOv10, SAVE achieved better validation results, particularly for lichenoid detection, which showed moderate performances with WLI but improved accuracy with SAVE. The overall consistency of the model increased with SAVE, especially for classes requiring high spectral contrast. These findings demonstrate that SAVE helps YOLOv10 in producing more accurate detections for lesion types that are typically difficult to identify compared to WLI (Supplement Figure S4 for the loss and precision of WLI and SAVE of YoloV10; Figure S5 for the confusion matrix of WLI and SAVE of YoloV10; and Figure S6 for the F1–confidence curve of WLI and SAVE of YoloV10).

The validation results of YOLOv9 demonstrated improved performance with SAVE, especially for complex classes. While WLI performed well for the dermatofibroma class, SAVE provided more accurate detection for the acrochordon and lichenoid classes. The positive results highlight the capability of SAVE to enhance spectral resolution and contrast, enabling YOLOv9 to effectively differentiate between small features in classes that are often difficult to identify under WLI (Supplement Figure S7 for the loss and precision of WLI and SAVE of YoloV9; Figure S8 for the confusion matrix of WLI and SAVE of YoloV9; and Figure S9 for the F1–confidence curve of WLI and SAVE of YoloV9).

YOLOv8 showed improved validation performance on classes such as acrochordon and lichenoid when used with SAVE. In contrast, WLI produced good results for well-defined classes. By highlighting the capability of YOLOv8 to identify key lesion characteristics, SAVE further improved the detection performance of the model. This finding reflects a consistent pattern across models, with SAVE substantially improving object detection accuracy, particularly for lesion types requiring remarkable spectral clarity (Supplement Figure S10 for the loss and precision of WLI and SAVE of YoloV8; Figure S11 for the confusion matrix of WLI and SAVE of YoloV8; and Figure S12 for the F1–confidence curve of WLI and SAVE of YoloV8).

Although YOLOv5 maintained balanced detection accuracy with WLI, especially for distinct classes such as dermatofibroma, SAVE contributed to improvements across all classes. During validation, SAVE enhanced the detection accuracy of YOLOv5, leading to highly accurate differentiation between classes. These findings support the idea that SAVE enhances the reliability of object detection (Supplement Figure S13 for the loss and precision of WLI and SAVE of YoloV5; Figure S14 for the confusion matrix of WLI and SAVE of YoloV5; and Figure S15 for the F1–confidence curve of WLI and SAVE of YoloV5).

Validation results across all YOLO models demonstrated that SAVE consistently outperformed WLI in the detection of SC. By enhancing spectral information and visual contrast, SAVE proved particularly effective for complex lesion types, such as lichenoid and acrochordon. Based on these results, integrating SAVE into the detection workflow improves detection reliability. Thus, SAVE is a useful technique for improving performance in medical imaging tasks using object detection models such as YOLO (Supplement Figure S16 for the comparison of WLI and SAVE results, highlighting the contrast between the normal and the SAVE images).

This study evaluated the performance of five YOLO models—YOLOv11, YOLOv10, YOLOv9, YOLOv8, and YOLOv5—across two imaging modalities, WLI and SAVE. The models were tested on the following three lesion classes: dermatofibroma, acrochordon, and lichenoid. The evaluation metrics considered included precision, recall, and F1 score. YOLOv10 demonstrated the highest accuracy, particularly in classifying all lesion types using the SAVE modality, where precision and recall were notably higher compared to WLI, as shown in Table 3. Analysis of SAVE results showed that the F1 scores for acrochordon and dermatofibroma were high, indicating the effectiveness of the model in improving precision and recall. This finding also implies that YOLOv10 is highly accurate in cancer detection within the SAVE modality, outperforming the other models. In the case of YOLOv9, the performance metrics were slightly lower than those of YOLOv10, but it still produced strong precision and recall values across all classes in the WLI modality. However, the model achieved strong performance under the SAVE modality, particularly for dermatofibroma. YOLOv9 demonstrated good generalization across both modalities, although its results were slightly lower than those of YOLOv10, especially in WLI for acrochordon detection. YOLOv11 achieved an overall F1 score of 65.73%, indicating good performance across all classes, particularly with the SAVE method. However, its accuracy in detecting acrochordon was lower compared to dermatofibroma and lichenoid. Overall, better performance was observed with the SAVE modality, where overall precision, recall, and F1 scores were higher than in WLI, especially for acrochordon, which demonstrated weaker results under WLI. These results indicate that, despite the excellent performance of YOLOv11, it does not surpass YOLOv10 in terms of precision and recall metrics. Among all the models, YOLOv8 had the lowest accuracy and F1 score, particularly on the WLI modality. This finding indicates that YOLOv8 is least effective in detecting some classes, particularly acrochordon. Although the SAVE modality improved the performance of YOLOv8, its results were still lower compared to those of YOLOv9 and YOLOv10. Thus, YOLOv8 was less precise with WLI and produced more false positives than the newer versions of YOLO. YoloV5, despite being from an earlier model generation, performed well in both imaging modalities, particularly when using the SAVE method. Its performance was only slightly lower than that of YOLOv9 and YOLOv10. Although the results showed lower precision compared to the latest YOLO models, YOLOv5 remained an effective model for lesion diagnosis, especially with the SAVE modality. After comparing all the results from YOLOv11, YOLOv8, and YOLOv5, notably higher precision was observed in YOLOv10 and YOLOv9, particularly with the SAVE modality, where both models performed substantially better. This finding indicates that the SAVE imaging method provides high accuracy. Considering the evaluation metrics, YOLOv10 achieved the highest precision and recall values, making it the best-performing algorithm overall. In contrast, YOLOv11 could have performed better in WLI and did not surpass the other models, especially when using SAVE. Overall, improvements in SC detection performance were greatest for the YOLOv10 and YOLOv9 models when using the SAVE imaging modality.

Acrochordon emerged as the most challenging lesion category across all models and imaging techniques, consistently yielding the lowest F1 scores, even after SAVE conversion as shown in Figure 3 (ranging from 49.69% for YOLOv11–SAVE to 66.60% for YOLOv8–SAVE). Multiple factors possibly contributed to this difficulty, including the small size of the lesions—skin tags typically occupy only a few dozen pixels, even after resizing to 640 × 640, rendering them challenging to localize and distinguish from background noise. Additionally, acrochordons exhibit considerable variations in color, texture, and attachment morphology, reducing the effectiveness of RGB and narrowband contrast cues as shown in Figure 4. In contrast, dermatofibromas display distinct vascular patterns that are effectively captured by the spectral bands of SAVE. Although SAVE enhanced recall (for instance, YOLOv10’s recall for acrochordon increased from 42.9% to 49.7%), improvements in precision were highly modest, indicating persistent false positives involving other minor skin features. Aiming to address these challenges, future efforts should enhance the acrochordon subset through targeted oversampling and synthetic data creation using controlled cropping and color jittering to accentuate tag-like textures. Additionally, exploring alternative narrowband wavelengths may improve the discrimination of skin tag structures. Customized strategies will be essential to improve acrochordon detection performance to levels comparable with more easily recognizable lesions.

4. Discussion

SAVE consistently produced statistically significant improvements in F1 across all models, with YOLOv10 showing a notable increase of +10.3 percentage points. This finding highlights the effectiveness of snapshot-based narrowband contrast in enhancing detection performance. Unlike device-specific filters such as i-Scan or FICE, a regression-based framework attains comparable or superior lesion detection efficacy without requiring specialized hardware. These findings indicate that SAVE not only connects RGB and hyperspectral methodologies but also outperforms current pseudo-HSI techniques in practical detection applications. The results from this study indicate promising advancements in SC diagnosis; however, several limitations must be addressed to enable further improvements. First, the computational load associated with HSI, machine learning algorithms, and video analysis remains a major challenge for scaling these technologies to real-world applications [46]. Advanced computing resources such as GPUs, TPUs, and FPGAs are essential to reduce processing time and allow real-time diagnostics [47]. Moreover, the use of ensemble models combined with transfer learning enhances system performance by leveraging pretrained networks, thus avoiding the need for exhaustive training on large new datasets and improving model accuracy and reliability [48].

4.1. Uncertainty Analysis

A major limitation to consider relates to the dataset. This study was conducted using data from an open-source resource, the ISIC platform, which has broad demographic coverage. However, incorporating additional data from different clinical and geographical settings would enhance the generalizability of this model. Although the ISIC archive offers a valuable and meticulously annotated benchmark, dependence on a single public dataset may inadequately reflect the variability encountered in clinical practice, such as differences in imaging devices, lighting conditions, skin phototypes, and actual lesion presentations. Consequently, the performance of the model may vary when applied to smartphone-acquired images, cross-institutional dermoscopic systems, or patient demographics exhibiting a range of skin tones. Future work will include prospective validation on multicenter clinical cohorts, refinement and domain adaptation for smartphone-acquired images and underrepresented skin types, and a pilot implementation in dermatology clinics to assess real-time detection accuracy and workflow integration. Such expansion will improve the model’s adaptability and ensure broader applicability across diverse patient populations.

4.2. Limitations and Computational Cost

Another limitation involves the preprocessing of images. In this study, all images were resized to a standard resolution of 640 pixels to reduce computational complexity and enhance data consistency [49]. However, this resizing may result in the loss of critical details for SC analysis. Future research should investigate adaptive resolution techniques that preserve high-resolution features where necessary, balancing image quality with computational efficiency. With frame-by-frame processing, optimizing the quality and computational speed of video analysis may be possible by capturing minimal changes that static images may fail to capture. Despite its robust detection capabilities, the proposed methodology possesses two primary limitations: computational resources and algorithmic scope. Aiming to obtain more robust and less optimistic performance estimates, future studies should implement stratified k-fold or nested cross-validation across RGB and SAVE datasets. Distributed training frameworks will also be essential to manage the increased computational load. Additionally, a systematic ablation study will be conducted to evaluate the impact of individual components, such as SAVE’s spectral band selection, normalization pipeline, and network architecture, on detection performance. The current study was based primarily on static, visual-based SC identification; extending the data analysis to video formats could introduce novel possibilities for improving diagnostic accuracy. By tracking lesion variations over time, video-based analysis could help identify dynamic patterns indicative of disease progression, which may be particularly valuable in early cancer detection. Subsequent research will also examine the application of novel architectures such as SimPoolFormer, FDSSC, Tri-CNN, and other CNN models to further improve SC detection using SAVE-enhanced imaging [50,51,52].

4.3. Comparison to Alternative Spectral Modalities and Clinical Translation

MSI and HSI systems provide intricate spatiospectral tissue signatures; however, these systems depend on cumbersome scanning apparatuses or costly tunable filters, limiting their regular clinical implementation. In endoscopy, solutions such as Pentax i-Scan and Fujinon FICE use proprietary lookup tables on RGB sensors. However, they provide limited transparency and necessitate vendor-specific platforms. NBI is extensively validated and grounded in well-defined hemoglobin absorption peaks; however, it generally necessitates modified light sources. In contrast, SAVE holds a distinctive role: it emulates the vascular contrast of NBI through a learned regression model applied to standard RGB data. Thus, SAVE is entirely hardware-agnostic, capable of generating five-band narrowband outputs from a single image capture using conventional equipment. In addition to promoting early action, SAVE has the potential to improve patient outcomes by enabling thorough and dynamic examinations. The extent of diagnosis and optimization for clinical usage should receive increased attention in future research. Further development for static images and real-time video data is essential to achieving an accurate, responsive evaluation system that enhances diagnostic precisions and clinical decision-making. As this technology matures, this approach is expected to become a vital tool in the medical field.

5. Conclusions

Skin disorders such as acrochordon, dermatofibroma, and lichenoids can be effectively identified by incorporating HSIs with advanced object detection algorithms, such as Yolov11, YOLOv10, YOLOv9, YOLOv8, and YOLOv5. While all models demonstrated excellent performance, YOLOv10 achieved the best overall results across key evaluations metrics—accuracy, precision, recall, and F1 score—particularly for the dermatofibroma class. YOLOv9 and YOLOv8 also performed well, confirming the effectiveness of modern YOLO architectures in skin lesion detection. This study indicates that deploying these advanced YOLO models in conjunction with HSI can notably enhance the accuracy and efficiency of skin lesion detection. These methods hold strong potential for clinical applications, especially in cases where early diagnosis is critical to successful treatment outcomes. The integration of real-time object detection into dermatological imaging systems could assist clinicians in identifying skin abnormalities with remarkable speed and precision. Further studies should focus on finetuning these models with highly diverse patient datasets and exploring the application of HSI combined with YOLO models in diagnosing other types of skin cancers or skin conditions. Such advancements could enable timely diagnosis, leading to better-targeted treatment methods and ultimately improving patient survival rates when timely intervention is needed.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/bioengineering12070714/s1, Figure S1 represents the loss and precision of WLI and SAVE; Figure S2 represents the confusion matrix of WLI and SAVE; Figure S3 represent the F1–confidence curve of WLI and SAVE; Figure S4 represents the loss and precision of WLI and SAVE; Figure S5 represents the confusion matrix of WLI and SAVE; Figure S6 represent the F1–confidence curve of WLI and SAVE; Figure S7 represents the loss and precision of WLI and SAVE; Figure S8 represents the confusion matrix of WLI and SAVE; Figure S9 represent the F1–confidence curve of WLI and SAVE; Figure S10 represents the loss and precision of WLI and SAVE; Figure S11 represents the confusion matrix of WLI and SAVE; Figure S12 represent the F1–confidence curve of WLI and SAVE; Figure S13 represents the loss and precision of WLI and SAVE; Figure S14 represents the confusion matrix of WLI and SAVE; Figure S15 represent the F1–confidence curve of WLI and SAVE. Figure S16 Comparison of WLI and SAVE results highlighting the contrast between the normal and the SAVE images. Figure S17 Comparison of SSIM between the simulated NBI images and the WLI images of VCE and Olympus. Figure S18 Comparison of entropy between the simulated NBI images and the WLI images. (a) Entropy for Olympus Endoscopy while (b) shows the entropy for the VCE camera. Figure S19 Comparison of PSNR of the twenty randomly chosen images in Olympus and VCE. Figure S20 Endoscopic using three imaging techniques. (a) WLI, (b) NBI, (c) SAVE. Figure S21 The lighting spectrum difference between the Olympus WLI, Olympus NBI, and the Capsule WLI. Figure S22 The Difference between the Olympus SAVE and the VCE simulated NBI lighting. (a) shows the difference between the Olympus SAVE and Olympus WLI. (b) shows the difference between VCE SAVE and VCE WLI images. Table S1 SSIM of twenty randomly chosen images in VCE and Olympus endoscope. Table S2 Entropy comparison of the WLI and NBI images in Olympus and VCE endoscope. Table S3 Comparison of PSNR of the twenty randomly chosen images in Olympus and VCE. Table S4 Comparison of F1-score of the different models.

Author Contributions

Conceptualization, N.-C.H., H.-C.W. and A.M.; data curation, N.-C.H. and A.M.; formal analysis, W.-Y.C., R.K., S.S. and A.M.; funding acquisition, N.-C.H., A.M. and H.-C.W.; investigation, R.K. and A.M.; methodology, S.S., H.-C.W. and A.M.; project administration, A.M. and H.-C.W.; resources, H.-C.W. and A.M.; software, W.-Y.C. and A.M.; supervision R.K. and H.-C.W.; validation, S.S. and R.K.; writing—original draft, S.S., R.K. and A.M.; writing—review and editing, R.K., A.M., W.-Y.C. and H.-C.W. All authors have read and agreed to the published version of the manuscript.

Funding

National Science and Technology Council, the Republic of China, under grants NSTC 113-2221-E-194-011-MY3, and the Dalin Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation-National Chung Cheng University Joint Research Program and Kaohsiung Armed Forces General Hospital Research Program KAFGH_D_114014 in Taiwan.

Institutional Review Board Statement

This study was conducted following the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board of Dalin Tzu Chi General Hospital (B11302014) on 19 June 2024.

Informed Consent Statement

Written informed consent was waived in this study because of its retrospective, anonymized design.

Data Availability Statement

The data presented in this study are available in this article upon request to the corresponding author. The data are not publicly available due to participant privacy and confidentiality restrictions; they contain sensitive personal health information that cannot be openly shared.

Conflicts of Interest

Author Hsiang-Chen Wang was employed by the company Hitspectra Intelligent Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Siegel, R.L.; Miller, K.D.; Fuchs, H.E.; Jemal, A. Cancer Statistics, 2021. CA Cancer J. Clin. 2021, 71, 7–33. [Google Scholar] [CrossRef]
Rundle, C.W.; Militello, M.; Barber, C.; Presley, C.L.; Rietcheck, H.R.; Dellavalle, R.P. Epidemiologic Burden of Skin Cancer in the Us and Worldwide. Curr. Dermatol. Rep. 2020, 9, 309–322. [Google Scholar] [CrossRef]
LeBoit, P.E. Pathology and Genetics of Tumours of the Skin: Who Classification of Tumours; IARC: Lyon, France, 2006; Volume 6. [Google Scholar]
Berklite, L.; Ranganathan, S.; John, I.; Picarsic, J.; Santoro, L.; Alaggio, R. Fibrous Histiocytoma/Dermatofibroma in Children: The Same as Adults? Hum. Pathol. 2020, 99, 107–115. [Google Scholar] [CrossRef]
Gaufin, M.; Michaelis, T.; Duffy, K. Cellular Dermatofibroma: Clinicopathologic Review of 218 Cases of Cellular Dermatofibroma to Determine the Clinical Recurrence Rate. Dermatol. Surg. 2019, 45, 1359–1364. [Google Scholar] [CrossRef]
Kim, J.M.; Cho, H.J.; Moon, S.-H. Rare Experience of Keloidal Dermatofibroma of Forehead. Arch. Craniofacial Surg. 2018, 19, 72. [Google Scholar] [CrossRef]
Kadakia, S.; Chernobilsky, B.; Iacob, C. Dermatofibroma of the Auricle. J. Drugs Dermatol. 2016, 15, 1270–1272. [Google Scholar]
Lehmer, L.M.; Ragsdale, B.D. Digital Dermatofibromas-Common Lesion, Uncommon Location: A Series of 26 Cases and Review of the Literature. Dermatol. Online J. 2011, 17, 2. [Google Scholar] [CrossRef]
Orzan, O.A.; Dorobanțu, A.M.; Gurău, C.D.; Ali, S.; Mihai, M.M.; Popa, L.G.; Giurcăneanu, C.; Tudose, I.; Bălăceanu, B. Challenging Patterns of Atypical Dermatofibromas and Promising Diagnostic Tools for Differential Diagnosis of Malignant Lesions. Diagnostics 2023, 13, 671. [Google Scholar] [CrossRef]
Cazzato, G.; Colagrande, A.; Cimmino, A.; Marrone, M.; Stellacci, A.; Arezzo, F.; Lettini, T.; Resta, L.; Ingravallo, G. Granular Cell Dermatofibroma: When Morphology Still Matters. Dermatopathology 2021, 8, 371–375. [Google Scholar] [CrossRef] [PubMed]
Tilly, J.J.; Drolet, B.A.; Esterly, N.B. Lichenoid Eruptions in Children. J. Am. Acad. Dermatol. 2004, 51, 606–624. [Google Scholar] [CrossRef] [PubMed]
Shah, R.; Jindal, A.; Patel, N.M. Acrochordons as a Cutaneous Sign of Metabolic Syndrome: A Case—Control Study. Ann. Med. Health Sci. Res. 2014, 4, 202–205. [Google Scholar] [PubMed]
Faiz, S.M.; Bhargava, A.; Srivastava, S.; Singh, H.; Goswami, D.; Bhat, P. Giant Acrochordon of Left Side Neck, a Rare Case Finding. Era’s J. Med. Res. 2020, 7, 260–261. [Google Scholar] [CrossRef]
Mohamed, T. A Comparative Study of Cutaneous Manifestations in Obese Patients and Non-Obese Controls at Vims, Ballari. Ph.D. Thesis, Rajiv Gandhi University of Health Sciences, Bengaluru, India, 2018. [Google Scholar]
Shrestha, P.; Poudyal, Y.; Rajbhandari, S.L. Acrochordons and Diabetes Mellitus: A Case Control Study. Nepal J. Dermatol. Venereol. Leprol. 2016, 13, 32–37. [Google Scholar] [CrossRef]
Sergi, C.M.; Consolato, M. Pathology of Childhood Sergi, and Adolescence: An Illustrated Guide. Soft Tissue; Springer: Berlin/Heidelberg, Germany, 2020; pp. 1003–1094. [Google Scholar]
Backer, E.L. Skin Tumors. In Family Medicine: Principles and Practice; Springer: Berlin/Heidelberg, Germany, 2022; pp. 1681–1705. [Google Scholar]
Singh, B.S.; Tripathy, T.; Kar, B.R. Association of Skin Tag with Metabolic Syndrome and Its Components: A Case–Control Study from Eastern India. Indian Dermatol. Online J. 2019, 10, 284–287. [Google Scholar] [CrossRef]
Sherin, N.; Khader, A.; Binitha, M.P.; George, B. Acrochordon as a Marker of Metabolic Syndrome—A Cross-Sectional Study from South India. J. Ski. Sex. Transm. Dis. 2023, 5, 40–46. [Google Scholar] [CrossRef]
Johansen, T.H.; Møllersen, K.; Ortega, S.; Fabelo, H.; Garcia, A.; Callico, G.M.; Godtliebsen, F. Recent Advances in Hyperspectral Imaging for Melanoma Detection. Wiley Interdiscip. Rev. Comput. Stat. 2020, 12, e1465. [Google Scholar] [CrossRef]
Huang, H.-Y.; Nguyen, H.-T.; Lin, T.-L.; Saenprasarn, P.; Liu, P.-H.; Wang, H.-C. Identification of Skin Lesions by Snapshot Hyperspectral Imaging. Cancers 2024, 16, 217. [Google Scholar] [CrossRef]
Leon, R.; Martinez-Vega, B.; Fabelo, H.; Ortega, S.; Melian, V.; Castaño, I.; Carretero, G.; Almeida, P.; Garcia, A.; Quevedo, E.; et al. Non-Invasive Skin Cancer Diagnosis Using Hyperspectral Imaging for In-Situ Clinical Support. J. Clin. Med. 2020, 9, 1662. [Google Scholar] [CrossRef]
Pathan, S.; Prabhu, K.G.; Siddalingaswamy, P.C. Techniques and Algorithms for Computer Aided Diagnosis of Pigmented Skin Lesions—A Review. Biomed. Signal Process. Control 2018, 39, 237–262. [Google Scholar] [CrossRef]
Yoon, J. Hyperspectral Imaging for Clinical Applications. BioChip J. 2022, 16, 1–12. [Google Scholar] [CrossRef]
Adesokan, M.; Alamu, E.O.; Otegbayo, B.; Maziya-Dixon, B. A Review of the Use of Near-Infrared Hyperspectral Imaging (NIR-HSI) Techniques for the Non-Destructive Quality Assessment of Root and Tuber Crops. Appl. Sci. 2023, 13, 5226. [Google Scholar] [CrossRef]
Guilloteau, C.; Oberlin, T.; Berne, O.; Dobigeon, N. Hyperspectral and Multispectral Image Fusion Under Spectrally Varying Spatial Blurs—Application to High Dimensional Infrared Astronomical Imaging. IEEE Trans. Comput. Imaging 2020, 6, 1362–1374. [Google Scholar] [CrossRef]
Guerri, M.F.; Distante, C.; Spagnolo, P.; Bougourzi, F.; Taleb-Ahmed, A. Deep Learning Techniques for Hyperspectral Image Analysis in Agriculture: A Review. ISPRS Open J. Photogramm. Remote Sens. 2024, 12, 100062. [Google Scholar] [CrossRef]
Rehman, A.; Qureshi, S.A. A Review of the Medical Hyperspectral Imaging Systems and Unmixing Algorithms’ in Biological Tissues. Photodiagnosis Photodyn. Ther. 2021, 33, 102165. [Google Scholar] [CrossRef]
Khan, U.; Paheding, S.; Elkin, C.P.; Devabhaktuni, V.K. Trends in Deep Learning for Medical Hyperspectral Image Analysis. IEEE Access 2021, 9, 79534–79548. [Google Scholar] [CrossRef]
Gutiérrez-Gutiérrez, J.A.; Pardo, A.; Real, E.; López-Higuera, J.M.; Conde, O.M. Custom Scanning Hyperspectral Imaging System for Biomedical Applications: Modeling, Benchmarking, and Specifications. Sensors 2019, 19, 1692. [Google Scholar] [CrossRef]
Thiele, S.T.; Bnoulkacem, Z.; Lorenz, S.; Bordenave, A.; Menegoni, N.; Madriz, Y.; Dujoncquoy, E.; Gloaguen, R.; Kenter, J. Mineralogical Mapping with Accurately Corrected Shortwave Infrared Hyperspectral Data Acquired Obliquely from UAVs. Remote Sens. 2021, 14, 5. [Google Scholar] [CrossRef]
Cucci, C.; Picollo, M.; Chiarantini, L.; Uda, G.; Fiori, L.; De Nigris, B.; Osanna, M. Remote-Sensing Hyperspectral Imaging for Applications in Archaeological Areas: Non-Invasive Investigations on Wall Paintings and on Mural Inscriptions in the Pompeii site. Microchem. J. 2020, 158, 105082. [Google Scholar] [CrossRef]
Aviara, N.A.; Liberty, J.T.; Olatunbosun, O.S.; Shoyombo, H.A.; Oyeniyi, S.K. Potential Application of Hyperspectral Imaging in Food Grain Quality Inspection, Evaluation and Control During Bulk Storage. J. Agric. Food Res. 2022, 8, 100288. [Google Scholar] [CrossRef]
Faltynkova, A.; Johnsen, G.; Wagner, M. Hyperspectral Imaging as an Emerging Tool to Analyze Microplastics: A Systematic Review and Recommendations for Future Development. Microplastics Nanoplastics 2021, 1, 13. [Google Scholar] [CrossRef]
He, Z.; Wang, P.; Liang, Y.; Fu, Z.; Ye, X.; Liu, A. Clinically Available Optical Imaging Technologies in Endoscopic Lesion Detection: Current Status and Future Perspective. J. Healthc. Eng. 2021, 2021, 7594513. [Google Scholar] [CrossRef] [PubMed]
Yang, Q.; Liu, Z.; Sun, H.; Jiao, F.; Zhang, B.; Chen, J. A Narrative Review: Narrow-Band Imaging Endoscopic Classifications. Quant. Imaging Med. Surg. 2023, 13, 1138–1163. [Google Scholar] [CrossRef] [PubMed]
Yi, D.; Kong, L.; Zhao, Y. Contrast-Enhancing Snapshot Narrow-Band Imaging Method for Real-Time Computer-Aided Cervical Cancer Screening. J. Digit. Imaging 2019, 33, 211–220. [Google Scholar] [CrossRef] [PubMed]
Gono, K. Principles and History of Nbi. In Atlas of Endoscopy with Narrow Band Imaging; Springer: Berlin/Heidelberg, Germany, 2015; pp. 3–10. [Google Scholar]
Goyal, M.; Knackstedt, T.; Yan, S.; Hassanpour, S. Artificial Intelligence-Based Image Classification Methods for Diagnosis of Skin Cancer: Challenges and Opportunities. Comput. Biol. Med. 2020, 127, 104065. [Google Scholar] [CrossRef]
Tejada-Casado, M.; Herrera, L.J.; Carrillo-Perez, F.; Ruiz-López, J.; Ghinea, R.I.; Pérez, M.M. Exploring the CIEDE2000 Thresholds for Lightness, Chroma, and Hue Differences in Dentistry. J. Dent. 2024, 150, 105327. [Google Scholar] [CrossRef]
Ultralytics. Ultralytics Yolo11; Ultralytics: Frederick, MD, USA, 2024. [Google Scholar]
Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. Yolov10: Real-Time End-to-End Object Detection. arXiv 2024, arXiv:2405.14458. [Google Scholar]
Wang, C.-Y.; Yeh, I.-H.; Liao, H.-Y.M. Yolov9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv 2024, arXiv:2402.13616. [Google Scholar]
Yao, Q.; Zhuang, D.; Feng, Y.; Wang, Y.; Liu, J. Accurate Detection of Brain Tumor Lesions from Medical Images Based on Improved Yolov8 Algorithm. IEEE Access 2024, 12, 144260–144279. [Google Scholar] [CrossRef]
Bashir, S.; Qureshi, R.; Shah, A.; Fan, X.; Alam, T. Yolov5-M: A Deep Neural Network for Medical Object Detection in Real-Time. In Proceedings of the 2023 IEEE Symposium on Industrial Electronics & Applications (ISIEA), Kuala Lumpur, Malaysia, 15–16 July 2023. [Google Scholar]
Camastra, F.; Vinciarelli, A. Machine Learning for Audio, Image and Video Analysis: Theory and Applications; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
Ravikumar, A.; Sriraman, H.; Saketh, P.M.S.; Lokesh, S.; Karanam, A. Effect of Neural Network Structure in Accelerating Performance and Accuracy of a Convolutional Neural Network with Gpu/Tpu for Image Analytics. PeerJ Comput. Sci. 2022, 8, e909. [Google Scholar] [CrossRef]
Mahmud, T.; Barua, K.; Barua, A.; Das, S.; Basnin, N.; Hossain, M.S.; Andersson, K.; Kaiser, M.S.; Sharmen, N. Exploring Deep Transfer Learning Ensemble for Improved Diagnosis and Classification of Alzheimer’s Disease. In Proceedings of the 16th International Conference on Brain Informatics (BI2023), Hoboken, NJ, USA, 1–3 August 2023. [Google Scholar]
Mayya, V.; Sowmya Kamath, S.; Kulkarni, U.; Surya, D.K.; Acharya, U.R. An Empirical Study of Preprocessing Techniques with Convolutional Neural Networks for Accurate Detection of Chronic Ocular Diseases Using Fundus Images. Appl. Intell. 2022, 53, 1548–1566. [Google Scholar] [CrossRef]
Jiang, X.; Hu, Z.; Wang, S.; Zhang, Y. Deep Learning for Medical Image-Based Cancer Diagnosis. Cancers 2023, 15, 3608. [Google Scholar] [CrossRef] [PubMed]
Chen, Z.; Hao, Y.; Liu, Q.; Liu, Y.; Zhu, M.; Xiao, L. Deep Learning for Hyperspectral Image Classification: A Critical Evaluation via Mutation Testing. Remote Sens. 2024, 16, 4695. [Google Scholar] [CrossRef]
Roy, S.K.; Jamali, A.; Chanussot, J.; Ghamisi, P.; Ghaderpour, E.; Shahabi, H. SimPoolFormer: A Two-Stream Vision Transformer for Hyperspectral Image Classification. Remote Sens. Appl. Soc. Environ. 2025, 37, 101478. [Google Scholar] [CrossRef]

Figure 1. Schematics of the entire study.

Figure 2. Comparative analysis of precision, recall, and F1 score for the detection of dermatofibroma using different object detection algorithms (YoloV11, YoloV10, YoloV9, YoloV8, and YoloV5) across two imaging modalities, including WLI and SAVE, for validation results.

Figure 3. Comparative analysis of precision, recall, and F1 score for the detection of dermatofibromas using different object detection algorithms (YoloV11, YoloV10, YoloV9, YoloV8, and YoloV5) across two imaging modalities, which included WLI and SAVE, for testing results.

Figure 4. Detection results: WLI—(a) represents the normal class, (b) represents the bounded box with dermatofibroma, (c) represents the bounded box with lichenoid, and (d) represents the bounded box with acrochordon. SAVE—(e) represents the normal class, (f) represents the bounded box with dermatofibroma, (g) represents the bounded box with lichenoid, and (h) represents the bounded box with acrochordon.

Table 1. Dataset used in this study.

Lesion Type	Total Images	Training (70%)	Validation (20%)	Test (10%)
Acrochordon	577	404 (70%)	116 (20%)	57 (10%)
Dermatofibroma	821	575 (70%)	164 (20%)	82 (10%)
Lichenoid	805	563 (70%)	161 (20%)	81 (10%)
Total	2203	1542 (70%)	441 (20%)	220 (10%)

Table 2. Validation results of multiple models.

Algorithm	Imaging Modality	Class	Precision in %	Recall in %	F1 in %
YoloV11	WLI	All	71	62.9	66.71
		Acrochordon	62.8	40	48.87
		Dermatofibroma	82.3	89.7	85.84
		Lichenoid	67.9	59	63.14
	SAVE	All	74.3	54.2	62.68
		Acrochordon	70.9	39.8	50.98
		Dermatofibroma	86.1	76.3	80.9
		Lichenoid	65.8	46.5	54.49
YoloV10	WLI	All	79.2	58.6	67.36
		Acrochordon	69.3	34.2	45.8
		Dermatofibroma	90.9	83.3	86.93
		Lichenoid	77.4	58.2	66.44
	SAVE	All	79.2	58.3	67.16
		Acrochordon	72.3	39.2	50.84
		Dermatofibroma	90.1	81.4	85.53
		Lichenoid	75.1	54.4	63.1
YoloV9	WLI	All	75.8	67	71.13
		Acrochordon	62.8	48.8	54.92
		Dermatofibroma	87.9	90.4	89.13
		Lichenoid	76.8	61.9	68.55
	SAVE	All	86.4	56.1	68.03
		Acrochordon	83.6	36.7	51.01
		Dermatofibroma	87.8	79.2	83.28
		Lichenoid	87.9	52.6	65.82
YoloV8	WLI	All	78.2	60.8	68.41
		Acrochordon	64.4	32.8	43.46
		Dermatofibroma	89.5	87.5	88.49
		Lichenoid	80.8	61.9	70.1
	SAVE	All	81.2	57.9	67.6
		Acrochordon	74.8	36.7	49.24
		Dermatofibroma	88.2	79.9	83.85
		Lichenoid	88.5	57.2	69.49
YoloV5	WLI	All	70.7	62.8	66.52
		Acrochordon	57.8	40.5	47.63
		Dermatofibroma	85.1	84.3	84.7
		Lichenoid	69.1	63.4	66.13
	SAVE	All	71.6	59.3	64.87
		Acrochordon	64.4	39.2	48.74
		Dermatofibroma	81.8	81.5	81.65
		Lichenoid	68.5	57.2	62.34

Table 3. Testing results of multiple machine learning models.

Algorithm	Imaging Modality	Class	Precision in %	Recall in %	F1 in %
YoloV11	WLI	All	74.5	50.7	57.67
		Acrochordon	72.6	37.9	49.91
		Dermatofibroma	83.7	66.7	73.77
		Lichenoid	67.2	47.4	55.71
	SAVE	All	77.5	57.9	65.73
		Acrochordon	58.1	43.3	49.69
		Dermatofibroma	93.4	82.7	87.04
		Lichenoid	80.9	47.6	60.17
YoloV10	WLI	All	70	52.9	60.3
		Acrochordon	56.4	42.9	48.7
		Dermatofibroma	85.4	68.5	76
		Lichenoid	68	47.4	55.9
	SAVE	All	85	60.3	70.6
		Acrochordon	75.7	49.7	60
		Dermatofibroma	92	79.3	85.2
		Lichenoid	87.3	52.1	65.3
YoloV9	WLI	All	81.4	56.2	66.5
		Acrochordon	85	47.6	61
		Dermatofibroma	85	70.8	77.3
		Lichenoid	74.1	50.2	59.9
	SAVE	All	87.3	60.6	71.5
		Acrochordon	79.1	50	61.3
		Dermatofibroma	92.3	81.7	86.7
		Lichenoid	90.3	50	64.4
YoloV8	WLI	All	80.4	55.9	65.9
		Acrochordon	78.8	45.2	57.4
		Dermatofibroma	87.5	71.9	78.9
		Lichenoid	74.9	50.5	60.3
	SAVE	All	83.3	64.3	72.6
		Acrochordon	80.5	56.8	66.6
		Dermatofibroma	88.4	84.1	86.2
		Lichenoid	80.9	52.1	63.4
YoloV5	WLI	All	68	49.5	57.3
		Acrochordon	60.8	33.3	43
		Dermatofibroma	80.9	67.4	73.5
		Lichenoid	62.3	47.8	54.1
	SAVE	All	74.7	61.4	67.4
		Acrochordon	63.2	52.3	57.2
		Dermatofibroma	87.5	82.9	85.1
		Lichenoid	73.4	49	58.8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, N.-C.; Mukundan, A.; Karmakar, R.; Syna, S.; Chang, W.-Y.; Wang, H.-C. Novel Snapshot-Based Hyperspectral Conversion for Dermatological Lesion Detection via YOLO Object Detection Models. Bioengineering 2025, 12, 714. https://doi.org/10.3390/bioengineering12070714

AMA Style

Huang N-C, Mukundan A, Karmakar R, Syna S, Chang W-Y, Wang H-C. Novel Snapshot-Based Hyperspectral Conversion for Dermatological Lesion Detection via YOLO Object Detection Models. Bioengineering. 2025; 12(7):714. https://doi.org/10.3390/bioengineering12070714

Chicago/Turabian Style

Huang, Nan-Chieh, Arvind Mukundan, Riya Karmakar, Syna Syna, Wen-Yen Chang, and Hsiang-Chen Wang. 2025. "Novel Snapshot-Based Hyperspectral Conversion for Dermatological Lesion Detection via YOLO Object Detection Models" Bioengineering 12, no. 7: 714. https://doi.org/10.3390/bioengineering12070714

APA Style

Huang, N.-C., Mukundan, A., Karmakar, R., Syna, S., Chang, W.-Y., & Wang, H.-C. (2025). Novel Snapshot-Based Hyperspectral Conversion for Dermatological Lesion Detection via YOLO Object Detection Models. Bioengineering, 12(7), 714. https://doi.org/10.3390/bioengineering12070714

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Novel Snapshot-Based Hyperspectral Conversion for Dermatological Lesion Detection via YOLO Object Detection Models

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset

2.2. SAVE

2.3. ML Algorithms

2.3.1. YoloV11

2.3.2. YoloV10

2.3.3. YoloV9

2.3.4. YOLOv8

2.3.5. YOLOV5

3. Results

4. Discussion

4.1. Uncertainty Analysis

4.2. Limitations and Computational Cost

4.3. Comparison to Alternative Spectral Modalities and Clinical Translation

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI