Next Article in Journal
Effects of Motion in Ultrashort Echo Time Quantitative Susceptibility Mapping for Musculoskeletal Imaging
Previous Article in Journal
DBA-YOLO: A Dense Target Detection Model Based on Lightweight Neural Networks
Previous Article in Special Issue
AQSA—Algorithm for Automatic Quantification of Spheres Derived from Cancer Cells in Microfluidic Devices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimized Lung Nodule Classification Using CLAHE-Enhanced CT Imaging and Swin Transformer-Based Deep Feature Extraction

1
COSIM Laboratory, Higher School of Communication of Tunis, University of Carthage, Ariana 2083, Tunisia
2
National Institute of Technology and Science of Kef, University of Jendouba, El Kef 7100, Tunisia
3
LR-SITI, National Engineering School of Tunis (ENIT), University of Tunis El Manar, Tunis 2092, Tunisia
*
Author to whom correspondence should be addressed.
J. Imaging 2025, 11(10), 346; https://doi.org/10.3390/jimaging11100346
Submission received: 21 September 2025 / Revised: 27 September 2025 / Accepted: 30 September 2025 / Published: 4 October 2025
(This article belongs to the Special Issue Advancements in Imaging Techniques for Detection of Cancer)

Abstract

Lung cancer remains one of the most lethal cancers globally. Its early detection is vital to improving survival rates. In this work, we propose a hybrid computer-aided diagnosis (CAD) pipeline for lung cancer classification using Computed Tomography (CT) scan images. The proposed CAD pipeline integrates ten image preprocessing techniques and ten pretrained deep learning models for feature extraction including convolutional neural networks and transformer-based architectures, and four classical machine learning classifiers. Unlike traditional end-to-end deep learning systems, our approach decouples feature extraction from classification, enhancing interpretability and reducing the risk of overfitting. A total of 400 model configurations were evaluated to identify the optimal combination. The proposed approach was evaluated on the publicly available Lung Image Database Consortium and Image Database Resource Initiative dataset, which comprises 1018 thoracic CT scans annotated by four thoracic radiologists. For the classification task, the dataset included a total of 6568 images labeled as malignant and 4849 images labeled as benign. Experimental results show that the best performing pipeline, combining Contrast Limited Adaptive Histogram Equalization, Swin Transformer feature extraction, and eXtreme Gradient Boosting, achieved an accuracy of 95.8%.

1. Introduction

Lung cancer is one of the most aggressive and deadly cancers, responsible for 2.5 million new cases and 1.8 million deaths annually, the leading cause of cancer mortality worldwide [1,2]. Its poor prognosis results from the late diagnosis due to rapid progression, early metastases, and nonspecific symptoms. Early detection is essential; it greatly enhances prognosis.
Computed tomography (CT) is the primary tool for detecting pulmonary nodules. However, a radiologist’s interpretation is constrained by fatigue, varying expertise, and image complexity [3]. Computer-aided diagnosis (CAD) systems address these challenges by providing objective and rapid assessments. Recent advances in deep learning, particularly convolutional neural networks (CNNs) and transfer learning, have improved the performance of CAD by enabling automated feature extraction and leveraging pre-trained models on large datasets.
Nevertheless, the high variability of nodules, their similarity to benign structures, and the integration of segmentation into classification remain major obstacles. To overcome these challenges, this study presents a hybrid CAD system combining feature extraction through transfer learning and machine learning (ML) classifiers. This system is preceded by a preprocessing step, which improves image quality, reduces noise, and standardizes input data, thereby improving classification accuracy and efficiency. The proposed system aims to facilitate reliable early diagnosis of lung cancer in clinical settings.
The remainder of this paper is structured as follows. Section 2 reviews recent literature on lung cancer classification; Section 3 details the proposed methodology; Section 4 and Section 5 present the experimental results and discussion; and Section 6 concludes the study.

2. Related Works

Recent advances in deep learning (DL) and machine learning have significantly improved medical image analysis, particularly for lung cancer detection and classification. Various approaches combine deep architectures, transfer learning, and optimization strategies to enhance diagnostic accuracy.
Dechao Chen et al. [4] applied an optimized CNN using the Beetle Antenna Search algorithm for cerebral hemorrhage diagnosis, illustrating the benefit of optimization techniques. Vijayan et al. [5] evaluated six DL models such as AlexNet, GoogleNet, ResNet, Inception V3, EfficientNet B0, and SqueezeNet, finding that GoogleNet with the Adam optimizer achieved the highest performance, with 92.08% accuracy.
Kumar et al. [6] developed PneumoNet, an ensemble model for pneumothorax detection, reaching 98.41% accuracy. Nawreen et al. [7] highlighted hybrid pipelines combining edge detection, thresholding, and SVMs for tumor assessment. Ayad, Al-Jamimi, and El Kheir [8] proposed a hybrid RFE-SVM + XGBoost model, achieving 100% accuracy on two datasets, while Wang et al. [9] used residual networks with transfer learning for lung cancer subtypes, reaching 85.71% accuracy. Other notable studies include Sari et al. [10] (modified ResNet-50, 93.33% accuracy), Bakchy et al. [11] (lightweight CNN with Grad-CAM, 99.48% accuracy), Raza et al. [12] (LungEffNet using EfficientNet variants), and Fan and Bu [13] (DenseNet-121, 93.7% accuracy).
Hrizi et al. [14,15,16] developed a series of optimized CAD pipelines, culminating in a lightweight model achieving 97.06% accuracy in under 0.25 s. Idrees et al. [17] used a marker-controlled watershed algorithm for ROI identification, achieving 88.5% accuracy.
Xu et al. [18] proposed DCSAU-Net, a compact split-attention U-Net that preserves multiscale semantic features, outperforming state-of-the-art methods on CVC-ClinicDB, ISIC-2018, and BraTS-2021 in Dice score and mean IoU.
Liu et al. [19] developed a lightweight 3D CNN with attention mechanisms to classify lung nodule malignancy from CT scans, incorporating nodules and their fibrotic microenvironment, achieving 80.84% accuracy and an AUC of 0.89. Dhiaa and Awad [20] integrated transcriptomic data with XGBoost to predict postsurgical lung cancer recurrence, demonstrating the potential of multi-omics approaches. Li et al. [21] introduced a Swin Transformer–based dual-channel model combining CT and histopathological data to predict bone metastasis, achieving an AUC of 0.966 and outperforming ResNet50 baselines.
Despite these advances, challenges remain in generalization, model complexity, and clinical applicability. Many systems are dataset-specific and lack efficient integration of segmentation and classification. Few approaches combine transfer learning with lightweight classifiers for real-time use. To address these gaps, our study introduces a hybrid CAD system that unifies transfer learning–based feature extraction with classical ML classifiers, aiming for accurate and computationally efficient clinical diagnosis.

3. Lung Cancer Detection System

This work proposes a modular system for lung cancer classification, aimed at distinguishing benign from malignant pulmonary cases using chest CT images. The pipeline comprises three principal stages: (1) data preparation, (2) deep feature extraction, and (3) classification via machine learning models. Throughout these stages, various techniques were systematically evaluated to identify the optimal configuration for accurate classification. Overall, four hundred distinct combinations were examined.
The overall architecture of our proposed methodology is illustrated in Figure 1.

3.1. Dataset

The publicly available Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) dataset from The Cancer Imaging Archive was used in this study. This collection contains 1018 thoracic CT scans, each annotated by four thoracic radiologists through a rigorous two-phase process. The first phase was conducted in a blinded manner to identify lung nodules, while the second phase involved an unblinded review to refine the assessments, deliberately preserving inter-observer variability.

3.2. Dataset Preparation

The dataset preparation process comprised two main steps: Section 3.2.1 The labeling of images step based on radiologist’s scores and Section 3.2.2 the preprocessing step to enhance image quality for subsequent analysis.

3.2.1. Labeling Strategy

The labeling strategy was based on the malignancy scores provided by radiologists in the LIDC-IDRI dataset. Nodules with an average score of 2 or lower were classified as benign, whereas those with an average score of 4 or higher were classified as malignant. Nodules with scores around 3 were excluded to avoid ambiguity. At the patient (scan) level, the following rules were applied. If all nodules in a scan were benign, the case was labeled benign, and if at least one nodule was malignant, the entire scan was labeled malignant.
This labeling strategy ensured consistency between nodule-level and patient-level annotations, while also reflecting clinical practice, where the presence of a single malignant lesion determines the overall diagnosis.

3.2.2. Preprocessing

To improve the quality and diagnostic value of the medical images prior to feature extraction, a comprehensive set of ten preprocessing techniques were applied. These techniques, selected based on literature prevalence and clinical relevance as well as complementary technical roles, aim to enhance contrast, suppress noise, correct imaging artifacts, and emphasize relevant anatomical structures. In this section, we categorize the preprocessing techniques into three functional groups: contrast enhancement, noise reduction, and structure highlighting.
(a)
Histogram-Based Enhancement
Two prominent techniques within this category are Histogram Equalization (HE) and Contrast Limited Adaptive Histogram Equalization (CLAHE); both of them have shown significant potential in improving the quality and diagnostic value of medical images.
HE: Redistributes pixel intensity values across the histogram to enhance global contrast, improving the visibility of overall features, such as in the work of Sunanda and Rani [22].
CLAHE: Unlike standard HE, CLAHE operates on small image tiles and limits contrast amplification to avoid noise, highlighting local structures such as tissue boundaries [23].
(b)
Intensity and Illumination Correction
Techniques under this category aim to correct uneven illumination and intensity artifacts, which are common in medical imaging and can hinder accurate analysis.
Logarithmic Transformation: Enhances low-intensity pixels, emphasizing faint structures and low-contrast details. This transformation was also employed in the work of Domain-Aware Adaptive Logarithmic Transformation [24].
Intensity Inhomogeneity Correction: Compensates for spatial intensity variations (bias fields) in MRI, improving contrast uniformity and supporting tissue segmentation or lesion detection [25].
(c)
Edge and Structure Enhancement
Edge and structure enhancement techniques aim to highlight important anatomical contours and assist in the preliminary localization of regions of interest, which are crucial for subsequent diagnostic tasks.
Edge and Contour Enhancement: Filters and algorithms highlight anatomical boundaries without increasing noise [26].
Preliminary Segmentation: Thresholding separates regions of interest from the background to guide further analysis [27]
(d)
Noise Reduction and Filtering
Effective noise suppression is a crucial preprocessing step in medical image analysis, and several filtering techniques including linear, non-linear, and multiscale methods are commonly used to address this challenge.
Gaussian Filter: Suppresses high-frequency noise while preserving structure [28].
Median Filter: Removes salt-and-pepper noise without blurring edges [29].
Anisotropic Filtering: Smooths images while preserving edges through gradient-based diffusion [30].
Wavelet Based Denoising: Multi-resolution transforms separate noise from signal, allowing scale-specific noise suppression while maintaining edges [31].
Each technique was evaluated independently to assess its impact on image clarity, contrast, and downstream model performance, guiding the selection of the most effective preprocessing combination.

3.3. Feature Extraction

Following the preprocessing step, features were extracted using ten Transfer learning architectures: Residual Network (ResNet50), InceptionV3, Densely Connected Convolutional Network (DenseNet121), EfficientNet (B0, B1, B3), Swin Transformer, Denoising Autoencoder, Variational Autoencoder, and Convolutional Autoencoder (CAE). Input images were resized to match model requirements, and feature vectors were extracted from the penultimate layer for subsequent machine learning classification. This setup allowed evaluation of how different preprocessing and feature extraction combinations affect classification of benign and malignant cases. ResNet50 uses residual connections, InceptionV3 captures multi-scale features, DenseNet121 strengthens inter-layer information flow, and EfficientNet variants balance depth, width, and resolution. Denoising and Variational Autoencoders enhance robustness and generalization, while CAE captures spatial features in a compact latent representation.
Although ten deep learning models were explored for feature extraction, we describe in detail the architectures of the Swin Transformer and the CAE, as they combine strong performance with distinctive design principles and were among the most successful models in our experiments. The Swin Transformer serves as a representative example for the feature extraction process in this study, while the CAE demonstrates strengths in unsupervised spatial feature learning. The other architectures used in this study are well established and extensively described in the literature. For instance, ResNet50 [32], InceptionV3 [33], DenseNet121 [34], and EfficientNet (B0–B2) [35] are widely adopted convolutional neural networks, especially in transfer learning contexts. The Denoising Autoencoder and Variational Autoencoder are also standard architectures in unsupervised learning and have been applied successfully in medical imaging [36]. The CAE combines spatial feature extraction with compact latent encoding through a bottleneck structure [37]. Readers are referred to relevant studies, including [38], which illustrate similar use of transfer learning and CNN-based models in medical diagnosis tasks.
(a)
Swin Transformer
To ensure compatibility with the Swin Transformer model pre-trained on ImageNet, grayscale images were converted to three-channel RGB images by duplicating the single channel.
The architecture, illustrated in Figure 2, processes images in four stages. In the first stage, the input RGB image (224 × 224) is split into non-overlapping 4 × 4 patches, resulting in 56 × 56 patches with 48 channels each. Next, patches are organized into windows where self-attention is applied locally to capture spatial dependencies. The third stage shifts window positions to create overlaps, enabling interactions between adjacent windows. Finally, features are progressively merged in a hierarchical aggregation to combine local details with global context.
This hierarchical structure maintains computational efficiency while progressively refining feature extraction.
(b)
Convolutional Autoencoder
The CAE is specifically designed for high-resolution grayscale CT images (512 × 512 pixels) and operates in an unsupervised learning system. It consists of two main parts:
The encoder captures hierarchical spatial features and compresses them into a compact latent space. Its architecture consists of three convolutional blocks. Each block uses increasing filter sizes, starting with 32, then 64, and finally 128. Following each convolutional block, a 2 × 2 max-pooling layer with same padding is applied. This progressive reduction in spatial resolution helps to preserve important anatomical details and results in a bottleneck feature map of 64 × 64 × 128. This final feature map effectively encodes both local texture and global structural information.
The decoder is responsible for reconstructing the input from the compact latent representation generated by the encoder. Its architecture mirrors the encoder’s, using upsampling and convolutional layers to reverse the encoding process and restore the original image.
For feature extraction, only the encoder is used. The 64 × 64 × 128 bottleneck is flattened to form a vector of 524,288 features, which is then stored in CSV format and passed to classical classifiers. The architecture of the CAE used in this study is illustrated in Figure 3.

3.4. Classification

The feature vectors extracted from the deep learning models were classified using four supervised machine learning algorithms, selected for their proven efficiency in handling high-dimensional data and their frequent use in medical image classification.
Support Vector Machine (SVM) is well suited for datasets with high-dimensional feature spaces, as it identifies the optimal hyperplane maximizing class separation. Its robustness and generalization make it reliable for medical decision making.
Random Forest (RF) builds multiple decision trees on random subsets of data and features, aggregating outputs via majority voting. This ensemble approach reduces overfitting and handles noisy or redundant features effectively.
Decision Tree (DT) is simpler and more interpretable than ensemble methods, providing rapid inference. Although prone to overfitting with high-dimensional data, it serves as a useful baseline.
eXtreme Gradient Boosting (XGBoost) sequentially constructs trees, optimizing performance by correcting previous errors. Its regularization and efficiency make it a strong alternative to RF and SVM.
Each classifier was tasked with distinguishing between two lung cancer categories: benign and malignant. To ensure fairness and reproducibility, stratified train–test splits were employed to preserve class distributions.
Performance was primarily assessed using classification accuracy, which reflects the proportion of correctly predicted samples. Although additional metrics such as precision and recall provide further clinical insights, accuracy was chosen here to enable consistent comparison across all preprocessing, feature extraction, and classification configurations.

4. Results

This section presents the outcomes of the experimental pipeline described previously. The performance of each classification model was evaluated across the various combinations of preprocessing techniques and feature extraction methods. The results are organized to highlight the influence of (1) data preparation, (2) feature extraction models, and (3) classification algorithms on the accuracy of medical image classification into benign or malignant cases.

4.1. Dataset Preparation Results

The following two subsections present the results of labeling and preprocessing.

4.1.1. Labeling Strategy Results

After applying the labeling strategy described in Section 3.2.1, the initial set of annotated images from the LIDC-IDRI dataset was transformed into a consistent binary classification problem. The distribution of images across the different categories is summarized in Table 1.

4.1.2. Preprocessing Results

Among the various preprocessing techniques evaluated in our study, CLAHE demonstrated the most effective results. CLAHE significantly improved the visibility of structural details within the CT images by enhancing local contrast, particularly in regions with low intensity variation. One of the key advantages of CLAHE is its ability to normalize image intensities across different scans, which helps to mitigate the impact of illumination inconsistencies and intensity inhomogeneities.
This normalization effect is particularly beneficial in the context of transfer learning, as it ensures that the input features are more consistent and representative, thereby facilitating more robust and accurate feature extraction by deep learning models. Consequently, CLAHE was selected as the optimal preprocessing method for our classification pipeline. The innovative aspect of this study lies in the exhaustive and systematic evaluation of the combined effect of eleven preprocessing techniques on image quality and the performance of feature extraction and classification models. This comprehensive approach allows precise identification of the most effective combinations, which is rarely explored in current literature.
Furthermore, we visually compared the outcomes of all preprocessing techniques on three representative samples from the dataset in Table 2. This visual analysis helped to qualitatively validate the effectiveness of CLAHE in enhancing image quality across different pathological conditions.
To quantitatively assess the contribution of each preprocessing method to classification performance, we computed the average accuracy achieved across all combinations of feature extractors and classifiers for each technique. Table 3 summarizes these average accuracies. This analysis supports the visual findings and further highlights the superiority of CLAHE in consistently enhancing downstream classification results.

4.2. Feature Extraction Results

In our study, we evaluated several deep learning models for feature extraction, including VGG16, ResNet50, InceptionV3, and the Swin Transformer. Among these, the Swin Transformer was selected for detailed analysis due to its strong ability to model both local and global image dependencies.
Unlike traditional convolutional neural networks that operate with fixed receptive fields, the Swin Transformer uses a hierarchical architecture with shifted windows, allowing it to efficiently capture fine-grained local structures as well as broader contextual information. This is particularly advantageous in medical imaging tasks such as lung cancer detection, where subtle texture patterns and spatial relationships between anatomical regions are critical. The Swin Transformer’s capacity to maintain high-resolution feature representations while progressively aggregating context makes it especially powerful for extracting rich and discriminative features from CT images.
After applying the CLAHE preprocessing technique, each CT scan image was resized to a standard dimension of 512 × 512 pixels to ensure uniformity across the dataset. When processed through the Swin Transformer model, each preprocessed image yielded a feature vector of 1024 dimensions, capturing both local and global characteristics of the lung region. In total, we obtained 11,417 labeled samples corresponding to the two classes: malignant and benign. These extracted features, illustrated in Figure 4, served as the input for the subsequent classification stage.

4.3. Classification Results

In the binary classification task distinguishing benign from malignant lung nodules, the XGBoost classifier demonstrated the strongest performance. Its gradient boosting framework builds decision trees sequentially to minimize classification error, effectively capturing complex feature interactions while controlling overfitting through regularization. This makes XGBoost particularly well suited for structured features extracted from deep learning models.
Random Forest ranked second, benefiting from its ensemble of decision trees that reduce variance and improve generalization. However, its averaging nature can smooth critical decision boundaries, which may slightly limit performance in subtle distinctions between benign and malignant nodules.
Decision Tree showed moderate performance due to its limited ability to model complex patterns, while SVM, despite its theoretical strength in high-dimensional spaces, achieved the lowest accuracy in this binary setting.
Table 4 shows that the XGBoost classifier achieves outstanding class-specific performance in differentiating benign and malignant lung nodules. For the Benign class, the model achieved a precision of 91.0%, a recall of 93.0%, and an F1-score of 92.0%, indicating that most nodules predicted as benign were correct and that the majority of benign nodules were successfully identified. For the Malignant class, the classifier showed even higher performance, with a precision of 97.0%, a recall of 98.0%, and an F1-score of 97.5%, reflecting strong reliability in detecting malignant tumors. Overall, these results highlight XGBoost’s robustness in handling class imbalance and maintaining high diagnostic accuracy, making it highly effective for binary classification of lung nodules.
(a)
Statistical Significance Analysis
To determine whether the observed differences in F1-scores between the evaluated configurations were statistically significant, we performed a two step non-parametric analysis using the five-fold cross-validation results reported in Table 5 for all CLAHE-based configurations.
First, a Friedman test was applied to the complete set of configurations (all combinations of feature extraction models and classifiers using CLAHE preprocessing). The null hypothesis stated that all models had equal median performance. The Friedman test revealed a statistically significant difference among the configurations ( p < 0.001 ), indicating that at least one configuration outperformed the others.
Second, post hoc pairwise comparisons were conducted using the Wilcoxon signed-rank test with Bonferroni correction to control for multiple comparisons. In this step, the best-performing configuration (CLAHE + Swin Transformer + XGBoost) was compared individually to each of the other configurations. The results are presented in Table 6. The proposed CLAHE + Swin Transformer + XGBoost pipeline achieved significantly higher F1-scores than all other configurations ( p < 0.01 ).
These results provide strong statistical evidence that the proposed configuration not only achieves the highest average F1-score but also delivers a performance advantage that is statistically significant compared to most alternative pipelines tested in this study.

5. Discussion

The experimental results of our study demonstrate that the proposed hybrid pipeline achieved a classification accuracy of 95.8% on the LIDCI-IDRI dataset, confirming its competitiveness with current state-of-the-art methods in lung cancer detection. Notably, the ExtRanFS framework [39] and the attention-enhanced InceptionNeXt model [40] used the IQ-OTH/NCCD dataset and reported accuracies of 99.09% and 99.54%, respectively. While these approaches rely on tightly coupled end-to-end architectures combining CNNs, Vision Transformers, and tree-based or deep classifiers, our method introduces a modular design that decouples preprocessing, feature extraction, and classification. This strategy provides greater interpretability, robustness, and the flexibility to explore a wide range of configurations for optimized performance.
Unlike these methods, which focus on specific architectures or fused attention mechanisms, our pipeline evaluates 400 distinct combinations, integrating 10 preprocessing techniques, 10 pretrained deep models (including CNNs and transformer-based architectures), and 4 classical machine learning classifiers. This comprehensive and systematic exploration enabled the identification of the best-performing configuration (CLAHE + Swin Transformer + XGBoost), while also offering valuable insights into the contribution of each component. The use of Swin Transformer for feature extraction brings the advantage of modeling both local and global dependencies in CT images, and its combination with a lightweight, explainable classifier like XGBoost reduces overfitting risks.
In contrast, the study presented in [41] proposes a hybrid algorithm that combines SURF feature extraction, genetic optimization, and a Feed-Forward Back Propagation Neural Network (FFBPNN), achieving an accuracy of 98.08%. While promising, this approach is based on handcrafted feature engineering and limited data volume (500 images), which may limit its generalizability compared to deep feature-based models.
As for [42], although it focuses on a different lung condition (pneumothorax) and a different imaging modality (chest X-rays), its lower performance (91.23% accuracy, 92.20% F1-score) further illustrates the challenges of using 2D X-ray data for accurate diagnosis of thoracic pathologies and highlights the relevance of CT imaging in this context.
In summary, our method offers a strong balance between accuracy, modularity, and interpretability. It is particularly well suited for adaptation to clinical settings where understanding the contribution of each module (preprocessing, feature extraction, classification) is critical. Moreover, the flexibility of our pipeline allows easy integration of future models or preprocessing enhancements.
Nevertheless, the study has certain limitations. It was conducted on a single dataset, and further evaluation on external datasets is needed to validate generalizability. Clinical validation and integration with patient metadata are also part of our future work.

6. Conclusions

Unlike traditional deep learning approaches that rely solely on pre-trained models for both feature extraction and classification, our proposed system adopts a hybrid architecture that separates these two stages. Specifically, we leverage pre-trained models to extract deep, high-level feature representations from CT scan images, and subsequently feed these features into classical machine learning classifiers such as SVM, XGBoost, and Random Forest.
This design offers several key advantages. First, it allows for better control and interpretability of the classification stage, enabling fine-tuned optimization of decision boundaries through well-established machine learning algorithms. Second, it reduces the risk of overfitting, by avoiding the need to retrain the final fully connected layers of deep models.
Furthermore, our system enables modular experimentation with different preprocessing techniques and classifiers, allowing the construction of a flexible and extensible diagnostic system. Remarkably, the combination of CLAHE preprocessing, Swin Transformer feature extraction, and XGBoost with a classification strategy achieved the highest accuracy of 95.8%, highlighting the effectiveness of our lung cancer detection system in distinguishing between malignant and benign lung CT scans with high precision.

Author Contributions

Conceptualization, D.H.; methodology, D.H.; software, D.H.; validation, K.T. and S.E.; formal analysis, D.H.; investigation, D.H.; resources, D.H.; data curation, D.H.; writing—original draft preparation, D.H. and K.T.; writing—review and editing, D.H.; visualization, D.H.; supervision, S.E. and K.T.; project administration, D.H.; funding acquisition, D.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the data used were obtained from the public databases.

Informed Consent Statement

Patient consent was waived due to the data used were obtained from the public databases.

Data Availability Statement

The original data presented in the study are openly available in LIDC-IDRI dataset at https://www.cancerimagingarchive.net/collection/lidc-idri/ (accessed on 29 September 2025).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bade, B.C.; Cruz, C.S.D. Lung cancer 2020: Epidemiology, etiology, and prevention. Clin. Chest Med. 2020, 41, 1–24. [Google Scholar] [CrossRef]
  2. American Cancer Society. Global Cancer Facts & Figures 2024. 2024. Available online: https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/global-cancer-facts-and-figures/global-cancer-facts-and-figures-2024.pdf (accessed on 7 July 2025).
  3. Howlader, N.; Noone, A.; Krapcho, M.; Miller, D.; Brest, A.; Yu, M.; Ruhl, J.; Tatalovich, Z.; Mariotto, A.; Lewis, D.; et al. SEER Cancer Statistics Review, 1975–2017; National Cancer Institute: Bethesda, MD, USA, 2020; Volume 4.
  4. Chen, D.; Li, X.; Li, S. A Novel Convolutional Neural Network Model Based on Beetle Antennae Search Optimization Algorithm for Computerized Tomography Diagnosis. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 1418–1429. [Google Scholar] [CrossRef]
  5. Vijayan, N.; Kuruvilla, J. The impact of transfer learning on lung cancer detection using various deep neural network architectures. In Proceedings of the 2022 IEEE 19th India Council International Conference (INDICON), Kochi, India, 24–26 November 2022; pp. 1–5. [Google Scholar]
  6. Kumar, V.D.; Rajesh, P.; Geman, O.; Craciun, M.D.; Arif, M.; Filip, R. «Diagnostic Quo Vadis»: Application de l’informatique à la détection précoce du pneumothorax. Diagnostics 2023, 13, 1305. [Google Scholar] [PubMed]
  7. Nawreen, N.; Hany, U.; Islam, T. Lung cancer detection and classification using CT scan image processing. In Proceedings of the 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI), Rajshahi, Bangladesh, 8–9 July 2021; pp. 1–6. [Google Scholar]
  8. Al-Jamimi, H.A.; Ayad, S.; El Kheir, A. Integrating Advanced Techniques: RFE-SVM Feature Engineering and Nelder-Mead Optimized XGBoost for Accurate Lung Cancer Prediction. IEEE Access 2025, 13, 29589–29600. [Google Scholar]
  9. Wang, S.; Dong, L.; Wang, X.; Wang, X. Classification of pathological types of lung cancer from CT images by deep residual neural networks with transfer learning strategy. Open Med. 2020, 15, 190–197. [Google Scholar] [CrossRef]
  10. Sari, S.; Soesanti, I.; Setiawan, N.A. Best performance comparative analysis of architecture deep learning on ct images for lung nodules classification. In Proceedings of the 2021 IEEE 5th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), Purwokerto, Indonesia, 24–25 November 2021; pp. 138–143. [Google Scholar]
  11. Bakchy, S.C.; Peyal, H.I.; Islam, M.I.; Yeamin, G.K.; Miraz, S.; Abdal, M.N. A lightweight-cnn model for efficient lung cancer detection and grad-cam visualization. In Proceedings of the 2023 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD), Dhaka, Bangladesh, 21–23 September 2023; pp. 254–258. [Google Scholar]
  12. Raza, R.; Zulfiqar, F.; Khan, M.O.; Arif, M.; Alvi, A.; Iftikhar, M.A.; Alam, T. Lung-EffNet: Lung cancer classification using EfficientNet from CT-scan images. Eng. Appl. Artif. Intell. 2023, 126, 106902. [Google Scholar] [CrossRef]
  13. Fan, R.; Bu, S. Transfer-learning-based approach for the diagnosis of lung diseases from chest X-ray images. Entropy 2022, 24, 313. [Google Scholar] [CrossRef]
  14. Hrizi, D.; Tbarki, K.; Elasmi, S. An Efficient Method for Lung Cancer Image Segmentation and Nodule Type Classification Using Deep Learning Algorithms. In Proceedings of the International Conference on Advanced Information Networking and Applications, Kitakyushu, Japan, 17–19 April 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 46–56. [Google Scholar]
  15. Hrizi, D.; Tbarki, K.; Attia, M.; Elasmi, S. Lung cancer detection and nodule type classification using image processing and machine learning. In Proceedings of the 2023 International Wireless Communications and Mobile Computing (IWCMC), Marrakesh, Morocco, 19–23 June 2023; pp. 1154–1159. [Google Scholar]
  16. Hrizi, D.; Tbarki, K.; Elasmi, S. Lung cancer detection and classification using CNN and image segmentation. In Proceedings of the 2023 IEEE Tenth International Conference on Communications and Networking (ComNet), Hammamet, Tunisia, 1–3 November 2023; pp. 1–10. [Google Scholar]
  17. Idrees, R.; Abid, M.; Raza, S.; Kashif, M.; Waqas, M.; Ali, M.; Rehman, L. Lung Cancer Detection using Supervised Machine Learning Techniques. Lahore Garrison Univ. Res. J. Comput. Sci. Inf. Technol. 2022, 6, 49–68. [Google Scholar] [CrossRef]
  18. Xu, Q.; Ma, Z.; Duan, W.; He, N. DCSAU-Net: A deeper and more compact split-attention U-Net for medical image segmentation. Comput. Biol. Med. 2023, 154, 106626. [Google Scholar] [CrossRef]
  19. Liu, Y.; Hsu, H.Y.; Lin, T.; Peng, B.; Saqi, A.; Salvatore, M.M.; Jambawalikar, S. Lung nodule malignancy classification with associated pulmonary fibrosis using 3D attention-gated convolutional network with CT scans. J. Transl. Med. 2024, 22, 51. [Google Scholar] [CrossRef]
  20. Dhiaa, R.; Awad, O. A Comparative analysis study of lung cancer detection and relapse prediction using XGBoost classifier. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1076, 012048. [Google Scholar] [CrossRef]
  21. Li, W.; Zou, X.; Zhang, J.; Hu, M.; Chen, G.; Su, S. R predicting lung cancer bone metastasis using CT and pathological imaging with a Swin Transformer model. J. Bone Oncol. 2025, 52, 100681. [Google Scholar] [CrossRef]
  22. Sunanda, P.; Rani, K.A. Automatic Brain Tumor Segmentation and Detection with Histogram Equalization of Morphological Image Processing. In Proceedings of the International Conference on Advanced Materials, Manufacturing and Sustainable Development (ICAMMSD 2024), Kurnool, India, 22–23 November 2024; Atlantis Press: Dordrecht, The Netherlands, 2025; pp. 343–355. [Google Scholar] [CrossRef]
  23. Halloum, K.; Ez-Zahraouy, H. Enhancing Medical Image Classification through Transfer Learning and CLAHE Optimization. Curr. Med. Imaging 2025, 21, e15734056342623. [Google Scholar]
  24. Fang, X.; Feng, X. Domain-Aware Adaptive Logarithmic Transformation. Electronics 2023, 12, 1318. [Google Scholar] [CrossRef]
  25. Nguyen, A.A.T.; Onishi, N.; Carmona-Bozo, J.; Li, W.; Kornak, J.; Newitt, D.C.; Hylton, N.M. Post-Processing Bias Field Inhomogeneity Correction for Assessing Background Parenchymal Enhancement on Breast MRI as a Quantitative Marker of Treatment Response. Tomography 2022, 8, 891–904. [Google Scholar] [CrossRef] [PubMed]
  26. Hu, G.; Saeli, C. Enhancing deep edge detection through normalized Hadamard-product fusion. J. Imaging 2024, 10, 62. [Google Scholar] [CrossRef] [PubMed]
  27. Hu, P.; Han, Y.; Zhang, Z.; Chu, S.C.; Pan, J.S. A multi-level thresholding image segmentation algorithm based on equilibrium optimizer. Sci. Rep. 2024, 14, 29728. [Google Scholar] [CrossRef]
  28. Lin, H.; Zhao, M.; Zhu, L.; Pei, X.; Wu, H.; Zhang, L.; Li, Y. Gaussian filter facilitated deep learning-based architecture for accurate and efficient liver tumor segmentation for radiation therapy. Front. Oncol. 2024, 14, 1423774. [Google Scholar] [CrossRef]
  29. Ullah, F.; Kumar, K.; Rahim, T.; Khan, J.; Jung, Y. A new hybrid image denoising algorithm using adaptive and modified decision-based filters for enhanced image quality. Sci. Rep. 2025, 15, 8971. [Google Scholar] [CrossRef]
  30. Radhi, E.A.; Kamil, M.Y. Anisotropic Diffusion Method for Speckle Noise Reduction in Breast Ultrasound Images. Int. J. Intell. Eng. Syst. 2024, 17, 621–631. [Google Scholar] [CrossRef]
  31. Taassori, M. Enhanced Wavelet-Based Medical Image Denoising with Bayesian-Optimized Bilateral Filtering. Sensors 2024, 24, 6849. [Google Scholar] [CrossRef] [PubMed]
  32. Sharma, A.K.; Nandal, A.; Dhaka, A.; Polat, K.; Alwadie, R.; Alenezi, F.; Alhudhaif, A. HOG transformation based feature extraction framework in modified Resnet50 model for brain tumor detection. Biomed. Signal Process. Control 2023, 84, 104737. [Google Scholar] [CrossRef]
  33. Bae, J.; Kim, M.; Lim, J.S. Feature extraction model based on inception v3 to distinguish normal heart sound from systolic murmur. In Proceedings of the 2020 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Republic of Korea, 21–23 October 2020; pp. 460–463. [Google Scholar]
  34. Siddarth, S.G.; Chokkalingam, S. DenseNet 121 framework for automatic feature extraction of diabetic retinopathy images. In Proceedings of the 2024 International Conference on Emerging Systems and Intelligent Computing (ESIC), Bhubaneswar, India, 9–10 February 2024; pp. 338–342. [Google Scholar]
  35. Hidayah, A.R.; Wisesty, U.N. Lung Cancer Classification Based on Ensembling EfficientNet Using Histopathology Images. In Proceedings of the 2024 International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA), Bali, Indonesia, 17–19 December 2024; pp. 76–81. [Google Scholar]
  36. Nowroozilarki, Z.; Mortazavi, B.J.; Jafari, R. Variational autoencoders for biomedical signal morphology clustering and noise detection. IEEE J. Biomed. Health Inform. 2023, 28, 169–180. [Google Scholar] [CrossRef] [PubMed]
  37. Zhang, Y.; Xiao, X.; Guo, J. TransMCS: A Hybrid CNN-Transformer Autoencoder for End-to-End Multi-Modal Medical Signals Compressive Sensing. Theor. Comput. Sci. 2025, 1051, 115409. [Google Scholar] [CrossRef]
  38. Krasteva, V.; Stoyanov, T.; Naydenov, S.; Schmid, R.; Jekova, I. Detection of Atrial Fibrillation in Holter ECG Recordings by ECHOView Images: A Deep Transfer Learning Study. Diagnostics 2025, 15, 865. [Google Scholar] [CrossRef]
  39. VR, N.; Chandra SS, V. ExtRanFS: An automated lung cancer malignancy detection system using extremely randomized feature selector. Diagnostics 2023, 13, 2206. [Google Scholar]
  40. Ozdemir, B.; Aslan, E.; Pacal, I. Attention enhanced inceptionnext based hybrid deep learning model for lung cancer detection. IEEE Access 2025, 13, 27050–27069. [Google Scholar] [CrossRef]
  41. Nanglia, P.; Kumar, S.; Mahajan, A.N.; Singh, P.; Rathee, D. A hybrid algorithm for lung cancer classification using SVM and Neural Networks. ICT Express 2021, 7, 335–341. [Google Scholar] [CrossRef]
  42. Gourisaria, M.K.; Singh, V.; Chatterjee, R.; Panda, S.K.; Pradhan, M.R.; Acharya, B. PneuNetV1: A deep neural network for classification of pneumothorax using CXR images. IEEE Access 2023, 11, 65028–65042. [Google Scholar] [CrossRef]
Figure 1. Pipeline architecture of the proposed lung cancer detection system.
Figure 1. Pipeline architecture of the proposed lung cancer detection system.
Jimaging 11 00346 g001
Figure 2. Swin Transformer architecture for feature extraction.
Figure 2. Swin Transformer architecture for feature extraction.
Jimaging 11 00346 g002
Figure 3. Architecture of the Convolutional Autoencoder (CAE) used for feature extraction from 512 × 512 CT images.
Figure 3. Architecture of the Convolutional Autoencoder (CAE) used for feature extraction from 512 × 512 CT images.
Jimaging 11 00346 g003
Figure 4. Example of feature extraction process. (a) CLAHE-enhanced CT slice. (b) Simplified representation of the Swin Transformer feature extraction pipeline, showing consecutive Swin Transformer blocks across four stages. (c) Flattened feature vector (size: 1 × 1024) obtained from the final stage; only the first five and last four feature values are displayed for illustration.
Figure 4. Example of feature extraction process. (a) CLAHE-enhanced CT slice. (b) Simplified representation of the Swin Transformer feature extraction pipeline, showing consecutive Swin Transformer blocks across four stages. (c) Flattened feature vector (size: 1 × 1024) obtained from the final stage; only the first five and last four feature values are displayed for illustration.
Jimaging 11 00346 g004
Table 1. Distribution of images after applying the labeling strategy.
Table 1. Distribution of images after applying the labeling strategy.
ClassNumber of Images
Malignant6568
Benign4849
Table 2. Visual comparison of different preprocessing techniques.
Table 2. Visual comparison of different preprocessing techniques.
Preprocessing TechniqueBenign CaseMalignant Case
Original imageJimaging 11 00346 i001Jimaging 11 00346 i002
Edge and contour enhancementJimaging 11 00346 i003Jimaging 11 00346 i004
Histogram equalizationJimaging 11 00346 i005Jimaging 11 00346 i006
CLAHEJimaging 11 00346 i007Jimaging 11 00346 i008
Intensity inhomogeneity correctionJimaging 11 00346 i009Jimaging 11 00346 i010
Median filterJimaging 11 00346 i011Jimaging 11 00346 i012
Gaussian filterJimaging 11 00346 i013Jimaging 11 00346 i014
Logarithmic transformationJimaging 11 00346 i015Jimaging 11 00346 i016
Anisotropic filterJimaging 11 00346 i017Jimaging 11 00346 i018
Preliminary segmentationJimaging 11 00346 i019Jimaging 11 00346 i020
Wavelet-based denoisingJimaging 11 00346 i021Jimaging 11 00346 i022
Table 3. Average classification accuracy per preprocessing method across all configurations.
Table 3. Average classification accuracy per preprocessing method across all configurations.
Preprocessing TechniqueAverage Accuracy (%)
Original Image84.3
Histogram Equalization78.2
CLAHE89.0
Intensity Inhomogeneity Correction78.4
Median Filter76.6
Gaussian Filter80.5
Logarithmic Transformation79.3
Anisotropic Filter78.8
Preliminary Segmentation79.1
Wavelet-Based Denoising76.6
Edge and contour enhancment80.1
Table 4. Class-wise performance metrics of the XGBoost classifier.
Table 4. Class-wise performance metrics of the XGBoost classifier.
ClassPrecision (%)Recall (%)F1-Score (%)Support
Benign91.093.092.01455
Malignant97.098.097.51971
Table 5. Combinations of preprocessing techniques with feature extraction models and classification results.
Table 5. Combinations of preprocessing techniques with feature extraction models and classification results.
PreprocessingFeature Extraction ModelSVMRFDTXGBoost
Acc. Brier Score Loss F1 Acc. Brier Score Loss F1 Acc. Brier Score Loss F1 Acc. Brier Score Loss F1
CLAHEResNet5085.880.180.8086.820.160.8474.340.250.7787.390.150.85
InceptionV386.940.140.8779.450.220.8685.630.180.8082.530.200.88
DenseNet12189.220.120.8683.180.190.8374.570.270.7181.960.210.82
EfficientNet B087.730.130.8685.630.160.8579.480.230.7085.240.150.85
EfficientNet B188.650.110.8487.690.140.8770.530.280.7086.730.130.87
EfficientNet B387.770.120.8784.60.180.8465.890.300.6683.880.190.84
Swin Transformer90.00.080.89910.070.9092.00.060.9195.80.040.94
Denoising Autoencoder78.50.230.7685.00.170.8091.80.090.9087.00.150.85
Variational Autoencoder83.660.190.81800.220.7979.260.240.7878.230.250.77
Convolutional Autoencoder71.260.290.7191.450.080.9182.290.200.8291.230.090.91
Table 6. Post hoc pairwise Wilcoxon signed-rank test with Bonferroni correction for CLAHE preprocessing (p-values).
Table 6. Post hoc pairwise Wilcoxon signed-rank test with Bonferroni correction for CLAHE preprocessing (p-values).
Model AModel Bp-Value
Swin Transformer + XGBoostResNet50 + XGBoost0.001
Swin Transformer + XGBoostDenseNet121 + XGBoost0.0005
Swin Transformer + XGBoostEfficientNet B3 + XGBoost0.0002
Swin Transformer + XGBoostDenoising Autoencoder + XGBoost0.0001
Swin Transformer + XGBoostVariational Autoencoder + XGBoost0.0001
Swin Transformer + XGBoostConvolutional Autoencoder + XGBoost0.045
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hrizi, D.; Tbarki, K.; Elasmi, S. Optimized Lung Nodule Classification Using CLAHE-Enhanced CT Imaging and Swin Transformer-Based Deep Feature Extraction. J. Imaging 2025, 11, 346. https://doi.org/10.3390/jimaging11100346

AMA Style

Hrizi D, Tbarki K, Elasmi S. Optimized Lung Nodule Classification Using CLAHE-Enhanced CT Imaging and Swin Transformer-Based Deep Feature Extraction. Journal of Imaging. 2025; 11(10):346. https://doi.org/10.3390/jimaging11100346

Chicago/Turabian Style

Hrizi, Dorsaf, Khaoula Tbarki, and Sadok Elasmi. 2025. "Optimized Lung Nodule Classification Using CLAHE-Enhanced CT Imaging and Swin Transformer-Based Deep Feature Extraction" Journal of Imaging 11, no. 10: 346. https://doi.org/10.3390/jimaging11100346

APA Style

Hrizi, D., Tbarki, K., & Elasmi, S. (2025). Optimized Lung Nodule Classification Using CLAHE-Enhanced CT Imaging and Swin Transformer-Based Deep Feature Extraction. Journal of Imaging, 11(10), 346. https://doi.org/10.3390/jimaging11100346

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop