Optimizing Cervical Cancer Diagnosis with Feature Selection and Deep Learning

Jeleń, Łukasz; Stankiewicz-Antosz, Izabela; Chosia, Maria; Jeleń, Michał

doi:10.3390/app15031458

Open AccessArticle

Optimizing Cervical Cancer Diagnosis with Feature Selection and Deep Learning

¹

Department of Computer Engineering, Wroclaw University of Science and Technology, wyb. Stanisława Wyspiańskiego 27, 50-370 Wrocław, Poland

²

Faculty of Security Studies, General Tadeusz Kościuszko Military University of Land Forces, ul. Piotra Czajkowskiego 109, 51-147 Wrocław, Poland

³

Department of Pathology, Pomeranian Medical University, ul. Unii Lubelskiej 1, 71-252 Szczecin, Poland

⁴

Department of Immunopathology and Molecular Biology, Wroclaw Medical University, ul. Borowska 211, 50-556 Wrocław, Poland

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(3), 1458; https://doi.org/10.3390/app15031458

Submission received: 12 December 2024 / Revised: 24 January 2025 / Accepted: 29 January 2025 / Published: 31 January 2025

(This article belongs to the Special Issue Recent Advances in and Applications of Medical Image Processing and Analysis)

Download

Browse Figures

Versions Notes

Abstract

The main purpose of cervical cancer diagnosis is a correct and rapid detection of the disease and the determination of its histological type. This study investigates the effectiveness of combining handcrafted feature-based methods with convolutional neural networks for the determination of cancer histological type, emphasizing the role of feature selection in enhancing classification accuracy. Here, a data set of liquid-based cytology images was analyzed and a set of handcrafted morphological features was introduced. Furthermore, features were optimized through advanced selection techniques, including stepwise and significant feature selection, to reduce feature dimensionality while retaining critical diagnostic information. These reduced feature sets were evaluated using several classifiers including support vector machines and compared with CNN-based approach, highlighting differences in accuracy and precision. The results demonstrate that optimized feature sets, paired with SVM classifiers, achieve classification performance comparable to those of CNNs while significantly reducing computational complexity. This finding underscores the potential of feature reduction techniques in creating efficient diagnostic frameworks. The study concludes that while convolutional neural networks offer robust classification capabilities, optimized handcrafted features remain a viable and cost-effective alternative, particularly when the data count is limited. This work contributes to advancing automated diagnostic systems by balancing accuracy, efficiency, and interpretability.

Keywords:

cervical cancer classification; machine learning; feature selection; convolutional neural networks; medical image classification; machine learning in healthcare; liquid-based cytology

1. Introduction

Looking at the latest statistics on cervical cancer, one can notice that the number of incidents of this disease decreases. However, it remains a significant challenge both in oncology and in social contexts. According to the World Health Organization (WHO), more than 570,000 new cases of cervical cancer are reported each year, and approximately 311,000 of these cases result in death [1]. In Poland, the five-year survival rate in years 2010–2015 was 48.3% and in Europe it was 62.1% [2]. For cancer, it is a well-known fact that a faster diagnosis will increase the chances of successful treatment for patients [3]. For cervical cancer, risk factors are well known and can be taken into account for a better diagnosis process. These factors can be divided into two categories [4]:

Main factors that include age of the patient, chronic inflammation of the HPV virus, early sexual relationship experience, large number of sex partners, large number of child deliveries, long smoking habits, and low economic status.
Possible factors that include long use of contraceptive pills, a diet rich in antioxidants, HIV infection, as well as frequent, untreated inflammations.

The main purpose of diagnosis is to quickly and accurately detect the disease and determine its histological type. The process starts with a patient interview and a physical examination. If a suspicious region is found, a cytological examination is performed.

Current diagnostic workflows often rely on cytological examinations, such as Pap smears or liquid-based cytology (LBC), to detect cervical abnormalities. However, these methods are often subjective and depend heavily on the expertise of pathomorphologists. Despite offering improved slide quality and reduced sample variability, LBC still suffers from interpretive limitations, leading to inconsistencies in diagnosis. Most existing methods try to solve the problem with deep learning models trained on large data sets or focus on slide-level classification, which limits their applicability in resource-constrained environments. To address these gaps, this study proposes a novel approach that integrates handcrafted feature-based methods, derived from clinically relevant diagnostic criteria. Unlike existing methods that focus on slide-level classification or rely on resource-intensive deep learning models, this framework works at the per-cell level, aligning closely with the diagnostic process used by pathologists. By focusing on per-cell classification, this study aims to advance automated diagnosis in resource-constrained environments. In addition, traditional diagnostic systems, such as the Bethesda system, often focus on broad classifications, such as normal versus abnormal, which may lack the specificity required for detailed treatment planning.

Recent developments in computer vision and machine learning, especially in the area of convolutional neural networks, have demonstrated high accuracies in medical image analysis and classification. However, most of these approaches focus on distinguishing between normal and abnormal samples, rarely providing the granular categorization necessary for effective treatment planning. These limitations make them less practical in resource-constrained environments or for granular classifications such as Normal Squamous Intraepithelial Lesion (NSIL), Low-Grade Squamous Intraepithelial Lesion (LSIL) and High-Grade Squamous Intraepithelial Lesion (HSIL).

To address these challenges, this study proposes a framework that combines handcrafted feature-based methods with deep learning techniques, such as convolutional neural networks, to classify LBC images into specific diagnostic categories: NSIL, LSIL or HSIL. According to the literature, hybrid approaches that integrate handcrafted features with deep learning have shown significant promise in the advancement of diagnostic frameworks. For example, Anurag et al. proposed a feature blending approach that combined handcrafted techniques with CNN features, resulting in improved classification accuracy when applied to histopathological images [5]. The results obtained by Daoud et al. also show that combined features can achieve high precision in breast ultrasound image classifications [6].

Unlike existing methods, the proposed approach reduces dimensionality through the incorporation of advanced feature selection methods, enabling high diagnostic accuracy while minimizing computational demands. This allows retaining critical diagnostic information, achieving a balance between computational efficiency, accuracy, and interpretability. Feature reduction is particularly important for LBC data, where extracted morphological features such as nucleus size and nucleus-to-cell ratio capture key diagnostic patterns without requiring extensive processing power. Furthermore, extraction of handcrafted features provides a practical solution for automated cytological diagnostics, offering scalability, cost effectiveness, and adaptability to a variety of clinical settings.

This work bridges the gap between traditional diagnostic methods and modern AI-driven solutions, offering a robust and scalable system tailored to the nuanced requirements of clinical practice. The details of the data set, feature extraction, and classification methods are described in the following section.

2. Materials and Methods

This section describes the proposed integration of machine learning for pattern recognition in the detection of cervical cancer using liquid-based cytology (LBC) slides. The study uses a combination of high-resolution LBC imaging and sophisticated algorithmic models to systematically identify and classify cellular patterns. The proposed scheme takes an image as input and assigns one of the squamous intraepithelial lesion grades as described in Section 2.1. The framework involves morphometric and convolutional neural network feature extraction, dimensionality reduction through feature selection, and classification. Each phase is described in detail in the following subsections, highlighting the methodologies used to achieve efficient and accurate classification. The hyperparameters for each classifier were selected using grid search optimization in the training data.

2.1. Diagnostic Process Overview

Cervical cancer diagnosis traditionally involves a multistage process that begins with a patient interview and physical examination. If abnormalities are suspected, cytological samples are collected from the cervix for further analysis. This subjective analysis often determines the patient’s subsequent treatment plan. Over time, efforts have been made to improve the traditional method in order to increase its objectivity. According to Kitchener et al., incorporating HPV testing into cervical screening protocols improved the sensitivity of detection of high-grade cervical intraepithelial neoplasia, indicating significant advances in the methods’ ability to identify serious diseases [7].

In this paper, we use images from liquid-based cytology (LBC), which is the latest advancement in cytological examination techniques. Liquid-based cytology is becoming more popular than traditional cytology as it reduces variability in sample preparation and improves the quality of diagnostic slides [7]. Unlike the conventional smear method, LBC involves suspending cells in a liquid medium before processing them on a glass slide, resulting in cleaner and more uniform preparations (see Figure 1).

In the next step, a slide is analyzed by a pathomorphologist. To standardize the interpretation and reporting of cervical cytological results, the Bethesda system is used. This system was developed in 1988 and was primarily used to report the results of cervical and vaginal cytology. Today, this system acts as a framework for consistent communication between healthcare professionals, helping to effectively manage and treat irregularities in cervical cytology.

George N. Papanicolaou played a crucial role in the development of cytological diagnostics, who, in 1941, described the usage of a vaginal smear in the diagnosis of uterine carcinoma [8] and in 1942 introduced a new staining procedure for these smears [9]. These advances have brought substantial improvements in cytological examinations, mainly by identifying atypical cells in smears. This technique, named after its creator, is now universally recognized as the Pap smear. Building on these basic techniques, the Bethesda system was established as a uniform method of reporting cervical and vaginal cytology findings. Having been revised on multiple occasions since its invention, it continues to facilitate efficient communication and handling of cervical cytological irregularities. In 2012, Solomon and Kurman described the Bethesda system (TBS) that details definitions, criteria, and terminological clarifications [10]. The TBS format is adaptable and designed to integrate both evolving diagnostic approaches for cervical cancer and advances in cervical pathology. This system employs a two-level reporting structure based on the condition of uterine cervix cells, which can be categorized into two distinct cases:

Low-Grade Squamous Intraepithelial Lesion—also called LSIL. This grade describes a non-cancerous lesion in which cells of the uterine cervix are slightly abnormal (see Figure 2b).
High-Grade Squamous Intraepithelial Lesion—also called HSIL. Describes these cases that can become cancerous. Here, uterine cervix cells are moderately or severely abnormal (see Figure 2c).

In addition, the Bethesda system contains information about the method that was used for the slide preparation. To distinguish between LSIL and HSIL cases, pathomorphologists evaluate morphological differences, including cell size, nucleus size, and nucleus-to-cytoplasm ratio as key diagnostic features [4]. These features are crucial for an accurate diagnosis and the appropriate management of cervical cytological abnormalities. Furthermore, the Bethesda system also provides detailed descriptions regarding the slide preparation techniques used, enhancing the reproducibility and reliability of cytological evaluations. This additional information helps ensure that the diagnostic process is consistent and standardized in different healthcare settings.

In this study, we take the comprehensive details captured in the Bethesda system to propose a robust machine learning framework. Using this well-documented and standardized information, we can suggest algorithms for accurate LBC image classification into the appropriate Bethesda system categories. This integration of machine learning with classical diagnostic criteria holds promise in enhancing diagnostic accuracy and efficiency in cytology.

2.2. Cervical Cancer Classification

As previously noted, advancements in diagnostic techniques are essential for improving patient outcomes. Recent studies have highlighted the potential for machine learning integration with traditional diagnostic techniques as a direction to enhance accuracy and efficiency of cervical cancer diagnosis. This section explores various cervical cancer classification approaches. Over the years, a variety of studies have explored the classification of cervical cancer using various techniques. In 2011, Rahmadwati et al. described a texture description based on a Gabor filter to analyze nuclei structure and determine the feature vector [11]. For cancer grade classification, the authors use a clustering algorithm applied to the feature vector and calculate a ration between normal and abnormal cells. Based on that ratio, they assign a grade based on the grading scheme used by pathomorphologists with an accuracy of 89%. Mariarputham and Stephen described a classification system for Pap smear evaluation based on the gray-level co-occurrence matrix (GLCM) and local binary patterns (LBP) [12]. The extracted features were then classified with support vector machines (SVM) and single-layer neural networks to reveal better performance of the SVM classifier. Amole and Osalusi [13] explored a comparable method to evaluate the performance of k-NN and SVM classifiers. Their framework, utilizing GLCM features, successfully classified Pap smear data with an accuracy of 90% for SVM and 88.3% for k-NN. In 2019, William et al. described a fuzzy clustering method for cancer classification that achieved an accuracy of over 96%. The most recent papers report on the combination of serum infrared spectroscopy with machine learning and the application of deep learning techniques [14,15]. Various other studies have also explored the use of deep learning in the classification of cervical cancer. Khoulqi and Nailae, as well as Hemalatha and Vetriselvi, achieved high accuracies with deep learning models [16,17]. Khoulqi and Nailae focused on the classification of MRI images and Hemalatha and Vetriselvi on cervical images. The approach of Subarna and Sukumar combines edge detection, wavelet transform, and CNN architecture for classification [18]. In 2023, Majeed et al. used transfer learning to improve classification performance.

The above discussion has highlighted the integration of machine learning with traditional diagnostic techniques, significantly enhancing the accuracy and efficiency of cervical cancer diagnosis. However, most studies have focused on the classical Pap smear method rather than liquid-based cytology (LBC). Considering the differences between these methods, it is essential to develop a new specialized framework for LBC to enhance both the accuracy and efficiency of cervical cancer detection [19,20]. Klug et al. described their comparative study showing the supremacy of the liquid-based cytology method over the conventional method [20]. The authors presented a comprehensive statistical study demonstrating the superiority of LBC in decision-making processes. Such studies provided new challenges, sparking a new research area. The recent literature review shows how these challenges are being addressed. In 2019, Sornapudi et al. provided a description of a deep learning approach that was able to extract features and classify the LBC slides with accuracy of 95% in the best case [21]. Hut et al. and Kanavati et al. both demonstrated the potential of machine learning in this context, achieving good classification performance [22,23]. Hut et al. used a VGG16 convolutional network to classify between low- and high-risk cases achieving an accuracy of 70.8%, while the accuracy of Kanavati et al. reached 90.7%. In 2023, Wong et al. described the application of ResNet architecture for the classification of cervical cancer [24]. On the other hand, Tan et al. evaluated pre-trained DenseNet-201 models for Pap smear image classification, underscoring the value of transfer learning in leveraging large-scale data sets for specific medical applications [25]. Further improvements showed very good classification results when combining 3D convolutional neural networks with Vision Transformer modules to achieve a remarkable 98.6% accuracy [26].

In the literature, we can also see a variety of studies exploring the application of feature-based approaches but focusing mainly on texture descriptors [27]. These studies have shown that LBP enhances the efficiency of deep learning architectures, especially in the context of content-based image retrieval (CBIR) systems [28]. The development of new feature extraction techniques, including the square symmetric LBP texture descriptor (SSLBP) and the merging two-class linear discriminant analysis (M2CLDA) approach, have improved the performance of image retrieval and classification systems [29]. These studies collectively emphasize the capabilities of LBP and additional feature extraction techniques to improve the performance of deep learning models.

From the above results, we can conclude that the introduction of liquid-based cytology considerably improved the cancer diagnosis process. This process extended features taken into account, and therefore, the morphometric analysis of the LBC features is important. To our knowledge, from the works described above only the attempts of Sornapudi et al. [21] and Zhang et al. [30] provide the calculation of cervical cancer features from Pap smears. The initial and most recent method employs convolutional neural networks to evaluate features. This study proposes an alternative approach that integrates handcrafted feature-based methods with feature selection techniques to address these limitations. By employing stepwise and significant feature selection, the dimensionality of feature vectors is reduced, retaining critical diagnostic information while enabling efficient and interpretable classification. The approach demonstrates competitive accuracy with reduced computational complexity, offering a cost-effective solution particularly suited for low-data environments. In Section 2.4, we propose a set of features resembling the traditional LBC features.

2.3. Image Segmentation

This paper investigates the morphometric characteristics of a cell and its nucleus. For this purpose, binary representations of both objects are required and, therefore, input color image segmentation is required. As segmentation of medical images is not easy, it has also become an increasingly active research field [31,32]. To reduce the noise in the images, median filtering was applied as a pre-processing step. This technique is well known to effectively remove noise while preserving important structural features such as the nucleus boundary. Subsequently, a K-means clustering algorithm was used to segment the images, isolating the nucleus and cell regions and minimizing the influence of irrelevant background data. These pre-processing steps mitigate noise and enhance the reliability of subsequent feature extraction.

K-means clustering is one of the simplest unsupervised learning algorithms to solve a clustering problem [33]. It has a single input parameter that represents a k input cluster. Therefore, prior to the beginning of the clustering process, knowledge of the number of clusters is required. The main idea behind this algorithm is to determine a k center of each cluster. This heuristic algorithm may not be able to converge to the global optimum, therefore the centroids should be placed in a cunning manner. When a centroid is assigned, the next step is to take each data point and associate it with the closest center. When this step is completed, the first step of the algorithm is completed, and the initial grouping is finished. The following procedure is to recalculate the new centroids defined as the centers of the groups initially calculated. This results in a new k centroids and the association procedure of all data must be repeated [33]. The consecutive steps of the generated loop change the location of the k centroids until convergence is achieved. The aim of the algorithm is to minimize the objective square error function (see Equation (1)).

J = \sum_{j = 1}^{k} \sum_{i = 1}^{n} | | x_{i}^{j} - c_{j} {| |}^{2},

(1)

where

| | x_{i}^{j} - c_{j} {| |}^{2}

is a chosen distance measure between a data point

x_{i}^{j}

and the cluster center,

c_{j}

is an indicator of the distance of the n data points from their respective cluster centers. Our research uses the RGB color distance between a pixel and a mean RGB color cluster as a measure of distance. The initial centroids are randomly selected, and the cluster with the highest mean RGB value is identified as the result of segmentation. In this study, the optimal number of clusters was empirically determined to be three. Selecting a larger k-value led to the loss of significant data, occasionally forming gaps within nuclei or uneven clusters. However, using two clusters introduced too much meaningless data, leading to unsatisfactory image segmentation. Given this context, opting for three clusters represents a balance between analyzing irrelevant data and ignoring valuable information.

For complete segmentation of a cell and its nucleus, a space partition was required to obtain separate binary images for both cell and nucleus (see Figure 3). These images were produced by thresholding the segmented image (

i m g_{s e g}

) as described in Equations (2) and (3). The thresholds were determined based on the values obtained from the k-means segmentation, which served as a representation of the identified regions. This representation allows for further determination of the morphological features of the cell and nucleus in Section 2.4 and will be treated as an entry point for the feature extraction subsystem.

c e l l (x, y) = \{\begin{matrix} 0 & if i m g_{s e g} (x, y) > t_{c} \\ 1 & otherwise \end{matrix},

(2)

n u c l e u s (x, y) = \{\begin{matrix} 0 & if i m g_{s e g} (x, y) > t_{n} \\ 1 & otherwise \end{matrix},

(3)

where

c e l l (x, y)

is a binary representation of a cell shape (see Figure 3b),

n u c l e u s (x, y)

is a binary representation of a nucleus (see Figure 3c),

t_{c}

and

t_{n}

are thresholds for the cell and the nucleus, respectively.

2.4. Feature Extraction

A feature vector serves as a parametric representation of objects within an image, making its development a crucial phase in every classification system. Thus, accurate feature selection can significantly enhance classification performance. This section introduces a collection of features that characterize cells on LBC slides and reflect their morphological aspects. For effective feature extraction, it is essential to identify the features that distinguish between uterine cervix cell conditions.

Table 1 describes these features, using a normal cell (NSIL) as the reference. Based on this description, we propose the following 27 morphological features:

Nucleus size ( $N$ )—is defined as the sum of all nuclei pixels in the nucleus.
Cell size ( $C$ )—is calculated as the total number of pixels of the cell in the segmented image.
Nucleus/cell ratio ( $NCr$ )—this feature was calculated as a ratio of the nucleus size and the cell size, providing insights into nuclear enlargement, a hallmark of malignancy:

$N C r = \frac{N}{C} .$

(4)
Nucleus perimeter ( $Np$ )—we define $N p$ as the length of the nuclear boundary of a nucleus N that is approximated by a length of the polygonal approximation of the boundary [34].
Cell perimeter ( $Cp$ )—similar to $N p$ , cell diameter is defined as the length of the cell membrane of a cell C that is approximated by the polygonal approximation length of the membrane.
Nucleus/cell perimeter ratio ( $NCp$ )—this feature was calculated as a ratio of nucleus diameter and cell diameter:

$N C p r = \frac{N d}{C d} .$

(5)
Min Axis ( $MinA$ )— $M i n A$ is calculated as a shortest Euclidean distance between extreme points of the segmented region both for nucleus and a cell.
Min Axis ratio ( $MinAr$ )—represents a ratio between Min Axis for nucleus and a cell
Max Axis ( $MaxA$ )— $M a x A$ is calculated as a largest Euclidean distance between extreme points of the segmented region both for nucleus and a cell.
Max Axis ratio ( $MaxAr$ )—determined as a ratio between $M a x A$ for the nucleus and a cell.
Nucleus Aspect Ratio ( $Nar$ )—to compute the aspect ratio, the bounding box ( $B B$ ) of a nucleus is obtained. Feature is calculated as a ratio between width ( $B B_{w}$ ) and height ( $B B_{h}$ ) of the bounding box.
Cell Aspect Ratio ( $Car$ )—this feature is similar to $N a r$ . Here, the bounding box is determined according to the cell image rather than the nucleus.
Nucleus/cell Aspect Ratio ( $NCar$ )—here, we calculate the ratio of $N a r$ and $C a r$ .
Extent ( $Ext$ )—extent is a feature that represents the amount of space that the nucleus or a cell occupies with respect to its bounding box. We calculate this feature according to Equation (6). It provides information about how compact or irregular a shape is.

$E x t = \frac{a r e a}{B B_{w} * B B_{h}},$

(6)

where $a r e a$ represents the size of a nucleus or a cell.
Nucleus/cell extent ( $NCExt$ )—is a ratio of nucleus extent and cell extent.
Solidity ( $Sol$ )—solidity measures the smoothness or irregularity of the nucleus boundary. It represents the number of pixels occupied by the object but with respect to its convex hull ( $C H$ ). Solidity is defined by Equation (7).

$e x t e n t = \frac{a r e a}{A r e a (C H)},$

(7)

where $a r e a$ represents the size of a nucleus or a cell and $A r e a (C H)$ is an area of $C H$ .
Nucleus/cell solidity ( $NCs$ )—defined as a ratio between $N S o l$ and $C S o l$
Equivalence diameter ( $Eq$ )—equivalence diameter is simply a diameter of a circle which area is the same as N or C.
Equivalence diameter ratio ( $NCEq$ )—is a ratio between equivalence diameter of the nucleus and the equivalence diameter of a cell.
Orientation ( $Or$ )—this feature is known as an axis of the least second moment and provides the information about the angle between the row axis when the coordinate system is placed and the centroid of the nucleus/cell. We calculate the orientation $O r$ according to Equation (8) [35].

$O r = t a n (2 θ_{i}),$

(8)

where the angle $θ_{i}$ is measured counterclockwise from the x–axis.
Orientation ratio ( $NCOr$ )—similar to the previous features, it is defined as a ratio between the nucleus and cell orientations.

All of the features mentioned above were calculated for each cell in a binary image (

I

) obtained during segmentation. The constructed feature vector is then presented as an input of the classification system as described in Section 2.6.

Table 1. LSIL and HSIL in comparison to a healthy cell [4].

Feature	NSIL	LSIL	HSIL
Nucleus size	Very small	Smaller than $1 / 2$ of cell size	Larger than $1 / 2$ of cell size
Cell size	Large cell	Modest decrease	Significant decrease
Nucleus/cell ratio	1:10	Change in favor of nucleus	Significant change in favor of nucleus
Stainability	Normal	Hyper-stainability	Hyper-or-hypo-stainability
Brightness around nucleus	None	Can be visible	None

2.5. Feature Selection and Ranking

Guyon and Elisseeff noted that using a large number of features is not always beneficial [36]. They suggested that when classification accuracy is not significantly compromised and the error rate remains relatively stable, it is preferable to use fewer features in the feature vector. In this study, we have embraced this approach to determine the optimal combination of the features proposed for the feature vector in the classification of cervical cancer. To achieve this, we have employed five feature selection methods, as described below. The statistical methods are robust to noise, as they evaluate feature distributions rather than raw data values, reducing the impact of outliers and irrelevant variations. Combined with pre-processing steps, such as median filtering, these feature selection techniques ensure the selection of highly informative and resilient features for classification.

Kruskal–Wallis Test
This is a non-parametric statistical test used to rank features based on their discriminatory power across diagnostic categories. It determines if the samples originate from the same distribution [37]. It is often regarded as the one-way ANOVA method on ranks or groups. The null hypothesis assumes that the cumulative distribution functions in the populations are equal, for $k > 2$ . For this test, we recorded a probability value (p-value) for the null hypothesis.
Kormogolov–Smirnov Test
This statistical metric represents characteristics through a distribution associated with a data sample [38,39]. This test determines whether the samples are drawn from the same distribution. If the samples come from two different distributions, then the KS test returns a value 1, otherwise it returns a value 0.
Friedman Test
A test proposed by Milton Friedman [40] that can be used for feature ranking. It assesses differences between at least three correlated classes. Here, we recorded a probability value (p-value) for this nonparametric Friedman test.
Stepwise Feature Selection
This is an iterative approach that adds or removes features to maximize classification performance. Given a set of extracted features, the stepwise feature selection method (SFS) chooses a subset of features that provide the best performance. The method performs a forward feature selection that in an iterative way adds the best features to the subset of selected features. In every iteration, the best prediction features are used and their significance is estimated with the F-test (see [41]). Here, we use the backward version of this algorithm (SBS), where the assumption is made that initially all features of the original feature set are used. At each iteration, the features with the worst prediction are eliminated.
Significant Feature Selection
This retains features with statistically significant differences between classes. Significant feature selection (SBS) uses statistical significance to reduce the full vector of features. This method estimates a $p - values$ for each feature in the feature vector. This calculation is based on the two-sample Student’s t-test for the difference of the population means with unequal variances with the assumption that samples are normally distributed. Usually, we consider a feature being statistically significant if its $p - value$ is smaller than 0.05 which is a typical significance value in biomedical sciences and elsewhere.

Since careful selection and ranking of features can be essential for optimizing the performance of the LBC cervical cancer classification system, we adopted the above feature selection and ranking procedures to construct several reduced feature vectors. By implementing the described feature selection methods, we aim to enhance the model’s efficiency, while maintaining its high accuracy of early cancer detection. Furthermore, the choice of feature selection methods was made to identify features with both high clinical relevance and strong discriminatory power. Features such as nucleus-to-cytoplasm ratio, nucleus size, and cell texture were selected because they align with established diagnostic criteria used by pathologists for cervical cancer diagnosis. These features were validated for their relevance and statistical significance using robust tests like Kruskal–Wallis and Kolmogorov–Smirnov. This approach ensures that the selected features not only contribute to classification accuracy but also enhance the interpretability and clinical applicability of the results.

2.6. Feature Classification

To analyze the selected features, we explore various classification algorithms and present their accuracy in distinguishing between TBS grades. We start with three popular and simple classifiers:

Linear discriminant analysis—here, we define a function that is a linear combination of the observations x (see Equation (9)):

$g (x) = w^{t} x + w_{0}$

(9)

where w is the weighting vector and $w_{0}$ is a bias [42]. If $g (x) = 0$ , then Equation (9) defines a decision surface and separates observations of one class from another class. For a two-category classifier, we classify the data into first class ( $C_{1}$ ) if $(x) > 0$ and second class ( $C_{2}$ ) if $(x) < 0$ . For $(x) = 0$ , the observation is classified into either class.
For a multi-class problem we need to define several linear discriminant functions as in Equation (9). The number of functions should be the same as the number of classes used for classification. In this case, a classification rule will be defined by Equation (10).

$g_{i} (x) = w_{i}^{t} x + w_{i 0} f o r i = 1, \dots n c$

(10)

where $n c$ is the number of classes. Equation (10) defines the decision regions $R_{i}$ for $n c$ classes where $g_{i} (x)$ is the largest discriminant for $x - e s$ in that region.
K-Nearest Neighbors (KNN)—a popular and simple classifier that classifies a new data point based on the majority class among its k closest neighbors in the feature space [42]. Training of the KNN classifier consists of saving a feature vector values along with class label. The classification is based on the calculation of the distance metric, typically the Euclidean distance, to find the k closest points [43]. In this work, we also adopted the same metric, but additionally we have tested different numbers (between 1 and 10) of neighbors taken for label estimation. In the result section (see Section 4) we showed accuracy values for $k = 1$ and $k = 10$ .
Support Vector Machines (SVM)—used to separate two or more classes of observations by constructing a boundary between them using border points from each class called support vectors [44]. Here, we employ a kernel-based approach to segregate data, which ensures a robust generalization of the problem. In this way, the classification of an unknown point is classified by its position with respect to the boundary. Training of the SVM model involves an iterative minimization of an error function (Equation (11)).

$\frac{1}{2} w^{T} w + C \sum_{i = 1}^{N} ε_{i}$

(11)

with the following restrictions:

$y_{i} (w^{T} ϕ (x_{i}) + b) \geq 1 - ε_{i} a n d ε_{i} \geq 0, i = 1, \dots, N$

(12)

where C and b are constants, w is the weight vector, $ε_{i}$ is a bias value that deals with overlapping cases and $ϕ$ is a kernel function that transforms input data into the feature space. The constant C has a significant influence on the error rate and must be carefully estimated during the training process. In this study, we perform a grid search on values varied between $10^{- 3}$ and $10^{3}$ .
Neural Networks (NN)—designed to simulate the behavior of the human brain to enable computers to learn from data. They consist of multiple layers of interconnected neurons, each connected through connections similar to synapses that carry weighted information. These weights are adjusted during training. The network prediction ability is then optimized based on input features. In this work, we adjust the internal parameters (weights and biases) based on the differences between the actual output and the desired output. This process, known as backpropagation, uses gradient descent to minimize network error across 200 iterations. For optimization, Adam’s optimizer was used and the rectified linear unit (ReLu) was used as an activation function in hidden layers. Here, the learning rate, number of hidden layers, and the number of neurons per layer were fine-tuned, with the learning rate ranging from 0.001 to 0.1 and the number of neurons per layer ranging from 10 to 100.

In addition to the above traditional classifiers, we explore how features extracted with Convolutional Neural Networks (CNN) perform on the same data set. We will exercise their ability to recognize spatial hierarchies and patterns in images. In our research, networks that have been shown to have high effectiveness in histological image recognition tasks were applied [45,46]. These are the VGG16 and VGG19 CNN models pre-trained on the ImageNet data set. VGG16 consists of 16 layers with weights, while VGG19 includes 19 weighted layers, both having a deep architecture of convolutional layers followed by fully connected layers. These are simple architectures that use only 3 × 3 convolutional layers stacked on top of each other in increasing depth. Due to their ability to resemble complex spatial hierarchies in images, they have the potential to handle the intricacies and variations specific to cervical cytology, providing robust classifications that support a timely and accurate medical diagnosis.

To maintain consistency and comparability, the classification of CNN features extracted was evaluated using the same classifiers as used for traditionally extracted features. This approach takes advantage of the powerful feature representation of convolutional neural networks while avoiding their computational burden during the classification phase. As shown in Section 4.2, this method achieved high classification performance, highlighting the robustness of the proposed framework.

2.7. Classification Metrics

This section will introduce the classification metrics used for LBC slide classification. To comprehensively evaluate the obtained results, it is essential to define metrics that can demonstrate the effectiveness of the proposed model. This approach allows for further comparison of the classification methods [47].

The fundamental method includes defining the confusion matrix, which records the classifiers’ output [48]. In this research, we employ a multi-class variant of the confusion matrix. This matrix includes three values on its diagonal that indicate the count of correct responses (“true”). The other six values represent misclassifications. Based on the recorded values, the values of “true positive” (TP), “false negative” (FN), and “false positive” (FP) can be calculated. These metrics reflect the classifier’s decision-making accuracy. The “false negative” metric is determined by summing all incorrectly classified examples of a class. Conversely, the “false positive” metric represents the total number of instances where other classes have been incorrectly identified as this particular class. For further evaluation of model accuracy, Precision and Recall measures are defined. They help to understand how well a model performs in terms of its predictive capabilities, particularly when dealing with data sets where the balance between classes is a critical factor.

Based on the confusion matrix values, we define the measures Precision and Recall for our system. Precision measures the accuracy of positive predictions made by the model. It is calculated as the ratio of true positive observations to the total predicted positives. Recall, on the other hand, measures system’s ability to make true positive classifications and is defined as a ratio of true positives and all actual positive values. These metrics are often used together to provide a complete picture of the performance of a model. In many practical applications, there is a trade-off between precision and recall, which can be balanced depending on the specific requirements. In the oncological classification, keeping both values as high as possible is very important. For the comprehensive assessment of the classifiers, Error and Accuracy were also calculated. Error is the proportion of false positives and false negatives to the total number of samples, while Accuracy is simply 100% − Error.

For a reliable assessment of model performance, a five-fold stratified cross-validation technique was applied. It involves dividing the data set into five equal parts, ensuring that each fold maintains the same proportion of classes as the original data set, which preserves the integrity of the distribution. During each of the five iterations of model assessment, four of the five folds are used for training, and the remaining fold is used to test the performance. By repeatedly training and testing the model across these folds, we gain insight into the stability of the model and its ability to generalize across different data samples.

3. Cervical Cancer Database

To evaluate a proposed framework, we assembled a database of cytological images of cervix uteri smears prepared using the liquid-based cytology (LBC) method according to the following procedure:

Specimen Collection—the samples were collected with a brush and suspended in a preservation solution.
Specimen processing—the solution was then processed through an automated separator to isolate cells from debris and placed on glass slides.
Slide staining—slides were stained using the Papanicolaou technique, which employs specific staining agents to highlight cellular details for morphological analysis. In general, staining methods like hematoxylin and eosin or Papanicolaou staining enhance the visibility of cells and nucleus structures.
Slide digitization—the stained slides were acquired from the Department of Patomorphology of the Pomeranian Medical University in Szczecin, Poland, and digitized in the Department of Pathology and Oncological Cytology at the Medical University of Wrocław, Poland, using an Olympus BX 51 microscope equipped with an Olympus DP72 camera attached to the microscope head. The images were captured using the Olympus CellD software at a resolution of 300 dpi, with dimensions of 2510 × 1540 pixels. The collected database contains 263 LBC cytology images, with 158 classified as Normal Squamous Intraepithelial Lesion, 70 as High Squamous Intraepithelial Lesion, and 35 as Low Squamous Intraepithelial Lesion. For cell shape analysis, we utilized ImageJ software to extract single cell images, resulting in 428 images: 233 NSIL, 110 HSIL and 85 LSIL, all annotated at the per-cell level. For data curation, the following inclusion and exclusion criteria were applied:
–
Inclusion criteria: Slides of adequate quality and clear diagnostic classification by expert pathologists
–
Exclusion criteria: Low slides quality, ambiguous diagnoses, or insufficient cellular material for accurate segmentation.
The images were retrospectively collected at a single institution to ensure consistency in sample preparation and diagnostic annotations. Although this limits the diversity of the data set, it provides a high-quality foundation for per-cell classification, which is not available in most public data sets.
Case validation—in the next step, an expert pathomorphologist examined the slides to categorize the samples into NSIL, LSIL, or HSIL based on the Bethesda system criteria. This step provided ground-truth labels for training and validating the classification algorithms.

The methods described in this section establish a framework for automated classification of cervical cancer cytology data. By integrating liquid-based cytology imaging, advanced feature extraction, dimensionality reduction techniques, and machine learning, this study aims to improve the accuracy and precision of cervical cancer diagnosis. The next section provides a description of the results obtained from the implementation of the proposed methodology. It includes an evaluation of the effectiveness of feature selection methods in reducing dimensionality, classification performance across different machine learning approaches, and comparisons between handcrafted feature-based methods and a convolutional neural network. Performance metrics such as accuracy, precision, recall, and computational efficiency are analyzed to validate the proposed approach.

4. Results

To check the behavior of the proposed features, the performance of different classifiers in different testing scenarios was checked. The scenarios were defined according to the feature selection methods described in Section 2.5. The features that were chosen in the scenarios are described in Table 2. Additionally, for comparison purposes, classification for a full feature set was performed, as well as classification with features extracted with convolutional neural networks. These results are summarized in Table 3. In subsequent sections, we will present results of the systematic performance evaluation of the proposed scenarios across various metrics, including accuracy, precision, and recall. The results highlight the effectiveness of feature selection methods in reducing the feature vector.

4.1. Feature Vector Reduction Results

To improve the computational efficiency and performance of our classifiers, we implemented five techniques to decrease the dimensions of our feature vectors. For this purpose, we implemented a multifaceted approach, integrating statistical tests and algorithmic feature selection techniques to streamline our feature set. Specifically, the implementation of the Kruskal–Wallis test, Kolmogorov–Smirnov test, and Friedman test provided a robust statistical framework to identify features with significant differences. In addition, stepwise feature selection and significant feature selection methods were utilized to refine the feature vectors. The following description details the results of these reduction methods and highlights their impact on model performance.

At first, features reflecting the real morphometric features of the cells were calculated. These features resembled the characteristics evaluated by a specialist when making a diagnosis, which led to the calculation of 27 morphometric features. Upon examining the characteristics of these features, it becomes evident that the importance of extracted features varies when classifying cervical cancer. In Table 2, we summarize the features that are the most important according to the test used for size reduction. The purpose of our experiments was to check whether the reduced feature vector provides information for accurate cancer classification in the automated diagnosis task. To accurately assess the classification performance of the reduced feature vectors, a reduction rate metric has been introduced. This metric indicates the extent of the reduction by representing the percentage of features omitted from the entire feature vector. The reduction rate is visualized in Figure 4.

From the results presented in Table 2, it is evident that both stepwise feature selection and significant feature selection provide the most substantial reduction of the set, at 74.1% and 40.7%, respectively. A detailed examination of the classification results for the reduced feature set will be discussed in the next subsection.

4.2. Classification Results

With the feature vectors created according to the description mentioned above, we performed a test to check the ability of the handcrafted features to distinguish correctly between cervical cancer grades. The tests were performed on images from the database described in Section 3. At first, we checked the classification accuracy for the 27 features as a baseline. Then the reduced feature vector classification was performed. Furthermore, we compared these results with the classification results for features extracted with VGG16 convolutional neural networks. The obtained results are summarized in Table 3. Each classifier was presented with a full feature vector (baseline), five reduced feature vectors, and a CNN feature vector. Looking at the results, we can easily notice that the SVM classifier outperformed other classifiers for all feature vectors (see Figure 5). We cannot see this dependency for all remaining classifiers.

In Table 4, the recorded precision and recall values for all classifier types are presented. From the table we can see that the highest precision was noted for Friedman, stepwise and significant feature selection methods. In case of recall (see Figure 6) the highest values were observed for the significant feature selection method reaching 91.40% when the SVM classifier was used. The same observation was made for precision, as can be seen in Figure 7. Another important discovery is that feature reduction not only highly increased the accuracy of the proposed framework but also enhanced the precision and recall of the system. Only in the case of Kormogolov–Smirnov test and stepwise feature selection method did we not observe such an increase.

Taking the above into account, we can see that SVM is the best classifier for which we have recorded the highest accuracy values in all cases. When we combine this classifier with the significant feature selection method (highest accuracy for SVM), we can see that the precision and recall values are also very high. What is also important to note is that this combination allows one to reach the accuracy similar to CNN feature vector and SVM classifier. Furthermore, an efficiency comparison between CNN-based feature extraction and conventional methods showed that the latter is more efficient. For a single cell image, the processing time for conventional methods was approximately 3.25 s, compared to 4.47 s for VGG16, making conventional methods 1.38 times faster. At the same time, memory usage was also slightly lower for conventional methods (0.0208 GB vs. 0.0246 GB). Considering this, we can propose a classification system based on SVM and a feature vector consisting of the features selected with the significant feature selection method. Taking that into account, it can be noted that the presented findings demonstrate the potential of combining handcrafted features with feature selection for accurate and efficient diagnosis.

5. Discussion and Conclusions

In this paper, we present a classification framework for liquid-based cytology (LBC) slides for automated cervical cancer diagnosis. The methodology described here uses the conventional classification approach in which morphological features are calculated. To our knowledge, this is the first automated classification approach for slides prepared using the LBC mmethod rather than the typical Pap smear method that uses hand-crafted features. Using feature reduction methods, the framework reduces computational demands without sacrificing diagnostic accuracy. This approach addresses critical challenges in resource-constrained regions, where access to high-end computational resources and large data sets for deep learning is limited.

Furthermore, this study shows the significant clinical implications of integrating machine learning with traditional cytological techniques in the diagnosis of cervical cancer. The application of machine learning, especially a Support Vector Machine (SVM) classifier, improves diagnostic accuracy and consistency. This approach reduces the risk of misdiagnosis and ensures standardized diagnostic processes by simplifying the examination of cytological slides. The results clearly demonstrate that the presented framework is capable of making accurate predictions. As observed, the SVM classifier achieved the best performance, with an error rate of only 7%, while also providing the highest precision rate. The recall values are at an acceptable level, allowing us to conclude that among the five tested classifiers, SVMs are the optimal choice.

Additionally, our findings highlight the computational efficiency of the proposed framework. A comparison between conventional methods and CNN-based feature extraction revealed that conventional methods are approximately 1.38 times faster for processing a single cell image and require slightly less memory. This efficiency makes conventional methods a better choice for resource-constrained clinical environments, further emphasizing the practicality of handcrafted features combined with SVM classifiers.

Moreover, our study shows that the stepwise and significant feature selection methods offer the best trade-off between feature vector size and classification accuracy. As mentioned in Section 2.5, feature selection can enhance the efficiency and interpretability of the diagnostic framework. Recent research has introduced sophisticated algorithms such as Opposition-based Harmony Search to select features, which improved performance on Pap smear image classification tasks [49]. Similarly, the integration of deep learning techniques, such as transfer learning with DenseNet and Vision Transformers, allowed the achievement of up to 98.6% classification accuracy [25,26]. Although these studies describe very good results, they also face challenges, including computational demands and reliance on large data sets. These constraints can limit their applicability in resource-constrained environments or for granular classifications like HSIL and LSIL. In contrast, our approach focuses on handcrafted morphological features combined with advanced feature reduction techniques, which significantly minimize computational requirements without sacrificing diagnostic accuracy. This makes our method particularly advantageous in clinical settings where resources and data are limited.

One limitation of the proposed methodology is the relatively small size of the data set (263 images), as it may affect the generalizability of the results. Furthermore, the data set was collected from a single institution, which may introduce biases due to the lack of diversity in patient demographics or preparation methods. However, the data set was specifically curated for per-cell classification, ensuring high-quality annotations aligned with diagnostic workflows. To address these limitations, future research should focus on validating the framework on a larger and more diverse data set. The expansion of the database with images from multiple institutions and the incorporation of diverse patient populations would improve the robustness and clinical relevance of the framework. Furthermore, integrating hybrid approaches that combine the interpretability of handcrafted features with the robustness of advanced deep learning models could further enhance performance and adaptability. Future work should explore the optimization of such hybrid frameworks for cervical cytology diagnostics.

Taking all of the above into consideration, we can deduce that the proposed novel LBC classification framework is suitable for automatic liquid-based cytology classification, and the described features are effective in characterizing the morphological properties of both the nucleus and the cell. Furthermore, we have shown that the features chosen by the stepwise and significant feature selection methods provided the highest quality of the features, reducing the feature vector by 74.1% and 40.7%, respectively.

In conclusion, the study presents a robust and clinically relevant framework for the automated classification of cervical cancer using LBC images. Significant improvements in diagnostic accuracy and reduced subjectivity underscore the potential of integrating machine learning with traditional cytological methods. The clinical impact of this research is significant, offering a pathway to more reliable and timely diagnoses, which is critical in the treatment and management of cervical cancer. Further research and adaptation of this framework can extend its benefits to other areas of medical diagnostics, demonstrating the versatility and value of this innovative approach.

Author Contributions

Conceptualization, Ł.J. and M.J.; methodology, Ł.J., I.S.-A., M.C. and M.J.; software, Ł.J.; validation, M.C. and M.J.; formal analysis, Ł.J. and I.S.-A.; investigation, Ł.J. and I.S.-A.; resources, Ł.J.; data curation, M.C. and M.J.; writing—original draft preparation, Ł.J.; writing—review and editing, Ł.J. and M.J; visualization, Ł.J.; supervision, M.J.; project administration, Ł.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset generated and analyzed during the current study are not publicly available as they are part of an ongoing research project and are subject to institutional policies. Data sharing is not applicable at this time, but specific inquiries can be addressed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ANOVA	Analysis of Variance
C	Cell Size
CH	Convex Hull
CNN	Convolutional Neural Network
Cp	Cell Perimeter
Eq	Equivalence Diameter
F-Test	Fisher Test
Fr	Friedman Test
GLCM	Gray-Level Co-Occurrence Matrix
HPV	Human Papillomavirus
HSIL	High-Grade Squamous Intraepithelial Lesion
k-NN	k-Nearest Neighbors
KS Test	Kolmogorov–Smirnov Test
LBP	Local Binary Patterns
LBC	Liquid-Based Cytology
LSIL	Low-Grade Squamous Intraepithelial Lesion
MaxA	Maximum Area of a Cell
MaxAr	Maximum Aspect Ratio
MinA	Minimum Area of a Cell
MinAr	Minimum Aspect Ratio
N	Nucleus Size
Nar	Nucleus Aspect Ratio
NCAr	Nucleus-to-Cell Aspect Ratio
NCp	Nucleus-to-Cell Perimeter Ratio
NCEq	Nucleus-to-Cell Equivalence Diameter Ratio
NCExt	Nucleus-to-Cell Extent Ratio
NCr	Nucleus-to-Cell Ratio
NCOr	Nucleus-to-Cell Orientation Ratio
NCs	Nucleus-to-Cell Solidity
Np	Nucleus Perimeter
NSIL	Normal Squamous Intraepithelial Lesion
Or	Orientation of a Cell
RGB	Red, Green, Blue
ReLU	Rectified Linear Unit
SBS	Stepwise Backward Selection
SFS	Stepwise Feature Selection
Sol	Solidity
SVM	Support Vector Machine
TBS	The Bethesda System

References

Wild, C.P.; Weiderpass, E.; Stewart, B.W. (Eds.) World Cancer Report: Cancer Research for Cancer Prevention; IARC: Lyon, France, 2020. [Google Scholar]
National Cancer Registry of Poland. Incidence Statistics for 2021. 2021. Available online: https://onkologia.org.pl/en (accessed on 19 November 2024).
Jeleń, Ł.; Krzyżak, A.; Fevens, T.; Jeleń, M. Influence of Feature Set Reduction on Breast Cancer Malignancy Classification of Fine Needle Aspiration Biopsies. Comput. Biol. Med. 2016, 79, 80–91. [Google Scholar] [CrossRef]
Stankiewicz, I. Using a Computer Program to Evaluate Cytological Smears Received from the Vaginal Part of the Cervix Using the LBC Method. Master’s thesis, Wroclaw Medical University, Wroclaw, Poland, 2018. [Google Scholar]
Anurag, A.; Das, R.; Jha, G.K.; Thepade, S.D.; Dsouza, N.; Singh, C. Feature Blending Approach for Efficient Categorization of Histopathological Images for Cancer Detection. In Proceedings of the 2021 IEEE Pune Section International Conference (PuneCon), Pune, India, 16–19 December 2021; pp. 1–6. [Google Scholar]
Daoud, M.I.; Abdel-Rahman, S.; Bdair, T.M.; Al-Najjar, M.; Al-Hawari, F.; Alazrai, R. Breast Tumor Classification in Ultrasound Images Using Combined Deep and Handcrafted Features. Sensors 2020, 20, 6838. [Google Scholar] [CrossRef] [PubMed]
Kitchener, H.; Almonte, M.; Thomson, C.; Wheeler, P.; Sargent, A.; Stoykova, B.; Gilham, B.; Baysson, H.; Roberts, C.; Dowie, R.; et al. HPV testing in combination with liquid-based cytology in primary cervical screening (ARTISTIC): A randomised controlled trial. Lancet Oncol. 2009, 10, 672–682. [Google Scholar] [CrossRef]
Papanicolaou, G.N.; Traut, H.F. The diagnostic value of vaginal smears in carcinoma of the uterus. Am. J. Obstet. Gynecol. 1941, 42, 193–206. [Google Scholar] [CrossRef]
Papanicolaou, G.N. A new procedure for staining vaginal smears. Science 1942, 95, 438–439. [Google Scholar] [CrossRef]
Solomon, D.; Kurman, R. The Bethesda System for Reporting Cervical/Vaginal Cytologic Diagnoses: Definitions, Criteria, and Explanatory Notes for Terminology and Specimen Adequacy; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Rahmadwati; Naghdy, G.; Ros, M.; Todd, C.; Norahmawati, E. Cervical Cancer Classification Using Gabor Filters. In Proceedings of the 2011 IEEE First International Conference on Healthcare Informatics, Imaging and Systems Biology, San Jose, CA, USA, 26–29 July 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 48–52. [Google Scholar]
Mariarputham, E.J.; Stephen, A. Nominated Texture Based Cervical Cancer Classification. Comput. Math. Methods Med. 2015, 2015, 586928. [Google Scholar] [CrossRef] [PubMed]
Amole, A.; Osalusi, B.S. Textural Analysis of Pap Smears Images for k-NN and SVM Based Cervical Cancer Classification System. Adv. Sci. Technol. Eng. Syst. J. 2018, 3, 218–223. [Google Scholar] [CrossRef]
Qu, H.; Yan, Z.; Wu, W.; Chen, F.; Ma, C.; Chen, Y.; Wang, J.; Lu, X. Rapid diagnosis and classification of cervical lesions by serum infrared spectroscopy combined with machine learning. In Proceedings of the AOPC 2021: Biomedical Optics, Beijing, China, 20–22 June 2021; Wei, X., Liu, L., Eds.; International Society for Optics and Photonics, SPIE: Beijing, China, 2021; Volume 12067, p. 120670A. [Google Scholar]
Rajeev, M.A. A Framework for Detecting Cervical Cancer Based on UD-MHDC Segmentation and MBD-RCNN Classification Techniques. In Proceedings of the 2021 2nd Global Conference for Advancement in Technology, Bangalore, India, 1–3 October 2021; pp. 1–9. [Google Scholar]
Khoulqi, I.; Idrissi, N. Deep learning-based Cervical Cancer Classification. In Proceedings of the 2022 International Conference on Technology Innovations for Healthcare (ICTIH), Magdeburg, Germany, 14–16 September 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 30–33. [Google Scholar]
Hemalatha, K.; Vetriselvi, V. Deep Learning based Classification of Cervical Cancer using Transfer Learning. In Proceedings of the 2022 International Conference on Electronic Systems and Intelligent Computing, Odisha, India, 17–18 December 2022; pp. 134–139. [Google Scholar]
Subarna, T.; Sukumar, P. Detection and classification of cervical cancer images using CEENET deep learning approach. J. Intell. Fuzzy Syst. 2022, 43, 3695–3707. [Google Scholar] [CrossRef]
Hattori, M.; Kobayashi, T.; Nishimura, Y.; Machida, D.; Toyonaga, M.; Tsunoda, S.; Ohbu, M. Comparative image analysis of conventional and thin-layer preparations in endometrial cytology. Diagn. Cytopathol. 2013, 41, 527–532. [Google Scholar] [CrossRef]
Klug, S.; Neis, K.; Harlfinger, W.; Malter, A.; König, J.; Spieth, S.; Brinkmann-Smetanay, F.; Kommoss, F.; Weyer, V.; Ikenberg, H. A randomized trial comparing conventional cytology to liquid–based cytology and computer assistance. Int. J. Cancer 2013, 132, 2849–2857. [Google Scholar] [CrossRef] [PubMed]
Sornapudi, S.; Brown, G.; Xue, Z.; Long, R.; Allen, L.; Antani, S. Comparing Deep Learning Models for Multi-cell Classification in Liquid-based Cervical Cytology Images. AMIA Annu. Symp. Proc. 2020, 2019, 820–827. [Google Scholar] [PubMed]
Hut, I.; Jeftic, B.; Dragicevic, A.; Matija, L.; Koruga, D. Computer-Aided Diagnostic System for Whole Slide Imaging of Liquid-Based Cervical Cytology Sample Classification Using Convolutional Neural Networks. Contemp. Mater. 2022, 13, 169–177. [Google Scholar] [CrossRef]
Kanavati, F.; Hirose, N.; Ishii, T.; Fukuda, A.; Ichihara, S.; Tsuneki, M. A Deep Learning Model for Cervical Cancer Screening on Liquid-Based Cytology Specimens in Whole Slide Images. Cancers 2022, 14, 1159. [Google Scholar] [CrossRef]
Wong, L.; Ccopa, A.; Diaz, E.; Valcarcel, S.; Mauricio, D.; Villoslada, V. Deep Learning and Transfer Learning Methods to Effectively Diagnose Cervical Cancer from Liquid-Based Cytology Pap Smear Images. Int. J. Online Biomed. Eng. 2023, 19, 77–93. [Google Scholar] [CrossRef]
Tan, S.L.; Selvachandran, G.; Ding, W.; Paramesran, R.; Kotecha, K. Cervical Cancer Classification From Pap Smear Images Using Deep Convolutional Neural Network Models. Interdiscip. Sci. Comput. Life Sci. 2024, 16, 16–38. [Google Scholar] [CrossRef] [PubMed]
Abinaya, K.; Sivakumar, B. A Deep Learning-Based Approach for Cervical Cancer Classification Using 3D CNN and Vision Transformer. J. Imaging Inform. Med. 2024, 37, 280–296. [Google Scholar] [CrossRef]
Su, Z.; Pietikäinen, M.; Liu, L. From Local Binary Patterns to Pixel Difference Networks for Efficient Visual Representation Learning. In Proceedings of the Image Analysis; Gade, R., Felsberg, M., Kämäräinen, J.K., Eds.; Springer: Cham, Switzerland, 2023; pp. 138–155. [Google Scholar]
Tarawneh, A.S.; Celik, C.; Hassanat, A.B.; Chetverikov, D. Detailed Investigation of Deep Features with Sparse Representation and Dimensionality Reduction in CBIR: A Comparative Study. Intell. Data Anal. 2018, 24, 47–68. [Google Scholar] [CrossRef]
Shi, Z.; Liu, X.; Li, Q.; He, Q.; Shi, Z. Extracting discriminative features for CBIR. Multimed. Tools Appl. 2011, 61, 263–279. [Google Scholar] [CrossRef]
Zhang, J.; Liu, Y. Cervical Cancer Detection Using SVM Based Feature Screening. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention, Malo, France, 26–29 September 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 873–880. [Google Scholar]
Jeleń, Ł.; Krzyżak, A.; Fevens, T.; Jeleń, M. Influence of Pattern Recognition Techniques on Breast Cytology Grading. Sci. Bull. Wroc. Sch. Appl. Inform. 2012, 2, 16–23. [Google Scholar]
Kowal, M.; Filipczuk, P.; Obuchowicz, A.; Korbicz, J.; Monczak, R. Computer-aided diagnosis of breast cancer based on fine needle biopsy microscopic images. Comput. Biol. Med. 2013, 43, 1563–1572. [Google Scholar] [CrossRef]
Kanungo, T.; Mount, D.; Netanyahu, N.; Piatko, C.; Silverman, R.; Wu, A. An efficient k–means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
Street, N. Xcyt: A System for Remote Cytological Diagnosis and Prognosis of Breast Cancer. In Artificial Intelligence Techniques in Breast Cancer Diagnosis and Prognosis; Jain, L., Ed.; World Scientific Publishing Company: Singapore, 2000; pp. 297–322. [Google Scholar]
Umbaugh, S. Digital Image Processing and Analysis. Human and Computer Vision Applications with CVIPTools, 2nd ed.; CRC Press: New York, NY, USA, 2011. [Google Scholar]
Guyon, I.; Elisseeff, A. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
Kruskal, W.H.; Wallis, W.A. Use of Ranks in One-Criterion Variance Analysis. J. Am. Stat. Assoc. 1952, 47, 583–621. [Google Scholar] [CrossRef]
Conover, W. Practical Nonparametric Statistics; Wiley: New York, NY, USA, 1980. [Google Scholar]
Corder, G.W.; Foreman, D.I. Nonparametric Statistics: A Step by Step Approach, 2nd ed.; Wiley: New York, NY, USA, 2014. [Google Scholar]
Friedman, M. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance. J. Am. Stat. Assoc. 1937, 32, 675–701. [Google Scholar] [CrossRef]
Kleinbaum, D.G.; Kupper, L.L.; Nizam, A.; Muller, K.E. Applied Regression Analysis and Multivariable Methods, 4th ed.; Duxbury Press: Belmont, CA, USA, 2007. [Google Scholar]
Duda, R.; Hart, P.; Stork, D. Pattern Classification, 2nd ed.; Wiley Interscience Publishers: New York, NY, USA, 2000. [Google Scholar]
Altman, N.S. An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 1992, 46, 175–185. [Google Scholar] [CrossRef]
Cristianini, N.; Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
Idlahcen, F.; Colere Mboukou, P.F.; Zerouaoui, H.; Idri, A. Whole-slide Classification of H&E-stained Cervix Uteri Tissue using Deep Neural Networks. In Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering, and Knowledge Management, Valletta, Malta, 24–26 October 2022; pp. 322–329. [Google Scholar] [CrossRef]
Mohammed, B.A.; Senan, E.M.; Al-Mekhlafi, Z.G.; Alazmi, M.; Alayba, A.M.; Alanazi, A.A.; Alreshidi, A.; Alshahrani, M. Hybrid Techniques for Diagnosis with WSIs for Early Detection of Cervical Cancer Based on Fusion Features. Appl. Sci. 2022, 12, 8836. [Google Scholar] [CrossRef]
Tharwat, A. Classification Assessment Methods. Appl. Comput. Inform. 2021, 17, 168–192. [Google Scholar] [CrossRef]
Shaikh, S. Measures Derived from a 2 × 2 Table for an Accuracy of a Diagnostic Test. J. Biom. Biostat. 2011, 2, 1–4. [Google Scholar] [CrossRef]
Das, N.; Mandal, B.; Santosh, K.; Shen, L.; Chakraborty, S. Cervical cancerous cell classification: Opposition-based harmony search for deep feature selection. Int. J. Mach. Learn. Cybern. 2023, 14, 3911–3922. [Google Scholar] [CrossRef]

Figure 1. Comparison of cytological slide preparation methods. (a) Conventional cytology produces more variability and debris, making analysis challenging; (b) liquid-based cytology ensures cleaner and more uniform slides, providing improved quality for image analysis and feature extraction.

Figure 2. Examples of squamous intraepithelial lesions as classified by the Bethesda system: (a) normal (NSIL), (b) low (LSIL), (c) high (HSIL).

Figure 3. Segmentation of a cell and its nucleus.

Figure 4. Reduction rate visualization.

Figure 5. Classifier performance for reduced data sets.

Figure 6. Classification recall vs. feature-selection method.

Figure 7. Classification precision vs. feature-selection method.

Table 2. Reduction rates for feature selection methods.

Test	Chosen Features	Reduction Rate
Kruskal–Wallis	N, C, NCr, Np, Cp, NCp, NMinA, CMinA, NCMinAr, NMaxA, CMaxA, NCMaxr, NExt, NSol, CSol, NCSol, Neq, Ceq, NCeq	29.6%
Kormogolov–Smirnov	N, C, NCr, Np, Cp, NCp, NMinA, CMinA, NCMinAr, NMaxA, CMaxA, NCMaxr, NExt, CExt, NCExt, NSol, CSol, NCSol, Neq, Ceq, NCeq	22.2%
Friedman	N, C, NCr, Np, Cp, NCp, NMinA, CMinA, NCMinAr, NMaxA, CMaxA, NCMaxr, NExt, NCExt, NSol, CSol, NCSol, Neq, Ceq, NCeq	25.9%
Stepwise feature selection	N, C, NCMinAr, CMaxA, NExt, Neq, Ceq	74.1%
Significant feature selection	NCr, Cp, NCp, NMinA, CMinA, NCMinAr, NMaxA, CMaxA, NCMaxr, CAr, NCExt, NSol, NCSol, Neq, Ceq, NCeq	40.7%

Table 3. Classification results with respect to feature selection methods.

	CNN	Baseline	KW	KS	Fr	SFS	SBS
NN	92.9%	82.7%	87.7%	86.3%	88.1%	89.7%	88.1%
SVM	93.3%	90.1%	90.9%	91.8%	91.4%	90.1%	93.0%
KNN1	87.1 %	80.7%	88.9%	88.5%	88.9%	86.8%	85.6%
KNN10	84.4 %	84.0%	88.5%	87.7%	88.1%	86.8 %	86.8%
Disc. An.	90.9%	87.2%	86.4%	87.2%	87.7%	87.2%	87.7%

Table 4. Precision and recall rates with respect to feature selection method, including confidence intervals [in %].

	Precision (95% CI)					Recall (95% CI)
	NN	SVM	KNN1	KNN10	Disc An.	NN	SVM	KNN1	KNN10	Disc An.
Baseline	82.7 (81.5–84.0)	89.3 (88.0–91.5)	81.1 (80.0–82.5)	85.4 (84.0–86.7)	85.5 (84.1–86.9)	82.7 (81.5–83.9)	89.6 (88.5–91.4)	81.1 (80.0–82.3)	83.9 (83.0–85.0)	84.8 (83.2–85.9)
KW	87.6 (85.3–89.7)	90.9 (89.2–92.7)	87.5 (85.0–89.8)	88.5 (86.0–90.8)	86.6 (85.0–88.1)	87.7 (85.3–89.9)	90.9 (89.0–92.9)	87.2 (84.5–89.6)	88.5 (86.2–90.4)	86.4 (84.2–88.0)
KS	86.3 (83.7–88.5)	88.5 (86.4–90.3)	86.7 (84.2–88.9)	89.7 (87.5–91.6)	87.3 (85.0–89.3)	86.4 (84.0–88.6)	88.5 (86.5–90.6)	86.8 (84.3–88.8)	89.3 (87.2–91.0)	86.8 (84.3–88.7)
Fr	88.2 (85.5–90.4)	89.7 (87.6–91.5)	85.5 (82.5–88.1)	90.1 (87.7–92.0)	87.9 (85.2–90.1)	88.1 (85.3–90.3)	89.7 (87.4–91.8)	85.6 (82.8–87.9)	89.7 (87.2–91.7)	87.6 (84.8–90.0)
SFS	89.8 (87.3–92.1)	88.8 (86.2–91.0)	88.9 (86.3–91.1)	87.0 (84.4–89.2)	86.9 (84.3–89.0)	89.7 (87.0–92.0)	88.8 (86.0–91.2)	88.8 (86.0–91.2)	86.8 (84.0–89.3)	86.8 (84.0–89.0)
SBS	88.0 (85.2–90.4)	91.4 (89.2–93.2)	84.9 (82.0–87.6)	89.7 (87.4–91.9)	85.2 (82.5–87.7)	88.1 (85.2–90.6)	91.4 (89.1–93.4)	85.2 (82.3–87.8)	89.3 (87.0–91.6)	84.8 (82.1–87.2)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jeleń, Ł.; Stankiewicz-Antosz, I.; Chosia, M.; Jeleń, M. Optimizing Cervical Cancer Diagnosis with Feature Selection and Deep Learning. Appl. Sci. 2025, 15, 1458. https://doi.org/10.3390/app15031458

AMA Style

Jeleń Ł, Stankiewicz-Antosz I, Chosia M, Jeleń M. Optimizing Cervical Cancer Diagnosis with Feature Selection and Deep Learning. Applied Sciences. 2025; 15(3):1458. https://doi.org/10.3390/app15031458

Chicago/Turabian Style

Jeleń, Łukasz, Izabela Stankiewicz-Antosz, Maria Chosia, and Michał Jeleń. 2025. "Optimizing Cervical Cancer Diagnosis with Feature Selection and Deep Learning" Applied Sciences 15, no. 3: 1458. https://doi.org/10.3390/app15031458

APA Style

Jeleń, Ł., Stankiewicz-Antosz, I., Chosia, M., & Jeleń, M. (2025). Optimizing Cervical Cancer Diagnosis with Feature Selection and Deep Learning. Applied Sciences, 15(3), 1458. https://doi.org/10.3390/app15031458

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimizing Cervical Cancer Diagnosis with Feature Selection and Deep Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Diagnostic Process Overview

2.2. Cervical Cancer Classification

2.3. Image Segmentation

2.4. Feature Extraction

2.5. Feature Selection and Ranking

2.6. Feature Classification

2.7. Classification Metrics

3. Cervical Cancer Database

4. Results

4.1. Feature Vector Reduction Results

4.2. Classification Results

5. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI