Hybridization of Deep Learning Pre-Trained Models with Machine Learning Classifiers and Fuzzy Min–Max Neural Network for Cervical Cancer Diagnosis

Medical image analysis and classification is an important application of computer vision wherein disease prediction based on an input image is provided to assist healthcare professionals. There are many deep learning architectures that accept the different medical image modalities and provide the decisions about the diagnosis of various cancers, including breast cancer, cervical cancer, etc. The Pap-smear test is the commonly used diagnostic procedure for early identification of cervical cancer, but it has a high rate of false-positive results due to human error. Therefore, computer-aided diagnostic systems based on deep learning need to be further researched to classify the pap-smear images accurately. A fuzzy min–max neural network is a neuro fuzzy architecture that has many advantages, such as training with a minimum number of passes, handling overlapping class classification, supporting online training and adaptation, etc. This paper has proposed a novel hybrid technique that combines the deep learning architectures with machine learning classifiers and fuzzy min–max neural network for feature extraction and Pap-smear image classification, respectively. The deep learning pretrained models used are Alexnet, ResNet-18, ResNet-50, and GoogleNet. Benchmark datasets used for the experimentation are Herlev and Sipakmed. The highest classification accuracy of 95.33% is obtained using Resnet-50 fine-tuned architecture followed by Alexnet on Sipakmed dataset. In addition to the improved accuracies, the proposed model has utilized the advantages of fuzzy min–max neural network classifiers mentioned in the literature.


Introduction
Cervical cancer is a type of cancer that develops in the cells of the cervix, which is the lower part of the uterus that connects to the vagina. Cervical cancer is usually caused by a human papillomavirus (HPV) infection, which is a sexually transmitted infection. HPV is a very common virus that can cause abnormal changes in the cells of the cervix, which can eventually lead to cancer if left untreated [1].
Cervical carcinoma is the most prevalent cancer diagnosed in 23 countries and the primary cause of mortality in 36 nations [1,2]. Furthermore, 85 percent of cervical cancers were encountered in the late stages. It is the fourth most frequent cancer in women as well as the leading cause of death, with an approximate 604,000 reported incidents and 342,000 deaths worldwide in 2020 [1]. Figure 1 depicts the mortality age-standardized rates and region-specific incidence for cervical cancer in 2020. The (W) world age standardized incidence rate is shown in descending order, and the highest national age-standardized incidence and mortality rates are overlaid. In such areas, it is critical to ensure that resourceintensive vaccination and screening programs are carried out to improve the situation [2]. The human-based smear analysis is difficult, laborious, time consuming, costly, and prone to errors since each smear slide consists of approximately 3 million cells with varying overlapping and orientation, necessitating the development of a computerized system capable of analyzing the Pap smear effectively and efficiently [4]. Extensive research has been conducted to assist pathologists in tracking cervical cancer with the development of computer-aided diagnostic (CAD) systems. This type of system consists of different steps, including image preprocessing, segmentation, feature extraction, feature selection, and classification. To enhance the image quality, filtering-based preprocessing is carried out. Much work is carried out to segment the nucleus and cytoplasm using different imageprocessing techniques [5]. The images are used to extract texture, morphological, and color metric features. The feature selection techniques are applied for the identification of the most discriminant features, and then, classifiers are designed to classify the cervical cytology cell images [6].
The above mentioned workflow necessitates multiple steps for processing the data. The handcrafted features lack the guarantee superior classification performance, highlighting the inadequacy of automatic learning. Deep learning methods have demonstrated success in a variety of applications over the last decade, including object recognition, natural language processing, signal processing, image classification, segmentation, and so on [7][8][9][10]. The deep network architecture has the ability to learn features automatically based on the spatial relationships among the pixels. The multiple layers with simple nonlinear activation functions are used to transform input data from abstract to specific at multiple levels of feature representation. Pap smear, liquid based cytology, and colposcopy are the main screening methods for cervical cancer diagnosis. In a Pap-smear test, cell samples are collected from the transformation zone of the cervix, and for abnormalities, it is examined under the microscope. The colposcopy examination deals with examining abnormalities in the cervix with the help of the colposcope; it is a direct visual examination done by gynecologists [3]. Regular screening of women over 30 years of age is advisable for early detection and treatment.
The human-based smear analysis is difficult, laborious, time consuming, costly, and prone to errors since each smear slide consists of approximately 3 million cells with varying overlapping and orientation, necessitating the development of a computerized system capable of analyzing the Pap smear effectively and efficiently [4]. Extensive research has been conducted to assist pathologists in tracking cervical cancer with the development of computer-aided diagnostic (CAD) systems. This type of system consists of different steps, including image preprocessing, segmentation, feature extraction, feature selection, and classification. To enhance the image quality, filtering-based preprocessing is carried out. Much work is carried out to segment the nucleus and cytoplasm using different image-processing techniques [5]. The images are used to extract texture, morphological, and color metric features. The feature selection techniques are applied for the identification of the most discriminant features, and then, classifiers are designed to classify the cervical cytology cell images [6].
The above mentioned workflow necessitates multiple steps for processing the data. The handcrafted features lack the guarantee superior classification performance, highlighting the inadequacy of automatic learning. Deep learning methods have demonstrated success in a variety of applications over the last decade, including object recognition, natural language processing, signal processing, image classification, segmentation, and so on [7][8][9][10]. The deep network architecture has the ability to learn features automatically based on the spatial relationships among the pixels. The multiple layers with simple nonlinear activation functions are used to transform input data from abstract to specific at multiple levels of feature representation.
The network can learn such hierarchical feature representations from a large scale of training data in an unsupervised or supervised manner. In many practical applications, such learned hierarchical features have outperformed handcrafted designs [11].
Lotfi A. Zadeh [12] proposed a fuzzy logic data analysis approach and an engineering approach. Fuzzy set theory is the basis for fuzzy logic which deals with reasoning that is approximate rather than precise in classical two-valued logic. As a result, it is a technique for formalizing the human capacity for imprecise reasoning. Such reasoning exemplifies the human ability to reason roughly and make decisions in the face of uncertainty [12]. Fuzzy set theory is considered a good framework for classification problems because of the inherent fuzziness in the cluster. FMMN has been used in many applications, including fault detection, lung cancer detection, breast cancer detection, medical data analysis, etc. [13][14][15].
This paper presents a hybrid method for the classification of cytology Pap-smear images into abnormal and normal. The machine learning classifiers and fuzzy min-max neural network are trained for two-class problems using the features to extract by fine tuning the deep learning pre-trained models. The following are the main contributions of the proposed work.
(1) Presents a novel and hybrid approach by leveraging the strengths of pre-trained deep learning models with machine learning classifiers and fuzzy min-max neural networks.
(3) Extracts the learned and specific features from Pap-smear images, which are proven to be more effective than handcrafted features and classify by using different machine learning classifiers and enhancing the classification performance using fuzzy min-max neural network.
(4) Provides improved accuracy with the advantages of different properties of the fuzzy min-max neural network classifier given by Simpson [16].

Literature Review
To classify the cervical cytology images, various deep learning and machine learningbased techniques are used, for example, researchers in [17,18] make use of local binary pattern, texture, histogram features, local binary pattern, and grey level features. The features are then given as input to a hybrid classifier system that combines SVM and a neuro-fuzzy for classification of the cervical images [19].
Jyothi Priyankaa et al. (2021) [20] consider Pap smear test images for cancerous cell prediction combined with deep learning techniques for more efficient results. The ResNet50 pre-trained model of convolutional neural networks (CNNs) for the prediction of cancerous cells produces accurate results. Except for the final layer, which is trained according to the requirements, all the layers in the proposed work are considered as they are. This methodology correctly classifies all classes with 74.04 percent accuracy.
Deep  [24] proposed a method for automatically classifying cervical cell images by generating labelled patch data, fine-tuning convolutional neural networks for the extraction of deep hierarchical features and the novel graph-based cell detection approach for cellular level evaluation. The results demonstrated that the proposed pipeline could classify images of single cells as well as overlapping cells. The VGG-19 model performed accurately at classifying cervical cytology patch data, with a precision-recall curve of 95%.
The deep learning approach reviewed in Swati Shinde et al. (2022) [25] can directly process raw images and offers automated learning of features based on specific objective functions, such as detection, segmentation, and classification. Different existing pre-trained models, such as ResNet-50, ResNet-152, and VGG are used in the literature for the classification of Pap-smear images for the diagnosis of cervical cancer. Table 1 shows the summarization of the different papers studied and analyzed.

Proposed Methodology
In this paper, a hybrid convolutional neural network classification technique is proposed to classify the cervical cytology images into abnormal and normal. Figure 2 shows the block diagram of the proposed work. The offered hybrid CNN framework is divided into two major phases. In the first phase, a pre-trained deep learning model for feature extraction is used. Successive layers, such as FC6 and FC7, are used to extract features. In the second phase, machine learning classifiers and fuzzy min-max neural network is used for the classification process [27]. [26] sity Hospital Resize 256 × 256 transform curacy was obtained with DCT

Proposed Methodology
In this paper, a hybrid convolutional neural network classification technique is proposed to classify the cervical cytology images into abnormal and normal. Figure 2 shows the block diagram of the proposed work. The offered hybrid CNN framework is divided into two major phases. In the first phase, a pre-trained deep learning model for feature extraction is used. Successive layers, such as FC6 and FC7, are used to extract features. In the second phase, machine learning classifiers and fuzzy min-max neural network is used for the classification process [27].  For medical image analysis, deep learning architecture is most prevalent. To train a convolutional neural network, a massive quantity of data and high computational resources are required, as well as a longer training time. Transfer learning (TL) is a solution to this problem because it aids in the creation of an accurate model by beginning to learn from previous patterns of knowledge on solving various problems instead of starting from scratch [28,29]. As a result, TL is a technique in artificial intelligence that allows us to transfer knowledge from one model to another [30]. A TL process consists of two steps.
Step 1: Choose a pre-trained model that is trained on large-scale data that is relevant to the problem at hand.
Step 2: Fine-tune a pre-trained model based on the similarity of our dataset. AlexNet, GoogleNet, ResNet-18, and ResNet-50 are different pre-trained deep learning architectures that have been experimented with using the proposed hybrid technique. AlexNet, GoogleNet, ResNet-18, and ResNet-50 networks are utilized in the transfer learning process, with the weights pre-trained on the ImageNet dataset [31]. ImageNet is made up of 1 million training images, 50,000 validation images, and 100,000 testing images from 1000 different classes. The earlier layers of the pre-trained models are frozen, which capture more low-level features. Alexnet fc7 layer, ResNet-18 pool 5 layer, ResNet-50 fc1000 layer, and Googlenet loss3-classifier layer are used as features. Figure 2 shows the overall process carried out where feature extraction is carried out using AlexNet. Similarly, GoogleNet, ResNet-18, and ResNet-50 are used. For the machine learning classifiers in Module 2, the number of features is fed for training and testing, as mentioned in Table 2. Along with the various machine learning algorithms, the fuzzy min-max neural network is also tested. For classification, the features are normalized and fed into a fuzzy min-max neural network. One of the most common methods for normalizing data is minmax normalization. For each feature, the minimum value is converted to 0, the maximum value is converted to 1, and all other values are converted to a decimal between 0 and 1. The following equation is used to normalize the features [32].
where X is the set is of feature values obtained, X min is minimum value in X, and X max is maximum value in X.

Module 2 3.2.1. Machine Learning Classifiers
Classification is a machine learning method that determines which class a new object belongs to based on a set of predefined classes. There are numerous classifiers that can be used to classify data, including decision trees, bays, functions, rules, lazy, meta, and so on. In this work we used different classifiers belonging to the different families, and performance comparison is to evaluate the best classifier. We experimented with the BayesNet, Naive Bayes, random forest, random tree, decision table and part machine learning classifiers.

Fuzzy Min-Max Neural Network
Simpson pioneered the hyperboxes for pattern classification [16]. FMM learns using a hyperbox fuzzy set. An expansion parameter theta (θ) controls the size of the hyperbox; in this case the theta (θ) ranges from values 0 to 1. The maximum (max) and minimum (min) points in a hyperbox are used to measure how a training sample accommodates in the hyperbox from a fuzzy membership function [31].
Equation (2) defines a hyperbox fuzzy logic with maximum (HW), minimum (HV), and unit hypercube I n points. Figure 3 depicts a 3-D hyperbox with its maximum point (HW j ) and minimum point (HV j ). tics 2023, 13, x FOR PEER REVIEW

Fuzzy Min-Max Neural Network
Simpson pioneered the hyperboxes for pattern classification [ a hyperbox fuzzy set. An expansion parameter theta (θ) controls th in this case the theta (θ) ranges from values 0 to 1. The maximum (min) points in a hyperbox are used to measure how a training sam the hyperbox from a fuzzy membership function [31].
Equation (2) defines a hyperbox fuzzy logic with maximum ( and unit hypercube I points. Figure 3 depicts a 3-D hyperbox w (HWj) and minimum point (HVj). Fuzzy logic H j can be used to represent each hyperbox as follows [16]: where hth represents the input pattern as HA h = (a h1 , a h2 , . . . , a hn ). jth hyperbox minimum and maximum points are represented as HV j = (hv j1 , hv j2 , . . . , hv hn ) and HW j = (hw j1 , hw j2 , . . . , hw hn ) respectively. Fuzzy min-max classifier is made up of three layers. The first is input feature vectors (FA), the second is the fuzzy hyperbox sets (FB), and the third is the classification nodes (FC). The fuzzy membership computes the input pattern for various hyperboxes and determines the pattern's class label. The feature vector obtained from the feature extraction step is provided to the input layer, FA. For hyperboxes, the membership function is evaluated by the nodes (bj) in the fuzzy hyperbox set layer (FB). V and W represent the weights of connections between layers FA and FB, which are a set of min and max points of hyperboxes, respectively. The FMMN expansion process [16] is used to update these parameters. U stores the weights between the nodes in the middle and third layers. Equation (3) shows the U is computed.
FMMN calls the membership function when a new input sample is provided. Equation (4) is used to calculate the membership value.
where H j denotes the membership of jth hyper box, HA h is the hth input data, HW ji is the maximum point of H j , HV ji is the minimum point of Hj, and γ indicates the sensitivity parameter which controls the decrease in speed of membership value as the gap between HA h and H j rises. The FMMN classification method is primarily based on expansion test, overlap test, and contraction test.

Expansion
To include a new input pattern, HA h , in the hyperbox, the following equation is used to determine if a hyperbox can be expanded.
Overlap Test If a hyperbox is chosen for expansion, an overlap test is run to determine whether there is any overlapping between two or more hyperboxes caused by the expansion. If any of the following conditions are met, overlapping of hyperboxes will occur.

Case 1
HV ji < HV ki < HW ji < HW ki δn = min HW ji − HV ki , δo Case 2 HV hi < HV ki < HW ki < HW ji δn = min HW ki − HV ji , δo Case 3 HV ji < HV ki < HW ki < HW ji δn = min min HW ji − HV ki , HW ki − HV ji , δo Case 4 HV ki < HV ji < HW ji < HW ki δn = min min HW ji − HV ki , HW ki − HV ji , δo Contraction A suitable contraction rule is applied to eliminate the overlap between the hyperboxes if the overlap is detected. The corresponding contraction rules are shown in the following equations with respect to the overlap test rules as stated in the overlap test. HV j ∆ < HV k ∆ < HW j ∆ < HW k ∆ < HW j ∆ new = HW k ∆ new = HW k ∆ old + HW j ∆ old /2 (10) Case 2

Case 3(b)
HV j ∆ < HV k ∆ < HW k ∆ < HW j ∆ and HW k ∆ − HV j ∆ < HW j ∆ − HV k ∆ =, HW j ∆ new + HV k ∆ old Case 4(a) Case 4(b) The training process is completed after successful completion of the preceding three processes, which results in a list of hyperboxes to represent the FMM network.

Algorithm 1
The algorithm for the proposed work is as follows: Step 4: For each model in Step 3 Train the model Extract the feature vector Step 5: Classifiers = {{machine learning classifiers: simple logistic, Naive Bays, Bayes Net, decision table, random forest, random tree, PART}, {fuzzy min-max neural network}} Step 6: For each classifier in Step 5 Train with the feature vector Evaluate with Testing Set End

Experimentation Environment
The proposed technique is implemented using Matlab software with Intel core i5 processor and 4 GB RAM. To investigate the effectiveness of the proposed techniques, it is applied to two different standard datasets, namely the Herlev dataset and the Sipakmed dataset. Both the datasets are rearranged into two classes, normal and abnormal, and the proposed techniques are used to solve binary classification. The dataset is split into training and testing.  Table 3 shows the cell distribution of the dataset and Figure 4 shows sample images from the Herlev dataset [33].

Sipakmed
The Sipakmed dataset consists of 4049 images. There are five categories for classification of the Sipakmed dataset: dyskeratotic, metaplastic, koilocytotic, parabasal, and superficial-intermediate [34]. The Sipakmed dataset samples are shown in Figure 5. Table 4 shows the cell distribution of the dataset.

Sipakmed
The Sipakmed dataset consists of 4049 images. There are five categories for classification of the Sipakmed dataset: dyskeratotic, metaplastic, koilocytotic, parabasal, and superficial-intermediate [34]. The Sipakmed dataset samples are shown in Figure 5. Table 4 shows the cell distribution of the dataset.

Sipakmed
The Sipakmed dataset consists of 4049 images. There are five categories for classification of the Sipakmed dataset: dyskeratotic, metaplastic, koilocytotic, parabasal, and superficial-intermediate [34]. The Sipakmed dataset samples are shown in Figure 5. Table 4 shows the cell distribution of the dataset.

Performance Measures
Choosing an appropriate evaluation metric is critical for overcoming bias among the various algorithms. Accuracy, sensitivity, specificity, precision and F1 Score are different performance metrics to evaluate the classification performance. True positive (TP) is the number of correctly labelled positive samples, true negative (TN) is the number of correctly classified negative samples, false positive (FP) is the number of negative samples classified as positive, and false negative (FN) is the number of positive instances predicted as negative (FN) [35]. Table 5 shows the formula of evaluation metrics.

Experiments and Results
The results of an experiment carried out when the AlexNet pretrained model is used as a feature extractor are shown in Table 6. From the results it can be analyzed that the highest classification testing accuracy of 88.6% is given by the simple logistic classifier on the Herlev dataset. With the Sipakmed dataset, 95.14% highest classification accuracy is given by the simple logistic classifier. Hence, the combination of Alexnet with a simple logistic classifier among the experimentations has the best performance. Experimentation carried out with the GoogleNet pre-trained model results are demonstrated in the following Table 7. Highest testing classification accuracy on Herlev dataset is obtained with simple logistic of 87.32%. On the Sipakmed dataset, the highest accuracy obtained is 92.21% with simple logistic classifiers. With the Googlenet also, the simple logistic is outperforming the other classifiers. and 93.85% are obtained with the simple logistic classifier on the Herlev and Sipakmed datasets, respectively.  Table 9 shows the experiment carried out when the ResNet-50 pre-trained model is used as a feature extractor. From the results it can be analyzed that the highest classification testing accuracies of 92.03% and 93.60% are given by the simple logistic classifier on the Herlev and Sipakmed datasets, respectively. Binary classification of cervical cytology images is performed using the pre-trained models, and fuzzy min-max neural networks are elaborated further. Table 10 shows the results of the AlexNet pre-trained model used as a feature extractor. From the tables it can be observed that the highest classification accuracy on the Herlev dataset is 90.22% and good sensitivity of 95% with θ 0.3, whereas the 95.33% is the highest classification accuracy on the Sipakmed dataset and good sensitivity of 95% with θ 0.5. Along with the accuracy, sensitivity, specificity, precision, and F1 score are calculated and presented in the table.  Table 11 represents the results of the Googlenet pre-trained model. From the tables it can be observed that highest classification accuracy on the Herlev dataset is 89.49% and good sensitivity of 97% with θ 0.6, whereas 92.13% is the highest classification accuracy on the Sipakmed dataset and good sensitivity of 91% with θ 0.3. The results of the RestNet-18 model are shown in Table 12. The highest classification accuracy on the Herlev dataset is 91.67% and good sensitivity of 99% with θ 0.5, whereas 92.87% is the highest classification accuracy on the Sipakmed dataset and good sensitivity of 93% with θ 0.4. The results of the RestNet-50 model are shown in Table 13. The highest classification accuracy on the Herlev dataset is 88.77% and good sensitivity of 91%, whereas 95.33% is the highest classification accuracy on the Sipakmed dataset and good sensitivity of 95% with 0 and 0.5, respectively.

Performance Analysis
The result analysis discussed above shows that the proposed techniques give overall good classification accuracy. Comparing the performance of the different pretrained models, the best classification accuracy obtained by the experimented pre-trained models is shown in Figure 6. The performance comparison demonstrated with the best classification accuracy, RestNet-50 followed by Alexnet, has performed better than other models with best accuracies of 95.33% and 95.32%, respectively.

Performance Analysis
The result analysis discussed above shows that the proposed techniques give overall good classification accuracy. Comparing the performance of the different pretrained models, the best classification accuracy obtained by the experimented pre-trained models is shown in Figure 6. The performance comparison demonstrated with the best classification accuracy, RestNet-50 followed by Alexnet, has performed better than other models with best accuracies of 95.33% and 95.32%, respectively. The performance comparison between the machine learning classifiers and the FMMN for classification shows that overall, the performance of the FMMN outperforms the machine learning classifier. Table 14 shows the comparative analysis. The performance comparison between the machine learning classifiers and the FMMN for classification shows that overall, the performance of the FMMN outperforms the machine learning classifier. Table 14 shows the comparative analysis. Comparing the two datasets with the classification accuracy obtained, it can be observed from Figure 7 that the Sipakmed dataset average classification accuracy with all the pre-trained models have outperformed over the Herlev dataset. As mentioned, the convolutional neural networks need large amounts of data to train the models, and the Sipakmed dataset has a considerably large number of images as compared to the Herlev dataset. Table 15 shows the comparative study outcomes with the results of the existing studies on cervical cancer diagnosis that uses Pap-smear images using computer-aided applications. dataset. Table 15 shows the comparative study outcomes with the results of the existing studies on cervical cancer diagnosis that uses Pap-smear images using computer-aided applications.
The advantage of the proposed method is it has given a significant good accuracy and sensitivity for the cervical cancer image classification compared with the existing methods. However, the limitation is FMMN is a complex architecture that requires a significant amount of computational resources and training data.    The advantage of the proposed method is it has given a significant good accuracy and sensitivity for the cervical cancer image classification compared with the existing methods. However, the limitation is FMMN is a complex architecture that requires a significant amount of computational resources and training data.

Conclusions
A novel hybrid deep learning technique is proposed to solve the problem of cervical cytology image classification to aid pathologists to carry out the smear test with good accuracy and less time. The proposed hybrid technique is based on deep learning pretrained models, transfer learning, machine learning classifiers, and fuzzy min-max neural network. Attempts are made to compare the performance of different deep learning models. The highest classification accuracy is given by the ResNet-50 classifier of 95.33% with theta value 0.5. Experimentation is performed on two different datasets to evaluate the performance. Results obtained on the Sipakmed dataset were better than those obtained on the Herlev dataset.
The future scope is to use the modified versions of the fuzzy min-max neural network to improve the classification accuracy. The seven-class, five-class problem for classification can be experimented with the proposed techniques to evaluate the performance for multiclass classification problem.