Diagnosis of Cervical Cancer and Pre-Cancerous Lesions by Artificial Intelligence: A Systematic Review

Objective: The likelihood of timely treatment for cervical cancer increases with timely detection of abnormal cervical cells. Automated methods of detecting abnormal cervical cells were established because manual identification requires skilled pathologists and is time consuming and prone to error. The purpose of this systematic review is to evaluate the diagnostic performance of artificial intelligence (AI) technologies for the prediction, screening, and diagnosis of cervical cancer and pre-cancerous lesions. Materials and Methods: Comprehensive searches were performed on three databases: Medline, Web of Science Core Collection (Indexes = SCI-EXPANDED, SSCI, A & HCI Timespan) and Scopus to find papers published until July 2022. Articles that applied any AI technique for the prediction, screening, and diagnosis of cervical cancer were included in the review. No time restriction was applied. Articles were searched, screened, incorporated, and analyzed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines. Results: The primary search yielded 2538 articles. After screening and evaluation of eligibility, 117 studies were incorporated in the review. AI techniques were found to play a significant role in screening systems for pre-cancerous and cancerous cervical lesions. The accuracy of the algorithms in predicting cervical cancer varied from 70% to 100%. AI techniques make a distinction between cancerous and normal Pap smears with 80–100% accuracy. AI is expected to serve as a practical tool for doctors in making accurate clinical diagnoses. The reported sensitivity and specificity of AI in colposcopy for the detection of CIN2+ were 71.9–98.22% and 51.8–96.2%, respectively. Conclusion: The present review highlights the acceptable performance of AI systems in the prediction, screening, or detection of cervical cancer and pre-cancerous lesions, especially when faced with a paucity of specialized centers or medical resources. In combination with human evaluation, AI could serve as a helpful tool in the interpretation of cervical smears or images.

. The process of screening and selecting relevant studies based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Application of AI for Cervical Cancer and Its Cost-Effectiveness
From 1997-when cervical cancer screening was performed for the first time with AI-until today, various machine learning algorithms have been applied for the detection of cervical cancer [30,. Common machine learning (ML) models included deep learning (DL), k-nearest neighbors (KNN), artificial neural network (ANN), decision tree (DT), random forest (RF), support vector machine (SVM), logistic regression (LR), synthetic minority oversampling technique (SMOTE), convolutional neural network (CNN), multilayer perceptron (MLP), deep neural networks (DNN), the PAPNET test, and ResNet (residual neural network or a combination of techniques) [36,39,45,46,. The time taken for training and for the prediction of cervical cancer by each algorithm varied markedly.

Application of AI for Cervical Cancer and Its Cost-Effectiveness
From 1997-when cervical cancer screening was performed for the first time with AI-until today, various machine learning algorithms have been applied for the detection of cervical cancer [30,. Common machine learning (ML) models included deep learning (DL), k-nearest neighbors (KNN), artificial neural network (ANN), decision tree (DT), random forest (RF), support vector machine (SVM), logistic regression (LR), synthetic minority oversampling technique (SMOTE), convolutional neural network (CNN), multilayer perceptron (MLP), deep neural networks (DNN), the PAPNET test, and ResNet (residual neural network or a combination of techniques) [36,39,45,46,. The time taken for training and for the prediction of cervical cancer by each algorithm varied markedly. In Kruczkowski et al.'s study, the time needed to train for the Naïve Bayes and CNN algorithms varied from 7.54 ms to 5320 ms. The prediction time for cervical cancer by the Naïve Bayes and RF algorithms varied from 1.81 ms to 15.5 ms, and the accuracies differed [76]. Elakkiya et al. reported an average of 0.2 s to classify the cervical lesion using a hybrid deep learning technique that combined small-object detection generative adversarial networks (SOD-GAN) and the fine-tuned stacked autoencoder (F-SAE). [77]. In recent years, the rapid progress of AI technologies and the use of combined features have significantly reduced costs, time, the cost of training, and the inference time [52,78,79]. These advancements have also improved the patient's access to professional pathologists and the prompt delivery of cytology results [78].
However, the cost effectiveness of AI was questioned in some studies because the cost of AI techniques far exceeded that of manual screening. Additionally, commonly used techniques require a considerable database for training the application of models and this constitutes a barrier in the diagnosis of cervical cancer [80].
The following four groups were created from the 117 studies that were deemed suitable for the investigation: AI application in cervical cancer prediction (n = 22), AI application in cervical cancer screening (n = 25), AI application in cytology (n = 44), and colposcopy (n = 26) for the detection of cervical cancer.

Application of AI in Cervical Cancer Screening
Screening is a way of identifying apparently healthy people who may have an increased risk of a particular condition [97]. The screening test needs to be sensitive and precise. A screening test must have sensitivity exceeding 95% if the specificity is less than or equal to 95% and vice versa (specificity must be >95% if the sensitivity is 95%) in order to detect more true-positive cases than false-positive cases when the prevalence of the disease is less than or equal to 5% (which covers the majority of screening populations). Most screening tests do not meet this high standard, which means that the screening program must absorb the costs of many false-positive results [98].
The use of AI in screening for cervical cancer has produced contradictory results. In some studies, the use of artificial intelligence reduced false-negative outcomes compared to traditional methods [33]. Other studies reported an increase in false-negative outcomes [35]. Yet other investigations registered no difference between AI and conventional methods [37]. Michelow and coworkers [34] reported no significant difference between manual detection and PAPNET for invasive carcinoma and HSIL. Interestingly, these contradictions were observed in older studies as well [33,35,37]. In recent times, the performance of AI in cervical cancer screening was better than that of conventional and manual methods [38,42,53]. With equal sensitivity and much higher specificity compared to both Pap and manual DS, AI-based dual staining (DS) yielded lower positivity than cytology and manual DS [53]. The better performance of AI in recent studies may be attributed to the hybrid ensemble approach, combined algorithms and techniques, and the grade of squamous intraepithelial lesions [38,39,46]. Sarwar et al. [38] reported that the hybrid ensemble technique outperformed all other algorithms and demonstrated a screening efficiency of nearly 98%. According to Bao et al. [30], the overall agreement rate between manual reading and AI in CIN detection was 94.7% (95% confidence interval 94.5-94.8%), and the kappa coefficient was 0.92 (0.91-0.92). Furthermore, the performance of AI in the detection of CIN2+ increased with the severity of detected abnormalities on cytology. The accuracy of AI in screening for CIN 1-3 and adenocarcinoma in situ varies between 67% and 98.27% [39][40][41]. The application of AI in cervical cancer screening is shown in Table 2.

Application of AI in Cytology for the Detection of Cervical Cancer
Cervical cytology image analysis is a very time-consuming, challenging, and laborious task [99]. Computer-assisted diagnosis is believed to ease this situation because it can potentially lower the misdiagnosis rate and also reduce the workload of cytologists [100]. Therefore, several studies have addressed the subject of automatic cervical cancer diagnosis [64][65][66][67][68]74,75,80,. The investigations showed that AI-assisted methods were promising, and achieved a high sensitivity and specificity in clinical cervical cytological screening [66,126]. In a multicenter, clinical-based observational study by Bao et al., AI-assisted reading identified considerably more CIN 2 (92.6%) and CIN 3+ (96.1%) lesions than, or at a similar rate as, manual reading. Compared to expert cytologists, AI-assisted reading showed a similar sensitivity (relative sensitivity of 1.01) and greater specificity (relative specificity of 1.26) [66]. Cao and co-workers [130] compared the performance of AI-assisted reading with that of four pathologists. The first two pathologists had 4 years of work experience, and the third and fourth pathologist had 7 and 10 years of work experience, respectively. The proposed model achieved an area under the receiver operating characteristics (AUROC) of 0.99, and an accuracy of 98.0%, which was comparable to a pathologist with a decade of expertise (accuracy, 93.7%). Additionally, pathologists needed on average 14.83 s to diagnose each image, compared to the 0.04 s needed by the AI-assisted method. In fact, reading with AI assistance was approximately 380 times faster than reading by a typical pathologist. AI algorithms were able to distinguish between normal and cancerous Pap smears with an accuracy of 80-100% [68,110,111,113,115,116,119,120,125,127,130,131,135,137]. The application of AI in cytology for the detection of cervical cancer is shown in Table 3.

Application of AI in Colposcopy for the Detection of Cervical Cancer
AI-assisted tools appear to be very suitable for the cervical cancer diagnostic protocol, which recommends colposcopy in cases of an abnormal PAP smear and/or high-risk HPV and the collection of diagnostic tissue samples before initiating any potentially invasive treatment [138]. In response to this demand, a few notable studies were published on the use of AI in colposcopy for the detection of cervical cancer [69][70][71][72][73]77,79,[137][138][139][140][141][142][143][144][145][146][147][148][149][150][151][152][153][154][155][156]. According to several investigations, the AI diagnostic approach could support or even potentially replace conventional colposcopy, permit more objective tissue specimen sampling, and reduce the number of cervical cancer cases in developing nations by offering an economical screening option in low-resource settings [137,141]. Some research suggests that AI could help less skilled clinicians to decide whether to perform a cervical biopsy [70,144,145,156]. In addition, AI helped gynecologists to accurately establish the presence of invasive cancer on cervical pathological images diagnosed by AI [156]. According to a large study (over 19,000 patients) performed in China by Xue et al., the agreement between pathology findings and colposcopic impressions graded by the Colposcopic Artificial Intelligence Auxiliary Diagnostic System (CAIADS) was higher than that of colposcopies interpreted by colposcopists (82.2% vs. 65.9%). Additionally, the CAIADS proved to be more accurate in predicting biopsy sites [147]. In published studies, the sensitivity and specificity of AI in colposcopy for the detection of CIN 2 or more severe lesions were reported at 71.9-98.22% and 51.8-96.2%, respectively [70,72,73,[137][138][139][140]143,147,148,[150][151][152]154]. The accuracy of AI in colposcopy for CIN 2+ detection varied from 40.5% to 98.3% [70,72,77,137,143,[146][147][148][150][151][152]154,156]. The use of AI in colposcopy for the detection of cervical cancer is summarized in Table 4.

Discussion
Despite tremendous progress in the treatment of cancer, cervical cancer cells are occasionally detected at a time when the disease has already caused distant metastasis [157]. This is a major problem in third-world countries with inferior health systems [5,8]. Furthermore, monetary and personal limitations increase the need for alternative tools as the number of people to be treated by professionals increases [158]. Thus, we need better diagnostic tools for cancer [157]. The application of AI not only in medicine but also other majors has grown significantly over the past ten years, particularly in the last five years [28,159,160] at three levels: for patients, by enabling them to process their own data to promote health; for health systems, by improving workflow and the potential for reducing medical errors; and for clinicians, predominantly via rapid, accurate image interpretation [161].
The aim of this systematic review is to evaluate the diagnostic performance of AI technologies in cervical cancer and pre-cancerous lesions. Estimating the prognosis of cervical cancer is one of the most difficult tasks because its management requires a variety of cancer treatment approaches [162,163], which may even impair quality of life to a significant extent [164,165].
The studies included in this systematic review employed models to identify a variety of predictors, including age, numbers of sexual partners, age at the first sexual intercourse, deliveries, smoking, hormonal and barrier contraceptives, STDs, marital status, personal health level, education level, social status, number of caesarean deliveries, and the presence of 15 high-risk HPV genotypes. The accuracy of different AI algorithms in predicting cervical cancer varies from 70% to 100%. More reliable predictions are achieved when the prediction models for cervical cancer are combined with the hybrid ensemble approach. Compared to studies focused on AI techniques, in a cohort study, Schulte-Frohlinde et al. [166] noted that cervical cancer could be predicted among high-risk HPV-positive women. Age at sexual debut was a significant modifier of the incidence of cervical cancer.
The prediction of cervical cancer on the basis of AI techniques has produced promising results. The findings of a study by Nsugbe showed how prediction machines can contribute towards early detection and prioritize the care of patients with cervical cancer, while also allowing for cost-saving benefits when compared with routine cervical cancer screening [167].
Cytology-based cervical cancer screening has poor accuracy [168]. In addition to the fact that the procedure needs clinical consultants who have undergone significant training, it is time consuming and susceptible to human interpretation and error [81,169]. The use of computer technologies may reduce the likelihood of misdiagnoses, analysis time, and assist in early diagnosis [169].
In the present systematic review, we reviewed studies investigating the performance of AI in screening and cytology for the detection of cervical cancer. The accuracy of AI in the detection of CIN 1-3 and adenocarcinoma in situ varied from 67% to 98.27%. However, we registered contradictory results regarding the performance of AI in cervical cancer screening, in terms of poor as well as better performance compared to traditional methods. The poor performance of AI appears to be limited to old methods of AI. Conversely, applying the hybrid ensemble approach and combined applied algorithms of new AI techniques performed better than traditional and manual approaches. We observed that AI techniques are able to distinguish between normal and cancerous Pap smears with 80-100% accuracy, and were 380 times faster than the typical pathologist. Pap smear testing is a fundamental procedure in protecting women from cervical cancer. However, the effort of a cytologist to detect morphologic changes in lesions with 20,000-50,000 cells on a single slide is tedious, arduous, and dependent on experience [67]. In a cross-sectional study by Wergeland Sørbye et al. [168], four pathologists at three hospitals in Norway evaluated one hundred Pap smears (20 cases normal, 20 cases LSIL, 20 cases HSIL, 20 cases atypical squamous cells of undetermined significance (ASC-US), and 20 high grade squamous intraepithelial lesion (ASC-H)). The accuracy for CIN2+ varied from 74.1% to 83.8%. Therefore, the claim that AI improves the effectiveness of diagnosis, reduces the clinician's workload, and even enhances the impact of treatment and prognosis would seem plausible [53,138]. Furthermore, AI was shown to be more adept than the human brain in recognizing specific patterns [138].
Currently, the two most common techniques used to diagnose precancerous cervical lesions are colposcopy and guided biopsy. However, numerous investigations have shown that even practitioners who are skilled in colposcopy struggle to make the right diagnosis [170]. Consequently, the standardized and less volatile diagnostic tools of AI might be useful [79]. Many studies included in the present review concluded that, in cervical cancer diagnosis, AI may be able to supplement or perhaps even replace current colposcopy procedures. The sensitivity and specificity of AI for the detection of CIN2+ were reported to be 71.9-98.22% and 51.8-96.2% respectively [70,72,73,[137][138][139][140]143,147,148,[150][151][152]154]. The accuracy of AI for the detection of CIN 2+ varied from 40.5% to 98.3%. These data demonstrate the potential of AI in reading colposcopic images. According to a metaanalysis by Mitchell et al. [171], the sensitivity of colposcopists in diagnosing CIN varies greatly compared to the performance of AI: the average weighted sensitivity of colposcopy in differentiating between normal and all cervix abnormalities (atypia, low-grade SIL, high-grade SIL, cancer) was 96%, and the average weighted specificity was 48%.
The threshold normal cervix has an area under the ROC curve of 0.80 when compared to other abnormalities. A gynecologist with little experience might overlook high-grade lesions [172]. AI technologies may serve as a practical aid for the inexperienced gynecologist or general physician in making a precise clinic diagnosis or a wise choice in terms of diagnostic intervention, such as whether to perform a punch biopsy or transfer the patient to a specialized center [79,144,171]. According to Kim et al. [79], the clinical interpretation of colposcopic images by AI had a higher AUC in identifying low-and high-risk lesions than the clinical interpretation of colposcopic images by humans. These findings imply that AI interpretations may be used in the clinical setting. A recent study that evaluated deep learning algorithms for automatic categorization of colposcopic images supports this notion [144]. Automated visual evaluation of cervical images had a higher AUC than the original interpretation of cervical images by human or conventional cytology [40].

Limitations and Recommendations
The limitations of the studies investigated for this systematic review are worthy of mention. First of all, the majority of the investigations were underpowered in regard of the primary outcome because of their small sample sizes. Some algorithms used in the studies are very unstable, which means that a slight change in the data will significantly change the layout of the best decision. Furthermore, some algorithms are slow and needed more memory to run, In fact, millions of observations may be needed for AI techniques to perform acceptably [173]. Second, AI-based models are not widely used in experimental and clinical settings on real datasets. The experimental results on some (small, intermediate and big) machine learning datasets can show the efficiency of the proposed methods, in terms of space, speed, and accuracy [174].
Experimental tests or prospective clinical trials are urgently needed to better highlight the differences between the investigated studies and validate the findings discussed in the present study. Considering the use of different techniques and algorithms in the published studies, it would be meaningful to design a review comparing each technique with others in order to obtain an accurate estimate of the effectiveness of the techniques and establish the potential superiority of the respective methods. According to several studies, the cost-effectiveness of the automated systems is limited, because they are not suitable for use in poorly and moderately developed nations [175]. Some researchers are still working to improve the use of artificial intelligence in cervical cytology. Since a reliable prediction of the clinical outcome would serve as a guide for treatment and the prediction of cervical cancer is most challenging [176], it would be appropriate to specifically address the role of AI in predicting cervical cancer outcomes.

Conclusions
Our systematic review highlights the acceptable performance of AI systems in the prediction, screening or detection of cervical cancer and pre-cancerous lesions. AI could aid clinicians in making decisions, reducing their workload as well as the likelihood of misdiagnoses. Indeed, AI interpretation of cervical smears or images could serve as an aid when combined with human evaluation. Further studies on prediction and detection are needed for making appropriate decisions about the treatment of cervical cancer. Eventually, this will help to devise programs for the eradication of cervical cancer on a worldwide basis. However, further work will be needed to make AI feasible, reliable, and less expensive for clinical use. The development of novel techniques and algorithms to reduce the impact of data scarcity in the evaluation and prediction of clinical outcomes, as well as the independent validation of machine learning algorithms, may be included in future studies.