Prediction of Postoperative Pathologic Risk Factors in Cervical Cancer Patients Treated with Radical Hysterectomy by Machine Learning

Ou, Zhengjie; Mao, Wei; Tan, Lihua; Yang, Yanli; Liu, Shuanghuan; Zhang, Yanan; Li, Bin; Zhao, Dan

doi:10.3390/curroncol29120755

Open AccessArticle

Prediction of Postoperative Pathologic Risk Factors in Cervical Cancer Patients Treated with Radical Hysterectomy by Machine Learning

by

Zhengjie Ou

^1,†,

Wei Mao

^1,†,

Lihua Tan

¹,

Yanli Yang

²,

Shuanghuan Liu

¹,

Yanan Zhang

¹,

Bin Li

¹ and

Dan Zhao

^1,*

¹

Department of Gynecology Oncology, National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing 100021, China

²

Department of Gynecology Oncology, The Fifth People’s Hospital of Qinghai Province, Xining 810007, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Curr. Oncol. 2022, 29(12), 9613-9629; https://doi.org/10.3390/curroncol29120755

Submission received: 7 October 2022 / Revised: 27 November 2022 / Accepted: 29 November 2022 / Published: 6 December 2022

(This article belongs to the Topic Artificial Intelligence in Cancer Diagnosis and Therapy)

Download

Browse Figures

Versions Notes

Abstract

Pretherapeutic serological parameters play a predictive role in pathologic risk factors (PRF), which correlate with treatment and prognosis in cervical cancer (CC). However, the method of pre-operative prediction to PRF is limited and the clinical availability of machine learning methods remains unknown in CC. Overall, 1260 early-stage CC patients treated with radical hysterectomy (RH) were randomly split into training and test cohorts. Six machine learning classifiers, including Gradient Boosting Machine, Support Vector Machine with Gaussian kernel, Random Forest, Conditional Random Forest, Naive Bayes, and Elastic Net, were used to derive diagnostic information from nine clinical factors and 75 parameters readily available from pretreatment peripheral blood tests. The best results were obtained by RF in deep stromal infiltration prediction with an accuracy of 70.8% and AUC of 0.767. The highest accuracy and AUC for predicting lymphatic metastasis with Cforest were 64.3% and 0.620, respectively. The highest accuracy of prediction for lymphavascular space invasion with EN was 59.7% and the AUC was 0.628. Blood markers, including D-dimer and uric acid, were associated with PRF. Machine learning methods can provide critical diagnostic prediction on PRF in CC before surgical intervention. The use of predictive algorithms may facilitate individualized treatment options through diagnostic stratification.

Keywords:

blood biomarker; cervical cancer; deep stromal infiltration; lymph node metastasis; lymph-vascular space invasion; machine learning methods

1. Introduction

Cervical cancer remains one of the most frequent malignant tumors in women [1]. With the widespread application of human papillomavirus (HPV) vaccination and the popularity of screening, patients diagnosed at early stages have accounted for the majority. Radical hysterectomy (RH) is the standard-of-care treatment for these patients [2]. The unavoidable problem after surgery is whether adjuvant treatment is required, which is judged in accordance with postoperative pathological risk factors. The likelihood of risk factors that increase the risk of recurrence is high, especially in stage IB3-IIA2 (the 2018 International Federation of Gynecology and Obstetrics, FIGO) due to large tumor bulk [2]. Previous studies have illustrated that neoadjuvant chemotherapy (NACT) plus surgery inhibited micro-metastasis and distant metastasis of tumors, and was associated with a declined incidence of pathologic risk factors [3]. However, despite the fact that NACT reduces the rate of adjuvant therapy after surgery, patients treated with NACT cannot be thoroughly free from radiotherapy and the adverse effects that radiotherapy brings.

In addition, concurrent chemoradiotherapy (CCRT) is also an alternative initial treatment for early-stage cervical cancer, particularly for locally advanced cervical cancer. As for a patient with several pathologic risk factors, conformed to the adjuvant therapy standard, CCRT should be considered as the initial therapy but not RH, which shortens the treatment process for the same effect and reduces treatment costs [4]. With regard to patients staged ⅠB-ⅡA, according to the National Comprehensive Cancer Network (NCCN) guidelines, concurrent chemoradiation and RH both serve as alternative primary treatment options, sharing nearly therapeutic equivalence. However, increased morbidity and complications have been specifically illustrated when surgery and radiotherapy are combined [5,6]. This multimodal treatment modality has caused them to bear a double treatment burden and increased medical cost. In addition, the successive therapeutic process also prolongs the treatment period, aggregates their side effects and affects quality of life in the long run. Accordingly, it is necessary to construct a model to predict pathologic risk factors before primary treatment, which will help select those for whom it is more appropriate to receive direct chemoradiation therapy rather than RH. Additionally, the development of model to predict postoperative pathologic risk factors is an important element for individual prognosis stratification and personalized medicine.

Pathologic risk factors in cervical cancer include lymph node metastasis (LNM), parametria infiltration, positive surgical margins, lymph-vascular space invasion (LVSI), tumor size >4 cm and deep stromal infiltration (DSI) [2]. Previous studies illustrated that many clinicopathologic factors were related to pathologic risk factors by common statistical methods, but these methods were not suited to handle more complex data [7,8,9]. Machine learning is a branch of artificial intelligence (AI) technology that allows the computer to conclude potential rules from complicated data of retrospective examples. AI technology has been widely used to analyze clinical material to construct a model to predict clinicopathological factors and treatment outcome, acquiring a properly higher accuracy compared with traditional statistical methods [10,11,12]. Therefore, it is feasible and reasonable to apply machine learning to the prediction of postoperative pathologic risk factors.

Based on the successful application of AI technology and the discovery of related factors with pathologic risk factors, we hypothesized that pretreatment of clinicopathological factors would be effective in the prediction of postoperative pathologic risk factors by machine learning analysis in FIGO stage IB-IIA cervical cancer. In addition, because of the low incidence rate of positive margins and parametria infiltration in primary cohorts and preoperative confirmation of tumor size via clinical palpation, this study’s outcome contained a prediction of other pathologic risk factors. Therefore, in the present study, we aimed to explore the construction of a model for predicting LNM, LVSI and DSI through machine learning combing of clinicopathological biomarkers and explore unreported significant parameters associated with these factors.

2. Materials and Methods

2.1. Patients and Considered Features

This was a retrospective cohort study of 1260 patients with FIGO stage (2003) IB and IIA cervical cancer who were treated with RH with retroperitoneal lymphadenectomy between 2003 and 2017 in our institution (National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences; CICAMS). We retrospectively collected clinicopathological parameters, including age at diagnosis, body mass index (BMI), menopausal status, clinical FIGO stage, gross type, histologic grade, clinical tumor diameter, 75 preoperative peripheral blood biomarkers, etc. (Table 1 and Table S1). Tumor diameter was obtained via clinical palpation before surgical intervention.

2.2. Data Splitting

We obtained 1260 samples after preliminary preprocessing: removing medically impossible data (containing obvious record error), removing the features with 10% missing values and the samples with missing values. Variables of age, BMI, menopausal status, clinical tumor diameter, histology, FIGO stage, gross type, previous abdominal surgery, histologic grade (obtained via cervical biopsy preoperatively) and 75 pretreatment peripheral blood markers were all incorporated into the model construction. We started to handle the features: the continuous features were normalized and categorical features were one-hot coded, and LinearSVC method with L1 penalty was used to choose features.

The dataset was split into training and test cohorts according to a ratio of 1:1 by repeated random sampling until there was no significant difference (p value > 0.05) between the two cohorts with respect to the three tasks (Table 1). The p values were calculated using Chi-square or Fisher exact test for categorical variables, and the student’s t-test or the Mann–Whitney U test were conducted for analyzing normally distributed or non-normally distributed continuous variables. This resulted in the training cohort and the test cohort both having 630 patients.

2.3. Supervised Machine Learning Classifiers

In this study, we evaluated six types of supervised machine learning classifiers, including GBM (Gradient Boosting Machine) [13,14], SVMRadial (Support Vector Machine with Gaussian kernel) [15], RF (Random Forest) [16], Cforest (Conditional Random Forest) [17], NB (Naive Bayes) [18] and EN (Elastic Net) [19]. In addition, a logistic regression classifier was used as a baseline. R software version 4.2.1 with R package caret was used to implement all classifiers. One hundred independent training sets were conducted using different random seeds in order to calculate variable importance for prediction. We used the median of variable importance acquired from each training as a representative value. The importance of each variable was calculated using the varImp function of the caret package. A RF classifier combines two machine learning techniques: bagging and random feature selection consisting of a group of decision trees. Cforest is an algorithm using conditional inference trees as base learners, implementing both the random forest and the bagging ensemble algorithm. EN is a logistic regression classifier trained by using a regularized method that linearly combines the L1 and L2 penalties of the lasso and ridge methods.

2.4. Model Assessment

To assess the performance of different models, we computed the accuracy (ACC) and the area under the ROC curve (AUC) on the test cohort as our evaluation metrics. Here, ACC was obtained by setting the threshold corresponding to the top left point of the ROC curve. As the AUC is independent of the chosen threshold, we used it as the main evaluation metric.

2.5. Confidence of Prediction and Shannon’s Information Gain

Shannon’s information gain was used to assess the prediction confidence [20]. If a patient, i, is lacking the information concerning the class that the patient is included in (k-class), the Shannon’s information entropy representing uncertainty is expressed with:

H (i) = \log_{2} k

If a classifier provides prediction probabilities for each class, the entropy will be:

H_{c} (i) = \sum_{j = 1}^{k} p_{j} (i) \log_{2} (p_{j} (i))

Here,

p_{j} (i)

is the predicted probability that the patient i is included in class j. Thus, we obtain the information gain, i.e., information gained by the prediction:

I G (i) = H (i) - H_{c} (i)

The individual information gain for each class is given by:

I G_{j} (i) = p_{j} (i) \times I G (i)

3. Results

3.1. Prediction of Deep Stromal Infiltration of Cervical Cancer Based on Multiple Preoperative Blood Markers Using Machine Learning Methods

Depth of stromal invasion was evaluated by an experienced pathologist and was recognized as significant, with more than one millimeter of invasion in the depth of the stroma in a microscopic examination. The status of the depth of stromal infiltration was classified into two groups: “non-deep” and “deep”. The “deep” group referred to patients who had an invasive carcinoma with greater than one-third stromal invasion according to the pathologic findings. “Non-deep” indicated a carcinoma infiltrating no more than one third of the cervical stroma. The values for the highest ACC of the prediction and the AUC were 70.8% and 0.767 with RF classifier, which achieved a 5.4% higher score than the traditional method of multiple logistic regression analysis in AUC (Figure 1A; Supplemental Table S2). It is notable that the best two classifiers, RF and GBM, both used ensemble methods that combine weak decision trees.

Next, we focused on the best model, RF, and understood the variables. The relative importance of each variable for segregating deep stromal infiltration patients from non-deep infiltration ones was calculated for RF (Figure 1B). We identified the top eight factors, including SCC, D-D, tumor diameter, URIC, age, neut%, ALP and TP, as important RF predictors for distinguishing deep infiltration from non-deep infiltration. Standard box plots that presented the distribution of each variable between deep and non-deep samples are shown in Figure 1C.

Interestingly, we found that D-D was a critical variable, in addition to SCC. From the confusion matrix (Figure 1D), RF predicted 81 patients with deep infiltration as ones with non-deep infiltration and predicted 108 patients with non-deep infiltration as ones with deep infiltration. When we considered the Shannon gain to represent the confidence of predictions and chose those patients with certain higher confidence of predictions, the predictions designated as higher confidence (>0.2 bits from Shannon information gain computation) contained only 21 mispredictions out of 148 instances (Figure 1E). In particular, for the predictions with higher confidence, if a patient was predicted as non-deep, this was right at a rate of 1 − 7/52 = 86.5%.

3.2. Differentiation of Lymph Node Metastasis of Cervical Cancer with Machine Learning Methods

The status of lymph node metastasis was classified into two groups: “metastasis” and “non-metastasis”. We found that Cforest showed the best prediction performance with an ACC of 64.3% and an AUC of 0.620 (Figure 2A; Supplemental Table S2), which achieved a 5.8% higher score than LR in AUC.

Next, the relative importance of a variable for segregating metastatic patients from non-metastatic ones was calculated for Cforest (Figure 2B). We identified the top eight factors, including SCC, IB2, IB1, MONO%, diameter, PT(A), HCT and TT, as important Cforest predictors for distinguishing metastatic patients from non-metastatic ones. It should be noted that as the clinical stage progresses, SCC and tumor diameter can increase. Standard box plots that presented the distribution of each variable between metastatic and non-metastatic samples are shown in Figure 2C.

Interestingly, we found that SCC was a critical variable. From the confusion matrix (Figure 2D), RF predictions had 105 false negative samples and 13 false positive samples. However, predictions designated as higher confidence (>0.2 bits from Shannon information gain computation) contained only 29 misprediction out of 230 instances (Figure 3E). In particular, for the predictions with higher confidence, if a patient was predicted as non-metastasis, this was right at a rate of 1 − 29/230 = 87.4%.

3.3. Prediction of Lymph-Vascular Space Invasion of Cervical Cancer Based on Preoperative Blood Markers Using Machine Learning Methods

In the task of lymph-vascular space invasion, patients were labeled as “invasion” or “non-invasion”. LVSI refers to the presence of epithelial tumor cells in the lumen of vessels. “Invasion” indicated positive pathologic findings of LVSI and “non-invasion” indicated no pathologic proof of LVSI. We found that EN showed the best prediction performance, with ACC of 59.7% and AUC of 0.628, and the traditional method of multiple logistic regression analysis was comparative with ACC of 59.5% and AUC of 0.627 (Figure 3A; Supplemental Table S2).

Next, the relative importance of each variable for segregating invasion from non-invasion was calculated for EN (Figure 3B). We identified the top eight factors, including RDW-SD, CK-MB, PCT, A/G, PT(A), IB1, TT and TBIL, as important EN predictors for distinguishing invasion patients from non-invasion ones. Standard box plots that present the distribution of each variable between invasion and non-invasion are shown in Figure 3C.

Interestingly, we found that RDW-SD was a critical variable. From the confusion matrix (Figure 3D), EN predictions had 180 false negative samples and 36 false positive samples. However, predictions designated as higher confidence (>0.2 bits from Shannon information gain computation) contained only 15 misprediction out of 98 instances (Figure 3D,E). In particular, for the predictions with higher confidence, if a patient was predicted as non-invasion, it was right at a rate of 1 − 15/98 = 84.7%.

4. Discussion

In recent years, machine learning algorithms based on AI technology have been widely accepted and extensively utilized for diagnostic and prognostic assessment of various types of cancers in the context of precision medicine [11,21,22]. This innovative approach, serving as an important tool with high accuracy and efficient ability to process complex data, can explore the key related factors to effectively assist in the clinical decision making of cervical cancer treatment. More importantly, hidden and embedded patterns within familiar clinical data can be revealed with the aid of AI models. However, so far, no studies have been conducted on integrating readily accessible clinical blood markers into the model construction of predicting pathologic risk factors in cervical cancer based on AI technology. Our study allowed for the comparison of various machine learning algorithms with the traditional logistic regression analysis to identify the approach with the most favorable performance and explore the serologic biomarkers with potential diagnostic potency. In cervical cancer with FIGO stage IB-IIA, radical hysterectomy followed by tailored adjuvant radiotherapy and concurrent chemoradiotherapy are both recommended for suitable treatment modalities [21]. Postoperative adjuvant radiotherapy is warranted for women with histopathologically verified risk factors, such as LVSI, LNM, DSI, etc., to improve prognosis [22,23,24], which led to an increase in the risk of higher morbidity [25,26,27]. It is beneficial and meaningful to predict pathologic risk factors so as to identify those more likely to receive postoperative adjuvant radiotherapy to avoid compounding treatment-related morbidity. Currently, the lack of ability to accurately identify those with a higher chance to receive postoperative radiotherapy and achieve individualized medical management instead of a “one-size fits all” approach has been a primary clinical limitation. Therefore, predicting pathologic risk factors by comprehensive utility of laboratory blood tests and other pretreatment information is a fundamental way toward individualized optimal medical care. In this study, we explored the ability of multiple machine learning methods to predict pathologic risk factors of patients with cervical cancer by incorporating readily available blood biomarkers. We found that three ensemble classifiers, RF, Cforest and EN, were able to predict pathologic risk factors of early-stage cervical cancer, in which RF showed the best predictive performance with an appreciable accuracy of 70.8% and AUC of 0.767 for DSI. Cforest showed the most accurate predictive value for LNM (64.3% accuracy and 0.620 AUC), and EN for LVSI (59.7% accuracy and 0.628 AUC). Compared to the traditional approach of logistic regression analysis, the RF classifier achieved a 5.4% higher score of AUC in DSI prediction, Cforest achieved a 3.4% higher score of AUC in LNM prediction and EN showed almost the same performance in LVSI prediction. The underperformance of these classifiers with regard to LNM and LVSI may be attributable to the lack of particularly strong distinctions of cervical cancer at the level of an early stage based on serum biomarkers. Nevertheless, the results indicate that AI technology can provide valuable predictive information before primary treatment to facilitate individualized medical strategy. In addition, based on the optimal results of machine learning algorithms, this study may offer useful clinical information concerning variables that are of most importance for identification of pathologic risk factors, like DSI, in early-stage patients.

Previous evidence has suggested that cancer is a metabolic disease associated with inflammation [28]. Cervical cancer harbors a unique collection of inflammatory and metabolic molecules in the serum [29]. In early-stage cervical cancer, local inflammatory processes may be at an initial state in which the peritumoral microenvironment perhaps alters the most, while distant and systemic metabolic features and cancer-target responses are immunosuppressed [30], leading to the slight distinction of cancer invasiveness, which was obscured in serum markers. Understandably, as tumor debulk progresses, tumor burden aggravates, leading to cancer invasiveness. In this study, we found that squamous cell carcinoma antigen (SCC), D-dimer and uric acid (UA) levels were the top five significant plasma biomarkers for predicting DSI. SCC has been considered as the most important diagnostic and prognostic tumor marker in cervical cancer. Many studies demonstrated that an elevated level of pretreatment serum SCC was closely associated with disease progression and recurrence [31,32]. UA is a powerful antioxidant and considered as a protective factor against cancer [33]. It has been reported that an elevated level of UA was associated with cancer risk, aggressiveness and poor oncologic outcomes in various cancer types [34,35,36], but few studies have focused on gynecologic cancer. Interestingly, previous studies have also shown a prooxidant role of UA [37] and lower levels of UA were associated with elevated risk of cancer-related mortality compared with high levels [38]. The precise relation of UA with cancer, especially cervical cancer, needs further study. D-dimer serves as a valuable marker of activation of coagulation and fibrinolysis, and is also known as a biomarker of cancer prognosis, especially in metastasized patients [39,40,41]. The pretreatment prediction model of DSI in cervical cancer performed well and revealed potential meaningful serum biomarkers that were readily available in clinical settings, which is also consistent with previous studies. This study’s findings suggest that the supervised machine learning analysis serves as a feasible and effective approach that can aid in discovering more meaningful biomarkers that are correlated with PRF in cervical cancer and are not identified by conventional multiple regression analysis.

Identification of reliable pretreatment blood markers associated with pathologic risk factors helps clinicians in clinical decision making [42]. In this study, we found some serologic indicators, such as RDW-SD and other indicators, that had scarcely been found to be related to the diagnosis and prognosis of cervical cancer in previous studies. We found that RDW was the top predictive indicator for LVSI. RDW is a routinely measured hematological index, primarily reflecting the degree of anisocytosis. It has been reported that this simple and inexpensive parameter is a strong and independent risk factor for death in the general population [43]. Research has demonstrated that an aberrant elevation level of RDW leads to poor survival outcomes in most tumor types and stages, independent of age, gender or region [44]. However, little is known about RDW in cervical cancer. One recent study indicated that RDW was associated with worse prognosis in cervical cancer [45]. Excessive oxidative stress, inflammation, and cell senescence were proposed as the conditions that RDW associates closely with mortality [46,47]. More dataset analysis is still needed to confirm the predictive ability of these factors. Based on the high efficiency of pretreatment blood markers, the dynamic detection of serological indicators in multiple time periods may be more powerful in prediction. As the dynamic analysis of serological indicators is more complex, future studies should develop the use of artificial intelligence-based machine learning algorithms to identify the predictive features of preoperative blood variable time series, which might significantly facilitate the accuracy of clinical characteristics prediction and deserve further study.

As tumors progress over time, the signal transduction and correlation between the tumor and its microenvironment, including fibroblasts, tumor-related immune cells and endothelial cells, will become increasingly closer [48]. The changes of peripheral blood parameters before surgery were inherently a combination of tumor-specific and microenvironment-specific factors and the result of the interaction between tumor and microenvironment. Given the importance of tumor microenvironment in the process of tumor development, clinicians should make full use of preoperative peripheral blood indicators for treatment decision making, cancer progression evaluation and prognosis assessment. In previous studies, clinicians often ignored the reflection of regular blood biomarkers on the biological characteristics of tumors and relied almost exclusively on tumor-specific factors as included indicators for assessment, which was also a common problem in previous retrospective analysis of tumors. In this study, we identified a series of blood indicators that were readily available and necessary for preoperative evaluation related to pathologic risk factors by machine learning methods, such as UA, D-dimer, thrombin time, AST, MONO%, RDW-SD, etc. These parameters have the potential to be related to the microenvironment in cancer progression or metastasis, and their changes will also influence treatment timing and selection.

There have been a few previous studies exploring the use of serologic biomarkers to predict PRF. One study [49] in 2016 incorporated clinical factors and three blood markers derived from pretreatment blood routine examination to predict LNM, patients’ overall survival and recurrence-free survival. They found platelet/lymphocyte ratio were significantly associated with LNM. Another study [50] in 2020 found that pretreatment albumin to fibrinogen ratio was significantly related to lymph node metastasis, depth of stromal infiltration, etc. Many studies focused on prediction for survival outcomes or a single PRF of cervical cancer based on clinical factors [51,52,53] and/or radiomic parameters [54,55]. However, no studies have made an attempt to predict three PRFs based on a series of clinically readily available blood markers. In addition to critical data analysis methods based on clinical factors, there are still many studies exploring new approaches of postoperative pathologic risk factors prediction. It is clear that the diagnosis of pathologic risk factors could only be accurately judged from the postoperative report of cervical cancer. Identification of reliable approaches that are able to predict pathologic risk factors in advance would facilitate the identification of more accurate diagnostic stratification and a more appropriate treatment strategy. A previous study indicated that DSI can be determined by combining the 2D or 3D ultrasound with clinical variables before treatment, with over 70% accuracy and AUC [56]. However, this diagnostic approach depended more on subjective judgment rather than objective parameters based on relatively few cases. It was reported that the assessment of cervical cancer with full-thickness stromal invasion by MRI examination was limited [57]. In Bidus’s study, the conical method combined with clinical factors to determine DSI and LVSI before treatment also achieved good accuracy but this method is a destructive examination and may easily interfere with the complete resection of radical surgery [58]. In the study of LNM diagnosis, sentinel node staining is currently the most commonly developed method, but it is only used to determine whether complete lymph node resection is performed before surgery [59,60]. In this study, LNM was associated closely with primary tumor size as staging and tumor diameter were among the top five predictors for LNM. Results indicated that imaging materials, such as MRI, reflecting the visual size of the tumor itself and enlarged lymph nodes would potentially provide more accurate predictive information preoperatively. However, previous studies also used magnetic resonance imaging (MRI) and ultrasound to determine lymph node metastasis, but imaging data could only determine lymphadenectasis rather than tumor cell metastases in most cases, which leads to the unsatisfactory accuracy of the prediction model [56,61]. This is a reminder that traditional data analysis on simple integration of imaging information is not adequate enough to achieve LNM prediction. It is promising to achieve more comprehensive and precise prediction by virtue of effective integration of high-throughput extraction of a large amount of information from images based on AI technology, which will be the focus of our subsequent research. As the approach used in this study did not consider any information from pretreatment biopsies or imaging studies, there may be a limitation of the ability to predict pathologic risk factors before initial treatment; indeed, more independent datasets from other institutions are required to investigate how pretreatment blood signatures can be utilized for more accurate assessment of pathologic risk factors. Manipulation of high-throughput sequencing analysis, such as RNA sequencing, of pretreatment peripheral blood may improve predictive performance, however, from another perspective, it may become more complicated and expensive to incorporate RNA analysis information into the process of preoperative assessment in the current context of clinical settings. Further comprehensive investigation is needed in the hope of achieving the best clinical and socioeconomic benefits.

Our study has some limitations. Firstly, this study was a single-center retrospective study. The retrospective nature may result in inherent bias. Secondly, results from our database should be supplemented with external and prospective validation for prevention of overfitting as well as further spread of application in clinical practice. Thirdly, other machine learning approaches should be undertaken to manage the missing data in future work. Fourthly, our assessment of diagnostic ability to predict pathological risk factors was preliminary, and further study is warranted to better validate the accuracy of blood biomarkers. At present, our model is not sufficiently powerful and accurate to predict LVSI and LNM, but some blood biomarkers have been revealed for the first time that may be potentially useful predictors from a large number of variables. However, a positive prediction is not trivial; compared with traditional methods, the machine learning algorithms could serve as a feasible tool for clinicians to predict oncologic outcomes based solely on pretherapeutic information.

5. Conclusions

This study indicates that AI-based algorithms are useful tools that may aid in providing critical information for diagnostic evaluation of pathologic risk factors in patients with cervical cancer before initial treatment. The use of predictive algorithms may facilitate personalized treatment selection through pretherapeutic assessment.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/curroncol29120755/s1, Table S1: Pretreatment peripheral blood tests of 1260 cervical cancer patients included in the primary cohort; Table S2: Diagnostic accuracy of clinicopathological factors using machine learning algorithms.

Author Contributions

Conceptualization, D.Z., Y.Y. and B.L.; methodology, D.Z.; formal analysis, Z.O.; investigation, Z.O., W.M., L.T., S.L. and Y.Z.; resources, D.Z. and B.L.; data curation, Z.O., W.M., S.L. and Y.Z.; writing—original draft preparation, Z.O. and W.M.; writing—review and editing, D.Z. and W.M.; visualization, W.M.; supervision, D.Z.; project administration, D.Z. and B.L.; funding acquisition, D.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (D.Z., grant number 62176267), the Natural Science Foundation of Qinghai Province (D.Z., grant number 2021-ZJ-922); the CAMS Innovation Fund for Medical Sciences (D.Z., grant number 2021-I2M-C&T-B-048), the Beijing Hope Run Special Fund of Cancer Foundation of China (D.Z., grant number LC2021A10) and Capital’s Funds for Health Improvement and Research (D.Z., grant number 2022-2-4026).

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the retrospective nature of the data.

Informed Consent Statement

Patient consent was waived due to the retrospective nature of the study.

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Bray, F.; Ferlay, J.; Soerjomataram, I.; Siegel, R.L.; Torre, L.A.; Jemal, A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2018, 68, 394–424. [Google Scholar] [CrossRef] [PubMed]
Bhatla, N.; Aoki, D.; Sharma, D.N.; Sankaranarayanan, R. Cancer of the cervix uteri. Int. J. Gynaecol. Obstet. 2018, 143 (Suppl. 2), 22–36. [Google Scholar] [CrossRef] [PubMed]
Peng, Y.H.; Wang, X.X.; Zhu, J.S.; Gao, L. Neo-adjuvant chemotherapy plus surgery versus surgery alone for cervical cancer: Meta-analysis of randomized controlled trials. J. Obstet. Gynaecol. Res. 2016, 42, 128–135. [Google Scholar] [CrossRef] [PubMed]
Landoni, F.; Colombo, A.; Milani, R.; Placa, F.; Zanagnolo, V.; Mangioni, C. Randomized study between radical surgery and radiotherapy for the treatment of stage IB-IIA cervical cancer: 20-year update. J. Gynecol. Oncol. 2017, 28, e34. [Google Scholar] [CrossRef] [PubMed]
Barter, J.F.; Soong, S.J.; Shingleton, H.M.; Hatch, K.D.; Orr, J.W., Jr. Complications of combined radical hysterectomy-postoperative radiation therapy in women with early stage cervical cancer. Gynecol. Oncol. 1989, 32, 292–296. [Google Scholar] [CrossRef] [PubMed]
Ayhan, A.; Al, R.A.; Baykal, C.; Demirtas, E.; Ayhan, A.; Yüce, K. Prognostic factors in FIGO stage IB cervical cancer without lymph node metastasis and the role of adjuvant radiotherapy after radical hysterectomy. Int. J. Gynecol. Cancer 2004, 14, 286–292. [Google Scholar] [CrossRef]
Kim, D.Y.; Shim, S.H.; Kim, S.O.; Lee, S.W.; Park, J.Y.; Suh, D.S.; Kim, J.H.; Kim, Y.M.; Kim, Y.T.; Nam, J.H. Preoperative nomogram for the identification of lymph node metastasis in early cervical cancer. Br. J. Cancer 2014, 110, 34–41. [Google Scholar] [CrossRef]
Hutchcraft, M.L.; Smith, B.; McLaughlin, E.M.; Hade, E.M.; Backes, F.J.; O’Malley, D.M.; Cohn, D.E.; Fowler, J.M.; Copeland, L.J.; Salani, R. Conization pathologic features as a predictor of intermediate and high risk features on radical hysterectomy specimens in early stage cervical cancer. Gynecol. Oncol. 2019, 153, 255–258. [Google Scholar] [CrossRef]
Li, X.; Zhou, J.; Huang, K.; Tang, F.; Zhou, H.; Wang, S.; Jia, Y.; Sun, H.; Ma, D.; Li, S. The predictive value of serum squamous cell carcinoma antigen in patients with cervical cancer who receive neoadjuvant chemotherapy followed by radical surgery: A single-institute study. PLoS ONE 2015, 10, e0122361. [Google Scholar] [CrossRef]
Obrzut, B.; Kusy, M.; Semczuk, A.; Obrzut, M.; Kluska, J. Prediction of 5-year overall survival in cervical cancer patients treated with radical hysterectomy using computational intelligence methods. BMC Cancer 2017, 17, 840. [Google Scholar] [CrossRef]
Matsuo, K.; Purushotham, S.; Jiang, B.; Mandelbaum, R.S.; Takiuchi, T.; Liu, Y.; Roman, L.D. Survival outcome prediction in cervical cancer: Cox models vs deep-learning model. Am. J. Obstet. Gynecol. 2019, 220, 381.e1–381.e14. [Google Scholar] [CrossRef]
Papadia, A.; Bellati, F.; Bogani, G.; Ditto, A.; Martinelli, F.; Lorusso, D.; Donfrancesco, C.; Gasparri, M.L.; Raspagliesi, F. When Does Neoadjuvant Chemotherapy Really Avoid Radiotherapy? Clinical Predictors of Adjuvant Radiotherapy in Cervical Cancer. Ann. Surg. Oncol. 2015, 22 (Suppl. 3), S944–S951. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013, 7, 21. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Liu, L.; Chen, L.; Zhang, K.; Liusan, N.; Yang, Z. Conditional Random Forest Based Smiling Face Detector, Has Random Forest Smile Classification Module for Detecting Dynamic Smiling Face Classifying Random Forest Non-Classification Face Area of Smiling Face. China Patent CN106650637-A, 10 May 2017. [Google Scholar]
Dv, L. Fiducial distributions and Bayes’ theorem. J. R. Stat. Soc. 1958, 1, 102–107. [Google Scholar]
Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. 2005, 67, 301–320. [Google Scholar] [CrossRef]
Feutrill, A.; Roughan, M. A Review of Shannon and Differential Entropy Rate Estimation. Entropy 2021, 23, 1046. [Google Scholar] [CrossRef]
Bhatla, N.; Aoki, D.; Sharma, D.N.; Sankaranarayanan, R. Cancer of the cervix uteri: 2021 update. Int. J. Gynecol. Obstet. 2021, 155, 28–44. [Google Scholar] [CrossRef]
Sedlis, A.; Bundy, B.N.; Rotman, M.Z.; Lentz, S.S.; Muderspach, L.I.; Zaino, R.J. A randomized trial of pelvic radiation therapy versus no further therapy in selected patients with stage IB carcinoma of the cervix after radical hysterectomy and pelvic lymphadenectomy: A Gynecologic Oncology Group Study. Gynecol. Oncol. 1999, 73, 177–183. [Google Scholar] [CrossRef] [PubMed]
Pieterse, Q.D.; Trimbos, J.B.M.Z.; Dijkman, A.; Creutzberg, C.L.; Gaarenstroom, K.N.; Peters, A.A.W.; Kenter, G.G. Postoperative radiation therapy improves prognosis in patients with adverse risk factors in localized, early-stage cervical cancer: A retrospective comparative study. Int. J. Gynecol. Cancer 2006, 16, 1112–1118. [Google Scholar] [CrossRef] [PubMed]
Ryu, S.-Y.; Park, S.-I.; Nam, B.-H.; Cho, C.-K.; Kim, K.; Kim, B.-J.; Kim, M.-H.; Choi, S.-C.; Lee, E.-D.; Lee, K.-H. Is adjuvant chemoradiotherapy overtreatment in cervical cancer patients with intermediate risk factors? Int. J. Radiat. Oncol. Biol. Phys. 2011, 79, 794–799. [Google Scholar] [CrossRef] [PubMed]
Peters, W.A.; Liu, P.Y.; Barrett, R.J.; Stock, R.J.; Monk, B.J.; Berek, J.S.; Souhami, L.; Grigsby, P.; Gordon, W.; Alberts, D.S. Concurrent Chemotherapy and Pelvic Radiation Therapy Compared With Pelvic Radiation Therapy Alone as Adjuvant Therapy After Radical Surgery in High-Risk Early-Stage Cancer of the Cervix. J. Clin. Oncol. 2000, 18, 1606–1613. [Google Scholar] [CrossRef] [PubMed]
Landoni, F.; Maneo, A.; Colombo, A.; Placa, F.; Milani, R.; Perego, P.; Favini, G.; Ferri, L.; Mangioni, C. Randomised study of radical surgery versus radiotherapy for stage Ib-IIa cervical cancer. Lancet 1997, 350, 535–540. [Google Scholar] [CrossRef]
Kong, T.-W.; Lee, J.-D.; Son, J.-H.; Paek, J.; Chun, M.; Chang, S.-J.; Ryu, H.-S. Treatment outcomes in patients with FIGO stage IB–IIA cervical cancer and a focally disrupted cervical stromal ring on magnetic resonance imaging: A propensity score matching study. Gynecol. Oncol. 2016, 143, 77–82. [Google Scholar] [CrossRef]
Wishart, D.S.; Mandal, R.; Stanislaus, A.; Ramirez-Gaona, M. Cancer Metabolomics and the Human Metabolome Database. Metabolites 2016, 6, 10. [Google Scholar] [CrossRef]
Yang, K.; Xia, B.; Wang, W.; Cheng, J.; Yin, M.; Xie, H.; Li, J.; Ma, L.; Yang, C.; Li, A.; et al. A Comprehensive Analysis of Metabolomics and Transcriptomics in Cervical Cancer. Sci. Rep. 2017, 7, 43353. [Google Scholar] [CrossRef]
Yuan, Y.; Cai, X.; Shen, F.; Ma, F. HPV post-infection microenvironment and cervical cancer. Cancer Lett. 2021, 497, 243–254. [Google Scholar] [CrossRef]
Charakorn, C.; Thadanipon, K.; Chaijindaratana, S.; Rattanasiri, S.; Numthavaj, P.; Thakkinstian, A. The association between serum squamous cell carcinoma antigen and recurrence and survival of patients with cervical squamous cell carcinoma: A systematic review and meta-analysis. Gynecol. Oncol. 2018, 150, 190–200. [Google Scholar] [CrossRef]
Choi, K.H.; Lee, S.W.; Yu, M.; Jeong, S.; Lee, J.W.; Lee, J.H. Significance of elevated SCC-Ag level on tumor recurrence and patient survival in patients with squamous-cell carcinoma of uterine cervix following definitive chemoradiotherapy: A multi-institutional analysis. J. Gynecol. Oncol. 2019, 30, e1. [Google Scholar] [CrossRef]
Ames, B.N.; Cathcart, R.; Schwiers, E.; Hochstein, P. Uric acid provides an antioxidant defense in humans against oxidant- and radical-caused aging and cancer: A hypothesis. Proc. Natl. Acad. Sci. USA 1981, 78, 6858–6862. [Google Scholar] [CrossRef]
Xu, Y.; Wu, Z.; Ye, W.; Xiao, Y.; Zheng, W.; Chen, Q.; Bai, P.; Lin, Z.; Chen, C. Prognostic value of serum uric acid and tumor response to induction chemotherapy in locally advanced nasopharyngeal carcinoma. BMC Cancer 2021, 21, 519. [Google Scholar] [CrossRef]
Hayashi, M.; Yamada, S.; Tanabe, H.; Takami, H.; Inokawa, Y.; Sonohara, F.; Shimizu, D.; Hattori, N.; Kanda, M.; Tanaka, C.; et al. High Serum Uric Acid Levels Could Be a Risk Factor of Hepatocellular Carcinoma Recurrences. Nutr. Cancer 2021, 73, 996–1003. [Google Scholar] [CrossRef]
Yan, S.; Zhang, P.; Xu, W.; Liu, Y.; Wang, B.; Jiang, T.; Hua, C.; Wang, X.; Xu, D.; Sun, B. Serum Uric Acid Increases Risk of Cancer Incidence and Mortality: A Systematic Review and Meta-Analysis. Mediat. Inflamm. 2015, 2015, 764250. [Google Scholar] [CrossRef]
Kang, D.H.; Ha, S.K. Uric Acid Puzzle: Dual Role as Anti-oxidantand Pro-oxidant. Electrolyte Blood Press. 2014, 12, 1–6. [Google Scholar] [CrossRef]
Kuo, C.F.; See, L.C.; Yu, K.H.; Chou, I.J.; Chiou, M.J.; Luo, S.F. Significance of serum uric acid levels on the risk of all-cause and cardiovascular mortality. Rheumatology 2013, 52, 127–134. [Google Scholar] [CrossRef]
Watanabe, A.; Araki, K.; Harimoto, N.; Kubo, N.; Igarashi, T.; Ishii, N.; Yamanaka, T.; Hagiwara, K.; Kuwano, H.; Shirabe, K. D-dimer predicts postoperative recurrence and prognosis in patients with liver metastasis of colorectal cancer. Int. J. Clin. Oncol. 2018, 23, 689–697. [Google Scholar] [CrossRef]
Kim, E.Y.; Song, K.Y. Prognostic value of D-dimer levels in patients with gastric cancer undergoing gastrectomy. Surg. Oncol. 2021, 37, 101570. [Google Scholar] [CrossRef]
Lin, Y.; Liu, Z.; Qiu, Y.; Zhang, J.; Wu, H.; Liang, R.; Chen, G.; Qin, G.; Li, Y.; Zou, D. Clinical significance of plasma D-dimer and fibrinogen in digestive cancer: A systematic review and meta-analysis. Eur. J. Surg. Oncol. 2018, 44, 1494–1503. [Google Scholar] [CrossRef]
Ma, J.Y.; Ke, L.C.; Liu, Q. The pretreatment platelet-to-lymphocyte ratio predicts clinical outcomes in patients with cervical cancer: A meta-analysis. Medicine 2018, 97, e12897. [Google Scholar] [CrossRef] [PubMed]
Montagnana, M.; Danese, E. Red cell distribution width and cancer. Ann. Transl. Med. 2016, 4, 399. [Google Scholar] [CrossRef] [PubMed]
Wang, P.F.; Song, S.Y.; Guo, H.; Wang, T.J.; Liu, N.; Yan, C.X. Prognostic role of pretreatment red blood cell distribution width in patients with cancer: A meta-analysis of 49 studies. J. Cancer 2019, 10, 4305–4317. [Google Scholar] [CrossRef] [PubMed]
Lima, P.S.V.d.; Mantoani, P.T.S.; Murta, E.F.C.; Nomelini, R.S. Laboratory parameters as predictors of prognosis in uterine cervical neoplasia. Eur. J. Obstet. Gynecol. Reprod. Biol. 2021, 256, 391–396. [Google Scholar] [CrossRef]
Salvagno, G.L.; Sanchis-Gomar, F.; Picanza, A.; Lippi, G. Red blood cell distribution width: A simple parameter with multiple clinical applications. Crit. Rev. Clin. Lab. Sci. 2015, 52, 86–105. [Google Scholar] [CrossRef] [PubMed]
Pan, J.; Borné, Y.; Engström, G. The relationship between red cell distribution width and all-cause and cause-specific mortality in a general population. Sci. Rep. 2019, 9, 16208. [Google Scholar] [CrossRef]
Whiteside, T.L. The tumor microenvironment and its role in promoting tumor growth. Oncogene 2008, 27, 5904–5912. [Google Scholar] [CrossRef]
Chen, L.; Zhang, F.; Sheng, X.G.; Zhang, S.Q.; Chen, Y.T.; Liu, B.W. Peripheral platelet/lymphocyte ratio predicts lymph node metastasis and acts as a superior prognostic factor for cervical cancer when combined with neutrophil: Lymphocyte. Medicine 2016, 95, e4381. [Google Scholar] [CrossRef]
Huang, L.; Mo, Z.; Zhang, L.; Qin, S.; Qin, S.; Li, S. Diagnostic Value of Albumin to Fibrinogen Ratio in Cervical Cancer. Int. J. Biol. Markers 2020, 35, 66–73. [Google Scholar] [CrossRef]
Chen, X.; Duan, H.; Liu, P.; Lin, L.; Ni, Y.; Li, D.; Dai, E.; Zhan, X.; Li, P.; Huo, Z.; et al. Development and validation of a prognostic nomogram for 2018 FIGO stages IB1, IB2, and IIA1 cervical cancer: A large multicenter study. Ann. Transl. Med. 2022, 10, 121. [Google Scholar] [CrossRef]
Chu, R.; Zhang, Y.; Qiao, X.; Xie, L.; Chen, W.; Zhao, Y.; Xu, Y.; Yuan, Z.; Liu, X.; Yin, A.; et al. Risk Stratification of Early-Stage Cervical Cancer with Intermediate-Risk Factors: Model Development and Validation Based on Machine Learning Algorithm. Oncologist 2021, 26, e2217–e2226. [Google Scholar] [CrossRef] [PubMed]
Yang, H.S.; Li, B.; Liu, S.H.; Ao, M. Nomogram model for predicting postoperative survival of patients with stage IB-IIA cervical cancer. Am. J. Cancer Res. 2021, 11, 5559–5570. [Google Scholar] [PubMed]
Du, W.; Wang, Y.; Li, D.; Xia, X.; Tan, Q.; Xiong, X.; Li, Z. Preoperative Prediction of Lymphovascular Space Invasion in Cervical Cancer With Radiomics–Based Nomogram. Front. Oncol. 2021, 11, 637794. [Google Scholar] [CrossRef] [PubMed]
Huang, G.; Cui, Y.; Wang, P.; Ren, J.; Wang, L.; Ma, Y.; Jia, Y.; Ma, X.; Zhao, L. Multi-Parametric Magnetic Resonance Imaging-Based Radiomics Analysis of Cervical Cancer for Preoperative Prediction of Lymphovascular Space Invasion. Front. Oncol. 2021, 11, 663370. [Google Scholar] [CrossRef] [PubMed]
Palsdottir, K.; Fischerova, D.; Franchi, D.; Testa, A.; Di Legge, A.; Epstein, E. Preoperative prediction of lymph node metastasis and deep stromal invasion in women with invasive cervical cancer: Prospective multicenter study using 2D and 3D ultrasound. Ultrasound Obstet. Gynecol. 2015, 45, 470–475. [Google Scholar] [CrossRef]
Okuno, K.; Joja, I.; Miyagi, Y.; Sakaguchi, Y.; Notohara, K.; Kudo, T.; Hiraki, Y. Cervical carcinoma with full-thickness stromal invasion: Relationship between tumor size on T2-weighted images and parametrial involvement. J. Comput. Assist. Tomogr. 2002, 26, 119–125. [Google Scholar] [CrossRef]
Bidus, M.A.; Caffrey, A.S.; You, W.B.; Amezcua, C.A.; Chernofsky, M.R.; Barner, R.; Seidman, J.; Rose, G.S. Cervical biopsy and excision procedure specimens lack sufficient predictive value for lymph-vascular space invasion seen at hysterectomy for cervical cancer. Am. J. Obstet. Gynecol. 2008, 199, 151.e1–151.e4. [Google Scholar] [CrossRef]
Salvo, G.; Ramirez, P.T.; Levenback, C.F.; Munsell, M.F.; Euscher, E.D.; Soliman, P.T.; Frumovitz, M. Sensitivity and negative predictive value for sentinel lymph node biopsy in women with early-stage cervical cancer. Gynecol. Oncol. 2017, 145, 96–101. [Google Scholar] [CrossRef]
Gortzak-Uzan, L.; Jimenez, W.; Nofech-Mozes, S.; Ismiil, N.; Khalifa, M.A.; Dube, V.; Rosen, B.; Murphy, J.; Laframboise, S.; Covens, A. Sentinel lymph node biopsy vs. pelvic lymphadenectomy in early stage cervical cancer: Is it time to change the gold standard? Gynecol. Oncol. 2010, 116, 28–32. [Google Scholar] [CrossRef]
Chen, X.L.; Chen, G.W.; Xu, G.H.; Ren, J.; Li, Z.L.; Pu, H.; Li, H. Tumor Size at Magnetic Resonance Imaging Association With Lymph Node Metastasis and Lymphovascular Space Invasion in Resectable Cervical Cancer: A Multicenter Evaluation of Surgical Specimens. Int. J. Gynecol. Cancer 2018, 28, 1545–1552. [Google Scholar] [CrossRef]

Figure 1. Prediction of deep stromal infiltration of cervical cancer based on multiple preoperative blood markers using machine learning methods. (A) ROC curves derived from logistic regression for predicting deep stromal infiltration of cervical cancer based on all 75 peripheral blood markers using machine learning methods compared with logistic regression. (B) Relative importance of variables for prediction of deep stromal infiltration calculated in the RF. Variable importance is represented as a percentage of the highest value. (C) Box and jitter plots representing the distribution of top eight important parameters for distinguishing infiltration from non-infiltration. (D,E), Confusion matrix indicating the prediction quality of the RF classification for all predictions (D) and for those predictions with high (>0.2 bits) confidence (E). Notes: SCC, squamous cell carcinoma antigen; D-D, D-dimer; URIC, uric acid; ALP, alkaline phosphatase; TP, total protein; IgA, immunoglobulin A; LDH, lactate dehydrogenase; TT, thrombin time; PT(A), plasma prothrombin time ratio (A); MONO%, percentage of monocytes; HCT, hematocrit; HGB, hemoglobin; CK-MB, creatine kinase-MB isoenzyme; b1-G, beta 1 globulin; PT(r), plasma prothrombin time ratio (r).

Figure 2. Differentiation of lymph node metastasis of cervical cancer with machine learning methods. (A) ROC curves derived from logistic regression for predicting lymph node metastasis of cervical cancer based on all 75 peripheral blood markers using machine learning methods compared with logistic regression. (B) Relative importance of variables for prediction of lymph node metastasis calculated in the Cforest. Variable importance is represented as a percentage of the highest value. (C) Box and jitter plots representing the distribution of top eight important parameters for distinguishing metastasis from non-metastasis. (D,E), Confusion matrix indicating the prediction quality of the Cforest classification for all predictions (D) and for those predictions with high (>0.2 bits) confidence (E). Notes: SCC, squamous cell carcinoma antigen; MONO%, percentage of monocytes; PT(A), plasma prothrombin time ratio (A); HCT, hematocrit; TT, thrombin time; LDH, lactate dehydrogenase; D-D, D-dimer; PT(r), plasma prothrombin time ratio (r); HGB, hemoglobin; ALP, alkaline phosphatase; TP, total protein; URIC, uric acid; neut%, percentage of neutrophils; b1-G, beta 1 globulin; CK-MB, creatine kinase-MB isoenzyme; IgA, immunoglobulin A.

Figure 3. Prediction of lymph-vascular space invasion of cervical cancer based on preoperative blood markers using machine learning methods. (A) ROC curves derived from logistic regression for predicting lymph-vascular space invasion of cervical cancer based on all 75 peripheral blood markers using machine learning methods compared with logistic regression. (B) Relative importance of variables for prediction of lymph-vascular space invasion calculated in the EN. Variable importance is represented as a percentage of the highest value. (C) Box and jitter plots representing the distribution of top eight important blood markers for distinguishing invasion from non-invasion. (D,E) Confusion matrix indicating the prediction quality of the EN classification for all predictions (D) and for those predictions with high (>0.2 bits) confidence (E). Notes: RDW-SD, standard deviation of red blood cell distribution width; CK-MB, creatine kinase-MB isoenzyme; PCT, plateletcrit; A/G, albumin to globulin ratio; PT(A), plasma prothrombin time ratio (A); TT, thrombin time; TBIL, total bilirubin; TP, total protein; TBA, total bile acid; MCV, mean corpuscular volume; abdo_surgery_0.0, previous abdominal surgery; MONO%, percentage of monocytes; LDL-CHO, low density lipoprotein cholesterol; D-D, D-dimer; b2-MG, beta 2 microglobulin.

Table 1. Clinical and pathologic characteristics of 1260 patients with cervical cancer.

Variables	All Patients (n = 1260)	Training Cohort (n = 630)	Test Cohort (n = 630)	p Value
Age (years)	45 (18–74)	45 (18–74)	45 (21–73)	0.777
BMI (kg/m²)	23.6 (16.0–42.7)	23.6 (16.0–47.5)	23.7 (16.5–42.7)	0.453
Menopausal status
Yes	353 (28.0%)	446 (70.8%)	461 (73.2%)	0.347
No	907 (72.0%)	184 (29.2%)	169 (26.8%)
Clinical tumor diameter (cm)	3.5 (0.5–8.0)	3.5 (0.5–10.0)	3.5 (0.5–8.0)	0.211
Histology
Squamous carcinoma	1053 (83.6%)	525 (83.3%)	528 (83.8%)	0.82
Adenocarcinoma	133 (10.6%)	69 (11.0%)	64 (10.2%)	0.647
Others	74 (5.8%)	36 (5.7%)	38 (6.0%)	0.811
FIGO stage (2003)
IB1	707 (56.1%)	361 (57.3%)	346 (54.9%)	0.394
IB2	289 (22.9%)	142 (22.5%)	147 (23.3%)	0.738
IIA1	135 (10.7%)	60 (9.5%)	75 (11.9%)	0.172
IIA2	129 (10.3%)	67 (10.6%)	62 (9.8%)	0.642
Gross type
Exophytic	1163 (92.3%)	587 (93.2%)	576 (91.4%)	0.245
Endophytic	97 (7.7%)	43 (6.8%)	54 (8.6%)
Previous abdominal surgery
Yes	255 (20.2%)	133 (21.1%)	122 (19.4%)	0.441
No	1005 (79.8%)	497 (78.9%)	508 (80.6%)
Histologic grade
Good	87 (6.9%)	43 (6.8%)	44 (7.0%)	0.912
Moderate	506 (40.2%)	256 (40.6%)	250 (39.7%)	0.73
Poor	667 (52.9%)	331 (52.5%)	336 (53.3%)	0.778
Deep stromal infiltration
Negative	653 (51.8%)	335 (53.2%)	318 (50.5%)	0.338
Positive	607 (48.2%)	295 (46.8%)	312 (49.5%)
Lymph-vascular space invasion
Negative	829 (65.8%)	415 (65.9%)	414 (65.7%)	0.953
Positive	431 (34.2%)	215 (34.1%)	216 (34.3%)
Lymph node metastasis
Negative	1017 (80.7%)	496 (78.7%)	521 (82.7%)	0.074
Positive	243 (19.3%)	134 (21.3%)	109 (17.3%)

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ou, Z.; Mao, W.; Tan, L.; Yang, Y.; Liu, S.; Zhang, Y.; Li, B.; Zhao, D. Prediction of Postoperative Pathologic Risk Factors in Cervical Cancer Patients Treated with Radical Hysterectomy by Machine Learning. Curr. Oncol. 2022, 29, 9613-9629. https://doi.org/10.3390/curroncol29120755

AMA Style

Ou Z, Mao W, Tan L, Yang Y, Liu S, Zhang Y, Li B, Zhao D. Prediction of Postoperative Pathologic Risk Factors in Cervical Cancer Patients Treated with Radical Hysterectomy by Machine Learning. Current Oncology. 2022; 29(12):9613-9629. https://doi.org/10.3390/curroncol29120755

Chicago/Turabian Style

Ou, Zhengjie, Wei Mao, Lihua Tan, Yanli Yang, Shuanghuan Liu, Yanan Zhang, Bin Li, and Dan Zhao. 2022. "Prediction of Postoperative Pathologic Risk Factors in Cervical Cancer Patients Treated with Radical Hysterectomy by Machine Learning" Current Oncology 29, no. 12: 9613-9629. https://doi.org/10.3390/curroncol29120755

APA Style

Ou, Z., Mao, W., Tan, L., Yang, Y., Liu, S., Zhang, Y., Li, B., & Zhao, D. (2022). Prediction of Postoperative Pathologic Risk Factors in Cervical Cancer Patients Treated with Radical Hysterectomy by Machine Learning. Current Oncology, 29(12), 9613-9629. https://doi.org/10.3390/curroncol29120755

Article Menu

Prediction of Postoperative Pathologic Risk Factors in Cervical Cancer Patients Treated with Radical Hysterectomy by Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Patients and Considered Features

2.2. Data Splitting

2.3. Supervised Machine Learning Classifiers

2.4. Model Assessment

2.5. Confidence of Prediction and Shannon’s Information Gain

3. Results

3.1. Prediction of Deep Stromal Infiltration of Cervical Cancer Based on Multiple Preoperative Blood Markers Using Machine Learning Methods

3.2. Differentiation of Lymph Node Metastasis of Cervical Cancer with Machine Learning Methods

3.3. Prediction of Lymph-Vascular Space Invasion of Cervical Cancer Based on Preoperative Blood Markers Using Machine Learning Methods

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI