You are currently viewing a new version of our website. To view the old version click .
Engineering Proceedings
  • Proceeding Paper
  • Open Access

5 December 2023

Prediction Model for Preoperative Diagnosis of Ovarian Cancer Using Tumor Markers, CBC, and LFT †

and
Department of Mathematics, Faculty of Science, King Mongkut’s University of Technology Thonburi, Bangkok 10140, Thailand
*
Author to whom correspondence should be addressed.
Presented at the IEEE 5th Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability, Tainan, Taiwan, 2–4 June 2023.
This article belongs to the Proceedings 2023 IEEE 5th Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability

Abstract

The preoperative diagnosis of ovarian cancer (OC) was developed based on risk factor groups using secondary data. Binary and multiple logistic regression and its operating characteristic curve were used to analyze the data of risk factor groups for tumor markers, complete blood count (CBC), and liver function tests (LFT), respectively, and to explore potential predictors for each risk factor group. The data of 202 patients with ovarian cancer were analyzed in this research. As the tumor markers group, menopausal status, human epididymal protein 4, and cancer antigen 19-9 were included as the derivation of the preoperative diagnosis index. For the CBC group, menopausal status, lymphocyte count, and basophil cell ratio were used as predictors. Menopausal status, albumin, alkaline phosphatase, and indirect bilirubin were used as predictors for the LFT group. The area under the receiver operating characteristic curve (AUROC) for tumor markers, CBC, and LFT were 0.89 (95% CI, 0.845–0.935; sensitivity = 0.776, specificity = 0.919), 0.813 (95% CI, 0.755–0.871; sensitivity = 0.741, specificity = 0.767), and 0.81 (95% CI, 0.751–0.868; sensitivity = 0.664, specificity = 0.837), respectively.

1. Introduction

Gynecological cancer is the most common cancer in women. However, its diagnosis is complicated because the cancer is found in the pelvis and diagnosed only by internal examination. Diagnosis may be delayed, which adversely affects the treatment of gynecological cancer because it is more effective at an early stage than at an advanced stage. Among gynecological cancers, ovarian cancer is the second most common, and the number one cause of death. The age-standardized incidence rates per 100,000 women were 7.1 and 5.8 for countries with high/very high Human Development Index (HDI) and low/medium HDI in 2020 [1]. Ovarian cancer does not show symptoms in its early stages often, causing patients to come to the doctor in the advanced stage. This results in a high mortality rate compared to other gynecological cancers. The important diagnostic method of ovarian cancer is to detect pelvic masses that are relatively hard and have a rough texture along with the presence of ascites. The stage of ovarian cancer is determined by the International Federation of Gynecology and Obstetrics (FIGO) system for describing how much cancer is in the body and determining how serious the cancer is. The initial treatment of patients at early stages is planned, while cytoreductive surgery is usually considered for patients at the advanced stage of ovarian cancer [2]. An important treatment can be decided based on a preoperative diagnosis for patients with high or low risk. Generally, patients at high risk must be referred to gynecologic oncologists for appropriate management.
Many indexes have been developed for preoperative diagnosis, such as the Risk of Malignancy Index (RMI) [3] and the Risk of Ovarian Malignancy Algorithm (ROMA) [4]. RMI is recommended by the American College of Obstetricians and Gynecologists (ACOG) as a tool for referring patients to gynecologic oncologists [2]. These indexes require the collection of demographic data, blood tests, morphological patterns, and biomarkers for prognostic purposes in the preoperative diagnosis. Therefore, several hospitals are unable to use these indexes to analyze patients due to a shortage of ultrasound specialists or gynecologists. Therefore, preoperative diagnosis indexes were developed based on tumor markers, complete blood count (CBC), and liver function tests (LFT), respectively, in women without pelvic or adnexal mass data.

2. Methods

This research was for developing preoperative diagnostic methods as a retrospective study using clinical data published by Mi et al. [5]. There were three groups of interesting risk factors for ovarian cancer: tumor markers, CBC, and LFT. The data of each group with different risk factors were analyzed to construct preoperative diagnosis indexes using multiple logistic regression (or binary logistic regression) with SPSS. The significant predictors were obtained based on the result of multiple logistic regression, and their predictive significance was assessed based on the diagnostic odds ratio and p-value. These significant predictors were then reanalyzed through regression until all variables became statistically significant to determine the logistic response function. After that, the Hosmer–Lemeshow goodness-of-fit statistics for logistic regression models was used to calibrate the models. The discriminative ability or predictive performance was assessed by calculating the area under the receiver operating characteristic curve (AUROC), which plotted the sensitivity against 1-specificity at various possible cut-off points. The optimal cut-off point was determined by considering the point on the receiver operating characteristic curve which was closest to the perfect cut-off point [6]. Furthermore, the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were calculated using a 2 × 2 table.

3. Results

The data were collected from the records of 202 patients to develop the preoperative diagnosis indexes [5]. The data contained information on the status, blood test results, and biomarkers of each patient, including 86 without ovarian cancer and 116 with ovarian cancer. The risk factors of ovarian cancer were divided into three groups: tumor markers, CBC, and LFT. Menopause was included in the analysis because postmenopausal patients are known to have a higher risk of developing epithelial cancer.

3.1. Logistic Regression Analysis: Tumor Marker Group

The preoperative diagnosis index based on tumor markers was created by using binary logistic regression analysis. The tumor markers consisted of the levels of tumor markers: human epididymis protein 4 (HE4) and carbohydrate antigen 19-9 (CA19-9). The details and parameters of tumor markers are presented in Table 1. The logistic response function of the tumor marker group was described as
P ( OC ) = e β 0 + β 1 M e n o p a u s a l + β 2 H E 4 + β 3 C A 19 - 9 1 + e β 0 + β 1 M e n o p a u s a l + β 2 H E 4 + β 3 C A 19 - 9 ,
where β 0 is the regression constant, β 1 i 3 is the regression coefficient of each independent variable, and 0 ≤ P(OC) ≤ 1; represents the probability of a patient with ovarian cancer.
Table 1. Parameters and their details for creation of preoperative diagnosis index of tumor marker group.
After conducting binary logistic regression based on tumor markers, the menopausal status, human epididymis protein 4 level, and carbohydrate antigen 19-9 level were included in the multiple logistic regression analysis. The model of the binary logistic regression was described as
P ( OC ) = e 3.867 + 0.623 M e n o p a u s a l + 0.056 H E 4 + 0.008 ( C A 19 - 9 ) 1 + e 3.867 + 0.623 M e n o p a u s a l + 0.056 H E 4 + 0.008 ( C A 19 - 9 ) ,
where the levels of HE4 and CA19-9 are measured in pmol/L and units/mL, respectively. The results of multiple logistic regression analysis are presented in Table 2. For discriminative ability, the area under the AUROC of the model of the tumor marker group was 0.89 (95% CI, 0.845–0.935), and the p-value of the Hosmer–Lemeshow goodness-of-fit test was 0.694. With the optimal cutoff point of 0.5, the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were 0.776, 0.919, 0.928, 0.752, and 0.837, respectively.
Table 2. Result of logistic regression analysis of preoperative diagnosis index of tumor marker group.

3.2. Logistic Regression Analysis: CBC Group

For the binary logistic regression analysis for creating the preoperative diagnosis index of the CBC group, basophil count (BASO) and lymphocyte count (LYM) were included in the regression analysis. The logistic response function is as follows.
P ( OC ) = e β 0 + β 1 B A S O + β 2 L Y M + β 3 M e n o p a u s a l 1 + e β 0 + β 1 B A S O + β 2 L Y M + β 3 M e n o p a u s a l ,
where β 0 is the regression constant, β 1 i 3 is the regression coefficient of each independent variable, and 0 P ( O C ) 1 ; represents the probability of a patient having ovarian cancer. Table 3 presents information on the parameters of the index for the CBC group.
Table 3. Parameters and their details for creation of preoperative diagnosis index of CBC group.
The model based on the result of binary logistics regression analysis included the data of BASO, LYM, and menopausal status where BASO and LYM were remeasured as their ratios to the number of white blood cells and 109/l of it. The detail of the parameters for multiple logistic regression analysis are shown in Table 4, and the logistic response function is as follows:
P ( OC ) = e 2.395 1.02 B A S O 1.283 L Y M + 1.87 M e n o p a u s a l 1 + e 2.395 1.02 B A S O 1.283 L Y M + 1.87 M e n o p a u s a l .
Table 4. Result of logistic regression analysis of preoperative diagnosis index of CBC group.
The AUROC of the model was 0.813 (95% CI, 0.755–0.871), and the model fitted the data well (p-value = 0.2 in the Hosmer–Lemeshow test). At the optimal cutoff point of 0.53, the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were 0.741, 0.767, 0.811, 0.688, and 0.753, respectively.

3.3. Logistic Regression Analysis: LFT Group

The data of the LFT group included albumin (ALB), alkaline phosphatase (ALP), and indirect bilirubin (IBIL). These parameters were used to create a preoperative diagnosis index by using binary logistic regression. The information for each parameter is shown in Table 5. The logistic function that describes the response of this case is expressed as
P ( OC ) = e β 0 + β 1 M e n o p a u s a l + β 2 A L B + β 3 A L P + β 4 I B I L 1 + e β 0 + β 1 M e n o p a u s a l + β 2 A L B + β 3 A L P + β 4 I B I L ,
where β 0 is the regression constant, β 1 i 4 is the regression coefficient of each independent variable, and 0 P ( O C ) 1 ; represents the probability of a patient having ovarian cancer.
Table 5. Parameters and their details for creation of preoperative diagnosis index of CBC group.
In the binary logistics regression analysis, ALB, ALP, and IBIL, and menopausal status, the units of ALB, ALP, and IBIL are g/L, units/L, and μmol/L, respectively. The coefficient and detail of each parameter are shown in Table 6. The model is described as
P ( OC ) = e 2.606 + 1.919 M e n o p a u s a l 0.093 A L B + 0.02 A L P 0.198 I B I L 1 + e 2.606 + 1.919 M e n o p a u s a l 0.093 A L B + 0.02 A L P 0.198 I B I L .
Table 6. Result of logistic regression analysis of preoperative diagnosis index of LFT group.
The AUROC of the receiver operating characteristic curve was 0.81 (95% CI, 0.751–0.868), indicating a good fit of the model to the data (p-value = 0.596). With an optimal cutoff point of 0.61, the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were 0.664, 0.837, 0.846, 0.649, and 0.738, respectively.

4. Discussion and Conclusions

In this study, three binary logistic regression models were developed for the preoperative diagnosis of ovarian cancer based on the data of tumor markers, CBC, and LFT. The model based on the data of tumor markers showed that HE4 and CA19-9 levels were significant predictive parameters for the prediction of ovarian cancer, while the HE4 level was presented the highest odds ratio. Menopausal status, BASO, and LYM were significant predictive parameters in the preoperative diagnosis based on the CBC data. For the preoperative diagnosis based on the data of LFT, menopausal status, ALB, ALP, and IBIL were the significant parameters. Menopausal status had the highest odds in the CBC and LFT groups. The discriminative performance of each model (Figure 1) was evaluated with the AUROC of the logistic regression models for the tumor marker, CBC, and LFT groups. The model of the tumor marker group had the largest AUROC, indicating better diagnostic performance than that of the CBC and LFT groups. The diagnostic indexes of the models for each group are compared in Table 7.
Figure 1. AUROC of logistic regression models of the tumor marker, CBC, and LFT groups.
Table 7. Comparison of AUROC curves logistic regression models of the tumor marker, CBC, and LFT groups.
It was found that the likelihood of ovarian cancer in patients decreased as the levels of BASO and LYM increased. Therefore, treatments for increasing the level of BASO and LYM are beneficial for treatment. The indexes of the model of the LFT group indicated that increasing levels of ALB, IBIL, and DBIL decreased the likelihood of having the disease.
The developed preoperative diagnosis indexes of ovarian cancer without pelvic or adnexal mass data can be used as a reference for the evaluation of patients presenting with ovarian tumors by physicians or gynecologists and assist in management planning and patient prioritization for surgery, potentially reducing surgical risks.

Author Contributions

Conceptualization, S.T. and T.S.; methodology, S.T. and T.S.; software, S.T.; validation, S.T. and T.S.; formal analysis, S.T.; investigation, S.T. and T.S.; resources, S.T. and T.S.; data curation, S.T.; writing—original draft preparation, S.T.; writing—review and editing, S.T. and T.S.; visualization, S.T.; supervision, T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data is available in a publicly accessible repository. The data presented in this study and supporting information are openly available in Mendeley Data at https://doi.org/10.17632/th7fztbrv9.11 [5].

Acknowledgments

The authors sincerely thank for Science Achievement Scholarship of Thailand and the Department of Mathematics, Faculty of Science, King Mongkut’s University of Technology Thonburi for supporting us.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 209–249. [Google Scholar] [CrossRef] [PubMed]
  2. Chirdchim, W.; Wanichsetakul, P.; Phinyo, P. Development and Validation of a Predictive Score for Preoperative Diagnosis of Early Stage Epithelial Ovarian Cancer. Asian Pac. J. Cancer Prev. 2019, 20, 1207–1213. [Google Scholar] [CrossRef] [PubMed]
  3. Jacobs, I.; Oram, D.; Fairbanks, J.; Turner, J.; Frost, C.; Grudzinskas, J.G. A risk of malignancy index incorporating CA 125, ultrasound and menopausal status for the accurate preoperative diagnosis of ovarian cancer. Br. J. Obstet. Gynaecol. 1990, 97, 922–929. [Google Scholar] [CrossRef] [PubMed]
  4. Moore, R.G.; McMeekin, D.S.; Brown, A.K.; DiSilvestro, P.; Miller, M.C.; Allard, W.J.; Gajewski, W.; Kurman, R.; Bast, R.C.; Skates, S.J. A novel multiple marker bioassay utilizing HE4 and CA125 for the prediction of ovarian cancer in patients with a pelvic mass. Gynecol. Oncol. 2009, 112, 40–46. [Google Scholar] [CrossRef] [PubMed]
  5. Data for: Using Machine Learning to Predict Ovarian Cancer. Available online: https://data.mendeley.com/datasets/th7fztbrv9/11 (accessed on 23 October 2020).
  6. Pepe, M.S. The Statistical Evaluation of Medical Tests for Classification and Prediction; Oxford University Press: Oxford, UK, 2003. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.