Machine Learning Algorithms to Predict Recurrence within 10 Years after Breast Cancer Surgery: A Prospective Cohort Study
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design and Patients
2.2. Statements of the Forecasting Models
2.2.1. Artificial Neural Networks (ANN) Model
2.2.2. K-Nearest Neighbor (KNN) Model
2.2.3. Support Vector Machine (SVM) Model
2.2.4. Naïve Bayesian Classifier (NBC) Model
2.2.5. Cox Proportional-Hazards Regression (COX) Model
2.3. Potential Predictors
2.4. Statistical Analysis
3. Results
3.1. Study Characteristics
3.2. Comparison of Forecasting Models
3.3. Significant Predictors in the ANN Model
3.4. Sensitivity Analysis
4. Discussion
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Wang, F.; Shu, X.; Meszoely, I.; Pal, T.; Mayer, I.A.; Yu, Z.; Zheng, W.; Bailey, C.E.; Shu, X.O. Overall Mortality after Diagnosis of Breast Cancer in Men vs Women. JAMA Oncol. 2019, 5, 1589–1596. [Google Scholar] [CrossRef]
- Freeman, J.; Crowley, P.D.; Foley, A.G.; Gallagher, H.C.; Iwasaki, M.; Ma, D.; Buggy, D.J. Effect of Perioperative Lidocaine, Propofol and Steroids on Pulmonary Metastasis in a Murine Model of Breast Cancer Surgery. Cancers 2019, 11, 613. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, Q.; Wei, J.; Chen, Z.; Zhang, T.; Zhong, J.; Zhong, B.; Yang, P.; Li, W.; Cao, J. Establishment of multiple diagnosis models for colorectal cancer with artificial neural networks. Oncol. Lett. 2019, 17, 3314–3322. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mosayebi, A.; Mojaradi, B.; Naeini, A.B.; Hosseini, S.H.K. Modeling and comparing data mining algorithms for prediction of recurrence of breast cancer. PLoS ONE 2020, 15, e0237658. [Google Scholar] [CrossRef] [PubMed]
- Kim, W.; Kim, K.S.; Park, R.W. Nomogram of Naive Bayesian Model for Recurrence Prediction of Breast Cancer. Healthc. Inform. Res. 2016, 22, 89–94. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gal, J.; Bailleux, C.; Chardin, D.; Pourcher, T.; Gilhodes, J.; Jing, L.; Guigonis, J.M.; Ferrero, J.M.; Milano, G.; Mograbi, B.; et al. Comparison of unsupervised machine-learning methods to identify metabolomic signatures in patients with localized breast cancer. Comput. Struct. Biotechnol. J. 2020, 18, 1509–1524. [Google Scholar] [CrossRef]
- Dlamini, Z.; Francies, F.Z.; Hull, R.; Marima, R. Artificial intelligence (AI) and big data in cancer and precision oncology. Comput. Struct. Biotechnol. J. 2020, 18, 2300–2311. [Google Scholar] [CrossRef]
- Elfiky, A.A.; Pany, M.J.; Parikh, R.B.; Obermeyer, Z. Development and Application of a Machine Learning Approach to Assess Short-term Mortality Risk Among Patients with Cancer Starting Chemotherapy. JAMA Netw. Open 2018, 1, e180926. [Google Scholar] [CrossRef] [Green Version]
- Rahimian, F.; Salimi-Khorshidi, G.; Payberah, A.H.; Tran, J.; Solares, R.A.; Raimondi, F.; Nazarzadeh, M.; Canoy, D.; Rahimi, K. Predicting the risk of emergency admission with machine learning: Development and validation using linked electronic health records. PLoS Med. 2018, 15, e1002695. [Google Scholar] [CrossRef]
- Jee, Y.H.; Gao, C.; Kim, J.; Park, S.; Jee, S.H.; Kraft, P. Validating Breast Cancer Risk Prediction Models in the Korean Cancer Prevention Study-II Biobank. Cancer Epidemiol. Biomark. Prev. 2020, 29, 1271–1277. [Google Scholar] [CrossRef] [Green Version]
- Huang, S.H.; Loh, J.K.; Tsai, J.T.; Houg, M.F.; Shi, H.Y. Predictive model for 5-year mortality after breast cancer surgery in Taiwan residents. Chin. J. Cancer 2017, 36, 23. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Burke, H.B.; Goodman, P.H.; Rosen, D.B.; Henson, D.E.; Weinstein, J.N.; Harrell, F.E.; Marks, J.R.; Winchester, D.P.; Bostwick, D.G. Artificial neural networks improve the accuracy of cancer survival prediction. Cancer 1997, 79, 857–862. [Google Scholar] [CrossRef]
- Cho, B.J.; Kim, K.M.; Bilegsaikhan, S.E.; Suh, Y.J. Machine learning improves the prediction of febrile neutropenia in Korean inpatients undergoing chemotherapy for breast cancer. Sci. Rep. 2020, 10, 14803. [Google Scholar] [CrossRef] [PubMed]
- Mitchel, J.; Chatlin, K.; Tong, L.; Wang, M.D. A Translational Pipeline for Overall Survival Prediction of Breast Cancer Patients by Decision-Level Integration of Multi-Omics Data. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18–21 November 2019; pp. 1573–1580. [Google Scholar] [CrossRef]
- Altman, N.S. An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 1992, 46, 175–185. [Google Scholar]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Domingos, P.; Pazzani, M. On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 1997, 29, 103–137. [Google Scholar] [CrossRef]
- Bender, R.; Augustin, T.; Blettner, M. Generating survival times to simulate Cox proportional hazards models. Stat. Med. 2006, 24, 1713–1723. [Google Scholar] [CrossRef] [Green Version]
- Deyo, R.A.; Cherkin, D.C.; Ciol, M.A. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. Clin. Epidemiol. 1992, 45, 613–619. [Google Scholar] [CrossRef]
- Krekel, N.M.A.; Lopes Cardozo, A.M.F.; Muller, S.; Bergers, E.; Meijer, S.; van den Tol, M.P. Optimising surgical accuracy in palpable breast cancer with intra-operative breast ultrasound—Feasibility and surgeons’ learning curve. Eur. J. Surg. Oncol. 2011, 37, 1044–1050. [Google Scholar] [CrossRef] [Green Version]
- Shi, H.Y.; Chang, H.T.; Culbertson, R.; Chen, Y.J.; Liao, Y.C.; Hou, M.F. Breast cancer surgery volume-cost associations: Hierarchical linear regression and propensity score matching analysis in a nationwide Taiwan population. Surg. Oncol. 2013, 22, 178–183. [Google Scholar] [CrossRef]
- Munir, K.; Elahi, H.; Ayub, A.; Frezza, F.; Rizzi, A. Cancer Diagnosis Using Deep Learning: A Bibliographic Review. Cancers 2019, 11, 1235. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gandek, B.; Ware, J.E. Methods for validating and norming translations of health status questionnaires: The IQOLA Project approach. J. Clin. Epidemiol. 1998, 51, 953–959. [Google Scholar] [CrossRef]
- Melton, L.J. Selection bias in the referral of patients and the natural history of surgical conditions. Mayo Clin. Proc. 1985, 60, 880–889. [Google Scholar] [CrossRef] [Green Version]
- Ayer, T.; Alagoz, O.; Chhatwal, J.; Shavlik, J.W.; Kahn, C.E.; Burnside, E.S. Breast Cancer Risk Estimation with Artificial Neural Networks Revisited: Discrimination and Calibration. Cancer 2010, 116, 3310–3321. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zafeiris, D.; Rutella, S.; Ball, G.R. An Artificial Neural Network Integrated Pipeline for Biomarker Discovery Using Alzheimer’s Disease as a Case Study. Comput. Struct. Biotechnol. J. 2018, 16, 77–87. [Google Scholar] [CrossRef]
- López-Martínez, F.; Núñez-Valdez, E.R.; Crespo, R.G.; García-Díaz, V. An artificial neural network approach for predicting hypertension using NHANES data. Sci. Rep. 2020, 10, 10620. [Google Scholar] [CrossRef]
- Morche, J.; Mathes, T.; Pieper, D. Relationship between surgeon volume and outcomes: A systematic review of systematic reviews. Syst. Rev. 2016, 5, 204. [Google Scholar] [CrossRef] [Green Version]
- Pieper, D.; Mathes, T.; Neugebauer, E.; Eikermann, M. State of evidence on the relationship between high-volume hospitals and outcomes in surgery: A systematic review of systematic reviews. J. Am. Coll. Surg. 2013, 216, 1015–1025. [Google Scholar] [CrossRef]
- Lafourcade, A.; His, M.; Baglietto, L.; Boutron-Ruault, M.C.; Dossus, L.; Rondeau, V. Factors associated with breast cancer recurrences or mortality and dynamic prediction of death using history of cancer recurrences: The French E3N cohort. BMC Cancer 2018, 18, 171. [Google Scholar] [CrossRef]
- Shiferaw, W.S.; Aynalem, Y.A.; Akalu, T.Y.; Demelew, T.M. Incidence and Predictors of Recurrence among Breast Cancer Patients in Black Lion Specialized Hospital Adult Oncology Unit, Addis Ababa, Ethiopia: Retrospective Follow-Up Study with Survival Analysis. J. Cancer Prev. 2020, 25, 111–118. [Google Scholar] [CrossRef]
- Wu, S.; Mo, M.; Wang, Y.; Zhang, N.; Li, J.; Di, G.; Shao, Z.; Wu, J.; Liu, G. Local recurrence following mastectomy and autologous breast reconstruction: Incidence, risk factors, and management. OncoTargets Ther. 2016, 9, 6829–6834. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Variables | N (%) | Mean ± SD |
---|---|---|
Demographic characteristics | ||
Age, years | 52.30 ± 10.98 | |
Education, years | 10.20 ± 3.77 | |
Current residence with family member(s) | 1095 (96.1%) | |
Married | 1002 (87.9%) | |
Body mass index, kg/m2 | 24.55 ± 4.71 | |
Charlson Comorbidity Index, score | 1.05 ± 1.38 | |
Tumor size | 2.42 ± 1.80 | |
Tumor stage | ||
I | 369 (32.4%) | |
II | 456 (40.0%) | |
III | 315 (27.6%) | |
Smoker | 57 (5%) | |
Drinker | 30 (2.6%) | |
Breast cancer history | 153 (13.4%) | |
Clinical characteristics | ||
Surgery | ||
BCS | 117 (10.3%) | |
MRM | 306 (26.8%) | |
Mastectomy with reconstruction | 717 (62.9%) | |
ASA score | 2.04 ± 0.39 | |
Chemotherapy | 816 (71.6%) | |
Radiotherapy | 663 (58.2%) | |
Hormonal therapy | 681 (59.7%) | |
Surgeon volume | ||
Low (≤8 cases/ year) | 376 (33%) | |
Medium (9 ~ 16 cases/ year) | 381 (33.4%) | |
High (≥17 cases/ year) | 383 (33.6%) | |
Hospital volume | ||
Low (≤19 cases/ year) | 374 (32.8%) | |
Medium (20~29 cases/ year) | 381 (33.4%) | |
High (≥30 cases/ year) | 385 (33.8%) | |
Quality of care within 10 years | ||
Readmission in 30 days | 285 (25%) | |
Recurrence | 225 (19.7%) | |
Survival | 840 (73.7%) | |
Preoperative quality of life | ||
Preoperative SF36 PCS score | 56.02 ± 7.44 | |
Preoperative SF36 MCS score | 41.12 ± 18.39 |
Variables | HR (95% CI) | p Value |
---|---|---|
Demographic characteristics | ||
Age, years | 0.97 (0.97–0.98) | <0.001 |
Education, years | 0.88 (0.87–0.90) | <0.001 |
Current residence with family member(s) (no vs. yes) | 0.25 (0.22–0.29) | <0.001 |
Marital status (unmarried vs. married) | 0.47 (0.44–0.51) | <0.001 |
Body mass index, kg/m2 | 0.95 (0.94–0.95) | <0.001 |
Charlson Comorbidity Index, score | 0.62 (0.57–0.68) | <0.001 |
Tumor size, cm | 0.66 (0.62–0.70) | <0.001 |
Tumor stage | ||
I vs. 0 | 0.19 (0.15–0.26) | < 0.001 |
II vs. 0 | 0.19 (0.15–0.24) | <0.001 |
≥III vs. 0 | 0.47 (0.36–0.62) | <0.001 |
Smoker (no vs. yes) | 0.46 (0.26–0.81) | 0.007 |
Drinker (no vs. yes) | 0.11 (0.34–0.37) | <0.001 |
Breast cancer history (no vs. yes) | 0.34 (0.24–0.49) | <0.001 |
Clinical characteristics | ||
Surgery type | ||
BCS | 0.31 (0.16–0.63) | <0.001 |
MRM | 2.00 (1.47–2.73) | <0.001 |
Mastectomy with reconstruction | 0.75 (0.56–1.00) | 0.050 |
ASA score | 0.50 (0.47–0.54) | <0.001 |
Chemotherapy (no vs. yes) | 0.28 (0.24–0.33) | <0.001 |
Radiotherapy (no vs. yes) | 0.28 (0.23–0.33) | <0.001 |
Hormonal therapy (no vs. yes) | 0.24 (0.20–0.30) | <0.001 |
Surgeon volume (medium vs. low) | 0.98 (0.98–0.99) | <0.001 |
Surgeon volume (high vs. low) | 0.97 (0.97–0.98) | <0.001 |
Hospital volume (medium vs. low) | 0.98 (0.98–0.99) | <0.001 |
Hospital volume (high vs. low) | 0.97 (0.97–0.98) | <0.001 |
Quality of care within 10 years | ||
Readmission in 30 days (no vs. yes) | 0.23 (0.17–0.31) | <0.001 |
Postoperative reconstruction (no vs. yes) | 0.50 (0.35–0.72) | <0.001 |
Preoperative quality of life | ||
Preoperative SF36 PCS score | 0.97 (0.96–0.97) | <0.001 |
Preoperative SF36 MCS score | 0.98 (0.97–0.98) | <0.001 |
Variables | Training Dataset (n = 798) | Testing Dataset (n = 171) | p Value |
---|---|---|---|
Demographic characteristics | |||
Age, years | 51.97 ± 11.32 | 53.22 ± 10.94 | 0.189 |
Education, years | 10.24 ± 3.83 | 10.13 ± 3.69 | 0.722 |
Current residence with family member(s) | 770 (96.5%) | 163 (95.3%) | 0.462 |
Married | 706 (88.5%) | 150 (87.6%) | 0.742 |
Body mass index, kg/m2 | 24.66 ± 4.92 | 24.36 ± 4.67 | 0.462 |
Charlson Comorbidity Index, score | 1.05 ± 1.38 | 1.11 ± 1.35 | 0.656 |
Tumor size, cm | 2.40 ± 1.82 | 2.56 ± 1.79 | 0.312 |
Tumor stage | 0.052 | ||
I | 271 (33.9%) | 44 (25.7%) | |
II | 314 (39.3%) | 75 (43.9%) | |
≥III | 213 (26.8%) | 52 (30.4%) | |
Smoker | 36 (4.5%) | 13 (7.6%) | 0.094 |
Drinker | 22 (2.8%) | 6 (3.5%) | 0.593 |
Breast cancer history | 99 (12.4%) | 32 (18.7%) | 0.060 |
Clinical characteristics | |||
Surgery type | 0.572 | ||
BCS | 75 (9.4%) | 19 (11.1%) | |
MRM | 218 (27.3%) | 46 (26.9%) | |
Mastectomy with reconstruction | 505 (63.3%) | 106 (62.0%) | |
ASA score | 2.04 ± 0.40 | 2.06 ± 0.35 | 0.399 |
Chemotherapy | 565 (70.8%) | 129 (75.4%) | 0.237 |
Radiotherapy | 464 (58.1%) | 96 (56.1%) | 0.581 |
Hormonal therapy | 480 (60.2%) | 94 (55.0%) | 0.186 |
Surgeon volume | |||
Low | 263 (33.0%) | 56 (32.8%) | 0.897 |
Medium | 267 (33.4%) | 57 (33.3%) | |
High | 268 (33.6%) | 58 (33.9%) | |
Hospital volume | |||
Low | 262 (32.8%) | 57 (33.3%) | 0.796 |
Medium | 266 (33.3%) | 57 (33.3%) | |
High | 270 (33.8%) | 57 (33.4%) | |
Quality of care within 10 years | |||
Readmission in 30 days | 203 (25.4%) | 47 (27.6%) | 0.551 |
Recurrence | 147 (18.4%) | 43 (25.3%) | 0.057 |
Postoperative reconstruction | 92 (11.5%) | 20 (11.8%) | 0.911 |
Preoperative quality of life | |||
Preoperative SF36 PCS score | 56.19 ± 7.42 | 55.18 ± 7.86 | 0.112 |
Preoperative SF36 MCS score | 41.16 ± 18.17 | 40.03 ± 19.46 | 0.468 |
Models | Sensitivity | Specificity | PPV | NPV | Accuracy | AUROC |
---|---|---|---|---|---|---|
Training dataset (n = 798) | ||||||
ANN | 95.89 (94.68–97.09) | 99.54 (99.43–99.66) | 97.90 (96.04–99.75) | 99.08 (98.90–99.26) | 98.87 (98.12–99.63) | 97.62 (96.87–98.37) |
KNN | 75.00 (71.67–78.34) | 94.97 (92.58–97.36) | 78.94 (75.12–82.76) | 93.78 (92.67–94.89) | 90.95 (89.47–92.44) | 85.00 (83.78–86.23) |
SVM | 81.29 (80.04–82.57) | 96.61 (95.49–97.73) | 85.80 (83.78–87.82) | 95.35 (94.22–96.48) | 95.53 (94.87–96.19) | 88.90 (86.76–91.04) |
NBC | 100.00 (99.91–100.00) | 0.00 (0.00–0.00) | 79.88 (76.47–83.29) | 0.00 (0.00–0.00) | 79.88 (76.23–83.53) | 50.00 (42.65–57.36) |
COX | 34.93 (24.91–44.99) | 0.00 (0.00–0.00) | 7.30 (4.37–10.24) | 0.00 (0.00–0.00) | 6.42 (5.78–7.06) | 17.50 (12.14–22.86) |
p value | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 |
Testing dataset (n = 171) | ||||||
ANN | 95.35 (94.47–96.23) | 100.00 (99.94–100.00) | 100.00 (99.97–100.00) | 98.45 (97.87–99.03) | 98.82 (97.67–99.96) | 99.81 (99.43–99.99) |
KNN | 46.15 (36.21–56.09) | 76.66 (73.69–79.63) | 46.15 (35.98–56.32) | 76.67 (72.41–80.93) | 67.44 (55.67–79.21) | 61.40 (47.93–74.87) |
SVM | 70.37 (65.45–75.29) | 94.35 (93.35–95.37) | 75.50 (71.41–79.59) | 93.13 (91.47–94.79) | 89.79 (87.77–91.80) | 82.40 (80.34–84.46) |
NBC | 100.00 (99.91–100.00) | 0.00 (0.00–0.00) | 19.01 (10.98–27.04) | 0.00 (0.00–0.00) | 19.01 (10.23–27.79) | 50.00 (40.67–59.34) |
COX | 20.93 (8.79–33.07) | 0.00 (0.00–0.00) | 6.62 (3.78–9.46) | 0.00 (0.00–0.00) | 5.29 (3.02–7.47) | 10.50 (4.71–16.29) |
p value | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 |
Rank 1st | Rank 2nd | Rank 3rd | |
---|---|---|---|
Variable | Surgeon volume | Hospital volume | Tumor stage |
VSR (95% CI) | 14.56 (12.33–16.79) | 14.23 (11.34–17.12) | 11.09 (7.98–14.21) |
Models | Sensitivity | Specificity | PPV | NPV | Accuracy | AUROC |
---|---|---|---|---|---|---|
ANN | 88.90 | 95.52 | 84.21 | 96.97 | 94.12 | 97.62 |
(86.69–91.11) | (94.87–96.17) | (82.67–85.75) | (95.57–98.38) | (93.01–95.23) | (96.83–98.41) | |
KNN | 46.15 | 76.66 | 46.15 | 76.67 | 67.44 | 61.40 |
(35.14–57.16) | (71.48–81.84) | (35.24–57.06) | (65.69–87.65) | (61.12–73.76) | (55.41–67.39) | |
SVM | 70.37 | 94.35 | 75.50 | 93.13 | 89.79 | 82.40 |
(65.67–75.07) | (93.24–95.46) | (70.49–80.51) | (92.01–94.25) | (88.12–91.46) | (81.12–83.68) | |
NBC | 100.00 | 0.00 | 19.01 | 0.00 | 19.01 | 50.00 |
(99.94–100.00) | (0.00–0.00) | (9.98–28.04) | (0.00–0.00) | (9.57–28.45) | (39.68–60.32) | |
COX | 20.93 | 0.60 | 6.62 | 0.23 | 5.29 | 10.50 |
(7.93–33.95) | (0.00–1.01) | (4.32–8.92) | (0.10–0.36) | (3.37–7.21) | (4.78–16.22) | |
p value | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lou, S.-J.; Hou, M.-F.; Chang, H.-T.; Chiu, C.-C.; Lee, H.-H.; Yeh, S.-C.J.; Shi, H.-Y. Machine Learning Algorithms to Predict Recurrence within 10 Years after Breast Cancer Surgery: A Prospective Cohort Study. Cancers 2020, 12, 3817. https://doi.org/10.3390/cancers12123817
Lou S-J, Hou M-F, Chang H-T, Chiu C-C, Lee H-H, Yeh S-CJ, Shi H-Y. Machine Learning Algorithms to Predict Recurrence within 10 Years after Breast Cancer Surgery: A Prospective Cohort Study. Cancers. 2020; 12(12):3817. https://doi.org/10.3390/cancers12123817
Chicago/Turabian StyleLou, Shi-Jer, Ming-Feng Hou, Hong-Tai Chang, Chong-Chi Chiu, Hao-Hsien Lee, Shu-Chuan Jennifer Yeh, and Hon-Yi Shi. 2020. "Machine Learning Algorithms to Predict Recurrence within 10 Years after Breast Cancer Surgery: A Prospective Cohort Study" Cancers 12, no. 12: 3817. https://doi.org/10.3390/cancers12123817
APA StyleLou, S.-J., Hou, M.-F., Chang, H.-T., Chiu, C.-C., Lee, H.-H., Yeh, S.-C. J., & Shi, H.-Y. (2020). Machine Learning Algorithms to Predict Recurrence within 10 Years after Breast Cancer Surgery: A Prospective Cohort Study. Cancers, 12(12), 3817. https://doi.org/10.3390/cancers12123817