Next Article in Journal
Prevalence and Patterns of Positional Dental Anomalies in First Permanent Molars: Insights from a Study in Oradea, Romania
Next Article in Special Issue
Overexpression of Osteopontin-a and Osteopontin-c Splice Variants Are Worse Prognostic Features in Colorectal Cancer
Previous Article in Journal
Aspects of Occlusal Recordings Performed with the T-Scan System and with the Medit Intraoral Scanner
Previous Article in Special Issue
Predictive Factors of Immunotherapy in Gastric Cancer: A 2024 Update
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Longitudinal Risk Analysis of Second Primary Cancer after Curative Treatment in Patients with Rectal Cancer

by
Jiun-Yi Hsia
1,2,†,
Chi-Chang Chang
3,4,†,
Chung-Feng Liu
5,
Chia-Lin Chou
6,7,* and
Ching-Chieh Yang
8,9,10,*
1
Division of Thoracic Surgery, Department of Surgery, Chung Shan Medical University Hospital, Taichung 402367, Taiwan
2
School of Medicine, Chung Shan Medical University, Taichung 40201, Taiwan
3
School of Medical Informatics, Chung Shan Medical University, IT Office, Chung Shan Medical University Hospital, Taichung 40201, Taiwan
4
Department of Information Management, Ming Chuan University, Taoyuan 33348, Taiwan
5
Department of Medical Research, Chi Mei Medical Center, Tainan 710402, Taiwan
6
Division of Colon & Rectal Surgery, Department of Surgery, Chi Mei Medical Center, Tainan 710402, Taiwan
7
Department of Medical Laboratory Science and Biotechnology, Chung Hwa University of Medical Technology, Tainan 71703, Taiwan
8
Department of Radiation Oncology, Chi Mei Medical Center, Tainan 71004, Taiwan
9
Department of Pharmacy, Chia-Nan University of Pharmacy and Science, Tainan 717301, Taiwan
10
School of Medicine, College of Medicine, National Sun Yat-sen University, Kaohsiung 80404, Taiwan
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Diagnostics 2024, 14(13), 1461; https://doi.org/10.3390/diagnostics14131461
Submission received: 11 June 2024 / Accepted: 4 July 2024 / Published: 8 July 2024
(This article belongs to the Special Issue Advances in the Diagnosis of Gastrointestinal Diseases—2nd Edition)

Abstract

:
Predicting and improving the response of rectal cancer to second primary cancers (SPCs) remains an active and challenging field of clinical research. Identifying predictive risk factors for SPCs will help guide more personalized treatment strategies. In this study, we propose that experience data be used as evidence to support patient-oriented decision-making. The proposed model consists of two main components: a pipeline for extraction and classification and a clinical risk assessment. The study includes 4402 patient datasets, including 395 SPC patients, collected from three cancer registry databases at three medical centers; based on literature reviews and discussion with clinical experts, 10 predictive variables were considered risk factors for SPCs. The proposed extraction and classification pipelines that classified patients according to importance were age at diagnosis, chemotherapy, smoking behavior, combined stage group, and sex, as has been proven in previous studies. The C5 method had the highest predicted AUC (84.88%). In addition, the proposed model was associated with a classification pipeline that showed an acceptable testing accuracy of 80.85%, a recall of 79.97%, a specificity of 88.12%, a precision of 85.79%, and an F1 score of 79.88%. Our results indicate that chemotherapy is the most important prognostic risk factor for SPCs in rectal cancer survivors. Furthermore, our decision tree for clinical risk assessment illuminates the possibility of assessing the effectiveness of a combination of these risk factors. This proposed model may provide an essential evaluation and longitudinal change for personalized treatment of rectal cancer survivors in the future.

1. Introduction

Rectal cancer is among the most common malignancies, affecting one-third of all colorectal cancer patients worldwide [1]. A multidisciplinary approach to rectal cancer treatment includes preoperative therapy followed by total mesorectum excision and adjuvant chemotherapy [2]. The development of new anticancer regimens, such as monoclonal antibodies and immune checkpoint inhibitors, has significantly decreased the mortality rate of rectal cancer [3,4]. Due to the increased long-term survival of rectal cancer survivors, second primary cancers (SPCs) are receiving increasing attention in clinical practice [5].
Phipps et al. found a higher rate of SPCs among rectal cancer survivors compared to the general population. As a result of various lifestyle, genetic, environmental, and treatment factors, SPCs in rectal cancer survivors are associated with the use of alcohol, tobacco, betel nuts, and anticancer drugs [6,7,8]. Several studies have also examined the effect of radiotherapy or chemotherapy on SPC risk, with inconsistent results [9,10]. Currently, no risk factors have been established that can predict the response to SPCs, and no tools have been incorporated into clinical practice to improve the prediction of SPCs in patients with rectal cancer. The objective of this study was to determine the risk factors for SPCs and perform a clinical assessment of the risk of rectal cancer to ultimately contribute to clinical treatment.

2. Materials and Methods

2.1. Ethic Statement

Chi-Mei Medical Center Institutional Review Board (CMFHR11006-006) approved this study in accordance with the Declaration of Helsinki. Because no personally identifiable information was used, the IRB waived the need for individual informed consent. In addition, this study had a noninterventional retrospective design, with no human subjects, and all data were analyzed anonymously.

2.2. Study Population

We included 4402 patients diagnosed with rectal cancer across multiple institutions between 1 January 2009 and 31 December 2016. The follow-up deadline was 31 December 2022 for survivors. All samples in this study were classified according to the 7th edition of the American Cancer Committee, and samples were selected considering second primary cancer [11,12,13,14,15,16].
All data were collected based on the following criteria: (1) considering the International Classification of Diseases for Oncology, 3rd edition, cases with the primary site of the rectosigmoid junction (code C19.9) and the rectum (code C20.9); (2) patients treated in the hospital who met the previous criterion. The exclusion criteria were (1) no clear coding on follow-up or curable treatment; (2) never being disease-free; (3) previous cancer history or metastatic disease or missing coding; and (4) SPCs diagnosed within 6 months, which were excluded from this study as we sought to investigate the prevention of recurrence and metastasis to observe the effect of treatment over time. Furthermore, the NCCN’s latest guidelines recommend a 6-month first surveillance examination after the removal of large adenomas or sessile serrated polyps with unfavorable features or those that have been sporadically removed [17].

2.3. The Evidence-Based Clinical Decision-Making Model

Three cancer registry databases were used, and coding data collected from three medical centers were input into the model as case data. Then, 10 important risk factors were considered, namely (1) sex, (2) age at diagnosis, (3) tumor size, (4) combined stage group, (5) radiotherapy, (6) chemotherapy, (7) body mass index (BMI) (kg/m2), (8) smoking behavior, (9) drinking behavior, and (10) carcinoembryonic antigen (CEA) lab value. To ensure the robustness and accuracy of our predictive models, we implemented a comprehensive validation strategy. Initially, the original dataset was divided into training and testing datasets, with a separation rate of 7:3. This initial split was used to perform a preliminary assessment of the model’s performance, providing a baseline indication of its effectiveness in new, unseen data scenarios. During this initial validation phase, several metrics such as accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC) were calculated to evaluate the model’s prediction ability. These metrics helped identify any potential overfitting at an early stage and guided the further tuning of model parameters. Furthermore, to enhance the generalizability of our model, we applied a 10-fold cross-validation technique within the training dataset. During the training period, all training datasets were randomly divided into 10 subsets of equal size, with each subset playing a role in the validation dataset. The format of the test dataset was the same as the training dataset.
In the extraction and classification pipelines, we used two types of extracting processes. One was the machine learning technique, and the other was the statistical testing method. In the machine learning technique, all 10 risk factors were directly used as predictors for C5.0, random forest (RF), C4.5, classification and regression tree (CART), support vector machine (SVM), logistic regression (LGR), and linear discriminant analysis (LDA) for constructing seven classification pipelines. Based on our previous studies, support vector machines classify classes using a linear decision boundary called the hyperplane. Hyperplanes place data to maximize the distance between the instance and the hyperplane [18,19]. Linear discriminant analysis is a supervised learning algorithm that also extracts features and compresses data for downscaling and classification [20,21]. Logistic regression is the most commonly used approach in epidemiology and medicine. A generalized linear model explicitly models the relationship between the explanatory variable X and the response variable Y [22,23]. Based on the concept of information entropy, C4.5 decision trees select the attributes of each node according to their attributes [24]. Using a greedy approach, decision trees were built in a top-down, recursive, and divide-and-conquer manner. In random forests, subsets of the dataset predictor variables are randomly selected, and the results are consolidated to generate a classification tree [25]. Using a recursive process, the C5.0 decision tree generates a tree based on the provided information using a top-down approach [26]. For splitting and estimation, the Gini index was used to construct the classification and regression tree. A binary tree was built similarly to a tree structure by splitting records according to a single input field at each node [27].
Each classification for SPCs was evaluated based on the area under the curve (AUC) of the receiver operating characteristic curve (ROC), which can also be used to determine how well a risk prediction model differentiates between patients with and without a certain condition. In general, the better the model discriminates, the closer the ROC curve approaches the upper left corner of the plot. In this study, there were 10 independent variables, generating 216 input combinations, each of which yielded a predicted value, and the threshold value in the plot of the ROC curve was the result of the corresponding sensitivity and 1-specificity. For the risk factor rankings, GainRatio, InfoGain, RF, C5.0, and MARS classifications were selected. The ranking of each risk factor was determined by calculating the average ranking of the above methods. The final model performance was calculated by averaging the 10 classification accuracy metric results. These classifiers were modeled using “raprt”, “RWeka”, “mass”, “elmNN”, “e1071”, “lgr”, and “randomForest”, respectively, in the R environment, version 4.2.1. InfoGain and GainRatio were used with the Waikato Environment for Knowledge Analysis (WEKA), version 3.8. In the statistical testing methods, the independent variables included sex, age at diagnosis, tumor size, combined stage, group, radiotherapy, chemotherapy, body mass index (BMI) (kg/m2), smoking behavior, drinking behavior, and carcinoembryonic antigen (CEA) lab value. A t-test was used to compare SPCs and non-SPCs. We employed the chi-square test and odds ratio to assess the associations between the dependent variable and all independent variables.
In the clinical risk assessment, different decision tree models were used to identify the prediction factors of conditions of interest, namely support vector machines, linear discriminant analyses, logistic regression, C4.5 decision trees, classification and regression trees, random forests, and C5.0 decision trees. All subjects were divided into 10 subgroups, from the root to the leaf node, through different branches. By using these different decision tree models, clinicians can identify the combination of risk factors for the condition.

3. Results

The descriptive characteristics of the study cohort are shown in Table 1. Of the 4402 patients in this study, 395 subsequently developed SPCs (males, 69.9%; females, 30.1%). The most frequent SPCs were colorectal (n = 231; 58.5%), followed by lung cancer (n = 42; 10.6%), others (n = 20; 5.1%), urinary system (n = 20; 5.1%), liver (n = 13; 3.3%), breast (n = 12; 3.0%), and prostate (n = 11; 2.8%). Our statistical analysis (see Table 1) indicated that sex, age at diagnosis, combined stage group, radiotherapy, chemotherapy, BMI, and smoking/drinking behavior revealed significant differences between rectal cancer patients who developed SPCs and those who did not.
Table 2 depicts the ranking results of the importance of the 10 predictor variables derived from the GainRatio, InfoGain, RF, C5.0, and MARS models. The table shows the different classifications of the predictor variables by different classifiers. In addition, it shows the application of the Borda count procedure to combine the classification results and create a global ranking. In particular, chemotherapy appears to be a major risk factor among the treatments associated with SPCs.
For the variables with the highest AUC values, C5.0 showed more stable AUC performance (0.8488) than other classifiers. In order to further analyze the SPC predictor for the prediction of chemotherapy, we chose C5.0 as the basis for further analysis (see Figure 1, Table 3). To further analyze the risks associated with the occurrence of SPCs in rectal cancer patients after chemotherapy, we conducted demographic analyses of patients with chemotherapy, as shown in Table 4. Considering patients receiving chemotherapy, age at diagnosis (≥65 years; p = 0.004), smoking behavior (yes; p = 0.017), and drinking behavior (yes; p = 0.014) were associated with an increased risk of SPCs among patients with rectal cancer (see Table 4). Decision tree stratification based on C5.0 prioritized all independent variables to determine their branch status. Through different branches from the root node to the leaf node, all subjects were divided into 13 subgroups (see Figure 2). In the classified decision tree, drinking behavior was identified as the root node due to its strong influence on SPCs among rectal cancer patients. The following are the explanations of the relevant decision-making rules: The factors that determined the first rule decision tree were drinking behavior (no), CEA lab value (≤050 ng/mL), sex (male), and age at diagnosis (<65 years), resulting in an accuracy of 57.0% across 149 samples. A four-rule decision tree was developed based on drinking behavior (no), CEA lab value (≤050 ng/mL), sex (female), and age at diagnosis (≥65 years), resulting in an accuracy of 65.1% across 114 samples. The factors that determined the six-rule decision tree were drinking behavior (no), CEA lab value (>50 ng/mL), age at diagnosis (≥65 years), and sex (male), yielding an accuracy of 59.5% across 84 samples. Nine-rule decision trees were obtained considering drinking behavior (yes), behavior (yes), BMI (<24), sex (male), age at diagnosis (<65 years), and CEA lab value (>50 ng/mL), yielding an accuracy of 69.4% in 25 samples. We developed the 10-rule decision tree based on drinking behavior (yes), BMI (<24), sex (male), age at diagnosis (≥65 years), and CEA lab value (50 ng/mL), resulting in a precision of 68.8% in 31 samples. To create the 12-rule decision tree, we considered drinking behavior (yes), BMI (<24), and sex (female), yielding an accuracy of 77.7% in seven samples. The factors determined using the 13-rule decision tree were drinking behavior (yes) and BMI (≥24), yielding an accuracy of 68.0% in 153 samples. The rules related to the prediction models for SPCs in rectal cancer receiving chemotherapy are summarized in Table 5.

4. Discussion

SPCs are more likely to occur following improved survival in patients with rectal cancer. In this study, we observed that SPCs occurred in 395 (9.0%) of the 4402 primary rectal cancer patients. Of the treatments used for primary rectal cancer, chemotherapy posed the highest risk for developing SPCs. Considering patients receiving chemotherapy, the age of 65 years, smoking, and drinking behavior were strongly correlated with the development of SPCs in patients with rectal cancer. These findings provide important information for the effective prevention and surveillance of SPCs in rectal cancer survivors. This study reports several interesting findings. First, 9.9% of male and 6.4% of female survivors experienced SPCs during their follow-up. Zhang et al. reported a higher incidence of SPCs (males, 17.1%; females, 13.0%) in their colorectal study cohort [28]. Rectal cancer survivors, however, had an 8% higher rate of SPCs than the general population [29]. Our regression analysis indicated that sex, age at diagnosis, combined stage group, radiotherapy, chemotherapy, BMI, and smoking/drinking behavior were related to rectal cancer patients developing SPCs. As a result, it is important to consider the characteristics of cancer survivors, since these characteristics may influence their health in the future.
Most cancer survivors are increasingly concerned about the identification of the factors that may increase their risk of developing SPCs. Compared to radiotherapy, the use of chemotherapy was a more significant risk factor for the SPCs investigated here (see Table 1). As we know, many effective chemotherapeutic agents have recently been developed for the management of recurrence or metastases in rectal cancer [30,31]. Thus, carcinogenesis caused by increased use of these chemotherapeutic agents should be investigated. Similarly, Hung et al. reported that chemotherapy was significantly associated with all types of SPCs during the follow-up period in some cancer survivors [32]. However, although this study used cancer registry databases in several hospitals, no information was found about regimens of chemotherapy. Thus, utilizing different databases that include chemotherapy regimens is still necessary to validate our findings.
The occurrence of SPCs is often recognized as a late adverse effect after cancer treatment. Since the risk of SPCs is not increased in the short term, long-term follow-up considering a latency period is necessary to observe this phenomenon. However, the risk pattern for SPCs has rarely been studied in depth; thus, this motivated us to perform the current analysis. According to our cancer registry database principles, our colorectal cancer patients treated with curative intent are routinely followed for 5 years. Our study materials complement existing data involving large populations and provide a more adequate duration of follow-up for assessing such low-frequency events. Although our results provide important insights into SPCs after rectal cancer treatments, with the increasing complexity of rectal cancer treatment, adding more information about modern techniques and drugs will be our next step.
Radiotherapy is a part of the current standard treatment for rectal cancer. Radiation for tumor control causes early and late toxicity, which is associated with the subsequent development of SPCs [33]. A study conducted by Rombouts et al. using data from the Netherlands Cancer Registry from 1989 to 2007 found that patients who underwent RT for previous pelvic cancer were at greater risk of rectal cancer (subhazard ratio, 1.72; 95% CI, 1.55–1.91) [34]. After primary pelvic radiotherapy, another systematic review and meta-analysis found a small increase in the incidence of second primary cancer. However, since the introduction of modern radiation techniques, which provide excellent preservation of normal tissue, studies have shown that, in some cases, radiotherapy does not increase the risk of SPCs and might even have a preventive effect [35,36]. Therefore, a reliable and accurate method such as machine learning, which can take into account complex interactions between multiple predictor variables, can help to resolve this important question.
Older age can lead to immunosenescence in survivors of SPCs, making it a critical prognostic factor [37]. SPC risk was determined by a combination of demographic factors, including age, race, and marital status, according to Zhang et al. [28]. Predisposition to a lifestyle such as smoking or alcohol also increases the risk of SPCs in cancer patients, particularly with respect to SPCs of the head and neck, esophagus, lung, urinary bladder, and kidney. Smoking and alcohol use cause damage to DNA damage repair in cells, and the length of exposure time increases the cancer risk. All this evidence supports our findings. In the study of the critical risk factors of secondary cancer in the medical practice, our results are consistent with existing research [15], showing that the C5.0 classification has greater compliance with clinical interpretations than alternative classification methods.
Postcancer treatment surveillance is crucial to detecting second lesions and improving survival. Of the treatments used for primary rectal cancer, chemotherapy posed the highest risk for developing SPCs in our results. An age of over 65, smoking, and drinking behavior are independent risk factors for SPCs after chemotherapy. These findings may help develop effective prevention and surveillance programs for high-risk rectal cancer survivors in their follow-up. For example, enhancing clinical health education on smoking cessation for elderly rectal patients is a recommended strategy. The government can also reduce the future occurrence of secondary cancers and subsequent treatment costs through smoking cessation policies.
This study has a few noteworthy limitations. First, data for our cohort were missing regarding dietary habits, comorbidities, and hereditary syndromes, which may significantly increase the risk of developing malignancies. Second, data regarding the type, length, and cycles of chemical agents administered were not available. However, we focused our analysis on the patients who received chemotherapy to increase the credibility of this study. Including the above variables in the model would help to extend the prediction performance. Third, the risk of SPCs varies according to race and ethnicity. Taiwanese residents, who are mostly Asian, accounted for 99% of our study cohort. Thus, our prediction models still need to be verified in external populations, although internal validation showed good consistency. Finally, our study used multiple machine learning models without comprehensive clinical validation, which may lead to overfitting and overly optimistic performance estimates. To address this, collecting new and unseen data for further validation is crucial. Additionally, the absence of detailed analyses may limit the ability to gain further insights. For example, future research should consider applying multiple testing adjustments, such as the Benjamini–Hochberg procedure [38], to reduce inflated false-positive rates. Moreover, incorporating calibration analysis, such as the Platt Scaling technique [39], is essential for future implementations to adjust predicted probabilities to more closely reflect actual outcomes, thus enhancing the model’s predictive accuracy.

5. Conclusions

Although patients with rectal cancer are at a high risk of developing cancer, current clinical guidelines do not include priority treatment strategies. This has resulted in significant changes in the quality of care provided to this population. As rectal cancer burdens continue to increase, it is important to evaluate current treatment strategy recommendations. An age of over 65, smoking, and drinking behavior are independent risk factors for SPCs after chemotherapy. These findings may help develop effective prevention and surveillance programs for high-risk rectal cancer survivors in their follow-up. This study aimed to perform a longitudinal diagnosis and prediction of SPCs among patients with rectal cancer. In addition to reassessing the risk factors for rectal cancer patients, the proposed model can also help to assess chemoradiotherapy response, particularly with the development of nonsurgical approaches such as “observation and waiting”. We suggest that future research further explore the relationship between the risk factors identified in this study. This study also serves as the basis for further clinical validation and a reference for healthcare education for both doctors and patients in the future.

Author Contributions

J.-Y.H. and C.-C.C.: Formal analysis, data curation, writing—review and editing, writing—original draft preparation, funding acquisition. C.-F.L.: formal analysis, data curation. C.-L.C.: writing—review and editing, supervision. C.-C.Y., writing—review and editing, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by Chung Shan Medical University and Chi Mei Medical Center (CMCSMU11004) and funded by a research grant from Chung Shan Medical University Hospital (CSH-2023-C-026).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Chi-Mei Medical Center Institutional Review Board (CMFHR11006-006, date of approval 12 July 2021).

Informed Consent Statement

This study had a non-interventional retrospective design; no human subjects or personally identifying information were used, and all data were analyzed anonymously. Thus, informed consent was waived by our IRB.

Data Availability Statement

Clinicopathological datasets are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors have no conflicts of interest. The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

References

  1. Siegel, R.L.; Miller, K.D.; Fuchs, H.E.; Jemal, A. Cancer statistics, 2022. CA Cancer J. Clin. 2022, 72, 7–33. [Google Scholar] [CrossRef] [PubMed]
  2. Gollins, S.; Sebag-Montefiore, D. Neoadjuvant Treatment Strategies for Locally Advanced Rectal Cancer. Clin. Oncol. 2016, 28, 146–151. [Google Scholar] [CrossRef] [PubMed]
  3. Salazar, R.; Capdevila, J.; Laquente, B.; Manzano, J.L.; Pericay, C.; Villacampa, M.M.; López, C.; Losa, F.; Safont, M.J.; Gómez, A.; et al. A randomized phase II study of capecitabine-based chemoradiation with or without bevacizumab in resectable locally advanced rectal cancer: Clinical and biological features. BMC Cancer 2015, 15, 60. [Google Scholar] [CrossRef] [PubMed]
  4. Bando, H.; Tsukada, Y.; Ito, M.; Yoshino, T. Novel Immunological Approaches in the Treatment of Locally Advanced Rectal Cancer. Clin. Color. Cancer 2022, 21, 3–9. [Google Scholar] [CrossRef] [PubMed]
  5. Liu, L.; Lemmens, V.E.; De Hingh, I.H.; de Vries, E.; Roukema, J.A.M.; van Leerdam, M.E.; Coebergh, J.W.M.; Soerjomataram, I.M. Second primary cancers in subsites of colon and rectum in patients with previous colorectal cancer. Dis. Colon. Rectum 2013, 56, 158–168. [Google Scholar] [CrossRef] [PubMed]
  6. Yang, L.; Xiong, Z.; Xie, Q.K.; He, W.; Liu, S.; Kong, P.; Jiang, C.; Xia, X.; Xia, L. Second primary colorectal cancer after the initial primary colorectal cancer. BMC Cancer 2018, 18, 931. [Google Scholar] [CrossRef] [PubMed]
  7. Cui, Y.; Han, B.; Zhang, H.; Liu, H.; Zhang, F.; Niu, R. Identification of Metabolic-Associated Genes for the Prediction of Colon and Rectal Adenocarcinoma. Onco Targets Ther. 2021, 14, 2259–2277. [Google Scholar] [CrossRef]
  8. Robertson, D.; Ng, S.K.; Baade, P.D.; Lam, A.K. Risk of extracolonic second primary cancers following a primary colorectal cancer: A systematic review and meta-analysis. Int. J. Color. Dis. 2022, 37, 541–551. [Google Scholar] [CrossRef] [PubMed]
  9. Green, R.J.; Metlay, J.P.; Propert, K.; Catalano, P.J.; Macdonald, J.S.; Mayer, R.J.; Haller, D.G. Surveillance for second primary colorectal cancer after adjuvant chemotherapy: An analysis of Intergroup 0089. Ann. Intern. Med. 2002, 136, 261–269. [Google Scholar] [CrossRef]
  10. Kendal, W.S.; Nicholas, G. A population-based analysis of second primary cancers after irradiation for rectal cancer. Am. J. Clin. Oncol. 2007, 30, 333–339. [Google Scholar] [CrossRef]
  11. Lee, H.C.; Lin, T.C.; Chang, C.C.; Lu, Y.C.A.; Lee, C.M.; Purevdorj, B. Clinical Risk Factor Prediction for Second Primary Skin Cancer: A Hospital-Based Cancer Registry Study. Appl. Sci. 2022, 12, 12520. [Google Scholar] [CrossRef]
  12. Chang, C.C.; Huang, T.H.; Shueng, P.W.; Chen, S.H.; Chen, C.C.; Lu, C.J.; Tseng, Y.J. Developing a Stacked Ensemble-Based Classification Scheme to Predict Second Primary Cancers in Head and Neck Cancer Survivors. Int. J. Environ. Res. Public. Health 2021, 18, 12499. [Google Scholar] [CrossRef] [PubMed]
  13. Liao, H.H.; Chang, C.C.; Wang, Y.X.; Cheewakriangkrai, C. Predicting the risk factors of second primary cancer in patients with hepatocellular carcinoma. Stud. Health Technol. Inform. 2022, 289, 93–96. [Google Scholar] [PubMed]
  14. Chang, C.C.; Chen, C.C.; Cheewakriangkrai, C.; Chen, Y.C.; Yang, S.F. Risk Prediction of Second Primary Endometrial Cancer in Obese Women: A Hospital-Based Cancer Registry Study. Int. J. Environ. Res. Public. Health 2021, 18, 8997. [Google Scholar] [CrossRef] [PubMed]
  15. Ting, W.C.; Chang, H.R.; Chang, C.C.; Lu, C.J. Developing a Novel Machine Learning-Based Classification Scheme for Predicting SPCs in Colorectal Cancer Survivors. Appl. Sci. 2020, 10, 1355. [Google Scholar] [CrossRef]
  16. Chang, C.C.; Chen, S.H. Developing a Novel Machine Learning-Based Classification Scheme for Predicting SPCs in Breast Cancer Survivors. Front. Genet. 2019, 10, 848. [Google Scholar] [CrossRef] [PubMed]
  17. Benson, A.B.; Venook, A.P.; Al-Hawary, M.M.; Arain, M.A.; Chen, Y.-J.; Ciombor, K.K.; Cohen, S.; Cooper, H.S.; Deming, D.; Farkas, L.; et al. Colon Cancer, Version 2.2021, NCCN Clinical Practice Guidelines in Oncology. J. Natl. Compr. Cancer Netw. 2021, 19, 329–359. [Google Scholar] [CrossRef] [PubMed]
  18. Ding, C.H.; Dubchak, I. Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 2001, 17, 349–358. [Google Scholar] [CrossRef] [PubMed]
  19. Sidey-Gibbons, J.A.M.; Sidey-Gibbons, C.J. Machine learning in medicine: A practical introduction. BMC Med. Res. Methodol. 2019, 19, 64. [Google Scholar] [CrossRef]
  20. Pham, T.H.; Vicnesh, J.; Wei, J.K.E.; Oh, S.L.; Arunkumar, N.; Abdulhay, E.W.; Ciaccio, E.J.; Acharya, U.R. Autism Spectrum Disorder Diagnostic System Using HOS Bispectrum with EEG Signals. Int. J. Environ. Res. Public. Health 2020, 17, 971. [Google Scholar] [CrossRef]
  21. Witjes, H.; Rijpkema, M.; van der Graaf, M.; Melssen, W.; Heerschap, A.; Buydens, L. Multispectral magnetic resonance image analysis using principal component and linear discriminant analysis. J. Magn. Reson. Imaging 2003, 17, 261–269. [Google Scholar] [CrossRef] [PubMed]
  22. Park, H.A. An introduction to logistic regression: From basic concepts to interpretation with particular attention to nursing domain. J. Korean Acad. Nurs. 2013, 43, 154–164. [Google Scholar] [CrossRef] [PubMed]
  23. Levy, J.J.; O’Malley, A.J. Don’t dismiss logistic regression: The case for sensible extraction of interactions in the era of machine learning. BMC Med. Res. Methodol. 2020, 20, 171. [Google Scholar] [CrossRef] [PubMed]
  24. Dima, S.; Wang, K.J.; Chen, K.H.; Huang, Y.-K.; Chang, W.-J.; Lee, S.-Y.; Teng, N.-C. Decision Tree Approach to the Impact of Parents’ Oral Health on Dental Caries Experience in Children: A Cross-Sectional Study. Int. J. Environ. Res. Public. Health 2018, 15, 692. [Google Scholar] [CrossRef]
  25. Ai, D.; Pan, H.; Han, R.; Li, X.; Liu, G.; Xia, L.C. Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer. Genes 2019, 10, 112. [Google Scholar] [CrossRef]
  26. Peng, J.; Chen, C.; Zhou, M.; Xie, X.; Zhou, Y.; Luo, C.H. A Machine-learning Approach to Forecast Aggravation Risk in Patients with Acute Exacerbation of Chronic Obstructive Pulmonary Disease with Clinical Indicators. Sci. Rep. 2020, 10, 3118. [Google Scholar] [CrossRef]
  27. Yang, C.C.; Su, Y.C.; Lin, Y.W.; Huang, C.I.; Lee, C.C. Differential impact of age on survival in head and neck cancer according to classic Cox regression and decision tree analysis. Clin. Otolaryngol. 2019, 44, 244–253. [Google Scholar] [CrossRef]
  28. Zhang, B.; Guo, K.; Zheng, X.; Sun, L.; Shen, M.; Ruan, S. Risk of Second Primary Malignancies in Colon Cancer Patients Treated With Colectomy. Front. Oncol. 2020, 10, 1154. [Google Scholar] [CrossRef]
  29. Phipps, A.I.; Chan, A.T.; Ogino, S. Anatomic subsite of primary colorectal cancer and subsequent risk and distribution of second cancers. Cancer 2013, 119, 3140–3147. [Google Scholar] [CrossRef] [PubMed]
  30. Papaccio, F.; Rosello, S.; Huerta, M.; Gambardella, V.; Tarazona, N.; Fleitas, T.; Roda, D.; Cervantes, A. Neoadjuvant Chemotherapy in Locally Advanced Rectal Cancer. Cancers 2020, 12, 3611. [Google Scholar] [CrossRef]
  31. Kim, J.K.; Marco, M.R.; Roxburgh, C.S.D.; Chen, C.-T.; Cercek, A.; Strombom, P.; Temple, L.K.F.; Nash, G.M.; Guillem, J.G.; Paty, P.B.; et al. Survival After Induction Chemotherapy and Chemoradiation Versus Chemoradiation and Adjuvant Chemotherapy for Locally Advanced Rectal Cancer. Oncologist 2022, 27, 380–388. [Google Scholar] [CrossRef] [PubMed]
  32. Hung, M.H.; Liu, C.J.; Teng, C.J.; Hu, Y.-W.; Yeh, C.-M.; Chen, S.-C.; Chien, S.-H.; Hung, Y.-P.; Shen, C.-C.; Chen, T.-J.; et al. Risk of Second Non-Breast Primary Cancer in Male and Female Breast Cancer Patients: A Population-Based Cohort Study. PLoS ONE 2016, 11, e0148597. [Google Scholar] [CrossRef] [PubMed]
  33. Rombouts, A.J.M.; Hugen, N.; Elferink, M.A.G.; Feuth, T.; Poortmans, P.M.P.; Nagtegaal, I.D.; de Wilt, J.H.W. Incidence of second tumors after treatment with or without radiation for rectal cancer. Ann. Oncol. 2017, 28, 535–540. [Google Scholar] [CrossRef] [PubMed]
  34. Rombouts, A.J.M.; Hugen, N.; Elferink, M.A.G.; Poortmans, P.M.P.; Nagtegaal, I.D.; de Wilt, J.H.W. Increased risk for second primary rectal cancer after pelvic radiation therapy. Eur. J. Cancer 2020, 124, 142–151. [Google Scholar] [CrossRef] [PubMed]
  35. Wiltink, L.M.; Nout, R.A.; Fiocco, M.; Kranenbarg, E.M.-K.; Jürgenliemk-Schulz, I.M.; Jobsen, J.J.; Nagtegaal, I.D.; Rutten, H.J.; van de Velde, C.J.; Creutzberg, C.L.; et al. No Increased Risk of Second Cancer After Radiotherapy in Patients Treated for Rectal or Endometrial Cancer in the Randomized TME, PORTEC-1, and PORTEC-2 Trials. J. Clin. Oncol. 2015, 33, 1640–1646. [Google Scholar] [CrossRef] [PubMed]
  36. Warschkow, R.; Guller, U.; Cerny, T.; Schmied, B.M.; Plasswilm, L.; Putora, P.M. Secondary malignancies after rectal cancer resection with and without radiation therapy: A propensity-adjusted, population-based SEER analysis. Radiother. Oncol. 2017, 123, 139–146. [Google Scholar] [CrossRef] [PubMed]
  37. Pawelec, G. Immunosenescence and cancer. Biogerontology 2017, 18, 717–721. [Google Scholar] [CrossRef]
  38. Benjamini, Y.; Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 1995, 57, 289–300. [Google Scholar] [CrossRef]
  39. Niculescu-Mizil, A.; Caruana, R. Predicting good probabilities with supervised learning. In Proceedings of the 22nd International Conference on Machine Learning, ICML ’05, Bonn, Germany, 7–11 August 2005; pp. 625–632. [Google Scholar]
Figure 1. Receiver operating characteristic curves of the seven methods with AUCs for rectal patients receiving chemotherapy.
Figure 1. Receiver operating characteristic curves of the seven methods with AUCs for rectal patients receiving chemotherapy.
Diagnostics 14 01461 g001
Figure 2. C5.0 classification tree depicting the secondary primary cancers (SPCs) of primary rectal cancer treated with chemotherapy. CEA Lab Value: carcinoembryonic antigen lab value; ABNL: abnormal; NL: normal; BMI: body mass index; ACC: accuracy.
Figure 2. C5.0 classification tree depicting the secondary primary cancers (SPCs) of primary rectal cancer treated with chemotherapy. CEA Lab Value: carcinoembryonic antigen lab value; ABNL: abnormal; NL: normal; BMI: body mass index; ACC: accuracy.
Diagnostics 14 01461 g002
Table 1. Subject demographics of primary rectal cancer patients.
Table 1. Subject demographics of primary rectal cancer patients.
Risk FactorsWith SPCs (%)Without SPCs (%)p-Valueχ2Odds Ratio
n (%)395 (9.0%)4007 (91.0%)
Sex39540070.004 *8.477
  Male276 (69.9%)2503 (62.5%) 1.394 * [1.114–1.744]
  Female119 (30.1%)1504 (37.5%) 1.00
Age at Diagnosis3954007<0.001 ***25.530
  <65 years174 (44.1%)2295 (57.3%) 1.00
  ≥65 years221 (55.9%)1712 (42.7%) 1.703 * [1.383–2.097]
Tumor size39540070.4330.614
  <5 cm267 (67.6%)2630 (65.6%) 1.092 [0.876–1.362]
  ≥5 cm128 (32.4%)1377 (34.4%) 1.00
Combine Stage Group39540070.001 **11.942
  ≤stage II212 (53.7%)1787 (44.6%) 1.439 * [1.170–1.771]
  >stage II183 (46.3%)2220 (55.4%) 1.00
Radiotherapy39540070.013 *6.208
  No279 (70.6%)2579 (64.4%) 1.332 * [1.062–1.669]
  Yes116 (29.4%)1428 (35.6%) 1.00
Chemotherapy3954007<0.001 ***20.410
  No190 (48.1%)1465 (36.6%) 1.608 * [1.307–1.979]
  Yes205 (51.9%)2542 (63.4%) 1.00
BMI39540070.019 *5.474
  <24179 (45.3%)2063 (51.5%) 1.00
  ≥24 216 (54.7%)1944 (48.5%) 1.281 * [1.041–1.576]
Smoking Behavior3954007<0.001 ***12.882
  No229 (58.0%)2682 (66.9%) 1.00
  Yes166 (42.0%)1325 (33.1%) 1.467 * [1.189–1.811]
Drinking Behavior39540070.001 **10.224
  No272 (68.9%)3050 (76.1%) 1.00
  Yes123 (31.1%)957 (23.9%) 1.441 * [1.151–1.805]
Carcinoembryonic Antigen (CEA) lab value39540070.8940.018
  ≤050265 (67.1%)2675 (66.8%) 1.015 [0.815–1.265]
  >051–100, 987130 (32.9%)1332 (33.2%) 1.00
* p < 0.05, ** p < 0.01, *** p < 0.001.
Table 2. The relative importance of variables associated with SPCs in rectal cancer patients.
Table 2. The relative importance of variables associated with SPCs in rectal cancer patients.
RankGainRatio *InfoGain *RFC5.0MARSOverall
1Age at DiagnosisAge at DiagnosisAge at DiagnosisSexAge at DiagnosisAge at Diagnosis
2ChemotherapyChemotherapyChemotherapyRadiotherapySmoking BehaviorChemotherapy
3Smoking BehaviorSmoking BehaviorSmoking BehaviorDrinking BehaviorChemotherapySmoking Behavior
4Drinking BehaviorSexSexCombine stage groupTumor SizeCombine Stage Group
5SexDrinking BehaviorDrinking BehaviorChemotherapyCombine Stage GroupSex
6Combine Stage GroupCombine Stage GroupCombine Stage GroupAge at DiagnosisRadiotherapyDrinking Behavior
7BMIBMIBMITumor SizeBMIRadiotherapy
8RadiotherapyRadiotherapyRadiotherapyBMICarcinoembryonic Antigen (CEA) Lab ValueBMI
9Tumor SizeTumor SizeTumor SizeCarcinoembryonic Antigen (CEA) Lab ValueDrinking BehaviorTumor Size
10Carcinoembryonic Antigen (CEA) Lab ValueCarcinoembryonic Antigen (CEA) Lab ValueCarcinoembryonic Antigen (CEA) Lab ValueSmoking BehaviorSexCarcinoembryonic Antigen (CEA) Lab Value
* Abbreviations: GainRatio, information gain ratio is the ratio of information gain to the intrinsic information. InfoGain, information gain is created by not providing a numerical difference between attributes with high distinct values from those that have less. GainRatio and InfoGain were obtained using the Waikato Environment for Knowledge Analysis (WEKA).
Table 3. Classification results of the rectal patients treated with chemotherapy.
Table 3. Classification results of the rectal patients treated with chemotherapy.
MethodSpecificitySensitivityAccuracyF1 ScorePrecision(PPV)NPVAUC
C5.00.88120.70820.79410.77590.85790.96840.8488
RF0.83770.74390.79050.78140.82280.97070.8342
C4.50.75650.79970.77830.78400.76890.97460.8330
CART0.85510.72250.78830.77450.83470.96900.8285
SVM0.86230.75540.80850.79880.84750.97280.8203
LGR0.81880.45210.63430.55440.71660.93810.6580
LDA0.81880.44210.62920.54550.71200.93710.6575
Abbreviations: FPR: the false-positive rate is the probability of incorrectly rejecting the null hypothesis for a particular test; MCC: Matthews correlation coefficient; a higher score is only obtained if the prediction had good results in all four categories of the confusion matrix (true positives, false negatives, true negatives, and false positives); F1 score: a harmonic score between sensitivity and precision; PPV: positive predictive value; NPV: negative predictive value.
Table 4. Subject demographics of primary rectal cancer patients with chemotherapy.
Table 4. Subject demographics of primary rectal cancer patients with chemotherapy.
Risk FactorsWith SPCs (%)Without SPCs (%)p-Valueχ2Odds Ratio
n (%)205 (7.5%)2542 (92.5%)
Sex20525420.12.702
  Male143 (69.8%)1628 (64.0%) 1.295 [0.951–1.763]
  Female62 (30.2%)914 (36.0%) 1.00
Age at Diagnosis20525420.004 **8.207
  <65 years107 (52.2%)1584 (62.3%) 1.00
  ≥65 years98 (47.8%)958 (37.7%) 1.514 * [1.138–2.015]
Tumor size20525420.8310.046
  <5 cm119 (58.0%)1495 (58.8%) 1.00
  ≥5 cm86 (42.0%)1047 (41.2%) 1.032 [0.773–1.377]
Combine Stage Group20525420.3970.718
  ≤stage II52 (25.4%)579 (22.8%) 1.152 [0.830–1.600]
  >stage II153 (74.6%)1963 (77.2%) 1.00
Radiotherapy20525420.3021.065
  No108 (52.7%)1244 (48.9%) 1.162 [0.874–1.545]
  Yes97 (47.3%)1298 (51.1%) 1.00
BMI20525420.3820.764
  <24101 (49.3%)1333 (52.4%) 1.00
  ≥24 104 (50.7%)1209 (47.6%) 1.135 [0.854–1.509]
Smoking Behavior20525420.017 *5.695
  No113 (55.1%)1614 (63.5%) 1.00
  Yes92 (44.9%)928 (36.5%) 1.416 * [1.063–1.886]
Drinking Behavior20525420.014 *6.064
  No134 (65.4%)1863 (73.3%) 1.00
  Yes71 (34.6%)679 (26.7%) 1.457 * [1.078–1.968]
Carcinoembryonic Antigen (CEA) Lab Value20525420.4750.511
  ≤050115 (56.1%)1491 (58.7%) 1.00
  >051–100, 98790 (43.9%)1051 (41.3%) 1.110 [0.833–1.479]
* p < 0.05, ** p < 0.01.
Table 5. Summarized rules of condition risk factors.
Table 5. Summarized rules of condition risk factors.
Rules No.Combinations of Condition FactorsSPCs/Observed (n) Accuracy
1Drinking Behavior (No) + CEA Lab Value (≤050 ng/mL) + Sex (Male) + Age at Diagnosis (<65 years)149/26157.0%
4Drinking Behavior (No) + CEA Lab Value (≤050 ng/mL) + Sex (Female) + Age at Diagnosis (≥65 years)114/17565.1%
6Drinking Behavior (No) + CEA Lab Value (>050 ng/mL) + Age at Diagnosis (≥65 years) + Sex (Male)84/14159.5%
9Drinking Behavior (Yes) + BMI (<24) + Sex (Male) + Age at Diagnosis (<65 years) + CEA Lab Value (>050 ng/mL)25/3669.4%
10Drinking Behavior (Yes) + BMI (<24) + Sex (Male) + Age at Diagnosis (≥65 years) + CEA Lab Value (≤050 ng/mL)31/4568.8%
12Drinking Behavior (Yes) + BMI (<24) + Sex (Female)7/977.7%
13Drinking Behavior (Yes) + BMI (≥24)153/22568.0%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hsia, J.-Y.; Chang, C.-C.; Liu, C.-F.; Chou, C.-L.; Yang, C.-C. Longitudinal Risk Analysis of Second Primary Cancer after Curative Treatment in Patients with Rectal Cancer. Diagnostics 2024, 14, 1461. https://doi.org/10.3390/diagnostics14131461

AMA Style

Hsia J-Y, Chang C-C, Liu C-F, Chou C-L, Yang C-C. Longitudinal Risk Analysis of Second Primary Cancer after Curative Treatment in Patients with Rectal Cancer. Diagnostics. 2024; 14(13):1461. https://doi.org/10.3390/diagnostics14131461

Chicago/Turabian Style

Hsia, Jiun-Yi, Chi-Chang Chang, Chung-Feng Liu, Chia-Lin Chou, and Ching-Chieh Yang. 2024. "Longitudinal Risk Analysis of Second Primary Cancer after Curative Treatment in Patients with Rectal Cancer" Diagnostics 14, no. 13: 1461. https://doi.org/10.3390/diagnostics14131461

APA Style

Hsia, J.-Y., Chang, C.-C., Liu, C.-F., Chou, C.-L., & Yang, C.-C. (2024). Longitudinal Risk Analysis of Second Primary Cancer after Curative Treatment in Patients with Rectal Cancer. Diagnostics, 14(13), 1461. https://doi.org/10.3390/diagnostics14131461

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop