Nomogram Predicting the Survival of Young-Onset Patients with Colorectal Cancer Liver Metastases

Background: Although the global prevalence of colorectal cancer (CRC) is decreasing, there has been an increase in incidence among young-onset individuals, in whom the disease is associated with specific pathological characteristics, liver metastases, and a poor prognosis. Methods: From 2010 to 2016, 1874 young-onset patients with colorectal cancer liver metastases (CRLM) from the Surveillance, Epidemiology, and End Results (SEER) database were randomly allocated to training and validation cohorts. Multivariate Cox analysis was used to identify independent prognostic variables, and a nomogram was created to predict cancer-specific survival (CSS) and overall survival (OS). Receiver operating characteristic (ROC) curve, C-index, area under the curve (AUC), and calibration curve analyses were used to determine nomogram accuracy and reliability. Results: Factors independently associated with young-onset CRLM CSS included primary tumor location, the degree of differentiation, histology, M stage, N stage, preoperative carcinoembryonic antigen level, and surgery (all p < 0.05). The C-indices of the CSS nomogram for the training and validation sets (compared to TNM stage) were 0.709 and 0.635, and 0.735 and 0.663, respectively. The AUC values for 1-, 3-, and 5-year OS were 0.707, 0.708, and 0.755 in the training cohort and 0.765, 0.735, and 0.737 in the validation cohort, respectively; therefore, the nomogram had high sensitivity, and was superior to TNM staging. The calibration curves for the training and validation sets were relatively consistent. In addition, a similar result was observed with OS. Conclusions: We developed a unique nomogram incorporating clinical and pathological characteristics to predict the survival of young-onset patients with CRLM. This may serve as an early warning system allowing doctors to devise more effective treatment regimens.


Introduction
Colorectal cancer (CRC) is the third most common malignancy and leading cause of cancer death in both men and women worldwide [1]. Since the mid-2000s, the incidence of CRC in both sexes in the United States has fallen by 2-3% each year because of the widespread use of screening tests that allow detection and excision of pre-malignant lesions [1,2]. However, the incidence of young-onset CRC, defined as CRC developing before the age of 50 years, has risen in recent years [3]. Adult CRC survival improved from 1973 to 2005, whereas child and adolescent survival did not [4,5]. Compared to the elderly, young patients are more prone to distant metastases and microsatellite instability, both of which are linked to poor outcomes [6]. Young-onset CRC causes both financial loss and loss of life. Appropriate therapies for young-onset CRC patients are lacking.
In individuals with CRC, the liver is the most common site of metastatic disease. In 20-25% of patients in whom colorectal cancer is identified for the first time, liver metastases are also found [7]. Younger CRC patients have more liver metastases than older CRC patients, possibly due to delays in diagnosis [8,9]. No consensus has emerged on whether the colorectal metastases of young people are identical to the colorectal cancer liver metastases (CRLM) of older patients, or a unique molecular/immunological entity. Due to variation in genetic, cultural, nutritional, and regional factors, it is difficult to predict the longterm survival of young-onset patients with CRLM. Currently, both the American Joint Committee on Cancer (AJCC), TNM stage, and clinical experience inform predictions of the CRLM prognosis and survival of young-onset patients. However, the TNM stage considers only a few criteria, and many clinical features that affect prognosis are overlooked [10]. Reliable prognostic predictions are crucial when selecting therapy, and to ensure good communication between clinicians and young-onset CRLM patients.
Often, nomograms that consider many independent predictors of survival are more accurate, and more intuitive when applied clinically, than other survival prediction methods. We used the Surveillance, Epidemiology, and End Results (SEER) database to acquire information on CRLM in young-onset patients. All cases were separated into training and validation sets. Using common clinicopathological criteria, we developed an efficient and precise nomogram predicting CRLM prognosis in young-onset patients, and a histogram that assessed predictive power.

Patients
We identified 1874 young-onset CRLM patients in the SEER database using the following selection criteria: aged 20-49 years, and CLRM evident at the initial diagnosis from 2010 to 2016. Patients diagnosed on the basis of autopsies or death certificates were excluded, as were those without comprehensive information. Figure 1 shows a flow diagram of the patient selection process. SEER database analyses are exempt from medical ethics approval, so no informed patient consent process was necessary.

Data Collection
Data on age, sex, tumor site, degree of differentiation, histological type, TNM stage, T stage, N stage, M stage, carcinoembryonic antigen (CEA) level, and primary and metastatic surgery status were retrieved. The seventh edition of the AJCC staging system was

Data Collection
Data on age, sex, tumor site, degree of differentiation, histological type, TNM stage, T stage, N stage, M stage, carcinoembryonic antigen (CEA) level, and primary and metastatic surgery status were retrieved. The seventh edition of the AJCC staging system was used to classify all clinicopathological factors. Adenocarcinoma (8010, 8020, 8140-8141, 8144, 8210,8211, 8255, 8260, 8261, 8263, 8310, 8440, 8460, 8550, 8560) and mucinous adenocarcinoma (8470, 8471, 8472, 8480, 8481) were the two histological subtypes of CRC defined by the ICD-O-3 oncology codes. The SEER database was used to determine survival and the ultimate cause of death. For model creation and assessment, training and validation sets were generated.

Statistical Analysis
The chi-squared test and Fisher's exact test were used to compare categorical variables, which are expressed as numbers with percentages. Multi-factor survival analysis was performed using Cox's regression, and a nomogram was created. The concordance index (C-index) refers to the proportion of all patient pairs that agreed with the findings. To assess prognostic accuracy, time-dependent receiver operating characteristic (ROC) curves were drawn, and the areas under the curves (AUCs) were calculated at 1, 3, and 5 years. The calibration curve was used to determine if the nomogram-predicted survival probability matched that of 1000-bootstrap resampling. The Kaplan-Meier method and log-rank test were used to examine the survival curves. SPSS (version 25.0; IBM Corp., Armonk, New York, NY, USA) and R software (version 4.12; R Foundation for Statistical Computing, Vienna, Austria) were used to execute all statistical procedures. A p-value < 0.05 was regarded as statistically significant.

Basic Patient Characteristics
A total of 36,616 patients with CRLM were found in the SEER database from 2010 to 2016, including 5038 aged 20-49 years (13.8% of all patients). After rigorous screening, 1874 young-onset patients with liver CRLM were included. The patients were randomly divided into 1314 cases in a training set and 560 cases in a validation set using the random sampling method and a 7:3 ratio based on R software 4.12 (caret package). The two groups did not differ significantly in demographic or clinical variables. Table 1 lists the patient characteristics.
In total, 54.5% of the patients were male and 33.5% of the tumors were in the rectum. Of all tumors, 75.7% were well-or moderately differentiated. Of all patients, 80.4% were CEA-positive and 93.1% had adenocarcinomas. Furthermore, in 25.8% of patients, both the primary tumor and liver metastases were removed at the same time, while only the primary tumor was removed in 49.9%. Of all patients, hepatic metastases only occurred in 81.1% and extrahepatic metastases in 18.9%.

Independent Features Predictive of Prognosis in Young-Onset Patients with CRLM
Seven characteristics, including the primary tumor site, degree of differentiation, N stage, M stage, histology, surgery, and CEA level were independent predictors of CSS on multivariate Cox's regression analysis (Table 2).

Construction of the Nomogram
We used multivariable Cox's regression analysis to create a nomogram that considered the seven main factors affecting survival. The top ruler of the nomogram is used to calculate a risk score for each variable; the probability of 1-, 3-, and 5-year CSS is determined by superimposing the risk score for each variable on the bottom ruler ( Figure 2). A prognostic nomogram for OS in young-onset patients with CRLM is presented in Figure S1.

Validation of the Nomogram
For the training set, the CSS nomogram had a C-index of 0.709 (95% CI = 0.689-0.729), indicating high accuracy in terms of CSS prediction. For both the training and validation sets, the C-indices of the CSS nomogram were higher than those of the TNM stage (Table  3). The CSS nomogram was well-calibrated, with the mean projected probability for each subgroup being similar to the observed probability, as revealed by the calibration plots ( Figure 3A-F). A prognostic nomogram for OS in young-onset patients with CRLM is presented in Figure S1.

Validation of the Nomogram
For the training set, the CSS nomogram had a C-index of 0.709 (95% CI = 0.689-0.729), indicating high accuracy in terms of CSS prediction. For both the training and validation sets, the C-indices of the CSS nomogram were higher than those of the TNM stage (Table 3). The CSS nomogram was well-calibrated, with the mean projected probability for each subgroup being similar to the observed probability, as revealed by the calibration plots ( Figure 3A-F).
We generated ROC curves to assess the survival predictions. The predictive accuracy of the CSS nomogram was better than that of the TNM stage in both the training and validation sets. When the CSS nomogram was compared to the TNM stage in terms of the training set, the 1- We calculated the relative risk coefficient for young-onset patients with CRLM based on the multivariate COX regression risk scale model. Patients with relative risk coefficients greater than the median were defined as the high-risk group, while patients with relative risk coefficients less than the median were defined as the low-risk group. The high-risk group had a median survival of 20 months, whereas the low-risk group had a median survival of 38 months. Figure 5 shows the survival curves (p < 0.0001). In addition, the time-dependent ROC results showed that the prediction accuracy of the OS nomogram was better than the TNM stage in both training and validation set ( Figures S2 and S3). We generated ROC curves to assess the survival predictions. The predictive accuracy of the CSS nomogram was better than that of the TNM stage in both the training and validation sets. When the CSS nomogram was compared to the TNM stage in terms of the training set, the 1-   We calculated the relative risk coefficient for young-onset patients with CRLM based on the multivariate COX regression risk scale model. Patients with relative risk coeffi cients greater than the median were defined as the high-risk group, while patients with relative risk coefficients less than the median were defined as the low-risk group. The

Discussion
Distant organ metastasis (mainly to the liver) is a clinical hallmark of young-onset CRC, accounting for most of the deaths [11]. Multiple liver metastases, nodules, the degree of differentiation, extrahepatic metastasis, tumor size, CEA level, positive surgical margins, and venous infiltration are all significant predictors of CRLM patient survival [12][13][14]. However, clinicopathological characteristics differ significantly between younger and older CRC patients. Young-onset CRC tends to be more advanced at diagnosis, and exhibits poorer cell differentiation and a higher likelihood of signet ring cell histology; moreover, the primary tumor is more likely to be on the left side of the colon [15]. It is unclear whether models predicting CRLM prognosis in the general population are appropriate for young-onset patients. Therefore, we identified risk factors for younger patients and developed a model predicting survival based on specific pathological tumor characteristics. We created a nomogram that combined clinicopathological factors with the TNM stage to predict survival in young-onset CRLM patients. The primary tumor site and grade, N and M stages, pretreatment CEA level, histology, and resection of primary or metastatic sites were associated with prognosis. The nomogram was validated and calibrated by identifying the most important parameters, and has the potential for wide application. In terms of both the ROC analysis and C-index, the nomogram outperformed the TNM staging method in terms of predictive accuracy and prognostic utility. The survival curve indicated that the low-risk group had a much better prognosis.
The primary tumor site significantly affected the clinical outcome [16]. Young-onset CRLM patients with rectal and left colon tumors had better outcomes than those with right colon tumors, but there was no significant difference between patients with rectal and left colon tumors. According to a previous study based on the SEER database, CRLM patients with colon primaries had worse survival than those with rectal primaries [14]. Different from our findings, this suggests that young-onset CLRM patients may have unique prognostic factors. Right-sided CRC has a worse prognosis than left-sided CRC,

Discussion
Distant organ metastasis (mainly to the liver) is a clinical hallmark of young-onset CRC, accounting for most of the deaths [11]. Multiple liver metastases, nodules, the degree of differentiation, extrahepatic metastasis, tumor size, CEA level, positive surgical margins, and venous infiltration are all significant predictors of CRLM patient survival [12][13][14]. However, clinicopathological characteristics differ significantly between younger and older CRC patients. Young-onset CRC tends to be more advanced at diagnosis, and exhibits poorer cell differentiation and a higher likelihood of signet ring cell histology; moreover, the primary tumor is more likely to be on the left side of the colon [15]. It is unclear whether models predicting CRLM prognosis in the general population are appropriate for young-onset patients. Therefore, we identified risk factors for younger patients and developed a model predicting survival based on specific pathological tumor characteristics. We created a nomogram that combined clinicopathological factors with the TNM stage to predict survival in young-onset CRLM patients. The primary tumor site and grade, N and M stages, pretreatment CEA level, histology, and resection of primary or metastatic sites were associated with prognosis. The nomogram was validated and calibrated by identifying the most important parameters, and has the potential for wide application. In terms of both the ROC analysis and C-index, the nomogram outperformed the TNM staging method in terms of predictive accuracy and prognostic utility. The survival curve indicated that the low-risk group had a much better prognosis.
The primary tumor site significantly affected the clinical outcome [16]. Young-onset CRLM patients with rectal and left colon tumors had better outcomes than those with right colon tumors, but there was no significant difference between patients with rectal and left colon tumors. According to a previous study based on the SEER database, CRLM patients with colon primaries had worse survival than those with rectal primaries [14]. Different from our findings, this suggests that young-onset CLRM patients may have unique prognostic factors. Right-sided CRC has a worse prognosis than left-sided CRC, as evidenced by the higher prevalence of mucinous, undifferentiated, and signet-ring cell tumors, and more advanced disease at diagnosis [17,18]. Patients with right-sided colon tumors and CRLM had a lower 5-year OS rate after surgery than those with left-sided colon tumors [19]. More studies are needed to determine whether the primary tumor site affects survival differently between younger patients with CRLM and older patients.
CEA is a cell surface glycoprotein expressed by normal mucosal cells, but is overexpressed in cancers. The CEA level was useful for predicting CRLM patient outcomes [10,20,21]. We also found that CEA-positive young-onset CRLM patients had a worse prognosis than CEAnegative patients. Furthermore, the lower the degree of differentiation of tumor cells, the poorer the survival. CEA-positive patients with a low degree of tumor differentiation must be closely monitored after discharge. The OS and cancer-specific survival nomograms, and tumor survival risk scores for the T1 stage, are higher than those for T2-T3 stage patients with CRLM, indicating that T1 tumors are associated with poorer survival [10,22]. However, we found that T stage had no significant effect on the survival of young-onset patients with CRLM. This difference in the effect of T stage by age requires further examination.
CRLM surgery remains contentious [23]. The most effective curative treatment for patients with CRLM is radical resection of the initial tumor combined with removal of liver metastases [24][25][26]. The median survival time of our young-onset CRLM patients who underwent resection of both the primary and metastatic sites was 38 months, which was significantly longer than that of patients who did not undergo surgery (18 months). For patients with unresectable liver metastases, the survival benefit afforded by resection of the primary lesion alone remains controversial. Primary tumor excision enhances quality of life and minimizes the adverse effects of systemic chemotherapy, as well as the risk of primary tumor complications (bleeding, blockage, and perforation) [27,28]. However, primary tumor excision delays systemic chemotherapy, particularly if complications emerge [29]. A multicenter retrospective cohort study showed that primary tumor excision significantly increased OS in patients with stage IV CRC and unresectable metastases [30,31]. We found that young-onset CRLM patients who underwent only primary resection had a median survival time of 26 months, which was much longer than that of patients who did not undergo surgery. This indicates that we should take a more aggressive approach to the management of the primary site in young-onset patients with CRLM.
In this study, we examined common clinicopathological features, individually and in combination, to develop a simple, quick, and accurate predictive model. However, the study had several limitations. First, the SEER database lacks information on the frequency and extent of liver metastases, mutations, and CA199 status. Second, apart from surgeries, the database lacked treatment information. Finally, our nomogram remains to be externally validated.
In conclusion, our nomogram correctly predicted the survival of young-onset CRLM patients, showed good discrimination and calibration, and will allow physicians to individualize prognoses and explore therapeutic solutions for young-onset CRLM patients.

Supplementary Materials:
The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/diagnostics12061395/s1. Table S1: Multivariable analyses of overall survival in the training cohort, Figure S1: The Nomogram for predicting overall survival (OS), Figure S2: Calibration plots of the nomogram for 1-, 3-, and 5-year OS prediction in the training set (A-C) and validation set (D-F), Figure S3: Comparison of the ROC curves of the nomogram and the TNM stage for 1-, 3-, and 5-year OS prediction in the training set (A-C) and validation set (D-F).
Funding: This study was supported by the Zhejiang Province Natural Science Foundation of China, grant number LY19H160040.

Institutional Review Board Statement:
This investigation was conducted in accordance with the ethical standards, according to the Declaration of Helsinki, and according to national and international guidelines, and the institutional review board of our hospital approved this study on 1 December 2021 (ZY20001832).

Informed Consent Statement:
The study was based on a secondary analysis of the previously collected, publicly available, and de-identified data. The SEER database holds no identifying patient information, and all data are anonymous, therefore, written informed consent was not required for this study.

Data Availability Statement:
The data that support the findings of this study are openly available in the Surveillance, Epidemiology, and End Results (SEER) database of the National Cancer Institute at https://seer.cancer.gov/ (accessed on 15 November 2021).

Conflicts of Interest:
The authors declare that they have no competing interest, and all authors confirm its accuracy.