Prognostic Discrimination of Alternative Lymph Node Classification Systems for Patients with Radically Resected Non-Metastatic Colorectal Cancer: A Cohort Study from a Single Tertiary Referral Center

Simple Summary We compared the predictive and prognostic performance of different lymph node classification systems regarding overall survival in patients with colorectal cancer (CRC). Distinct lymph node ratio (LNR) and Log odds of positive lymph nodes (LODDS) classifications demonstrated prognostic superiority over the N category only in patients with Stage III CRC. Abstract Background: Lymph node ratio (LNR) and the Log odds of positive lymph nodes (LODDS) have been proposed as a new prognostic indicator in surgical oncology. Various studies have shown a superior discriminating power of LODDS over LNR and lymph node category (N) in diverse cancer entities, when examined as a continuous variable. However, for each of the classification systems various cut-off values have been defined, with the question of the most appropriate for patients with CRC still remaining open. The present study aimed to compare the predictive impact of different lymph node classification systems and to define the best cut-off values regarding accurate evaluation of overall survival in patients with resectable, non-metastatic colorectal cancer (CRC). Methods: CRC patients who underwent surgical resection from 1996 to 2018 were extracted from our medical data base. Cox proportional hazards regression models and C-statistics were performed to assess the discriminative power of 25 LNR and 26 LODDS classifications. Regression models were adjusted for age, sex, extent of the tumor, differentiation, tumor size and localization. Results: Our study group consisted of 654 consecutive patients with non-metastatic CRC. C-statistic revealed 2 LNR and 5 LODDS classifications that demonstrated superior prognostic performance in patients with UICC III CRC, compared to the N category. No clear advantage of one classification over another could be demonstrated in any other patient subgroup. Conclusions: Distinct LNR and LODDS classifications demonstrate a prognostic superiority over the N category only in patients with Stage III radically resected CRC.


Introduction
Colorectal cancer (CRC) is one of the most common malignancies worldwide with an estimated number of approximately 148.000 new cases in the United States in 2020 [1].
Lymph node (LN) status is a significant prognostic factor, directly associated with disease free survival (DFS), as well as overall survival (OS) [2]. Its importance with regard to therapeutic decision making is deemed paramount, as LN metastasis constitutes an indication for perioperative treatment regimens, for most solid tumors. The most widely accepted standardized LN-staging system among clinicians is incorporated into the Tumor Node Metastasis (TNM) system maintained by the American Joint Committee on Cancer (AJCC) and the Union for International Cancer Control (UICC) [3,4]. In this system, cases with no metastatic LNs are classified as N0, cases with 1-3 metastatic LNs are classified as N1 and cases with more than 3 positive LNs are classified as N2. Moreover, N1 category is subdivided in N1a (1 metastatic LN), N1b (2-3 metastatic LNs) and N1c (no regional lymph nodes are positive but there are tumor deposits in the subserosa, mesentery or nonperitonealized pericolic or perirectal/mesorectal tissues), whereas N2 category is subdivided in N2a (4-6 metastatic LNs) and N2b (7 or more metastatic LNs). The minimum number of examined LNs (NELN) needed for an adequate staging, as recommended by the AJCC and UICC, should not be less than 12 in order to minimize the possibility of stage migration [5]. However, existing data that derive from population-based analysis suggest that cases with sufficient NELN can be as low as 37% of the study population [6]. The strong association between NELN and N category constitutes an inherent weakness of the TNM system and has necessitated the development of novel nodal staging systems that allow a better prognostic stratification. In this context, the metastatic LN Ratio (LNR: the number of positive LNs divided by the NELN) has been suggested during the last decade and has been evaluated in several studies, demonstrating superior independent prognostic value in CRC [7]. However, a major drawback of LNR becomes evident in node-negative disease, as it fails to deliver any more meaningful prognostic evaluation compared to TNM. An additional limitation of LNR is that cases in which all harvested LNs are positive are staged in the same class, regardless of the total number of harvested lymph nodes. It is clinically evident that LNR does not fully encompass the information contained in positive LNs and NELN. Log odds of positive lymph nodes (LODDS) is the natural logarithm of the ratio of the positive LNs and the negative LNs which has been reported to diminish the risk of stage migration in various types of solid malignancies [8][9][10]. Essentially, LODDS uses a mathematical approach to LN staging that is not influenced by the extent of lymphadenectomy thus representing the probability of a harvested LN to be metastatic. Existing data provide evidence on the suitability of LODDS as predictor for OS in CRC and other cancers. Moreover, it has been demonstrated that LODDS is an independent prognostic factor for survival in CRC patients [11]. However, the existing literature is characterized by notable diversity regarding cut-off points used to categorize the studied population into different subgroups. To date, these distinct cut-off points have not been compared and validated in independent sets of CRC patients. In the present study, we sought to shed light on this issue and determine the prognostically most appropriate set of, already proposed, cut-off points for different LN classification systems in patients with CRC that were treated at our department

Study Cohort
In the present study we retrospectively analyzed patient charts, histopathological findings and surgical reports collected from the prospectively maintained computer-based patient records database of the University Hospital Duesseldorf. Between November 1996 and August 2018, a total number of 996 adult patients with diagnosed primary CRC underwent surgery with curative intent at our department. Patients with the following criteria were excluded: metastatic disease (n = 148), incomplete histopathological information (n = 77), positive resection margins (n = 15), death within 30 days after surgery (n = 20), emergency surgery (n = 27), lost to follow up (n = 48) and polyposis syndromes and inflammatory bowel disease (n = 7) ( Figure S1). All operations were performed via laparotomy. A high-tie of the central vessels was performed with a subsequent complete Cancers 2021, 13, 3898 3 of 13 mesocolic excision. Regarding rectal resections, a total mesorectal excision was conducted. Circumferential resection margins could not be retrospectively retrieved for all cases of rectal cancer. OS was defined as time between date of surgery and death from any cause. All patients remained under outpatient follow-up of their oncological outcome where they were clinically examined by surgeons. The study was carried out in accordance with the principles of good clinical practice and the Declaration of Helsinki. Since this was a retrospective study, it was no longer possible to obtain a declaration of consent for data collection at a later date. For most of the patients, consent was no longer possible or involved a disproportionate amount of effort. In addition, all data analyzed were collected as part of routine diagnosis and treatment. The data were anonymized at the source and there was no evidence that the patients refused to use their data. An institutional review board (IRB)-approval of the Medical Faculty, Heinrich Heine University Duesseldorf was retrieved (IRB-No: 2019-428-ProspDEuA).

Statistical Analysis
Scatter plots were designed to investigate the relationship between the number of metastatic lymph nodes, LNR and LODDS. The accuracy of various LN classifications was analyzed as a continuous variable by measuring the area under the receiver operating characteristics (ROC) curve (AUC) using SPSS statistics for Windows (IBM SPSS Statistics for Windows, Version 25.0. Armonk, NY, USA: IBM Corp.). Kaplan-Meier curves were generated and compared by the log-rank (Mantel-Cox) test using GraphPad Prism for Windows (Version 8.0.2, GraphPad Software, San Diego, CA, USA).
The relationship between distinct cut-off values of different LN classifications systems and OS was analyzed using a multivariate Cox proportional regression model calculating the hazard ratio (HR) and 95% confidence intervals (CI). Therefore, we fitted a base model including the following covariates: age at the time of surgery, sex, T-category (T1 + 2, T3 + 4), degree of histologic differentiation (well/moderately differentiated, G1 + G2; poorly differentiated/undifferentiated G3 + G4), tumor size and localization (rectum, right/transverse/left/sigmoid colon).
Using this model, we further evaluated for each LN classification model discrimination using the C-statistics. We compared two C-statistics by using the same data set and by calculating the jackknife variance estimates of the difference between two C-statistics. The comparison is made by interpreting the 95% CI of the difference of the C-statistics. The Delta C parameter was calculated in order to quantify the differences of C-statistic between the N category and any other classification. The false discovery rate (FDR) method was In addition, we performed the above-mentioned analysis in subgroups defined by the presence of lymph node metastases (UICC I/II versus UICC III), history of a neoadjuvant therapy (yes versus no), tumor localization (rectum versus right/transverse/left/sigmoid colon) and number of resected LNs (≥12 versus <12). For the sample size determination, we assumed a constant hazard rate (HR) of 1.5 between the low and higher N-grading classes during the complete follow-up period of 150 months. When the total sample size is 420 with a total number of events required of 210, a 0.05 level two-sided log-rank test for equality of survival curves will have 90% power to detect a difference between the two groups. The total number of patients of n = 654 in our cohort with 252 observed events are enough to detect a statistically significant difference between N-grading classification groups.
For risk factors with missing data, we used a simple imputation method using medians for continuous variables and the most often frequency for categorical outcomes. Statistical analysis was performed using the statistical software R version 3.6.3 [44]. We used reporting tools based on the standards of replicable research using the R package "knitr" [45]. The analysis based on the proportional hazard Cox's regression and the estimation of the C-statistics was performed with the R package "survival" [46].

Results
A total number of 654 consecutive patients with non-metastatic CRC could be included in our study. Baseline clinical and histopathological characteristics of the study population are summarized in Table 1.
There were 403 (61.6%) patients without LN metastases. The median (range) of NELN and positive LNs in the whole cohort was 17 (2-68) and 0 (0-56), respectively. The study population consisted of 239 (36.5%) rectal cancer patients and 415 (63.5%) cases of colon cancer. First, we evaluated various LN classification systems as categorical variables by conducting a ROC analysis for 1-year, 3-year and 5-year OS ( Figure 1A-C). * one patient included with N1c (no regional lymph nodes were positive but there were pericolorectal tumor deposits).
There were 403 (61.6%) patients without LN metastases. The median (range) of NELN and positive LNs in the whole cohort was 17 (2-68) and 0 (0-56), respectively. The study population consisted of 239 (36.5%) rectal cancer patients and 415 (63.5%) cases of colon cancer. First, we evaluated various LN classification systems as categorical variables by conducting a ROC analysis for 1-year, 3-year and 5-year OS ( Figure 1A-C). Accordingly, LODDS was the only LN classification exhibiting the highest AUC values with a p < 0.001 during all three follow up phases (Table S1). In addition, we displayed LN parameters in a scatter plot to verify the relationship between the numbers of metastatic lymph nodes (positive LN, pLN), LNR and LODDS (Figure 2A-C). Accordingly, LODDS was the only LN classification exhibiting the highest AUC values with a p < 0.001 during all three follow up phases (Table S1). In addition, we displayed LN parameters in a scatter plot to verify the relationship between the numbers of metastatic lymph nodes (positive LN, pLN), LNR and LODDS (Figure 2A-C).
Scatter plots demonstrated that both, LNR and LODDS, increased with the number of pLN (r s = 0.992, r s = 0.845). Moreover, LODDS also increased with LNR values (r s = 0.857). However, when LNR was 0 or 1, LODDS remained heterogeneous implying that LODDS discriminates more precisely among patients without lymph node metastasis and patients in which the number of pLN is equal to the NELN. Of note, although we observed a tendency towards a more favorable prognosis in patients with a NELN ≥ 12, this difference became not statistically significant ( Figure S2). We then performed Kaplan-Meier survival analysis for the AJCC 8th edition N-staging as well as the distinct LNR and LODDS classification systems demonstrating a ubiquitous statistically significant association with OS ( Figures S3-S5). * One patient included with N1c (no regional lymph nodes were positive but there were pericolorectal tumor deposits).
To further analyze the highest discriminative power of the different LN staging systems in predicting prognosis we performed Cox proportional hazards regression and evaluated model discrimination for each LN parameter using the overall C index. Therefore, we first examined the prognostic value of the selected covariates in our base model using Cox regression analysis. Accordingly, age at the time of surgery, tumor size, grade of tumor differentiation and tumor localization were significantly associated with OS (Table 2). Scatter plots demonstrated that both, LNR and LODDS, increased with the number of pLN (rs = 0.992, rs = 0.845). Moreover, LODDS also increased with LNR values (rs = 0.857). However, when LNR was 0 or 1, LODDS remained heterogeneous implying that LODDS discriminates more precisely among patients without lymph node metastasis and patients in which the number of pLN is equal to the NELN. Of note, although we observed a tendency towards a more favorable prognosis in patients with a NELN ≥ 12, this difference became not statistically significant ( Figure S2). We then performed Kaplan-Meier survival analysis for the AJCC 8th edition N-staging as well as the distinct LNR and LODDS classification systems demonstrating a ubiquitous statistically significant association with OS ( Figures S3-S5).
To further analyze the highest discriminative power of the different LN staging systems in predicting prognosis we performed Cox proportional hazards regression and evaluated model discrimination for each LN parameter using the overall C index. Therefore, we first examined the prognostic value of the selected covariates in our base model using Cox regression analysis. Accordingly, age at the time of surgery, tumor size, grade of tumor differentiation and tumor localization were significantly associated with OS (Table 2).   Using this base model, we performed for each LN classification system Cox regression analysis and evaluated model discrimination using the C-statistics in our entire cohort of CRC patients. Cox regression analysis revealed that advanced N categories as well as higher LNR or LODDS categories, even independently of the different cut-off values, were significantly associated with a poor prognosis (Tables S2-S4). However, C-statistics demonstrated comparable results for all classification systems showing no superiority of any LNR or LODDS classification when compared with the AJCC 8th edition N category ( Figure S6).
Postoperative chemotherapy has been shown to significantly improve survival of stage III CRC patients [47]. Given the side effects of the currently administered chemotherapeutics, a shorter chemotherapy for selected patients characterized by a lower risk of recurrence would be desirable. In this context, Grothey and colleagues demonstrated that for CRC patients with a low risk situation (T1, T2, or T3 and N1) a therapy of 3 months with CAPOX (capecitabine and oxaliplatin) was not inferior to 6 months [48]. In contrast, for high risk patients (T4, N2 or both) a therapy of 6 months was superior when compared with a regimen of 3 months. In addition, the JFMC37-0801 study revealed superior re-currence free and overall survival in stage III B, III C and IV CRC patients for 12 months of capecitabine [49]. Accordingly, among UICC III CRC patients there exist subgroups with a lower risk that is reflected by a better prognosis. This prompted us to investigate whether LNR and LODDS classification systems may further prognostically discriminate risk groups within the subgroup of UICC III cancer patients. Interestingly, C-statistics revealed that 2 LNR [2,21] and 5 LODDS [2,20,23,24,35] classifications exhibited a superior discrimination in OS when compared with the N category (Table 3). Consistent with these observations, survival curves for low and high risk patients either perfectly matched with the survival curves of LNR groups 1 and 2 defined by Fortea-Sanchis [2] or demonstrated at least a parallel course with the survival curves of certain LNR or LODDS groups of the remaining 6 LN classifications [2,20,21,23,24,35] in UICC III CRC patients (Figure 3). This observation suggests that these 7 LN-classification systems might serve as a useful tool in the decision making of the duration of an adjuvant chemotherapy. Cancers 2021, 13, x FOR PEER REVIEW 9 of 14 Figure 3. Kaplan-Meier survival curves for OS in UICC III CRC patients (n = 251) depending on the LNR classification system proposed by (A) Lee et al. [21], (B) Fortea-Sanchis et al. [2] or the LODDS classification system suggested by (C) Fortea-Sanchis et al. [2], (D) He et al. [35], (E) Calero et al. [20], (F) Bagante et al. [23] and (G) Jian-Hui et al. [24]. Red and green OS curves indicate high risk (T4, N2 or both) and low risk (T1, T2, or T3 and N1) patients, respectively.  [2] or the LODDS classification system suggested by (C) Fortea-Sanchis et al. [2], (D) He et al. [35], (E) Calero et al. [20], (F) Bagante et al. [23] and (G) Jian-Hui et al. [24]. Red and green OS curves indicate high risk (T4, N2 or both) and low risk (T1, T2, or T3 and N1) patients, respectively.
Of note, in other subgroups defined by the tumor localization, history of neoadjuvant radio-chemotherapy, NELN LNR and LODDS classifications failed to demonstrate a prognostic superiority (data not shown).

Discussion
The role of LN metastasis in the systemic dissemination of CRC is crucial. LN status is regarded as one of the major prognostic parameters for assessing the course of the disease after CRC resection. Along with the TNM classification maintained by the AJCC/UICC, based on the positive LN category (N) further LN classification systems have been developed. This is a result of the limitations of the N category, as it is solely based on the number of positive LNs, regardless of the radicality of locoregional lymphadenectomy. The LNR system was introduced as an alternative to N category as it takes into account not only the positive LNs but also the total number of harvested LNs. However, an inherent limitation of the above-mentioned classification is the heterogeneity of patients in cases where all resected LNs are positive or when all resected LNs are negative. LODDS has been therefore introduced as a LN classification system that resolves this issue and is defined as the logarithm of the ratio between the probability of being a metastatic LN and the probability of being a negative harvested LN, when a LN is retrieved. The prognostic value of LNR and LODDS has already been evaluated in patients with CRC and other types of cancer, also in patients who underwent emergency surgery for complicated CRC [50]. Both, LNR and LODDS are continuous biological variables. Nevertheless, such variables are of little use or, in the worst case, cannot be applied in clinical practice. As a result, a plethora of categorical cut-off values for various LN staging systems have been proposed. The remarkable heterogeneity of existing cut-off values is a consequence of clinical and/or methodological diversity among the existing studies. In our study, the prognostic impact of 25 LNR and 27 LODDS classifications was investigated in patients following curative-intent resection of CRC. After confirming the predictive value of LNR and LODDS in our patient cohort as a continuous variable, we further sought to compare the various LN classifications as a categorical variable using C-statistics, based on already-published cut-off values. However, in our study cohort, none of the proposed sets of cut-off values were able to demonstrate superiority over the N category. Exclusively in the subgroup of UICC III CRC patients, 2 LNR [2,21] and 5 LODDS [2,20,23,24,35] classifications demonstrated a predictive superiority when compared with the N category. Of note, stage III CRC patients constitute a distinctive subgroup of cases that require the administration of adjuvant chemotherapy. However, the usual regimen of 6-month treatment is associated with cumulative neurotoxicity and, as a result, in quality of life deterioration. Hitherto, the issue of balance between choice of regimen, therapy duration and risk of toxicity has been addressed in various trials [48,51]. Ivenson et al. [51] conducted the largest single randomized study on adjuvant treatment of CRC and clearly demonstrated the non-inferiority of 3-month oxaliplatin-based regimen versus the standard 6-month duration. On the other hand, in the study of Grothey et al. [48], a large prospective pooled analysis of six randomized Phase III trials, two different risk groups of UICC III CRC patients were identified in which a 3-month CAPOX-regimen was noninferior to 6 months of chemotherapy regarding disease free survival. Furthermore, a study from Japan randomly assigned patients with radically resected UICC III stage CRC to oral adjuvant chemotherapy for 6 or 12 months, demonstrating an improved OS in the 12-month treatment group for advanced UICC III B, C and UICC IV stages [49].
Accordingly, our results provide valuable data for the further subclassification of stage III CRC patients in distinctive risk groups. That is of utmost importance as we verified which novel LN classification systems could be implemented in the tailored decision-making process of selecting the most appropriate duration of adjuvant therapy regimen for the suitable subgroup of patients.
To our knowledge, our study is the first attempt to directly compare previously proposed LN classification systems. We now provide novel data, which generate the basis for future research and point the direction to the evaluation of specific LN classification systems that appear to have clinical and therapeutic relevance in patients with UICC III CRC. However, there is a number of inherent limitations to all cohort studies of this type. The patients represented a selected cohort that were radically operated in a highly specialized setting and are consequently not representative of all patients diagnosed with CRC. Cohort size was modest and disease free survival was not recorded. Moreover, administration of neoadjuvant radiochemotherapy in patients with rectal cancer has not been consistent over the observed period and thus, the results regarding this subgroup should be interpreted with caution. Additionally, the administration of adjuvant therapy could not be fully evaluated within this retrospective study. Data regarding exact chemotherapeutic drugs administered, their dosage, frequency and duration are incomplete. At this point it must also be stated that all surgeries were performed via laparotomy, which was our institutional standard during the study period, and thus explains the lack of laparoscopic and/or robotic approaches in our study. Howbeit, the amount of total harvested lymph nodes has not been found to differ significantly between these three different surgical approaches in the existing literature [52][53][54][55][56].
The strengths of the study, nevertheless, included robust follow-up data with a reasonable duration of follow-up. Patients were recruited from a consecutive series diagnosed with CRC, from a single geographical region, all treated by the same group of specialists, using a standardized staging algorithm and operative techniques.

Conclusions
In conclusion, LNR cut-off values as proposed by Lee et al. [21] and Fortea-Sanchis et al. [2], as well as LODDS classifications as proposed by Fortea-Sanchis et al. [2], He et al. [35], Calero et al. [20], Bagante et al. [23] and Jian-Hui et al. [24] demonstrate a clear prognostic superiority over the N category in the subgroup of patients with UICC III CRC. Therefore, we believe that future examination of LN classification systems with cut-off values other than the above mentioned should be abandoned in patients with UICC III CRC and that focus should be turned on the further verification of our findings, in the context of larger-scale clinical trials.