Clinical and Molecular Comparative Study of Colorectal Cancer Based on Age-of-Onset and Tumor Location: Two Main Criteria for Subclassifying Colorectal Cancer

Our aim was to characterize and validate that the location and age of onset of the tumor are both important criteria to classify colorectal cancer (CRC). We analyzed clinical and molecular characteristics of early-onset CRC (EOCRC) and late-onset CRC (LOCRC), and we compared each tumor location between both ages-of-onset. In right-sided colon tumors, early-onset cases showed extensive Lynch syndrome (LS) features, with a relatively low frequency of chromosomal instability (CIN), but a high CpG island methylation phenotype. Nevertheless, late-onset cases showed predominantly sporadic features and microsatellite instability cases due to BRAF mutations. In left colon cancers, the most reliable clinical features were the tendency to develop polyps as well as multiple primary CRC associated with the late-onset subset. Apart from the higher degree of CIN in left-sided early-onset cancers, differential copy number alterations were also observed. Differences among rectal cancers showed that early-onset rectal cancers were diagnosed at later stages, had less association with polyps, and more than half of them were associated with a familial LS component. Stratifying CRC according to both location and age-of-onset criteria is meaningful, not only because it correlates the resulting categories with certain molecular bases, but with the confirmation across larger studies, new therapeutical algorithms could be defined according to this subclassification.


Introduction
In Europe, colorectal cancer (CRC) is the third most common cancer in males, the second most common cancer in females, and the fourth most common cause of cancer-related deaths; worldwide, 1,360,000 new cases are diagnosed each year and it causes 700,000 deaths [1].Biologically, it represents a heterogeneous disease.Early-onset colorectal cancer (EOCRC) represents 11% of colon cancers and 18% of rectal cancers, and its incidence is increasing [2].Several studies have described different genetics, biological and clinical behavior in this age group, suggesting that it may be a specific subgroup within CRC, and that age-of-onset should be a major criterion for its subclassification [3,4].Moreover, new findings may also define some particular subtypes, such as rectal cancer with microsatellite-stability (MSS) and without chromosomal instability (CIN-) [5].
The predisposition of the three main carcinogenetic CRC pathways to different locations in the colon, together with studies demonstrating that right and left-sided CRCs exhibit different genetic, biological and demographical characteristics and risk factors, suggest that the carcinogenetic mechanism and progression of CRC may differ with tumor location.Thus, the anatomic site of origin of CRC appears to be another good discriminator for the subclassification of this type of neoplasm [6][7][8].Along these lines, we recently analyzed tumor location (right colon, left colon and rectum) as a discriminatory factor within EOCRC and clinical differences emerged as well as the different main carcinogenetic pathways predominant within each location [9].
For the past few years, while characterizing EOCRC from different points of view, we have been comparing series of features with their correlates in late-onset CRC (LOCRC) in order to assess whether they have a different molecular basis [3,4,10].Therefore, our purpose at this point is to confirm whether our previous findings that the differential clinical and molecular features of tumors with a different location in CRC and the age-of-onset as a major criterion to classify CRC, hold up together.If so, right colon, left colon and rectal cancers should be substantially different between early and late age-of-onset groups.

Clinicopathological and Familial Features
Comparative analysis of EOCRC with respect to tumor location has been published before [9].Left colon was the most frequent location (43%), followed by rectum (33%).The main features for each location are shown in Tables 1-3.Approximately 15% of EOCRC tumors showed Microsatellite Instability (MSI); the most prevalent location was the right colon (30%), followed by the left colon (17%).None of the tumors located at the rectum showed MSI.Most MSI cases were due to LS (germline mutations in MMR genes).CIMP-high was also located mainly in the right colon (50%).Therefore, according to the molecular classification, the most homogeneous distribution was observed for right colon cancers.For left colon and rectal cancers, the largest category was MSS-CIMP-Low/0 (77.5% and 90.5%, respectively).Right colon tumors showed the lower CIN, whereas left colon tumors exhibited the highest CIN, particularly with respect to losses.Rectal tumors showed the highest mean number of whole chromosome aberrations.

Clinico-Pathological and Familial Features
Clinical and molecular features of global LOCRC and comparison between colon locations have been published before [4, [9][10][11], and are summarized in Tables 1-3.The least frequent location was left colon (23%), whereas the other two locations were observed at equivalent rates.

Molecular Features
With regard to the three main molecular carcinogenetic pathways, nine cases showed MSI (9.3%), with only one of them due to an MMR-germline mutation; seven of the others were due to BRAF mutations and/or hypermethylation of the MLH1 promoter, the majority of which were located in the right colon.With regard to the molecular classification, the highest proportion was related to MSS-CIMP-Low-0, reaching almost 86% of all left colon cancers.Interestingly, another frequently occurring molecular classification was MSS-CIMP-High in rectal tumors, reaching 36%.The highest CIN was observed in the right colon.Left colon cancers showed an interestingly low CIN, and rectal tumors showed a high mean of whole altered chromosomes.

Clinico-Pathological and Familial Features
We compared right colon cancers between both age-of-onset populations (Table 1).The right colon location was more frequent in LOCRC (39% vs 24%).Most of the differential features related to the early-onset subset were because of the LS component: cases were diagnosed at early stages, with a higher mean number of polyps during follow-up and showed an important familial component.
They also showed a better disease-free survival (DFS).On the other hand, late-onset cases developed on average fewer polyps during follow-up, and three-quarters of them were sporadic.

Molecular Features
With respect to the molecular component of right-sided EOCC, all MSI tumors were due to germline mutations in the MMR genes (30%), with the genomic instability index (GII) being equivalent to a low CIN.Of note, there is a considerable component of CIMP-High in this particular group.All of these features are shown in Table 1.Within right-sided LOCC, the MSI component (18%) was almost entirely due to BRAF mutations and/or hypermethylation of the MLH1 promoter; these cases display some already known particular characteristics, such as a CIMP-High component [12].This subset showed a much higher CIN compared with its EOCC equivalent, as shown in Figure 1A.With respect to the molecular component of right-sided EOCC, all MSI tumors were due to germline mutations in the MMR genes (30%), with the genomic instability index (GII) being equivalent to a low CIN.Of note, there is a considerable component of CIMP-High in this particular group.All of these features are shown in Table 1.Within right-sided LOCC, the MSI component (18%) was almost entirely due to BRAF mutations and/or hypermethylation of the MLH1 promoter; these cases display some already known particular characteristics, such as a CIMP-High component [12].This subset showed a much higher CIN compared with its EOCC equivalent, as shown in Figure 1A.The right colon is the location with most differentially altered chromosomal segments according to the age of onset, mainly for the late-onset group (Table S1), and the most frequent specific differential segments for each location are shown in Table 4. From a total of 113 differentially altered chromosomal segments, 100 corresponded to right-sided LOCC, and losses in chr2 and gains in chr7 were the most frequent.Only 13 differential segments were particularly predominant in right-sided EOCC, in most cases corresponding to losses in 1q21.1-21.2,10q11.21-11.22 and 14q11.1-11.2.The right colon is the location with most differentially altered chromosomal segments according to the age of onset, mainly for the late-onset group (Table S1), and the most frequent specific differential segments for each location are shown in Table 4. From a total of 113 differentially altered chromosomal segments, 100 corresponded to right-sided LOCC, and losses in chr2 and gains in chr7 were the most frequent.Only 13 differential segments were particularly predominant in right-sided EOCC, in most cases corresponding to losses in 1q21.1-21.2,10q11.21-11.22 and 14q11.1-11.2.

Clinico-Pathological and Familial Features
The age subset with the highest proportion of left colon cancers was EOCRC (43%).Most of the clinicopathological and familial features were similar for both groups, as shown in Table 2.The earlier stages at diagnosis in the left-sided EOCC group (80% for stages I and II), and the higher development of polyps as well as other CRCs (S and/or MCRC) in left-sided LOCC were the differential characteristics.

Clinicopathological and Familial Features
Table 3 summarizes the main differential features between rectal cancers of both age-groups.From a clinical point of view, EORCs showed less association with polyps, a very advanced stage at diagnosis (37% with metastasis), and a better DFS; a relevant aspect was the important familial cancer component, with more than 50% of cases with LS-related neoplasms in their families.Conversely, LORC cases were mainly sporadic.

Molecular Features
All LORCs were MSS, and an important number of these tumors showed CIMP-High (36%).More than 90% of EORC were MSS-CIMP0.Lastly, there were differences in relation to CIN, with the young rectal cancers displaying less instability for losses (Figure 1C and Table 3).
Although there were some differentially altered chromosomal segments between these age-of-onset groups, they were not as high in proportion as in other locations, with a slight predominance for LORC (Table 4 and Table S1).In this last group losses in 1p36.32-36.13and gains in 5q13.2 are the most frequent.

Discussion
CRC is a heterogeneous disease with different outcomes and drug responses.Subclassification according to the age of onset and tumor location may have applications in terms of diagnosis, prevention and therapy.We recently published differential features for EOCRC and LOCRC separately, according to tumor location, and some interesting subsets within these age-of-onset groups became manifest [9,11].More interesting, however, was the possibility that both factors, age of onset and colon location may differentiate CRC.
In right colon cancers, the differential phenotypes for both age groups are clearly correlated with their molecular basis.Right-sided early-onset cases showed important LS features (poorly differentiated tumors, a higher mean number of polyps, the familial component of LS-related neoplasms), all of which are likely to be associated with the 30% of MSI due to germline mutations in MMR genes; these tumors have a relatively low CIN but are CIMP-high.The late-onset subset was mainly present in females, with sporadic features, and only 18% of cases were MSI; this was seldom due to LS but mostly because of BRAF mutations.In spite of the higher CIN of right-sided LOCCs, the two most frequently occurring differential regions (1q21 and 10q11.21-11.22)were observed in early-onset cases.Increased alterations in 1q21 have been found before in EOCRC, although the criterion of tumor location was not applied [13], while 10q11.21-11.22 is a region where the RET gene is located [14].Another important aspect that should be underlined is the presence of losses in 14q11.1-11.2.Here the NDRG2 gene is located, a tumor suppressor gene that has been associated with c-MYC and the TGF-β pathway in colorectal carcinogenesis [15,16], and its epigenetic silencing promotes tumor proliferation and invasiveness [17].
Left-colon tumors showed predominantly differences according to CIN (mainly whole chromosomes altered).The most reliable clinical features were associated with the late-onset subgroup, showing an important tendency to develop polyps as well as Multiple Primary CRC, in contrast with the early-onset subgroup.Nevertheless, left-sided EOCCs showed the higher CIN, as well as some differential CNAs, few of which have been associated previously with other features of CRC or the early-onset of cancers.As we mentioned before, losses in 1q21.1-21.2 are found significantly more in EOCRC [13]; losses in 11q14.1-14.3have been associated with aggressive behaviors and familial linkage, albeit in prostate cancers [18]; gains in 19p13.12 may be related with NOTCH3 expression or function [19].In addition, 11q14.1 harbors GAB2, a gene involved in Epithelial to Mesenchymal transition, and more importantly, in the development of lymph-nodes invasion and metastasis [20].
Differences within rectal cancers are more prominent in the early-onset subset: later-stages at diagnosis, less association with polyps, and more than half of them associated with a familial LS component, although all of them are MSS.Only the CIMP-High-MSS category (36%) within LORC is noteworthy.Only a few differential chromosomal segments appeared at high rates, and the most frequent were associated with LORC; one of these, gains in 5q13.2, involves a segment associated with rectal metastatic carcinomas [21].
Finally, we wanted to assess to which categories our location and age-of-onset subgroups corresponded when comparing them with the recent consensus molecular classification of the CRC subtyping consortium [22].Since consensus molecular subtype 1 (CMS1 or MSI immune) is mainly located in the right colon, we compared both age groups.As CMS1 characterized by MSI, this is clearly related to both age groups, but mainly with the right-sided early-onset subgroup.The other features are shared by the separate groups individually.While right-sided EOCC is linked with CIMP-High and a low prevalence of somatic CNAs or CIN, right-sided LOCC showed BRAF mutation, high grade of differentiation, predominance in females, and worse survival after relapse, in agreement with recent studies showing worse prognosis of patients with MSI and recurring BRAF-mutant CRC [23][24][25].In our study, the location group within EOCRC that had the best prognosis was the right colon, while for LOCRC it was the left colon.
Consensus molecular subtype 2 (CMS2 or Canonical) is the most frequent and is defined by somatic copy number alterations (SCNAs) in a larger proportion than in the other subtypes, by being mainly left-sided and by having a larger proportion of long-term survivors.Compared to our study, the first three characteristics are clearly connected with left-sided EOCC and only the last with left-sided LOCC.
Consensus molecular subtype 3 or metabolic type is characterized by MSI, CIMP-Low and low CNA.Some features are associated with EOCRC (left colon with CIMP-low and MSI, and rectum also CIMP-low), and others with left-sided LOCC and LORC, respectively (low CNA and CIMP-High).
Consensus molecular subtype 4 or mesenchymal type exhibits SCNA high, is diagnosed at more advanced stages and show worse relapse-free and overall survival.Rectal tumors mainly with early-onset could be classified in this molecular subtype.This point about rectal tumors deserves a separate consideration to the extent that CMS classification mainly focused on right versus left colon cancer, and only a few rectal cancers were included.The study of rectal cancer within EOCRC may arise especially significant, as this subtype is the one that is increasing more remarkably in the last years [26,27].Other important points to take into account are that it has the worse prognosis for both age-of-onset subgroups and that rectal cancer is a different disease than colon cancer.[28].Efforts are needed in order to identify with enough cases the group of the aforementioned classification where rectal cancer could be framed if a new category does not arise as a result of this analysis.
However, our results and their correlative implications should be considered cautiously given the limited sample size of the groups.Additional studies with a larger sample size should be developed to confirm our findings.

Families, Samples and Data Collection
Between January 2002 and December 2008, a total of 82 consecutive individuals with CRC diagnosed at an age of 45 years or younger (EOCRC) (we excluded 6 cases diagnosed with familial adenomatous polyposis), and 97 consecutive patients with CRC diagnosed at an age of 70 years or older (LOCRC), were collected at our institution.They were considered the index case of each family.
The age of 45 years or younger was chosen as an inclusion criterion because although there is no consensus on a specific age, according to the literature, it is the age range where the majority of genetic and phenotypic alterations are found.The age of 70 years or older was chosen as an inclusion criterion because according to the World Health Organization, 65-70 years old is the cut-off point from which a patient is considered older.Patients between 45 and 70 years of age were not included since, according to the literature, cases may behave like late genetic syndromes or attenuated or more frequently as sporadic tumors, thus overlapping with both groups.This may be a cause of confusion factor that leads to false conclusions [29][30][31][32].
All subjects gave their informed consent for inclusion before they participated in the study.The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of the "12 de Octubre" University Hospital.Personal and clinicopathological information was obtained including age of onset, gender, location of the CRC (right/left colon or rectum), grade of cell differentiation (low, medium or high), mucin production, the presence of "signet ring" cells, TNM stages (6th version), the existence of polyps during follow-up, type of polyps (adenomatous, hyperplastic or mixed), the presence of synchronous or metachronous CRCs (SCRC or MCRC), and primary multiple neoplasms in the index case.The tumors (adenocarcinomas) were pathologically confirmed by a single pathologist.
Right colon tumors have been defined as those ranging from the caecum to the proximal two-thirds of the transverse colon and left colon tumors as those ranging from the distal third of the transverse colon, including the splenic flexure, to the sigmoid colon.Rectal cancers are defined as those until 16 cm from anal verge (rectosigmoid junction).
To analyze the antecedents of cancer, families were classified into four groups: (a) families fulfilling the Amsterdam II criteria for Lynch syndrome (LS) [33]; (b) families with mainly aggregation-at least one in first-degree or two in second-degree family members of LS-related neoplasms; (c) families with mainly aggregation of LS-unrelated neoplasms; (d) cases without oncological antecedents; these were considered sporadic cases.
Follow-up was at least 5 years from surgery; DFS and overall survival (OS), recurrence and cancer-related death were recorded for each case and were included in the different attached tables as a mean.

Microsatellite Instability and Mutational Analysis
A microsatellite instability (MSI) analysis was performed using the Bethesda panel [34], as published before [3].Tumors were considered as MSI when showing high-frequency MSI (MSI-H; two or more of the five markers showing instability), while the rest (including MSI-L) were classified as MSS.
MSI cases were screened for germline mutations in the mismatch repair (MMR) system genes MLH1, MSH2 and MSH6 as previously reported, with minor modifications [35].
Sporadic MSI cases were identified by determining the methylation status of the MLH1 gene promoter, as well as assessing the V600E mutation in the BRAF gene.The methods used were described previously [4].

Chromosomal Instability. Array Comparative Genomic Hybridization (aCGH)
A CGH was performed using oligonucleotide microarrays (Roche NimbleGen, Inc., Reykjavik, Iceland) in order to identify copy number alterations (CNA) for both age-of-onset subgroups and has been described before [10].The degrees of genomic instability were also described in that same study.Both datasets were included in the gene expression omnibus (GEO): LOCRC (GSE108166) and EOCRC (GSE108220).

Statistical Analyses
Continuous variables were expressed as mean values plus/minus standard deviation (SD), and categorical variables were expressed as the number of cases and their percentage.Differences were considered significant when p < 0.05.For associations between colon location and other discrete variables, statistical analyses were performed using Pearson's chi-square (χ 2 ) test for parametric variables, and Fisher's exact test for non-parametric variables.When those features were continuous variables, Student's t-test was used.The SPSS v.11.5 for Windows (SPSS, Inc., Chicago, IL, USA) statistical package was used.
For the CGH analysis, both univariate and multivariate analysis was carried out to identify significant minimum regions.Regarding the univariate analysis, unconditional logistic regression was done for each candidate region.For the multivariate analysis, each of the regions was tested separately, including other relevant clinical variables.Location was considered a factor in all analyses carried out.
We selected only regions larger than 1 Mb, to avoid any possible bias.This analysis was performed in R Statistical Software [36].

Conclusions
In our opinion, dividing CRC according to both location and age-of-onset criteria is meaningful, not only because it homogenizes the resulting categories, but also because it facilitates new approaches in those subsets of which the molecular basis remains as yet unknown: high CIN left-sided EOCC, left-sided LOCC with multiple primary neoplasms, or EORC with a familial component and advanced stage at diagnosis.The subclassification also helps to define clinically which tumors are more likely to show certain already described molecular alterations, such as for example changes affecting NDRG2 or GAB2.Comparison with the consensus molecular classification also confirms the importance of considering both criteria when studying CRC.With the confirmation across larger studies, new therapeutical algorithms could be defined according to the classification of the age-of-onset and the location of the tumor.
Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/4/968/s1.Table S1: Differential chromosomal regions between each age-of-onset group, for each colon location.Funding: This work was funded by Project PI10/0683 and PI13/0127 and PI16/01650 to J.P, and PI16/01920 to R.G.S, and PI17/01233 to D.G-O from the Spanish Ministry of Health and Consumer Affairs, and co-funded by the European Regional Development Fund (FEDER); and supported by grants R01 CA72851, CA18172, CA184792, and U01 CA187956 from the National Cancer Institute, National Institutes of Health to Ajay Goel.

Figure 1 .
Figure 1.Frequency plots of copy number gains (above zero, green) and losses (below zero, red) defined for each subgroup.The fraction gained or lost is plotted on the y-axis versus genomic location on the x-axis.(A) Comparison of right-sided cancers in EOCC and LOCC; (B) comparison of left-sided cancers in EOCC and LOCC; (C) comparison of rectal cancers in EORC and LORC.EOCC: Early-onset colon cancer; EORC: Early-onset rectal cancer; LOCC: Late-onset colon cancer; LORC: Late-onset rectal cancer.

Table 1 .
Comparison between right colon cancers with different ages of onset.

Table 2 .
Comparison between left colon cancers with different ages of onset.

Table 3 .
Comparison between rectal cancers with different ages of onset.

Table 1 .
Comparison between right colon cancers with different ages of onset.

Table 4 .
Summary of the main specific differential chromosomal regions between each age-of-onset group, for each colon location.
EO: Early-onset; LO: Late-onset; Chr: chromosome; Green: gained regions; Red: lost regions.Percentages shown in bold indicate frequencies that are at least twice as high in one age-of-onset group as in the other.