Colorectal cancer (CRC) is one of the well-established tumor types to be considered as a genetic disease in which the multiple and sequential accumulation of genetic alterations underlies the development and progression to carcinoma and metastasis. The inactivation of APC
mutations, activation of KRAS
mutations, and the diverse mutations in TP53
, and SMAD4
, TGF-β pathway genes, drive the development and evolution of a malignant CRC [1
A comprehensive investigation of the genomic landscape of the early stages of CRC, was reported by The Cancer Genome Atlas (TCGA) Network Network [2
]. More recently, metastatic CRCs were also analyzed by using MSK-IMPACT, a capture-based next generation sequencing (NGS) platform [3
]. Several studies have performed the analysis of the comparative genetic sequencing of paired primary and metastatic CRC [4
]. The majority have shown a high degree of concordance in the genetic profile between primary and metastatic CRC [4
]. Especially, early recurrent genetic alterations such as APC
, and BRAF
, involving colorectal carcinogenesis, were highly concordant in matched pairs of primary and metastatic CRC [5
Genetic intratumor heterogeneity within tumors often occurs as a result of the progressive accumulation of genetic alterations during the spatial and temporal evolution of the tumor. More recently, several studies have reported that advanced CRCs harbor extensive intratumor heterogeneity, shaped by neutral evolution during tumor evolution [8
]. Furthermore, Saito et al. demonstrated that the evolutionary principle shaping genetic intratumor heterogeneity shifts from Darwinian to neutral evolution during CRC progression [9
The comparison of the genetic profile between primary and metastatic CRCs is needed to enable the discovery of useful therapeutic targets against metastatic CRCs. In this study, we performed the targeted next generation sequencing (NGS) assay of 170 cancer-associated genes for 142 metastatic CRCs, including 95 pairs of primary and metastatic colorectal samples, to reveal their genomic characteristics and to assess the genetic heterogeneity between primary and metastatic CRCs.
2.1. Clinicopathologic Characteristics of Patients
A total of 146 patients were analyzed in this study. From 95 patients, pairs of primary and metastatic colon cancer samples were analyzed to compare the genetic profiles. An additional 4 primary only and 47 metastatic only samples were also analyzed. The clinicopathologic characteristics of 142 metastatic tumors (including 95 paired with primary tumor and 47 singleton) are summarized in Table 1
. The median age at diagnosis was 61 years (range, 34–89). The most common sites of metastasis were the liver (54.2%) and the lung (24.7%), followed by the abdominopelvic cavity (12.7%) and central nervous system and soft and bone metastases were rare. In total, 78 (54.9%) and 64 (45.1%) of 142 patients developed synchronous and metachronous metastasis, respectively. Only 4.9% of metastatic CRCs displayed an MSI-H genotype/phenotype.
2.2. Genomic Profiling of 95 Paired Samples
After the filtering steps to identify the clinically significant mutations described in the Method section, a total of 318 mutations of 51 genes, including 243 missense and nonsense single nucleotide variants (SNVs), 52 insertion and deletions (indels) and 23 splice site mutations were identified in 95 primary-tumor pairs. The full list of variants can be found in Table S1
. Variants occurred in more than 3 out of 95 pairs and are summarized in Figure 1
. Of the 318 mutations, 81% (258/318) were found in both primary and metastatic CRCs, 12% (37/318) were found only in primary CRCs, and 7% (23/318) were found only in metastatic CRCs. Concordance was different according to the type of variant, 85% for SNVs, 73% for indels and 57% for splice site mutations. The concordance of the variants in the top six genes was 85% on average, the lowest in SMAD4 was 61% and the highest in FBXW7
The average number of concordant variants per sample was 5.4 and discordant variants per sample was 0.6. The discordant variants were caused more frequently by primary specific variants (37/60, 62%), compared with the metastatic-specific variants (23/60, 38%). The portion of concordant mutations was high in FBXW7, NRAS, PTEN, and BRCA2 (100% concordance) gene, whereas the portion of discordant mutations was high in ERBB2, PIK3R1, TSC1, and VHL (concordance ≤40%) genes. The mean concordance of variants in all 95 paired primary and metastatic CRCs was 87%. Concordance was significantly higher in the synchronous group than in the metachronous group (92% vs. 86%, p = 0.0124). The ratio of primary specific variants was significantly lower in the synchronous group than in the metachronous group (4% vs. 11%, p = 0.0016).
The most frequently mutated gene in primary CRCs was APC
(71%, 67/95), followed by TP53
(54%, 52/95), KRAS
(45%, 43/95), PIK3CA
(16%, 15/95), SMAD4
(15%, 14/95) and FBXW7
(11%, 10/95). In metastatic CRCs, the order was the same as the primary group, APC
(65%, 62/95), followed by TP53
(57%, 54/95), KRAS
(44%, 42/95), PIK3CA
(19%, 18/95), SMAD4
(14%, 13/95) and FBXW7
(11%, 10/95). The frequency of the top 6 gene mutations was almost the same in primary and metastatic CRCs (Figure 2
A). The frequency of mutations was numerically higher in primary tumors than in metastatic tumors in APC
, although, there was no overall statistical difference in the frequency of mutations between primary and metastatic tumors.
We also compared the variant allele frequency (VAF) of the frequently mutated top six genes between primary and metastatic CRCs. (Figure 2
B and Table 2
). Of the six genes, VAFs of TP53
mutations in metastatic CRCs were higher than those of primary CRCs (0.24% ± 0.15% in primary CRCs vs. 0.33% ± 0.22% in metastatic CRCs), which was statistically significant by paired two sample t-test (p
No discordant mutation in FBXW7 was observed and most of the VAFs in metastatic CRCs were lower than those of primary CRCs (9/11), which is tumor cells harboring a FBXW7 mutation might be subclonal in metastatic CRCs. No obvious differences in the VAFs of APC, KRAS and PIK3CA were detected between primary and metastatic CRCs.
Mutational counts per sample did not show significant differences according to the patterns of metastasis (synchronous vs. metachronous), location of metastasis (liver vs. others) and MSI status.
2.3. Genomic Profiling of All 142 Samples Including Primary/Metastatic Singletons
When analyzing all samples, including unpaired primary or metastatic only in each primary and metastatic group, the top six genes were the same in both groups (Figure 3
). The variants in APC
were mostly the truncating type, 72% nonsense SNVs, 26% indels and 2% splice site mutations in the metastatic group. For TP53
, 66% were missense SNVs encoding p.R175H (16%), p.R248Q/W (8%) or p.R273H/S (5%) altered proteins and 34% were the truncating type including 22% nonsense, 8% indels and 4% splice site mutations in metastatic group. All of the variants of the two oncogenes, KRAS
occurred in mutational hotspots. The mutational frequencies of the genes, except for the top six genes, were less than 5% and the rank of the mutational frequencies showed no significant difference. The genes with variants exclusively in the metastatic group were AKT1
, and MSH6
. These genes are probable tumor suppressor genes according to the Cancer Gene Census [10
], except for AKT1
The clinicopathological factors showed statistically significant differences by mutational status. (Table 3
) MSI-H ratio was higher in the SMAD4
mutated group. The frequency of TP53
mutations in synchronous metastasis were higher than those of metachronous metastasis. Overall survival was not different according to mutational status of the top six genes. Recurrence occurred significantly earlier in patients with FBXW7
mutant (137 days in FBXW7
mutant vs. 294 days in wild type, p
2.4. Comparing with Public Data: 99 Primary CRCs vs. TCGA and 142 Metastatic CRCs vs. MSKCC
We compared our data with public cancer datasets from cBioPortal. (https://www.cbioportal.org/datasets
, accessed on 19 May 2021) ‘Colorectal Adenocarcinoma (TCGA, Firehose Legacy)’ (TCGA, n =
223) and ‘Metastatic Colorectal Cancer (MSKCC, Cancer Cell 2018)’ (MSKCC, n =
1134) for primary and metastatic control, respectively. The top 20 genes of each group are listed in Table 4
. Of the top 20 genes in TCGA or MSKCC, AMER1
were not included in our cancer panel. TCGA, the datasets with genetic profiles of primary CRCs and MSKCC, those of metastatic CRCs were compared with the results in this study (Figure 4
When comparing two public datasets, TCGA and MSKCC, the genes with significantly lower mutational frequency in MSKCC than in TCGA were AMER1 (9% vs. 4%, p = 0.0326) and NRAS (8% vs. 3%, p = 0.0463). The genes with significantly higher mutational frequency in MSKCC than in TCGA were TP53 (53% vs. 72%, p < 0.0001), PIK3CA (13% vs. 19%, p = 0.0188), and SOX9 (4% vs. 9%, p = 0.0088). The genes showing significant difference between two public datasets were NRAS, TP53, and PIK3CA, and these genes showed similar mutational frequencies between the primary and metastatic CRCs of the present study.
When comparing the primary group of the present study with TCGA, the genes showing a difference in variant rate ≥5% were SMAD4 (15% vs. 10%, p = 0.1995), FBXW7 (10% vs. 15%, p = 0.2225), and BRAF (2% vs. 9%, p = 0.0022). Between the metastatic group of the present study and MSKCC, APC (63% vs. 73%, p = 0.0129), TP53 (51% vs. 72%, p < 0.0001), PIK3CA (14% vs. 19%, p = 0.1209), and BRAF (3% vs. 11%, p < 0.0001) showed differences of more than 5%.
APC mutations were mostly the truncated type in MSKCC (1268/1217 mutations, 99.8%) as in our results of the present data. In terms of types of variants, indel ratio was significantly higher in MSKCC than our metastatic group (41% vs. 26%, p = 0.0004). Considering TP53 mutations, ratio of missense SNVs were 66% in both the metastatic group of the present study and MSKCC, and nonsense SNV ratio was higher in the metastatic group of the present study than in MSKCC (22% vs. 14%). However, indel ratio was significantly lower in the metastatic group of the present study than in MSKCC (8% vs. 15%, p = 0.0488). PIK3CA mutations occurred at p.H1047 were 36% in the metastatic group of the present study and 17% in MSKCC, at p.E545 were 32% vs. 27%. Ninety-five percent of PIK3CA variants were found in the amino acid positions of 1047, 542, 545 and 546 in the metastatic group of the present study, 32% of variants were found at other positions in MSKCC. BRAF V600E mutation was only found in two patients of the present study.
We performed targeted NGS of 170 cancer-associated genes in 142 metastatic CRCs, including 95 pairs of primary and metastatic CRCs to define the mutational concordance of these genes in primary and metastatic CRCs. Furthermore, we compared our data with the public cancer datasets, TCGA (n =
] and MSKCC (n =
]. The previous comparative sequencing studies between primary and metastatic CRCs were only compared with the data of TCGA [6
]. TCGA analyzed only the primary CRCs, the majority of which were derived from the early stages of CRCs [2
]. Meanwhile, our study cohort, which consists of all patients with metastatic CRCs, and the MSKCC CRC cohort, which also had more aggressive and advanced CRCs, were distinct from the TCGA cohort [3
]. Genomic analysis in the MSKCC thus provided insights into more metastatic CRCs that were not evident in the TCGA CRCs cohort. Therefore, we compared genetic profiles of primary CRCs in our study with TCGA, and those of the metastatic CRCs with the MSKCC in our study, respectively. Our data is significant in comparing primary and metastatic CRCs in pairs, as well as comparing two large public data, TCGA and MSKCC, representative of primary and metastatic CRCs, under the same conditions. To the author’s knowledge, this is the first study to compare the genetic profile of our cohort with that of the metastatic CRCs from MSKCC dataset.
We identified that the frequency of recurrent mutations of APC
, and SMAD4
were consistent with previous reports on metastatic CRCs [3
]. Overall concordance of the clinically significant mutations between primary and metastatic CRCs was 81%. This concordance rose to 85% for the six most recurrent mutations occurring in CRCs. The six most recurrent mutations were known as the CRC driver genes. These results are consistent with those of previous studies on comparative sequencing [3
]. In MSKCC, a high level of genomic concordance was also identified between the primary and metastatic CRCs [3
]. Therefore, a low degree of genetic heterogeneity between primary and metastatic CRCs with respect to driver mutations of CRCs. This supported that the main driver genetic alterations involving colorectal carcinogenesis was maintained during the evolution of tumor metastasis. Among the driver genes, the concordance was high in KRAS
(90%), which are early or universal mutations in CRCs. On the other hand, the concordance was relatively low in PIK3CA
(61%), which are known as later mutations. This observation was consistent with prior studies [11
] and could be explained through heterogeneous clonal evolution [15
Especially, our result showed that with KRAS
(100%), there was very high concordance between primary and metastatic CRCs. A meta-analysis of all published studies between 1991 and 2018 reporting on biomarker concordance between primary and metastatic CRCs, a very high median biomarker concordance for KRAS
(93%) was reported, whereas meta-analytic pooled discordance was 8% for KRAS
, 8% for BRAF
, and 7% for PIK3CA
in 61 studies, including 3565 patient samples [16
are OncoKB level-1 resistance biomarkers for anti-EGFR (epidermal growth factor receptor) antibodies (cetuximab and panitumumab), mutational testing of these genes has now been incorporated into the National Comprehensive Cancer Network (NCCN) guidelines for the treatment of patients with metastatic CRCs [17
]. These results are also consistent with the recommendation that molecular analysis of the primary tumor is representative of the genetic characteristic of the metastatic tumor [17
]. Therefore, the genetic investigation of archived primary tumor samples with challenges in getting adequate samples from metastatic sites appears to be sufficient for the application of cancer precision medicine in the metastatic setting.
Concordance between primary and metastatic CRCs was significantly higher in the synchronous group than in the metachronous group. Synchronous metastatic tumors were mainly treatment-naïve while the majority of patients with metachronous metastatic tumors have received systemic treatment such as chemotherapy for the primary tumor, which may influence the mutational concordance between the primary and the subsequent tumor.
The genes with significantly lower mutational frequency in the metastatic group of the present study than in the MSKCC were TP53, and APC, whereas genes showing significant difference of mutational frequency between the primary group of the present study and TCGA were not identified. Considering the distribution of the types of variants, the MSKCC group showed a significantly higher proportion of indels among the total variants than those of the present study and TCGA. (35% for MSKCC, 14% for present study and TCGA, p < 0.0001). TP53 and APC are the most representative tumor suppressor genes which commonly suffer loss of their function by deletion mutations. In the MSKCC group, due to a more efficient indel calling algorithm, more indels may have been detected than in the present study and made a difference in the mutational frequencies in the two genes commonly altered by deletion mutation.
alterations in the MSKCC group, the genetic profiles of metastatic CRCs, were the only genomic alteration significantly enriched in metastatic CRCs, which shows TP53
alterations are selectively enriched in metastatic CRCs [3
]. In our study, VAFs of TP53
mutations in metastatic CRCs were significantly higher than those of the primary CRCs, although there was no overall statistical difference in the frequency of mutations between paired primary and metastatic CRCs. This result could be interpreted that the clones with the TP53
mutation might be expanded through sustained tumor growth and metastasis or an additional genetic hit, resulting in loss of heterozygosity.
, a downstream regulator in the TGF-β signaling pathway in CRC, has been highlighted. In particular, inactivation of SMAD4
has been associated with late stage or metastatic CRCs [18
]. Recent work has also highlighted that SMAD4
downregulation may occur in up to 60% of patients with metastatic CRC, which is significantly higher than the incidence of SMAD4
]. However, there was no statistical difference in the frequency of SMAD4
mutations between primary and metastatic CRCs in our study.
is a tumor suppressor gene and the frequency of mutation in CRCs has been reported at 6–10% [20
]. It was recently reported that FBXW7
mutations had significantly worse survival in metastatic CRCs [23
]. In our study, recurrence occurred significantly early in patients with the FBXW
7 mutant than in patients with wild type, although the overall survival (OS) was not different according to FBXW
7 mutational status.
There was a lower prevalence of BRAF
genes in the both primary (2%) and metastatic (3%) CRCs of the present study compared with TCGA (9%) and MSKCC (11%) data. According to a previous review, the prevalence of BRAF
-mutated CRC is lower in Eastern Asian countries (0.7–11.4%) than in Western countries (3.7–20.6%) [24
Only 4.9% of metastatic CRCs displayed an MSI-H genotype/phenotype, a frequency that is lower than that reported for primary CRCs but similar to the MSK cohort (4%) [3
]. These findings could be explained by the lower tendency of MSI-H genotype/phenotype tumors to metastasize [25
The limitation of this study is that we could not analyze the difference of frequency and concordant rate of genes between primary and metastatic CRCs according to MSI-status due to the too small size of the MSI-H genotype/phenotype tumors. Therefore, future studies, with a large cohort of MSI-H of metastatic CRCs, are needed to further investigate the difference of frequency and concordance in MSS and MSI-H group. A second limitation is that our study only examined the changes in DNA. The areas that are not well explained about the association with clinicopathological factors may be explained by other aspects of genetic variation, such as methylation and microenvironment.