High Concordance of Genomic Profiles between Primary and Metastatic Colorectal Cancer

The comparison of the genetic profiles between primary and metastatic colorectal cancer (CRC) is needed to enable the discovery of useful therapeutic targets against metastatic CRCs. We performed the targeted next generation sequencing assay of 170 cancer-associated genes for 142 metastatic CRCs, including 95 pairs of primary and metastatic CRCs, to reveal their genomic characteristics and to assess the genetic heterogeneity. The most frequently mutated gene in primary and metastatic CRCs was APC (71% vs. 65%), TP53 (54% vs. 57%), KRAS (45% vs. 44%), PIK3CA (16% vs. 19%), SMAD4 (15% vs. 14%) and FBXW7 (11% vs. 11%). The concordance in the top six frequently mutated genes was 85%, on average. The overall mutation frequencies were consistent with two sets of public data (TCGA and MSKCC). To the author’s knowledge, this is the first study to compare the genetic profiles of our cohort with that of the metastatic CRCs from MSKCC. Comparative sequencing analysis between primary and metastatic CRCs revealed a high degree of genetic concordance in the current clinically actionable genes. Therefore, the genetic investigation of archived primary tumor samples with the challenges of obtaining an adequate sample from metastatic sites appears to be sufficient for the application of cancer precision medicine in the metastatic setting.


Introduction
Colorectal cancer (CRC) is one of the well-established tumor types to be considered as a genetic disease in which the multiple and sequential accumulation of genetic alterations underlies the development and progression to carcinoma and metastasis. The inactivation of APC mutations, activation of KRAS mutations, and the diverse mutations in TP53, PIK3CA, and SMAD4, TGF-β pathway genes, drive the development and evolution of a malignant CRC [1].
A comprehensive investigation of the genomic landscape of the early stages of CRC, was reported by The Cancer Genome Atlas (TCGA) Network Network [2]. More recently, metastatic CRCs were also analyzed by using MSK-IMPACT, a capture-based next generation sequencing (NGS) platform [3]. Several studies have performed the analysis of the comparative genetic sequencing of paired primary and metastatic CRC [4][5][6][7]. The majority have shown a high degree of concordance in the genetic profile between primary and metastatic CRC [4][5][6]. Especially, early recurrent genetic alterations such as APC, KRAS, NRAS, and BRAF, involving colorectal carcinogenesis, were highly concordant in matched pairs of primary and metastatic CRC [5,6].
Genetic intratumor heterogeneity within tumors often occurs as a result of the progressive accumulation of genetic alterations during the spatial and temporal evolution of the tumor. More recently, several studies have reported that advanced CRCs harbor extensive intratumor heterogeneity, shaped by neutral evolution during tumor evolution [8,9]. Furthermore, Saito et al. demonstrated that the evolutionary principle shaping

Genomic Profiling of 95 Paired Samples
After the filtering steps to identify the clinically significant mutations described in the Method section, a total of 318 mutations of 51 genes, including 243 missense and nonsense single nucleotide variants (SNVs), 52 insertion and deletions (indels) and 23 splice site mutations were identified in 95 primary-tumor pairs. The full list of variants can be found in Table S1. Variants occurred in more than 3 out of 95 pairs and are summarized in Figure 1. Of the 318 mutations, 81% (258/318) were found in both primary and metastatic CRCs, 12% (37/318) were found only in primary CRCs, and 7% (23/318) were found only in metastatic CRCs. Concordance was different according to the type of variant, 85% for SNVs, 73% for indels and 57% for splice site mutations. The concordance of the variants in the top six genes was 85% on average, the lowest in SMAD4 was 61% and the highest in FBXW7 was 100%.
We also compared the variant allele frequency (VAF) of the frequently mutated top six genes between primary and metastatic CRCs. ( Figure 2B and Table 2). Of the six genes, VAFs of TP53 mutations in metastatic CRCs were higher than those of primary CRCs (0.24% ± 0.15% in primary CRCs vs. 0.33% ± 0.22% in metastatic CRCs), which was statistically significant by paired two sample t-test (p = 0.0024). No discordant mutation in FBXW7 was observed and most of the VAFs in metastatic CRCs were lower than those of primary CRCs (9/11), which is tumor cells harboring a FBXW7 mutation might be subclonal in metastatic CRCs. No obvious differences in the VAFs of APC, KRAS and PIK3CA were detected between primary and metastatic CRCs. Mutational counts per sample did not show significant differences according to the patterns of metastasis (synchronous vs. metachronous), location of metastasis (liver vs. others) and MSI status.

Genomic Profiling of All 142 Samples Including Primary/Metastatic Singletons
When analyzing all samples, including unpaired primary or metastatic only in each primary and metastatic group, the top six genes were the same in both groups ( Figure 3). The variants in APC were mostly the truncating type, 72% nonsense SNVs, 26% indels and 2% splice site mutations in the metastatic group. For TP53, 66% were missense SNVs encoding p.R175H (16%), p.R248Q/W (8%) or p.R273H/S (5%) altered proteins and 34% were the truncating type including 22% nonsense, 8% indels and 4% splice site mutations in metastatic group. All of the variants of the two oncogenes, KRAS and PIK3CA occurred in mutational hotspots. The mutational frequencies of the genes, except for the top six genes, were less than 5% and the rank of the mutational frequencies showed no significant difference. The genes with variants exclusively in the metastatic group were AKT1, CDH1, BAP1, DNMT3A, FOXL2, and MSH6. These genes are probable tumor suppressor genes according to the Cancer Gene Census [10], except for AKT1.
The clinicopathological factors showed statistically significant differences by mutational status. (Table 3) MSI-H ratio was higher in the SMAD4 mutated group. The frequency of TP53 mutations in synchronous metastasis were higher than those of metachronous metastasis. Overall survival was not different according to mutational status of the top six genes. Recurrence occurred significantly earlier in patients with FBXW7 mutant (137 days in FBXW7 mutant vs. 294 days in wild type, p = 0.0221). No discordant mutation in FBXW7 was observed and most of the VAFs in metastatic CRCs were lower than those of primary CRCs (9/11), which is tumor cells harboring a FBXW7 mutation might be subclonal in metastatic CRCs. No obvious differences in the VAFs of APC, KRAS and PIK3CA were detected between primary and metastatic CRCs.
Mutational counts per sample did not show significant differences according to the patterns of metastasis (synchronous vs. metachronous), location of metastasis (liver vs. others) and MSI status.

Genomic Profiling of All 142 Samples Including Primary/Metastatic Singletons
When analyzing all samples, including unpaired primary or metastatic only in each primary and metastatic group, the top six genes were the same in both groups ( Figure 3). The variants in APC were mostly the truncating type, 72% nonsense SNVs, 26% indels and 2% splice site mutations in the metastatic group. For TP53, 66% were missense SNVs encoding p.R175H (16%), p.R248Q/W (8%) or p.R273H/S (5%) altered proteins and 34% were the truncating type including 22% nonsense, 8% indels and 4% splice site mutations in metastatic group. All of the variants of the two oncogenes, KRAS and PIK3CA occurred in mutational hotspots. The mutational frequencies of the genes, except for the top six genes, were less than 5% and the rank of the mutational frequencies showed no significant difference. The genes with variants exclusively in the metastatic group were AKT1, CDH1, BAP1, DNMT3A, FOXL2, and MSH6. These genes are probable tumor suppressor genes according to the Cancer Gene Census [10], except for AKT1.
The clinicopathological factors showed statistically significant differences by mutational status. (Table 3) MSI-H ratio was higher in the SMAD4 mutated group. The frequency of TP53 mutations in synchronous metastasis were higher than those of metachronous metastasis. Overall survival was not different according to mutational status of the top six genes. Recurrence occurred significantly earlier in patients with FBXW7 mutant (137 days in FBXW7 mutant vs. 294 days in wild type, p = 0.0221).      Table 4. Of the top 20 genes in TCGA or MSKCC, AMER1, SOX9, ARID1A and TCF7L2 were not included in our cancer panel. TCGA, the datasets with genetic profiles of primary CRCs and MSKCC, those of metastatic CRCs were compared with the results in this study (Figure 4). The top six genes were common in all groups, the order showed differences in TCGA. We compared our data with public cancer datasets from cBioPortal.    When comparing two public datasets, TCGA and MSKCC, the genes with significantly lower mutational frequency in MSKCC than in TCGA were AMER1 (9% vs. 4%, p = 0.0326) and NRAS (8% vs. 3%, p = 0.0463). The genes with significantly higher mutational frequency in MSKCC than in TCGA were TP53 (53% vs. 72%, p < 0.0001), PIK3CA (13% vs. 19%, p = 0.0188), and SOX9 (4% vs. 9%, p = 0.0088). The genes showing significant difference between two public datasets were NRAS, TP53, and PIK3CA, and these genes showed similar mutational frequencies between the primary and metastatic CRCs of the present study.
APC mutations were mostly the truncated type in MSKCC (1268/1217 mutations, 99.8%) as in our results of the present data. In terms of types of variants, indel ratio was significantly higher in MSKCC than our metastatic group (41% vs. 26%, p = 0.0004). Considering TP53 mutations, ratio of missense SNVs were 66% in both the metastatic group of the present study and MSKCC, and nonsense SNV ratio was higher in the metastatic group of the present study than in MSKCC (22% vs. 14%). However, indel ratio was significantly lower in the metastatic group of the present study than in MSKCC (8% vs. 15%, p = 0.0488). PIK3CA mutations occurred at p.H1047 were 36% in the metastatic group of the present study and 17% in MSKCC, at p.E545 were 32% vs. 27%. Ninety-five percent of PIK3CA variants were found in the amino acid positions of 1047, 542, 545 and 546 in the metastatic group of the present study, 32% of variants were found at other positions in MSKCC. BRAF V600E mutation was only found in two patients of the present study.

Discussion
We performed targeted NGS of 170 cancer-associated genes in 142 metastatic CRCs, including 95 pairs of primary and metastatic CRCs to define the mutational concordance of these genes in primary and metastatic CRCs. Furthermore, we compared our data with the public cancer datasets, TCGA (n = 223) [2] and MSKCC (n = 1134) [3]. The previous comparative sequencing studies between primary and metastatic CRCs were only compared with the data of TCGA [6,11]. TCGA analyzed only the primary CRCs, the majority of which were derived from the early stages of CRCs [2]. Meanwhile, our study cohort, which consists of all patients with metastatic CRCs, and the MSKCC CRC cohort, which also had more aggressive and advanced CRCs, were distinct from the TCGA cohort [3]. Genomic analysis in the MSKCC thus provided insights into more metastatic CRCs that were not evident in the TCGA CRCs cohort. Therefore, we compared genetic profiles of primary CRCs in our study with TCGA, and those of the metastatic CRCs with the MSKCC in our study, respectively. Our data is significant in comparing primary and metastatic CRCs in pairs, as well as comparing two large public data, TCGA and MSKCC, representative of primary and metastatic CRCs, under the same conditions. To the author's knowledge, this is the first study to compare the genetic profile of our cohort with that of the metastatic CRCs from MSKCC dataset.
We identified that the frequency of recurrent mutations of APC, KRAS, PIK3CA, FBXW7, and SMAD4 were consistent with previous reports on metastatic CRCs [3,12,13]. Overall concordance of the clinically significant mutations between primary and metastatic CRCs was 81%. This concordance rose to 85% for the six most recurrent mutations occurring in CRCs. The six most recurrent mutations were known as the CRC driver genes. These results are consistent with those of previous studies on comparative sequencing [3,5,14]. In MSKCC, a high level of genomic concordance was also identified between the primary and metastatic CRCs [3]. Therefore, a low degree of genetic heterogeneity between primary and metastatic CRCs with respect to driver mutations of CRCs. This supported that the main driver genetic alterations involving colorectal carcinogenesis was maintained during the evolution of tumor metastasis. Among the driver genes, the concordance was high in KRAS (98%), NRAS (100%), APC (90%), which are early or universal mutations in CRCs. On the other hand, the concordance was relatively low in PIK3CA (70%), SMAD4 (61%), which are known as later mutations. This observation was consistent with prior studies [11] and could be explained through heterogeneous clonal evolution [15].
Especially, our result showed that with KRAS (98%), NRAS (100%), there was very high concordance between primary and metastatic CRCs. A meta-analysis of all published studies between 1991 and 2018 reporting on biomarker concordance between primary and metastatic CRCs, a very high median biomarker concordance for KRAS (93%), NRAS (100%), BRAF (99.4%), PIK3CA (93%) was reported, whereas meta-analytic pooled discordance was 8% for KRAS, 8% for BRAF, and 7% for PIK3CA in 61 studies, including 3565 patient samples [16]. KRAS and NRAS are OncoKB level-1 resistance biomarkers for anti-EGFR (epidermal growth factor receptor) antibodies (cetuximab and panitumumab), mutational testing of these genes has now been incorporated into the National Comprehensive Cancer Network (NCCN) guidelines for the treatment of patients with metastatic CRCs [17]. These results are also consistent with the recommendation that molecular analysis of the primary tumor is representative of the genetic characteristic of the metastatic tumor [17]. Therefore, the genetic investigation of archived primary tumor samples with challenges in getting adequate samples from metastatic sites appears to be sufficient for the application of cancer precision medicine in the metastatic setting.
Concordance between primary and metastatic CRCs was significantly higher in the synchronous group than in the metachronous group. Synchronous metastatic tumors were mainly treatment-naïve while the majority of patients with metachronous metastatic tumors have received systemic treatment such as chemotherapy for the primary tumor, which may influence the mutational concordance between the primary and the subsequent tumor.
The genes with significantly lower mutational frequency in the metastatic group of the present study than in the MSKCC were TP53, and APC, whereas genes showing significant difference of mutational frequency between the primary group of the present study and TCGA were not identified. Considering the distribution of the types of variants, the MSKCC group showed a significantly higher proportion of indels among the total variants than those of the present study and TCGA. (35% for MSKCC, 14% for present study and TCGA, p < 0.0001). TP53 and APC are the most representative tumor suppressor genes which commonly suffer loss of their function by deletion mutations. In the MSKCC group, due to a more efficient indel calling algorithm, more indels may have been detected than in the present study and made a difference in the mutational frequencies in the two genes commonly altered by deletion mutation.
TP53 alterations in the MSKCC group, the genetic profiles of metastatic CRCs, were the only genomic alteration significantly enriched in metastatic CRCs, which shows TP53 alterations are selectively enriched in metastatic CRCs [3]. In our study, VAFs of TP53 mutations in metastatic CRCs were significantly higher than those of the primary CRCs, although there was no overall statistical difference in the frequency of mutations between paired primary and metastatic CRCs. This result could be interpreted that the clones with the TP53 mutation might be expanded through sustained tumor growth and metastasis or an additional genetic hit, resulting in loss of heterozygosity.
SMAD4, a downstream regulator in the TGF-β signaling pathway in CRC, has been highlighted. In particular, inactivation of SMAD4 has been associated with late stage or metastatic CRCs [18]. Recent work has also highlighted that SMAD4 downregulation may occur in up to 60% of patients with metastatic CRC, which is significantly higher than the incidence of SMAD4 mutations [19]. However, there was no statistical difference in the frequency of SMAD4 mutations between primary and metastatic CRCs in our study.
FBXW7 is a tumor suppressor gene and the frequency of mutation in CRCs has been reported at 6-10% [20][21][22]. It was recently reported that FBXW7 mutations had significantly worse survival in metastatic CRCs [23]. In our study, recurrence occurred significantly early in patients with the FBXW7 mutant than in patients with wild type, although the overall survival (OS) was not different according to FBXW7 mutational status.
There was a lower prevalence of BRAF genes in the both primary (2%) and metastatic (3%) CRCs of the present study compared with TCGA (9%) and MSKCC (11%) data. According to a previous review, the prevalence of BRAF-mutated CRC is lower in Eastern Asian countries (0.7-11.4%) than in Western countries (3.7-20.6%) [24].
Only 4.9% of metastatic CRCs displayed an MSI-H genotype/phenotype, a frequency that is lower than that reported for primary CRCs but similar to the MSK cohort (4%) [3]. These findings could be explained by the lower tendency of MSI-H genotype/phenotype tumors to metastasize [25].
The limitation of this study is that we could not analyze the difference of frequency and concordant rate of genes between primary and metastatic CRCs according to MSIstatus due to the too small size of the MSI-H genotype/phenotype tumors. Therefore, future studies, with a large cohort of MSI-H of metastatic CRCs, are needed to further investigate the difference of frequency and concordance in MSS and MSI-H group. A second limitation is that our study only examined the changes in DNA. The areas that are not well explained about the association with clinicopathological factors may be explained by other aspects of genetic variation, such as methylation and microenvironment.

Patients
A total of 142 patients with CRC and distant metastasis were enrolled in this study. Ninety-five cases were available for both primary and metastatic tumor tissues. Additional 47 metastatic only samples were included for mutational profiling of metastatic tumors and 4 primary only samples were used as a control for metastatic tumors. H&E stained slides were reviewed by pathologists and samples with tumor cellularity ≥50% were included to detect confident variants. Clinicopathological information, including age, sex, smoking history, and stage of cancer, was obtained retrospectively by reviewing medical records. Synchronous metastasis was defined as a metastatic disease at the time or within 6 months of the original diagnosis of CRC. Metachronous metastasis was defined as the absence of metastatic disease at the time of the initial diagnosis with metastatic disease developing later than 6 months from the original diagnosis. The study protocol was approved by the institutional review board of Konkuk University Medical Center (KUH1210052), and written informed consents were obtained from all patients.
To identify confident and clinically significant variants, we applied filtering criteria as followings: (1) protein altering variants including missense and nonsense SNVs, splice site mutations and indels, (2) rare variants with MAF of gnomAD total and east Asian ≤0.1%, (3) probable pathogenic mutations with ClinVar significance in one of 'Pathogenic', 'Likely pathogenic', or 'Drug response', or oncoKB annotation with one of 'Predicted oncogenic', 'Likely oncogenic', or 'Oncogenic'.

Comparing with Public Data
To compare mutational frequencies of primary and metastatic tumors with public data, we downloaded files from cBioPortal. To compare public data with our results under the same conditions, ClinVar and OncoKB annotation were performed for the two datasets and filtering steps by the criteria described above were applied.

Molecular Findings for Microsatellite Instability (MSI)
We performed an MSI analysis on paraffin-embedded tissues to evaluate MSI status. The MSI status of the tumor samples was determined by using the five-marker Bethesda panel (BAT25, BAT26, D5S346, D2S123 and D17S250) [31]. Polymerase chain reaction (PCR) products were run on a Qsep 100 DNA fragment analyzer (Bioptic Inc., Taiwan, China) and analyzed using Qsep 100 viewer (Bioptic Inc., Taiwan, China). Microsatellite instability was defined by the presence of different sized alleles in tumor DNA compared with the matched normal DNA sample. We classified the results into microsatellite instability-high (MSI-H), microsatellite instability-low (MSI-L) and microsatellite stable (MSS) in tumors according to Bethesda guidelines [32].

Statistical Analysis
The association between mutational status and clinicopathological features was analyzed with the χ2 test or Fisher's exact test. Paired two sample t-test was done by comparing the VAFs of 95 pairs of primary and metastatic tumors. Two sample proportion test was used in comparing mutational frequencies between groups. The overall survival (OS) was the primary endpoint for this study and was calculated from the date of surgery until the date of death. The Kaplan−Meier method was used to estimate the OS. p-value of less than 0.05 was considered to indicate a statistically significant difference. All analyses were carried out using Rex software version 3.0.3 (RexSoft Inc., Seoul, Korea).

Conclusions
In conclusion, our data is significant in comparing with primary and metastatic CRCs in pairs, as well as comparing two large public data, TCGA and MSKCC, representative of primary and metastatic CRCs, under the same conditions. To the author's knowledge, this is the first study to compare the genetic profiles of our cohort with that of the metastatic CRCs from MSKCC dataset. Comparative sequencing analysis between primary and metastatic CRCs revealed a high degree of genetic concordance in the main driver genes, especially, the current clinically actionable genes. Therefore, the genetic investigation of archived primary tumor samples with the challenges of obtaining adequate samples from metastatic sites appears to be sufficient for the application of cancer precision medicine in the metastatic setting.