The Road so Far in Colorectal Cancer Pharmacogenomics: Are We Closer to Individualised Treatment?

In recent decades, survival rates in colorectal cancer have improved greatly due to pharmacological treatment. However, many patients end up developing adverse drug reactions that can be severe or even life threatening, and that affect their quality of life. These remain a limitation, as they may force dose reduction or treatment discontinuation, diminishing treatment efficacy. From candidate gene approaches to genome-wide analysis, pharmacogenomic knowledge has advanced greatly, yet there is still huge and unexploited potential in the use of novel technologies such as next-generation sequencing strategies. This review summarises the road of colorectal cancer pharmacogenomics so far, presents considerations and directions to be taken for further works and discusses the path towards implementation into clinical practice.


Introduction
Colorectal cancer (CRC) is the second leading cause of cancer-related death and the third most commonly diagnosed cancer [1]. Surgical resection is the preferable treatment independently of stage, but chemotherapy is widely used too across stages. There are different chemotherapeutic schemes for CRC treatment (Table 1). Usually, the first line of treatment is based on fluoropyrimidines: 5-fluorouracil (5-FU) or its oral prodrug capecitabine, either alone or in different combinations with other agents, the most common being leucovorin, oxaliplatin (named FOLFOX or XELOX -if capecitabine is used instead of 5FU) or irinotecan (FOLFIRI) [2][3][4][5]. Besides these cytotoxic agents, metastatic CRC (mCRC) treatment may in addition include biological targeted agents to improve patient outcome, such as monoclonal antibodies against vascular endothelial growth factor (VEGF) (bevacizumab), or against epidermal growth factor receptor (EGFR) (cetuximab and panitumumab) ( Table 1) [4].
There are two essential factors to be taken into account when considering efficacy and appropriateness of a treatment: response and toxicity. Response is often evaluated based on overall survival, progression-free survival or response evaluation criteria in solid tumors (RECIST), in the case of unresectable CRC [6]. On the other hand, patients subject to chemotherapy are prone to develop adverse drug reactions (ADRs) that might be severe or even fatal, and have a considerable impact on healthcare and burden. These ADRs can affect the patients' quality of life (even in the long term) and may hinder treatment, due to necessary delays or dose reduction. A study with more than four thousand mCRC patients receiving FOLFOX, FOLFIRI or XELOX saw that 90% of patients had one ADR, and 66% of patients had >1 ADR during the first line of treatment [7]. These toxic events also come with an increased economic burden to resolve them, with haematological toxicities being the most costly to resolve, followed by respiratory, endocrine/metabolic, central nervous system and cardiovascular ones.
Since both response and toxicity events have heterogeneous distributions amongst patients, it has been hypothesised that these ADRs may be caused by underlying genetic variants. Moreover, because chemotherapy agents have only been used since the 1950s, any genomic variants having large effects on toxicity responses have not had time to be washed away by negative selection [8,9]. Moreover, because cancer is usually related to later stages of life and does not affect fitness, purifying selection against these variants is not in place. Therefore, it is feasible that genetic variants having moderate-to-large effects (detectable by classical association studies) could be responsible for the observed variability.
Pharmacogenetics is a science that aims to learn about the inherited inter-variability in response and ADRs after drug exposure. First-generation studies were focused on the analysis of genes with an a priori relationship to drug effect, i.e., those involved mainly in the adsorption, distribution, metabolism and excretion (ADME) of chemotherapeutic agents. Later, these studies started to apply global approaches without a previous functional hypothesis, like genome-wide association studies (GWAS). The Pharmacogenomics Knowledgebase (PharmGKB [10]) is a free database that aggregates, curates, integrates and disseminates the knowledge obtained from these studies regarding the impact of human genetic variation on drug response and toxicity. Other important sources of pharmacogenomic information have also launched from the efforts of The Clinical Pharmacogenetics Implementation Consortium (CPIC), which aims to create, curate and post free detailed gene/drug clinical practice guidelines (https://cpicpgx.org/ (accessed on 29 October 2020)).
In this review, we summarise the available data on CRC pharmacogenomics to date and go beyond the typically discussed candidate gene approaches, to cover genome-wide studies and next-generation sequencing. We also reflect on the necessity of comprehensive works including molecular studies to assess variant functionality, and discuss the limitations towards clinical implementation in the light of cost-effectiveness to health systems. Last but not least, we discuss considerations for further studies towards a routine implementation of personalised medicine strategies in clinical practice.

Chemotherapeutic Agents in CRC Treatment
Chemotherapy based on fluoropyrimidines, specifically 5-FU, has been used for over thirty years now, and is still the backbone of CRC treatment ( Figure 1) [11]. However, there have been reports that show that up to 94% of patients treated with this drug end up developing ADRs, some of which may be severe or life threatening (Table 2) [12]. For instance, some studies have shown that around 40-56% of patients treated with 5-FU develop severe neutropenia, and 10-15% present grade 3-4 diarrhoea [13]. Patients receiving capecitabine have a similar incidence of ADRs, although with less severe neutropenia, but present hand-foot syndrome (HFS) at a high incidence (54%) instead (Table 2) [14].

Figure 2.
Graphic scheme of the genes involved in the ADME of platinum compounds, including oxaliplatin [23]. The glutathione S-transferases (GSTs), a multigene family of enzymes, undertake oxaliplatin detoxification. The solute carriers (SLCs) and adenosine-triphosphate binding cassette (ABC) transporters are responsible for oxaliplatin uptake and efflux in the liver, respectively, and so impact on drug bioavailability and toxicity profile. Further, the nucleotide excision repair (NER) and base excision repair (BER) pathways, which include the ERCC1 and ERCC2, and XRCC1 proteins, respectively, repair the damages cause by this drug. Used with PharmGKB and Stanford University permission (available at https://www.pharmgkb.org/pathway/PA150642262 (accessed on 24 September 2020)).
Irinotecan (CPT-11) is another cytotoxic agent used in the treatment of CRC in combination with 5-FU (FOLFIRI) ( Table 1). FOLFIRI treatments result in better response rates and longer progressionfree survival and overall survival of mCRC patients ( Figure 3) [2,24]. CPT-11 is a semi-synthetic soluble analogue of the natural alkaloid camptothecin [25,26]. Some clinical trials report an ADR incidence for this drug of up to 100% of patients, where common ADRs include diarrhoea, neutropenia and alopecia (Table 2) [16]. Graphic scheme of the genes involved in the ADME of platinum compounds, including oxaliplatin [23]. The glutathione S-transferases (GSTs), a multigene family of enzymes, undertake oxaliplatin detoxification. The solute carriers (SLCs) and adenosine-triphosphate binding cassette (ABC) transporters are responsible for oxaliplatin uptake and efflux in the liver, respectively, and so impact on drug bioavailability and toxicity profile. Further, the nucleotide excision repair (NER) and base excision repair (BER) pathways, which include the ERCC1 and ERCC2, and XRCC1 proteins, respectively, repair the damages cause by this drug. Used with PharmGKB and Stanford University permission (available at https://www.pharmgkb.org/pathway/PA150642262 (accessed on 24 September 2020)).
In case of unresectable CRC, patients may also be given biological targeted agents. Cetuximab and panitumumab bind specifically to the human EGFR protein, which is constitutively expressed in normal epithelial tissues and overexpressed in some cancers like CRC. Some of the pioneer pharmacogenetics studies on treatment efficacy found, however, that because RAS mutations can constitutively activate the response pathway downstream from EGFR, anti-EGFR therapy efficacy is limited to patients' wild type for KRAS and NRAS [4]. These belong to signalling pathways downstream of EGFR, and mutations in these genes may cause EGFR-independent pathway activation, leading to resistance to anti-EGFR treatments [27]. More than 87% of patients receiving cetuximab develop an ADR and are commonly (>25%) prone to develop cutaneous reactions, In case of unresectable CRC, patients may also be given biological targeted agents. Cetuximab and panitumumab bind specifically to the human EGFR protein, which is constitutively expressed in normal epithelial tissues and overexpressed in some cancers like CRC. Some of the pioneer pharmacogenetics studies on treatment efficacy found, however, that because RAS mutations can constitutively activate the response pathway downstream from EGFR, anti-EGFR therapy efficacy is limited to patients' wild type for KRAS and NRAS [4]. These belong to signalling pathways downstream of EGFR, and mutations in these genes may cause EGFR-independent pathway activation, leading to resistance to anti-EGFR treatments [27]. More than 87% of patients receiving cetuximab develop an ADR and are commonly (>25%) prone to develop cutaneous reactions, headache, diarrhoea and infection, whereas patients receiving panitumumab (>20%) will probably have cutaneous reactions, fatigue, nausea and diarrhoea [17,18,25,26]. On the other hand, bevacizumab binding to VEGF blocks the interactions with its receptors on the endothelial cell surface. This interaction allows cell proliferation and angiogenesis, and thus bevacizumab reduces microvascular growth and inhibits metastatic progression. Over 60% of patients receiving bevacizumab develop ADRs, where the most common are hypertension, proteinuria, mucosal bleeding and wound healing problems [4,19].

Pharmacogenetics: Candidate Gene Studies
As we mentioned before, pharmacogenetic studies arose in the context of studying the genetic factors that contribute to ADRs. Initial efforts utilised candidate gene approaches to inspect mainly genetic variation in genes that might have a great influence on the drug pharmacokinetics and pharmacodynamics, and that can alter drug concentration levels, leading to toxicity.

Dihydropyrimidine Dehydrogenase (DPYD)
DPD, encoded by the DPYD gene, is responsible for the vast majority of 5-FU hepatic metabolism and is responsible for the first step and rate-limiting factor in the 5-FU catabolic pathway ( Figure 1). Several single nucleotide polymorphisms (SNPs) have so far been identified in this gene in association with different toxicities [28]. The most studied DPYD variant is rs3918290 (DPYD*2A, IVS14+1G>A), which causes exon 14 skipping and results in a truncated and catalytically inactive protein [29,30]. A study by Toffoli et al. on 603 patients treated with 5-FU-based chemotherapy reported the association of rs3918290 (OR = 8.5, p = 0.008), rs67376798 (OR = 7.8, p = 0.012) and rs55886062 (OR = 6.0, p = 0.131) with general toxicity (Table 3) [28].        a: The risk alleles frequencies were consulted on gnomAD. b: Measure of confidence in the association, according to PharmGKB [10]. c: Associated with changes in enzymatic activity, but with a particular adverse drug reaction (ADR). d: Described for tegafur, a prodrug of 5-FU. e: Described for non-small-cell lung carcinoma. NA: not available. Note: In case of multiple studies, we have chosen a publication used by PharmGKB to support the level of evidence of the referred variant, and the corresponding OR and p-value.

Thymidylate Synthetase (TYMS)
TS, encoded by the TYMS gene, is the main target of fluoropyrimidines and low levels of expression may influence toxicity [86,87]. The two most studied SNPs in TYMS are rs2853542 (5 VNTR 2R/3R) and rs11280056 (3 UTR 6bp ins-del). This gene has been widely studied, but with no conclusive results so far. Some studies have reported a correlation between rs2853542 and 5-FU/capecitabine toxicity, where the haplotype 2R/ins 6-bp was found to be significantly associated with severe toxicity [45,87], but other works could not replicate this association [61]. This might be explained by a work of Rosmarin et al. in 2015, which reported an association of an intronic variant located in the overlapping ENOSF1 gene capable of explaining the toxicity attributed to the two previous TYMS polymorphisms. They discovered that SNP rs2612091 and TYMS 5 VNTR and 3 UTR are in moderate linkage disequilibrium (LD) (r 2 = 0.40 and 0.32, respectively), but after testing for dependency, they concluded that it was the rs2612091 G allele alone that increased the risk of toxicity (p = 0.0021). Although it has been proposed that the ENOSF1 protein could influence TYMS activity, the interaction between these two genes is not yet well understood [41]. Interestingly, genetic variation in TYMS has also been related to response to pyrimidine treatments, with higher levels of TS implicating worse response and poorer overall survival [88,89].

Methylenetetrahydrofolate Reductase (MTHFR)
MTHFR is the other major enzyme involved in 5-FU metabolism. Polymorphisms in this gene (namely rs1801133 and rs1801131) might impact enzyme activity, causing an accumulation of 5,10-MTHF, which increases toxicity [90]. Indeed, a study involving 292 stage II/III colon cancer patients found that the rs1801133 TT genotype was associated with neutropenia (OR = 2.32, p = 0.014) [48]. Another study involving 118 mCRC patients found that the same genotype was associated with diarrhoea (p = 0.02) [91]. However, other studies have not been able to find any association between polymorphisms in this gene and toxicity events [61,62,92,93].

Carboxyl Esterases (CES) and Cytidine Deaminase (CDA)
CES2 is the first enzyme in the conversion of capecitabine to 5-FU, followed by a second step catalysed by CDA ( Figure 1). There have been some attempts to prove the association of polymorphisms on these two genes with ADRs, but there are still no concrete positive results. Ribelles et al. studied 136 patients and showed a trend (p = 0.07) between HFS and CDA SNP rs3215400 [54]. A study including 239 patients found an association of CDA rs2072671 with a high risk of overall toxicity (OR= 1.84, p = 0.029) [53]. Another work including 430 patients linked the CDA rs602950 and CDA rs532545 variants with diarrhoea (OR = 2.3, p = 0.0055, and 2.3, p = 0.0082, respectively) [47]. There have also been some smaller studies on CES polymorphisms and their association with capecitabine toxicity [45,54,94]. CES proteins are also important in the catabolic pathway of irinotecan ( Figure 3) [95]. CES1 rs2244613 was found to be associated with diarrhoea and patients with low CES2 expression are more prone to develop neutropenia or diarrhoea [95][96][97][98].

DNA Repair Genes
DNA repair pathways have been extensively studied in pharmacogenomic studies [99]. A metaanalysis of more than 1000 CRC patients receiving oxaliplatin found a single significant association of the ERCC1 rs11615 C allele with a higher risk of having haematological toxicity in Asian populations (HR = 1.97, p < 0.05) [100]. Boige et al. could not, however, replicate this association, perhaps due to population differences, but did associate the ERCC2 rs13181 C allele with a higher risk of severe haematological toxicity caused by FOLFOX (OR = 2.16, p = 0.01) [62]. A recent study on 596 CRC patients found that ERCC1 rs11615 was significantly associated with stomatitis (p = 0.03) and nausea (p = 0.04), and that ERCC2 rs13181 and rs238406 were associated with thrombocytopenia (p = 0.004 and p = 0.03, respectively) [63]. On the other hand, a study of 517 patients with stage II/III colon cancer concluded that polymorphisms in ERCC1 and XRCC1 did not have a clinically significant association with adverse effects [61]. Further smaller studies could neither confirm the relationship between these variants and toxicity [91,101].

Glutathione S-Transferases (GSTs)
GST enzymes are proteins from a multigene family, and specifically, GSTP1, GSTM1 and GSTT1 are involved in oxaliplatin detoxification (Figure 2). The most studied variations are GSTP1 rs1695 and the complete deletion of the GSTT1 and GSTM1 genes. McLeod et al. tested these on 300 patients receiving FOLFOX in an advanced CRC setting. Patients bearing the GSTM1 null genotype had a 1.7-fold increased risk of having severe neutropenia (p = 0.016), whereas homozygous patients for the rs1695 T allele had higher probability of discontinuing FOLFOX treatment due to neurotoxicity (p = 0.01) [102]. In contrast to these findings, Boige et al. did not find any significant association between these same SNPs and severe neurotoxicity on a study enrolling 349 patients [62]. Ruzzo et al. studied 517 patients and suggested a weak association between the GST-T1/M1 null/null genotype and severe neutropenia (OR = 1.99, p = 0.032) [61], whereas Cecchin et al. analysed 154 patients receiving FOLFOX but could not replicate any markers of neurotoxicity. Interestingly, they suggested that variants other than genetics, such as the biological state of patients or disease stage, may also influence the detoxification pathway, and could therefore be responsible for the FOLFOX-related neurotoxicity [58].  [58,61,103].
In relation to irinotecan-based regimens, Salvador-Martín et al. showed that SNPs rs1128503, rs2032582 and rs1045642 in ABCB1, which are in LD, were associated with haematological and overall toxicity [92]. Others proposed the association of solely ABCB1 rs1128503 (OR = 2.02, p = 0.401) with global toxicity, or of ABCB1 rs1045642 with early toxicity (OR = 3.79, p = 0.098) (not strictly significant), while others did not find any association at all [74,93,95,96]. There have also been some reports on other ABC transporter genes, with conflicting results. For instance, a study on 26 mCRC patients showed that patients with the CC genotype in ABCC5 rs562 or the GG genotype in ABCG1 rs425215 presented higher gastrointestinal toxicity (p < 0.02) [72]. A study including 250 patients with mCRC linked the ABCG2 rs7699188 variant with severe global toxicity (OR = 7.26, p = 0.013) [74].

Uridine Disphosphate Glucuronosyltransferases (UGTs)
UGT1A1 is the main enzyme responsible for SN-38 inactivation, followed by UGT1A7 and UGT1A9. Several groups have studied the influence of UGT polymorphisms on toxicity development. One of the most studied polymorphisms in UGT1A1 is a change in the number of TA repeats (TA) n TAA in the promoter region. The wild-type allele for this polymorphism is (TA) 6 TAA, with (TA) 7 TAA (rs3064744, UGT1A1*28) being frequent in Caucasians, but not in Asian populations (≈30% and ≈10%, respectively). However, rs4148323 (UGT1A1*6) is more frequent in Asian populations comparing with Caucasians (≈14% and ≈1%, respectively). Ando and colleagues reported that patients carrying the UGT1A1*28 genotype were at significantly higher risk of having irinotecan-related severe toxicity (OR = 7.23, p < 0.001) [64]. Innocenti et al. also stated that patients with UGT1A1*28 had more events of severe neutropenia (OR = 9.3, p = 0.001) [67]. Others have also showed a correlation between UGT1A1*28 and neutropenia, diarrhoea and vomiting (p < 0.01) [104][105][106][107]. Additionally, as for TYMS, it has been proven that the UGT1A1 genotype also affects maximum tolerated dose and therefore response [108,109].

Solute Carriers (SLCs)
Reduction or elimination of the function of SLC genes due to genetic variation can lead to a decrease in SN-38 uptake, with further accumulation in plasma, ultimately leading to toxicity [97]. rs2306283 (SLCO1B1*1b) has been shown to cause severe gastrointestinal toxicity, particularly diarrhoea and neutropenia [72,110,111]. A discovery study on 167 mCRC patients receiving irinotecan also revealed a protective effect of the SLCO1B1 rs2291076 T allele against neutropenia but associated the rs2306283 GG genotype with significantly higher neutropenia events. These results were, however, not replicated in a posterior study of 250 mCRC patients [71].

Cytochrome p Gene Family (CYP)
CYP3A4 and CYP3A5 are responsible for the oxidation of irinotecan into the inactive metabolites APC, M4 and NPC. Some researchers have studied the possible association of polymorphisms on these genes and chemo-related toxicity but have not found any positive correlation [68,96,112], probably because over 80% of variants in CYP genes coding regions are very rare and the sample sizes of these studies were not large enough [113]. It has also been suggested that their enzymatic function might be altered by non-genetic factors such as diet, concomitant medications, altered liver function or patient's performance status [114].

Epidermal Growth Factor Receptor (EGFR)
Skin toxicity is the major ADR related to anti-EGFR agents. Parmar et al. studied 109 cancer patients and concluded that skin toxicity was linked to the EGFR rs2227983 GG genotype (OR = 3.24, p = 0.014) [80]. Dahan et al. studied 58 patients treated with third-line cetuximab and irinotecan, and reported a trend between the presence of rs11568315 (CA repeats ≤ 35) and skin toxicity (OR = 2.91, p = 0.058) [81]. Sunakawa et al. studied 77 patients treated with cetuximab in combination with oxaliplatin and also correlated rs11568315 (CA repeats ≤ 19) with skin toxicity [115]. A study on 52 patients treated with cetuximab and FOLFIRI found that EGFR rs712830 was significantly associated with severe global toxicity (OR = 6.13, p = 0.010), but not specifically with skin toxicity. rs712829, rs11568315 (CA repeats cut-off = 17) and rs4444903 were, however, not associated with any toxicity [79]. Another study on 46 mCRC patients receiving XELOX-bevacizumab with or without cetuximab also found no evidence for the association of either rs4444903 or rs11568315 (CA repeats cut-off = 20) with skin toxicity [116].

Vascular Endothelial Growth Factor (VEGF)
Hypertension is the major toxicity derived from anti-VEGF agent treatment. Studies on the relationship of VEGF polymorphisms and bevacizumab-related toxicity have also been controversial. For instance, a study on 89 patients reported a positive link between rs3025039 and hypertension (OR = 0.15, p = 0.022), but a meta-analysis of over 1000 cancer patients did not validate this finding [83,117]. Moreover, some researchers have reported that patients with the rs833061 TT, rs2010963 CC or rs699947 CC genotypes were less prone to hypertension caused by bevacizumab (p < 0.03) [84,85], but Etienne-Grimaldi et al. saw that patients harbouring the rs2010963 CC genotype alone had more toxicity than patients with other genotypes (p = 0.01) [118].

Immunotherapy and Toxicity
Immunotherapy has arisen in the past few years as a promising therapeutic option in many cancers, and has particular relevance in the case of tumours with microsatellite instability (MSI) [119]. Hence, the FDA approved, in 2018, the use of ipilimumab and nivolumab (anti-CTLA-4 and anti-PD1 monoclonal antibodies, respectively) for the treatment of metastatic CRC patients previously treated with standard chemotherapy [120]. In 2020, pembrolizumab (anti-PD-1) was also approved as a first-line treatment of patients with unresectable, MSI-high or mismatch repair-deficient metastatic CRC [121]. Although there have been some studies suggesting the influence of genetic variants on the development of toxicity due to these treatments in other cancer types, to date there is no sufficient data on CRC [122][123][124]. Surely novel data on this will shortly become available for pharmacogenomic studies as more patients undergo immunotherapy treatment.

Genome-Wide Association Studies (GWAS)
Despite the large effect sizes for toxicity variants discovered by candidate gene approaches, chemotherapy-related toxicity is likely complex and multigenic. Therefore, other discovery strategies may be more suitable to inspect genomic variation in a more comprehensive manner. This has been made possible by the increasing availability of higher-throughput technologies at increasingly affordable prices, which has allowed pharmacogenetics to go genomic. In these upcoming sections, we will describe the more recent approaches that have further expanded the knowledge on pharmacogenomics in recent years (Table 4).  GWAS make use of LD inheritance patterns to inspect common genetic variation across the entire genome. The main two advantages of GWAS over candidate gene studies are that they are unbiased by a priori functional knowledge on the variants (which may help in the discovery of other toxicity relevant pathways) and also have the potential to identify variation in regulatory regions such as promoters or enhancers, which have been largely unexplored by candidate gene approaches.
Several GWAS have been performed to inspect chemotherapy-related toxicity in CRC. In the QUASAR2 trial, Rosmarin et al. analysed over 1000 stage II/III CRC patients receiving capecitabine with or without bevacizumab to identify 1456 variants on 25 candidate genes (Table 3) [41]. Fernandez-Rozadilla et al. used 1012 patients in a two-stage study in patients treated with 5-FU and FOLFOX [57] to find a moderate association for the rs10876844 variant and diarrhoea in patients treated with 5-FU. Won et al. also completed a GWAS on 343 Korean patients receiving oxaliplatin-based regimens to identify possible genetic markers associated with chronic oxaliplatin-induced peripheral neurotoxicity (OXCPN) [78]. They found some evidence for an association that was intronic or within 100 Kb of genes related to various neuronal activities. Two subsequent and independent studies by Oguri et al. and Terrazzino et al. tried to validate these findings, but a single association between the FARS2 rs17140129 G allele and OXCPN (OR = 6.5, p = 0.034) was found [125,126]. Lastly, the CAIRO2 trial included 282 advanced or metastatic CRC individuals treated with XELOX plus bevacizumab and cetuximab. They found some novel SNPs to be moderately associated with toxicity (Table 3) [82].
In general, although GWAS present several advantages over candidate gene strategies, there are also some important limitations, some of which could be overcome post hoc. Firstly, there is a lack of replication due to discrepancies in variant frequencies amongst the different populations used between studies, as seen when comparing the works from Won et al. and Terrazzino et al. mentioned above (Asian vs. Caucasian populations, respectively). Further, most of the associated variants are intergenic, which makes it harder to interpret the results directly and design appropriate validatory functional assays. Moreover, because we are evaluating thousands to millions of variants at a time, statistical power is a concern, and adequate study sample sizes are needed [127]. As an illustration, for a GWAS with a sample size of 200 patients, assessing variants with minor allele frequency (MAF) ≥ 5%, and a statistical threshold of 80% power, the OR that we would be able to discover is OR ≈ 2, which reflects a moderate effect.
GWAS are limited to inspecting common variation (i.e., generally over 5% MAF), but it is likely that toxicity variants may be of rarer prevalence [128,129]. Some approaches have been developed to overcome this limitation. For instance, targeted SNP panels can be designed to fine-map regions of interest spanning a large section of the gene or specific to a desired population. As an example, a commercially available array has been designed to include both common and low-frequency variation as well as Mendelian and functional alleles specific to Spanish genomes, which allows for better genotyping of the Spanish population when comparing with the generic global arrays [130]. Moreover, albeit possible, GWAS strategies are not usually suitable for CNV studies, because they demand that the CNV be in high LD with a genotyped SNP [57].
Despite these limitations, GWAS still hold great potential for discovery, given appropriate study conditions. Surely, there are still pathways contributing to toxicity development to be discovered, as proven by the contribution of RPS7 to cetuximab-related toxicity. This gene is normally overexpressed in dermal papilla cells, which makes it reasonable that genetic variants could be associated with skin toxicity [82].

Next-Generation Sequencing (NGS)
NGS, either whole-exome (WES) or whole-genome sequencing (WGS), allows for a more comprehensive identification of novel genetic biomarkers in this regard, and several studies have reported the added value of NGS to identify relevant rare pharmacogenetic variants that would not be detected by other conventional methods (Table 4) [131][132][133][134][135][136].
In 2014, Mizzi et al. compared the data from 482 healthy individuals (data from Genomes Data and the Wellderly Study) obtained either with WGS or SNP array genotyping that included 1936 known pharmacogenomic variants within 231 ADMET genes (Table 5) [131]. Focusing on these genes, the WGS revealed an average of 17,733 variants vs. 249.5 found with the SNP array. In silico analysis with the PROVEAN and SIFT algorithms, which are in silico functional predictors, showed some missense variants likely to be deleterious. Specifically, they found that 254 of the 332 variants in UGT1A1 were novel, of which 31 were functional and 26 had a frequency of <1%. In general, the WGS approach allowed the identification of a significantly higher number of variants compared to the SNP array, which might impact the pharmacological processes.  Another study analysed sequencing data for 146 genes related to pharmacological traits from over 6500 individuals (data from the 1000 Genomes Project (1KGP) and Exome Sequencing Project (ESP)) ( Table 5) [133]. They detected 19,328 single nucleotide variants (SNVs), 62.9% of which were exonic; for example, 6225 and 6258 variants in ABC transporter (22 genes) and SLC genes (49) respectively, and 253 variants in UGTs (16) and GTSs (14). Most of these variants were indeed rare (MAF < 1%; 92.9%) or very rare (MAF < 0.1%; 82.7%)-meaning that they would not be detected by conventional methods-and the majority were missenses (56.2%). The functional impact from rare variants was different across the genes, yet they concluded that rare variants contribute on average 30-40% of the functional variability in the studied pharmacogenes.
Schaller et al. analysed WES and WGS data from 141,456 individuals (data from gnomAD v2.1) and assessed the genetic variability of SLC genes (Table 5) [136]. They detected 204,287 SNVs and indels, of which 56.9% were missenses, and several were loss-of-function variants, such as 2.5% frameshifts, 1.7% stop-gains and 1.5% variations in canonical splice sites. They concluded that each individual presents, on average, 29.7 putatively functional SLC variants, with rare variants contributing 18% of this functional variability.
Following on from the results obtained from their initial GWAS, Rosmarin et al. sequenced the complete DPYD and TYMS coding regions and identified a further novel rare independent DPYD variant (c.1651G>A; p.Ala551Thr). This change was present in a single patient that had presented with grade 4 neutropenia and thrombocytopenia, and was predicted to be "strongly damaging" by in silico predictors (Table 4) [41].
NGS approaches can not only be useful to identify rarer variants but can be an important asset to reveal copy number variations (CNVs). The case in point is the work by Santos et al. that included CNV available data from 2504 whole genomes and 59,898 exomes (data from 1KGP and Exome Aggregation Consortium (ExAC)) and focused on 208 ADME genes (Table 5) [134]. Within these, 201 (97%) genes had a total of 5589 novel CNVs, where 47% were deletions and 54% were duplications. These novel deletions were responsible for >5% of loss-of-function alleles in a considerable number of genes (87,25,49,48,59 and 51 genes in non-Finnish Europeans, Finnish, East Asians, South Asians, Africans and admixed Americans, respectively). This demonstrates the impact that CNV might have on ADME genes, and hence the development of ADRs.
As the conventional screening methods only include common variants, a high number of variants are missed, thus explaining the need for unbiased and more comprehensive approaches. These interesting works emphasise the potential of NGS to detect novel rarer variants or CNV, not only in ADME genes, but in other pathways, which might help to explain the pharmacogenetic variability possibly associated with toxicity caused by chemotherapy.

Functional Assays
Functional assays on candidate variants are essential to ultimately clarify the mechanisms by which the genetic variants exert their effect on ADR development. Pharmacokinetic (PK) studies have been the most used approach to assess the functional impact of toxicity SNPs (Table 4). They have been used for many years now to evaluate enzymatic activity in patients carrying the desired variants, as they measure the level of drug and its metabolites that influence drug bioavailability and could hence lead to the toxicity profile.
By far, the most studied gene in PK studies has been DPYD, and there is an agreement that the DPD protein plays a crucial role in 5-FU metabolism. There are several methods to determine DPD deficiency [30,137]: testing for DPD activity in peripheral blood mononuclear cells, the uracil breath test, the uracil test dose and endogenous DHU/U ratio, or high-performance liquid chromatography (HPLC).
By these means, at least four SNPs in DPYD have been proven deleterious: rs55886062, rs3918290, rs67376798 and rs56038477/HapB3 [30,34,156]. Studies on other variations have so far led to inconclusive or contradicting results [157].
Of late, other approaches have also been used to assess the functionality of pharmacogenetic variants. For instance, Offer et al. proposed the construction of a vector for rapid phenotypic assessment of DPYD variants and their relation with 5-FU sensitivity (Table 6) [29,31]. DPYD constructs were expressed in mammalian cells and the enzymatic activity of the expressed proteins was measured by HPLC and compared to the wild type. By these means, they could confirm that 30 of the variants caused a significant reduction in enzymatic activity. Interestingly, 19 of the variants tested displayed <25% activity. In turn, DPYD rs1801158, rs1801265, rs2297595, rs200687447, rs60139309 and rs114096998 had higher enzymatic activity, and therefore cells expressing these variants were more resistant to 5-FU, which may not confer susceptibility to toxicity development, but may in turn influence response rates.
In 2015, Henricks et al. proposed to assign an activity value (AV) to DPYD alleles, to adjust the initial dose of 5-FU. In this context, fully functional alleles had an AV = 1, reduced activity alleles had an AV = 0.5 and non-functional alleles had an AV = 0 (wild-type AV = 1; rs67376798 and rs56038477 AV = 0.5; and rs3918290 and rs55886062 AV = 0). Based on the AV of both alleles, the gene activity score (AS) is calculated, thus representing the enzymatic phenotype of the patient [30].
For genes other than DPYD, there is much less functional evidence (Table 6). Some research has been conducted on the relation of irinotecan PK variants. These studies were able to significantly associate polymorphisms in ABCC1 and ABCB1 with SN-38 exposure and the glucuronidation ratio (GR)-measured as AUC SN-38G/AUC SN-38 [73,111]. Demattia et al. investigated the possible association between ABCG2 rs7699188 and ABCB1 rs2032582 with irinotecan PK parameters on patients with advanced CRC by measuring plasma concentrations of irinotecan, SN38 and SN38G, but did not find any significant correlation [74]. Toffoli et al. evaluated irinotecan PK in 71 patients with metastatic CRC. They associated severe toxicity with a significantly lower GR (p = 0.01) and an increased biliary index (BI) (p = 0.003), which indicates SN-38 accumulation. Further, they reported a significant correlation between UGT1A1*28 and lower GR (p = 0.01), and higher BI (p = 0.007) [145]. Other works showed that patients with the wild-type genotype had a significantly higher clearance of SN-38 compared to UGT1A1*28 (p < 0.001), and that the homozygous genotype was significantly associated with GR (p = 0.005) and BI (p = 0.014) [151]. Iyer et al. also reported significantly lower SN-38 glucuronidation in patients with UGT1A1*28 (p = 0.001) [105]. Other UGT1A polymorphisms, such as UGT1A1*60 (p = 0.005), UGT1A1*93 (p < 0.0001), UGT1A1*6 (p = 0.037) and UGT1A7*3 (p < 0.02), were also associated with GR and BI [73,146,150].

Cost-Effectiveness Analysis
Besides the need for clear evidence on the functional relevance of a pharmacogenetic biomarker, a proof of cost-effectiveness-that the pharmacogenetic strategy is more effective with an acceptable additional cost or even a cost saving-is crucial to facilitate its introduction into clinical practice and acceptance from healthcare professionals and institutions.
In 2015, Deenen et al. evaluated the safety and costs of upfront DPYD*2A genotyping with individualised dose adjustment treatment for fluoropyrimidines [158]. They showed that genotypeguided dosing represented a reduction in severe toxicity from 73% to 28%. Moreover, dose adjustment based on genotype produced shorter and easier to control toxicities, and a significant reduction in drug-induced death from 10% to 0%. Therefore, they demonstrated that screening for DPYD*2A before treatment could be lifesaving and potentially cost-efficient. Cortejoso et al. complementarily evaluated the costs of genotyping three DPYD variants (rs3918290, rs67376798 and rs55886062) and the management of severe neutropenia caused by fluoropyrimidines. Considering an average cost of management of EUR 3044.40 vs. EUR 6.40 per patient for DPYD testing, they concluded that genotyping is cost-effective if severe neutropenia is prevented in at least 2.1 cases per 1000 treated patients [159]. Given that the combined frequency of these three markers is about 1%, this provides evidence that DPYD testing should be considered by healthcare systems. Murphy et al. further compared the reactive vs. prospective DPYD genotyping of variants rs3918290, rs67376798, rs1801158 and rs55886062. Of the 134 included patients, five carried a DPYD variant and the costs of their hospitalisation were EUR 232,061, whereas the total cost of genotyping prior to treatment for all patients would have been only EUR 23,718. Even if patients still had to endure some ADRs, the cost would have been considerably smaller, making pharmacogenetic analysis again cost-efficient [160]. In 2019, Henricks et al. also compared the costs from prospective DPYD screening (rs3918290, rs67376798, rs55886062 and rs56038477) with no screening on 1103 patients receiving fluoropyrimidine-based therapy. Patients with variants rs67376798 or rs56038477 had a 25% dose reduction, while patients with rs3918290 or rs55886062 had a 50% dose reduction. They concluded that the expected costs of the screening approach were EUR 2599 vs. EUR 2650 for the non-screening approach, representing a cost saving of EUR 51 per patient. These results strongly suggested that upfront DPYD-guided dose individualisation does not result in extra costs, and therefore solidly supports DPYD screening implementation prior to fluoropyrimidine treatment as a standard of care [161]. It also constituted the basis for pharmGKB EMA guideline changes from actionable to recommended.
Gold et al. assessed the safety and costs of testing for UGT1A1*28 before irinotecan treatment [162]. Assuming no treatment efficacy reduction, the average cost saving per patient was EUR 250. Obradovic et al. compared the standard irinotecan dose with dose reduction based on UGT1A1 genotyping, and evaluated the cases of severe neutropenia, the number of life-years gained and the associated costs. They concluded that genotyping with dose reduction in homozygotes was cost-saving in African and Caucasian populations, but not in Asians, given the population frequency of this variant [163]. Another study by Butzke et al. compared severe neutropenia and grade 4 diarrhoea in a similar setting, to find that dosage calculations based on UGT1A1*28 genotypes save about EUR 600 per patient [164]. More recently, Roncato et al. calculated that the costs for toxicity management per patient increased 1.4-fold for heterozygotes and 6-fold for homozygotes compared to wild-type individuals, and they were superior to the costs related to genotyping all patients before treatment [165].

Pharmacogenomic Testing Guidelines
Although, as we have described so far, there is a considerable amount of evidence on the effect of genetic variants on CRC chemotoxicity, translation into clinical practice is yet far from routine implementation. For now, guidelines from leading authorities, including the European Medicines Agency (EMA), the Food and Drug Administration (FDA), the private pharmacogenetic consortia, the CPIC, the Royal Dutch Association for the Advancement of Pharmacy-Pharmacogenetics Working Group (DPWG) and the Spanish Agency for Medicines and Health Products (Agencia Española de Medicamentos y Productos Sanitarios, AEMPS) have only produced a very limited list of recommendations (Table 7) [166][167][168][169][170]. Pharmacogenetic guidelines from CPIC for the administration of fluoropyrimidines recommend that the DPYD metaboliser status (based on variants rs3918290, rs67376798, rs55886062 and rs56038477) is characterised prior to treatment administration. Poor metabolisers (AS 0-0.5) either: (a) receive an alternative drug; or (b) if 5-FU/capecitabine is still considered the better suited option of treatment, it is recommended to strongly reduce the given dose and accompany with close monitoring. For intermediate metabolisers (AS 1-1.5), a 50% dose reduction is recommended [156,167]. On the other hand, the FDA only contraindicates the administration of 5-FU/capecitabine in patients with DPD deficiency, but does not directly recommend screening for DPD deficiency before treatment, neither does it distinguish heterozygous nor homozygous DPD-deficient patients (www.pharmgkb.org/gene/PA145/labelAnnotation (accessed on 07 October 2020)) [170].
With the growing knowledge on CRC pharmacogenomics, more guidelines including other genes/variants will most likely be available in the next coming years. For instance, the ABC transporter genes, like ABCC1 and ABCB1, have been quite studied so far and there is good evidence of their relation to the development of ADRs, both by association studies and functional assays.

Limitations in Pharmacogenomic Studies
In this appraisal, we have presented a comprehensive review of the field of CRC pharmacogenomics, since its early inception to the latest trends. Although remarkable findings have been produced, the road towards widespread clinical implementation is still far from over, and is inherently hindered by some of the limitations that pharmacogenetic analysis encounters. One of the main problems in pharmacogenomic studies is the extensive phenotype heterogeneity. This could be attributable to at least three different factors: (a) heterogeneity in clinical inclusion, i.e., differences in tumour staging and treatment strategies and lines (i.e., the genetic contribution to toxicity may be different in patients that have received FOLFOX as first-line treatment compared to those who have received it as second line); (b) pharmacogenomic data are not kept in a standardised manner, and it is usually hard to find in the patient's clinical record case report forms, including the appropriate scaling, timing and line of treatment, and Eastern Cooperative Oncology Group (ECOG) performance status of the patient (amongst others) should therefore be used to produce robust study designs; and (c) some ADRs, like haematological counts, can be measured quantitatively, whereas others, like diarrhoea, are subject to clinician interpretation. To overcome this, toxicity grading scales such as the Common Terminology Criteria for Adverse Events (CTCAE) should be used across studies [172].
Secondly, the influence of each therapeutical agent alone is hard, if not impossible, to assess, as the great majority of patients undergo combination therapies, and many of the ADRs are shared amongst treatments. This could be due, for instance, to the backbone presence of 5-FU in most settings but could also result from a pleiotropic effect of different drugs.
Thirdly, there has been, in general, a lack of unambiguous association findings. This could be due to the abovementioned phenotypic heterogeneity, but also other factors such as study sample sizes or population stratification issues. For instance, the overwhelming majority of studies reported in this review have been performed exclusively on Caucasian populations, and there are few published works in non-Europeans. Moreover, those that have been published in Asians show considerable differences in the allelic frequencies of the variants. Therefore, validation of findings in cohorts with appropriate statistical power is essential. On this topic, an outstanding example is the Radiogenomics Consortium, which advocates for the standardisation of toxicity data collection derived from radiation treatments. They have published guidelines for STrengthening the Reporting Of Genetic Association studies in Radiogenomics (STROGAR), which allow for multi-institutional approaches towards large-scale radiotherapy patient biorepositories and databanks. Indeed, this consortium has already successfully completed several GWAS of radiotherapy toxicity [173][174][175][176][177].
Fourthly, there are no implemented guidelines for the reporting of pharmacogenetic studies. There have been recent efforts to overcome this, including a publication on the STrengthening the Reporting Of Pharmacogenetic Studies (STROPS). This work produces guidelines to standardise pharmacogenetic reporting. This could be essential for the homogenisation of pharmacogenetic data leading to improved systematic reviewing and meta-analyses, hence improving the power and applicability for pharmacogenetic associations [178].
Overall, the evidence gathered so far has brilliantly supported the relevance of pharmacogenomic testing in personalised medicine approaches. Novel genomic technologies such as GWAS and NGS offer unprecedented and affordable access to genomic information that can be assessed to discover novel pharmacogenomic variants related to toxic ADRs [179,180]. Pharmacokinetic profiling has proven to be useful for the identification of patients that might benefit from modified treatment strategies and might help improve the prediction value of genetic testing. Cost-effective analyses produced so far have validated the thought that the treatment design should be designed based on pharmacogenomic data, and that these strategies are always cost-effective vs. having to palliate toxicity issues.
Nevertheless, widespread testing is still anecdotic including in regulatory guideline recommendations. Researchers must hence make additional efforts to produce sound and relevant data that can be presented to the regulatory agencies to support pre-treatment testing. Surely, we must continue working in this direction towards a more meaningful implementation of pharmacogenomics in the routine clinical practice.

Conflicts of Interest:
The authors declare no conflict of interest.