The Somatic Mutation Landscape of UDP-Glycosyltransferase (UGT) Genes in Human Cancers

Simple Summary The human UDP-glycosyltransferase (UGT) superfamily is involved in the metabolism of numerous anticancer drugs and endobiotic signaling molecules with pro/anti-cancer activities. Previous studies have shown abundant expression of UGT genes in many human cancers, indicative of the active intratumoral metabolism of drugs and endobiotics through the UGT conjugation pathway. Mutations of UGT genes in tumors that can affect this pathway have not yet been reported. In the present study, our analysis of somatic mutations in 10,069 tumors from 33 different cancer types identified 3427 somatic mutations in UGT genes, over half of which have been predicted to code for variant UGT proteins with no or reduced activity. As a result, somatic mutations of UGT genes may reduce the capacity of cancer cells to metabolize anticancer drugs and pro/anti-cancer endobiotics, and hence, they are likely to alter therapeutic efficacy and cancer growth, highlighting their potential utility as biomarkers predicting therapeutic efficacy and clinical outcomes. Abstract The human UDP-glycosyltransferase (UGTs) superfamily has a critical role in the metabolism of anticancer drugs and numerous pro/anti-cancer molecules (e.g., steroids, lipids, fatty acids, bile acids and carcinogens). Recent studies have shown wide and abundant expression of UGT genes in human cancers. However, the extent to which UGT genes acquire somatic mutations within tumors remains to be systematically investigated. In the present study, our comprehensive analysis of the somatic mutation profiles of 10,069 tumors from 33 different TCGA cancer types identified 3427 somatic mutations in UGT genes. Overall, nearly 18% (1802/10,069) of the assessed tumors had mutations in UGT genes with huge variations in mutation frequency across different cancer types, ranging from over 25% in five cancers (COAD, LUAD, LUSC, SKCM and UCSC) to less than 5% in eight cancers (LAML, MESO, PCPG, PAAD, PRAD, TGCT, THYM and UVM). All 22 UGT genes showed somatic mutations in tumors, with UGT2B4, UGT3A1 and UGT3A2 showing the largest number of mutations (289, 307 and 255 mutations, respectively). Nearly 65% (2260/3427) of the mutations were missense, frame-shift and nonsense mutations that have been predicted to code for variant UGT proteins. Furthermore, about 10% (362/3427) of the mutations occurred in non-coding regions (5′ UTR, 3′ UTR and splice sites) that may be able to alter the efficiency of translation initiation, miRNA regulation or the splicing of UGT transcripts. In conclusion, our data show widespread somatic mutations of UGT genes in human cancers that may affect the capacity of cancer cells to metabolize anticancer drugs and endobiotics that control pro/anti-cancer signaling pathways. This highlights their potential utility as biomarkers for predicting therapeutic efficacy and clinical outcomes.


Extracting Individual Somatic Mutations from the MC3 MAF File and Assigning Them to Each of the 33 TCGA Cancer Types
The MC3 MAF file lists all somatic mutations from 10,295 tumors alphabetically and numerically according to the TCGA barcodes of the tumor samples (mc3.v0.2.8.PUBLIC.maf.gz). Therefore, the entries are grouped by tumor sample and not by cancer type. We manually allocated the tumor samples from the MC3 MAF file into each of the 33 TCGA cancer types. Consistent with the primary aim of the TCGA project focusing on the study of primary tumors, the majority of samples were primary tumors; however, many cancer types also contained recurrent (BRCA, COAD, GBM, LGG, LIHC, LUAD, READ and SARC) and metastatic (BRCA, CESC, COAD, LGG, PAAD, PRAD and PCPG) tumor samples. To ensure a consistent analysis of only primary tumors within and between cancer types, all recurrent tumors were excluded from the analysis. Metastatic tumors were also excluded from the analysis for all cancer types except SKCM. The SKCM cohort contained about 75% metastatic tumors and 25% primary tumors [61,62]. In the present study, we assessed metastatic SKCM tumors (364 samples) as a sub-cohort (designated as Metastatic SKCM) as compared with primary SKCM tumors (103 samples) (designated as Primary SKCM) ( Table 1, Table S1).
Most tumors have a relatively low mutation burden; however, tumors with extremely high numbers of somatic mutations have been reported for many cancers, such as melanoma, lung, endometrial and bladder cancer [49,[65][66][67][68]. This hypermutation can be caused by genetic defects (e.g., replication repair defects) [69,70], mutagen exposure (e.g., UV lights and tobacco smoking) [71,72] and anticancer therapy (e.g., immune checkpoint inhibitors) [73][74][75]. Hypermutated tumors are very rare in cancers with a low mutation burden and only account for a very small portion of tumors for cancers with a high mutation burden. Therefore, it was necessary to exclude hypermutated tumors from the analysis to avoid their potential influence on the results [49,52]. In the present study, we excluded Cancers 2022, 14, 5708 4 of 28 53 hypermutated tumors from the analysis, including tumors from cancers with a high (COAD, SKCM, STAD and UCEC) or low (GBM, LGG, CESC, PAAD, PRAD and UCS) mutation load (Table S1). Therefore, 878,185 mutations (16,569 mutations on average per tumor) from these hypermutated tumors were excluded from the analysis (Table S1).
Collectively, after having excluded recurrent, metastatic (except SKCM) and hypermutated tumors, we obtained 9705 primary tumors and 364 metastatic SKCM tumors from the MC3 MAF file that together had 2,686,092 somatic mutations (Table 1). Table S1 lists the tumors that were analyzed in this study for each of the 33 TCGA cancer types. The MC3 MAF file is a tab-delimited text file that contains comprehensive information for each mutation, including the mutated gene (Hugo_Symbol) and the positions of the mutation at the genomic (GRCh37/hg19), cDNA (Ensembl Reference Transcripts) and protein levels (Table S3). The mutations at eleven UGTs (2A3, 2B4, 2B7, 2B10, 2B11, 2B15, 2B17, 2B28, 3A1, 3A2 and UGT8) were clearly identified in the MC3 MAF file by the HUGO gene names (Hugo_Symbol). However, conflicting allocation and mis-annotation were seen in the MC3 MAF file for mutations for the remaining eleven UGT genes (nine UGT1As, UGT2A1 and UGT2A2) primarily due to the exon-sharing genomic structure among UGT1As or UGT2As [1]. The nine UGT1A (1A1, 1A3-1A10) genes have unique exon 1s and a shared set of exons 2-5 [1]. As expected, the MC3 MAF file lists mutations at the unique exon 1s for the respective UGT1As, but mutations within UGT1A1 exon 1 are mis-annotated as intronic mutations for UGT1A8. We recalculated the positions at both the cDNA and protein levels for these mis-allocated mutations based on the UGT1A1 reference sequences (RefSeq: NM_000463, NP_000454) (designated as mutations for 1A1) ( Table 2). All mutations within the shared exons 2-5 affect all nine UGT1As, but these mutations are listed in the MC3 MAF file specifically as mutations for UGT1A10 (exons 2-4) or UGT1A4 (exon 5). In the present study, we correctly assigned these mutations to all nine UGT1As (designated as mutations for 1A E2-5) ( Table 2). The DNAJB3 gene [DnaJ (Hsp40) homolog, subfamily B, member 3)] is located between the first exons of UGT1A1 and UGT1A3. There are 122 mutations at DNAJB3 that are mis-annotated in the MC3 MAF file as intronic mutations for UGT1A10 (Table S4). We excluded these mutations from the analysis of this study. Similarly, UGT2A1 and UGT2A2 have different exon 1s and a common set of exons 2-6, and therefore, mutations within exons 2-6 affect both genes [1]. As expected, the MC3 MAF file lists the mutations at the unique exon 1s as mutations for the respective UGT2A1 and UGT2A2; however, the mutations within the shared exons 2-6 are annotated in the MC3 MAF file using a variant UGT2A1 transcript (ENST00000514019, NM_001389565.1) that has the UGT2A2 exon 1 inserted between UGT2A1 exons 1 and 2. This insertion generates an extended 737-aa variant UGT2A1 protein (NP_001376494.1) as compared with the 527-aa wildtype UGT2A1 protein (NP_006789.3). In the present study, we recalculated the positions at both the cDNA and protein levels for these mis-annotated mutations based on the UGT2A1 NCBI reference sequences (RefSeq NM_006798, NP_006789). These mutations affect both UGT2A1 and UGT2A2, and hence they are listed as "2A1/2A2 E2-6" in Table 2.
After identifying the mutations for each UGT gene, we manually assessed whether the annotated positions at the genomic, cDNA and protein levels are accurate and consistent for each of the mutations based on the NCBI GRCh37/hg19 reference sequences. Through this process, we were able to verify the accuracy of annotations for all mutations, but five mutations showed conflicting cDNA and genomic positions, including one mutation from each of three genes (1A4, 1A10, 2B28) and two mutations from UGT1A5 (Table S4). We excluded these five mutations from the analysis of this study.
Collectively, we identified 1802 tumors from 33 TCGA cancer types in the MC3 MAF file that each have at least one mutation in a UGT gene (Table S2). These tumors together have 3427 somatic mutations in UGT genes that were included in the analysis of this study (Tables 1 and 2). Table S3 lists the mutations in UGT genes for each of the 33 TCGA cancer types. Table S4 lists the mutations for each of the 22 UGT genes.
To assess whether mutations in UGT genes are correlated with mutations in the tumor suppressor gene TP53, we determined the numbers of mutations in TP53 for each of the 33 TCGA cancer types (Table S4).
The MC3 MAF file (variant-Classification column) classifies mutations into at least 13 different types of mutations according to the positions and nature of the mutations, including (1) mutations in 5 or 3 untranslated regions (5 UTR and 3 UTR), (2) mutations in coding regions (translation_start_site, missense, nonsense, silent, nonstop_mutation, frame_shift_del, frame_shift_in, in_frame_del and in_frame_in) and (3) mutations within introns and splice sites. The MC3 MAF file includes an assessment of all missense mutations by the SIFT (Sorting Intolerant From Tolerant) algorithm, which uses amino acid sequence homology to predict whether a missense substitution affects protein function, and it classifies missense substitutions as tolerated or deleterious [76,77]. Table 2 lists the number of each type of mutations for each of the 22 UGT genes, including the number of deleterious missense mutations.
Using the Clustal Omega multiple sequence alignment program from the EMBL-EBI sequence analysis toolkit [78], we identified amino acids that are conserved across the UGT enzyme family and subfamilies, and we annotated mutations that affect conserved amino acids in multiple UGT proteins. To further highlight the mutations (i.e., missense, nonsense, nonstop and small indels) within coding sequences that may affect enzyme function, we mapped them at the cDNA and protein levels to clearly show their distribution throughout the coding regions and potential mutation hotspot regions.

Assessment of Somatic Mutations of UGT Genes in Human Cancer Cell Lines
The Cancer Cell Line Encyclopedia (CCLE) project comprehensively characterizes the molecular profiles of over 1000 human cancer cell lines (Broad, 2019), including the mutation profiles (CCLE_mutations.csv) for 18,784 human genes in 1771 human cancer cell lines using whole exome sequencing [54,79]. The resulting dataset is available from the DepMap Public 22Q2 via the CCLE DepMap portal (https://depmap.org/portal) (accessed on 1 August 2002). This dataset is further elaborated with additional annotations for every mutation in the cBio cancer genomics portal (cBioPortal) (https://www.cbioportal.org) (accessed on 1 August 2022) [80]. We obtained the mutations of all UGT genes except for UGT2A2 exon 1 from the cBioPortal (Table S5). In the cBioportal, mutations in the exons 2-6 shared by UGT2A1 and UGT2A2 are annotated using the UGT2A2 reference sequence (ENST00000457664). Mutations in UGT2A2 exon 1 were found in the DepMap dataset, but they are annotated based on the same variant UGT2A1 transcript (ENST00000514019) as described above for the TCGA MC3 MAF file (Table S5). We recalculated the positions at the cDNA and protein levels for these mutations based on the UGT2A2 NCBI reference sequences (RefSeq NM_006798 and NP_006789). Silent mutations are listed in the DepMap Cancers 2022, 14, 5708 7 of 28 portal dataset but are not present in the cBioPortal dataset. There are no mutations at untranslated regions (5 UTR and 3 UTR) in both the DepMap-and cBio-Portal datasets. Collectively, our analysis of 1568 CCLE cell lines identified 895 mutations in UGT genes (Tables 3 and S5). 1A E2-5: exons 2-5 shared by all nine UGT1As. 2A1/2A2 E2-6: exons 2-6 shared by UGT2A1 and UGT2A2. The number in the BRACKET refers to the number of mutations that were also seen in TCGA tumors.

Statistical Analysis
The potential correlation between the number of mutations in all genes and the number of mutations in UGT genes per tumor across 33 different TCGA cancer types was assessed by Spearman ranking correlation analysis using GraphPad Prism (version 9.1.1) (GraphPad Software, San Diego, CA, USA). A p value of <0.05 was considered statistically significant.

Somatic Mutations of Protein-Coding Genes in Human Cancers
Using the MC3 MAF file, our analysis of the mutation profiles of 10,069 tumors identified 2,686,092 somatic mutations within the exonic sequences of human protein-Cancers 2022, 14, 5708 8 of 28 coding genes (Table 1). Table 1 lists the number of tumors assessed as well as the total number of somatic mutations identified for each of the 33 different cancer types. The number of mutations varied greatly across cancer types. Metastatic SKCM and PCPG had the highest (1204) and lowest (15) number of mutations per tumor, respectively (Table 1). These data are consistent with previous studies that have identified cancer types with a high mutation burden (BLCA, COAD, LUAD, LUSC, SKCM, STAD and UCEC) and those with a relatively low mutation burden (KICH, LAML, LGG, MESO, PCPG, PAAD, PRAD, TGCT, THCA, THYM and UVM) [49,[65][66][67][68]. A genome-wide analysis of the MC3 somatic mutations was recently reported [49]. In the present study, we focused on the analysis of somatic mutations in UGT genes, as described in detail below.

Summary
This section briefly summarizes our overall findings on the somatic mutation landscape of the UGT gene superfamily. Detailed descriptions of the different types of mutations found in individual UGT genes are described in subsequent sections.
Of the assessed 10,069 tumors, 1802 tumors (17.8%) had at least one UGT gene mutation (Table 1). Together, these tumors had 3427 somatic mutations in UGT genes. Table 1 lists the number of tumors and the total number of mutations in UGT genes for each of the 33 cancer types. Overall, the total number of mutations in UGT genes per tumor varied widely across different cancer types and was positively correlated with the total number of mutations in all genes per tumor across cancer types (Spearman ranking correlation analysis: r = 0.939; p < 0.0000001) ( Figure 1A). As described below, missense and silent mutations were the two most common types of mutations in UGT genes in TCGA tumors. We showed a positive correlation between the numbers of missense or silent mutations per tumor in UGT genes and in all genes across cancer types ( Figure 1B,C). Collectively, these results indicate that the mutation rates of UGT genes in different types of cancers were defined by the differing mutation burdens of cancer types, as described in detail below.
TP53 is a tumor suppressor gene that is frequently mutated in TCGA tumors [81]. We showed a positive association between the total numbers of mutations per tumor in TP53 gene and in UGT genes ( Figure 1D). However, it remains to be investigated whether TP53 influences UGT mutations or overall mutation burdens.

Introduction
The human UDP-glycosyltransferase (UGT) superfamily comprises four s (UGT1, UGT2, UGT3 and UGT8) that code for 22 functional UGT enzymes [ conjugate small lipophilic compounds with UDP-sugars to generate water-sol ucts, thus facilitating their excretion from the body [3]. The nine UGT1 (1A1, and ten UGT2 (2A1, 2A2, 2A3, 2B4, 2B7, 2B10, 2B11, 2B15, 2B17 and 2B28) en marily use UDP-glucuronic acid to conjugate therapeutic drugs and numerou nous (e.g., steroid hormones, bile acids, bilirubin and fatty acids) and exoge carcinogens, dietary constituents and environmental toxins) compounds and traditionally termed UDP-glucuronosyltransferases [3][4][5]. By contrast, the UGT8 enzymes use differing UDP-sugars as donors, including UDP-N-acetylgl (UGT3A1), UDP-glucose/UDP-xylose (UGT3A2) and UDP-galactose (UGT8) [ The impact of mutations on protein function depends on the positions and nature of the mutations. The MC3 MAF file classified mutations into 13 different types, as described in the Materials and Methods Section [49]. All these types of mutations except "In_Frame_Ins" were found in UGT genes (Table 2). Table 2 lists the number of each type of mutation for each of the 22 UGT genes. Of the 3427 mutations in UGT genes, 3065 mutations were found in coding regions that were subclassified into eight different types of mutations (Table 2). Briefly, nearly a quarter of the mutations (754/3065) were silent mutations (synonymous mutations) that do not alter protein sequences. Nearly two-thirds of the mutations (1998/3065) were missense mutations resulting in amnio acid substitutions, approximately 55% (1099/1998) of which were defined by the SIFT algorithm as deleterious amino acid substitutions with a significant impact on UGT function [76]. Nonsense mutations resulting in premature stop codons accounted for about 6% (184/3065) of the mutations. Approximately 4% of the mutations (127/3065) were small deletions and insertions (Indels) that code for frame-shifted truncated proteins (frame_shift_del, frame_shift_ins) or variant proteins with small internal deletions (in_frame_del). Finally, one mutation within the start (UGT1A6) or stop (UGT2B15) codon was also observed. Mutations that introduce premature stop codons may lead to nonsense-mediated mRNA decay or may encode truncated proteins. Truncated UGTs generally have no transferase activity but might act as dominant negative regulators repressing UGT activity [82][83][84].
Although the MC3 project focused on the analysis of mutations in coding exonic regions of human protein-coding genes, we found that approximately 10% of the mutations in UGT genes (362/3427) occurred in untranslated regions (5 UTR, 3 UTR), introns and splice sites. These mutations do not change UGT protein sequences, but they may affect the splicing, stability and translation initiation of UGT transcripts, as described in detail below.

Mutations in the UGT1A Subfamily Genes
The UGT1A subfamily contains nine genes (1A1, 1A3-1A10) which have unique first exons and a shared set of exons 2-5 [1]. We found 87 mutations within exons 2-5 and 68 to 96 mutations in the individual exon 1s in TCGA tumors (Tables 2 and S4). More than half of the mutations in exons 2-5 ( Figure 2) and nine unique exon 1s (Figures S1-S9) were missense, nonsense or small indel mutations that result in amino acid substitutions or generate truncated proteins. Mutations in exons 2-5 affect all nine UGT1A enzymes; however, mutations in the first exon only affect the corresponding UGT1A enzyme.
Most mutations in UGT1A genes generally occurred randomly throughout the coding sequences (Figures 2, S1-S9). However, mutation hotspots were also observed. For example, there were 11 different mutations in a 23 bp region between nucleotides 663 and 685 of the UGT1A9 exon 1 ( Figure S8).
The nine unique UGT1A exon 1s encode the N-terminal half (284-288 amino acids) of the UGT1A proteins. Using the Clustal Omega program, we identified 66 conserved amino acids within this region across all nine UGT1As. We identified two conserved amnio acids ( 152 Pro and 257 Arg as positioned in UGT1A1 protein sequence) whose codons were mutated in seven UGT1A genes, generating missense or nonsense mutations ( Figure 3A).

Figure 2.
Mutations within the coding region of the shared UGT1A exons 2-5 in TCGA tumors. Data shown are the NCBI reference sequence (4 exons, 738 bp) for UGT1A exons 2-5 with genomic positions (GRCh37/hg19) indicated at the right, and the positions at cDNA (at left) and protein (above sequence) levels for exons 2-4 (defined by the UGT1A10 reference sequences: NM_019075.4 and NP_061948.1) and exon 5 (defined by the UGT1A4 reference sequences: NM_007120.3 and NP_009051.1). Mutations (missense, nonsense and small indels) and the resulting changes at the protein levels are indicated above the reference sequence. Recurrent mutations are indicated by # (twice), $ (three times) and ∧ (six times).
Most mutations in UGT1A genes generally occurred randomly throughout the coding sequences (Figure 2, Figures S1-S9). However, mutation hotspots were also observed. For example, there were 11 different mutations in a 23 bp region between nucleotides 663 and 685 of the UGT1A9 exon 1 ( Figure S8).
The nine unique UGT1A exon 1s encode the N-terminal half (284-288 amino acids) of the UGT1A proteins. Using the Clustal Omega program, we identified 66 conserved amino acids within this region across all nine UGT1As We identified two conserved amnio acids ( 152 Pro and 257 Arg as positioned in UGT1A1 protein sequence) whose codons were mutated in seven UGT1A genes, generating missense or nonsense mutations ( Figure  3A). Mutations generally occurred randomly throughout the coding sequences of the three UGT2A genes; however, within the shared UGT2A1/2A2 exons 2-6, exons 2 and 6 appear to have more mutations than the three other exons, representing mutation hotspots (Figure 4).
Mutations generally occurred randomly throughout the coding sequences of the three UGT2A genes; however, within the shared UGT2A1/2A2 exons 2-6, exons 2 and 6 appear to have more mutations than the three other exons, representing mutation hotspots (Figure 4).    Figure S18).

Mutations in the UGT3 Subfamily Genes
The UGT3 subfamily comprises UGT3A1 and UGT3A2 [1]. We found 307 and 255 somatic mutations in TCGA tumors in UGT3A1 and UGT3A2, respectively (Tables 2 and S4). We found that 53% of the mutations (165/307) in UGT3A1 and 62% of the mutations (159/255) in UGT3A2 result in amino acid substitutions (missense), premature stop codons (nonsense) or frame-shift truncated proteins (small indels) (Figures S20 and S21, Table 2). Approximately 45% of the missense mutations in UGT3A1 and UGT3A2 were SIFT-defined as deleterious amino acid substitutions with a significant impact on protein function ( Table 2). Most mutations occurred randomly throughout the two UGT3A genes; however, mutation hotspots with multiple mutations clustered together were also observed. Examples of such hotspots include (1) six mutations within an 11 bp region in UGT3A1 (c.1396-1406) ( Figure S20) and (2) six mutations within a 16 bp region in UGT3A2 (c.17-32) ( Figure S21).

Mutations in the UGT8 Gene
We found 140 somatic mutations in UGT8 in TCGA tumors, nearly 70% (97/140) of which were missense, nonsense and small indels that result in amino acid substitutions, premature stop codons and frame-shift truncated proteins, respectively (Table 2, Figure S22). Approximately half (47/89) of the missense mutations were defined by SIFT to be deleterious amino acid substitutions with a significant effect on protein function. Mutations were randomly distributed across the UGT8 gene ( Figure S22).

Mutations in the 5 UTRs of UGT Genes
The Kozak sequence [GCCGCC(A/G)CCAUGG] [(positioned as +1 for A in start codon AUG (underlined)] surrounding the start AUG codon is critical for translation initiation ( Figure 6) [85][86][87]. An A or G in the −3 position and a G in the +4 position represent the optimal Kozak motif ( Figure 6). Variations at all other positions have no or a weak impact on translation initiation [85]. Nine (i.e., 1A1, 1A3, 1A4, 1A5, 1A6, 2B10, 2B28, 3A1 and 3A2) of the 22 UGT genes have the optimal Kozak motif with an A or G at −3 and a G at +4 (Figure 6). In this study, we found 99 mutations in the 5 UTRs of UGT genes, 18 of which were located within the Kozak sequence with potential influence on translation initiation ( Figure 6, Table 2, Table S6). For example, the conserved G at +4 in UGT2B28 was mutated to T. This might reduce translation efficiency ( Figure 6). Although the Kozak sequence does not generally extend beyond G in position +4, a C at position +5 is highly conserved in eukaryotic genes [88]. Consistent with this, 17 of the 22 UGT genes have a C at position +5 ( Figure 6). Furthermore, the presence of a U at position +5 was shown to negate the effect of G at position +4 [86]. Therefore, mutations that change a U at position +5 to any of the three other bases may enhance translation efficiency. The Kozak sequences of two UGT genes (UGT3A1, UGT2A1) have a G at position +4 and a U at position +5, indicative of a weak element ( Figure 6). We found a mutation within the UGT3A1 Kozak sequence that changed the U to C at position +5, thus possibly enhancing translation efficiency ( Figure 6). Finally, several UGT genes had mutations at positions −10 (1A1), +7 (1A4, 2B4, 2B17), +8 (2B7), +9 (2B4, 2B7, 2B15, 2B17, 2B28) and +10 (1A5, 2B10) that are not within but adjacent/close to the Kozak sequence ( Figure 6). As previously reported [85][86][87], these mutations likely have no or weak effects on translation initiation.

Mutations in the 3 UTRs of UGT Genes
Fourteen of the 22 UGT mRNAs are known to be regulated by at least one miRNA via binding to their 3 UTRs [89]. In this study, we found 182 somatic mutations in the UGT 3 UTRs in TCGA tumors ( Table 2, Table S7). Mutations within known miRNA target sites may affect miRNA regulation. Examples of such mutations include (1) two mutations in the UGT1A 3 UTR (*70A > T, *74T > A) within the seed target site that is shared by miRNA-200a-3p and miR-141-3p ( Figure 7A) and (2) two mutations in the UGT2B4 3 UTR (*83G > T, *83G > A) within the miR-216b-5p seed target site ( Figure 7B). As miRNAs regulate target mRNAs primarily via the binding of its seed to the seed target site [89], these mutations are likely to disrupt this binding with a significant impact on miRNA regulation. Furthermore, the pairing of the 3 sequence of the miRNA to the 5 sequence of the target site (3 pairing) can also facilitate miRNA regulation [88]. We found many mutations that are located outside seed target sites but within the 5 sequences of known miRNA target sites in the UGT2B7 ( Figure S23A) and UGT2B15 ( Figure S23B) 3 UTRs. The potential impact of these mutations on miRNA regulation remains to be investigated.

Mutations in the Splice Sites of UGT Genes
Most canonical exons of human genes have a conserved acceptor splice site with the dinucleotide "AG" at the 5 -end and a conserved donor splice site with the dinucleotide "GT" at the 3 -end; therefore, mutations in splice sites, especially those within the dinucleotides AG and GT, can disrupt pre-mRNA splicing, leading to exon skipping or intron inclusion [90]. In this study, we found 26 mutations in the donor splice sites and 19 mutations in the acceptor splice sites of 12 UGT genes (1A10, 2A1, 2A2, 2A3, 2B4, 2B7, 2B10, 2B11, 2B15, 2B28, 3A1 and 3A2) in TCGA tumors (Tables 2 and S4). Of these 45 mutations, 38 occurred at the G base within the AG or GT dinucleotide, which was mutated to A (25 mutations), T (7 mutations) or C (4 mutations). Therefore, these mutations abolished the conserved dinucleotide AG or GT of splice sites and likely disrupted the splicing of the relevant exons (Table S4).

Mutations in the 3′ UTRs of UGT Genes
Fourteen of the 22 UGT mRNAs are known to be regulated by at least one miRNA via binding to their 3′UTRs [89]. In this study, we found 182 somatic mutations in the UGT 3′UTRs in TCGA tumors ( Table 2, Table S7). Mutations within known miRNA target sites may affect miRNA regulation. Examples of such mutations include (1) two mutations in the UGT1A 3′UTR (*70A > T, *74T > A) within the seed target site that is shared by miRNA-200a-3p and miR-141-3p ( Figure 7A) and (2) two mutations in the UGT2B4 3′UTR (*83G > T, *83G > A) within the miR-216b-5p seed target site ( Figure 7B). As miRNAs regulate target mRNAs primarily via the binding of its seed to the seed target site [89], these mutations are likely to disrupt this binding with a significant impact on miRNA regulation. Furthermore, the pairing of the 3′ sequence of the miRNA to the 5′ sequence of the target site (3′ pairing) can also facilitate miRNA regulation [88]. We found many mutations that are located outside seed target sites but within the 5′ sequences of known miRNA target sites in the UGT2B7 ( Figure S23A) and UGT2B15 ( Figure S23B) 3′UTRs. The potential impact of these mutations on miRNA regulation remains to be investigated.

Recurrent Mutations in UGT Genes
We found 215 recurrent mutations with a total number of 519 mutations in UGT genes in TCGA tumors (Table S8). Table S8 lists the recurrent mutations for each of the 22 UGT genes, including 163, 31, 11, 6, 3 and 1 recurrent mutations that occurred 2, 3, 4, 5, 6 and 8 times, respectively. Nearly 80% (171/215) were missense, nonsense and small indels that result in amino acid substitutions, premature stop codons and frame-shift truncated proteins, respectively. Three UGT genes (2B4, 3A2 and 2B10) had the largest numbers of recurrent mutations (29, 20 and 15, respectively). Recurrent mutations generally occurred in more than one cancer type. For example, the "Frame_Shift_Del" mutation [c.517delT(Trp173GlyfsTer8)] in UGT1A4 exon 1 was observed in three different types of cancers, including one COAD tumor, two UCEC tumors and five STAD tumors (Table S4). Another "Frame_Shift_Del" mutation [1566delA (Arg524GlufsTer22)] in UGT1A exon 5 was also seen in three different types of cancers, including one BRCA tumor, three COAD tumors and two STAD tumors (Table S4). In contrast, some recurrent mutations were restricted to a specific cancer type. For example, the Frame_Shift_Del [c.364delT (Ser122GlnfsTer12)] was only seen in three UCEC tumors; the missense mutation [c.463C > T (Pro155Ser)] occurred only in five SKCM tumors (Table S4).

Mutations in the Splice Sites of UGT Genes
Most canonical exons of human genes have a conserved acceptor splice site with the dinucleotide "AG" at the 5′-end and a conserved donor splice site with the dinucleotide "GT" at the 3′-end; therefore, mutations in splice sites, especially those within the dinucleotides AG and GT, can disrupt pre-mRNA splicing, leading to exon skipping or intron inclusion [90]. In this study, we found 26 mutations in the donor splice sites and 19 mutations in the acceptor splice sites of 12 UGT genes (1A10, 2A1, 2A2, 2A3, 2B4, 2B7, 2B10, 2B11, 2B15, 2B28, 3A1 and 3A2) in TCGA tumors ( Table 2, Table S4). Of these 45 mutations, 38 occurred at the G base within the AG or GT dinucleotide, which was mutated to A (25 mutations), T (7 mutations) or C (4 mutations). Therefore, these mutations abolished the conserved dinucleotide AG or GT of splice sites and likely disrupted the splicing of the relevant exons (Table S4).

Recurrent Mutations in UGT Genes
We found 215 recurrent mutations with a total number of 519 mutations in UGT genes in TCGA tumors (Table S8). Table S8 lists the recurrent mutations for each of the 22 UGT genes, including 163, 31, 11, 6, 3 and 1 recurrent mutations that occurred 2, 3, 4, 5, 6 and 8 times, respectively. Nearly 80% (171/215) were missense, nonsense and small indels that result in amino acid substitutions, premature stop codons and frame-shift truncated proteins, respectively. Three UGT genes (2B4, 3A2 and 2B10) had the largest numbers of recurrent mutations (29, 20 and 15, respectively). Recurrent mutations generally occurred in more than one cancer type. For example, the "Frame_Shift_Del" mutation [c.517delT(Trp173GlyfsTer8)] in UGT1A4 exon 1 was observed in three different types of cancers, including one COAD tumor, two UCEC tumors and five STAD tumors (Table  S4). Another "Frame_Shift_Del" mutation [1566delA (Arg524GlufsTer22)] in UGT1A exon 5 was also seen in three different types of cancers, including one BRCA tumor, three COAD tumors and two STAD tumors (Table S4). In contrast, some recurrent mutations were restricted to a specific cancer type. For example, the Frame_Shift_Del [c.364delT (Ser122GlnfsTer12)] was only seen in three UCEC tumors; the missense mutation [c.463C > T (Pro155Ser)] occurred only in five SKCM tumors (Table S4).

Assesment of Associations of UGT Mutations with Clinicopathological Parameters Using the LUAD Cohort
The LUAD cohort had 512 tumors with a total number of 243,687 somatic mutations, of which 235 tumors have UGT mutations with a total number of 412 somatic mutations in UGT genes ( Table 1). The LUAD cohort represents a good model cancer type to assess whether UGT mutations are associated with clinicopathological parameters. All raw data used for analysis in this section are provided in Table S9. We obtained clinicopathological parameters such as tumor stages for 488 tumors and overall survival (OS) times for 495 patients from the TCGA Pan-Cancer Clinical Data Resources (TCGA-CDR), as we recently reported [39]. Figure 8 shows the numbers of tumors with or without UGT mutations at four different tumor stages (I, II, II and IV). Chi-squared tests showed that there was no significant difference in the mutation frequency of UGT genes across different stages (p = 0.25) (Figure 8). This indicates that mutations in UGT genes were not related to tumor stage.
We recently reported highly variable expression of UGT genes in LUAD tumors and a lack of expression of one or multiple UGT genes in many LUAD tumors [39]. Mutations in UGT genes might have an impact on clinicopathological parameters only if they occur in tumors that express the corresponding UGT gene. We focused on the analysis of 165 LUAD tumors with a total number of 267 mutations in UGT genes, such as missense, nonsense or small indels that are predicted to encode mutated proteins. We obtained the expression levels (RSEM) of all UGT genes in these tumors, as recently reported [39]. We found that 31% of UGT mutations (83/267) occurred in tumors that were previously classified to have a high expression of the corresponding UGT genes [39] (Table S9). However, this analysis also revealed differences in the overall levels of UGT expression between mutated and unmutated tumor groups (Table S9), suggesting that any comparison of clinicopathological features and clinical outcomes (e.g., survival time) between these groups could be confounded by differing UGT expression levels. For this reason, we did not attempt to perform survival analyses comparing the mutated and non-mutated cohorts within any cancer type. four different tumor stages (I, II, II and IV). Chi-squared significant difference in the mutation frequency of UGT g 0.25) (Figure 8). This indicates that mutations in UGT g stage. We recently reported highly variable expression of U a lack of expression of one or multiple UGT genes in man in UGT genes might have an impact on clinicopathologic in tumors that express the corresponding UGT gene. We LUAD tumors with a total number of 267 mutations in UG sense or small indels that are predicted to encode mutate pression levels (RSEM) of all UGT genes in these tumor found that 31% of UGT mutations (83/267) occurred in tu sified to have a high expression of the corresponding UGT this analysis also revealed differences in the overall lev mutated and unmutated tumor groups (Table S9), sugg clinicopathological features and clinical outcomes (e.g. groups could be confounded by differing UGT expressio not attempt to perform survival analyses comparing the horts within any cancer type.

Mutations in UGT Genes in Human Cancer Cell Lines
After having characterized the mutations of UGT genes in human cancers, we assessed the mutation profiles of UGT genes in 1568 CCLE cell lines. Overall, we found 895 mutations in UGT genes in 502 CCLE cell lines (Tables 3 and S5). Table 3 lists the number of mutations for each of the eight different types of mutations found in CCLE cell lines: (1) translation_start_site, (2) missense, (3) nonsense, (4) frame_shift_del, (5) frame_shift_ins, (6) in_frame_del, (7) nonstop and (8) splice site. Of the 895 mutations, 728 (81%) were missense mutations, 337 (45%) of which were SIFT-defined as deleterious amino acid substitutions with a significant impact on protein function.
Of the 502 CCLE cell lines with mutations in UGT genes, 172 had multiple mutations in UGT genes (Table S5). Briefly, there were 89, 48, 15, 5, 4 and 4 cell lines that had 2, 3, 4, 5, 6 and 7 mutations in UGT genes, respectively. Seven other cell lines that were possibly derived from hypermutated tumors had 10 (COLO792), 11 (HCC2998, SNU1040), 12 (GP5D, MEWO, SNU81) or 15 (SW684) mutations in UGT genes. Multiple mutations within a single cell line were usually distributed across several UGT genes, although some cell lines showed mutations clustered in a single UGT gene (Table S5) Of the 895 mutations in UGT genes in CCLE cell lines, 150 were recurrent mutations (Table S5). There were 48, 11, 1, 1 and 2 mutations in UGT genes that reoccurred in 2, 3, 4, 5 and 6 different CCLE cell lines, respectively. The majority of CCLE cell lines with the same recurrent mutations in UGT genes were derived from different types of tumors. For example, the mutation c.518T > G (Leu173Arg) in UGT1A9 was observed in five CCLE cell lines (JHOM1, LUDLU1, CAPAN1, TCCSUP and HT29) that were derived from five different types of cancers (ovary cancer, non-small cell lung cancer, pancreatic cancer, bladder cancer and colorectal cancer, respectively) (Table S5).
A comparison of the 3427 mutations in TCGA tumors and the 895 mutations in CCLE cell lines identified 114 mutations in UGT genes that were found in both TCGA tumors and CCLE cell lines (Tables 2, 3 and S10). Nearly one-third of these mutations were present in CCLE cell lines derived from the same types of tumors that shared the mutations, suggesting that they might be derived from the parental tumors. Table S10 shows the number of mutations for every UGT gene that occurred in both TCGA tumors and CCLE cell lines. For example, 10 of the 36 mutations in UGT1A7 in the CCLE cell lines were also found in TCGA tumors (Table S10).

Discussion
We recently showed abundant expression of UGT genes in human cancers and their association with clinical outcomes, highlighting the importance of the intratumoral metabolism of drugs and pro/anti-cancer signaling molecules through the UGT conjugation pathway [39]. Somatic mutations in UGT genes in the tumor that could influence this pathway have not yet been reported. In the present study, our assessment of the mutation profiles of 1069 tumors from 33 TCGA cancer types revealed for the first time the somatic mutation landscape for all 22 UGT genes in human cancers. Briefly, nearly one-fifth of the tumors analyzed had mutations in UGT genes with a total number of 3427 somatic mutations. Most mutations occurred sporadically throughout the coding sequences of UGT genes, but recurrent mutations and mutation hotspot regions were also observed. The impact of mutations on protein function depends on the position and type of mutation. Approximately two-thirds of the mutations in UGT genes in tumors were missense, frame-shift and nonsense mutations that may directly affect UGT function via coding for variant or truncated proteins. However, this direct impact may only occur in the tumors that express the mutated UGT proteins. Our analysis of the LUAD tumors indicates that approximately 31% of these mutations occurred in the tumors that expressed the corresponding UGT genes. Mutations in non-coding regions do not alter protein sequence but may indirectly influence UGT function through modulating the efficiency of translation initiation (mutations within the Kozak sequence in 5 UTRs), disrupting miRNA regulation (mutations in 3 UTRs) or altering pre-mRNA splicing process (mutations in splice sites). Collectively, somatic mutations occurred throughout the exonic sequences of UGT genes with a potential impact on local UGT activity within the tumor through multiple mechanisms.
Cancer genomes acquire somatic mutations during cancer development and progression that are generally classified into driver and passenger mutations [91][92][93][94]. Driver mutations contribute to cancer initiation or promote tumor growth; passenger mutations accumulate through tumor evolution with no or even detrimental effects on tumor growth [93,95]. On average, every cancer genome has 4-5 driver mutations, with the vast majority of mutations being passenger mutations [96]. Genes with driver mutations in at least one cancer type are considered to be cancer driver genes [93,97]. A recent PanCancer and PanSoftware analysis of the MC3 somatic mutations in 9423 tumors from 33 TCGA cancer types identified 299 cancer driver genes [52]. UGT genes were not among these cancer driver genes, and driver mutations in UGT genes have not yet been reported in other similar studies [52,96,98]. However, more than half of our observed somatic mutations in UGT genes are predicted to code for truncated proteins or variant proteins with deleterious amino acid substitutions. Given that UGT proteins dimerize, and truncated inactive forms have been shown to act in a dominant-negative manner, it is possible for even heterozygous mutations of UGT genes to lead to a significant loss of UGT function [82][83][84]. Our findings therefore support the potential role of UGT somatic mutations in modulating cancer growth and treatment, as described in detail below.
Numerous drugs and their active metabolites are UGT substrates, including chemotherapy drugs such as etoposide, epirubicin and irinotecan [5,16,99,100]. Mutations in UGT genes in the tumor that reduce the glucuronidation of anticancer drugs can increase intratumoral drug concentrations, thus potentially enhancing therapy efficacy and inhibiting tumor growth. For example, irinotecan is commonly used for treating colorectal cancer (COAD), and its active metabolite, SN-38, is primarily glucuronidated by UGT1A1 with weak activity from all other UGT1As except UGT1A4 [101][102][103][104][105]. We recently showed the expression of all nine UGT1As at varying levels in COAD tumors, suggesting that there is in situ glucuronidation of SN-38 within the tumor [39]. In the present study, we found seven COAD tumors with a frame-shift [c.1566delA (Arg524GlufsTer22)] or deleterious amino acid substitution mutation in the shared UGT1A  (Table S4). Treatments with irinotecan have been reported for all these six cancer types, suggesting that somatic mutations in UGT1A exons 2-5 could modulate irinotecan efficacy in these cancers [106][107][108][109][110]. As mentioned earlier, germline genetic polymorphisms such as low activity UGT1A1 alleles result in high systemic SN-38 levels due to reduced hepatic clearance and hence increase the risk of hematological and gastrointestinal toxicity following irinotecan administration [18]. This raises the possibility of genotyping tumors for deleterious somatic UGT1A mutations as a strategy to identify patients that may show greater irinotecan efficacy, including at lower doses that reduce the risk of toxicity.
Another example of UGT somatic mutations that may have an impact on drug responses relates to the treatment of estrogen receptor-positive breast cancers with antiestrogens such as tamoxifen (TAM) and aromatase inhibitors such as exemestane (EXE) [111]. The active metabolites of TAM and EXE are 4-OH-TAM and 17-OH-EXE, which are glucuronidated by UGT2B15 and UGT2B17, respectively [112,113]. We found BRCA tumors with deleterious somatic mutations in UGT2B15 and UGT2B17 (Table S4). It is anticipated that these mutations could decrease the glucuronidation of 4-OH-TAM or 17-OH-EXE within the tumor, potentially enhancing drug efficacy and inhibiting breast tumor growth.
In addition to drugs, numerous endogenous (e.g., fatty acids, bile acids, bilirubin and steroid hormones) and exogenous (e.g., dietary constituents, carcinogens) pro/anti-cancer molecules are UGT substrates [5]. For example, estrogens contribute to breast carcinogenesis and promote breast cancer growth [114]. Six UGTs (1A1, 1A3, 1A8, 1A9, 1A10, 2B7) that are expressed in breast cancer have been shown to conjugate estrogens, such as estrone (E1) and 17β-estradiol (E2) [39,115]. In the present study, we found four BRCA tumors with a frame-shift [c.1566delA (Arg524GlufsTer22)] or deleterious amino acid substitution mutation [c.1174G > T (Gly392Cys), c.1175G > T (Gly392Val), c.1326G > A (Met442Ile)] within the shared UGT1A exons 2-5. A further 14 BRCA tumors had similar mutations that specifically affect one of the aforementioned six UGT1As (Table S4). It is anticipated that these mutations could reduce the glucuronidation of estrogens within the tumor, thus increasing local estrogen levels and stimulating tumor growth. As mentioned earlier, androgens are implicated in prostate carcinogenesis and promote androgen-sensitive prostate cancer growth. Androgens are primarily inactivated in the prostate by UGT2B15 and UGT2B17 [25,26]. In the present study, we found two mutations in UGT2B15 [c.249A > T (Lys83Asp), c.436T > A (Phe146Ile)] and one mutation in UGT2B17 [c.1193C > G (Ala398Gly)] in PRAD tumors (Table S3). It is anticipated that these somatic mutations could reduce androgen inactivation, thus potentially promoting androgen-sensitive prostate cancer growth.
The impact on the intratumoral metabolism of drugs and pro/anti-cancer signaling molecules is likely to be more profound in tumors with multiple mutations in UGT genes. As described earlier, four cancers with a high mutation burden (SKCM-metastatic, SKCM-primary, LUAD and UCEC) had the largest percentages (46%, 29%, 19% and 10%, respectively) of tumors with two or more mutations from UGT genes. This is particularly true for hypermutated tumors. Among the 53 hypermutated tumors that were excluded from analysis in this study, forty-six had ten or more mutations from UGT genes (Table S1). Our preliminary analysis of the MC3 MAF file revealed widespread mutations in other drug-metabolizing enzymes (e.g., CYPs, SULTs and GSTs) and ABC and SLC transporters in TCGA tumors. It is anticipated that multiple concurrent mutations that simultaneously affect UGTs and other drug-metabolizing enzymes and transporters would have a great impact on the capacity of cancer cells to uptake, metabolize and dispose of anticancer drugs and other pro/anti-cancer molecules.
Human cancer cell lines have been used for experimental models for the study of cancer biology and therapy for decades [53,54]. Cancer cell lines are also used to study the function and regulation of UGT genes [3,5,89]. Our finding of mutations in UGT genes in over 500 cancer cell lines emphasizes the importance of the selection of proper cell lines that have no mutations in the UGT genes under investigation. For example, UGT2B28 is highly upregulated upon androgen exposure in the ZR751 breast cancer cell line, which would suggest that it is a suitable line to study the biological function of this gene in breast cancer [116]. However, as described earlier, the presence of three deleterious mutations in UGT2B28 in the ZR751 cell line indicates that it is unsuitable for functional studies of this gene.
One limitation of the present study is that the MC3 data derive mainly from primary, treatment-naïve tumors. It is likely that drug treatment results in selective pressure that enriches mutations that provide a growth/survival advantage. It is possible that pre-and post-treatment tumors have different profiles of mutations in UGTs that control intratumoral drug exposure; however, the analysis of such paired pre-and post-treatment datasets is a subject for future study. Another notable consideration is that the levels of UGT expression may vary in tumors with and without UGT mutations. Mutational status and expression level are often treated as independent variables in analyses of clinicopathological features and clinical outcomes. However, we found that the overall expression levels of UGTs were not comparable between the mutated and unmutated groups within the LUAD cohort, suggesting that such analyses should be treated with caution. Future studies could develop approaches to integrate these variables in analyses of clinical outcomes.

Conclusions
In conclusion, our comprehensive assessment of the mutation profiles in 1069 TCGA tumors and 1568 CCLE cell lines identified 3427 and 895 mutations in UGT genes in human cancers and cancer cell lines, respectively. Over half of the mutations in UGT genes in tumors are predicted to encode truncated proteins or variant proteins with deleterious amino acid substitutions that likely influence the capacity of cancer cells to metabolize anticancer drugs and pro/anti-cancer signaling molecules through the UGT conjugation pathway. As a result, somatic mutations in UGT genes might affect tumor growth and therapeutic efficacy, suggesting their potential role as biomarkers predicting therapeutic efficacy and clinical outcomes. We acknowledge the necessity for future experimental and prospective clinical studies to further validate this hypothesis. Overall, we consider this study an important first step in identifying the mutational profiles of UGTs and other genes associated with drug metabolism and disposition in tumors, which could ultimately aid in the development of personalized cancer therapies.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/cancers14225708/s1, Figure S1: Mutations within the UGT1A1 exon 1 in TCGA tumors; Figure S2: Mutations within the UGT1A3 exon 1 in TCGA tumors; Figure S3: Mutations within the UGT1A4 exon 1 in TCGA tumors; Figure S4: Mutations within the UGT1A5 exon 1 in TCGA tumors; Figure S5: Mutations within the UGT1A6 exon 1 in TCGA tumors; Figure S6: Mutations within the UGT1A7 exon 1 in TCGA tumors; Figure S7: Mutations within the UGT1A8 exon 1 in TCGA tumors; Figure S8: Mutations within the UGT1A9 exon 1 in TCGA tumors; Figure S9: Mutations within the UGT1A10 exon 1 in TCGA tumors; Figure S10: Mutations within the UGT2A1 exon 1 in TCGA tumors; Figure S11: Mutations within the UGT2A2 exon 1 in TCGA tumors; Figure S12: Mutations at the UGT2A3 gene in TCGA tumors; Figure S13: Mutations at the UGT2B7 gene in TCGA tumors; Figure S14: Mutations at the UGT2B10 gene in TCGA tumors; Figure S15: Mutations at the UGT2B11 gene in TCGA tumors; Figure S16: Mutations at the UGT2B15 gene in TCGA tumors; Figure S17: Mutations at the UGT2B17 gene in TCGA tumors: Figure S18: Mutations at the UGT2B28 gene in TCGA tumors; Figure S19: Mutations within the codons of conserved amino acids across the UGT1A and UGT2B family enzymes; Figure S20: Mutations at the UGT3A1 gene in TCGA tumors; Figure S21: Mutations at the UGT3A2 gene in TCGA tumors; Figure S22: Mutations at the UGT8 gene in TCGA tumors; Figure S23: Functional miRNA target sites and somatic mutations at 3 untranslated regions of UGT mRNAs in TCGA tumors; Table S1: The tumors for each of the 33 different types of TCGA cancers that were included in the analysis of this study; Table S2: The tumors for each of the 33 different types of TCGA cancers that had somatic mutations at UGT genes; Table S3: Somatic mutations at UGT genes for each of the 33 different types of TCGA cancers; Table S4: Somatic mutations in TCGA tumors for each of the 22 UGT genes; Table S5: The CCLE cell lines assessed by this study and the mutations found in these cell lines for each of the 22 UGT genes; Table S6: Mutations at 5 untranslated regions (UTRs) of UGT genes in TCGA tumors; Table S7: Mutations at 3 untranslated regions (UTRs) of UGT genes in TCGA tumors. Mutations in functional miRNA seed target sites are in bold; Table S8: Recurrent somatic mutations at UGT genes in TCGA tumors; Table S9: Assessment of associations of UGT mutations with clinicopathological parameters in the LUAD cohort; Table S10: Mutations of UGT genes found in both TCGA tumors and CCLE cell lines.

Conflicts of Interest:
The authors declare no conflict of interest.