Somatic Host Cell Alterations in HPV Carcinogenesis

High-risk human papilloma virus (HPV) infections cause cancers in different organ sites, most commonly cervical and head and neck cancers. While carcinogenesis is initiated by two viral oncoproteins, E6 and E7, increasing evidence shows the importance of specific somatic events in host cells for malignant transformation. HPV-driven cancers share characteristic somatic changes, including apolipoprotein B mRNA editing catalytic polypeptide-like (APOBEC)-driven mutations and genomic instability leading to copy number variations and large chromosomal rearrangements. HPV-associated cancers have recurrent somatic mutations in phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) and phosphatase and tensin homolog (PTEN), human leukocyte antigen A and B (HLA-A and HLA-B)-A/B, and the transforming growth factor beta (TGFβ) pathway, and rarely have mutations in the tumor protein p53 (TP53) and RB transcriptional corepressor 1 (RB1) tumor suppressor genes. There are some variations by tumor site, such as NOTCH1 mutations which are primarily found in head and neck cancers. Understanding the somatic events following HPV infection and persistence can aid the development of early detection biomarkers, particularly when mutations in precancers are characterized. Somatic mutations may also influence prognosis and treatment decisions.

HPV infection alone is an insufficient cause of carcinogenesis. Most HPV infections become undetectable after a few months and never result in malignancies, with 91% becoming undetectable after two years, although it has been proposed that there may be some level of persistent latent infection that is undetectable by PCR [10,11]. High-risk HPV types persist longer on average than low-risk types [12]. A failure to clear the virus results in viral persistence, but many persistent infections never develop into precancerous lesions [13]. Finally, even advanced precancerous cervical intraepithelial neoplasias grade 3 (CIN3) only progress to invasive cancer in 30% of cases over 30 years [14]. When infections persist over time, somatic mutations may accumulate and contribute to the development of precancerous lesions, and then finally to malignant cancers. Understanding the complete carcinogenic pathways is important for developing new strategies to prevent HPV-associated cancer mortality, both through early detection and through targeted therapies [15,16].
HPV-derived cancers share many carcinogenic features across cancer sites, suggesting that the viral oncoproteins E6 and E7 work similarly at different sites. A previous review on this topic [17] predates recent publications of large genomic data from HPV-driven cervical and head and neck cancers in The Cancer Genome Atlas (TCGA) [2,18]. Here, we review common somatic mutations, copy number alterations, and related pathways identified by TCGA and other recent efforts. While the focus of this review is on somatic changes, genome-wide association (GWAS) studies of cervical and HPV-related head and neck cancers have shown that there is also a heritable component. At both cancer sites, human leukocyte antigen (HLA) variants are among the few consistent, independently replicated findings from GWAS studies [19][20][21].

Mechanisms of HPV-Mediated Mutagenesis
There is a great diversity of HPV genotypes, but only a small subset is carcinogenic; among these, HPV16 alone accounts for 50-90% of HPV-driven cancers depending on the site, with some regional variations [22,23]. Most cancers evaluated in studies included in this review are caused by HPV16, and there may be variations in somatic mutation load and type by HPV genotype that are currently not adequately captured. Two of the eight proteins encoded by the HPV genome, E6 and E7, account for most carcinogenic effects of high-risk HPV types [15]. They promote carcinogenesis in several ways, including creating genomic instability and inhibiting tumor suppressor genes. E6 and E7 directly promote genomic instability, which can result in large chromosomal rearrangements and copy number variations, by interfering with centromere duplication during mitosis [24,25]. Both oncoproteins interfere with important cellular tumor suppressor pathways: E6 inhibits the p53 tumor suppressor by promoting its proteasomal degradation [26,27], while E7 disrupts the retinoblastoma (Rb) pathway resulting in uncontrolled activation of the cell cycle and induction of p16 INK4A , a cyclin-dependent kinase inhibitor, through a disrupted feedback loop ( Figure 1) [28][29][30]. Theoretically, since HPV oncoproteins are important carcinogenic drivers interfering with several cellular pathways, it could be expected that fewer somatic alterations are required for malignant transformation in HPV-associated compared to non-HPV associated cancers. There is some evidence of lower mutation load in HPV-positive compared to HPV-negative penile cancers [31]. However, the evidence is inconclusive for head and neck cancers, with one study showing evidence of a reduced somatic mutation load in HPV-positive compared to HPV-negative cancers [3] while the TCGA head and neck study did not find evidence of a difference [2].
In addition to direct viral effects, specific mutation signatures may be overrepresented in HPV-positive cancers due to host-viral interactions. The apolipoprotein B mRNA editing catalytic polypeptide-like (APOBEC) mutation signature in particular is very common in HPV-positive cancers, likely triggered by the host response to HPV infection [32]. Figure 1. The Rb and p53 pathways are disrupted by the human papilloma virus (HPV) oncoproteins E7 and E6, respectively. The HPV E7 protein binds to Rb with high affinity, disrupting its interaction with the transcription factor E2F. This results in the release and activation of E2F, driving expression of S-phase genes and cell cycle progression. P16 INK4A is a cyclin-dependent kinase inhibitor that regulates the cell cycle by inactivating cyclin-dependent kinases involved in Rb phosphorylation. Upregulation of p16 INK4A is induced by HPV-mediated disruption of E7, leading to the accumulation of p16 INK4A in HPV-transformed cells. The HPV E6 protein inhibits apoptosis by targeting the tumor suppressor protein, p53, for degradation. HPV E6 inhibition of p53 promotes cell proliferation and can lead to genomic instability and the accumulation of somatic mutations. Abbreviations: Rb, retinoblastoma protein; p16 INK4A , cyclin-dependent kinase inhibitor 2A; CDK, cyclin-dependent kinases; E2F, E2F transcription factor; CDC, cell-division-cycle genes; MCM, minichromosome maintenance family.

Genomic Instability
Rates of copy number alterations vary across cancer sites. Cervical cancers average 88 copy number alterations in the TCGA dataset, including 26 amplifications and 37 losses [18]. Focal amplifications of loci containing genes discussed elsewhere in this review in order of frequency include 3q28 (tumor protein p63 (TP63), altered in 77% of samples), 3q24.1 (transforming growth factor beta receptor 2 (TGFBR2), 36%), 10q23.31 (phosphatase and tensin homolog (PTEN), 31%), 18q21.2 (SMAD family member 4 (SMAD4), 28%), and 7p11.2 (epidermal growth factor receptor (EGFR), 17%) [18]. Greater numbers of copy number variations were reported in cervical squamous cell carcinomas than in cervical adenocarcinomas [18]. A review of cervical squamous cell carcinomas from other datasets as well as limited information on HPV-positive vulvar squamous cell carcinomas also showed gains at 3q (55%), losses at 3p (36%), and losses at 11q (33%) [33]. A study of CIN3 lesions and invasive cancers reported an average of 36.3 copy number alterations in cancers, with the most frequent amplification at 3q (50% of cancers and 25% of CIN3) [34]. Notably, this region contains the phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) gene, which is the most commonly mutated gene in HPV-driven cancers across sites (see below). Losses were most common in 3p (40% of cancers and 10% of CIN3) [34]. A summary of copy number alterations reported in HPV-driven cancers can be found in Table 1. Figure 2 shows the frequency of chromosomal amplifications and deletions across the whole genome in cervical cancers from TCGA [18]. The Rb and p53 pathways are disrupted by the human papilloma virus (HPV) oncoproteins E7 and E6, respectively. The HPV E7 protein binds to Rb with high affinity, disrupting its interaction with the transcription factor E2F. This results in the release and activation of E2F, driving expression of S-phase genes and cell cycle progression. P16 INK4A is a cyclin-dependent kinase inhibitor that regulates the cell cycle by inactivating cyclin-dependent kinases involved in Rb phosphorylation. Upregulation of p16 INK4A is induced by HPV-mediated disruption of E7, leading to the accumulation of p16 INK4A in HPV-transformed cells. The HPV E6 protein inhibits apoptosis by targeting the tumor suppressor protein, p53, for degradation. HPV E6 inhibition of p53 promotes cell proliferation and can lead to genomic instability and the accumulation of somatic mutations. Abbreviations: Rb, retinoblastoma protein; p16 INK4A , cyclin-dependent kinase inhibitor 2A; CDK, cyclin-dependent kinases; E2F, E2F transcription factor; CDC, cell-division-cycle genes; MCM, minichromosome maintenance family.

Genomic Instability
Rates of copy number alterations vary across cancer sites. Cervical cancers average 88 copy number alterations in the TCGA dataset, including 26 amplifications and 37 losses [18]. Focal amplifications of loci containing genes discussed elsewhere in this review in order of frequency include 3q28 (tumor protein p63 (TP63), altered in 77% of samples), 3q24.1 (transforming growth factor beta receptor 2 (TGFBR2), 36%), 10q23.31 (phosphatase and tensin homolog (PTEN), 31%), 18q21.2 (SMAD family member 4 (SMAD4), 28%), and 7p11.2 (epidermal growth factor receptor (EGFR), 17%) [18]. Greater numbers of copy number variations were reported in cervical squamous cell carcinomas than in cervical adenocarcinomas [18]. A review of cervical squamous cell carcinomas from other datasets as well as limited information on HPV-positive vulvar squamous cell carcinomas also showed gains at 3q (55%), losses at 3p (36%), and losses at 11q (33%) [33]. A study of CIN3 lesions and invasive cancers reported an average of 36.3 copy number alterations in cancers, with the most frequent amplification at 3q (50% of cancers and 25% of CIN3) [34]. Notably, this region contains the phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) gene, which is the most commonly mutated gene in HPV-driven cancers across sites (see below). Losses were most common in 3p (40% of cancers and 10% of CIN3) [34]. A summary of copy number alterations reported in HPV-driven cancers can be found in Table 1. Figure 2 shows the frequency of chromosomal amplifications and deletions across the whole genome in cervical cancers from TCGA [18].   In HPV-positive head and neck cancers, significant copy number losses have been reported in 22 genes and gains in 65 genes, including RB transcriptional corepressor 1 (RB1) and PIK3CA [37]. The 3q26-28 region is amplified in both HPV-positive and HPV-negative cancers, while 3p deletions are primarily found in HPV-negative head and neck cancers [37].
In penile cancers, greater copy number gains in 15 regions and losses in four regions are seen in HPV-positive compared to HPV-negative cancers [38]. Autosomal copy number variations are most frequently observed on chromosomes 3 and 8, including losses in 3p and gains in 3q, and are also associated with worse prognosis [38]. A small study of HPV-positive anal cancers reported recurrent gains in 17q, 3q, 19p, and 19q [39].
In HPV-driven cancers of the cervix and head and neck, copy number variations often co-localize with sites of viral integration [2,18], a phenomenon that occurs in many HPV-associated cancers, and In HPV-positive head and neck cancers, significant copy number losses have been reported in 22 genes and gains in 65 genes, including RB transcriptional corepressor 1 (RB1) and PIK3CA [37]. The 3q26-28 region is amplified in both HPV-positive and HPV-negative cancers, while 3p deletions are primarily found in HPV-negative head and neck cancers [37].
In penile cancers, greater copy number gains in 15 regions and losses in four regions are seen in HPV-positive compared to HPV-negative cancers [38]. Autosomal copy number variations are most frequently observed on chromosomes 3 and 8, including losses in 3p and gains in 3q, and are also associated with worse prognosis [38]. A small study of HPV-positive anal cancers reported recurrent gains in 17q, 3q, 19p, and 19q [39].
In HPV-driven cancers of the cervix and head and neck, copy number variations often co-localize with sites of viral integration [2,18], a phenomenon that occurs in many HPV-associated cancers, and has been shown to vary by HPV type [40][41][42]. Though the mechanisms by which HPV integrates into the host cell genome are not well understood, these events tend to occur at regions of genomic instability [34,[42][43][44][45]. It has been proposed that copy number alterations commonly occur in regions of genomic instability, which in turn may promote viral integration in those locations, explaining why viral integration is more common at sites with copy number alterations than expected by chance [34]. Viral integration has also been observed in short regions of HPV and host genome sequence homology (i.e., "micro-homologies"), suggesting a potential role for DNA repair processes to integrate HPV and host cell genomes based on nucleotide sequence similarities [45,46].
Recurrent large chromosomal rearrangements have been reported in 23 locations in cervical cancers in TCGA [18]. One notable recurrent rearrangement is the 16p13 zinc finger CCCH-type containing 7Abreast cancer anti-estrogen resistance 4 (ZC3H7A-BCAR4) fusion, which together with copy number gain of the locus containing BCAR4 (16p13.13, found in 20% of tumors) and duplication detected by whole genome sequencing suggest a potential role of this gene in cervical carcinogenesis [18].
HPV-driven cancers of the cervix, head and neck, and penis share copy number alteration sites, most notably copy number gains in 3q, which in addition to PIK3CA contains the telomerase RNA component (TERC), MDS1 and EVI1 complex locus (MECOM), SRY-box 2 (SOX2), and TP63 genes [18,34,37,38]. It is worth noting that both HPV-positive and HPV-negative cancers display recurrent focal amplifications of this region [2]. Together with the extremely high somatic mutation rate of PIK3CA (see Section 3.2), this supports an important role for PIK3CA in HPV-mediated carcinogenesis.

APOBEC
The APOBEC family of cytosine deaminases causes cytosine to thymine or guanine mutations [47][48][49]. APOBEC3B, a subclass of these proteins, causes characteristic mutations that are enriched in many cervical and head and neck cancers [18,35,[50][51][52]. During DNA repair, APOBEC-mediated cytosine deamination can result in characteristic mutational signatures that occur at motifs involving a thymine immediately 5 to the target cytosine, collectively referred to as "TCW" mutations, where W corresponds to an A or T [52]. APOBEC-mediated mutagenesis is also enriched in HPV-positive subsets of many head and neck cancers [53] as well as in penile cancers [54] suggesting the activation of APOBEC enzymes in HPV-driven cancers across sites.
APOBEC-associated mutations are responsible for many mutations of genes in the HPV-associated carcinogenesis pathways discussed below, including common PIK3CA point mutations [53]. APOBEC signature enrichment was reported in 150 of 192 exomes in TCGA cervical cancer data, with the fraction of ABOPEC signature mutations by gene reproduced in Figure 3 [18].
The APOBEC pathway drives mutations in many cancer sites including cervix, head and neck, bladder, lung, and breast [51,52]. However, APOBEC mutations are likely enriched in HPV-positive cancers due to its role in the host response to the viral infection. The APOBEC3A protein may inhibit HPV infectivity, so upregulation assists in viral clearance and reduces persistence [32], although it has also been suggested that APOBEC3B is likely to be the primary APOBEC involved HPV-related carcinogenesis because unlike APOBEC3A it is expressed in the nucleus [51]. The APOBEC mutagenesis pathway has also been reported to be upregulated by the HPV oncoprotein E6 [55]. Upregulation of APOBEC proteins in response to viral infection can cause "collateral damage" to the host DNA [56]. However, the exact mechanism of induction of the APOBEC pathway and its contribution to carcinogenesis once activated remain unclear, since it is also found in many cancer types not associated with infectious agents, including breast cancer and ovarian serous carcinoma [57][58][59]. Due to insufficient data from cancer precursors, it is currently not clear at what stage in the carcinogenic process APOBEC mutations start to accumulate and whether APOBEC mutations occur before non-APOBEC mutations. . Apolipoprotein B mRNA editing catalytic polypeptide-like (APOBEC, blue) and non-APOBEC (gray) mutations in significantly mutated genes in TCGA cervical cancer data [18]. Abbreviations: PIK3CA, phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha; EP300, E1A binding protein p300; FBXW7, F-box and WD repeat domain containing 7; PTEN, phosphatase and tensin homolog; HLA-A, human leukocyte antigen A; NFE2L2, nuclear factor, erythroid 2 like 2; ARID1A, AT-rich interaction domain 1A; HLA-B, human leukocyte antigen B; KRAS, KRAS proto-oncogene, GTPase; ERBB3, erb-b2 receptor tyrosine kinase 2; MAPK1, mitogenactivated protein kinase 1; CASP8, caspase 8; TGFBR2, transforming growth factor beta receptor 2; SHKBP1, SH3KBP1 binding protein 1.

Other Mutational Signatures
Cervical cancer, which has an attributable risk for HPV of close to 100% [1], has two primary mutational signatures, classified as signature 1B and signature 2 by Alexandrov et al. [50]. Signature 2 is the above-discussed APOBEC signature. Signature 1B is a common pattern across many cancer sites that is characterized by cytosine to thymine mutations at methylated cytosine-guanine (CpG) sites along the DNA and is associated with age [50]. Other cancers associated with signature 1B include head and neck, the only other HPV-associated cancer characterized by this study, as well as ovarian and endometrial, the other major gynecological cancers [50].

Genes and Pathways
Many somatic mutations overlap across HPV-associated cancer sites. Frequently somatically mutated genes are summarized in Table 2. In the following sections, common mutations are discussed in the context of their respective pathways.

Other Mutational Signatures
Cervical cancer, which has an attributable risk for HPV of close to 100% [1], has two primary mutational signatures, classified as signature 1B and signature 2 by Alexandrov et al. [50]. Signature 2 is the above-discussed APOBEC signature. Signature 1B is a common pattern across many cancer sites that is characterized by cytosine to thymine mutations at methylated cytosine-guanine (CpG) sites along the DNA and is associated with age [50]. Other cancers associated with signature 1B include head and neck, the only other HPV-associated cancer characterized by this study, as well as ovarian and endometrial, the other major gynecological cancers [50].

Genes and Pathways
Many somatic mutations overlap across HPV-associated cancer sites. Frequently somatically mutated genes are summarized in Table 2. In the following sections, common mutations are discussed in the context of their respective pathways.

Lack of Mutations in TP53 and RB1
The HPV oncogenic proteins E6 and E7 target the tumor suppressor proteins p53 and pRB, respectively, for degradation [79]. They therefore obviate the need for somatic deactivation of the TP53 and RB1 genes during the carcinogenesis process, and mutations in these genes infrequently occur in HPV-positive cancers compared to corresponding HPV-negative cancers at the same sites (Figure 1).
In cervical squamous cell carcinoma, TP53 mutations have been reported with a frequency of 5% [35]. Although fewer than 1% of cervical squamous cell carcinomas are HPV-negative, one study reported a difference in TP53 mutation status by classifying tumors in the TCGA-CESC data set as "HPV active" (expressing HPV transcripts; 4% TP53 mutation rate) versus "HPV inactive" (not expressing HPV transcripts; 47% TP53 mutation rate and 8% of the total number of HPV-positive samples) [80]. This is consistent with the idea that TP53 inactivation is exceedingly common, and that the TP53 mutation rates are negatively correlated with HPV activity. Vulvar squamous cell carcinoma has an 8-16% TP53 mutation prevalence in HPV-positive tumors versus 30-76% prevalence in HPV-negative tumors, and vulvar intraepithelial neoplasia (VIN) precancerous lesions have a 3% TP53 mutation prevalence in HPV-positive and 21% prevalence in HPV-negative lesions [73,74]. Likewise, TP53 mutations appear to be more prevalent in HPV-negative than in HPV-positive penile squamous cell carcinomas [31].
Numerous studies have reported significantly higher TP53 mutation rates in HPV-negative (52-86%) compared to HPV-positive (0-25%) head and neck tumors [2,3,36,37,72]. A complete absence of TP53 mutations in tumors with high-risk HPV types present has also been found in laryngeal [81] and esophageal [82] cancers. It has been suggested that TP53 inactivation, either through HPV infection or somatic mutation, is nearly ubiquitous in head and neck squamous cell carcinomas, even those that are HPV-negative and therefore must achieve this inactivation via other pathways [3].
Head and neck cancers with wild-type TP53 have a better prognosis than those with TP53 mutations [2,83]. HPV positivity and p16 INK4A expression, which are both related to retention of wild type TP53, are also positively correlated with overall 3-year survival in anal cancers [84]. Evidence in penile cancers is mixed [85][86][87][88].
The Rb pathway controls the cell cycle and regulates growth and proliferation [89]. RB1 mutations are very rare in cervical cancers because HPV E7 activity inactivates Rb tumor suppression activity by disrupting its interaction with the transcription factor E2F, making mutations in this gene unnecessary in HPV-positive cancers [90,91]. RB1 is mutated in 6-24% of HPV-positive head and neck cancers, a similar fraction to HPV-negative head and neck cancers (4%) [2,36,37]. Cyclin dependent kinase inhibitor 2A (CDKN2A) encodes p16 INK4A , an Rb pathway gene which as described above is nearly ubiquitously expressed in HPV-positive cancers due to activation of a negative feedback loop triggered by E2F release [92,93]. Overexpression of p16 INK4A is also common in HPV-related precancers, which has led to development of p16 INK4A -based biomarkers for cervical cancer screening and triage [94,95]. CDKN2A is rarely altered in HPV-positive (0%) compared with HPV-negative head and neck cancers (25% mutation rate, frequent alterations in 9p21.3 chromosomal region containing the CDKN2A gene) [2]. An absence of CDKN2A alterations in HPV-positive penile squamous cell carcinomas has also been reported, compared with 16% mutation prevalence and 24% copy number reduction in HPV-negative tumors [31].
The most common PIK3CA mutations occur in "hotspots" E542K and E545K in the helical domain (exon 9) of p110α. Mutations in these sites have been shown to increase phosphatidylinositol 3,4,5-trisphosphate (PIP 3 ) levels, activate downstream effectors such as phosphoinositide-dependent kinase (PDK1) and AKT, and promote cellular transformation. Although the mechanisms by which these mutations activate P13K signaling are not fully understood, current data suggests these mutations block the inhibitory effect of the p85α regulatory subunit on p110α activity [101]. In HPV-positive head and neck and cervical squamous cancers, mutations in PIK3CA are almost exclusively found in E542K (c.1624G > A) and E545K (1633G > A) corresponding to a C to T single base change at a TCW motif, indicative of APOBEC-induced mutagenesis [35,53,[102][103][104][105]. In contrast, these mutations are less common in HPV-negative head and neck cancers, suggesting that APOBEC activity is the major source of PIK3CA mutations in HPV-driven carcinogenesis. Evidence from a limited number of studies suggests that these mutations may represent a late event in cervical carcinogenesis [63,67,105]; however, a comprehensive deep-sequencing study of cervical precancers has not been conducted.
PTEN is a cell cycle regulator that inhibits rapid cell growth and functions as a tumor suppressor [106]. Signaling of the PI3K pathway is regulated by PTEN through dephosphorylation of PIP 3 (Figure 4) [107]. PTEN mutations are less frequent than PIK3CA mutations but are found in 6-13% of cervical carcinomas and 6-10% of HPV-positive head and neck cancers [2,18,35,36]. High rates of concurrent PIK3CA mutations with PTEN loss have been documented in HPV-positive tumors, ranging from 24 to 56% in head and neck cancers to over 80% in anal cancers [99,108]. In the context of PTEN loss or deficiency, helical mutations in PIK3CA have been shown to induce tumorigenesis through AKT-dependent signaling; whereas in tumors with intact PTEN, helical mutations in PIK3CA have been shown to promote cell growth and transformation through AKT-independent pathways involving PDK1 and its substrate serine/threonine protein kinase family member 3 (SGK3) [109].
Overall, more than 50% of cancers of the cervix and anus have at least one mutation in the PI3K/AKT pathway [110]. Similarly, mutations in this pathway have been reported in 61% of HPV-positive head and neck cancers (and a similar number of HPV-negative head and neck cancers) [2]. The average across all solid tumors was 38%, suggesting that compared with the known driver mutations in other cancers, PI3K pathway alterations are uniquely high in HPV-driven cancers [110]. It is interesting to note that PIK3CA is also commonly mutated in endometrial and some ovarian cancers [111,112], which could make it a hallmark of gynecological cancers as well as of HPV-driven cancers.

Human Leukocyte Antigen
Human leukocyte antigen (HLA) alleles are important components of host cell-mediated immune responses to viral infections and are essential to the major histocompatibility complex (MHC) immune response pathway. HLA-A and HLA-B are MHC class I molecules that present viral antigens on the cell surface to alert the immune system to infection [113] (Figure 5). Germline HLA variants have been associated with cervical cancer and with HPV-positive oropharyngeal cancer susceptibility [19][20][21]. Somatic mutations are found in HLA-A in 8% and HLA-B in 6-9% of cervical squamous cell carcinomas [18,35]. In a small study evaluating cervical cancer cell suspensions, 90% of tumors showed some HLA gene alterations including gene mutations, loss of heterozygosity, and other genetic changes [78]. HLA alterations are found frequently in cervical precancers as well, suggesting that it is an early event in cervical carcinogenesis [114]. Rates of HLA-A/B mutations are somewhat more common in HPV-positive (11%) than HPV-negative (7%) head and neck cancers [2,37]. Loss of HLA-A or HLA-B could lead to loss of presentation of tumor antigens and immune cell recognition. One small study reported frequent mutations in the HLA pathway-associated transporter associated with antigen processing (TAP) gene (52%) in cervical carcinomas [115]. However, another candidate gene study failed to replicate this finding [116] and the large cervical cancer studies did not identify recurrent mutations in this gene [18,35]. Given the observed associations of both germline and somatic changes with the antigen presentation pathway, it is clear that it plays an important role in the host response to viral invasion that can alter the probability of persistence and potentially subsequent steps in the carcinogenesis process.
Viruses 2017, 9, x FOR PEER REVIEW 12 of 21 One small study reported frequent mutations in the HLA pathway-associated transporter associated with antigen processing (TAP) gene (52%) in cervical carcinomas [115]. However, another candidate gene study failed to replicate this finding [116] and the large cervical cancer studies did not identify recurrent mutations in this gene [18,35]. Given the observed associations of both germline and somatic changes with the antigen presentation pathway, it is clear that it plays an important role in the host response to viral invasion that can alter the probability of persistence and potentially subsequent steps in the carcinogenesis process. Figure 5. HLA pathway. Proteins undergo proteasomal degradation and the resulting peptides are transported to the endoplasmic reticulum by the TAP complex. There they are bound with MHC Class I into HLA-A or HLA-B and bound to β2-microglobulin. The complex is transported to the plasma membrane, where the peptide antigen is displayed for cytotoxic T-cell recognition. Fraction of cervical and head and neck cancers with each gene mutated are noted [2,18,115,116]. * There are conflicting reports of TAP mutation prevalence [115,116]. Abbreviations: MHC, major histocompatibility complex; HLA, human leukocyte antigen; TAP, transporter associated with antigen processing; HNSCC, head and neck squamous cell carcinoma.

Transforming Growth Factor Beta Pathway
The transforming growth factor beta (TGFβ) pathway inhibits DNA synthesis and plays a tumor suppressor role, although it can also promote cancer progression once carcinogenesis has been initiated [117][118][119]. Inhibition of this pathway by the HPV oncoprotein E7 contributes to early tumor development in HPV-positive cervical and head and neck cancers [120][121][122][123] (Figure 6). Commonly mutated genes in the TGFβ pathway include TGFBR2 (a receptor), CREB binding protein (CREBBP) and E1A binding protein p300 (EP300) (activators), and SMAD4 (a transcription factor and tumor suppressor), and mutations in at least one of these genes have been reported in 30% of cervical squamous cell carcinomas [18]. In contrast, among TGFβ genes, only EP300 was in the top 30 mutated genes in head and neck squamous cell carcinomas [36]. This implies that somatic alterations in TGFBR2, CREBBP, and SMAD4 may be cervical squamous cell carcinoma-specific, although E7driven expression effects in the TGFβ pathway may still play a role in carcinogenesis in other HPVpositive cancers. SMAD4 downregulation is also associated with HPV-negative head and neck cancers [124], and SMAD signaling pathway alterations have been found in both HPV-positive and HPV-negative tumors [37].  [2,18,115,116]. * There are conflicting reports of TAP mutation prevalence [115,116]. Abbreviations: MHC, major histocompatibility complex; HLA, human leukocyte antigen; TAP, transporter associated with antigen processing; HNSCC, head and neck squamous cell carcinoma.

Transforming Growth Factor Beta Pathway
The transforming growth factor beta (TGFβ) pathway inhibits DNA synthesis and plays a tumor suppressor role, although it can also promote cancer progression once carcinogenesis has been initiated [117][118][119]. Inhibition of this pathway by the HPV oncoprotein E7 contributes to early tumor development in HPV-positive cervical and head and neck cancers [120][121][122][123] (Figure 6). Commonly mutated genes in the TGFβ pathway include TGFBR2 (a receptor), CREB binding protein (CREBBP) and E1A binding protein p300 (EP300) (activators), and SMAD4 (a transcription factor and tumor suppressor), and mutations in at least one of these genes have been reported in 30% of cervical squamous cell carcinomas [18]. In contrast, among TGFβ genes, only EP300 was in the top 30 mutated genes in head and neck squamous cell carcinomas [36]. This implies that somatic alterations in TGFBR2, CREBBP, and SMAD4 may be cervical squamous cell carcinoma-specific, although E7-driven expression effects in the TGFβ pathway may still play a role in carcinogenesis in other HPV-positive cancers. SMAD4 downregulation is also associated with HPV-negative head and neck cancers [124], and SMAD signaling pathway alterations have been found in both HPV-positive and HPV-negative tumors [37].
Viruses 2017, 9, x FOR PEER REVIEW 13 of 21 Figure 6. TGFβ pathway. TGFβ binds to TGFBR2 and other receptors to form a complex which becomes phosphorylated. This triggers the phosphorylation of R-SMADs. The phosphorylated R-SMADs form a complex with SMAD4 and are transported into the nucleus, where they promote transcription by binding to promotor regions of the DNA. EP300 and CREBBP are two activators commonly mutated in HPV-driven cancers, and many other activators and repressors also act to regulate this pathway. Fractions of cervical cancers with each gene mutated are noted [18]. Abbreviations: TGFβ, transforming growth factor beta; TGFBR2, TGFβ receptor 2; R-SMAD, receptorregulated SMAD; EP300, E1A binding protein p300; CREBBP, CREB binding protein.

Notch Pathway
The Notch signaling pathway is responsible for cellular differentiation. Mutations in the NOTCH1 receptor are found in both HPV-negative (12-26%) and in HPV-positive (6-17%) head and neck cancers, albeit somewhat more frequently in HPV-negative tumors, and are not commonly reported in cervix or other HPV-driven cancer sites [2,36,37,76,77]. This mutation may, therefore, be specific to head and neck carcinogenesis rather than to HPV infection, and NOTCH1 has indeed been reported as a driver gene in oral tumorigenesis independent of HPV status [125]. F-box and WD repeat domain containing 7 (FBXW7) is involved in angiogenesis through regulation of the Notch pathway [126] and is mutated at higher rates in cervix (11-15%) and HPV-positive head and neck (12%) squamous cell carcinomas than in combined head and neck squamous cell carcinomas (HPV status not specified) (5%) [18,35,36].

RAS/EGFR/ERK Pathway
The RAS/EGFR/ERK (retrovirus-associated DNA sequences/ epidermal growth factor receptor/ extracellular signal-regulated kinases) pathway is involved in cellular proliferation and survival ( Figure 3). It consists of a signaling cascade that regulates transcription of genes affecting many functions including differentiation, growth, and senescence, which can contribute to carcinogenesis [127]. KRAS proto-oncogene, GTPase (KRAS) is an oncogene in which mutations are found in 8-23% in cervical adenocarcinomas but rarely in cervical squamous cell carcinomas [18,35,62,75]. The mutation rate of KRAS in head and neck cancers is 6% [37]. In contrast, EGFR is a tumor suppressor in the same pathway in which mutations are found in 3-33% of cervical squamous cell carcinomas but rarely in cervical adenocarcinomas [18,62,70,71]. Other genes in this pathway are mutated in fewer than 10% of HPV-positive tumors except for FGFR2 and FGFR3, which have combined Figure 6. TGFβ pathway. TGFβ binds to TGFBR2 and other receptors to form a complex which becomes phosphorylated. This triggers the phosphorylation of R-SMADs. The phosphorylated R-SMADs form a complex with SMAD4 and are transported into the nucleus, where they promote transcription by binding to promotor regions of the DNA. EP300 and CREBBP are two activators commonly mutated in HPV-driven cancers, and many other activators and repressors also act to regulate this pathway. Fractions of cervical cancers with each gene mutated are noted [18]. Abbreviations: TGFβ, transforming growth factor beta; TGFBR2, TGFβ receptor 2; R-SMAD, receptor-regulated SMAD; EP300, E1A binding protein p300; CREBBP, CREB binding protein.

Notch Pathway
The Notch signaling pathway is responsible for cellular differentiation. Mutations in the NOTCH1 receptor are found in both HPV-negative (12-26%) and in HPV-positive (6-17%) head and neck cancers, albeit somewhat more frequently in HPV-negative tumors, and are not commonly reported in cervix or other HPV-driven cancer sites [2,36,37,76,77]. This mutation may, therefore, be specific to head and neck carcinogenesis rather than to HPV infection, and NOTCH1 has indeed been reported as a driver gene in oral tumorigenesis independent of HPV status [125]. F-box and WD repeat domain containing 7 (FBXW7) is involved in angiogenesis through regulation of the Notch pathway [126] and is mutated at higher rates in cervix (11-15%) and HPV-positive head and neck (12%) squamous cell carcinomas than in combined head and neck squamous cell carcinomas (HPV status not specified) (5%) [18,35,36].

RAS/EGFR/ERK Pathway
The RAS/EGFR/ERK (retrovirus-associated DNA sequences/ epidermal growth factor receptor/ extracellular signal-regulated kinases) pathway is involved in cellular proliferation and survival ( Figure 3). It consists of a signaling cascade that regulates transcription of genes affecting many functions including differentiation, growth, and senescence, which can contribute to carcinogenesis [127]. KRAS proto-oncogene, GTPase (KRAS) is an oncogene in which mutations are found in 8-23% in cervical adenocarcinomas but rarely in cervical squamous cell carcinomas [18,35,62,75]. The mutation rate of KRAS in head and neck cancers is 6% [37]. In contrast, EGFR is a tumor suppressor in the same pathway in which mutations are found in 3-33% of cervical squamous cell carcinomas but rarely in cervical adenocarcinomas [18,62,70,71]. Other genes in this pathway are mutated in fewer than 10% of HPV-positive tumors except for FGFR2 and FGFR3, which have combined mutation rates of 10-17% in HPV-positive head and neck cancers [2,18,[35][36][37]. This is notable because, as kinases, the FGFR genes may potentially be therapeutic targets [37].

Other Genes
The tumor necrosis factor (TNF) receptor associated factor 3 (TRAF3) is involved in viral immune responses [128] and was recently reported to have truncating mutations (8%) or deletions (14%) in HPV-positive head and neck cancers [2]. It is not commonly mutated in cervical cancers [18], and it remains to be investigated whether this gene is mutated in HPV-positive cancers at other sites. Other genes differentially mutated in HPV-positive versus HPV-negative head and neck squamous cell carcinomas include E2F1, a cell cycle related gene more commonly mutated in HPV-positive cancers (19% versus 2%), and FAT atypical cadherin 1 (FAT1) and ajuba LIM protein (AJUBA), two genes involved in differentiation that are more commonly mutated in HPV-negative cancers (32% versus 3% and 7% versus 0%, respectively) [2].

Discussion
While HPV infection is a necessary cause of many cancers, the interplay between the virus and the host cell is what ultimately causes cancers to develop. There are many similarities across sites in the mechanisms and mutations found in HPV-driven cancers, suggesting that mechanisms are likely to be similar in rarer cancers such as penile and vaginal carcinomas in which it is difficult to complete large genomic studies. For example, one recent candidate gene study found no statistically significant differences in gene mutations in any of 48 candidate genes including PIK3CA, EGFR, NOTCH1, and KRAS or copy number alterations in any of six candidate genes across HPV-positive squamous cell carcinomas at four anatomical sites [99]. While HPV-positive cancers share many characteristic mutagenesis mechanisms and somatic mutations, there are also site-specific aspects. The other major gynecological cancers, endometrial and ovarian cancer, share with cervical cancer high rates of PIK3CA mutations and APOBEC and signature 1B mutational signatures. HPV-positive and HPV-negative tumors arising in the head and neck also share properties such as recurrent focal amplifications of the 3q26-28 chromosomal region. Recent data have shown that HPV genetic variation is very common and that HPV variant sublineages influence the risk of different histologic types of cervical precancer and cancer. It will be important to study the interplay between viral genetics and host genomic changes to better understand HPV-driven carcinogenesis [129,130].
Characterizing somatic mutations in HPV-related carcinogenesis could be highly relevant for early detection, prognosis, and treatment. To date, very few studies have attempted to characterize the somatic landscape of precancerous lesions, none comprehensively [63,74,131]. Several important steps are required to develop early detection assays based on somatic mutations. First, the sequence of somatic mutation events in the transition from precancers to cancers needs to be established. Next, a promising panel of mutations needs to be selected and evaluated in cervical cytology samples. Similar efforts have been evaluated for other gynecological cancers [132].
In addition to early detection, somatic characterization can be important for prognosis and targeted treatment strategies. For example, PIK3CA-mutated cervical cancers have worse prognosis than cancer with wild-type PIK3CA [61]. Site-specific mutations in PIK3CA have been shown to have varying responses to treatment, with evidence suggesting a greater response to PI3K/AKT/mTOR pathway inhibitors for tumors with mutations in the H1047R kinase domain (which are not commonly found in cervical cancers) compared with mutations at other sites [133]. Another prospective therapeutic target is BCAR4, in which amplifications and gene fusions have been found in cervical cancer and which is targeted by lapatinib [18,134]. CD274 and PDCD1LG2 are immunotherapy targets with amplifications reported in cervical cancer [18]. Erb-b2 receptor tyrosine kinase 2 (ERBB2; HER2) and erb-b2 receptor tyrosine kinase 3 (ERBB3; HER3) are mutated in a subset of cervical adenocarcinomas and these tumors may be susceptible to targeted therapies, and PTEN and AT-rich interaction domain 1A (ARID1A) alterations are also potential targets [18]. The PI3K/AKT and TGFβ signaling pathways, at least one of which is altered in over 70% of cervical cancers, are very promising in that targeted therapies may be broadly applicable due to their high prevalence [18]. The development of somatic marker panels for HPV-driven cancers will enable oncologists to more precisely tailor treatments.