Characterization of Human Papillomavirus 16 from Kinshasa (Democratic Republic of the Congo)—Implications for Pathogenicity and Vaccine Effectiveness

Human Papillomavirus (HPV) type 16 is the main etiological agent of cervical cancer worldwide. Mutations within the virus genome may lead to an increased risk of cancer development and decreased vaccine response, but there is a lack of information about strains circulating in Sub-Saharan Africa. Endocervical cytology samples were collected from 480 women attending a voluntary cervical cancer screening program at Monkole Hospital and four outpatient centers in Kinshasa, Democratic Republic of the Congo (DRC). The prevalence of HPV infection was 18.8% and the most prevalent high-risk types were HPV16 (12.2%) followed by HPV52 (8.8%) and HPV33/HPV35 (7.8% each). HPV16 strains were characterized: 57.1% were classified as C lineage; two samples (28.6%) as A1 and one sample belonged to B1 lineage. HPV33, HPV35, HPV16, and HPV58 were the most frequent types associated with low-grade intraepithelial lesion while high-grade squamous intraepithelial lesions were predominantly associated with HPV16. Several L1 mutations (T266A, S282P, T353P, and N181T) were common in Kinshasa, and their potential effect on vaccine-induced neutralization, especially the presence of S282P, should be further investigated. Long control region (LCR) variability was high with frequent mutations like G7193T, G7521A, and G145T that could promote malignancy of these HPV16 strains. This study provides a helpful basis for understanding HPV16 variants circulating in Kinshasa and the potential association between mutations of LCR region and malignancy and of L1 and vaccine activity.


Introduction
Human Papillomavirus (HPV) infects mucosal and cutaneous epithelia, and can be sexually transmitted [1][2][3][4]. It is described as one of the most common sexually transmitted infections (either skin-to-skin or mucosa-to-mucosa contact). Although mostly asymptomatic and cleared from the body by the immune system, HPV infection can cause disease, the usual manifestations of which are anogenital warts, cutaneous warts, recurrent respiratory papillomatosis, cervical intraepithelial neoplasia (CIN), cervical, vulvar, penile, and anal cancers, and carcinomas of the upper respiratory tract [5,6]. Besides oncogenic HPV activity, cervical cancer onset and development encompass a large variety of genetic and epigenetic changes [7,8]. According to the risk of malignancy, HPV is classified as low-risk

Study Design and Participants
A cross-sectional study was designed to evaluate a screening program of cervical pre-cancerous lesions at Monkole Hospital and four outpatient centers (Kimbondo, Eliba, Moluka, and Gombe) in Kinshasa (DRC). During July 2017, women aged 25-70 years attending HIV Voluntary Counseling and Testing (VCT) or seeking medical care were invited to participate. Pregnant women, those not-sexually initiated, or those with a total hysterectomy were excluded. No patient had previously been diagnosed or treated for HPV disease.

Sample Collection
Four hundred and eighty women were screened for LSIL, including cervical intraepithelial neoplasia grade 1 (CIN1); HSIL, including CIN2/3; and ICC, through visual inspection with acetic acid (VIA) and Lugol according to WHO guidelines [31,32]. In addition, endocervical samples were collected for HPV testing and characterization. Specimens were collected using BD SurePath™ liquid-based Pap test and shipped to Clinica Universidad de Navarra (Pamplona, Spain), where they were stored at 4 • C until further use.

DNA Extraction for HPV16 Characterizatin
DNA was manually extracted using 200 µL of endocervical sample medium with High-Pure Viral Nucleic Acid (Roche) kit following the manufacturer instructions. Briefly, 253 µL of Lysis buffer was prepared (250 µL Binding buffer + 3 µL Poly A carrier DNA) and together with 50 µL of Proteinase K was added to the endocervical sample and incubated 10 min at 72 • C. After a short centrifugation (9300 rpm for 5 s), 100 µL of Binding buffer were added to the former mix. Then, DNA purification was performed using the manufacturer filter columns and centrifugation (9300 rpm for 1 min) after each step. First, 500 µL of Inhibitor Removal Buffer was used for purification and two additional steps with 450 µL of Wash Buffer were carried out. Lastly, for DNA elution 50 µL of Elution Buffer was used and after centrifugation at 9300 rpm/1 min, the purified DNA was stored at −70 • C until use.
Primers previously described for L1 [34] and LCR [23] regions and new designed primers after studying all HPV16 lineages (including all known African variants) were used for amplification (Table 1).
PCR reactions were performed using 5 µL of DNA for each target gene in a final volume of 25 µL reaction mix containing 1.5 pmol of each forward and reverse primers (for L1.1. and L1.2. reactions) or 2 pmol of each forward and reverse primers (for LCR.1. and LCR.2. reactions), 12.5 µL KAPA2G Fast HotStart ReadyMix (2X) (Sigma-Aldrich, Wilmington, MA, USA) and water. All amplifications were carried out a t 95 • C for 5 min and 40 cycles of denaturation at 95 • C for 15 s, annealing at 55 • C for 15 s, extension at 72 • C for 30 s, and a final elongation step at 72 • C for 1 min.

Ethics
The project was approved by the Human Subjects Review Committees at Monkole Hospital/CEFA (Kinshasa, DRC) and University of Navarra (Pamplona, Spain). Informed consent of enrolled participants was obtained. All methods were carried out in accordance with relevant guidelines and regulations.

Results
The mean (SD) age of our population was 44.6 (10.9) years old. Ninety participants within four hundred and eighty screened women were infected by HPV (18.8%), 75% of them HR-HPV. There were no differences in age between patients infected and not infected with HPV. Among those infected, 11 out of 90 women were infected by more than one strain (up to seven different types). The most prevalent HR-HPV type was HPV16 found in 11 women (12.2%) followed by HPV52 (8.9%) and HPV33/HPV35 (7.8% each) ( Figure 1).
All HPV16 positive samples (eleven) were subjected to L1 and LCR sequencing. Six out of eleven (54%) and eight out of eleven (72%) samples were positive after L1 and LCR amplification, respectively. However, after the sequencing reaction, three sequences had to be discarded due to low quality, so the final number of available sequences was five L1 sequences and six LCR sequences, belonging to seven different patients.

HPV16 Lineage Distribution
Phylogenetic analysis of both regions (L1 and LCR) allowed us to obtain the lineage of seven out of eleven HPV16 infections detected among the screened women. The African lineage AFR2 was the most prevalent within this cohort, followed by the European ones. Four samples (57.1%) were classified as C lineage; two samples (28.6%) as A1; and one sample belonged to B1 lineage (Figures 4 and 5).  All HPV16 positive samples (eleven) were subjected to L1 and LCR sequencing. Six out of eleven (54%) and eight out of eleven (72%) samples were positive after L1 and LCR amplification, respectively. However, after the sequencing reaction, three sequences had to be discarded due to low quality, so the final number of available sequences was five L1 sequences and six LCR sequences, belonging to seven different patients.

HPV16 Lineage Distribution
Phylogenetic analysis of both regions (L1 and LCR) allowed us to obtain the lineage of seven out of eleven HPV16 infections detected among the screened women. The African lineage AFR2 was the most prevalent within this cohort, followed by the European ones. Four samples (57.1%) were classified as C lineage; two samples (28.6%) as A1; and one sample belonged to B1 lineage (Figures 4 and 5).  The evolutionary history was inferred using the Neighbor-Joining method [37]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [38] and are in the units of the number of base substitutions per site. The analysis involved 25 nucleotide sequences (20 reference strains from GenBank and 5 new sequences from this study). Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 1518 positions in the final dataset. Evolutionary analyses were conducted in MEGAX [35,36]. The evolutionary history was inferred using the Neighbor-Joining method [37]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [38] and are in the units of the number of base substitutions per site. The analysis involved 27 nucleotide sequences (21 reference strains from GenBank and 6 new sequences from this study). Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. There was a total of 985 positions in the final dataset. Evolutionary analyses were conducted in MEGAX [35,36].

L1 Genomic Polymorphisms
Seventeen single nucleotide changes were identified among the five L1 sequences obtained. These mutations included 10/17 (58.8%) silent mutations, and 7/17 (41.2%) which led to amino acids substitutions ( Table 2). T266A substitution was present in 80% of all the samples studied for this fragment; then S282P and T353P were present in 3/5 sequences; and L474F in two out of five samples. Finally, H76Y, T176N, and N181T were identified in one patient; these changes are located at the first segment of L1 region, only obtained for two samples (P44 and P114). The evolutionary history was inferred using the Neighbor-Joining method [37]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [38] and are in the units of the number of base substitutions per site. The analysis involved 27 nucleotide sequences (21 reference strains from GenBank and 6 new sequences from this study). Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. There was a total of 985 positions in the final dataset. Evolutionary analyses were conducted in MEGAX [35,36].
In addition, two consecutive positions showed nucleotide changes in lineage C: G7435A alone (highly frequent) or combined with G7436C.

Discussion
This study provides the first data on the lineages of HPV16 from Kinshasa (DRC) and constitutes a helpful experimental basis for understanding HPV16 variants in this area, the most prevalent L1 and LCR mutations, and the correlation between mutations of the LCR region and malignancy and L1 region and immunogenicity of the virus. To our knowledge, there are no previous studies that have included a full description of L1 and LCR from HPV16 mutations and variants collected in DRC.
The prevalence of HPV infection within our cohort was 18.8%, similar to that reported before in Kinshasa between 11-28% [17,39]. However, the main HR type found among our participants was HPV16 (12.2%), different from the distribution reported before where HPV68 and HPV35 were the most prevalent HR types detected [17,39].
Considering the presence of HPV types among our participants showing intraepithelial lesions, HPV33 and HPV35 were the most frequent types associated with LSIL, followed by HPV16 and HPV58, while HPV16 was the main virus associated with HSIL. Like our study, previous data from Kinshasa have highlighted an important incidence of HPV35, HPV52, HPV16, and HPV18 among women with LSIL and HSIL; however, HPV68, previously reported, was not predominant in our series [40]. Collected data from Eastern, Southern, and Western Africa stated, like our study, that HPV16, HPV18, HPV35. and HPV52 were the leading types associated with LSIL [19]. The information from Central Africa is limited; we found also a high presence of HPV33 and HPV58 among women with LSIL, uncommon strains in other areas of the continent among LSIL. HSIL were mainly associated in our series with HPV16, as described in other African countries. Other HPV types were also found in HSIL participants where HPV39 has not been found associated with this phase in other countries where data is available [19].
Overall, L1 looks to be a more conserved region, as shown in previous studies, with 17 single nucleotide changes found, versus 35 mutations detected in LCR. Furthermore, mutations in L1 (when they occurred) were present in almost all samples, unlike LCR, where different patterns were found with changes present in one single sample. This may be due to the different functions of each gene, with L1 being a part of the virus assembly, while LCR is involved in regulating the expression of different proteins.
L1 plays a vital role in the efficacy of vaccines against HPV16 that are currently on the market. This major capsid protein sequence is usually well conserved, but mutations can have an impact in the virus sensitivity to antibodies induced by vaccines, especially when they are present on the immune dominant loops [41]. Our samples from Kinshasa showed some mutations related to this interaction, like A797G, T844C, A1057C, and A542C mutations (amino acid substitutions T266A, S282P, T353P, and N181T, respectively).
Ning et al. studied how the changes in the amino acid sequence of the L1 region of HPV16 influenced the susceptibility to antibody-mediated neutralization by creating pseudovirions. Their study showed that amino acid positions 181 (EF loop) or 282 (FG loop) were critical residues involved in the conformational epitopes recognized by monoclonal antibodies (MAbs). Especially the S282P mutation induced decreased reactivity to the virus for 7/13 MAbs studied and could lead to a complete loss of susceptibility to neutralization [42]. In other studies, T266A (FG loop) and T353P (HI loop) were reported to destabilize the L1 monomer structure due to loss of hydrogen bonds caused by those amino acid substitutions [43]. They have also been reported to have a potential impact on antigenicity and recognition by antibodies and T cells [42,44].
Overall, H76Y, T176N, N181T, A266T, T353P, and L474F, all mutations present in our samples from Kinshasa, have been described as variations in immune dominant loops of HPV16 L1, and might affect the efficacy of the vaccines available [45]. These mutations have been previously reported in different variants, the most common being T353P, in 20.1% of samples, followed by H76Y (15.3%), within collections including little information from Africa [42].
Further research needs to be done, because scarce data on L1 mutation prevalence when talking about African lineages are available [28,45,46]. For the rest of the lineages, some of these mutations have shown ethnicity differences; T176N mutation is more prevalent between Asian and American populations (23.4-25.3%), while A266T is more prevalent in the European population (16.4%) [45].
LCR has been described as the most variable region of the HPV16 genome [47][48][49], and it is well known for its role in the development of cervical cancer. Changes in this region can influence not only the transcription of viral oncogenes but the addition/loss of binding sites for transcription factors [23,50]. Nowadays, there are plenty of studies that link the presence of certain mutations with the malignity of HPV16, but the actual role of the specific SNPs and their link to transcriptional factors, as well as the actual role they play in the likelihood of cancer development, are not well established.
Some studies locate this mutation at the binding site of the transcriptional repressor sexdetermining region Y-box 9 protein (SOX9) [23]. This could down-regulate the expression of this gene, which is thought to act as a tumor suppressor in the development of cervical cancer [52].
On the other hand, other studies describe this mutation as involved in the YY1 binding site, a mutation that is found in the majority of patients with cervical cancer, and that may promote the activity of p97, involved in the transcription of other oncogenes. Both mechanisms are believed to have a role in the development of tumors for different cancers [30,53,54].
Other mutations described in the literature, such as G7193T, have been linked to the transcriptional enhancer factor (TEF-1) [51,54] that influences the regulation of the LCR transcription activity and, like YY1, acts as a critical activator protein for p97 activity [55]. Furthermore, the G145T mutation is described in other studies, suspected to affect p53 binding and degradation as well as acting like a B and T cell epitope [44].
According to phylogenetic analysis of L1 and LCR sequences, HPV16 strains detected in our samples from Kinshasa belonged mostly to lineage C (AFR2) (57%), but strains belonging to lineage A1 and B were also found. Cornet et al. described a number of E2/LCR genetic variations in HPV16 that correlate with phylogenetic classification. In this sense, mutations such as C7485A, C7669T, C143G, and G145T were present in samples classified as African lineage. In addition, strains classified as lineage C (AFR2) contained specific mutations such as G7826A, A7837C, G7387C, and G7435A. One sample classified as lineage B1 could be assigned to sublineage AFR1a because of the presence of T7714A and T7293G [27]. Finally, two samples were classified as A1 (EUR) variants and they did not show any of the classic AFR mutations described, but were more similar to the reference sequence.
Our study is subject to a number of potential limitations. The first is that only HPV16 was studied and the sample size of HPV16 detected and sequenced was low; however, the information obtained from L1 and LCR sequences was homogenous and consistent to be able to analyze the genetic polymorphisms of HPV16 in this area. Second, no information about HPV vaccination was obtained from our patients, but unfortunately, a vaccination campaign against HPV has not been initiated in DRC, so no effect from vaccination could be expected. Introducing inexperienced medical providers into the cervical cancer screening campaign may have had a potential bias. However, this suggests that these campaigns might be feasible in the context of amateur observers after short training. Finally, no next-generation sequencing was performed; however, Sanger sequencing is accepted as a reliable tool for HPV typing and polymorphisms detection. Additional sequencing studies should be performed with other HPV types highly prevalent in the DRC, such as HPV18, 33,35,52,53,61, and 68, so that information obtained from them could be representative of the HPV impact in the region.
The study has several notable strengths, including the description of HPV epidemiology among women from DRC showing LSIL and HSIL using a reliable commercial test. This study described for the first time the distribution of HPV16 L1 and LCR genetic variants and polymorphisms in infected women from the Democratic Republic of the Congo. The characterization of HPV16, using molecular and phylogenetic techniques, from subjects in Kinshasa belonging mainly to the general population, confirmed lineage C (AFR2) as the predominant strain in our study population. This work can help to understand the natural history and carcinogenicity of HPV16 genetic variants, as well as enabling the study of potential vaccine escape mutants in Kinshasa.
Finally, further analysis with a larger sample size needs to be performed in order to obtain reliable information about all the HPV16 variants circulating in Kinshasa. As previously reported in other African countries, the genomic diversity of HPV16 variants has been demonstrated and the L1/LCR mutations identified should be further investigated using functional assays [28,46]. This will allow future researchers to assess if the use of prophylactic vaccination will protect this population against the virus, and to find possible mutations in these variants that increase the risk of development of cervical cancer.

Conclusions
In conclusion, the prevalence of HPV infection within women from Kinshasa was 18.8%, with HPV16 as the most common HR type found (12.2%), mainly belonging to lineage C (AFR2). HPV33, HPV35, HPV16, and HPV58 were the most frequent types associated with LSIL, while HSIL were predominantly associated with HPV16. L1 mutations (T266A, S282P, T353P, and N181T) were common among the HPV16 strains detected in Kinshasa, and their potential effect on vaccine-induced neutralization, especially the presence of S282P, should be further investigated. LCR variability of HPV16 in this area was very high with frequent mutations like G7193T, G7521A, and G145T that could promote malignancy.