You are currently viewing a new version of our website. To view the old version click .
International Journal of Molecular Sciences
  • Article
  • Open Access

15 November 2025

Genetic Diversity of Selected High-Risk HPV Types Prevalent in Africa and Not Covered by Current Vaccines: A Pooled Sequence Data Analysis

,
,
,
and
1
Discipline of Biochemistry, School of Agriculture and Science, University of KwaZulu-Natal, Pietermaritzburg 3209, South Africa
2
Tumour Virology, International Centre for Genetic Engineering and Biotechnology (I.C.G.E.B.), 34149 Trieste, Italy
3
Centre for the AIDS Programme of Research in South Africa (CAPRISA), School of Medicine, University of KwaZulu-Natal, Durban 4013, South Africa
4
Discipline of Genetics, School of Agriculture and Science, University of KwaZulu-Natal, Pietermaritzburg 3209, South Africa
Int. J. Mol. Sci.2025, 26(22), 11056;https://doi.org/10.3390/ijms262211056 
(registering DOI)
This article belongs to the Special Issue Future Challenges and Innovation in Gynecological Oncology

Abstract

High-risk human papillomavirus (HR-HPV) types exhibit an uneven global distribution, with types 35, 51, 56, and 59 being more prevalent in Africa yet not covered by current L1-based vaccines. The genetic diversity of HR-HPV oncoproteins in Africa remains poorly characterized, despite their potential as alternative vaccine targets. This study investigates the genetic diversity of HR-HPV types 16, 18, 35, 51, 56, and 59 to inform vaccine development. We analyzed 14,332 sequences from the NCBI Virus database and 222 HPV reference sequences from the Papillomavirus Episteme (PaVE) database using phylogenetic analysis and variant identification. HPV16 and HPV35 exhibited close evolutionary relatedness, which may indicate shared traits relevant to vaccine design, although functional implications remain to be experimentally validated. A key finding of the study was the discovery of novel non-synonymous mutations, including E148K in HPV35 E6, S63C in HPV16 E7 and S495F in HPV18 L1, as well as known oncogenic variants such as L83V (E6) and N29S (E7) in HPV16. These findings highlight significant intra- and inter-type diversity among African HR-HPVs. This study provides new insights into the genetic diversity and evolutionary relationships of underrepresented HR-HPV types. The findings underscore the need for continued genomic surveillance and support efforts to develop region-specific vaccines that include HPV35, 51, 56, and 59 to address gaps in current vaccine coverage and help reduce the burden of HPV-related cancers in Africa.

1. Introduction

Cervical cancer remains a major global health burden, with approximately 702,000 new cases and 370,000 deaths estimated for 2025 []. More than 99% of cervical cancer cases have been linked to high-risk human papillomaviruses (HR-HPVs), which are primarily transmitted through sexual contact [,]. Although the immune system clears around 90% of HPV infections within 6–24 months, approximately 10% of cases progress to chronic infections [,]. These infections lead to oncogenesis and various cancers, including cervical, anogenital, and head-and-neck cancers [,,,].
In 2018, the World Health Organization (WHO) pledged to eliminate cervical cancer as a public health problem by 2030. The global elimination strategy is built on three pillars: vaccinating 90% of girls against HPV before age 15, screening 70% of women aged 35–45 for cervical cancer, and ensuring treatment for 90% of women diagnosed with the disease [].
Because the success of these interventions hinges on targeting the most relevant HPV types, it is critical to understand HPV risk classification. HPVs are categorized as either high-risk or low-risk based on their oncogenic potential. Phylogenetic studies have shown that HR-HPV types, including 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 68, and 73, belong to the closely related Alpha-papillomavirus genus []. Among these, HPV types 16 and 18 are the most prevalent worldwide, accounting for approximately 70% of all HPV-related cancer cases globally []. The distribution of other HR-HPV types varies across geographic regions [,]. Notably, epidemiological data indicate that HR-HPV types 35, 51, 56, and 59, while common globally, are disproportionately more prevalent in Africa compared to other regions [,].
Currently, there is no cure for HPV infections. However, six prophylactic vaccines based on the L1 capsid protein are available []. The first three globally approved vaccines are Gardasil® (HPV6/11/16/18), Cervarix® (HPV16/18), and Gardasil 9® (HPV6/11/16/18/31/33/45/52/58). In 2022, two additional bivalent vaccines, Cecolin® and Walrinvax®, both targeting HPV types 16 and 18, were approved in China [], and a quadrivalent vaccine, Cervavac® (HPV6/11/16/18), was approved in India [,]. All three new vaccines received WHO prequalification in 2024 [].
Despite widespread global use, concerns remain about vaccine efficacy and coverage in diverse populations. This issue is particularly relevant for African populations, which are among the most genetically diverse in the world []. Of the ten most prevalent HR-HPV types among African women, four types, HPV35, 51, 56, and 59, are not covered by current vaccines []. This gap underscores the need for region-specific vaccines to reduce the burden of HPV-related cancers in Africa.
Advances in vaccine technology are creating opportunities to explore new candidates and strategies for HPV prevention []. Current research and clinical trials are investigating vaccines that target the L2 minor capsid protein and the Early (E) proteins []. Among these, the E5, E6, and E7 oncoproteins are of particular interest, as they play critical roles in initiating tumorigenesis and maintaining the malignant state of infected epithelial cells [,]. However, the diversity of these proteins may present challenges for vaccine design. Understanding the genetic variations in HR-HPVs is therefore crucial for developing vaccines with broader and more effective coverage. Although various isolated studies have explored HR-HPV genetic diversity worldwide [,,,,,,,], a comprehensive, systematic analysis of the genetic variability of multiple HR-HPV types, particularly those highly prevalent in Africa, is still lacking.
To address this gap, the present study analyzed pooled sequence data of six HR-HPV types (16, 18, 35, 51, 56, and 59) to characterize the genetic diversity of the E5, E6, and E7 oncoproteins, as well as the L1 major capsid protein. By comparing African and global sequence data, we aimed to identify Africa-specific variants and assess their potential impact on protein function. These insights are critical for guiding the design of next-generation, region-specific HPV vaccines and improving global vaccine strategies.

2. Results

2.1. Sequence Dataset Composition

A comprehensive search of the NCBI Virus Database yielded a total of 13,860 verified DNA sequences of HR-HPV types 16, 18, 35, 51, 56, and 59 from global and African sources (Figure 1). Of these, 1063 were near-complete (≥80%) genomes, while 12,797 were partial genomes or gene sequences. Globally, HPV16 and HPV18 had the highest representation, with 10,733 and 1063 sequences, respectively, followed closely by HPV35 with 1045 sequences. Within the African dataset, the most frequently represented type was HPV16 (432 sequences), followed by HPV35 (263 sequences) and HPV18 (60 sequences) (Figure 1). In contrast, HPV51, HPV56, and HPV59 were markedly underrepresented in both the global and African datasets. Notably, Oceania contributed sequences exclusively for HPV35. The limited number of sequences for HPV51, HPV56, and HPV59 significantly constrains the ability to make robust comparisons between these types and the more well-represented HR-HPVs.
Figure 1. Pooled sequence data analysis of verified HR-HPV nucleotide sequences from the NCBI database. A breakdown of sequences for HPV types 16, 18, 35, 51, 56, and 59, focusing on those most prevalent in African populations. Data are categorized by geographic region, highlighting the distribution of sequences across continents, including near-complete (≥80%) genomes, partial genomes and genes.
Sequence datasets from Africa were further processed to identify and extract full-length gene sequences encoding the E5, E6, E7, and L1 proteins using their open reading frames (from start to stop codons) (Table 1). Gene sizes for each protein corresponded closely with those of the respective reference genomes, although some variations were observed across HPV types. Notably, gene sizes differed between the HR-HPV types, consistent with the annotated reference sequences. The E5 gene ranged from 222 bp to 252 bp (encoding 73–83 amino acids) in HPV types 16, 18, 35, and 59, but was not detected in HPV types 51 and 56, in agreement with their reference genome annotations (PaVE IDs: HPV51REF and HPV56REF, respectively). This analysis included a total of 220 full-length E5 gene sequences from ten African countries. The E6 gene ranged from 450 bp to 483 bp (149–160 amino acids), with distinct sizes observed for each HPV type. A total of 242 full-length E6 sequences were analyzed from twelve African countries. The E7 gene ranged from 297 bp to 324 bp (98–107 amino acids), with 251 full-length sequences included from eleven African countries. The L1 gene ranged from 1500 bp to 1527 bp (499–508 amino acids), with 249 full-length sequences obtained from ten African countries. In total, 962 full-length gene sequences were analyzed, distributed across HR-HPV types as follows: HPV16 (48 sequences, 5.0%), HPV18 (41 sequences, 4.3%), HPV35 (763 sequences, 79.3%), HPV51 (37 sequences, 3.8%), HPV56 (33 sequences, 3.4%), and HPV59 (40 sequences, 4.2%).
Table 1. Full-length E5, E6, E7, and L1 gene sequences of HR-HPV types in Africa retrieved from the NCBI Virus Database.

2.2. Phylogenetic Analysis

A total of 222 HPV reference sequences were retrieved and used for the analysis. Figure 2 illustrates the phylogenetic relationships of these sequences, including the HR-HPV consensus sequences of interest (16, 18, 35, 51, 56, and 59). The results show that the HR-HPVs of interest (highlighted in green) are closely related, clustering together within different branches of the same monophyletic clade (outlined in green), with HPV16 and HPV35 forming a distinct sub-cluster.
Figure 2. Maximum likelihood tree topology of 222 HR-HPV reference genomes. Multiple sequence alignments were performed using MAFFT v7.525, and phylogenetic analysis was performed using IQ-TREE v2.0.7 with SH-aLRT/UFboot tests for 1000 replicates. Purple dots indicate branches with support values ≥ 70%, where larger dots represent relatively higher support values. HPVs 16, 18, 35, 51, 56, and 59 are highlighted in green and cluster within the same monophyletic clade.
We compared the genetic diversity of HR-HPV consensus sequences from different regions of the world, as shown in Figure 3 (further detail is presented in Supplementary Figure S1 and Table S1). Based on the cladograms of regional consensus sequences, HPV16 and HPV51 populations from Asia and Africa were found to be closely related. Similarly, the African HPV56 and HPV59 populations showed a close relationship with their counterparts in South America.
Figure 3. Maximum likelihood tree topologies of HR-HPV consensus sequences from different regions of the world. Multiple sequence alignments were performed using MAFFT v7.525, and phylogenetic analysis was conducted using IQ-TREE v2.0.7 with SH-aLRT/UFboot tests for 1000 replicates. Purple dots indicate branches with support values ≥70%, with larger dots representing relatively higher support values. Branches showing notable clustering are shown in blue.

2.3. Intra-Africa Diversity

The dataset from Africa was strongly biased, with sequences for HPV types 16, 18, 51, 56, and 59 originating only from South Africa or Togo. In contrast, HPV35 sequences exhibited greater diversity, representing a total of ten countries (Figure 4 and Supplementary Figure S2). Most branches on the phylogenetic trees of HR-HPVs 16, 18, 35, 56, and 59 were strongly supported (≥70%), whereas only three branches for HPV51 could be confidently resolved (Figure 4). For HPV35, a closer relationship was observed among the South African, Zimbabwean, and Togolese populations, while Moroccan and Nigerian populations clustered together. Figure 4 illustrates the phylogenetic analysis of HPV35 consensus sequences by country, with a more detailed analysis of individual isolates presented in Supplementary Figure S2.
Figure 4. Maximum likelihood tree topologies of near-complete (≥80%) genomes of HR-HPV isolates from countries across Africa. Multiple sequence alignments were performed using MAFFT v7.525, followed by phylogenetic analysis using IQ-TREE v2.0.7 with SH-aLRT/UFboot tests for 1000 replicates. Purple dots represent branches supported by values ≥ 70%, with larger dots indicating relatively higher support. Branches showing notable clustering in HPV35 are shown in blue.

2.4. Oncoprotein/L1 Variant Analysis

To assess the diversity of HR-HPV oncoproteins in Africa, full-length E5, E6, E7, and L1 genes were extracted from sequences obtained from the NCBI database. Table 2, Table 3, Table 4 and Table 5 summarize variant counts, while Supplementary Tables S2–S5 provide detailed non-synonymous variations. Overall, the translated sequences were highly similar to their respective reference sequences. Several polymorphisms occurred at 100% frequency, including I44L in HPV16 E5, H78Y in HPV16 E6, and T266A in HPV16 L1. In HPV16 E5, I65 was mutated in all sequences, with I65V in 10 out of 11 sequences and I65L in one sequence.
Table 2. Summary of E5 Protein Amino Acid Polymorphisms Identified in Africa. Synonymous and non-synonymous SNPs were detected in the E5 gene sequences of HR-HPV types from African populations using the variant-calling tool in Geneious Prime®.
Table 3. Summary of E6 Protein Amino Acid Polymorphisms Identified in Africa. Synonymous and non-synonymous SNPs were detected in the E6 gene sequences of HR-HPV types from African populations using the variant-calling tool in Geneious Prime®.
Table 4. Summary of E7 Protein Amino Acid Polymorphisms Identified in Africa. Synonymous and non-synonymous SNPs were detected in the E7 gene sequences of HR-HPV types from African populations using the variant-calling tool in Geneious Prime®.
Table 5. Summary of L1 Protein Amino Acid Polymorphisms Identified in Africa. Synonymous and non-synonymous SNPs were detected in the L1 gene sequences of HR-HPV types from African populations using the variant-calling tool in Geneious Prime®.
For E7, 21 non-synonymous variants were identified, and L1 proteins exhibited 55 variants across all HR-HPVs. Three novel variants were detected in African populations: E148K in HPV35 E6, S63C in HPV16 E7 and S495F in HPV18 L1. Variants present at <1% frequency were reported descriptively and excluded from statistical testing. Two-tailed Fisher’s exact tests were applied only to variants with interpretable frequencies to determine whether the distribution of HPV protein variants differed significantly across geographic regions, enabling identification of common versus rare variants and the detection of regional clustering. As these analyses were exploratory, no multiple-testing correction was applied, and findings should therefore be interpreted with caution. Several variants displayed notable geographic patterns. In HPV35 E6, W78R clustered in Algeria, Rwanda, and Togo, whereas H98Y was rare but disproportionately present in Mali. In HPV35 E7, E63K was distributed across Algeria, Guinea, Nigeria, Rwanda, and Togo. In HPV35 L1, S348T occurred in eight countries, including Algeria, Guinea, Kenya, Mali, Morocco, Nigeria, Rwanda, and South Africa, while S349T was strongly represented in Rwanda compared to other regions.
This analysis highlights both common and rare variants, identifies novel mutations, and reveals geographic clustering patterns that may inform regional vaccine design and enhance HR-HPV surveillance in Africa.

3. Discussion

This study provides a comprehensive analysis of the genetic diversity of six high-risk HPV types (16, 18, 35, 51, 56, and 59), with a focus on African populations compared to global sequences. By analyzing full-length genomes and key viral proteins, E5, E6, E7, and L1, our work identifies Africa-specific variations, including three novel non-synonymous mutations, E148K in HPV35 E6, S63C in HPV16 E7 and S495F in HPV18 L1, which have not been previously reported. While these findings highlight important intra- and inter-type diversity, it is important to recognize that the data are limited by the low number of available sequences for several HR-HPV types and the snapshot nature of database retrieval. Nevertheless, this study provides one of the first consolidated views of HR-HPV protein variation in Africa, establishing a baseline for future research and informing considerations for next-generation, region-specific HPV vaccines.
Previous research has largely focused on HPV16 and HPV18 due to their global prevalence, oncogenic potential, and inclusion in current vaccines []. Our findings suggest that HPV35, which is disproportionately prevalent in Africa, deserves greater attention. Phylogenetic analyses revealed a close evolutionary relationship between HPV35 and HPV16. While this may indicate shared biological traits, any implications for vaccine protection are speculative and require experimental validation. Similarly, evolutionary analyses indicated close relationships between HPV18 and HPV59, and between HPV51 and HPV56. Unlike prior studies that relied on concatenated amino acid sequences of selected proteins [], our analysis of 222 complete HPV reference genomes offers a more comprehensive perspective on these relationships.
The L1 gene, highly conserved across HPV types, remains the primary target for current vaccines [,]. The observed L1 variants in African HPV16, 35, and 59 sequences could potentially influence antigenicity, but further functional studies are needed to determine any impact on vaccine efficacy. Notable variations such as T266A and S282P in HPV16 occur within immunogenic regions, consistent with previous findings in South African isolates that suggest these variants affect T- and B-cell epitopes []. Several L1 variants identified in Africa have been reported in other regions, whereas S495F in HPV18 appears novel. Variations in HPV35 and other HR-HPVs further highlight the importance of continued surveillance to inform vaccine design.
All L1 variations in HPV16 observed in Africa have also been reported at varying frequencies in Europe, Asia, the Middle East, and the Americas [,,]. Similarly, L3M has been detected previously in America and Asia, while T88N, Q273P, and V323I from HPV18 have been reported in Europe [,]. Ahmed et al. also reported two variants (T389S and L475F) that were not observed in the present study. However, S495F in HPV18 represents a previously unreported variation in a South African isolate (OP971042.1). Additionally, S348T in HPV35 has been reported in Brazil [], while the HPV51 variations, while V264G and G265S in HPV51 were observed in Southwest China [].
In contrast, the E5 protein displayed considerable variability, which may limit its potential as a reliable vaccine target. Literature indicates that E5 is expressed exclusively by Alpha HPVs [,]. We also observed the absence of the E5 open reading frame (ORF) in HPV51 and HPV56, which has not been previously reported, suggesting that E5 may play a non-essential role in HPV-related oncogenesis and may be a poor candidate for therapeutic vaccines.
Variations in oncoproteins may influence HPV oncogenicity. Some, such as L83V in HPV16 E6 and N29S in E7, are associated with precancerous or cancerous phenotypes []. Combinations of multiple variations, including Q14H, H78Y, and L83V in HPV16 E6, may enhance viral immortalization and transformation of cells compared to wildtype oncoproteins []. These findings highlight the importance of investigating HR-HPV variations in Africa and their potential impact on vaccine-targeted antigenicity.
Comparison with the literature shows that several E6 and E7 variants identified in this study have been previously reported in other regions. For example, Q14D and H78Y in HPV16 were found in isolates from Congolese cervical cancer patients [], while L83V in HPV16 has been reported in Europe, South/Central America, and East Asia []. In HPV35, I73V and W78R have been described in Brazil, Germany, and China [,,,], whereas W78N in CAR was not observed in our study []. E148K in HPV35 and S495F in HPV18 are novel to African isolates. Other previously described variants include S100L in HPV51 and S14R and K54N in HPV56 [].
E7 variants in HPV16, such as N29S, have been reported in Korea and China [], while E7Q in Chad has also been identified in CAR []. S63C in HPV16 from South Africa is rare and has not been widely reported, whereas other variants at this position (S63F and S63P) occur in Romania and China [,]. E63K in HPV35 from Africa has recently been reported in China []. A recent study of South African and Mozambican HPV35 isolates found SNPs in E6 and E7, though they did not result in protein variation [].
While short-term vaccine efficacy studies in Africa are encouraging, long-term follow-up is critical due to the 3–20 year latency of HPV-related cancers [,,]. The increasing prevalence of HPV35 and limited data on HPV51, HPV56, and HPV59 highlight the need for region-specific vaccine strategies. Enhanced genomic surveillance could support Africa-centric vaccine development tailored to circulating HPV variants.
This study lays a foundation for future research into Africa-specific HPV vaccines but is subject to several limitations. Sampling bias due to skewed GenBank data results in under-representation of certain regions and HR-HPV types, particularly 51, 56, and 59, limiting robust conclusions. Clinical metadata, including lesion grade, patient age, and temporal data, were not included, restricting interpretation of some findings. Additionally, the limited time window of data extraction introduces snapshot bias.

4. Methods and Materials

4.1. Database Mining and Nucleotide Sequence Retrieval

DNA sequences of HR-HPV types 16, 18, 35, 51, 56, and 59 from both global and African sources were searched and retrieved from the National Center for Biotechnology Information (NCBI) Virus Sequence Database []. The search was conducted using the “Virus/Taxonomy” filter with the following taxonomic identifiers: HPV16 (Human papillomavirus 16, taxid:333760), HPV18 (Human papillomavirus 18, taxid:333761), HPV35 (Human papillomavirus 35, taxid:10587), HPV51 (Human papillomavirus 51, taxid:10595), HPV56 (Human papillomavirus 56, taxid:10596), and HPV59 (Human papillomavirus 59, taxid:37115). Additionally, the search results were filtered by geographic origin using the “Geographic Region” filter to separate African sequences from those collected globally (Figure 5). All retrieved sequences were saved as separate FASTA files between August and October 2024. Complete reference genomes for HR-HPV types 16, 18, 35, 51, 56, and 59 were retrieved from the Papillomavirus Episteme (PaVE) database []. Microsoft Excel Professional Plus 2019 (Redmond, WA, USA) was used to check for duplicate entries. Duplicate entries, including identical accession numbers originating from the same subject or laboratory, were removed prior to downstream analyses. Geographic metadata for each sequence were validated through a two-step process: first, each location was independently verified by two reviewers; second, locations were cross-checked against the accompanying metadata to ensure consistency and accuracy. Only sequences with validated metadata were included in subsequent analyses.
Figure 5. Flow diagram showing the data collection and sorting procedure for the study. Nucleotide sequences of six HR-HPV types (16, 18, 35, 51, 56, and 59) from different regions of the world were retrieved from the NCBI Virus database.

4.2. Sequence Data Sorting and Annotation

The FASTA files containing the nucleotide sequences were imported into Geneious Prime® (version 2024.0.7, Dotmatics, Boston, MA, USA), a bioinformatics software used for sequence data analysis. Each sequence was manually renamed to include its geographic origin and accession number. The following identifier codes were used to represent each world region: Afr (Africa), Asia (Asia), Eur (Europe), NAm (North America), Oce (Oceania), and SAm (South America). Specific codes were also assigned to represent the African countries from which the sequences originated: Alg (Algeria), Cong (Republic of the Congo), CAR (Central African Republic), Chad (Chad), DRC (Democratic Republic of the Congo), Egy (Egypt), Gab (Gabon), Gamb (The Gambia), Gha (Ghana), Guin (Guinea), Ken (Kenya), Mali (Mali), Maus (Mauritius), Moro (Morocco), Ngra (Nigeria), RSA (Republic of South Africa), Rwa (Rwanda), Tan (Tanzania), Togo (Togo), Tuni (Tunisia), Uga (Uganda), and Zim (Zimbabwe). Any sequence classified as “unverified” in the NCBI database was discarded. Reference genomes were manually annotated in Geneious Prime®, using the information provided in the GenBank files and the locus viewer of the respective reference genomes on the PaVE database [].

4.3. Nucleotide Sequence Alignments and Phylogenetic Analysis

To assess genetic diversity and infer phylogenetic relationships among HR-HPV types, multiple sequence alignments were performed using MAFFT v7.525 with the FFT-NS-2 algorithm and default parameters (gap opening penalty = 1.53; gap extension penalty = 0.00) in Ubuntu (v24.04.2 LTS, Canonical, London, UK). The alignments included all verified near-complete genomes (≥80% reference sequence coverage) for each HPV type, along with their respective reference genomes. Only sequences with ≤15% ambiguous bases (N) were included, and insertions and deletions were treated as gaps in the alignments. Both coding regions and non-coding regions, including the long control region (LCR), were analyzed without masking hypervariable regions, in line with previously published literature []. Multiple sequence alignments were visualized and residues corresponding to gaps in the reference sequence were trimmed using AliView (v1.30, Uppsala University, Uppsala, Sweden) to reduce poorly aligned regions, ensuring consistent treatment of gaps across all sequences. Maximum likelihood phylogenetic trees were constructed using IQ-TREE v2.0.7 (multicore version) with ModelFinder for model selection and SH-aLRT/UFboot tests, each run with 1000 replicates []. The resulting trees were visualized and annotated using the Interactive Tree of Life (iTOL) v7 online tool [].

4.4. Variation Calling and Amino Acid Translation

All verified sequences retrieved from the NCBI Virus Database for each HPV type were aligned to their respective annotated reference sequences. The genes of interest (L1, E5, E6, and E7) were identified and extracted using the “Extract” tool in Geneious® Prime. Variant calling was performed using the “Find Variation/SNPs” function under the “Annotate and Predict” menu in Geneious® Prime. The following data were recorded for each identified variant: Type of variant, Frequency, and Origin sequence. Incomplete gene sequences lacking identifiable start or stop codons were excluded. For variant calling, only gene sequences with 100% reference sequence coverage were included with a minimum length corresponding to the gene in the corresponding reference sequence since each HR-HPV type has a different size for the same gene (shown in Table 1). The nucleotide sequences were subsequently translated into amino acid sequences to classify mutations as synonymous or non-synonymous.

4.5. Statistical Analysis

Differences in the geographical distribution of variants were assessed using two-tailed Fisher’s exact tests in GraphPad Prism (v10.6.1, GraphPad Software, San Diego, CA, USA). Variants occurring at <1% frequency were reported descriptively and excluded from statistical testing to avoid over-interpretation. Statistically significant associations were indicated by asterisks (p < 0.05 to p < 0.0001), highlighting variants disproportionately represented in specific regions. No multiple-testing correction was applied, as the analyses were primarily descriptive and exploratory.

5. Conclusions

Our findings highlight the genetic diversity of clinically significant HR-HPV types in Africa, particularly HPV16, 18, 35, 51, 56, and 59. Three novel non-synonymous variants, E148K in HPV35, S63C in HPV16 E7 and S495F in HPV18 L1, were identified, expanding the catalog of African HR-HPV diversity. Despite the limited sequence availability and snapshot nature of the data, our results highlight genetic diversity that may be relevant to vaccine design, though functional effects on vaccine efficacy remain to be established. Strengthening genomic surveillance and expanding sequencing efforts in Africa are critical to monitor HPV distribution, emerging variants, and to guide the development of effective vaccines. The South African Vaccine Innovation and Manufacturing Strategy (VIMS) project [] provides a key opportunity for collaborative efforts to address these challenges.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms262211056/s1.

Author Contributions

The study was conceptualized by N.D.M., N.D.M. and P.P.M. jointly supervised the study and provided critical guidance throughout. The methodology was designed and carried out by B.N., N.D.M. and P.P.M. Software operation was performed by B.N. and N.D.M., and validation was carried out by N.D.M., P.P.M., L.B. and M.T. Formal analysis was performed by B.N., with investigation and data curation conducted by B.N., N.D.M., P.P.M., L.B. and M.T. The original draft of the manuscript was prepared by B.N. and reviewed by N.D.M., P.P.M. and M.T. All authors have read and agreed to the published version of the manuscript.

Funding

N.M would like to thank the National Research Foundation (NRF), with grant number TTK230430100291, and the Poliomyelitis Research Foundation (PRF), with grant number 24/81, for research funding. B.N. would like to thank the PRF for funding, with grant number 24/64.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The datasets analyzed during the current study are available in the National Center for Biotechnology Information (NCBI) repository, https://www.ncbi.nlm.nih.gov/. Accession numbers of sequences analyzed are provided in the Supplementary Materials.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the result.

Abbreviations

HR-HPVhigh-risk human papillomavirus
NCBInational centre for biotechnology information
ORFopen reading frame
SNPsingle nucleotide polymorphism
VIMSvaccine innovation and manufacturing strategy

References

  1. WHO. Global Cancer Observatory; WHO: Geneva, Switzerland, 2025; Available online: https://gco.iarc.fr/tomorrow/en/dataviz/isotype?cancers=23&single_unit=50000&types=0 (accessed on 16 July 2025).
  2. Yuan, Y.; Cai, X.; Shen, F.; Ma, F. HPV post-infection microenvironment and cervical cancer. Cancer Lett. 2021, 497, 243–254. [Google Scholar] [CrossRef]
  3. Jary, A.; Teguete, I.; Sidibé, Y.; Kodio, A.; Dolo, O.; Burrel, S.; Boutolleau, D.; Beauvais-Remigereau, L.; Sayon, S.; Kampo, M. Prevalence of cervical HPV infection, sexually transmitted infections and associated antimicrobial resistance in women attending cervical cancer screening in Mali. Int. J. Infect. Dis. 2021, 108, 610–616. [Google Scholar] [CrossRef]
  4. de Sanjose, S.; Brotons, M.; Pavon, M.A. The natural history of human papillomavirus infection. Best Pract. Res. Clin. Obstet. Gynaecol. 2018, 47, 2–13. [Google Scholar] [CrossRef]
  5. Laganà, A.S.; Chiantera, V.; Gerli, S.; Proietti, S.; Lepore, E.; Unfer, V.; Carugno, J.; Favilli, A. Preventing persistence of HPV infection with natural molecules. Pathogens 2023, 12, 416. [Google Scholar] [CrossRef]
  6. Szymonowicz, K.A.; Chen, J. Biological and clinical aspects of HPV-related cancers. Cancer Biol. Med. 2020, 17, 864. [Google Scholar] [CrossRef]
  7. Ribeiro, D.V.; Steffens, S.M.; Fedrizzi, E.N. The impact of the HPV vaccine on the world: Initial outcomes and challenges. Braz. J. Sex. Transm. Dis. 2020, 32, e203204. [Google Scholar] [CrossRef]
  8. Huber, J.; Mueller, A.; Sailer, M.; Regidor, P.-A. Human papillomavirus persistence or clearance after infection in reproductive age. What is the status? Review of the literature and new data of a vaginal gel containing silicate dioxide, citric acid, and selenite. Women’s Health 2021, 17, 17455065211020702. [Google Scholar] [CrossRef]
  9. Bray, F.; Laversanne, M.; Sung, H.; Ferlay, J.; Siegel, R.L.; Soerjomataram, I.; Jemal, A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J. Clin. 2024, 74, 229–263. [Google Scholar] [CrossRef]
  10. WHO. Accelerating the Elimination of Cervical Cancer as a Public Health Problem: Towards Achieving 90–70–90 Targets by 2030; WHO: Geneva, Switzerland, 2022. [Google Scholar]
  11. Doorbar, J.; Quint, W.; Banks, L.; Bravo, I.G.; Stoler, M.; Broker, T.R.; Stanley, M.A. The biology and life-cycle of human papillomaviruses. Vaccine 2012, 30, F55–F70. [Google Scholar] [CrossRef]
  12. Bruni, L.; Albero, G.; Serrano, B.; Mena, M.; Collado, J.J.; Gómez, D.; Muñoz, J.; Bosch, F.; de Sanjosé, S. Human Papillomavirus and Related Diseases in Africa: Summary Report 10 March 2023; IARC Information Centre on HPV and Cancer (HPV Information Centre): Barcelona, Spain, 2023. [Google Scholar]
  13. Bruni, L.; Albero, G.; Rowley, J.; Alemany, L.; Arbyn, M.; Giuliano, A.R.; Markowitz, L.E.; Broutet, N.; Taylor, M. Global and regional estimates of genital human papillomavirus prevalence among men: A systematic review and meta-analysis. Lancet Glob. Health 2023, 11, e1345–e1362. [Google Scholar] [CrossRef]
  14. Bosch, F.X.; Burchell, A.N.; Schiffman, M.; Giuliano, A.R.; de Sanjose, S.; Bruni, L.; Tortolero-Luna, G.; Kjaer, S.K.; Munoz, N. Epidemiology and natural history of human papillomavirus infections and type-specific implications in cervical neoplasia. Vaccine 2008, 26, K1–K16. [Google Scholar] [CrossRef]
  15. Bruni, L.; Albero, G.; Serrano, B.; Mena, M.; Collado, J.; Gómez, D.; Muñoz, J.; Bosch, F.; de Sanjosé, S. Human Papillomavirus and Related Diseases in the World: Summary Report 10 March 2023; IARC Information Centre on HPV and Cancer (HPV Information Centre): Barcelona, Spain, 2023. [Google Scholar]
  16. Ghosh, A.; Chatterjee, S.; Dawn, A.; Das, A. HPV Vaccines–An Overview. Indian J. Dermatol. 2025, 70, 188–200. [Google Scholar] [CrossRef]
  17. Wang, R.; Huang, H.; Yu, C.; Li, X.; Wang, Y.; Xie, L. Current status and future directions for the development of human papillomavirus vaccines. Front. Immunol. 2024, 15, 1362770. [Google Scholar] [CrossRef]
  18. Oncology, T.L. HPV vaccination in south Asia: New progress, old challenges. Oncology 2022, 23, 1233. [Google Scholar]
  19. WHO. WHO Adds an HPV Vaccine for Single-Dose Use; WHO: Geneva, Switzerland, 2024; Available online: https://www.who.int/news/item/04-10-2024-who-adds-an-hpv-vaccine-for-single-dose-use (accessed on 16 September 2025).
  20. Gomez, F.; Hirbo, J.; Tishkoff, S.A. Genetic variation and adaptation in Africa: Implications for human evolution and disease. Cold Spring Harb. Perspect. Biol. 2014, 6, a008524. [Google Scholar] [CrossRef]
  21. Ghattas, M.; Dwivedi, G.; Lavertu, M.; Alameh, M.-G. Vaccine technologies and platforms for infectious diseases: Current progress, challenges, and opportunities. Vaccines 2021, 9, 1490. [Google Scholar] [CrossRef]
  22. Zhang, Y.; Qiu, K.; Ren, J.; Zhao, Y.; Cheng, P. Roles of human papillomavirus in cancers: Oncogenic mechanisms and clinical use. Signal Transduct. Target. Ther. 2025, 10, 44. [Google Scholar] [CrossRef]
  23. DiMaio, D.; Mattoon, D. Mechanisms of cell transformation by papillomavirus E5 proteins. Oncogene 2001, 20, 7866–7873. [Google Scholar] [CrossRef]
  24. Peng, Q.; Wang, L.; Zuo, L.; Gao, S.; Jiang, X.; Han, Y.; Lin, J.; Peng, M.; Wu, N.; Tang, Y. HPV E6/E7: Insights into their regulatory role and mechanism in signaling pathways in HPV-associated tumor. Cancer Gene Ther. 2024, 31, 9–17. [Google Scholar] [CrossRef] [PubMed]
  25. Prado, J.C.; Calleja-Macias, I.E.; Bernard, H.-U.; Kalantari, M.; Macay, S.A.; Allan, B.; Williamson, A.-L.; Chung, L.-P.; Collins, R.J.; Zuna, R.E. Worldwide genomic diversity of the human papillomaviruses-53, 56, and 66, a group of high-risk HPVs unrelated to HPV-16 and HPV-18. Virology 2005, 340, 95–104. [Google Scholar] [CrossRef] [PubMed]
  26. Calleja-Macias, I.E.; Kalantari, M.; Huh, J.; Ortiz-Lopez, R.; Rojas-Martinez, A.; Gonzalez-Guerrero, J.F.; Williamson, A.-L.; Hagmar, B.; Wiley, D.J.; Villarreal, L. Genomic diversity of human papillomavirus-16, 18, 31, and 35 isolates in a Mexican population and relationship to European, African, and Native American variants. Virology 2004, 319, 315–323. [Google Scholar] [CrossRef]
  27. Calleja-Macias, I.E.; Villa, L.L.; Prado, J.C.; Kalantari, M.; Allan, B.; Williamson, A.-L.; Chung, L.-P.; Collins, R.J.; Zuna, R.E.; Dunn, S.T. Worldwide genomic diversity of the high-risk human papillomavirus types 31, 35, 52, and 58, four close relatives of human papillomavirus type 16. J. Virol. 2005, 79, 13630–13640. [Google Scholar] [CrossRef] [PubMed]
  28. Gagnon, S.; Hankins, C.; Money, D.; Pourreaux, K.; Franco, E.; Coutlée, F.; Canadian Women’s HIV Study Group. Polymorphism of the L1 capsid gene and persistence of human papillomavirus type 52 infection in women at high risk or infected by HIV. JAIDS J. Acquir. Immune Defic. Syndr. 2007, 44, 61–65. [Google Scholar] [CrossRef]
  29. Gagnon, S.; Hankins, C.; Tremblay, C.; Forest, P.; Pourreaux, K.; Coutlée, F.; Canadian Women’s HIV Study Group. Viral polymorphism in human papillomavirus types 33 and 35 and persistent and transient infection in the genital tract of women. J. Infect. Dis. 2004, 190, 1575–1585. [Google Scholar] [CrossRef] [PubMed]
  30. Gagnon, S.; Hankins, C.; Tremblay, C.; Pourreaux, K.; Forest, P.; Rouah, F.; Coutlée, F. Polymorphism of human papillomavirus type 31 isolates infecting the genital tract of HIV—seropositive and HIV—seronegative women at risk for HIV infection. J. Med. Virol. 2005, 75, 213–221. [Google Scholar] [CrossRef]
  31. Raiol, T.; Wyant, P.S.; de Amorim, R.M.S.; Cerqueira, D.M.; Milanezi Nv, G.; Brigido Md, M.; Sichero, L.; Martins, C.R.F. Genetic variability and phylogeny of the high-risk HPV-31, -33, -35, -52, and -58 in central Brazil. J. Med. Virol. 2009, 81, 685–692. [Google Scholar] [CrossRef] [PubMed]
  32. Chen, Z.; Schiffman, M.; Herrero, R.; DeSalle, R.; Anastos, K.; Segondy, M.; Sahasrabuddhe, V.V.; Gravitt, P.E.; Hsing, A.W.; Burk, R.D. Evolution and taxonomic classification of human papillomavirus 16 (HPV16)-related variant genomes: HPV31, HPV33, HPV35, HPV52, HPV58 and HPV67. PLoS ONE 2011, 6, e20183. [Google Scholar] [CrossRef]
  33. Harper, D.M.; DeMars, L.R. HPV vaccines–a review of the first decade. Gynecol. Oncol. 2017, 146, 196–204. [Google Scholar] [CrossRef]
  34. Dehghani, B.; Hasanshahi, Z.; Hashempour, T.; Motamedifar, M. The possible regions to design Human Papilloma Viruses vaccine in Iranian L1 protein. Biologia 2020, 75, 749–759. [Google Scholar] [CrossRef]
  35. Oumeslakht, L.; Ababou, M.; Badaoui, B.; Qmichou, Z. Worldwide genetic variations in high-risk human papillomaviruses capsid L1 gene and their impact on vaccine efficiency. Gene 2021, 782, 145533. [Google Scholar] [CrossRef]
  36. Tsakogiannis, D.; Nikolaidis, M.; Zagouri, F.; Zografos, E.; Kottaridi, C.; Kyriakopoulou, Z.; Tzioga, L.; Markoulatos, P.; Amoutzias, G.D.; Bletsa, G. Mutation profile of HPV16 L1 and L2 genes in different geographic areas. Viruses 2022, 15, 141. [Google Scholar] [CrossRef]
  37. Alsanea, M.; Alsaleh, A.; Obeid, D.; Alhadeq, F.; Alahideb, B.; Alhamlan, F. Genetic variability in the E6, E7, and L1 genes of human papillomavirus types 16 and 18 among women in Saudi Arabia. Viruses 2022, 15, 109. [Google Scholar] [CrossRef]
  38. Ahmed, A.I.; Bissett, S.L.; Beddows, S. Amino acid sequence diversity of the major human papillomavirus capsid protein: Implications for current and next generation vaccines. Infect. Genet. Evol. 2013, 18, 151–159. [Google Scholar] [CrossRef]
  39. Jing, Y.; Wang, T.; Chen, Z.; Ding, X.; Xu, J.; Mu, X.; Cao, M.; Chen, H. Phylogeny and polymorphism in the long control regions E6, E7, and L1 of HPV Type 56 in women from southwest China. Mol. Med. Rep. 2018, 17, 7131–7141. [Google Scholar] [CrossRef]
  40. Basukala, O.; Banks, L. The not-so-good, the bad and the ugly: HPV E5, E6 and E7 oncoproteins in the orchestration of carcinogenesis. Viruses 2021, 13, 1892. [Google Scholar] [CrossRef]
  41. Venuti, A.; Paolini, F.; Nasir, L.; Corteggio, A.; Roperto, S.; Campo, M.S.; Borzacchiello, G. Papillomavirus E5: The smallest oncoprotein with many functions. Mol. Cancer 2011, 10, 140. [Google Scholar] [CrossRef]
  42. Li, T.; Yang, Z.; Zhang, C.; Wang, S.; Mei, B. Genetic variation of E6 and E7 genes of human papillomavirus type 16 from central China. Virol. J. 2023, 20, 217. [Google Scholar] [CrossRef]
  43. Bletsa, G.; Zagouri, F.; Amoutzias, G.; Nikolaidis, M.; Zografos, E.; Markoulatos, P.; Tsakogiannis, D. Genetic variability of the HPV16 early genes and LCR. Present and future perspectives. Expert Rev. Mol. Med. 2021, 23, e19. [Google Scholar] [CrossRef] [PubMed]
  44. Boumba, L.M.A.; Assoumou, S.Z.; Hilali, L.; Mambou, J.V.; Moukassa, D.; Ennaji, M.M. Genetic variability in E6 and E7 oncogenes of human papillomavirus Type 16 from Congolese cervical cancer isolates. Infect. Agents Cancer 2015, 10, 15. [Google Scholar] [CrossRef]
  45. He, J.; Li, T.; Cheng, C.; Li, N.; Gao, P.; Lei, D.; Liang, R.; Ding, X. The polymorphism analysis and therapy vaccine target epitopes screening of HPV-35 E6 E7 among the threaten α-9 HPV in Sichuan area. Virol. J. 2024, 21, 213. [Google Scholar] [CrossRef]
  46. Hiller, T.; Stubenrauch, F.; Iftner, T. Isolation and functional analysis of five HPVE6 variants with respect to p53 degradation. J. Med. Virol. 2008, 80, 478–483. [Google Scholar] [CrossRef]
  47. Yuan, H.-B.; Yu, J.-H.; Gan, J.; Qiu, Y.; Yan, Z.-Y.; Xu, H.-H. Genetic variations and carcinogenicity analysis of E6/E7 oncogenes in HPV31 and HPV35 in Taizhou, China. Virol. J. 2025, 22, 35. [Google Scholar] [CrossRef] [PubMed]
  48. Mboumba Bouassa, R.-S.; Avala Ntsigouaye, J.; Lemba Tsimba, P.C.; Nodjikouambaye, Z.A.; Sadjoli, D.; Mbeko Simaleko, M.; Camengo, S.P.; Longo, J.D.D.; Grésenguet, G.; Veyer, D. Genetic diversity of HPV35 in Chad and the Central African Republic, two landlocked countries of Central Africa: A cross-sectional study. PLoS ONE 2024, 19, e0297054. [Google Scholar] [CrossRef] [PubMed]
  49. Lou, H.; Boland, J.F.; Li, H.; Burk, R.; Yeager, M.; Anderson, S.K.; Wentzensen, N.; Schiffman, M.; Mirabello, L.; Dean, M. HPV16 E7 nucleotide variants found in cancer-free subjects affect E7 protein expression and transformation. Cancers 2022, 14, 4895. [Google Scholar] [CrossRef] [PubMed]
  50. Maueia, C.; Murahwa, A.T.; Carulei, O.; Taku, O.; Mbulawa, Z.; Manjate, A.; Valey, Z.O.; Mussá, T.; Williamson, A.-L. Genetic variability in HPV 33 and 35 E6 and E7 genes from South African and Mozambican women with different cervical cytology status. Virol. J. 2025, 22, 234. [Google Scholar] [CrossRef]
  51. Soheili, M.; Keyvani, H.; Soheili, M.; Nasseri, S. Human papilloma virus: A review study of epidemiology, carcinogenesis, diagnostic methods, and treatment of all HPV-related cancers. Med. J. Islam. Repub. Iran 2021, 35, 65. [Google Scholar] [CrossRef]
  52. Schiffman, M.; Castle, P.E.; Jeronimo, J.; Rodriguez, A.C.; Wacholder, S. Human papillomavirus and cervical cancer. Lancet 2007, 370, 890–907. [Google Scholar] [CrossRef]
  53. Murenzi, G.; Mungo, C. Impact of the human papillomavirus vaccine in low-resource settings. Lancet Glob. Health 2023, 11, e997–e998. [Google Scholar] [CrossRef]
  54. Sayers, E.W.; Beck, J.; Bolton, E.E.; Brister, J.R.; Chan, J.; Connor, R.; Feldgarden, M.; Fine, A.M.; Funk, K.; Hoffman, J. Database resources of the National Center for Biotechnology Information in 2025. Nucleic Acids Res. 2025, 53, D20–D29. [Google Scholar] [CrossRef]
  55. Van Doorslaer, K.; Li, Z.; Xirasagar, S.; Maes, P.; Kaminsky, D.; Liou, D.; Sun, Q.; Kaur, R.; Huyen, Y.; McBride, A.A. The Papillomavirus Episteme: A major update to the papillomavirus sequence database. Nucleic Acids Res. 2017, 45, D499–D506. [Google Scholar] [CrossRef]
  56. Liu, Y.; Pan, Y.; Gao, W.; Ke, Y.; Lu, Z. Whole-genome analysis of human papillomavirus types 16, 18, and 58 isolated from cervical Precancer and Cancer samples in Chinese women. Sci. Rep. 2017, 7, 263. [Google Scholar] [CrossRef] [PubMed]
  57. Minh, B.Q.; Schmidt, H.A.; Chernomor, O.; Schrempf, D.; Woodhams, M.D.; Von Haeseler, A.; Lanfear, R. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 2020, 37, 1530–1534. [Google Scholar] [CrossRef] [PubMed]
  58. Letunic, I.; Bork, P. Interactive Tree of Life (iTOL) v6: Recent updates to the phylogenetic tree display and annotation tool. Nucleic Acids Res. 2024, 52, W78–W82. [Google Scholar] [CrossRef] [PubMed]
  59. DSTI. Building a Local Vaccine Capability Critical in the Era of Health Challenges; DSTI: Pretoria, South Africa, 2024. Available online: https://www.dsti.gov.za/index.php/media-room/latest-news/4363-building-a-local-vaccine-capability-critical-in-the-era-of-health-challenges (accessed on 5 February 2025).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.