Distribution of Human Papillomavirus Genotypes among the Women of South Andaman Island, India

Background: Human Papillomavirus (HPV) causes various types of cancer in both men and women. Woman with HPV infection has a risk of developing invasive cervical cancer. Globally, HPV 16 and 18 were predominant. This study aims to find the distribution of various HPV types in South Andaman. Methods: A cross-sectional study was conducted among women in South Andaman, where cervical scrapes were collected after collecting written informed consent. Detection of HPV genotypes was carried out by using a PCR assay. Further, sequencing analysis was performed using MEGA11 to identify various genotypes in this territory. Result: Of these 1000 samples, 32 were positive for HR-HPV 16, and four were positive for HR-HPV 18. Fifteen HPV genotypes were detected using molecular evolutionary analysis. Six cases were identified with multiple genotypes. The most prevalent genotype is HPV 16 which belongs to Lineage-A and sub-lineage A2. HPV 18 identified in South Andaman belonged to the lineage A1 to A5. Discussion: Various HPV types were identified among women in South Andaman. Global burden of cervical cancer associated with various HPV sub-lineages. HPV-16 A1 sub-lineage was globally widespread, whereas sub-lineages A1, A2 and D1 prevailed in South Andaman. Conclusions: HR-HPV identified in this study enlightens the importance of HPV vaccination among women in remote places. These findings will help to strengthen public health awareness programs and prevention strategies for women in remote areas.


Background
HPV is responsible for most human reproductive tract viral infections. Most HPV infections are asymptomatic and self-limiting; chronic infections can progress into warts in precancerous, cervical, anogenital or oropharyngeal regions in men and women. Cervical cancer was the most frequent HPV associated disease. Though the majority of HPV precancerous lesions have a tendency to disappear on their own, there remains a risk for every woman with HPV infection to become persistent and pre-cancerous leading to invasive cervical cancer [1].
Cervical cancer is the leading cause of death in women. According to Global Cancer Observatory (GLOBOCAN) 2020, the incidence rate of cervical cancer was 15.6%, and the mortality rate was 8.8% worldwide. The age-specific standardized rate of cervical cancer was 13.3%. In Asia, the incidence rate of cervical cancer was 58.2 per cent. The five-year prevalence of cervical cancer was 59.5% in Asia. In India, the incidence and mortality rates were 16.2% and 9.5%, respectively. The proportion of cervical cancer in India was 7.9 per 100,000 [2]. In India, 22% of women have undergone cervical screening examinations based on aNational Family Health Survey (NFHS) report [3]. According to the World Health Organization (WHO), 99% of cervical cancer cases were associated with high-risk human papillomavirus (HR-HPV) [1]. Among the Indian population, the prevalence of cervical cancer was higher among sex workers in an urban slum in Mumbai and HIV-positive women. HPV 16 and 18 were observed in 56% of cases in the West Indian region [4,5]. HPV belongs to the family Papillomaviridae family, and it is a small, non-enveloped circular double-stranded DNA virus. The DNA molecule is 8000 base pairs in size, and the genome has six early (E) E1-E2, E4-E7 regions and two late (L) L1 and L2 regions [6].
Andaman & Nicobar Islands are situated in the southern regions of the Bay of Bengal in the Indian Ocean, closer to Indonesia and Thailand. According to the Census of India (2011), the territory's population was 380,581, and the female population was 177,710 (46.7%). The literacy rate of the Andaman and Nicobar Islands is 77.3% [11].
Our previous study in Andaman and Nicobar Islands reported the HR-HPV types (HPV 16 and 18). This was the first of its kind study to find HPV types in these islands [12]. HPV variants were not studied in detail among the population in this region so far. It is necessary that the public health system should be aware of the circulating variants of local strain patterns of HPV to frame recommendations for developing appropriate broadspectrumvaccines aiming at HR-HPV variants.To our knowledge, this study was the first cross-sectional survey conducted among a large population across Andaman and Nicobar Islands. Further, the current study aims to know the variants of HPV among married women in the Andaman and Nicobar Islands.

Study Population
A community-based cross-sectional study was conducted among married women of reproductive age (18-59 years) residing in the South Andaman District of the Andaman and Nicobar Islands, India.

Exclusion Criteria
Patients were excluded if there was evidence of pregnancy, severe gynaecological bleeding, hysterectomy or previous history of the disease, including cancer, warts and other cutaneous manifestations.

Ethical Approval
This study has been approved by the Institutional Human Ethics Committee (IHEC) of the Indian Council of Medical Research-Regional Medical Research Centre (ICMR-RMRC), Port Blair [IEC No: 03/RMRC/29/06/2017].

Sampling and Sample Size
The target population was chosen via cluster sampling, and the sampling units were villages or municipal wards. After stratifying the sampling units into rural/urban strata, the required sample size was determined by random selection of the required sample size's units. Based on the Andaman population ratio, the study participants were drawn from rural villages and urban wards in a ratio of 2.5:1, yielding a sample of 700 from the rural and 300 from the urban.

Awareness Programmes
Initially, awareness programmes were conducted in each selected village/ward at the Anganwadi centres/community hall. The health care team (clinician along with trained nurses) were detailed about the health issues like cervical cancer, its symptoms, and genital hygiene. In addition, the need for the study was also explained and requested for written informed consent before enrollment.

Sample Collection and Storage
The enrolled women were called to the field clinics, which were held in the sub-centre, Primary Health Centre (PHC), Community Health Centre (CHC) and District Hospitals catering for the population of the particular village/wards. The cervical scrapes were collected using the standard procedure from the ectocervix or surface of the cervical portion using a cytobrush.Specimens were collected in a tube containing phosphate-buffered saline (pH 8.6) and transported to the laboratory in ICMR-RMRC, Port Blair, by maintaining a cold chain.

Sample Processing
Once the samples arrived at the laboratory, the specimen tubes were vortexed, cytobrushes were discarded, and tubes were centrifuged to pellet the cells, which were suspended in 1 mL of phosphate-buffered saline. Aliquots of each fresh specimen were made and stored for a short duration at −20 • C until further processing.The analysis of HPV DNA was performed in the molecular biology laboratory of ICMR-RMRC, Port Blair.

DNA Extraction
The total DNA was extracted with the QIAamp DNA Minikit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. The DNA was eluted in 45 µL of elution buffer.

PCR Assays
The isolated DNA was amplified with ß-globin (internal control) to ensure the purity of the DNA extractions, as described previously [13].
To confirm the HPV infection, the DNA of all the samples were subjected to PCR amplification targeting the L1 consensus gene by a standard procedure reported previously. The results were recorded as positive if amplicon size specific to the 450 bp DNA band was observed in agarose gel electrophoresis.

Detection of HPV 16 & 18
PCR for the detection of type-specific HR-HPV 16 & 18 in the predominant genotypes was performed [13,14]. In addition, the E6 and E7 genes of HPV 16 and the E6 gene of HPV 18 were also amplified to identify the lineages and sub-lineages of HR-HPV 16 & 18 in the South Andaman Islands [14].

PCR Sequencing
The DNA sequence analysis was carried out to confirm the HPV types distributed in South Andaman. L1 gene PCR amplicons of all the samples negative for HPV 16 and 18.as well as 7 samples of HPV 16 confirmed, were subjected to DNA sequence analysis. In addition, the E6 gene and E7 gene PCR amplicons of HPV 16 and the E6 gene of HPV 18 were also subjected to DNA sequencing.DNA sequencing was carried out by the Sanger sequencing method with corresponding primer sets [15].

Phylogenetic Analysis
The DNA sequences were assembled using the MEGA11 software tool and were analysed together with worldwide diverse HPV sequences using ClustalW multiple alignments and pairwise alignment for phylogenetic analysis and were subsequently analysed using Kimura's two-parameter model as a method of substitution and neighbour-joining to reconstruct the phylogenetic tree. The statistical significance of the relationships obtained was estimated by bootstrap resampling analysis (1000 repetitions). A similar analysis was performed for the E6 and E7 genes of HPV 16 and E6 gene HPV 18.
The phylogenetic trees depicting the evolutionary relationship between taxonomic groups were generated for L1 genes of HPVs, E6 and E7 genes HPV 16, and E6 gene of HPV 18 sequences using molecular evolutionary genomic analyser software MEGA 11 [16]. Genetic distances were calculated by using the Kimura 2 parameter (K2P) model at the nucleotide level, and phylogenetic trees were constructed by using the neighbour-joining method. The reliability of the phylogenetic trees was tested using the bootstrap test with 1000 bootstrap replications.

Results
All of the cervical samples tested positive for the β-globin gene, indicating that there were adequate cells in the samples. Out of 1000 samples screened, 50 specimens tested positive for HPV L1 gene amplification. Subsequently, type-specific PCR for HR-HPV 16 and 18 identified 32 patients positive for HR-HPV 16 and four patients positive for HR-HPV 18. DNA sequencing for the L1 region was successful for 24 samples. The molecular evolutionary genetic analysis of sequences from South Andaman and worldwide was performed, and the pairwise genetic distances between the closely related HPV types from worldwide are specified in Table 1, given below.
The distribution of the multiple genotypes of HPV in South Andaman could be determined by a combined analysis that uses a specific PCR and DNA sequence analysis. Table 2. lists the genotype, risk group, frequency, and percentage of HPV detected in South Andaman Island. The molecular evolutionary genetic analysis could identify the distribution of HPV types 16, 52, 58, 66, 33, 18, 73, 53, 30, 6, 61, 71, 81, 84 and 87. There were high-risk, as well as LR-HPV types prevalent among the women in South Andaman Island.  Table 1).
The distribution of the multiple genotypes of HPV in South Andaman could be determined by a combined analysis that uses a specific PCR and DNA sequence analysis. Table 2. lists the genotype, risk group, frequency, and percentage of HPV detected in South Andaman Island. The molecular evolutionary genetic analysis could identify the distribution of HPV types 16, 52, 58, 66, 33, 18, 73, 53, 30, 6, 61, 71, 81, 84 and 87. There were high-risk, as well as LR-HPV types prevalent among the women in South Andaman Island.  All the co-infected cases were found to have at least one HR-HPV type association. However, HPV type could not be identified in 2 samples due to exhaustion of specimens for repeated experiments.

Phylogenetic Analysis of E6 Gene
The majority of the HPV types distributed in South Andaman were found to be HR-HPV 16, followed by HR-HPV 18. The distribution of various lineages and sub-lineages in South Andaman Island was revealed by phylogenetic analysis of the HPV 16 partial E6 gene (Figure 2A).  The phylogenetic analysis revealed that thirteen of the fourteen sequences were found to be associated with lineage A, and the remaining one was associated with lineage D. The majority (11) of HPV16 sequences were grouped with AF536179 which belongs to the sub-lineage A2 (K2P = 0.002%). The pairwise genetic distance between the two isolates from South Andaman (AN GP-20 and AN PV-59) and the European isolate (K02718) was found that the South Andaman isolates belong to the A1 lineage (0.000%). One isolate (ANMT-19) identified from South Andaman was grouped with the HQ644257, which belongs to the D1 lineage (K2P = 0.00%).
Analysis to identify the HPV18 sub-lineages revealed that the partial E6 gene of HPV18 had close genetic relatedness with reference sequences of HPV18 Lineages A. The E6 gene of HPV18 identified from South Andaman was associated with the lineage A1 to A5 (K2P = 0.00%) ( Figure 2B). However, there were no genetic differences within the lineages of the E6 partial gene region sequenced to identify the sub-lineage.

Phylogenetic Analysis of E7 Gene
The phylogenetic analysis of the E7 gene of HPV16 from the South Andaman district showed maximum genetic relatedness with lineage A (Figure 3). Hence the predominant lineage circulating in South Andaman was identified as lineage A. However, the analysis based on genetic distances between the sequences did not show apparent sub-lineage differentiation in the E7 gene, as seen in the E6 gene of HPV 16. The phylogenetic analysis revealed that thirteen of the fourteen sequences were found to be associated with lineage A, and the remaining one was associated with lineage D. The majority (11) of HPV 16 sequences were grouped with AF536179 which belongs to the sub-lineage A2 (K2P = 0.002%). The pairwise genetic distance between the two isolates from South Andaman (AN GP-20 and AN PV-59) and the European isolate (K02718) was found that the South Andaman isolates belong to the A1 lineage (0.000%). One isolate (ANMT-19) identified from South Andaman was grouped with the HQ644257, which belongs to the D1 lineage (K2P = 0.00%).
Analysis to identify the HPV 18 sub-lineages revealed that the partial E6 gene of HPV 18 had close genetic relatedness with reference sequences of HPV 18 Lineages A. The E6 gene of HPV 18 identified from South Andaman was associated with the lineage A1 to A5 (K2P = 0.00%) ( Figure 2B). However, there were no genetic differences within the lineages of the E6 partial gene region sequenced to identify the sub-lineage.

Phylogenetic Analysis of E7 Gene
The phylogenetic analysis of the E7 gene of HPV 16 from the South Andaman district showed maximum genetic relatedness with lineage A (Figure 3). Hence the predominant lineage circulating in South Andaman was identified as lineage A. However, the analysis based on genetic distances between the sequences did not show apparent sub-lineage differentiation in the E7 gene, as seen in the E6 gene of HPV 16.

Discussion
The current study provided the diversity of HPV among women in South Andaman Island. It is essential to comprehend the spectrum of HPV genotypes because data on the distribution of HPV genotypes are relevant to vaccine development. HPV16 and HPV18 cause more than 70 percent of cervical cancer cases, with the remaining cervical cancers caused by other HR-HPV genotypes [17].
HPV16 genetic variation may have a significant impact on cervical cancer risk. However, the global burden of cervical cancer associated with various sub-lineages is predominantly driven by past HPV16 sub-lineage distribution. HPV-16 A1 sub-lineage

Discussion
The current study provided the diversity of HPV among women in South Andaman Island. It is essential to comprehend the spectrum of HPV genotypes because data on the distribution of HPV genotypes are relevant to vaccine development. HPV 16 and HPV 18 cause more than 70 percent of cervical cancer cases, with the remaining cervical cancers caused by other HR-HPV genotypes [17].
HPV 16 genetic variation may have a significant impact on cervical cancer risk. However, the global burden of cervical cancer associated with various sub-lineages is predominantly driven by past HPV 16 sub-lineage distribution. HPV-16 A1 sub-lineage was globally widespread. However, sub-lineages A3 and A4 were common in Asia. Sub-lineages A3, A4 and lineage D were common in regions like East Asia and North America.
Sub-lineage A4 was associated with more severe disease status than A1-3 sub-lineages in Chinese females and a higher risk of cancer. These lineages were highly cancer-risk associated [23]. In addition, lineage A of HPV 16 was found to be the prevailing strain in Spain. Lineage D of HPV 16 was linked to a higher risk of CIN3+ and otherhigh-grade lesions [24]. A study conducted in Eastern India revealed the existence of A1, D1 and D2 lineages. Of these lineages, A1 sub-lineage was predominant among women with cervical carcinoma [25]. The previous studies in India revealed the HPV 16 A1 (European) sublineage was predominant among cervical carcinoma patients compared to D1 (North American) and D2 (Asian-American-1) [15,25]. The current study found sub-lineages A1, A2, and D1 to be prevailing in South Andaman. The majority of the isolates from the current study belonged to the A2 sub-lineage.
A study revealed that the A1 sub-lineage of HPV 18 was predominant in Central Asia, Northern America and Eastern Asia. Nevertheless, A2 sub-lineages of HPV 18 were predominant in Europe, North America, Northern Africa and South/Central Asia. In addition, B1 and B2 sub-lineages were predominant only in Sub-Saharan Africa. Further, C-lineages were also observed in the African region [26]. HPV 18 sub-lineage distribution in China belonged to A1 to A7 [27,28]. Another study from Iran found that the prevalence of sub-lineage A4 was high compared to other sub-lineages [29]. In Spain, the study revealed Lineage B of HPV 18 was related to the burden for CIN3+ compared with lineage A [24]. In the current study, the South Andaman isolates belonged to lineage A of HPV 18.
The identification of the genetic underpinnings responsible for the distinct carcinogenic properties exhibited by certain lineages of HPV 16 and HPV 18 has the potential to shed light on the intricate interactions between the viral agents and the human host. Such insights hold promise for enhancing our ability to effectively manage HPV infections and mitigate the incidence of cervical cancer.
Diversity in the distribution of HPV types gives rise to a challenge to vaccine strategies. Molecular surveillance of HPV is needed for the detection of new strains or types emerging among symptomatic and asymptomatic populations in these remote islands. This will help policymakers to implement preventive measures against HPV-associated cervical cancer. A study on the specificity of HPV variants will be helpful in developing broad-spectrum vaccines aiming at HR-HPV variants.

Conclusions
This is the first-ever community-based cross-sectional study conducted in the Andaman and Nicobar Islands which interestingly revealed the prevalence of a wide range of the genotype distribution of HPV among women in this small island. HPV 16 was the most predominant high-risk type found in the Andaman Islands. Various high and low-risk types were also revealed in this study. Phylogenetic analysis of the E6 gene found lineages A and D of HPV 16 in the Andaman Islands. Moreover, lineage A of HPV 18 was also identified through the phylogenetic analysis.Sequencing analysis of the HPV 16 E6 gene revealed that A2 sub-lineages of HPV 16 were predominantly reported as compared to other lineages. The findings in the current study provide sufficient data to highlight the importance of screening for cervical cancer and promote vaccination and vaccine awareness in women living in remote geographical locations. These findings also emphasise and help to initiate stronger public health awareness programs and prevention strategies for the women of the Andaman and Nicobar Islands.

Author Contributions:
The study concept and study design were contributed by M.N. and R.P. Data collection and sample collection were contributed by R.P. Data analysis, interpretation and critical evaluation were all contributed by R.P., M.N., P.V. and H.K. Manuscript writing and correction of the manuscript were by all the authors. The final article was approved by all the authors. All authors have read and agreed to the published version of the manuscript. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The datasets of the current study are available from the corresponding author upon reasonable request.