Genetic Characterization of a New HIV-1 Sub-Subtype A in Cabo Verde, Denominated A8

Previous molecular characterization of Human immunodeficiency virus (HIV-1) samples from Cabo Verde pointed out a vast HIV-1 pol diversity, with several subtypes and recombinant forms, being 5.2% classified as AU-pol. Thus, the aim of the present study was to improve the characterization of these AU sequences. The genomic DNA of seven HIV-1 AU pol-infected individuals were submitted to four overlapping nested-PCR fragments aiming to compose the full-length HIV-1 genome. The final classification was based on phylogenetic trees that were generated using the maximum likelihood and bootscan analysis. The genetic distances were calculated using Mega 7.0 software. Complete genome amplification was possible for two samples, and partial genomes were obtained for the other five. These two samples grouped together with a high support value, in a separate branch from the other sub-subtypes A and CRF26_A5U. No recombination was verified at bootscan, leading to the classification of a new sub-subtype A. The intragroup genetic distance from the new sub-subtype A at a complete genome was 5.2%, and the intergroup genetic varied from 8.1% to 19.0% in the analyzed fragments. Our study describes a new HIV-1 sub-subtype A and highlights the importance of continued molecular surveillance studies, mainly in countries with high HIV molecular diversity.


Introduction
Human immunodeficiency virus (HIV) originated from multiple zoonotic transmission events of the simian immunodeficiency virus (SIV) from non-human primates to humans in Central and West Africa, resulting in two types of HIV-1 that encompassed groups M, N, O, and P [1][2][3], in addition to HIV-2 (groups A-H) [4]. The HIV-1 M group is responsible for the HIV/AIDS pandemic and, currently, phylogenetic analyses based on complete genomes revealed that this group is composed of 10 subtypes (A, B, C, D, F, G, H, J, K, L), as well as circulating (CRFs) or unique (URFs) recombinant forms [5,6]. In 2018, a reclassification of HIV-1 A sub-subtypes was proposed based on phylogenetic analyses carried out with full-genome sequences obtained from public sequence databases. Thus, subtype A was subdivided into six sub-subtypes (A1-A4, A6, A7) [7]. Sub-subtype A5 was not found in its pure form, but is still part of the CRF26_A5U [8].
The HIV pandemic remains a global public health problem, with 38 million people living with HIV (PLWH) in 2019 [9]. Located in West Africa, Cabo Verde has about 2500 (2100-3000) PLWH, which is equivalent to 0.6% of the population between 15 and 49 years old [9]. In a previous study of HIV samples from Cabo Verde, molecular epidemiological data of the pol region revealed a high prevalence of HIV-1 subtypes G (36.6%) and CRF02_AG (30.6%), URFs (10.4%), sub-subtype F1 (9.7%), B (5.2%), CRF05_DF (3.0%), C (2.2%), CRF06_cpx (0.7%), CRF25_cpx (0.7%), and CRF49_cpx (0.7%), while all HIV-2 Viruses 2021, 13, 1093 2 of 9 infections belonged to group A. Moreover, a complex profile of drug resistance mutations occurring in almost 48% of the CV HIV-1 positive individuals under cART, considering any class of antiretroviral drugs [10]. Among the samples classified as URFs, 5.2% had the same AU pol profile and were grouped in a monophyletic cluster with high support [10]. Thus, the aim of the present work was to obtain and characterize the HIV-1 full-length genome sequences, which allowed the description of a new HIV-1 sub-subtype A, here denominated as A8.

Study Population
Previous studies from our group recruited a total of 169 individuals living with HIV-1 in Cabo Verde from 2010 to 2011. HIV-1 pol sequences (covering the protease (PR) and partial reverse-transcriptase (RT), positions 2253-3251 relative to the HXB2 genome) were available for 134 of them. A highly significant supported AU-pol cluster including seven sequences corresponded to 5.2% of the classified sequences [10], which were the focus of the present study.
The amplified products were purified using the Illustra GFX PCR DNA and gel Purification Kit (GE Healthcare, Little Chalfont, Buckinghamshire, United Kingdom) and sequenced on an ABI 3130 Genetic Analyzer using the ABI BigDye Terminator v.3.1 Cycle Sequencing Ready Reaction kit (Applied Biosystems, Foster City, CA, USA). The chromatograms were analyzed and edited using the Seqman software from the package DNAS-TAR Lasergene (DNAStar, Madison, WI, USA).
The phylogenetic trees of maximum likelihood (ML) were reconstructed with PhyML version 3.0 [12] using the general time reversible (GTR) model of nucleotide substitutions. The approximate likelihood ratio test (aLRT) was used to estimate the confidence of the branch in the tree. The phylogenetic trees reconstructed were visualized and edited using the Figtree software version 1.4.4 [13]. Reference sequences of HIV-1 group M subtypes (A-D, F-H, J-L), sub-subtypes (A1-A4, A6, A7-, F1-F2), and CRF26_A5U sequences were obtained from the Los Alamos HIV database [5]. The sequence data sets were obtained by grouping our sequences and the reference sequences.
A basic local alignment search tool (BLAST) [14] was performed in order to identify sequences with high similarity to the studied sequences. We investigated the complete genome and pol region. The retrieved sequences were included in phylogenetic analyses.
Recombination analyses were performed using a bootscan implemented in Simplot v3.5.1 software with the following parameters: 400 nt window, 20 nt increments, and NJ method under Kimura's two-parameter correction with 100 bootstrap replicates [15].

Drug Resistance Analysis (DRM)
Analysis of PR/RT resistance mutations was performed through the Stanford HIV Drug Resistance Database website. HIV Database for Transmitted DRM-TDRM (CPR Tool version 9.0) and DRM (HIVdb Program version 6.3.1) for naive and treated patients, respectively [17,18].

Sociodemographic and Clinical Data
Among the seven HIV-1 AU samples investigated in Cabo Verdean individuals, six were from Santiago and one was from the Sal Island. Five were females and two were males. Only two patients had a known epidemiological linkage resulting from mother-to-child transmission ( Table 1). Considering the drug resistance mutation profiles, just CV. 10.105 and CV.10.115 presented NNRTI DRM, whereas all of them presented L10I or L10V minor PI DRM.

Genome Amplification and Sequence Analysis
From these previously classified HIV-1 AU pol samples, it was possible to amplify and sequence the complete genome for the two of them (CV.10.115-864 to 9615, CV.10.126-413 to 9516 relative to the HXB2 genome), which were obtained from patients without epidemiological linkage. Due to the low amount of the biological material available, only partial genome sequences could be obtained for three individuals (CV.10. 105-2254 to 6533, CV.11.270-985 to 5565, and CV.11.290-1339 to 5767 relative to the HXB2 genome) and the two remaining ones. Only the initial fragment of PR/RT obtained at the original study was investigated (CV.10.164 and CV.11.275-2253 to 3218 relative to the HXB2 genome) ( Table 1).
The ML tree from the complete genomes showed that these two new full-length sequences from Cabo Verde branched together in a highly supported branch, separate from the other sub-subtype A clusters, and also from the CRF26_AU (Figure 1). Bootscan analysis including all HIV-1 subtypes and A sub-subtypes was conducted and showed that the majority of the studied genomes presented high similarity among them and with no other subtype or sub-subtype. Taking these results together, we could denominate then as a new sub-subtype A: A8 ( Figure 2).   These sequences were submitted to BLAST search analysis and retrieved sequences with up to 89.4% of homology with a query cover of 100%. After pol (2253-3218 of the HXB2 reference) BLAST analyses, those 100 sequences with homology higher than 92.4% were included at the pol alignment and the ML tree was performed. However, only five sequences confirmed with a high support value for clustering in a monophyletic clade with the new sub-subtype A, A8. One was sequenced in Portugal (PT), one in the United States (US), one in the Democratic Republic of Congo (DC), one in Spain (ES), and one in Sweden (SE). GenBank accession numbers were as follows: GQ398862, JX460184, These sequences were submitted to BLAST search analysis and retrieved sequences with up to 89.4% of homology with a query cover of 100%. After pol (2253-3218 of the HXB2 reference) BLAST analyses, those 100 sequences with homology higher than 92.4% were included at the pol alignment and the ML tree was performed. However, only five sequences confirmed with a high support value for clustering in a monophyletic clade with the new sub-subtype A, A8. One was sequenced in Portugal (PT), one in the United States (US), one in the Democratic Republic of Congo (DC), one in Spain (ES), and one in Sweden (SE). GenBank accession numbers were as follows: GQ398862, JX460184, MH705159, EF380382 and AY165240, respectively (Figure 3). Those sequences presented BLAST homology above to 94.83% with a query cover of 100% (PT-97.1%, US-97%, DC-95.97%, ES-95.24%, and SE-94.83%) and investigated sequences without origin information.

Intragroup
Inter-Sub-Subtype A X A8

Discussion
The global spread of HIV-1 group M in the second half of the 20th century has led to a complex and constantly changing distribution of subtypes and recombinant forms. Subtype A is responsible for about 10% of the HIV-1 infections worldwide, being found mainly in East Africa [19]. The characterization of the first A sub-subtypes occurred in 2001, distinguishing A1 and A2 [20]. Until 2016, six A sub-subtypes (A1-A4, A6) had been described, and the sub-subtype A5 was found only in the recombinant form CRF26_A5U [7,8,21]. In 2018, a phylogenetic analysis based on public complete genome sequences was carried out and a reclassification of HIV-1 sub-subtypes A was proposed. Thus, subtype A was subdivided into six sub-subtypes (A1-A4, A6, A7), in addition to CRF26_A5U [5,7,8]. Moreover, in 2018, in contrast to the classification adopted for Los Alamos and these studies, a group of researchers stated that on an evolutionary scale, A3 would be part of the sub-subtype A1 clade [22]. In that study, they named A8 the sub-subtype A that took part in CRF36 (composed by CRF01, CRF02, A, G) and CRF37 (composed by CRF01, CRF02, A, G, U). However, as Los Alamos named it A, we designated our sequences as A8 [22].
The HIV-1 sub-subtypes A have dispersed around the world. Inspecting complete genome sequences at the Los Alamos database, it was possible to identify that A1 is the subsubtype with the largest number (≈80%) of described A sequences and is found mainly in Rwanda and Kenya; A2 and A4 are restricted to the Democratic Republic of Congo; A3 and A7 to Senegal; and the sub-subtype A6 is more often found in the Russian Federation [5].
The high diversity of subtype A, besides being related to its division in sub-subtypes, is also present in the classification of circulating recombinant forms. Among the 100 CRFs already described until January 2021, subtype A is present in 59, some of them being part of a complex CRF involving three or more subtypes or CRF01 or CRF02 and other subtypes [4]. Among the circulating recombinant forms, the most widespread in the world are CRF01_AE and CRF02_AG, which are responsible for 5.3% and 7.7% of HIV-1 group M infections, respectively [19].
Similar to a recent study that shows unique drug resistance profiles of the subtype A6 circulating in former Soviet bloc countries, to reverse transcriptase inhibitors (A62V RT and G190S RT ) [23], in sub-subtype A8 we verified the presence of L10I/V minor protease inhibitor DRM.
The high diversity of HIV-1 described in Cabo Verde may be related to its proximity and relationships with West Africa and European countries. Studies already conducted show strong similarity to the HIV-1 subtype G between sequences from Portugal and Cabo Verde and propose that historical and recent movements between Angola, Cabo Verde, and Portugal may have played a key role in the origin and dispersion of certain viral clades [24]. Due to its geographical location and investment in tourism, Cabo Verde receives thousands of tourists from all continents, especially Europe. Many foreigners have also come to Cabo Verde as a country of emigration. In 2013, immigrants represented about 3.5% of the total resident population. The majority come from the African continent, 38% from Economic Community of West African States (ECOWAS) countries and 34% from other African countries. This phenomenon of people movement, whether by tourism or emigration, may affect the dispersion and viral diversity of HIV-1 [25].
The continuous monitoring of molecular epidemiology in Cabo Verde and worldwide is extremely important since HIV diversity can impact diagnosis, viral load measures, drug resistance, responses to antiretroviral treatment, pathogenesis, vaccine design, immune response, and viral escape [19].

Conclusions
The proximity to HIV endemic regions where HIV-1 sub-subtypes A circulate, with a high level of people mobility, whether by tourism or emigration may have contributed to the emergence of sub-subtype variant A8 circulating in Cabo Verde. Further studies with recent samples will be of relevance to assess the dispersion and the role of HIV-1 sub-subtype A8 in the HIV/AIDS epidemic in Cabo Verde.