HIV-1 Sub-Subtype A6: Settings for Normalised Identification and Molecular Epidemiology in the Southern Federal District, Russia.

Russia has one of the largest and fastest growing HIV epidemics. However, epidemiological data are scarce. Sub-subtype A6 is most prevalent in Russia but its identification is challenging. We analysed protease/reverse transcriptase-, integrase-sequences, and epidemiological data from 303 patients to develop a methodology for the systematisation of A6 identification and to describe the HIV epidemiology in the Russian Southern Federal District. Drug consumption (32.0%) and heterosexual contact (27.1%) were the major reported transmission risks. This study successfully established the settings for systematic identification of A6 samples. Low frequency of subtype B (3.3%) and large prevalence of sub-subtype A6 (69.6%) and subtype G (23.4%) were detected. Transmitted PI- (8.8%) and NRTI-resistance (6.4%) were detected in therapy-naive patients. In therapy-experienced patients, 17.3% of the isolates showed resistance to PIs, 50.0% to NRTI, 39.2% to NNRTIs, and 9.5% to INSTIs. Multiresistance was identified in 52 isolates, 40 corresponding to two-class resistance and seven to three-class resistance. Two resistance-associated-mutations significantly associated to sub-subtype A6 samples: A62VRT and G190SRT. This study establishes the conditions for a systematic annotation of sub-subtype A6 to normalise epidemiological studies. Accurate knowledge on South Russian epidemiology will allow for the development of efficient regional frameworks for HIV-1 infection management.


Introduction
Considerable progress in the fight against HIV/AIDS has already been made, with rates of new infections declining in almost all world regions. However, Eastern Europe and Central Asia showed an alarming increase in annual infection rates of 29% between 2010-2018 [1]. Russia, with over 1.16 million estimated infections by mid-2017 [2,3], is also one of the countries accounting for 89% of all new HIV infections worldwide. However accurate, global Russian data are scarce. Significantly, even the basic incidence and prevalence estimates generated by the Joint United Nations Programme on HIV/AIDS (UNAIDS) cannot be released for Russia because of political sensitivities. Existing knowledge is based only on a limited number of publications, mostly restricted to Moscow and St. Petersburg areas [4], and on reports from several Russian administrative agencies, though the latter are only in the Russian language. Launched in 2014, the UNAIDS Fast-Track strategy outlined plans to step up the HIV response in critical world regions and end AIDS as a public health threat by 2030 (https://www.unaids.org/en/resources/documents/2014/JC2686_WAD2014report). The first step for such an approach must be the identification of the human target population(s), which can differ from region to region. This information, together with the molecular characterisation of the circulating viral strains, will allow for the development of an efficient and specific regional frameworks for HIV-1 prevention, treatment, and monitoring.
The current Russian HIV scenario has been shaped by a massive epidemic in Ukraine and Southern Russia in the mid-1990s, which spread a new sub-subtype: the "Russian A1", also referred to as "IDU-A", or "A-FSU" [3,[5][6][7][8][9][10]. Phylogenetics has shown that the "Russian A1" is different from the African A1, leading to a classification as the sub-subtype A6 within the subtype A [11]. However, detection of A6 is challenging as there is no reference sequence yet, which in turn impedes the systematic identification by any of the HIV subtyping tools currently used in routine diagnostics. Published epidemiological studies reporting A6 samples have used different genomic regions as well as different sequences as a reference [7,8,[12][13][14][15], leading to unsystematic annotation in GenBank and Los Alamos databases. Therefore, the A6 prevalence worldwide is largely unexplored.
A unique feature further differentiates the HIV-1 epidemiology in the Southern Federal District from the rest of the country: the high prevalence of subtype G isolates [3,15]. This can be retraced to a clonal infection episode of more than 250 children after exposure to non-sterile needles during the years 1988 and 1989. The index case was a child admitted to Elista hospital, vertically infected by his mother who, in turn, got infected from her husband after military services in Central Africa [3,[16][17][18][19].
The aim of this study was to provide a much needed setting for systematic A6 identification as well as comprehensive data on the HIV epidemiology in the Russian Southern Federal District.

Study Design and Participants
Blood samples and clinical data from all patients attending the Russian Southern Federal Centre for HIV Control (Rostov-na-Donu), Rostov State Medical University Hospital and the Krasnodar Clinical Centre for HIV Control in 1990 to 2015 were collected. This study was retrospective and non-interventional. All samples were initially analysed for diagnostic purposes within the Rostov-na-Donu-Cologne cooperation project (Ethics approval #246). Enrolment criteria were confirmed HIV infection and availability of one blood sample and provided written consent. Three hundred twenty-eight patients fulfilled these criteria. The final inclusion criterion for the study was availability of at least the reverse transcriptase (RT) and/or protease (PR) sequence from the blood sample. Three hundred three patients were included in the analyses.
Epidemiological data concerning reported transmission route, date of birth, and sex were collected at the treating sites. Data were anonymised for the study and stored in the Arevir database.
Viral RNA isolation and RT-PCR amplification for protease-reverse transcriptase (PRRT), reverse transcriptase (RT), and integrase (IN) were completed as previously described [20] with subtype-optimized primers. Next-Generation-Sequencing (MiSeq, Illumina) with a 15% cut-off was performed. Initial subtyping was done based on the RT (or PR region when the RT was not available) using the geno2pheno system (http://www.geno2pheno.org/) and Stanford HIVdb PROGRAM (HIVdb; https://hivdb.stanford.edu/hivdb/by-mutations/). Final subtyping of A1 or A-containing recombinant forms was based on phylogenetic analysis.
Four main datasets were used for the phylogeny: i) RUS: 222 RT sequences from our patients, initially predicted as A1 or one of the A-containing recombinants forms 01_AE, 02_AG, or 03_AE; ii) RUS-A6: 211 samples from the RUS dataset classified as A6 through our first phylogenetic analysis; iii) A6-POL-LA: 68 available A6-annotated pol sequences from the Los Alamos database (LA) (GenBank accession numbers in File S1); and iv) REF-LA: 40 subtype reference sequences from LA (Table S1) [21]. Two additional datasets were used for subtype prevalence comparison: v) RU-LA: 8355 LA HIV-1 sequences (any region) from Russia; and vi) FSU-LA: 12510 LA HIV-1 sequences (any region) from the former Soviet Union (FSU) countries (including the 8388 from RU-LA). An overview of all datasets is provided in Table S2. LA searches were performed using the search DB tool (https://www.hiv.lanl.gov/components/ sequence/HIV/search/search.html), or the Geography Search Interface (https://www.hiv.lanl.gov/ components/sequence/HIV/geo/geo.comp) with "former USSR" or "Russian federation" as selected regions. Problematic sequences (classified as such in LA; https://www.hiv.lanl.gov/components/ sequence/HIV/search/help.html#bad_seq) and entries without annotated subtypes were excluded.
Phylogenetic analyses were performed using the Mega 7 software [22]. Multiple alignments of DNA sequences were generated using the Muscle tool using UPGMB for clustering with a minimum diagonal length of 24 and a maximum of eight iterations [23]. These alignments were used to compute nucleotide pairwise distances and neighbour-joining (NJ) analysis, using the maximum composite method in both cases.
Resistance-associated-mutations (RAMs) and drug susceptibility profiles were analysed with the Stanford HIVdb [24]. Resistance levels of 1-2 were considered susceptible, levels of 3-4 as intermediate resistant, and 5 as fully resistant. Resistance levels 3-5 were considered clinically relevant. Drug class susceptibility was calculated as the highest level of resistance for any of the drugs included within the class.
We performed descriptive analyses for patients' sex, year of birth, and transmission route as well as for viral subtypes, circulating recombinant forms (CRFs), RAMs, and drug susceptibility. We present genetic data as median pairwise distances with interquartile range values.
Statistical analysis of categorical variables was performed with the Fisher´s exact probability test using 2 × 2 contingency tables (http://vassarstats.net/tab2x2.html). All p values were two-tailed. Significant differences were considered as p values < 0.05.

Baseline Characteristics
Twenty-five samples could not be amplified in the RT and/or PR regions due to low viral load. Therefore 303/328 patients were included in the final analysis. One hundred fifty (49.5%) individuals were male, 126 (38.4%) female, and 27 (8.9%) unknown (not reported). In terms of the disclosed transmission routes, 97 (32.0%) patients were persons who inject drugs (PWID), 82 (27.1%) were heterosexuals, 76 (25.1%) were from late-1980´s nosocomial infections, 32 (10.6%) were vertical infections, 8 (2.6%) were men having sex with men (MSM), and for 8 (2.6%) the transmission route was not reported. Males were significantly more often PWID (42.2% of male infections; p < 0.001), whereas females were more likely to have been infected through heterosexual contact (46.8% of female infections; p < 0.001) ( Figure 1A).  (Table S3). Nosocomial infections were associated to a birth year between 1981 and 1990 (55.0% in this age group; p < 0.001). Infections in PWID were significantly more frequent in patients born in 1948-1980 (53.0% of all infections in patients in this age group) than in younger patients born in or after 1981 (9.9%; p < 0.001). Heterosexual transmission rate decreased with the patients´ age from 53.8% (birth 1948-1960) to 20.0% (1991-2011).

Viral Subtype Distribution
Initial subtyping classified 222 RT samples predicted as subtype A or as subtype A-containing recombinant forms. They were subjected to phylogenetic analysis together with the A6-POL-LA dataset, and A1, 02_AG, 01_AE, and 03_AB reference sequences. The generated NJ tree showed a cluster comprising 211 RUS samples and 66/68 A6-POL-LA sequences; EF545108, EU861977 sequences located intermediate between A6, CRF01_AE, and A1 sequences ( Figure 2). This analysis classified two initially A1-predicted sequences as 03_AB and one A1 as 02_AG. The 01_AE sample #16020 was reclassified as A6.  (Table S3). Nosocomial infections were associated to a birth year between 1981 and 1990 (55.0% in this age group; p < 0.001). Infections in PWID were significantly more frequent in patients born in 1948-1980 (53.0% of all infections in patients in this age group) than in younger patients born in or after 1981 (9.9%; p < 0.001). Heterosexual transmission rate decreased with the patients´age from 53.8% (birth 1948-1960) to 20.0% (1991-2011).

Viral Subtype Distribution
Initial subtyping classified 222 RT samples predicted as subtype A or as subtype A-containing recombinant forms. They were subjected to phylogenetic analysis together with the A6-POL-LA dataset, and A1, 02_AG, 01_AE, and 03_AB reference sequences. The generated NJ tree showed a cluster comprising 211 RUS samples and 66/68 A6-POL-LA sequences; EF545108, EU861977 sequences located intermediate between A6, CRF01_AE, and A1 sequences ( Figure 2). This analysis classified two initially A1-predicted sequences as 03_AB and one A1 as 02_AG. The 01_AE sample #16020 was reclassified as A6. The RUS-A6 and A6-POL-LA datasets were used to generate ten different consensus sequences. The consensus sequence generated by MutExt using the high-and low-cut-offs of 70% and 50%, respectively, was selected as the reference consensus sequence (A6_pol_reference consensus sequence; nt. sequence in Supplementary Material, page 1). It showed the lowest maximal pairwise distance (0.034) as well as median distance (0.014; IQR 0.010-0.019) to any A6 sequence ( Figure S1). EF545108 and EU861977 were the most distant sequences, with genetic distances to the A6_pol_reference consensus sequence of 0.031 and 0.034, respectively. These two sequences were still closer to A6 than to A1, as the minimal pairwise distances to any A1 reference sequence were 0.069 and 0.062, and the median distances to the four A1 references were 0.077 (0.073-0.083) and 0.070 (0.067-0.074), respectively. Subsequently, the A6_pol_reference consensus sequence was aligned to the REF-LA dataset ( Figure S2). The A1 reference sequences were the closest genetically, The RUS-A6 and A6-POL-LA datasets were used to generate ten different consensus sequences. The consensus sequence generated by MutExt using the high-and low-cut-offs of 70% and 50%, respectively, was selected as the reference consensus sequence (A6_pol_reference consensus sequence; nt. sequence in Supplementary Material, page 1). It showed the lowest maximal pairwise distance (0.034) as well as median distance (0.014; IQR 0.010-0.019) to any A6 sequence ( Figure S1). EF545108 and EU861977 were the most distant sequences, with genetic distances to the A6_pol_reference consensus sequence of 0.031 and 0.034, respectively. These two sequences were still closer to A6 than to A1, as the minimal pairwise distances to any A1 reference sequence were 0.069 and 0.062, and the median distances to the four A1 references were 0.077 (0.073-0.083) and 0.070 (0.067-0.074), respectively. Subsequently, the A6_pol_reference consensus sequence was aligned to the REF-LA dataset ( Figure S2).
The A1 reference sequences were the closest genetically, with a minimal pairwise distance of 0.050 (Table S4). The maximal pairwise distance to our reference consensus sequence for a query sample to be classified as A6 was selected as 0.040. This cut-off significantly discriminated A6 samples in the RUS dataset (p < 0.001).
Heterogeneity analysis within the RUS-A6 dataset analysis showed a maximal genetic distance of 0.034, and a median of 0.013 (0.025-0.040). Subtype G isolates showed higher variability, with a maximal pair distance of 0.090 and an overall median distance of 0.042 (0.030-0.053).
There were no significant divergences in the subtype distribution between males and females. On the other hand, differences were found in subtype distribution depending on the transmission route ( Figure 3A). Subtype G detection correlated with nosocomial infections (90.8%; p < 0.001) and subtype B infections significantly associated to MSM (62.5%; p < 0.001). The subtype distribution was independent of patients´date of birth, with the exception of subtype G infections, which significantly correlated with a birth date between 1981 and 1990 (55.0%; p < 0.001; Figure 3B). This subtype was detected only in patients born before 1991 with only one exception: one vertical transmission in 2005 from a patient nosocomially infection in the late 1980s.
Viruses 2020, 12, x FOR PEER REVIEW 6 of 14 with a minimal pairwise distance of 0.050 (Table S4). The maximal pairwise distance to our reference consensus sequence for a query sample to be classified as A6 was selected as 0.040. This cut-off significantly discriminated A6 samples in the RUS dataset (p < 0.001).
Heterogeneity analysis within the RUS-A6 dataset analysis showed a maximal genetic distance of 0.034, and a median of 0.013 (0.025-0.040). Subtype G isolates showed higher variability, with a maximal pair distance of 0.090 and an overall median distance of 0.042 (0.030-0.053).
There were no significant divergences in the subtype distribution between males and females. On the other hand, differences were found in subtype distribution depending on the transmission route ( Figure 3A). Subtype G detection correlated with nosocomial infections (90.8%; p < 0.001) and subtype B infections significantly associated to MSM (62.5%; p < 0.001). The subtype distribution was independent of patients´ date of birth, with the exception of subtype G infections, which significantly correlated with a birth date between 1981 and 1990 (55.0%; p < 0.001; Figure 3B). This subtype was detected only in patients born before 1991 with only one exception: one vertical transmission in 2005 from a patient nosocomially infection in the late 1980s.

Resistance-Associated-Mutations (RAMs) and Drug-Susceptibility Profiles
Two hundred sixty-one PR, 277 RT, and 61 IN sequences were available and screened for RAMs (Table 1; Table S6). Two PI-, six NRTI-, and four NNRTI-RAMs were found to be significantly associated with therapy experience.

Resistance-Associated-Mutations (RAMs) and Drug-Susceptibility Profiles
Two hundred sixty-one PR, 277 RT, and 61 IN sequences were available and screened for RAMs (Table 1; Table S6). Two PI-, six NRTI-, and four NNRTI-RAMs were found to be significantly associated with therapy experience.
Detection of ≥2 NRTI-RAMs correlated with a history of ≥3 NRTIs (p < 0.001) and with a record of AZT monotherapy (p < 0.001). Detection of ≥1 NNRTI-RAMs or major PI-RAMs correlated with a history of ≥1 NRTIs or PIs, (p < 0.001 for both cases).
The prevalence of specific RAMs significantly correlated to viral subtype. RAMs I54AV PR , L90M PR , M41L RT , D67N PR , and T215F PR were significantly more frequent in subtype G isolates. Conversely, A62V PR and G190S PR correlated with A6 infections. A62V PR was detected in 60/191 (31.4%) of the A6 sequences. No significant difference in the prevalence between TN (29/76; 38.2%) and TE (22/75; 29.3%) patients was observed. On the other hand, G190S PR was not detected in any TN patient but in 14/75 (18.7%) of the TE patients. Moreover, G190S PR was observed in 16/30 (53.3%) of the efavirenz (EFV)-exposed patients, resulting in a significant association of this RAM in the context of A6 and EFV-exposure (p < 0.001). Drug class susceptibility data are provided in Table 2. Transmitted drug resistance (TDR) was detected in TN patients: 8.8% of the isolates showed transmitted resistance to PIs and 6.4% to NRTIs. No TDR to NNRTIs or INSTIs was detected. In TE patients, 17.3% of the PR sequences were clinically resistant to PIs, 50.0% to NRTIs, 39.2% to NNRTIs, and 9.5% to INSTIs. Multiresistance was detected in 52 isolates, 40 corresponding to two-class and seven to three-class multiresistance.

Discussion
This study provides a comprehensive description of the Southern Russian HIV-1 epidemiology using samples obtained from 303 patients attending the major treatment centres in the Russian Southern Federal District. These data complement previous reports conducted in different Russian regions and with different epidemiological parameters [2,4,25]. The limited number of patients and the predominance of older patients in this study compared to others from West/Central Europe and North America (http://www.cdc.gov/hiv/library/reports/hiv-surveillance.html) is attributable to the high HIV unawareness and reduced accessibility to HIV care in Russia.
In Russia compared to western countries, HIV-1 epidemiology differs in several aspects. First, females are a vulnerable collective in Russia. Moreover, we show that the female proportion of HIV patients in medical care has not significantly decreased over the years in the Southern Federal District. In addition, PWID and heterosexual contacts are the main reported transmission modes in Russia [25], while in western countries, the impact of these two transmission modes is limited. In West/Central Europe and North America, MSM is the most affected group, both in cumulative incidence and new infections, and these infections correlate with subtype B [1]. In Eastern Europe and Central Asia, MSM represent only 6% of new infections, though they also correlate with this subtype [3,26]. Specifically, in Southern Russia less than 3% of the patients reported to be MSM and this transmission route also correlated with subtype B viruses. Yet, 25% of our samples with unknown transmission route were classified as subtype B, suggesting that they may indeed represent cases of withheld MSM transmission. Underrepresentation of MSM in epidemiological analyses is indeed expected as marked social discrimination complicates their enrolment in clinical studies and, upon inclusion, some of them may report a different transmission route, mostly heterosexual [2,3,26].
One of the major technical problems for Russian epidemiology analysis is the accurate subtyping of isolates. This is a key issue for sub-subtype A6, the most prevalent in FSU countries, but also for other CRFs such as 63_02A, which comprises 10.3% of the FSU sequences deposited in GenBank. These isolates cannot be identified by current bioinformatics subtyping tools. Therefore, our initial efforts were directed to develop a methodology to systematise its identification and subsequent annotation in Los Alamos and GenBank databases. The use of the A6_pol_reference consensus sequence allows for a clear discrimination between A6 and A1 samples based on the pol nucleotide sequence. Our data do not support EF545108 and EU861977 as reference sequences, in spite of their use in a number of previous studies [7,[12][13][14][15], as they cluster between African A1 and A6 sequences, suggesting that they are actually ancestors of the presently circulating A6 sub-subtype [14]. The subsequent analysis of A6 prevalence in our data set showed a large predominance of this sub-subtype, similar to what is described for other regions in the country. The initial expansion of this sub-subtype took place in PWID networks [3], but now it has extended to all other collectives. The A6 identification methodology developed in this work has been transferred to the geno2pheno system team who are already working to extend the subtyping tool. In the near future, the use of free-access, web-based subtyping tool(s) will enable not only the effortless and straightforward identification of A6 isolates in routine diagnostics, but also the characterisation of their prevalence worldwide. Of note, 4% of the A6-annotated entries in the Los Alamos database are isolates from countries geographically close to the FSU such as Mongolia, Germany, and Turkey, but also from more distant countries like Cyprus, Spain, The Republic of Korea, U.S.A., and Australia (Table S7). We acknowledge the limitation that our analyses were based only on the pol region. This region was prioritized because it is the main genomic region analysed in routine diagnostics for subtyping as well as for drug resistance testing. Current experiments in our group are investigating other genomic regions such as the IN gene or the V3 loop of the gp120 subunit of the Env protein to further optimize and facilitate the detection of A6 isolates. In addition, the methodology developed in this work can also be extended to normalise the classification of other relevant subtypes/CRFs not yet detected through subtyping tools.
Exceptional within HIV Russian epidemiology is the large prevalence of subtype G in the Southern Federal District [3,15]. Subtype G strains are otherwise mostly reported from Nigeria and West African countries [27,28] (Table S8). They are rare in western countries, with the exceptions of Spain and Portugal, where they account for up to 12% and 30% of the infections, respectively [29][30][31][32]. In Russia, subtype G infections are unusual outside the region covered in this study [33]. The subtype G viruses from the Southern Federal District can be retraced to an episode of clonal infection of more than 250 children after exposure to non-sterile needles in 1988-1989 [3,19]. This event still shapes the current HIV landscape, with subtype G prevalence of 23% of the isolates detected in this study. Almost-full genome analyses of three isolates from the nosocomially-infected cohort showed genetic similarity to African G isolates and dissimilarity to eight G viruses from other Russian regions, which were related to G/CRF14_BG isolates from the Iberian zone [19]. Importantly and contrary to the African and South European cases where subtype G viruses are currently being actively transmitted among the population by heterosexual contacts, this study could not detect any subtype G sample outside this nosocomially infected group, suggesting that these viruses may not have entered other transmission networks yet.
Non-B subtypes show differences in disease progression (summarised in [34]) and present genetic signatures and polymorphisms in amino acid residues associated to resistance in subtype B. Subtypes C, F, G, or CRF02_AG show different susceptibility to specific antiretrovirals and may develop RAMs, which are not favoured in subtype B strains [35,36]. Our analysis detected two specific RAMs significantly associated to A6 viruses: A62V RT and G190S RT . In subtype B, A62V RT is a compensatory mutation associated with NRTI-class resistance. A62V RT is observed in two unusual mutational patterns: the Q151M complex and the T69SSS insertion complex. The A62V RT alone is rare in subtype B, TN-samples as it reduces the replication capacity of these variants to 50% compared to the wild type (wt) virus [37]. In A6 samples, A62V RT is an endemic polymorphism with similarly high prevalence among TN and TE patients [9,38,39]. These prevalence trends strongly suggest that this mutation probably does not lead to a sharp reduction of replication capacity in the context of sub-subtype A6, though no experimental data to confirm this hypothesis are available yet. What is more, it opens the question of whether NRTI class-resistance patterns may be favoured in the A6 context. The limited size of our dataset hinders any conclusion in this direction. The substitution G190S RT confers high level resistance against doravirine, efavirenz, and nevirapine and reduces etravirine and rilpivirine susceptibility to levels of still unknown clinical relevance [24]. In non-A6 isolates, K103N RT or Y181C RT are preferentially detected after efavirenz exposure [40][41][42][43] while G190S RT is rarely observed, probably due to high costs in replication capacity (in the subtype B context, 20% compared to the wt) [41]. Conversely, G190S development in A6 isolates is significantly higher (up to 30% to 60%) after EFV and/or nevirapine (NVP)-exposure and is favoured over the K103N and Y181C resistance pathways [10,38,39,[42][43][44]. These subtype differences may be partially attributable to the nucleotide exchanges required to achieve the amino acid substitution: While one transition and one transversion (GGA to TCA or AGY) are required in non-A6 isolates, only one transition is required (GGC to AGC) in the A6 genetic background [42]. The effect of G190S RT in replication capacity of A6 isolates is yet to be analysed.
Single class-and multiresistance to all antiretroviral drug classes available in Russia were detected in the present study, threatening the long-term success of antiretroviral therapy programs. In South Russian therapy-experienced patients, resistance levels are comparable to those described for TE, viraemic patients in western countries [45][46][47]. Whether the prevalence of drug resistance in Russia is currently declining, as described for Western Europe and the U.S.A. [45,46] is still unknown. Our study included too few samples from recent years to drive any conclusion in this matter. Further studies including additional samples from the last decade are required to clarify this question. We also identified transmission of drug resistance. PI-, NRTI-, and even PI + NRTI-resistant strains were detected in TN patients. PI-TDR in South Russia, detected in 9% of the TN patients, seems to be higher than in other European countries [45,[48][49][50], which may be a consequence of the extended use of older PIs such as atazanavir, nelfinavir, or saquinavir compared to western countries, where these drugs have been mostly replaced by darunavir.
In conclusion, this study enables the systematisation of A6 identification and confirms the predominance of sub-subtype A6 and subtype G in Southern Russia. Additionally, it shows that resistance is circulating in viruses from both TN and TE patients, with the presence of specific RAMs associated to individual subtypes. Efforts to increase the implementation of routine epidemiology and surveillance will further improve therapy efficacy in this region.

Supplementary Materials:
The following are available online at http://www.mdpi.com/1999-4915/12/4/475/s1, Figure S1: Distribution of the pairwise genetic distances between the A6_pol_reference consensus and the A6 samples, Figure S2: NJ tree of A6_pol_reference consensus and HIV-1 subtype reference sequences, Table S1: GenBank accession numbers for the reference sequences, Table S2: Overview of the datasets used in this study, Table S3: Sex distribution based on the year of birth, Table S4: Pairwise distances between A6_pol_reference consensus and the REF-LA dataset, Table S5: HIV-1 subtype and CRF prevalence in GenBank and our Southern Russia samples, Table S6: Number of RAMs per sequence according drug class, Table S7: Country of origin of HIV-1 sequences (any genomic region) annotated as subclade A6 in the Los Alamos database, Table S8: Country of origin of HIV-1 sequences (any genomic region) annotated as subtype G in the Los Alamos database. File S1: GenBank accession numbers of sequences included in A6-POL-LA dataset.  Gilead Sciences and ViiV Healthcare. M.Z. received speaker's honoraria and travel support from Gilead Sciences, Janssen-Cilag, and ViiV Healthcare.