Pre-Treatment Integrase Inhibitor Resistance and Natural Polymorphisms among HIV-1 Subtype C Infected Patients in Ethiopia

Dolutegravir-based antiretroviral therapy (ART) has been scaled up in many developing countries, including Ethiopia. However, subtype-dependent polymorphic differences might influence the occurrence of HIV-drug-resistance mutations (HIVDRMs). We analyzed the prevalence of pre-treatment integrase strand transfer inhibitor (INSTI) HIVDRMs and naturally occurring polymorphisms (NOPs) of the integrase gene, using plasma samples collected as part of the national HIVDR survey in Ethiopia in 2017. We included a total of 460 HIV-1 integrase gene sequences from INSTI-naïve (n = 373 ART-naïve and n = 87 ART-experienced) patients. No dolutegravir-associated HIVDRMs were detected, regardless of previous exposure to ART. However, we found E92G in one ART-naïve patient specimen and accessory mutations in 20/460 (4.3%) of the specimens. Moreover, among the 288 integrase amino acid positions of the subtype C, 187/288 (64.9%) were conserved (<1.0% variability). Analysis of the genetic barrier showed that the Q148H/K/R dolutegravir resistance pathway was less selected in subtype C. Docking analysis of the dolutegravir showed that protease- and reverse-transcriptase-associated HIVDRMs did not affect the native structure of the HIV-1 integrase. Our results support the implementation of a wide scale-up of dolutegravir-based regimes. However, the detection of polymorphisms contributing to INSTI warrants the continuous surveillance of INSTI resistance.


Introduction
Following the global increase of pre-treatment drug resistance (PDR) to non-nucleoside reverse transcriptase inhibitors (NNRTIs), the World Health Organization (WHO) recommended the transition from NNRTI to integrase strand transfer inhibitor (INSTI)-based regimens in both treatment-naïve and treatment-experienced patients [1][2][3]. Several lowand middle-income countries have already transitioned to the dolutegravir (DTG)-based regimen, and many more are in the planning phase, so millions of people living with HIV will soon receive DTG combined with two nucleoside reverse transcriptase inhibitors (NRTIs) as first-and second-line therapies [2,4].
HIV-1 integrase (IN), which comprises 288 amino acids encoded by the 5 -end of the HIV pol (polymerase) gene, plays a vital role in HIV-1 replication by catalyzing two distinct

Study Design
In this study, we used plasma samples collected from HIV-1-infected patients as part of a national HIVDR survey conducted in Ethiopia. A cross-sectional survey was conducted in 2017 among treatment-naïve patients and patients on first-and second-line regimens in selected health facilities from different parts of the country according to the WHO-recommended HIVDR survey [41]. After obtaining written informed consent from each participant, 10 mL of blood was collected by venipuncture for CD4+ T-cell count, viral load, and HIVDR genotyping. Basic demographic and clinical information were also collected during the survey using a standardized questionnaire. Specimens were Viruses 2022, 14, 729 3 of 22 transported to the Ethiopian Public Health Institute (EPHI) on dry ice for viral load testing and long-term storage at −80 • C. HIV-1 VL was determined using the Abbott RealTime HIV-1 assay (Abbott Molecular Inc., Des Plaines, IL, USA). Using 1000 copies/mL as a viral load suppression threshold based on the WHO recommendation [42], all samples with a viral load ≥1000 copies/mL were then shipped to the National Institute of Respiratory Diseases-Mexico (INER) laboratory for HIVDR genotyping.

HIV-1 Genotyping
Genotyping of the integrase region was performed using an in-house-developed and -validated protocol for IN [43]. Amplicons obtained by the nested PCR method were used for Sanger sequencing using the BigDye technology on the ABI Prism 3730 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Sequence assembly and editing were performed using the RECall V 2.0 HIV-1 sequencing analysis tool (University of British Columbia, Vancouver, BC, Canada) [44]. Sequence quality control was performed using the WHO tool (https://sequenceqc-dev.bccfe.ca/who_qc (accessed on 28 June 2021)) and the Quality Control program of the Los Alamos HIV sequence database (https://www.hiv. lanl.gov (accessed on 28 June 2021)).

Subtype Determination Using HIV-1 Integrase Sequences
The HIV-1 subtyping was performed using the online automated subtyping tools REGA v3.0 [45], COMET [46], and the jumping profile Hidden Markov Model (jpHMM) [47]. Subtyping was further confirmed by Maximum likelihood (ML) phylogenetic tree analysis with the IN references sequences from HIV-1 subtype (A-K) and recombinant virus downloaded from the Los Alamos database (http://www.hiv.lanl.gov (accessed on 3 July 2021)). Multiple sequence alignment was conducted using MAFFT version 7 [48] and was then manually edited using BioEdit V7.0.9.0 [49,50] until a perfect codon alignment was obtained. ML tree topology was constructed using the online version of PhyML v 3.0 [51] with the GTR+I+Γ nucleotide-substitution model (using the estimated proportion of invariable sites and four gamma categories). A heuristic tree search was performed using the SPR branch-swapping algorithm. Branch support was determined with aLRT-SH (approximate likelihood ratio test, Shimodaira Hasegawa-like) [52]. Clusters were defined as monophyletic clades with aLRT-SH support ≥0.9. The subtype-resolved ML phylogeny trees were visualized using the FigTree v1.4.0 program. Sequence(s) that formed a cluster with the reference sequences belonging to the same subtype were assigned to that subtype.

HIV-1 Drug Resistance Analysis
INSTI-associated mutations were identified using the Stanford HIV Drug Resistance Database (HIVdB v9.0) (https://hivdb.stanford.edu/hivdb/by-mutations (accessed on 7 July 2021)). INSTI DRMs were categorized as major resistance mutations, accessory resistance mutations, and other mutations according to the Stanford HIV Drug Resistance Database. Major resistance mutations were primarily nonpolymorphic DRMs that caused a significant reduction in INSTI susceptibility, even when they occurred alone. Accessory mutations were nonpolymorphic or minimally polymorphic mutations that caused only low-level reduction of INSTI susceptibility when they occurred alone, but may have augmented resistance and/or restored the fitness of viral mutants with major resistance mutations. The other mutations included highly polymorphic and/or rare nonpolymorphic mutations that may have been weakly associated (uncertain role) with drug resistance. We further extensively investigated all amino acid positions associated with decreased INSTI susceptibility. Samples harboring resistant and/or a mixture of wild-type and resistant amino acids were considered resistant.

HIV-1 Subtype C Integrase Polymorphism and Conservation Analysis
For this analysis, only HIV-1 subtype C sequences were used. Briefly, multiple sequence alignment was conducted using MAFFT version 7 [48] and was then manually edited using BioEdit V7.0.9.0 [49,50] until a perfect codon alignment was obtained. The nucleotide sequences were translated to an amino acid sequence. Then, each amino acid along the 288 IN positions was extensively investigated for the presence of primary mutations and of nonpolymorphic and polymorphic mutations associated with resistance to INSTI. The prevalence of each amino acid at each IN position was determined and compared to the HIV-1 subtype B reference sequence (GenBank accession number: K03455). We defined NOPs as substitutions within the HIV-1 IN that occurred in ≥1% of the sequences for this analysis [6]. The positions with ≥20% substitutions were defined as highly polymorphic, while those with ≤0.5% variability were considered highly conserved.

Generation of Consensus HIV-1 Integrase Sequence
To comprehensively describe the variability (polymorphism) in the IN sequences, we downloaded global subtype B and C IN sequences that matched the region (HXB2: 4230-5093 relative to HXB2 clone) from the HIV Los Alamos National Library (LANL) database (https://www.hiv.lanl.gov (accessed on 13 July 2021)). To avoid the overestimation of variant calling and ensure the sequences included in the analysis were from INSTI-naïve patients, only sequences before 2007 (before the FDA approved INSTIs) were used. The quality of all HIV-1 sequences was verified using the online Quality Control program (http://www.hiv.lanl.gov (accessed on 14 July 2021)). Sequences with stop codons and/or frameshifts and/or poor quality were removed from the analysis. Only one sequence per patient was retained. For a patient with multiple sequences, the earliest sequence was selected and used. The consensus amino acid sequence for IN was generated for Ethiopian HIV-1 subtype C, the global HIV-1 subtype B, and the global subtype C sequence using BioEdit V7.0.9.0 [49,50]. For positions where two amino acids occurred at frequencies higher than 30%, both amino acids were represented, and the first letter seen at the consensus represented the most prevalent amino acid.
Furthermore, to assess the impact of previous exposure to ART on IN gene NOPs, the consensus amino acid sequences of IN from the ART-naïve and ART-experienced patients were generated and compared. Similarly, we also compared the consensus amino acid sequences of IN from patients with one or more major HIVDRMs to protease inhibitor (PI), NRTI, and/or NNRTIs (HIVDR group) with those with no major HIVDRMs (no-HIVDR groups) in their corresponding protease/reverse transcriptase (PR/RT) gene.

Genetic Barrier to Integrase Strand-Transfer Inhibitor Resistance
To assess differences in the genetic barrier for evolution of drug-resistance substitutions between subtypes C and subtype B, we compared Ethiopian HIV-1 subtype C IN sequences obtained from INSTI-naïve patients and global HIV-1 subtype B sequences obtained from LANL (INSTI-naïve, collected before 2007). We calculated the genetic barrier to INSTI resistance for 10 major INSTI resistance amino acid positions (19 substitutions) using a previously published method [53]. Briefly, we first determined the extent of natural diversity at each selected position in our dataset of Ethiopian HIV-1 subtype C IN sequences and global subtype B IN sequences by identifying all wild-type triplets and their prevalence. Next, we compute genetic barrier score for each wild-type triplet to evolve to resistant amino acid at the specific selected position. The genetic barrier was calculated as the sum of transitions and/or transversions required to evolve to any major drug-resistance substitution. We used a score of 1 for transition (A↔G and C↔T), 2.5 for transversion (A↔C, A↔T, G↔C, G↔T), and 0 when no change was needed, as described by Nguyen et al. (2012) [53]. The smallest number (minimal score) of transversion and/or transition required for evolution from wild-type codon to resistant codon were used to calculate the genetic barrier.

Modeling and In Silico Predictions of HIV-1 Integrase and Dolutegravir Interaction
For in silico predictions, 20 randomly selected (10 from each ART-naïve (PDR) and ART-experienced (ADR)) sequences were used. The ART-naïve IN sequences used in our analysis had no HIVDRMs against NRTI, NNRTI, and/or PI in their corresponding PR/RT gene, while the ART-experienced group had one or more HIVDRMs against NRTI, NNRTI, and/or PI. A multiple-sequence alignment of amino acid sequences (without any gap) was made using ClustalW (https://www.genome.jp/tools-bin/clustalw (accessed on 1 November 2021)). An amino acid identity matrix was created with Clustal 12.1 (https: //www.ebi.ac.uk/Tools/msa/clustalo (accessed on 1 November 2021)) and visualized using GraphPad Prism 8.
The crystallographic structure of full-length HIV-1 IN (accession number: 6u8q.pdb) was obtained from the Protein Data Bank (www.rcsb.org (accessed on 2 November 2021)) [54]. To visualize both the PDR and ADR HIV-1 IN, the 6u8q was modified by using UCSF-Chimera at 12 amino acid positions (see Table S2), and a monomer was used in the docking prediction. The structure (6u8q) originally included a DNA fragment and DTG. After removing all ligands, the DNA fragment and water molecules from the crystal structure, receptor, and ligand-DTG files were separately saved for further analysis. MGL Tools (Version 1.5.7rc1) was used for creating .pdbqt files of the receptor and ligands needed for docking with Autodock Vina (Vina) (Version 1.1.2) [55,56]. Ligands were docked to the binding site cavity using x = 211, 63 Å; y = 205, 453 Å; and z = 171, 895 Å Cartesian coordinates that used the catalytic site in the monomer of HIV-1 IN. The grid box dimensions used for the search space were 50 Å × 40 Å × 40 Å. Docking calculations were performed with an exhaustiveness option of 8 (average accuracy) and an energy range of 3. Validation of the docking method was performed by redocking DTG to the modified crystal structure to the modified above-mentioned structure.

Statistical Analysis
Fisher's exact test, the Chi-squared test, and the Mann-Whitney U-test were used to evaluate the statistical differences between groups. p-values ≤ 0.05 were considered statistically significant.

Results
A total of 460 IN sequences obtained from INSTI-naïve patients were included in the analysis. Among these, 373 sequences were from patients who did not report exposure to any antiretroviral drug at the time of specimen collection (ART-naive), while 87 sequences were from ART-experienced (NNRT-based or PI-based regimens) patients, with virological failure (viral load ≥ 1000 copies/mL) while on a first-line (n = 41) or second-line (n = 46) regimen.
The phylogenetic tree in Figure 1 contains a total of 874 sequences, including Ethiopian sequences (n = 460) and (n = 414) integrase reference sequences for HIV-1 subtypes (A-K) and circulating recombinant forms downloaded from the HIV-1 LANL database. An ML tree was constructed using the online version of PhyML v 3.0. The reference sequences from the Los Alamos National Laboratory are in black in the figure. All the Ethiopian sequence's clusters with the HIV-1 subtype C reference sequence are in green, while the non-subtype C Ethiopian sequences are in pink. The phylogenetic tree in Figure 1 contains a total of 874 sequences, includ pian sequences (n = 460) and (n = 414) integrase reference sequences for HIV-(A-K) and circulating recombinant forms downloaded from the HIV-1 LANL An ML tree was constructed using the online version of PhyML v 3.0. The re quences from the Los Alamos National Laboratory are in black in the figure. A opian sequence's clusters with the HIV-1 subtype C reference sequence are in g the non-subtype C Ethiopian sequences are in pink.

Prevalence of Major Integrase Strand-Transfer Inhibitor Resistance Mutations
No major DRMs known to be associated with DTG resistance (T66K, E92 E138K/A/T, G140S/A/C, Q148H/R/K, N155H, or R263K) were detected among ïve individuals, regardless of previous exposure to ART. However, one (0.22% from a person without previous ART exposure was found to harbor E92G, a mu moderately reduces EVG susceptibility but does not reduce susceptibility to DTG.

Prevalence of Major Integrase Strand-Transfer Inhibitor Resistance Mutations
No major DRMs known to be associated with DTG resistance (T66K, E92Q, G118R, E138K/A/T, G140S/A/C, Q148H/R/K, N155H, or R263K) were detected among INSTInaïve individuals, regardless of previous exposure to ART. However, one (0.22%) sequence from a person without previous ART exposure was found to harbor E92G, a mutation that moderately reduces EVG susceptibility but does not reduce susceptibility to RAL and DTG.

Integrase Strand-Transfer Inhibitor Resistance among Patents on Antiretroviral Therapy
To assess the impact of ART exposure to NRTI, NNRTI, and/or PI on the selection of INSTI-resistance mutations, we further compared the INSTI HIVDRMs from patients with one or more major HIVDR mutations to NRTI, NNRTI, and/or PI (HIVDR group) with those with no HIVDRMs in their corresponding PR/RT genes (no-HIVDR group) ( Figure 2).
Briefly, among the total 460 IN sequences used in our analysis, 327 had a corresponding PR/RT gene sequence, of which 234 had no major HIVDRMs (no-HIVDR group), while 93 of the sequences (HIVDR group) had one or more HIVDRMs against the NRTI, NNRTI, and/or PI (see Table S1). No major INSTI HIVDRMs were detected in either of these groups, and there was no significant difference in the presence of accessory mutations with regard to previous ART exposure, nor with regard to DRMs toward other ARVs. Among the HIVDR and no-HIVDR groups, 3.2% (3/93) and 4.7% (11/234) accessory mutations were detected, respectively (p = 0.8); while 4.29% (15/373) and 5.75% (5/87) accessory mutations were detected among ART-naïve and ART-experienced groups, respectively (p = 0.6).
To assess the impact of ART exposure to NRTI, NNRTI, and/or PI on the selection of INSTI-resistance mutations, we further compared the INSTI HIVDRMs from patients with one or more major HIVDR mutations to NRTI, NNRTI, and/or PI (HIVDR group) with those with no HIVDRMs in their corresponding PR/RT genes (no-HIVDR group) ( Figure  2). High similarity was also observed when comparing the consensus sequence from ARTnaïve and ART-experienced patients, as shown in Figure 3. Similarly, our comparison of the consensus sequences from the HIVDR and non-HIVDR groups also showed high similarity between the two consensus sequences, except at positions K215N, T218L, and R269, where the HIVDR group had one amino acid; while the no-HIVDR group had a mixture of amino acids at positions T215K/N, T218I/L, and R269R/K, respectively ( Figure 3). Briefly, among the total 460 IN sequences used in our analysis, 327 had a corresponding PR/RT gene sequence, of which 234 had no major HIVDRMs (no-HIVDR group), while 93 of the sequences (HIVDR group) had one or more HIVDRMs against the NRTI, NNRTI, and/or PI (see Table S1). No major INSTI HIVDRMs were detected in either of these groups, and there was no significant difference in the presence of accessory mutations with regard to previous ART exposure, nor with regard to DRMs toward other ARVs. Among the HIVDR and no-HIVDR groups, 3.2% (3/93) and 4.7% (11/234) accessory mutations were detected, respectively (p = 0.8); while 4.29% (15/373) and 5.75% (5/87) accessory mutations were detected among ART-naïve and ART-experienced groups, respectively (p = 0.6).
High similarity was also observed when comparing the consensus sequence from ART-naïve and ART-experienced patients, as shown in Figure 3. Similarly, our comparison of the consensus sequences from the HIVDR and non-HIVDR groups also showed high similarity between the two consensus sequences, except at positions K215N, T218L, and R269, where the HIVDR group had one amino acid; while the no-HIVDR group had a mixture of amino acids at positions T215K/N, T218I/L, and R269R/K, respectively (Figure 3). The consensus sequence from the ART-naïve sequences (n = 367) is represented as ART_Naive, and that from ART-experienced (n = 87) is represented as ART_Expo. The consensus sequence from the sequence with no HIVDR mutation in the protease/reverse transcriptase (PR/RT) gene (n = 234) is represented as No_HIVDR, while that with one or more major mutation in PR/RT is represented as HIVDR. Positions with more than one amino acid are both represented. HXB2 represents the consensus HIV-1 subtype B reference sequence from the LANL database (accession number: K03455).

Analysis of the N-Terminal Domain (NTD)
Within the NTD, the Zn-binding motif (H12H16C40C43) involved in the multimerization of the IN subunit, stabilization of folding, and interaction with LEDGF/p75 were highly conserved [6]. However, amino acid positions, D10E, S24N, D25E, V31I, and M50I were highly polymorphic (>20.0% variability). We also observed that the residue E10 had been replaced by D (aspartic acid) in 97.8% of sequences, which might be the signature of subtype C (Figure 4).

Analysis of the N-Terminal Domain (NTD)
Within the NTD, the Zn-binding motif (H 12 H 16 C 40 C 43 ) involved in the multimerization of the IN subunit, stabilization of folding, and interaction with LEDGF/p75 were highly conserved [6]. However, amino acid positions, D10E, S24N, D25E, V31I, and M50I were highly polymorphic (>20.0% variability). We also observed that the residue E10 had been replaced by D (aspartic acid) in 97.8% of sequences, which might be the signature of subtype C (Figure 4).

Analysis of the C-Terminal Domain (CTD)
Within CTD, the two large consecutive residues, L241-Q252 and I257-K264, which are involved in the binding of viral and cellular DNA, were found to be highly conserved, except for positions I251 and V257, which were mutated to I251L and V259I in 3.5% and 0.7% of the sequences, respectively. However, the important positions for DNA binding and integrase multimerization (K258, V260, R262, R263, and K264) [6] were fully conserved.
Our analysis also showed that 24 amino acid positions were highly polymorphic (>20. Our comparison of the NOPs' distribution with the global subtype B and global subtype C sequences downloaded from LANL showed that the Ethiopian HIV-1 subtype C IN sequences had a high similarity to the global subtype C sequence, but were quite different from the global subtype B, as shown in Figure 5.

Analysis of the C-Terminal Domain (CTD)
Within CTD, the two large consecutive residues, L241-Q252 and I257-K264, which are involved in the binding of viral and cellular DNA, were found to be highly conserved, except for positions I251 and V257, which were mutated to I251L and V259I in 3.5% and 0.7% of the sequences, respectively. However, the important positions for DNA binding and integrase multimerization (K258, V260, R262, R263, and K264) [6] were fully conserved.
Our analysis also showed that 24 amino acid positions were highly polymorphic (>20. Our comparison of the NOPs' distribution with the global subtype B and global subtype C sequences downloaded from LANL showed that the Ethiopian HIV-1 subtype C IN sequences had a high similarity to the global subtype C sequence, but were quite different from the global subtype B, as shown in Figure 5.

Analysis of the Subtype Consensus Integrase Sequences
The consensus IN sequence for the global HIV-1 subtype B and global subtype C were generated using 1884 and 1410 sequences, respectively. Our comparison of the 288

Analysis of the Subtype Consensus Integrase Sequences
The consensus IN sequence for the global HIV-1 subtype B and global subtype C were generated using 1884 and 1410 sequences, respectively. Our comparison of the 288 amino acid sequence alignment of the consensus Ethiopian HIV-1 subtype C with the global HIV-1 subtype C showed high similarity, except for the mixture of amino acid sequences at positions 25E/D, 100Y/F,124T/A, 136K/Q, 167E/D, 215K/N, and 218I/L in the Ethiopian consensus; and 50M/I, 72I/V, and 265A/V in the global HIV-1 subtype C consensus sequence. However, it differed from the global subtype B consensus at eight positions with complete amino acid replacement (31, 112, 125, 201, 218, 234, 278, 283), while a mixture of amino acids was detected at positions 11E/D, 72I/V, and 101I/L in the global subtype B consensus sequences and at 24N/S, 25E/D, 100Y/F, 124T/A, 136K/Q, 167E/D, 215K/N, and 269K/R in the Ethiopian subtype C consensus sequence ( Figure 6).

Genetic Barrier to Dolutegravir Resistance
In this study,19 substitutions conferring major resistance to DTG at 10 amino acid positions in the IN (T66A/I/K, E92G, G118R, E138K/A/T, G140S/A/C, Y143R/C/H, S147G, Q148H/R/K, N155H, and R263K) were assessed to explore the genetic barrier to DTG. For each codon, the number of transitions and/or transversions required for a IN drug resistance associated substitution were calculated. A total of 1884 global HIV-1 subtype B sequences and 453 Ethiopian subtype C sequences from INSTI-naïve patients were compared for differences in the genetic barrier to INSTI resistance ( Table 2).

Genetic Barrier to Dolutegravir Resistance
In this study,19 substitutions conferring major resistance to DTG at 10 amino acid positions in the IN (T66A/I/K, E92G, G118R, E138K/A/T, G140S/A/C, Y143R/C/H, S147G, Q148H/R/K, N155H, and R263K) were assessed to explore the genetic barrier to DTG. For each codon, the number of transitions and/or transversions required for a IN drug resistance associated substitution were calculated. A total of 1884 global HIV-1 subtype B sequences and 453 Ethiopian subtype C sequences from INSTI-naïve patients were compared for differences in the genetic barrier to INSTI resistance (Table 2).
Overall, the sequence analysis of the two subtypes showed similar predominant codon use at the selected amino acid positions, resulting in a similar minimum score for the genetic barrier to DTG. However, at position 140, the predominant codons in subtype C were GGG (53.6%) and GGA (45.9%). In contrast, in subtype B, GGC (85.0%) was the predominant codon resulting in a difference in the calculated genetic barrier at this position. For subtype C, two transversions (minimum score of 5) were required to mutate to G140C (GGG/A to ATG/C); while for subtype B, one transversion and transition (minimum score: 3.5) were required to mutate to G140C (GGC to TGT). Similarly, a two-point mutation (one transversion and one transition) (minimum score of 3.5) was required to mutate to G140S (GGG/A to AGT/C) for subtype C; while subtype B required a one-step transition (minimum score of 1) (GGC to AGC).

Impact of Protease and Reverse-Transcriptase Drug-Resistance Mutation on the Structure of HIV-1 Integrase
The effects of HIVDRMs in HIV-1 PR and/or RT on the secondary structure of HIV-1 IN were investigated on 20 sequences: 10 from ART-naïve (PDR) and 10 from ART-experienced (ADR) individuals representative of randomly selected HIV-1 IN sequences. The sequence identity matrix (Figure 7a) showed that all the sequences were more than 92% identical at the amino acid level, and there were no major differences between the two main groups. To study the effects of PR and RT drug-induced resistance on the structure of HIV-1 IN, chain A of the 6u8q structure was modified at 12 positions to represent both the ADR and the PDR sequences (see Table S2). The alignment of the monomers of the PDR and ADR INs did not result in any differences between the two groups. DTG was successfully docked to both the PDR and ADR IN by Autodock Vina (Figure 7c), and the docking score was −6.5 kcal/mol, which was at a similar position as the original DTG ligand.
the ADR and the PDR sequences (see Table S2). The alignment of the monomers of the PDR and ADR INs did not result in any differences between the two groups. DTG was successfully docked to both the PDR and ADR IN by Autodock Vina (Figure 7c), and the docking score was -6.5 kcal/mol, which was at a similar position as the original DTG ligand.

Discussion
Overall, our results revealed no major DTG associated HIVDRM mutations among INSTI-naïve individuals, regardless of previous exposure to ART. In one individual, the E92EG mutation was found, which moderately reduced EVG susceptibility, but had no effect on DTG. However, INSTI accessory mutations and NOPs, which could influence INSTI susceptibility and the genetic barrier to INSTI resistance, were detected. Our polymorphism analysis showed that 64.9% (187/288) of amino acid positions of the HIV-1 subtype C IN sequences from INSTI-naïve individuals were conserved (<1.0% variability). The majority of amino acids involved in key functions of the enzyme (the HHCC motif and the DDE motifs [6,22]) were fully conserved. The genetic barriers to DTG resistance were similar at selected amino acid positions for subtypes B and C, except that subtype C had a higher genetic barrier for the G140C and G140S mutations, highlighting that the Q148H/K/R DTG resistance pathway was selected less in subtype C. Docking analysis of the DTG showed that the PR-and RT-associated HIVDRM did not affect the structure of the HIV-1 IN, supporting the use of DTG as a salvage therapy for patients with resistance to drugs targeting these enzymes.
The absence of major INSTI DRMs among INSTI-naïve patients in our study was consistent with other studies from Africa [58][59][60][61][62][63][64], Asia [65][66][67], and Europe [68][69][70], showing no or highly infrequent major INSTI mutations among INSTI-naïve patients. Our finding was not unexpected, and was in line with studies from other settings based on samples obtained before the rollout of DTG [71][72][73]. However, following the wide scale-up of DTG, an increase in DTG resistance has been reported, especially in persons receiving DTG monotherapy [15,19,[23][24][25][26][27][28]. Hitherto, the prevalence of transmitted resistance to DTG resistance has been low [20][21][22][23]. Similarly, in Ethiopia, after implementing the test-andtreat strategy, an increased number of patients will be on a DTG-based regimen. Thus, the emergence of INSTI resistance is expected, especially in settings with low access to viral load monitoring, delaying the identification of patients with treatment failure and increasing the risk of HIV drug resistance [74].
When present alone, accessory mutations have a minimal effect on INSTI susceptibility, but may serve to augment resistance and/or restore the fitness of viral mutants with major resistance mutations [5,30]. INSTI accessory mutations were detected in 20 (4.4%) of our specimens, and were equally distributed in both ART-naive and ART-experienced patients. Similar to our findings, different studies [67,72,73,75] revealed that NOPs were common among INSTI-naive patients. However, the prevalence differed with HIV-1 subtypes or circulating recombinant forms.
E157Q was the most common nonpolymorphic accessory mutation detected in our analysis. It is a natural polymorphism present in 1-10% of untreated individuals, depending on the subtype. It has no effect on the susceptibility of INSTI. However, it may act as a compensatory substitution for R263K-induced resistance to DTG [76]. Q95K was among the other nonpolymorphic accessory INSTI resistance detected in our study, and it had little, if any, effect on drug susceptibility to INSTI; however, in the presence of a N155H mutation, it increased INSTI resistance and improved the impaired replication of the virus [77].
L74M/I (2.9%) and M50I (18.8%) were the other polymorphic mutations detected in our study. L74M/I has been reported at levels between 0.5-20% in the untreated population, with a high prevalence in subtypes A, G, and A/G recombinants. It does not decrease INSTI susceptibility alone, but it can contribute to a high-level resistance when occurring with major INSTI-resistance mutations, mainly the Q148H/K/R mutation [24,58,78,79]. Studies in South Africa, Brazil and Europe have also confirmed a low frequency of L74M in INSTI-naïve patients [64,68,80]. M50I can be found in 10-25% of INSTI-naïve patients [81]. M50I alone does not negatively impact integrase strand-transfer activity and HIV replication capacity, but in combination with R263K, it increased resistance to DTG by 15.6-fold [81].
The other nonpolymorphic and polymorphic accessory mutations detected were G163R and T97A, which can contribute to a high-level resistance when occurring with Y143 and N155H major INSTI-resistance mutations [30].
In this study, we characterized the distribution of amino acid variants among the 453 HIV-1 subtype C IN sequences from INSTI-naïve individuals. Our results revealed that 64.9% of HIV-1 IN amino acid positions were conserved (<1.0% variability). The conserved position in the NTD, CCD, and CTD were 60%, 66.0%, and 65.8%, respectively. This was comparable to the study by Rhee et al. (2008) that showed 70% (202/288) of IN amino acid positions of the 1500 sequences obtained from INSTI-naïve (ART-naive or ART-experienced) individuals with different subtypes (<1.0% variability) [5]. Similarly, Hackett et al. (2008) also showed that 65% (187/288) of amino acid positions were conserved after analyzing 1304 HIV-1 sequences from groups M, N, and O IN sequences [82].
In general, our results showed that the majority of amino acids involved in key functions of the enzyme, including the zinc-binding HHCC motif, the multimerization of IN subunits, and the binding with the human cellular factor LEDGF/p75 in the catalytic core domain, the catalytic triad DDE [6,22] was highly conserved. The high conservation might have been due to the absence of INSTI pressure. All of our study participants were INSTI-naïve, and INSTI was not used in Ethiopia during our sample collection. However, a highly polymorphic residue in the NTD, CCD, and CTD regions, which might have affected the IN-protein function and interfered with the INSTI binding, were also observed [22,30]. Further long-term treatment follow-up studies are needed to assess the potential impact of NOPs on the evolution of INSTI resistance and viral fitness under the pressure of INSTIs.
It was also interesting to note that 20.5% (93/453) of our study participants were found to harbor a major HIVDR mutation (transmitted and acquired HIVDR) for NRTI, NNRTI, and/or PI in their corresponding PR/RT gene. However, DRM directed toward sites other than IN did not have a significant effect on INSTI susceptibility. In line with our findings, different studies have shown that previous NRTIs mutations appeared to have no impact on the risk of virological failure in patients switched to DTG with NNRTIs [83][84][85][86]. However, this was in contrast to other studies that showed previous exposure to NNRTI, PI, and/or NNRTI induced mutations or increase polymorphisms in the IN gene, highlighting the functional cooperation between viral IN and RT, and/or a potential coevolution of some of their mutations [9,87]. For instance, a study by Ceccherini et al. (2009Ceccherini et al. ( , 2010 showed a higher frequency of I84V, M154I, and V165I among ART-treated subtype B patients compared to ART-naïve patients, implying that nonsuppressive ART treatment based on other antiretroviral drug classes (NRTI and/or NNRTI) might induce IN polymorphisms [6,9].
The observed differences between this and previous studies might be due to the number of sequences, range of major/minor mutations, and subtypes included in the analysis. However, the lack of a major INSTI mutation among sequences with multiple mutations in the PR/RT gene and the high conservation of amino acids involved in key functions of the IN enzyme did not support the impact of previous ART treatment on INSTI susceptibility.
Our docking analysis further supported our results, and showed no differences between the HIVDR and no-HIVDR groups. In both groups, DTG was successfully docked at a similar position to the original DTG ligand with the best docking score of −6.5 kcal/mol. The genetic barrier, which is a crucial factor in the development of drug resistance, is defined by a cumulative number of resistance-associated mutations (RAMs) required for the virus to escape drug-selective pressure [53]. It is an important factor that contributes to the development of drug resistance. The variability at the nucleotide level in the IN among the different subtypes could influence the genetic barrier of INSTI drugs. In this study, we explored how the variability between subtypes C and B could affect DTG resistance.
Overall, our analysis of the codon distribution of the selected amino acid position of HIV-1 subtype C and subtype B revealed a similar genetic barrier for the development of DTG resistance between subtype C and B, except at codon position 140, where subtype C had a higher genetic barrier to develop the G140C and G140S mutations compared to subtype B, highlighting a higher genetic barrier for the Q148H/R/K resistance pathway in subtype C. The G140S mutation has been shown to rescue the catalytic defect due to the Q148H mutation, enabling the recovery of viral fitness [88]. A similar high genetic barrier to acquire mutations G140S or G140C has also been described in CRF02_AG compared with subtype B [53,89].
This study was comprehensive, and included both treatment-naïve and treatmentexperienced (first-and second-line regimens) patients, and will be a benchmark for INSTI DRM monitoring in Ethiopia. However, our analysis was based on the Sanger dideoxy sequencing method, which does not detect drug-resistance minority variants below 20% of the virus population, and might have underestimated the prevalence of INSTI DRMs among our study participants [90].

Conclusions
Our results showed no major clinically relevant INSTI-associated mutations among INSTI-naïve patients regardless of exposure to other antiretroviral agents, supporting the implementation of the wide scale-up of DTG-based regimes in Ethiopia. However, the detection of polymorphisms contributing to INSTI resistance and the expected increased use of DTG-based regimens in Ethiopia warrant the need for continuous surveillance of INSTI resistance. The genetic barrier analysis showed that subtype C had a high genetic barrier to acquiring the G140C and G140S mutations, highlighting that the Q148H/K/R mutation DTG resistance pathway was selected less in subtype C. Moreover, the docking analysis of the dolutegravir showed that proteaseand reverse-transcriptase-associated HIVDRMs did not affect the native structure of the HIV-1 integrase.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/v14040729/s1, Table S1: Type of HIV-drug-resistance mutations detected in the HIVDR group (patients with one or more major HIVDR mutation to NRTI, NNRTI, and/or PI) (n = 93), Table S2: Modifications of the 6u8q.pdb HIV-1 integrase structure according to the alignment of both the ADR and PDR sequences.  Data Availability Statement: All the sequences from this study were deposited in the GenBank with accession numbers OM302554-OM303013.