High Level of Pre-Treatment HIV-1 Drug Resistance and Its Association with HLA Class I-Mediated Restriction in the Pumwani Sex Worker Cohort

Background: We analyzed the prevalence of pre-antiretroviral therapy (ART) drug resistance mutations (DRMs) in a Kenyan population. We also examined whether host HLA class I genes influence the development of pre-ART DRMs. Methods: The HIV-1 proviral DNAs were amplified from blood samples of 266 ART-naïve women from the Pumwani Sex Worker cohort of Nairobi, Kenya using a nested PCR method. The amplified HIV genomes were sequenced using next-generation sequencing technology. The prevalence of pre-ART DRMs was investigated. Correlation studies were performed between HLA class I alleles and HIV-1 DRMs. Results: Ninety-eight percent of participants had at least one DRM, while 38% had at least one WHO surveillance DRM. M184I was the most prevalent clinically important variant, seen in 37% of participants. The DRMs conferring resistance to one or more integrase strand transfer inhibitors were also found in up to 10% of participants. Eighteen potentially relevant (p < 0.05) positive correlations were found between HLA class 1 alleles and HIV drug-resistant variants. Conclusions: High levels of HIV drug resistance were found in all classes of antiretroviral drugs included in the current first-line ART regimens in Africa. The development of DRMs may be influenced by host HLA class I-restricted immunity.


Introduction
Human immunodeficiency virus 1 (HIV-1) continues to be a major global health threat, with an estimated 37.6 million people living with HIV (PLWH) as of 2020 [1]. The Joint United Nations Programme of HIV/AIDs (UNAIDS) made an ambitious 90-90-90 goal of diagnosing 90% of those living with HIV, providing antiretroviral therapy (ART) to 90% determine pre-treatment HIV-1 Drug Resistance in drug treatment naïve women in a Kenya Cohort and (2) to determine the influence of host HLA class I genes on the development of pre-treatment DRMs. We investigated the distributions of pre-ART DRMs in the pol region of HIV-1 proviral DNAs from ART-naïve women in Kenya using NGS technologies and explored the potential influence of host HLA class I genes on the development of pre-ART DRMs.

Study Participants and Samples
Study participants (n = 266) were all HIV-infected women enrolled between the years 1987 and 2010 in the Pumwani (a district of Nairobi) sex worker cohort established in Nairobi, Kenya. Because antiretroviral drug treatments were not available prior to 2003 when the PEPFAR was introduced, as well as the CD4 counts criteria for ART treatment, the participants did not receive antiretroviral therapy before 2003; participants enrolled later did not receive ART because their CD4 counts were above 200 cell/µL. The HIV+ blood samples, PBMCs, or buffy coats were collected from all participants at enrollment (n =266) or resurvey visits (n = 171). This study was approved by the Ethics Committee of the University of Manitoba and the Ethics and Research Committee of Kenyatta National Hospital, and written informed consent was obtained from all participants.

Amplification and Sequencing of HIV Genome
HIV-1 proviral DNA was amplified using a nested approach [21,22]. The first round of PCR amplified the full HIV genome using published primers MSF12b (HXB2 location 623-649) and ofm19 (HXB2 location 9632-9604) located in the 5 -LTR and 3 -LTR region using the Expand Long Template PCR System (Roche Diagnostics GmbH, Mannheim Germany) at the recommended conditions. Three sets of PCR primers were used for nested PCRs with Expand High Fidelity PCR System (Roche Diagnostics GmbH, Mannheim Germany) to generate 3 overlapping PCR products covering the full HIV-1 genome. This approach was used successfully to amplify full HIV-1 genomes from 300 samples [23]. The amplified PCR products were purified, quantified, and sequenced with Roche 454 GS FLX platform at 1000-fold coverage.

HLA Genotyping
A sequence-based high-resolution typing method [24,25] was used to genotype three HLA class I genes of all participants. DNA was isolated using QIAmp DNA Mini Kit and QIAgen EZ1 Blood Robot (QIAgen Inc., Mississauga, ON, Canada). HLA-A, -B, and -C genes were amplified by PCR with gene-specific primers [24,25]. The purified PCR products were sequenced with BigDye cycle sequencing kits (Applied Biosystems, Waltham, MA, USA) using sequence-specific primers and analyzed with an ABI3730 Prism Genetic Analyzer. HLA-A, -B, and -C alleles were assigned using CodonExpress TM , a computer software program developed based on a taxonomy-based sequencing analysis [24].

Sequence Analysis
The raw reads generated from the Roche 454 were aligned to HXB2, and only the sequencing reads mapped to the pol region of the HXB2 genome (coordinates: 2294-5096; 2803 bps) were extracted for HIV DRM identification and for consensus sequence generation using HyDRA Web (http://hydra.canada.ca; 13 July 2020). Default settings were selected, and reads were filtered to have a minimum variant quality score of Q30 and 100 bp length. The mutations were determined based on the standard mutation database. The minimum amino acid (AA) frequency for variant calling was 1%, the minimum read depth was 100, and the minimum mutation count was 5: an error rate of 0.0021 was used. In order to reduce the intrinsic error rate of the Roche 454 [18], only variants with a minimum Reference 12 is missing. Kindly include this and re-order the references to ensure that they run chronologically. 2% frequency were selected for analysis. The consensus sequence for  [25,26]. The level of drug resistance was determined to be high, intermediate, or low using the Stanford HIV DB [25]. A patient-specific database was generated with patient IDs, characteristics including age, country of origin, CD4 counts, HIV subtype, and HIV stage (if available). Each DRM identified per patient was added to the table and merged in the case of multiple samples for one patient. If a patient had the same DRM at two different collection times, the mutation was counted once to portray the number of people with each variant accurately.

Statistical Analysis
Data analysis was conducted using IBM ® SPSS ® Statistics for Macintosh, V27.0.1.0 (IBM Corp., Armonk, NY, USA) [27]. HLA alleles and HIV DRMs present in more than 1% of participants/samples were included in our analysis. Fisher's exact test was used to examine associations between individual DRMs and HLA class I alleles. Associations with p-values less than 0.05 were considered potentially relevant and were extracted for further characterization. The false discovery rate was controlled at 0.05 to adjust p values for multiple comparisons using the Benjamin-Hochberg method. Kaplan-Meier survival analysis was carried out to estimate the effect of HLA-restricted DRMs on time taken for CD4 + T cell to decline to below 200 cells/µL (diagnosis of AIDS). The log-rank test was used to compare the time of CD4+ T cell declining to <200 cells/µL between two groups: participants harboring HLA-specific DRMs vs. participants without HLA-specific DRMs. A p-value less than 0.05 was considered significant.

Characteristics of the Study Participants
A total of 266 HIV+ women enrolled in the Pumwani cohort were included in the study; among them, 126 (47%) were from different regions of Kenya, 123 (46%) from Tanzania, and 14 (5%) from Uganda. Those from Tanzania and Uganda were from areas around Lake Victoria. Most of the women were Bantu speakers (~93%), and a small number of them were Nilote speakers. None of them had received ART. Participants ranged in age from 18 to 54 years, with a median (Q1, Q3) age of the participants of 35 (25,45). The median duration since HIV-1 diagnosis was 7.67 years. The dominant HIV-1 subtype in pol gene was A1 (n = 150, 56.4%), followed by D (n = 25, 9.4%) and C (n = 14, 5.3%); HIV-1 subtype B was rare (n = 1, 0.4%). The average CD4 count was 295 cells/µL. The characteristics of the study participants are summarized in Table 1.

Pre-ART Drug Resistance Mutations
Among all participants, 98% of them had at least one DRM with frequency ≥2% within the viral population, including mutations at highly polymorphic sites and potential APOBEC mediated mutations. The total number of DRMs in each patient ranged from 1 to 15, with a mean of 6. A total of 58 unique variant positions were identified; 25 (43%) of these were in the reverse transcriptase (RT) region, 18 (31%) were in the protease (PR) region, and 15 (26%) were in the integrase (IN) region. Of those 58 mutations, 42 (72%) were determined to be non-polymorphic and unlikely to be APOBEC mediated [25]. One hundred and thirtyeight individuals (52%) had one or more DRM conferring at least potentially low-level resistance to one of the currently available ARV medications (according to Stanford HIV DB [20]) and 42% of participants harbored one or more DRM to one of the current first-line ARV options used in Kenya and surrounding countries (Table 2) [28,29]. One hundred and  Ninety-one percent of the participants harbored an NNRTI-associated variant, making it the drug class with DRMs in the highest number of participants. This was followed by PRs (61%), NRTIs (45%), and INSTIs (40%). Variants conferring resistance to one of the available ARVs (based on Stanford HIV DB or listed as a WHO SDRM [25,26]) were considered to be clinically important variants ( Table 2).
The most prevalent clinically important mutations for each drug class are as follows with the study sample prevalence: M184I (37%) (NRTI), E138AK (15%) (NNRTI), D30N (10%) (PI), and E138K (8%) (INSTI) ( Figure 1). The prevalence of these DRMs is very similar among the participants from different countries (Supplementary Table S1). The most prevalent single mutation overall, regardless of clinical importance, was K103E (RT), with a prevalence of 77% (result not shown). K103E frequently co-occurred with K103R, which had a prevalence of 61%. Fifty-eight percent of the study population had both the K103E and K103R variants.

Pre-ART Drug Resistance Associated with HLA Class I Alleles
A total of 65 HLA Class I alleles and 29 HIV DRMs were observed in at least 1% of the study participants.
Of the 65 alleles, 23 (35%) were Class I A, 25 (38%) were Class I B, and 17 (26%) were Class I C. More than 1800 HLA-by-HIV DRM interactions were examined using Fisher's exact test. We looked at the associations with the lowest uncorrected p-values, giving 53 potentially relevant associations (unadjusted p < 0.05). Fourteen of these 53 potential  Table 3. Table 2. Drug class, population prevalence, and level of ARV drug resistance conferred of selected HIV-1 drug-resistant mutations found in the study participants. The examination of the effect of the identified clinically important DRMs on the disease progression by Kaplan-Meier analysis showed that E138K_RT mutation is significantly associated with disease progression (p = 0.002) (Figure 2). Sub-Saharan countries with either a 5-10% or >10% HLA allele frequency in their population are listed in Table 3. The HLA-DRM associations involving HLA alleles at >10% frequency in Kenya are A*68:02 with DRMs G190ES (p = 0.008), IN E138K (p = 0.042), C*17:01 with DRMs M46I (p = 0.018), and IN E138K (p = 0.021). The most significant association was between the mutation T97A and A*66:01 (p = 6.20 −7 ; p = 0.001 after correction for multiple tests), a high-frequency allele in both Kenya and Uganda.
We checked whether the same DRMs could be detected from samples collected at different years from the same study participants on the assumption that if the HLA class I restricted CTL response drive the development of a specific DRM, the DRM will be maintained in the participant without other factors such as ART. We observed such examples in our study. For example, E138K, a well-known escape mutation [30], was first identified in participant ML874 (A*68:02+) in the blood sample collected in 1996, and this DRM was also detected in the samples collected in 2003 from the same participant. Moreover, E138A was identified first in participant ML264 (A*68:02+), and in samples collected in 1996, the DRM was also present in two samples collected from the same patient more than 7 and 7.5 years later in 2003. Both patients have the A*68:02 allele that is associated with the DRM. These results support the assumption that HLA class I-restricted CTLs exert selective immune pressure on RT138 amino acid and maintain K138AK mutations in these individuals over the course of infection.
The most prevalent clinically important mutations for each drug class are as follow with the study sample prevalence: M184I (37%) (NRTI), E138AK (15%) (NNRTI), D30N (10%) (PI), and E138K (8%) (INSTI) (Figure 1). The prevalence of these DRMs is very sim ilar among the participants from different countries (Supplementary Table S1). The mos prevalent single mutation overall, regardless of clinical importance, was K103E (RT), with a prevalence of 77% (result not shown). K103E frequently co-occurred with K103R, which had a prevalence of 61%. Fifty-eight percent of the study population had both the K103E and K103R variants.  = 0.018), and IN E138K (p = 0.021). The most significant association was between t tation T97A and A*66:01 (p = 6.20 −7 ; p = 0.001 after correction for multiple tests), frequency allele in both Kenya and Uganda. We checked whether the same DRMs could be detected from samples colle different years from the same study participants on the assumption that if the HL I restricted CTL response drive the development of a specific DRM, the DRM will b tained in the participant without other factors such as ART. We observed such ex in our study. For example, E138K, a well-known escape mutation [30], was first ide in participant ML874 (A*68:02+) in the blood sample collected in 1996, and this DR also detected in the samples collected in 2003 from the same participant. Moreover, was identified first in participant ML264 (A*68:02+), and in samples collected in 19 DRM was also present in two samples collected from the same patient more than 7.5 years later in 2003. Both patients have the A*68:02 allele that is associated w DRM. These results support the assumption that HLA class I-restricted CTLs exer tive immune pressure on RT138 amino acid and maintain K138AK mutations in th dividuals over the course of infection.

Discussion
Surprisingly, 98% of ART-naïve participants had at least one detectable HIV ant, 38% of participants harbored at least one WHO SDRM, and 42% had at least level resistance DRM to one of the first-line medications used in Kenya [28]. This compared with previous studies reporting TDR prevalence of 8% in men who h with men (MSM) in Coastal Kenya from 2005 to 2017 [29], SDRM prevalence of four African countries from 2013 to 2019 [31], pre-treatment DRM prevalence of rural Kenya from 2008 to 2013 [32], and a 9.7% SDRM prevalence in Nigeria from 2017 [33]. Importantly, resistance was identified to all major classes of antiretrovira

Discussion
Surprisingly, 98% of ART-naïve participants had at least one detectable HIV-1 variant, 38% of participants harbored at least one WHO SDRM, and 42% had at least a low-level resistance DRM to one of the first-line medications used in Kenya [28]. This is to be compared with previous studies reporting TDR prevalence of 8% in men who have sex with men (MSM) in Coastal Kenya from 2005 to 2017 [29], SDRM prevalence of 11% in four African countries from 2013 to 2019 [31], pre-treatment DRM prevalence of 24% in rural Kenya from 2008 to 2013 [32], and a 9.7% SDRM prevalence in Nigeria from 2013 to 2017 [33]. Importantly, resistance was identified to all major classes of antiretroviral drugs (NNRTIs, NRTIs, PIs, and INSTIs) currently used in Kenya ( Table 2). The higher prevalence of pre-ART DRMs in our study is likely contributed by using NGS technology wherein sequence coverage of 1000x was reached compared to previous studies employing Sanger sequencing or OLA detection of a specific mutation. Eight (44%) of the clinically significant mutations (n = 18) were detected with frequency less than 10%, and one (5%) was present at a frequency between 10 and 20%. LADRVs (frequency < 20%) have been associated with an increased risk of virological failure in a dose-dependent manner with respect to mutant load, regardless of medication adherence [34][35][36]. Our results highlight the importance and utility of NGS and high coverage sequencing techniques [34,37,38].
RT K103E (77%) and K103R (61%) were the most common variants identified in our study. K103E, listed as a rare variant by Stanford HIV DB [25], can be selected by NNRTI medications but does not seem to confer any reduced susceptibility to them. When present with the variant V179D, which was not found in our participants, K103R can reduce susceptibility to NVP and EFV approximately 15-fold [25]. Knowledge of the high level of K103R polymorphism in this study population is important if population levels of the V179D variant increase. In previous studies, the K103 variant was shown to persist even in participants who tested negative for ARV drugs in their system, eliminating the possibility of acquisition of ARV medications illegitimately and suggesting there is alternative selective pressure maintaining that mutation [39].
The clinically significant mutations with the highest prevalence among the study participants were M184I (37%) and E138AK (15%) on RT, D30N (10%) and M46I (8%) on PR, and E138K (6%) on IN. Each of these variants confers at least low-level resistance to one or more of the currently recommended ARV drugs, except D30N, which confers resistance to NFV, a discontinued PI [20]. Slightly divergent from the guidelines, commonly used medications in Kenya and surrounding countries, as of 2020, include the combined regimens AZT + 3TC + NVP, TDF + 3TC + EFV, and AZT + 3TC + EFV [38]. Twenty-four participants (9%) harbored HIV-1 variants conferring low to high-level resistance to both EFV and NVP, while 41% had resistant mutations to 3TC. Of note, the presence of the M184I mutation in a patient's quasispecies is not a contradiction to regimens containing 3TC or FTC. This is due to the finding that the M184I mutation decreases viral replication fitness and increases viral susceptibility to other ARV drugs TDF, dFT, and AZT [25]. M184 variants were identified in numerous studies within this region [23,37,38]. Interestingly, the M184V variant was not common (<1%, not shown) in this cohort. It is possible that the combination of M184I with other DRMS, especially E138AK, makes the isoleucine at position184 less likely to change to valine (to the more stable and fit valine variants [35]), as the M184I mutation exhibits high-level reduced susceptibility to FTC/STC. More importantly, 6% of the participants had the IN mutation, E138K, which confers low-level resistance to the newly recommended DTG or, when present with other IN variants, may synergistically lead to intermediate level resistance against DTG. A higher prevalence of IN-associated DRMs was seen in our study than in the others. The majority of studies in Kenya and surrounding countries have not yet included IN sequence data, and the ones that included IN sequence data did not find a high prevalence of INSTI DRMs [34,37,38]. In a recent study of a Congolese population, T97A, a variant conferring reduced susceptibility to INSTIs when acting synergistically with other INSTI mutations, was identified in 11% of the participants [25]. In our study, 12% of participants harbored the T97A variant (Table 3). R263K, a variant shown to be selected by DTG in vitro, was present in 2.3% of our sample with a mean quasispecies frequency of 34.9% (Figure 1), a startling prevalence considering that DTG is a relatively new drug and newer INSTIs have a high genetic barrier to resistance. Our results provide a rationale for increased surveillance of INSTI in this population and support the WHO's recent recommendations [19].
HIV mutates rapidly to escape host immune responses, and host leukocyte antigen (HLA) class I-restricted CD8+ T cell responses represent a major selective pressure driving and shaping HIV evolution (or mutations) in the absence of ART within the HIV-1 positive population [13,15,40]. As such, pre-ART DRMs could be introduced and maintained by HLA-restricted immunity [18]. In our study, we examined associations between predominant HLA alleles and pre-existing DRMs. Among a total of 53 potentially relevant correlations identified, 14 of them involved clinically important DRMs, encompassing NNRTIs, PIs, and INSTIs, although most of these associations were very weak. Only the T97A variant showed a significant association with HLA*66:01 (adjusted p = 0.001) after correction for multiple tests. Failure to detect these associations is due to the reduced statistical power by the low prevalence of these specific HLA alleles and DRMs in the participants. Thus, the real association might be hidden. This is supported by the fact that E138K_RT mutation is associated with faster CD4 T cell decline by the Kaplan-Meier analysis (p = 0.002) ( Figure 2). As E138K_RT is a well-known escape mutation [30], it suggests that the HLA class I restricted CD8 CTL responses that drove E138K to escape mutation. Indeed, the E138K_RT is significantly associated with several high-frequency HLA class I genotypes such as A*68:02 (17.99%), B*45:01(11%), and A*66:01(7.5%). High-frequency HLA alleles, with >10% allele frequency in the general Kenyan population associated with variants identified in our study, include A*68:02 and C*17:01. Both of these HLA class I alleles are associated with the INSTI related variant E138K (p = 0.048). E138K confers potentially low to intermediate level resistance to DTG alone and can act synergistically with other variants to confer greater resistance. A*66:01, a common allele in Kenya and Uganda, was strongly associated with the T97A variant (adjusted p = 0.001). As mentioned above, T97A can act synergistically with other INSTI mutations to reduce susceptibility to all INSTI medications. Participants with the aforementioned HLA Class I alleles are more likely to harbor these drug-resistant variants, increasing the risk of virologic failure after prescribing medications with potentially reduced efficacy. Most importantly, these HLA alleles are enriched in African populations such as the Ghanaian and South African black populations (Table 3). Evidently, pre-treatment HLA testing would provide patient-specific insight into potential DRMs. To our knowledge, this is the first study from Kenya to report HLA-associated HIV-1 variants on the RT, PR, and IN regions of pol in predominately non-B subtypes. Several studies found associations between HLA alleles and drug-resistant variants, mainly in subtype-B viruses from high-income countries [30,41]. More comparable to our study population, McCluskey et al. (2021) performed a study on individuals from Uganda and identified the INSTI variant L74I to be associated with HLA A*02, B*44:15, and C*04:07 in predominately subtype A1 viruses [42]. In our study, the L74I variant was associated with HLA B*15:01 (p = 0.011), an association also found in subtype B viruses in Switzerland and Australian cohorts [43]. Therefore, more studies are necessary to comprehensively analyze the role of HLA class I-restricted CD8+ T cell responses in developing pre-ART DRMs in different populations worldwide.
There are limitations to our study. The sample size of 266 participants is relatively small for examining over 1800 possible HLA by DRM associations. In addition, we did not compare the NGS results directly with the Sanger sequencing-a currently accepted standard assay. There may be differences in DRMs identified between the two assays. Although the pyrosequencing technology developed by 454 Life Sciences was the first NGS technology that provided high throughput and high-quality sequences. Roche has terminated support for the 454 FLX system since 2016 due to its relatively higher running cost and lower sequence output compared to other NGS platforms. Moreover, it is important to compare the DRMs identified in this ARV-naïve population with populations exposed to ART to evaluate the impact of pre-ART DRMs on the DRMs' evolution under ART.
In conclusion, analysis of viruses from ART-naïve individuals in this study cohort identified high levels of HIV drug resistance to all classes of antiretroviral drugs (NNRTIs, NRTIs, PIs, and INSTIs) included in the current first-line ART regimens in Africa. The development of drug resistance may be influenced by immunological pressure exerted by HLA class I-restricted CTL responses. Our findings show that HLA genotyping may provide patients with better ARV drugs that are less likely to enhance DRMs' development.
Author Contributions: B.L. and M.L. designed the project, provided supervision, interpreted the data, and participated in the writing of the manuscript. F.A.P. established and supported Pumwani Sex Worker Cohort. J.K. recruited subjects, collected, and processed HIV samples. L.L. processed HIV samples. E.S. and R.S. amplified HIV-1 proviral DNAs and generated sequence data. R.W. analyzed the data, prepared the tables and figures, interpreted the data, and wrote the manuscript. R.B. performed the statistical analysis and interpreted the data. R.B. assisted with the statistical analysis, interpretation, and preparation of the manuscript. All authors have read and agreed to the published version of the manuscript. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Raw data is available upon request.