Within-Subtype HIV-1 Polymorphisms and Their Impacts on Intact Proviral DNA Assay (IPDA) for Viral Reservoir Quantification

Arikatla, Mohith Reddy; Mathad, Jyoti S.; Reddy, Kavidha; Reddy, Nicole; Ndung’u, Thumbi; Dupnik, Kathryn M.; Lee, Guinevere Q.

doi:10.3390/v17111453

Open AccessArticle

Within-Subtype HIV-1 Polymorphisms and Their Impacts on Intact Proviral DNA Assay (IPDA) for Viral Reservoir Quantification

by

Mohith Reddy Arikatla

¹

,

Jyoti S. Mathad

^1,2

,

Kavidha Reddy

³

,

Nicole Reddy

³,

Thumbi Ndung’u

^3,4,5,6

,

Kathryn M. Dupnik

^1,2,†

and

Guinevere Q. Lee

^1,7,*,†

¹

Infectious Diseases Division, Department of Medicine, Weill Cornell Medical College, New York, NY 10065, USA

²

Center for Global Health, Weill Cornell Medicine, New York, NY 10065, USA

³

Africa Health Research Institute, Durban 4001, South Africa

⁴

Division of Infection and Immunity, University College London (UCL), London WC1E 6JF, UK

⁵

HIV Pathogenesis Programme, The Doris Duke Medical Research Institute, University of KwaZulu-Natal, Durban 4001, South Africa

⁶

Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology, and Harvard University, Cambridge, MA 02139, USA

⁷

Department of Microbiology and Immunology, Weill Cornell Medical College, New York, NY 10065, USA

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Viruses 2025, 17(11), 1453; https://doi.org/10.3390/v17111453

Submission received: 6 October 2025 / Revised: 28 October 2025 / Accepted: 29 October 2025 / Published: 31 October 2025

(This article belongs to the Special Issue Intra-Patient Viral Evolution and Diversity)

Download

Browse Figures

Review Reports Versions Notes

Abstract

The Intact Proviral DNA Assay (IPDA) is widely used to quantify genome-intact HIV proviruses in people living with HIV, but viral sequence diversity has been observed to cause assay failures due to primer/probe mismatches. Adapted for subtype C, IPDA-BC is a modified version of the IPDA validated on South African HIV-1 subtype C. India is also impacted by subtype C, but IPDA performance within-subtype across geographical regions is not well studied. We analyzed Indian (IN) and South African (ZA) subtype C sequences in silico, hypothesizing that IPDA-BC may underperform with IN viruses. Primer/probe binding was predicted using three increasingly stringent nucleotide mismatch criteria, whose sensitivity and specificity were evaluated against experimental IPDA outcomes. Phylogenetic analyses confirmed that IN and ZA subtype C sequences form distinct clusters with significant compartmentalization (p < 0.003). Across criteria, up to 6–10% decreases in primer/probe binding were observed in IN versus ZA, with the env forward primer being the most affected. These criteria showed low sensitivity (18–53%) and variable specificity (67–100%) in predicting experimental outcomes. In conclusion, even within subtype, HIV-1 variation across geographical regions may impact IPDA performance, underscoring the need for improved predictive models to guide assay design for global HIV cure research.

Keywords:

HIV proviruses; intact proviruses; HIV diversity; IPDA; intra-subtype variation; phylogenetics; PCR; India; subtype C

1. Introduction

The Intact Proviral DNA Assay (IPDA) [1] is a highly sensitive PCR-based molecular technique used to quantify the number of intact HIV proviruses in the cells of people living with HIV. This assay is crucial for HIV cure research because it distinguishes between genetically intact proviruses, which could potentially reactivate and produce infectious virus, and defective proviruses, which lack the ability to replicate [2]. Though the majority of integrated HIV is defective or incomplete, the intact proviruses pose a significant barrier to achieving a cure as they result in viral rebound if antiretroviral therapy (ART) is stopped.

Prior to IPDA, molecular methods like total HIV DNA quantification via quantitative PCR (qPCR) or single-target Droplet Digital PCR (ddPCR) were unable to differentiate between defective and intact proviruses [3], making it difficult to measure the true size of the replication-competent reservoir. Viral outgrowth assays and near-full-genome viral DNA sequencing more accurately assess provirus intactness [3], but these assays are labor-intensive and expensive. Additionally, viral outgrowth assays require a large number of cells, which precludes their use outside of specialized research centers. In contrast, IPDA provides a relatively precise and economical quantification of intact proviral genomes. IPDA has become instrumental for assessing the effectiveness of potential cure strategies aimed at reducing or eliminating the intact reservoirs, especially in more resource-limited settings. As a result, IPDA plays a key role in evaluating the success of potential HIV cure interventions, such as latency-reversing agents and gene therapies, by allowing researchers to monitor changes in the size of the intact reservoir over time in a high-throughput manner.

The original IPDA, hereafter referred to as IPDA-original, was developed based on subtype B HIV-1 sequences [1]. However, HIV-1 is genetically diverse, and even for subtype-B sequences, the IPDA-original primers and probes have a high failure rate due to viral polymorphisms: 28% in one study [4] and 18% in another [5]. Furthermore, multiple studies have shown that non-B HIV-1 subtypes, which make up over 90% of the HIV epidemic worldwide [6], have viral sequence polymorphisms that could lead to primer/probe mismatches and subsequent IPDA failures [4,7,8]. To address this issue, multiple research groups have adapted the IPDA-original for use in other subtypes. For example, a cross-subtype IPDA (CS-IPDA) was developed for subtypes A, B, C, D, and CRF01_AE [9]; a modified version of IPDA called IPDA-A1D was developed for subtype A1 and D [8]; and another modified version of IPDA was developed for subtypes B and C HIV-1 [10], hereafter referred to as IPDA-BC, which we use as an example for evaluation in this study.

IPDA-BC was designed based on 148 reference genomes from the 2018 Los Alamos HIV Sequence Compendium [10]. The primers and probes were bioinformatically validated with 2125 (randomly subset to 752) subtype B sequences derived from the USA SCOPE and OPTION cohorts obtained via the FLIPS near-full-length proviral genome sequencing approach [11], and 697 subtype C HIV-1 sequences derived from 24 South African FRESH (Females Rising through Education, Support, and Health) cohort participants obtained via the FLIP-seq near-full-length proviral genome sequencing approach [12,13]. IPDA-BC was further experimentally validated with Gblocks and five South African subtype C HIV-1 samples from the same FRESH cohort study participants. Unlike IPDA-original, neither the IPDA-BC psi probe nor its primers bind to the viral Major Splice Donor Site (MSD, HXB2 744), although the resulting psi amplicon does include the MSD site. Compared to IPDA-original, IPDA-BC shifts all psi primer and probe targets to more conserved genomic regions shared by both subtypes B and C HIV-1. For env forward, env reverse, and env-hypermutated primers/probes, IPDA-BC retained IPDA-original’s targeted genomic sites. The IPDA-BC env-intact probe introduces two additional bases at the 3’ end of IPDA-original’s env-intact probe, thus making it the same length as the env-hypermutated probe. A detailed comparison of the primers and probes from IPDA-original and IPDA-BC is shown in Supplementary Figure S1.

As mentioned above, IPDA-BC was primarily validated using South African subtype C HIV-1 sequences. However, India is also impacted predominantly by subtype C HIV-1 [14]. Due to the polymorphic nature of HIV and intra-subtype sequence differences across geographical regions, we hypothesize that Indian subtype C HIV-1 will be genetically distinct from South African subtype C HIV-1, and that IPDA-BC will exhibit a higher in silico failure rate for subtype C HIV-1 sequences isolated in India than those in South Africa. If confirmed, this would suggest that assays developed and validated on viral sequence diversity from a specific region should be adapted cautiously when applied elsewhere. This, to our knowledge, is the first study to evaluate whether identical viral subtypes circulating in geographically distinct regions would require further IPDA adaptations.

2. Methods

2.1. Phylogenetic Analysis

For the phylogenetic comparison of subtype C HIV-1 isolated in India (IN) and South Africa (ZA), we downloaded all available subtype C sequences in these regions from the Los Alamos HIV Sequence Database (available online at https://www.hiv.lanl.gov/content/sequence/HIV/mainpage.html (accessed on 18 June 2025)) [14] spanning HXB2 coordinates 400-1400 (covering IPDA-psi) and 7000–8000 (covering IPDA-env) (Figure 1). Sequence sets from each genomic region were subjected to multiple sequence alignment using Muscle (version 3.50.0, available online at https://bioconductor.org/packages/release/bioc/html/muscle.html (accessed on 18 June 2025)) [15] followed by neighbor-joining tree construction using the R package ape (version 5.8.1, available online at https://cran.r-project.org/web/packages/ape/index.html (accessed on 18 June 2025)) [16]. Formal statistical tests for compartmentalization were performed using the Slatkin-Maddison compartmentalization test implemented in the HyPhy package (version 2.5.74, available online at https://github.com/veg/hyphy (accessed on 18 June 2025)) [17].

2.2. In Silico Evaluation of the Compatibility of IPDA Primers/Probes per Geographical Region

All available subtype C HIV-1 sequences labeled with the country codes IN (India) and ZA (South Africa), and subtype B HIV-1 sequences labeled with country code US (United States) were downloaded from the Los Alamos HIV Database Search Interface [14] on 18 June 2025. Each of these sequences were queried against the HXB2 reference genome using NCBI BLAST+ suite (version 2.15.0, available online at https://www.ncbi.nlm.nih.gov/books/NBK131777/ (accessed on 18 June 2025)) [18] and interrogated for the presence of IPDA-original and IPDA-BC primer/probe binding sequences (Supplementary Figure S1) using an in-house R algorithm (available online at https://github.com/guineverelee/HIVprimertestR (accessed on 18 June 2025)). We evaluated three different definitions to predict primer/probe binding failures due to sequence mismatches (Figure 2a). In our first definition (Definition #1), we used the same criteria for primer/probe mismatch as those published by Gaebler et al. [19], which were also used in the development of IPDA-BC [10]. In this definition, primer target regions were split evenly into two halves: a 5’ end and a 3’ end. If the primer target region had an odd number of nucleotides, the middle nucleotide was allocated to the 5’ end. Up to three single-nucleotide mismatches were allowed within the 5’ end, whereas a maximum of one mismatch was allowed in the 3’ end. Only one single-nucleotide mismatch was allowed for the probe region. In our second definition (Definition #2), we increased stringency by failing any sequences with mismatches in the last two bases of the 3’ end of the forward or reverse primers. This is because a lack of nucleotide mismatches at the 3’ end of a primer target region had been shown to be crucial for successful PCR amplification [20]. All other criteria remained the same as Definition #1. In our last definition (Definition #3), only sequences with 100% sequence identity to the respective primer/probe were considered a pass. Definition #3 represents the minimum probable fraction of sequences that would successfully yield IPDA signals. All analyses were performed in R. All R scripts are publicly available (see GitHub link above). This script package is not restricted to IPDA evaluation; it can also be applied to any future HIV-1 primer/probe designs against any background population sequence datasets according to Definitions #1–3.

3. Results

3.1. Subtype C HIV-1 Isolated from India (IN) and South Africa (ZA) Were Genetically Distinct

From the Los Alamos HIV Sequence Database [14], we retrieved 10 IN and 29 ZA subtype C sequences spanning HXB2 coordinates 400–1400 (covering IPDA-psi), and 746 IN and 8990 ZA subtype C sequences spanning HXB2 7000–8000 (covering IPDA-env). Phylogenetic analyses showed that the IN (subtype C) and ZA (subtype C) isolates separated into their own respective clusters (Figure 1 and Supplementary Figure S2). Significant geographical compartmentalization was confirmed by Slatkin-Maddison compartmentalization tests (psi p < 0.003 and env p < 0.001). These results suggest that a subtype-specific IPDA validated on South African subtype C HIV-1 sequences should be evaluated before being applied to Indian subtype C HIV-1 samples.

3.2. In Silico Analysis Predicted That IPDA-BC Primers/Probes Were 6–10% More Likely to Fail in IN than ZA Samples

As mentioned in the Section 2, we used three different definitions to predict whether a primer/probe would bind (Figure 2a). Definition #1 was previously published and has been used in multiple IPDA-related development studies [10,19]. Referring to Figure 2b,c, the IPDA-original design was predicted to perform poorly in the psi probe region especially in IN (C) and ZA (C) sequences, whereas the IPDA-BC psi probe design would rescue almost all sequences from IN (C), ZA (C), and US (B). Using this definition, IPDA-BC would result in a maximum of 6% increase in binding failures if used for IN (C) samples (env forward primer had the maximum reduction from 96% ZA (C) to 90% IN (C)). Definition #2 is a more stringent version of Definition #1 with an added requirement that the last two bases at the 3’ of a primer target site must be identical to the primer sequence. As shown in Figure 2d,e, IPDA-BC remained more appropriate for samples across the three geographical regions relative to IPDA-original. Using this definition, IPDA-BC would result in a maximum increase in binding failures of 6% if used for IN (C) samples (env forward primer had the maximum reduction from 95% ZA (C) to 89% IN(C)). Definition #3 is the most stringent and requires 100% sequence identity against each primer or probe. As shown in Figure 2f,g, ≤1% of IN (C) and ZA (C) sequences had complete sequence identity against the IPDA-original design at the psi probe and env forward primer target sites, whereas the IPDA-BC design significantly improved the inferred success rate. Using this definition, IPDA-BC would result in a maximum of 10% more binding failures if used for IN (C) samples (env forward primer had the maximum reduction from 63% to 53%). The use of Definition #3 also revealed the minimum probable fraction of sequences within a population that were associated with binding successes. At minimum, IPDA-BC primers/probes exhibited 100% sequence identity with only 53% of IN (C) (env forward), 63% of ZA (C) (env forward), and 64% of US (B) (env forward) sequences in this analysis. The exact type of target sequence versus primer/probe mismatches are summarized in Supplementary Figure S3a,b.

The analysis shown in Figure 2 was not adjusted for donor-specific over-representation due to multiple sequences derived from the same donor within the query dataset, and query sequences were not screened for mapping abnormalities such as presence of large gaps. Therefore, we then repeated the analyses with a quality filter on the target sequences to remove mappings that were either <50% or >150% of the primer/probe sequence length and sequences that exhibit <50% sequence identity with the respective primer/probe (Supplementary Figure S4a). In addition to the quality filter, we also generated a donor-wise majority consensus sequence in cases where donor information was available (Supplementary Figure S4b). Both additional analyses led to similar conclusions as the first analysis: IPDA-BC showed a significant improvement from IPDA-original for both IN (C) and ZA (C) samples. With the sequence quality filter alone, IN (C) samples were predicted to have a 6–10% reduction in capture relative to ZA (C) samples, whereas applying the sequence quality filter with donor-wise consensus predicted a 2–9% reduction. All plotting data and the number of query sequences analyzed for each geographic region are available in Supplementary Table S1a,b.

Since IPDA-BC was designed to incorporate both subtype-B and subtype-C-specific primers into the same reaction (Supplementary Figure S1), all the above analyses were performed with the assumption that both B and C primers were present in the ddPCR reaction setup. We performed an additional sub-analysis on a subtype-matched ddPCR reaction setup scenario. Our analyses revealed that a subtype-matched approach further reduced binding success rate in all US (B), IN (C), and ZA (C) (Supplementary Figure S5a–c). For example, according to Definition #3, not using the IPDA-BC psi reverse-B primer for IN (C) and ZA (C) reduced the capture by 11% and 35%, respectively. Similarly, not using IPDA-BC psi reverse-C for US (B) reduced the minimum predicted capture fraction by 15%. All plotting data and the number of query sequences associated with this set of analyses are available in Supplementary Table S2a,b.

3.3. Ability to Distinguish Between Hypermutated and Non-Hypermutated Genomes

Next, we evaluated whether the IPDA-original and IPDA-BC env-hypermutated probe designs were likely able to distinguish between APOBEC-3G/3F-mediated hypermutated proviral genomes versus non-hypermutated ones. By design [1,10], the env-hypermutated probe serves as a competing non-fluorescent probe with nucleotides “A and A” in the designated hypermutation sites as opposed to having “G and G” nucleotides in the env-intact fluorescent probe (Supplementary Figure S1, red fonts). For each geographical region, we quantified the number of query sequences with ≤1 single-nucleotide mismatch that also naturally contain “G and A” or “A and G” in these two positions and would qualify for successful probe binding with both env-hypermutated and env-intact probes according to Definitions #1 and #2. For both Definitions #1 and #2, approximately 2% of IN (C), 8% of ZA (C), and 3% of US (B) sequences had a “G and A” or “A and G” genotypes at these positions, representing the fraction of sequences in these populations for which the env-hypermutated probe would unlikely function as an effective competitor to distinguish env-hypermutated genomes. Since Definition #3 requires zero mismatches, no queries with “G and A” or “A and G” at these positions would qualify for either of the env probes.

3.4. Performance Evaluation of the Three Mismatch Definitions Used in This Study

In this study, we based our evaluation on whether a primer or probe would successfully bind to a target sequence using a previously employed criteria for IPDA-BC development (Definition #1), and we further increased the stringency of the passing criteria in Definitions #2 and #3. However, PCR success for HIV amplification is known to depend on additional factors such as the nature of the nucleotide mismatch and its relative position in the primer/probe, as exemplified by the study by Kinloch et al. [4] where IPDA failures were not associated with any obvious mismatch patterns that would collectively predict IPDA failures. In this context, we evaluated the sensitivity and specificity of Definitions #1, #2, and #3 in predicting experimental IPDA successes and failures using 23 samples with matched IPDA results and viral sequence data previously published by the eCLEAR clinical trial [21] (Supplementary Figure S6). Definitions #1, #2, and #3 showed 53%, 47%, and 18% sensitivity in predicting IPDA experimental success, and 67%, 67%, and 100% specificity, respectively. In a separate experiment, IPDA-BC was applied to South African subtype C HIV-1 samples derived from five FRESH cohort donors (Supplementary Figure S7) and resulted in no failures as expected. When applied to donor-matched consensus sequences [13], Definitions #1, #2, and #3 showed 60%, 60%, and 0% sensitivity in predicting IPDA-BC experimental success for these samples. Specificity could not be evaluated because none of the samples failed IPDA-BC.

4. Discussion

HIV-1 is genetically diverse. Most PCR-based molecular biology assays, sequencing techniques, and bioinformatics methods for viral reservoir characterization in clinical samples developed based on our understanding of the sequence diversity in one HIV-1 subtype (often subtype B) must be adapted for application to non-B HIV-1 subtypes. Importantly, even within a given subtype, viral sequences exhibit significant diversity. In this study, we used subtype C HIV-1 isolated in India versus South Africa as an example to conduct an in silico evaluation of whether an assay such as IPDA-BC, which was validated largely based on South African subtype C HIV-1 strains, would require further assay adaptations before its application to subtype C strains in India. Based on our evaluation, we concluded that the possibility of a signal failure for IPDA-BC would likely marginally increase for Indian subtype C samples. However, our conclusion is limited by the relatively poor predictive values of these three criteria.

Despite the poor predictive values associated with our Definitions #1–3, we nevertheless showed that in a curated, centralized HIV genomics database, Indian subtype C HIV-1 sequences were genetically distinct from South African subtype C HIV-1 sequences, showing an up to 6% decrease in predicted primer/probe capture according to Definitions #1 and #2 and an up to 10% decrease according to Definition #3. Our results suggest that for Indian HIV-1 reservoir studies, instead of adopting the IPDA-BC approach as-is, researchers may need to first survey viral sequence diversity within a cohort of interest by sequencing the virus, then further adapt the primers/probes for the viral polymorphisms found in the respective geographic region. These will need to be periodically re-evaluated because of shifting epidemiology and the emergence of circulating recombinant strains. Globally, circulating recombinant strains account for about a third of HIV infections, followed in prevalence by subtype C at 23% [6]. Subtype C predominates in eastern and southern Africa and is about 90% of the HIV-1 circulating in India [22]. Subtype C can emerge as the dominant strain at regional levels as well, as has been seen in southern Brazil [23]. HIV-1 sequence diversity will likely be a continuous challenge when adapting IPDA and similar PCR-based reservoir quantification assays to different HIV-1 subtypes globally. We expect the same challenges will be faced by similar assays such as ddPCR-based HIV-1 RNA transcript profiling assays [24,25].

Our observations also highlight the importance of using IPDA-BC with both subtype B and C primers in the ddPCR reaction as-is: We showed that modifying IPDA-BC by removing the subtype B primers in the ddPCR reaction mix when applying the assay to subtype C samples reduced the predicted capture success. Likewise, modifying IPDA-BC by removing the subtype C primers in the ddPCR reaction mix when applying the assay to subtype B samples also reduced the predicted capture success.

Based on our findings in this study, the higher rate of sequence mismatches with primers/probes could result in an underestimation of total and intact proviral load. Consequently, there may be a higher proportion of negative results (assay failure) in clinical subtype C samples obtained from India compared to those from South Africa. However, the magnitude of underestimation is likely to vary across donors and geographic regions. For example, even within the same region, proviral sequence diversity is expected to be higher in individuals who experienced prolonged viremia than in those who initiated ART during acute infection (e.g., FRESH cohort [13,26] and the cohort in [27]). This is further complicated by the degree of nucleotide mismatches between the viral strain within each individual and the IPDA-BC primers/probes. Our findings may also have implications in IPDA-derived intact/defective proviral genome ratio values. For instance, if more intact viruses contain env regions with primer/probe mismatches in Indian samples relative to South African samples but this bias is absent in defective genomes, there would be an inflation of the defective/intact proviral genome ratio in Indian samples. To address this concern, large-scale full-length proviral sequencing of Indian samples is required; however, such a database does not currently exist. We therefore urge future studies to generate such data. These considerations reflect the inherent nature of the IPDA design: the presence of double-positive psi and env signals represents a high likelihood of genome-intact proviruses, whereas the absence of a signal may indicate either true absence or it can be an indicator of primer/probe mismatches [28]. In this context, the assay is best suited for longitudinal monitoring of changes in intact reservoir size within an individual, as reservoir diversity is expected to remain relatively comparable across time points within the same donor on suppressive therapy. Accordingly, IPDA double-positive signals should be interpreted as the minimal estimate of intact proviral load.

Our study is foremost limited by the poor sensitivity and specificity of the prediction criteria we used. As expected, all cases of eCLEAR trial experimental IPDA failures occurred when a target sequence had less than 100% sequence identity relative to the primers/probes (Supplementary Figure S7), supporting that Definition #3 represents the probable minimal passing frequency. Despite the absence of failures in the FRESH cohort (the original samples used to validate IPDA-BC), our in silico Definition #1 prediction achieved only 60% sensitivity. We also recognize that our approach to evaluate the sensitivity and specificity of the in silico Definitions used to predict primer/probe binding may be influenced by the use of viral sequences obtained from pre-ART plasma samples of eCLEAR cohort donors. Specifically, sequences from pre-ART plasma may represent only a subset of the proviral pool during supressive ART [29]. Futhermore, we used consensus sequences derived from donor-specific viral pools for the sensitivity and specificity evaluations. Since consensus sequences represent the most frequently observed bases at each nucleotide position and may miss minority variants, our sensitivty and specificity estimation should be considered a minimal estimate.

Despite these caveats, our observations revealed the limitations of simple in silico prediction algorithms for PCR success, especially when the target is as genetically diverse as HIV. In this study, we attempted to predict IPDA experimental success based on nucleotide mismatch count, but PCR success is also influenced by a range of other factors such as annealing temperature and ionic concentration [30]. Since some current assays were designed based on Definition #1, our findings highlight the urgent need for the development of better predictive algorithms. Further studies should be conducted using large-scale datasets with viral sequences from diverse HIV-1 subtypes with matching IPDA pass/fail data, followed by machine learning strategies to develop more accurate in silico prediction algorithms.

In addition to the limited sensitivity and specificity values as discussed above, our conclusions are further limited by these factors: First, our findings are limited by the low number of subtype C Indian viral sequence data available through the Los Alamos HIV Sequence database [14], especially at the psi region (Supplementary Table S1). The minimum number of Indian psi sequences analyzed was 73, 73, and 34 in Analyses 1, 2, and 3, respectively, underscoring substantial gaps in our understanding of HIV sequence polymorphisms in India. Second, the dataset available in the Los Alamos HIV Sequence Database may not represent contemporary viral strains circulating in India. Third, we were unable to evaluate the sensitivity and specificity of IPDA-BC on subtype C samples from India due to the lack of IPDA data with paired near full length proviral sequences data from which genome-intactness can be determined. This limitation prevented us from determining whether IPDA-BC introduces systematic bias or misclassification of intact and defective genomes in Indian samples. To address these gaps, future efforts should prioritize expanding sequencing coverage of underrepresented genomic regions from subtype C HIV in India, as well as generating paired IPDA and proviral sequence data.

In conclusion, primer and probe design will remain a persistent challenge in PCR-based assay development due to the extensive sequence diversity of HIV-1. Although both India and South Africa are affected by subtype C HIV-1, based on the data available in the Los Alamos HIV Sequence Database [14], our findings demonstrate that the circulating strains in these regions are genetically distinct. Assays such as IPDA-BC should thus be applied with caution and appropriate customizations when used in genetically diverse populations.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/v17111453/s1, Figure S1: Relative locations of IPDA-original versus IPDA-BC primer and probe sequences in HXB2 coordinates; Figure S2: Bootstrap polar trees of Indian (IN) and South African (ZA) Subtype C HIV-1 env sequences; Figure S3a: IPDA-BC psi - Sequence diversity in the top ten most prevalent sequences (Analysis #1); Figure S3b: IPDA-BC env - Sequence diversity in the top ten most prevalent sequences (Analysis #1); Figure S4a: Analysis #2: In silico successful binding rate of IPDA-original and IPDA-BC primer/probes against Indian (IN) and South African (ZA) subtype C HIV-1 sequences, with quality control filtering (QC) and without donor-wise consensus sequence generation; Figure S4b: Analysis #3: In silico successful binding rate of IPDA-original and IPDA-BC primer/probes against Indian (IN) and South African (ZA) subtype C HIV-1 sequences, with quality control filtering (QC) and with donor-wise consensus sequence generation; Figure S5a: Analysis 1: Subanalysis of Figure 2; Figure S5b: Analysis 2: Subanalysis of Supplementary Figure S4a; Figure S5c: Analysis 3: Subanalysis of Supplementary Figure S4b; Figure S6: Sensitivity and specificity of the three in silico prediction algorithms used in this study; Figure S7: Sensitivity and specificity of the three in silico prediction algorithms used in this study; Table S1a. Analysis #1: Primary analysis with no sequence quality filter nor donor-wise-consensus; Table S1a. Analysis #2: Secondary analysis with sequence quality filter; Table S1a. Analysis #3: Secondary analysis with sequence quality filter and generation of donor-wise consensus; Table S1b. Analysis #1: Primary analysis with no sequence quality filter nor donor-wise-consensus; Table S1b. Analysis #2: Secondary analysis with sequence quality filter; Table S1b. Analysis #3: Secondary analysis with sequence quality filter and generation of donor-wise consensus; Table S2a. Analysis 1: Primary analysis with no sequence quality filter nor donor-wise-consensus; Table S2a. Analysis 2: Secondary analysis with sequence quality filter; Table S2a. Analysis 3: Secondary analysis with sequence quality filter and generation of donor-wise consensus; Table S2b. Analysis 1: Primary analysis with no sequence quality filter nor donor-wise-consensus; Table S2b. Analysis 2: Secondary analysis with sequence quality filter; Table S2b. Analysis 3: Secondary analysis with sequence quality filter and generation of donor-wise consensus.

Author Contributions

Conceptualization, J.S.M., K.M.D. and G.Q.L.; methodology, M.R.A., K.M.D., and G.Q.L.; software, M.R.A. and G.Q.L.; validation, M.R.A., K.R., N.R., K.M.D. and G.Q.L.; formal analysis, M.R.A. and G.Q.L.; investigation, M.R.A., J.S.M., K.M.D. and G.Q.L.; resources, T.N. and G.Q.L.; data curation, M.R.A., K.R., N.R. and G.Q.L.; writing—original draft preparation, M.R.A., K.M.D. and G.Q.L.; writing—review and editing, M.R.A., J.S.M., K.R., N.R., T.N., K.M.D. and G.Q.L.; visualization, M.R.A. and G.Q.L.; supervision, G.Q.L.; project administration, K.M.D. and G.Q.L.; funding acquisition, G.Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by NIH grants R21AI150398 (GQL), R01AI162221 (GQL), and UM1AI164565 (GQL).

Data Availability Statement

The sequences representing HIV-1 subtype C Indian (IN) and South African (ZA), and subtype B USA (US) were downloaded from the Los Alamos HIV sequence database [https://www.hiv.lanl.gov (accessed on 18 June 2025)] [14]. All plotting data and analyses presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author. The experimental IPDA-BC results and the donor-wise consensus sequences used to evaluate the sensitivity and specificity of Definitions #1–3 were obtained from the Supplementary Materials of the eCLEAR trial publication [21]. Viral sequence data associated with the FRESH cohort were previously published in Reddy et al. [13].

Acknowledgments

We thank Ole Søgaard, Katie Fisher, Zabrina Brumme, and Natalie Kinloch for providing viral sequence data and experimental IPDA data from the eCLEAR trial.

Conflicts of Interest

All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Bruner, K.M.; Wang, Z.; Simonetti, F.R.; Bender, A.M.; Kwon, K.J.; Sengupta, S.; Fray, E.J.; Beg, S.A.; Antar, A.A.R.R.; Jenike, K.M.; et al. A Quantitative Approach for Measuring the Reservoir of Latent HIV-1 Proviruses. Nature 2019, 566, 120–125. [Google Scholar] [CrossRef]
Ho, Y.C.; Shan, L.; Hosmane, N.N.; Wang, J.; Laskey, S.B.; Rosenbloom, D.I.S.; Lai, J.; Blankson, J.N.; Siliciano, J.D.; Siliciano, R.F. Replication-Competent Noninduced Proviruses in the Latent Reservoir Increase Barrier to HIV-1 Cure. Cell 2013, 155, 540. [Google Scholar] [CrossRef]
Lee, G.Q. Chemistry and Bioinformatics Considerations in Using Next-Generation Sequencing Technologies to Inferring HIV Proviral DNA Genome-Intactness. Viruses 2021, 13, 1874. [Google Scholar] [CrossRef]
Kinloch, N.N.; Ren, Y.; Conce Alberto, W.D.; Dong, W.; Khadka, P.; Huang, S.H.; Mota, T.M.; Wilson, A.; Shahid, A.; Kirkby, D.; et al. HIV-1 Diversity Considerations in the Application of the Intact Proviral DNA Assay (IPDA). Nat. Commun. 2021, 12, 165. [Google Scholar] [CrossRef]
Gaebler, C.; Falcinelli, S.D.; Stoffel, E.; Read, J.; Murtagh, R.; Oliveira, T.Y.; Ramos, V.; Lorenzi, J.C.C.; Kirchherr, J.; James, K.S.; et al. Sequence Evaluation and Comparative Analysis of Novel Assays for Intact Proviral HIV-1 DNA. J. Virol. 2021, 95, e01986-20. [Google Scholar] [CrossRef]
Williams, A.; Menon, S.; Crowe, M.; Agarwal, N.; Biccler, J.; Bbosa, N.; Ssemwanga, D.; Adungo, F.; Moecklinghoff, C.; Macartney, M.; et al. Geographic and Population Distributions of Human Immunodeficiency Virus (HIV)-1 and HIV-2 Circulating Subtypes: A Systematic Literature Review and Meta-Analysis (2010–2021). J. Infect. Dis. 2023, 228, 1583–1591. [Google Scholar] [CrossRef]
Gunst, J.D.; Højen, J.F.; Pahus, M.H.; Rosás-Umbert, M.; Stiksrud, B.; McMahon, J.H.; Denton, P.W.; Nielsen, H.; Johansen, I.S.; Benfield, T.; et al. Impact of a TLR9 Agonist and Broadly Neutralizing Antibodies on HIV-1 Persistence: The Randomized Phase 2a TITAN Trial. Nat. Med. 2023, 29, 2547–2558. [Google Scholar] [CrossRef]
Lee, G.Q.; Khadka, P.; Gowanlock, S.N.; Copertino, D.C.; Duncan, M.C.; Omondi, F.H.; Kinloch, N.N.; Kasule, J.; Kityamuweesi, T.; Buule, P.; et al. HIV-1 Subtype A1, D, and Recombinant Proviral Genome Landscapes during Long-Term Suppressive Therapy. Nat. Commun. 2024, 15, 5480. [Google Scholar] [CrossRef]
Cassidy, N.A.J.; Fish, C.S.; Levy, C.N.; Roychoudhury, P.; Reeves, D.B.; Hughes, S.M.; Schiffer, J.T.; Benki-Nugent, S.; John-Stewart, G.; Wamalwa, D.; et al. HIV Reservoir Quantification Using Cross-Subtype Multiplex DdPCR. iScience 2021, 25, 103615. [Google Scholar] [CrossRef]
Buchholtz, N.V.E.J.; Nühn, M.M.; de Jong, T.C.M.; Stienstra, T.A.T.; Reddy, K.; Ndung’u, T.; Ndhlovu, Z.M.; Fisher, K.; Palmer, S.; Wensing, A.M.J.; et al. Development of a Highly Sensitive and Specific Intact Proviral DNA Assay for HIV-1 Subtype B and C. Virol. J. 2024, 21, 36. [Google Scholar] [CrossRef]
Hiener, B.; Horsburgh, B.A.; Eden, J.-S.S.; Barton, K.; Schlub, T.E.; Lee, E.; von Stockenstrom, S.; Odevall, L.; Milush, J.M.; Liegler, T.; et al. Identification of Genetically Intact HIV-1 Proviruses in Specific CD4+T Cells from Effectively Treated Participants. Cell Rep. 2017, 21, 813–822. [Google Scholar] [CrossRef]
Lee, G.Q.; Orlova-Fink, N.; Einkauf, K.; Chowdhury, F.Z.; Sun, X.; Harrington, S.; Kuo, H.-H.; Hua, S.; Chen, H.-R.; Ouyang, Z.; et al. Clonal Expansion of Genome-Intact HIV-1 in Functionally-Polarized Th1 CD4 T Cells. J. Clin. Investig. 2017, 127, 2689–2696. [Google Scholar] [CrossRef]
Reddy, K.; Lee, G.Q.; Reddy, N.; Chikowore, T.J.; Baisley, K.; Dong, K.L.; Walker, B.D.; Yu, X.G.; Lichterfeld, M.; Ndung’u, T. Differences in HIV-1 Reservoir Size, Landscape Characteristics, and Decay Dynamics in Acute and Chronic Treated HIV-1 Clade C Infection. Elife 2025, 13, RP96617. [Google Scholar] [CrossRef]
Los Alamos National Laboratory. Los Alamos HIV Sequence Database. Available online: http://www.hiv.lanl.gov/ (accessed on 20 August 2024).
Edgar, R.C. MUSCLE: Multiple Sequence Alignment with High Accuracy and High Throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef]
Paradis, E.; Schliep, K. Ape 5.0: An Environment for Modern Phylogenetics and Evolutionary Analyses in R. Bioinformatics 2019, 35, 526–528. [Google Scholar] [CrossRef]
Kosakovsky Pond, S.L.; Poon, A.F.Y.; Velazquez, R.; Weaver, S.; Hepler, N.L.; Murrell, B.; Shank, S.D.; Magalis, B.R.; Bouvier, D.; Nekrutenko, A.; et al. HyPhy 2.5—A Customizable Platform for Evolutionary Hypothesis Testing Using Phylogenies. Mol. Biol. Evol. 2019, 37, 295. [Google Scholar] [CrossRef]
Camacho, C.; Coulouris, G.; Avagyan, V.; Ma, N.; Papadopoulos, J.; Bealer, K.; Madden, T.L. BLAST+: Architecture and Applications. BMC Bioinform. 2009, 10, 421. [Google Scholar] [CrossRef]
Gaebler, C.; Lorenzi, J.; Oliveira, T.; Nogueira, L.; Ramos, V.; Lu, C.; Pai, J.; Mendoza, P.; Jankovic, M.; Caskey, M.; et al. Combination of Quadruplex QPCR and Next-Generation Sequencing for Qualitative and Quantitative Analysis of the HIV-1 Latent Reservoir. J. Exp. Med. 2019, 216, 2253–2264. [Google Scholar] [CrossRef]
Simsek, M.; Adnan, H. Effect of Single Mismatches at 3′–End of Primers on Polymerase Chain Reaction. J. Sci. Res. Med. Sci. 2000, 2, 11. [Google Scholar]
Gunst, J.D.; Pahus, M.H.; Rosás-Umbert, M.; Lu, I.N.; Benfield, T.; Nielsen, H.; Johansen, I.S.; Mohey, R.; Østergaard, L.; Klastrup, V.; et al. Early Intervention with 3BNC117 and Romidepsin at Antiretroviral Treatment Initiation in People with HIV-1: A Phase 1b/2a, Randomized Trial. Nat. Med. 2022, 28, 2424. [Google Scholar] [CrossRef]
Neogi, U.; Bontell, I.; Shet, A.; de Costa, A.; Gupta, S.; Diwan, V.; Laishram, R.S.; Wanchu, A.; Ranga, U.; Banerjea, A.C.; et al. Molecular Epidemiology of HIV-1 Subtypes in India: Origin and Evolutionary History of the Predominant Subtype C. PLoS ONE 2012, 7, e39819. [Google Scholar] [CrossRef]
Souto, B.; Triunfante, V.; Santos-Pereira, A.; Martins, J.; Araújo, P.M.M.; Osório, N.S. Evolutionary Dynamics of HIV-1 Subtype C in Brazil. Sci. Rep. 2021, 11, 23060. [Google Scholar] [CrossRef]
Yukl, S.A.; Kaiser, P.; Kim, P.; Telwatte, S.; Joshi, S.K.; Vu, M.; Lampiris, H.; Wong, J.K. HIV Latency in Isolated Patient CD4⁺ T Cells May Be Due to Blocks in HIV Transcriptional Elongation, Completion, and Splicing. Sci. Transl. Med. 2018, 10, eaap9927. [Google Scholar] [CrossRef]
Stevenson, E.M.; Terry, S.; Copertino, D.; Leyre, L.; Danesh, A.; Weiler, J.; Ward, A.R.; Khadka, P.; McNeil, E.; Bernard, K.; et al. SARS CoV-2 MRNA Vaccination Exposes Latent HIV to Nef-Specific CD8 + T-Cells. Nat. Commun. 2022, 13, 4888. [Google Scholar] [CrossRef]
Lee, G.Q.; Reddy, K.; Einkauf, K.B.; Gounder, K.; Chevalier, J.M.; Dong, K.L.; Walker, B.D.; Yu, X.G.; Ndung’u, T.; Lichterfeld, M. HIV-1 DNA Sequence Diversity and Evolution during Acute Subtype C Infection. Nat. Commun. 2019, 10, 2737. [Google Scholar] [CrossRef]
Palma, P.; Zangari, P.; Alteri, C.; Tchidjou, H.K.; Manno, E.C.; Liuzzi, G.; Perno, C.F.; Rossi, P.; Bertoli, A.; Bernardi, S. Early Antiretroviral Treatment (EART) Limits Viral Diversity over Time in a Long-Term HIV Viral Suppressed Perinatally Infected Child. BMC Infect. Dis. 2016, 16, 742. [Google Scholar] [CrossRef]
Lee, G.Q. A Daisy Chain of Inferences: The Role of Single-Cell and Single-Genome Proviral Sequencing in Characterizing HIV-1 Reservoirs. Curr. Opin. HIV AIDS 2025, 20, 512–517. [Google Scholar] [CrossRef]
Kinloch, N.N.; Shahid, A.; Dong, W.; Kirkby, D.; Jones, B.R.; Beelen, C.J.; MacMillan, D.; Lee, G.Q.; Mota, T.M.; Sudderuddin, H.; et al. HIV Reservoirs Are Dominated by Genetically Younger and Clonally Enriched Proviruses. MBio 2023, 14, e02417-23. [Google Scholar] [CrossRef]
Bustin, S.A.; Mueller, R.; Nolan, T. Parameters for Successful PCR Primer Design. Methods Mol. Biol. 2020, 2065, 5–22. [Google Scholar] [CrossRef]

Figure 1. Indian and South African Subtype C HIV-1 are phylogenetically distinct. HIV-1 subtype C sequences associated with India (IN) and South Africa (ZA) downloaded from the Los Alamos HIV Sequence Database [14] spanning HXB2 coordinates 400-1400 (covering IPDA-psi; 10 IN and 29 ZA sequences) and 7000-8000 (covering IPDA-env; 746 IN and 8990 ZA sequences). All psi region sequences are represented in the phylogenetic tree. For env, due to the large number of sequences available, the sequence sets from each country were bootstrapped 10 times at 100 sequences per set. A representative env tree is shown here (see Supplementary Figure S2 for all bootstrapped env trees). The sizes of the polar phylogenetic trees represent relative genetic diversity. Both trees are rooted to HXB2. IN and ZA sequences are significantly compartmentalized by Slatkin-Maddison tests.

Figure 2. In silico successful binding rate of IPDA-original and IPDA-BC primers/probes against Indian (IN) and South African (ZA) subtype C HIV-1 sequences. (a) Three in silico definitions of binding success were examined in this study. Criteria used for Definition #1 are the same as those published by Gaebler et al. [19]. (b–g) Per definition of success, we evaluated the fraction of Indian (IN), South African (ZA) and USA (US) HIV-1 sequences that are predicted to bind to each of the primers and probes of both IPDA-original and IPDA-BC. Detail analysis tables are available in Supplementary Table S1.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Arikatla, M.R.; Mathad, J.S.; Reddy, K.; Reddy, N.; Ndung’u, T.; Dupnik, K.M.; Lee, G.Q. Within-Subtype HIV-1 Polymorphisms and Their Impacts on Intact Proviral DNA Assay (IPDA) for Viral Reservoir Quantification. Viruses 2025, 17, 1453. https://doi.org/10.3390/v17111453

AMA Style

Arikatla MR, Mathad JS, Reddy K, Reddy N, Ndung’u T, Dupnik KM, Lee GQ. Within-Subtype HIV-1 Polymorphisms and Their Impacts on Intact Proviral DNA Assay (IPDA) for Viral Reservoir Quantification. Viruses. 2025; 17(11):1453. https://doi.org/10.3390/v17111453

Chicago/Turabian Style

Arikatla, Mohith Reddy, Jyoti S. Mathad, Kavidha Reddy, Nicole Reddy, Thumbi Ndung’u, Kathryn M. Dupnik, and Guinevere Q. Lee. 2025. "Within-Subtype HIV-1 Polymorphisms and Their Impacts on Intact Proviral DNA Assay (IPDA) for Viral Reservoir Quantification" Viruses 17, no. 11: 1453. https://doi.org/10.3390/v17111453

APA Style

Arikatla, M. R., Mathad, J. S., Reddy, K., Reddy, N., Ndung’u, T., Dupnik, K. M., & Lee, G. Q. (2025). Within-Subtype HIV-1 Polymorphisms and Their Impacts on Intact Proviral DNA Assay (IPDA) for Viral Reservoir Quantification. Viruses, 17(11), 1453. https://doi.org/10.3390/v17111453

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Within-Subtype HIV-1 Polymorphisms and Their Impacts on Intact Proviral DNA Assay (IPDA) for Viral Reservoir Quantification

Abstract

1. Introduction

2. Methods

2.1. Phylogenetic Analysis

2.2. In Silico Evaluation of the Compatibility of IPDA Primers/Probes per Geographical Region

3. Results

3.1. Subtype C HIV-1 Isolated from India (IN) and South Africa (ZA) Were Genetically Distinct

3.2. In Silico Analysis Predicted That IPDA-BC Primers/Probes Were 6–10% More Likely to Fail in IN than ZA Samples

3.3. Ability to Distinguish Between Hypermutated and Non-Hypermutated Genomes

3.4. Performance Evaluation of the Three Mismatch Definitions Used in This Study

4. Discussion

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI