Allosteric Integrase Inhibitor Influences on HIV-1 Integration and Roles of LEDGF/p75 and HDGFL2 Host Factors

Allosteric integrase (IN) inhibitors (ALLINIs), which are promising preclinical compounds that engage the lens epithelium-derived growth factor (LEDGF)/p75 binding site on IN, can inhibit different aspects of human immunodeficiency virus 1 (HIV-1) replication. During the late phase of replication, ALLINIs induce aberrant IN hyper-multimerization, the consequences of which disrupt IN binding to genomic RNA and virus particle morphogenesis. During the early phase of infection, ALLINIs can suppress HIV-1 integration into host genes, which is also observed in LEDGF/p75-depelted cells. Despite this similarity, the roles of LEDGF/p75 and its paralog hepatoma-derived growth factor like 2 (HDGFL2) in ALLINI-mediated integration retargeting are untested. Herein, we mapped integration sites in cells knocked out for LEDGF/p75, HDGFL2, or both factors, which revealed that these two proteins in large part account for ALLINI-mediated integration retargeting during the early phase of infection. We also determined that ALLINI-treated viruses are defective during the subsequent round of infection for integration into genes associated with speckle-associated domains, which are naturally highly targeted for HIV-1 integration. Class II IN mutant viruses with alterations distal from the LEDGF/p75 binding site moreover shared this integration retargeting phenotype. Altogether, our findings help to inform the molecular bases and consequences of ALLINI action.


Introduction
People living with HIV (PLWH) are prescribed a cocktail of antiviral compounds, also known as combinatorial antiretroviral therapy (ART), to suppress human immunodeficiency virus 1 (HIV-1) replication. Since the mid-1990s, three drugs comprise most ART formulations, including a backbone of two nucleoside reverse transcriptase (RT) inhibitors (NRTIs) and a third compound from a separate drug class. The nature of the third compound has reflected the history of antiretroviral drug development. Protease inhibitors, which initially filled this niche, were followed by non-nucleoside RT inhibitors [1]. The advent of raltegravir, which was the first integrase (IN) strand transfer inhibitor (INSTI) approved by the US Food and Drug Administration, further impacted ART formulations [2]. Compared to raltegravir, second-generation INSTIs dolutegravir and bictegravir impart comparatively high barriers to the generation of drug resistance [3][4][5][6], and either dolutegravir or bictegravir is now generally prescribed alongside two NRTIs to treat PLWH as well as people switching from non-INSTI containing regiments [7]. Despite these advancements, resistance to second-generation INSTIs does occur [8][9][10], highlighting the need to develop new anti-IN compounds with novel mechanisms of action.
In addition to its canonical activity, HIV-1 IN harbors a second, non-enzymatic function, wherein it binds genomic RNA to regulate virus particle morphogenesis [34]. IN-RNA binding is required to incorporate the viral ribonucleoprotein complex, which is primarily composed of genomic RNA and nucleocapsid protein, into the conical capsid core. Lacking this, HIV-1 RNA and IN are prematurely degraded in infected target cells, aborting the infectious process [45][46][47]. In vitro, HIV-1 IN tetramers as compared to monomers or dimers avidly bind RNA [47], and a variety of HIV-1 IN mutant proteins have been shown to be defective for RNA binding due to defective IN tetramerization or alteration of amino acid residues that directly contact RNA [47]. The associated replication-defective HIV-1 IN mutant viruses are referred to as class II to delineate them from class I IN mutant viruses that display normal IN-RNA binding and virion particle morphology [34,48]. In addition to IN mutations, allosteric IN inhibitors (ALLINIs) disrupt IN-RNA binding [49].
ALLINIs [50], which are also referred to as LEDGINs for LEDGF site inhibitors [51], NCINIs for non-catalytic IN inhibitors [52], IN-LAIs for IN-LEDGF allosteric inhibitors [53], and MINIs for multimeric IN inhibitors [54], are a promising class of pre-clinical compounds. ALLINIs engage IN at the LEDGF/p75 binding site and accordingly suppress gene-tropic integration when cells are treated with inhibitors at the time of HIV-1 infection [54][55][56]. However, the primary mode of ALLINI action is via aberrant IN hypermultimerization [54,57]. Drug binding to the LEDGF/p75 binding site unveils a secondary IN binding site for a CTD from a separate IN multimer [58,59], the consequences of which generate long-chain, drug-interlinked IN polymers [60]. ALLINIs more potently inhibit HIV-1 maturation as compared to integration presumably because LEDGF/p75 is better positioned to compete for drug binding to IN during the afferent as compared to the efferent phase of virus replication [61].
Our current work was inspired by several unanswered questions concerning the mechanisms and outcomes of ALLINI action. Although genic integration targeting was reduced significantly by exposing cells to ALLINIs at the time of the HIV-1 infection, integration into genes remained enriched compared to LEDGF/p75 knockout (LKO) cells [55] and to random controls [55,56]. To comprehensively address the roles of LEDGF/p75 and HDGFL2 in ALLINI-mediated integration retargeting, we have generated HDGFL2 knockout (HKO) cells and used these alongside previously described LKO cells [62] as well as cells doubly knocked out for both factors [63]. HIV-1 produced in the presence of ALLINIs has been shown to integrate into genes at normal frequencies during the subsequent round of infection [54,64]. However, localization within the nucleus can significantly impact the frequency at which genes are targeted for HIV-1 integration. For example, genes in proximity to nuclear speckles are highly preferred integration targets [18,20,21]. Herein we show that HIV-1 produced in the presence of ALLINIs is deficient for integration into SPAD-associated genes during the subsequent round of virus infection, a trait that was shared by class II IN mutant viruses. Collectively, our findings inform the mechanisms and consequences of ALLINI action, which is germane given recent descriptions of highly potent compounds [65][66][67][68][69][70], some of which have advanced to human clinical trials [66].
Single-round pseudovirus carrying the gene for firefly luciferase, hereafter referred to as HIV-Luc, was prepared by plasmid DNA co-transfection as previously described [19,71,75]. In brief, WT HEK293T cells cultured in 10 cm tissue culture plates were transfected with 15 µg total plasmid DNA (pNLX.Luc.R-:pCG-VSV-G and pNLX.Luc.R-.∆AvrII:pCG-VSV-G ratios of 9:1) using PolyJet transfection reagent as recommended by the manufacturer (SignaGen, Frederick, MD, USA). Cell supernatants at 48 h post-transfection, precleared via centrifugation, were filtered through 0.45 µm filters by gravity flow and concentrated by ultracentrifugation at 4 • C for 2 h at 26,000 rpm using a SW32-Ti rotor. Virus pellets were resuspended in DMEM, aliquoted, and stored at −80 • C. Virus yield was determined using a commercial p24 ELISA kit as recommended by the manufacturer (Advanced Bioscience Laboratories, Rockville, MD, USA). For integration site sequencing experiments, HIV-Luc was treated with Turbo DNase (Thermo Fisher Scientific, Waltham, MA, USA) at the final concentration of 0.08 U/µL prior to infection as described [19].
To determine dose response curves as a function of host factor content, BI-D or dimethyl sulfoxide (DMSO) was added to triplicate wells at the time of HEK293T cell infection essentially as described for a prior INSTI study [75]. In brief, 10 4 cells seeded per well in 96-well plates were infected with 2 ng p24 HIV-Luc in 200 µL for 6-8 h. Virus was removed, and cells were fed fresh growth media containing the same drug concentration or DMSO as present at the start of the infection. At 48 h, cells were washed twice with phosphate-buffered saline and lysed with passive lysis buffer (Promega Corporation Madi- son, WI, USA) by freezing plates at −80 • C for 30 min, which was followed by heating at 37 • C for 20 min. Following centrifugation, supernatants were assessed for luciferase activity using a Berthold Technologies luminometer. Effective concentration 50%, 70%, and 95% (respective EC 50 , EC 70 , and EC 95 ) values were calculated by fitting data from two independent experiments to a dose response inhibition model (four parameters) in GraphPad Prism 8 (Dotmatics Boston, MA, USA).
ALLINI-treated HIV-Luc was produced by maintaining BI-D or DMSO solvent control throughout the course of HEK293T cell transfection. Following concentration by ultracentrifugation, resuspended virus was ultrafiltered using Amicon Ultra-15 Centrifugal Filter Units with 3 kDa molecular cut-off (Millipore Sigma, Burlington, MA, USA) to remove remaining unincorporated BI-D from the media. The concentration of HIV-Luc in the retentate was determined by p24 ELISA.
For integration site sequencing, 4 × 10 5 cells were infected with 400 ng HIV-Luc in each well of a 6-well plate. Virus-containing media was replaced with fresh media after 6-8 h, and approximately 20% of the cell culture was harvested at 2 d post-infection to determine luciferase activity. For this, cells resuspended in passive lysis buffer were frozen overnight at −80 • C, heated at 37 • C for 30 min, and then centrifuged at 17,500× g for 8 min. Relative light units (RLUs) of cell supernatants were determined in triplicate by luminometer. RLU results were normalized to total protein concentration in the cell extract as determined by the Pierce bicinchoninic acid (BCA) protein assay kit (Thermo Fisher Scientific). The remaining cell culture was lysed for genomic DNA preparation at 5 d from the start of infection.

Preparation of Integration Site Libraries
DNA libraries were prepared by ligation-mediated PCR (LM-PCR) essentially as previously described [76,77]. In brief, isolated genomic DNA (2-10 µg) was digested with MseI and BglII restriction endonucleases overnight at 37· • C. Following purification, the DNA was ligated overnight at 12 • C to asymmetric linkers containing 5 -TA overhangs for compatibility with MseI-digested ends. Following purification, the DNA was subjected to semi-nested PCR using primers that anneal to the U5 end of HIV-1 DNA and the linker. The linker-specific primer and the second round U5-specific primer were megaprimers that contained additional sequences for Illumina clustering and sequencing. Purified LM-PCR products were subjected to 150 bp paired-end Illumina sequencing at the Dana-Farber Cancer Institute Molecular Biology Core Facilities (Boston, MA, USA).

Integration Site Determination and Mapping
Illumina raw reads were scanned, and integration sites were determined as described previously [19,[76][77][78]. In brief, U5 and linker-specific sequences were trimmed from Illumina raw read1 and read2, respectively. Trimmed reads containing host DNA were aligned to human genome build hg19 downloaded from the UCSC server (http://genome. ucsc.edu (accessed on 1 July 2018)) by BWA-MEM aligner with paired-end option [79]. Aligned reads were filtered by SAMtools [80] and converted into BED format as described previously [19,78]. Raw Illumina sequences for viruses produced in the presence of ALLINI CX014442 accessed using Sequence Read Archive (SRA) number SRP157991 [64] were downloaded to bioinformatically determine sites of HIV-1 integration.
Integration sites were analyzed by BEDtools (command intersect) [81] to assess HIV-1 provirus distribution with respect to human genome annotations such as RefSeq genes and SPADs [18][19][20][21]. Results were compared to random integration control (RIC) values, which were generated in silico in two different ways to match utilized genome shearing strategy. RIC values based on digestion with MseI and BglII enzymes were described previously [19] while RIC values for DNA sonication, which Vansant et al. [64] used for shearing, were based on previously described fragments [20] that herein were mapped to hg19 using BEDtools. SPAD-associated RefSeq genes were identified by BEDtools [81] by quantifying the overlap between gene and SPAD coordinates; non-overlapping genes were termed SPAD-non-associated genes. For each gene set, percent integration was determined by BEDtools [81].

Statistical Analyses
Statistical significance in virus infection assays was assessed using two-tailed equal variance t test in Excel. Differences in integration site usage between samples was determined using Fisher's exact test in Python. P values less than 0.05 were generally considered to be statistically relevant.

Research Strategy
ALLINIs can perturb different aspects of HIV-1 replication. When present in target cells during the early phase of infection, ALLINIs can inhibit the overall level of integration and suppress the frequency at which genes are targeted [54][55][56]82]. ALLINIs engage the HIV-1 IN CCD dimer interface at a location coincident with LEDGF/p75 and HDGFL2 binding [36,40,51,53,54,61,[82][83][84][85], but the roles of these virus-host interactions in ALLINImediated integration retargeting have not been systematically investigated. To address this information gap, we created HKO (for HDGFL2 knockout) cells and infected these alongside isogenic WT, LKO, and DKO HEK293T cells ( Figure 1A) [62,63]. When present in virus producer cells during the late phase of replication, ALLINIs inhibit particle maturation and the resulting eccentric particles are defective for reverse transcription during the subsequent round of virus infection [52][53][54]61,[85][86][87]. Although such virions reportedly retain normal gene targeting frequencies [54,64], we have further investigated this aspect of ALLINI action by stratifying genes based on established targeting preferences [18,20,21].
In the following sections, integration site data is presented in graphic and table formats. Statistical outcomes of virus infection and integration sample comparisons are presented as supplementary tables.

The Roles of LEDGF/p75 and HDGFL2 in ALLINI-Mediated HIV-1 Integration Retargeting
We have utilized BI-D, which is a prototypical quinoline ALLINI [44,61] previously shown to reduce genic HIV-1 integration targeting in HEK293T cells [55]. BI-D EC 50 values to inhibit the early versus late phase of HIV-1 replication are~2.4 µM [44,53] and 0.9 µM [61], respectively, and previous work showed that LEDGF/p75 depletion significantly increased BI-D's potency to inhibit the early phase of infection [44,61]. In order to ascertain drug effects on integration site targeting across our isogenic set of HEK293T cells, we accordingly first derived BI-D dose response curves for each cell type. WT and HKO cells were infected with single-round HIV-Luc reporter virus in the presence of a BI-D concentration range that varied from~0.16 µM to 20 µM, while LKO and DKO cells were treated with an appropriately adjusted concentration range that varied from~0.02 µM to 5 µM. The level of infection at each drug concentration was percent normalized to parallel cell cultures that were infected in the presence of the DMSO solvent control. As expected, BI-D potency noticeably increased in cells lacking LEDGF/p75, with calculated EC 50  In the following sections, integration site data is presented in graphic and table formats. Statistical outcomes of virus infection and integration sample comparisons are presented as supplementary tables.

The Roles of LEDGF/p75 and HDGFL2 in ALLINI-Mediated HIV-1 Integration Retargeting
We have utilized BI-D, which is a prototypical quinoline ALLINI [44,61] previously shown to reduce genic HIV-1 integration targeting in HEK293T cells [55]. BI-D EC50 values to inhibit the early versus late phase of HIV-1 replication are ~2.4 µM [44,53] and 0.9 µM [61], respectively, and previous work showed that LEDGF/p75 depletion significantly increased BI-D's potency to inhibit the early phase of infection [44,61]. In order to ascertain drug effects on integration site targeting across our isogenic set of HEK293T cells, we accordingly first derived BI-D dose response curves for each cell type. WT and HKO cells were infected with single-round HIV-Luc reporter virus in the presence of a BI-D concentration range that varied from ~0.16 µM to 20 µM, while LKO and DKO cells were treated with an appropriately adjusted concentration range that varied from ~0.02 µM to 5 µM. The level of infection at each drug concentration was percent normalized to parallel cell cultures that were infected in the presence of the DMSO solvent control. As expected, BI-D potency noticeably increased in cells lacking LEDGF/p75, with calculated EC50 values of 1.5 µM, 1.26 µM, 0.16 µM, and 0.33 µM in WT, HKO, LKO, and DKO cells, respectively ( Figure 1B). From these data, cell type-matched BI-D EC70 and EC95 values were calculated.
We next determined sites of HIV-1 integration in cells infected in the presence of cell type-adjusted EC70 and EC95 levels of BI-D and compared these results to infections conducted under baseline conditions (in the presence of DMSO). To address data reproducibility, we report side-by-side results of two independent integration site sequencing experiments. In the absence of drug, HKO, LKO, and DKO cells supported about 61%, 9%, and 3% of the level of infection supported by WT HEK293T cells, respectively (Figure 2A, DMSO). Infections conducted in the presence of EC70 and EC95 BI-D concentrations expectedly reduced these baseline values by ~70% and 95%, respectively (Figure 2A; see Table  S1 for comprehensive statistical comparisons).  Table S1 for comprehensive statistical comparisons).   Table 1 for plotted values (the graph on the left is replicate 1 data) and HIV-Luc at baseline integrated into a human gene in WT and HKO cells about 82% of the time, which expectedly decreased significantly to about 65% (p < 10 −26 ) and 55% (p < 10 −54 ) genic integration in LKO and DKO cells, respectively ( Figure 2B, Tables 1 and S2).
In the presence of BI-D, HIV-Luc integrated into genes about 66% to 70% of the time in WT and HKO cells, which were highly significant differences from baseline (p < 10 −14 ). While EC70 BI-D concentration did not noticeably affect the basal level of gene-tropic integration in LKO cells, EC95 BI-D reduced this significantly, to about 56% (p ≤ 0.006). This level of genic integration targeting was notably similar to the level observed in DKO cells under basal infection conditions (p ≥ 0.7). While EC95 BI-D further reduced genic integration by two to three percentage points in DKO cells, these differences did not attain statistical significance versus the basal DKO cell condition (p = 0.3). Based on these data, we conclude that inhibition of IN binding to LEDGF/p75 and HDGFL2 in large part accounts for the ability of ALLINIs such as BI-D to retarget integration away from genes when the drugs are present during the early phase of HIV-1 infection.  Table 1 for plotted values (the graph on the left is replicate 1 data) and Table S2 for pairwise statistical comparisons. *, p < 0.05; **, p < 0.001; ***, p < 0.0001 (gray asterisks, versus matched DMSO control; black asterisks, versus matched WT cell condition; red asterisks, versus random integration control [RIC; dotted line]).
HIV-Luc at baseline integrated into a human gene in WT and HKO cells about 82% of the time, which expectedly decreased significantly to about 65% (p < 10 −26 ) and 55% (p < 10 −54 ) genic integration in LKO and DKO cells, respectively ( Figure 2B, Tables 1 and S2). In the presence of BI-D, HIV-Luc integrated into genes about 66% to 70% of the time in WT and HKO cells, which were highly significant differences from baseline (p < 10 −14 ). While EC 70 BI-D concentration did not noticeably affect the basal level of gene-tropic integration in LKO cells, EC 95 BI-D reduced this significantly, to about 56% (p ≤ 0.006). This level of genic integration targeting was notably similar to the level observed in DKO cells under basal infection conditions (p ≥ 0.7). While EC 95 BI-D further reduced genic integration by two to three percentage points in DKO cells, these differences did not attain statistical significance versus the basal DKO cell condition (p = 0.3). Based on these data, we conclude that inhibition of IN binding to LEDGF/p75 and HDGFL2 in large part accounts for the ability of ALLINIs such as BI-D to retarget integration away from genes when the drugs are present during the early phase of HIV-1 infection.

ALLINI Treated Virions Are Defective for Integration into SPAD-Associated Genes
The requirement for specific gene KOs restricted the infection experiments in the prior section to HEK293T cells. In the following sections, drug-treated viruses were used to infect WT cell types. To address the generality of the integration targeting phenotypes, T cell lines were included in these experiments. Virions produced in the presence of 600 nM BI-D, which equates to an~EC 95 concentration under this condition of drug exposure [61], were initially used to infect HEK293T cells and Jurkat T cells in the absence of any added BI-D in the cell culture media. BI-D-treated HIV-Luc infected HEK293T and Jurkat T cells at approximately 6.6% and 3.3%, respectively, of the level of infection observed with viruses produced in the presence of DMSO ( Figure 3A).
HIV-Luc produced in the presence of BI-D integrated into genes at lower frequencies than did the DMSO control virus (Figure 3 and Table 2). Across experiments and cell types, these differences amounted to~1.1 to 2.6%, which in all cases failed to attain statistical significance (p ≥ 0.06; Table S3). These data are consistent with prior reports that HIV-1 made in the presence of ALLINI GS-B [54] and CX014442 [64] displayed baseline frequencies of genic integration targeting.

ALLINI Treated Virions Are Defective for Integration into SPAD-Associated Genes
The requirement for specific gene KOs restricted the infection experiments in the prior section to HEK293T cells. In the following sections, drug-treated viruses were used to infect WT cell types. To address the generality of the integration targeting phenotypes, T cell lines were included in these experiments. Virions produced in the presence of 600 nM BI-D, which equates to an ~EC95 concentration under this condition of drug exposure [61], were initially used to infect HEK293T cells and Jurkat T cells in the absence of any added BI-D in the cell culture media. BI-D-treated HIV-Luc infected HEK293T and Jurkat T cells at approximately 6.6% and 3.3%, respectively, of the level of infection observed with viruses produced in the presence of DMSO ( Figure 3A).  Table 2 for plotted values and Table S3 for pairwise statistical comparisons. ***, p < 0.0001 versus random integration control (RIC). HIV-Luc produced in the presence of BI-D integrated into genes at lower frequencies than did the DMSO control virus (Figure 3 and Table 2). Across experiments and cell types, these differences amounted to ~1.1 to 2.6%, which in all cases failed to attain statistical significance (p > 0.06; Table S3). These data are consistent with prior reports that HIV-1 made in the presence of ALLINI GS-B [54] and CX014442 [64] displayed baseline frequencies of genic integration targeting.
We recently determined that SPAD-associated genes are some of the most highly targeted genes for HIV-1 integration in the human genome [21]. We accordingly next stratified the genes that were targeted for integration in Figure 3 into SPAD-associated versus SPAD-non-associated, and replotted integration into these gene subsets ( Figure 4). As  Table 2 for plotted values and Table S3 for pairwise statistical comparisons. ***, p < 0.0001 versus random integration control (RIC).
We recently determined that SPAD-associated genes are some of the most highly targeted genes for HIV-1 integration in the human genome [21]. We accordingly next stratified the genes that were targeted for integration in Figure 3 into SPAD-associated versus SPAD-non-associated, and replotted integration into these gene subsets ( Figure 4). As expected, HIV-1 highly favored SPAD-associated genes for integration: although these genes comprise but 3.3% of the human genome, nominally one-third of all integration events occurred within them ( Figure 4A,C, Table 2; p < 10 −300 , Table S3). Integration into SPAD-non-associated genes was also enhanced compared to random, though this approximate 1.2-fold enrichment paled in comparison to the >10-fold enrichment for integration into SPAD-associated genes ( Figure 4B,D, Table 2).
HIV-Luc made in the presence of BI-D displayed statistically significant~4% to 8% reductions in integration into SPAD-associated genes in both HEK293T cells and Jurkat T cells ( Figure 4A,C, Table 2; p ≤ 0.017, Table S3). Reciprocally, integration into SPAD-nonassociated genes witnessed~2% to 6% upticks in HIV-1 integration site targeting, which in three of four cases were statistically significant differences ( Figure 4B,D, Tables 2 and S3). This inverse relationship follows the observation that bulk genic integration was unaffected under these conditions ( Figure 3). expected, HIV-1 highly favored SPAD-associated genes for integration: although these genes comprise but 3.3% of the human genome, nominally one-third of all integration events occurred within them ( Figure 4A,C, Table 2; p < 10 −300 , Table S3). Integration into SPAD-non-associated genes was also enhanced compared to random, though this approximate 1.2-fold enrichment paled in comparison to the >10-fold enrichment for integration into SPAD-associated genes ( Figure 4B,D, Table 2).  Table S3 for detailed statistical analyses.
HIV-Luc made in the presence of BI-D displayed statistically significant ~4% to 8% reductions in integration into SPAD-associated genes in both HEK293T cells and Jurkat T cells ( Figure 4A,C, Table 2; p < 0.017, Table S3). Reciprocally, integration into SPAD-nonassociated genes witnessed ~2% to 6% upticks in HIV-1 integration site targeting, which in three of four cases were statistically significant differences ( Figure 4B,D, Tables 2 and  S3). This inverse relationship follows the observation that bulk genic integration was unaffected under these conditions (Figure 3).
To ascertain the generality of these findings, we next reanalyzed previously reported integration site data from viruses that were produced in the presence of ALLINI CX014442 [64]. In this study, SupT1 cells were infected with HIV-1 following exposure to a CX014442 concentration range that varied from 31.25 nM to 250 nM, which spanned from less than the compound's EC50 to greater than its EC90 (respective 69 nM and 114 nM values [88]). As per the original paper [64], we report integration sites for infections initiated with 1to-20 and 1-to-40 dilutions of virus (Table 3 and Figure 5).   Table S3 for detailed statistical analyses.
To ascertain the generality of these findings, we next reanalyzed previously reported integration site data from viruses that were produced in the presence of ALLINI CX014442 [64]. In this study, SupT1 cells were infected with HIV-1 following exposure to a CX014442 concentration range that varied from 31.25 nM to 250 nM, which spanned from less than the compound's EC 50 to greater than its EC 90 (respective 69 nM and 114 nM values [88]). As per the original paper [64], we report integration sites for infections initiated with 1-to-20 and 1-to-40 dilutions of virus (Table 3 and Figure 5). CX014442-treated virus integrated into human RefSeq genes similarly as the baseline DMSO treated virus, though we do note a statistically significant~4.5% reduction (p = 0.004) in genic integration for virus made in the presence of 125 nM compound that was diluted 40-fold prior to infection ( Figure 5A, Tables 3 and S4). In a largely dose-dependent manner, by contrast, integration into SPAD-associated genes was consistently and significantly decreased by CX014442 treatment. For example, preexposure to 125 nM and 250 nM CX014442 reduced HIV-1 integration into SPAD-associated genes by~8% to 10% (p ≤ 0.0001) ( Figure 5B and Table S4). Akin to the results observed for BI-D ( Figure 4B,D), these changes were accompanied by meaningful upticks in integration into SPAD-non-associated genes ( Figure 5C, Tables 3 and S4). 1 Shown are number of unique sites and %integration into RefSeq genes, SPAD-associated genes (SPADg), and SPAD-non-associated genes (SPAD-nong). 2 -, virus produced in presence of DMSO. 3 RIC, random integration control for shearing by sonication. Same is in panel A except that percent integration into SPAD-associated genes was mapped. (C) Same as in panel A except that percent integration into SPAD-non-associated genes is shown. Asterisks show statistical differences versus matched RIC (red) or DMSO (black) controls. *, p < 0.05; **, p < 0.001; ***, p < 0.0001-see Table S4 for detailed statistical analyses.
CX014442-treated virus integrated into human RefSeq genes similarly as the baseline DMSO treated virus, though we do note a statistically significant ~4.5% reduction (p = 0.004) in genic integration for virus made in the presence of 125 nM compound that was diluted 40-fold prior to infection ( Figure 5A, Tables 3 and S4). In a largely dose-dependent manner, by contrast, integration into SPAD-associated genes was consistently and significantly decreased by CX014442 treatment. For example, preexposure to 125 nM and 250 nM CX014442 reduced HIV-1 integration into SPAD-associated genes by ~8% to 10% (p < 0.0001) ( Figure 5B and Table S4). Akin to the results observed for BI-D ( Figure 4B,D), these changes were accompanied by meaningful upticks in integration into SPAD-non-associated genes ( Figure 5C, Tables 3 and S4).

Class II HIV-1 IN Mutant Viruses Are Defective for Integration into SPAD-Associated Genes
ALLINI treated viruses and class II HIV-1 IN mutant viruses share some commonalities, including disruption of IN binding to genomic DNA and defective virus particle morphogenesis [34,47,49,52,87], the consequences of which lead to premature IN and RNA degradation and abortive reverse transcription [34,[45][46][47]. We accordingly next analyzed a small number of class II IN mutant viruses to see how their integration targeting preferences compared to WT viruses made in the presence of ALLINIs.
The class I and class II monikers are generally reserved for HIV-1 IN mutant viruses that display little-to-no residual infectivity. However, characterization of reverse  Table S4 for detailed statistical analyses.

Class II HIV-1 IN Mutant Viruses Are Defective for Integration into SPAD-Associated Genes
ALLINI treated viruses and class II HIV-1 IN mutant viruses share some commonalities, including disruption of IN binding to genomic DNA and defective virus particle morphogenesis [34,47,49,52,87], the consequences of which lead to premature IN and RNA degradation and abortive reverse transcription [34,[45][46][47]. We accordingly next analyzed a small number of class II IN mutant viruses to see how their integration targeting preferences compared to WT viruses made in the presence of ALLINIs.
The class I and class II monikers are generally reserved for HIV-1 IN mutant viruses that display little-to-no residual infectivity. However, characterization of reverse transcription phenotypes can help to distinguish class I versus class II characteristics of partially defective IN mutant viruses as well. For example, if normal DNA synthesis is accompanied by an increase in 2-long terminal repeat circle formation, the mutant is categorized as class I. If by contrast the infection defect is accompanied by a DNA synthesis defect, then the mutant is typed as class II [73,89,90]. Because our goal was to map integration sites, our experimental strategy required residual levels of HIV-1 infectivity. Moreover, because LEDGF/p75 binding is mediated via the IN NTD and CCD [36,39], we selected partially infectious IN CTD mutant viruses, including E246A and K236A/E246A [73]. As a control, we in parallel analyzed the class II IN CCD mutant virus W131D [38]. Trp131 maps to the LEDGF/p75 binding site [36][37][38] and the W131D substitution partially disrupted IN-LEDGF/p75 binding [38].
In replicate experiments, IN mutant viruses K236A/E246A, E246A, and W131D infected HEK293T cells at~3.4%, 19.8%, and 1.9% of the level of WT HIV-Luc, respectively ( Figure 6A). Frequencies of IN mutant K236A/E246A and E246A viral integration into RefSeq genes varied from the WT by at most 0.8%, which were across the board statistically insignificant differences ( Figure 6B, Tables 4 and S5). The approximate 10-12% reductions in RefSeq gene integration observed for the IN W131D mutant virus by contrast were highly significant (p < 10 −30 ). fected HEK293T cells at ~3.4%, 19.8%, and 1.9% of the level of WT HIV-Luc, respectively ( Figure 6A). Frequencies of IN mutant K236A/E246A and E246A viral integration into Ref Seq genes varied from the WT by at most 0.8%, which were across the board statistically insignificant differences ( Figure 6B, Tables 4 and S5). The approximate 10-12% reductions in RefSeq gene integration observed for the IN W131D mutant virus by contrast were highly significant (p < 10 −30 ).  Table S1). (B) Percen integration in RefSeq genes in HEK293T cells for the infections shown in panel A; replicate 1 data i plotted to the left. Asterisks show differences versus matched RIC (red) and WT HIV-Luc (black controls. ***, p < 0.0001-see Table S5 for detailed statistical analyses. In contrast to total RefSeq genes, the IN CTD class II mutant viruses were generally defective for integration into SPAD-associated genes. In the case of the double mutan within each experiment, luciferase values were determined for technical triplicate samples) were percent normalized to WT HIV-Luc, which was set to 100%. *, p < 0.01 (also see Table S1). (B) Percent integration in RefSeq genes in HEK293T cells for the infections shown in panel A; replicate 1 data is plotted to the left. Asterisks show differences versus matched RIC (red) and WT HIV-Luc (black) controls. ***, p < 0.0001-see Table S5 for detailed statistical analyses. 1 Shown are number of unique sites and %integration into RefSeq genes, SPAD-associated genes (SPADg), and SPAD-non-associated genes (SPAD-nong). 2 RIC, random integration control for MseI/BglII digestion.
In contrast to total RefSeq genes, the IN CTD class II mutant viruses were generally defective for integration into SPAD-associated genes. In the case of the double mutant virus, whose residual~3.4% infectivity mirrored the infectivity of BI-D-treated virus ( Figure 3A), both experimental replicates revealed highly significant reductions (p < 10 −5 ), with reciprocal upticks in integration into SPAD-non-associated genes ( Figure 7A,B, Tables 4 and S5; p < 10 −4 ). The IN E246A IN mutant virus revealed this same phenotype in one of two experimental replicates. Thus, while the frequency of SPAD-associated gene integration was statistically indistinguishable from the WT in replicate 2 samples, the significant~3% reduction for the E246A IN mutant virus in replicate 1 (p = 3.6 × 10 −6 ) was accompanied by a significant uptick in integration into SPAD-non-associated genes (p = 5.2 × 10 −5 ). As expected from the overall reduction in RefSeq gene targeting, the IN W131D mutant virus was also defective for integrating into SPAD-associated genes, with one of two replicates revealing a significant uptick in integration into SPAD-non-associated genes ( Figure 7A,B, Tables 4 and S5).
tion was statistically indistinguishable from the WT in replicate 2 samples, the significant ~3% reduction for the E246A IN mutant virus in replicate 1 (p = 3.6 × 10 −6 ) was accompanied by a significant uptick in integration into SPAD-non-associated genes (p = 5.2 × 10 −5 ). As expected from the overall reduction in RefSeq gene targeting, the IN W131D mutant virus was also defective for integrating into SPAD-associated genes, with one of two replicates revealing a significant uptick in integration into SPAD-non-associated genes (Figure 7A,B, Tables 4 and S5).  Table S5 for detailed statistical analyses.

Discussion
Our work informs important aspects of ALLINI functionality. First, competition for binding of LEDGF/p75 and HDGFL2 host factors to the IN CCD appears to in large part account for the ability of these compounds to suppress genic integration targeting during the early phase of HIV-1 replication (Figure 2). Due to this retargeting, proviruses formed in the presence of ALLINIs are transcriptionally compromised [56,65], leading to the suggestion that ALLINIs may help to suppress the formation of the replication-competent latent HIV reservoir [91]. Although currently uncertain how the IN inhibitors might be effectively deployed as such "block-and-lock" agents, our work nonetheless helps to understand the underlying biology behind the integration retargeting phenotype.
Proviruses formed via ALLINI-treated virions are also compromised transcriptionally [64], and our work additionally clarifies that genic integration targeting is disrupted for viruses exposed to ALLINIs during the late phase of HIV-1 replication. Although integration into bulk RefSeq genes was largely unperturbed, integration into SPAD-associated genes, which are naturally highly preferred for HIV-1 integration [21], was disrupted (Figures 3-5). These downturns were moreover generally met with significant upturns in integration into the reciprocal gene set, i.e., SPAD-non-associated genes. Somewhat unexpectedly, we observed this same phenotype for the class II IN CTD K236A/E246A double mutant virus, and for one of two experimental replicates with its more infectious single missense mutant E246A variant (Figures 6 and 7). Although the molecular basis of the integration retargeting phenotype shared by ALLINI-treated and class II IN mutant viruses is unclear, different possibilities can be envisioned.
HIV-1 IN in solution exhibits concentration-dependent tetramerization (reviewed in [34]) and IN tetramers as compared to lower order monomer/dimer forms avidly bind  Table S5 for detailed statistical analyses.

Discussion
Our work informs important aspects of ALLINI functionality. First, competition for binding of LEDGF/p75 and HDGFL2 host factors to the IN CCD appears to in large part account for the ability of these compounds to suppress genic integration targeting during the early phase of HIV-1 replication (Figure 2). Due to this retargeting, proviruses formed in the presence of ALLINIs are transcriptionally compromised [56,65], leading to the suggestion that ALLINIs may help to suppress the formation of the replicationcompetent latent HIV reservoir [91]. Although currently uncertain how the IN inhibitors might be effectively deployed as such "block-and-lock" agents, our work nonetheless helps to understand the underlying biology behind the integration retargeting phenotype.
Proviruses formed via ALLINI-treated virions are also compromised transcriptionally [64], and our work additionally clarifies that genic integration targeting is disrupted for viruses exposed to ALLINIs during the late phase of HIV-1 replication. Although integration into bulk RefSeq genes was largely unperturbed, integration into SPAD-associated genes, which are naturally highly preferred for HIV-1 integration [21], was disrupted (Figures 3-5). These downturns were moreover generally met with significant upturns in integration into the reciprocal gene set, i.e., SPAD-non-associated genes. Somewhat unexpectedly, we observed this same phenotype for the class II IN CTD K236A/E246A double mutant virus, and for one of two experimental replicates with its more infectious single missense mutant E246A variant (Figures 6 and 7). Although the molecular basis of the integration retargeting phenotype shared by ALLINI-treated and class II IN mutant viruses is unclear, different possibilities can be envisioned.
HIV-1 IN in solution exhibits concentration-dependent tetramerization (reviewed in [34]) and IN tetramers as compared to lower order monomer/dimer forms avidly bind RNA in vitro [47]. Although not investigated directly for the K236A/E246A or E246A mutants, prior work revealed that K236E mutant IN protein was primarily dimeric, and that the associated mutant viral IN was defective for genomic RNA binding [47]. We accordingly suspect that both K236A/E246A and E246A mutant INs would also be defective for IN tetramerization and RNA binding in virions, though perhaps proportionally less so than the highly defective K236E IN mutant (<1% residual infectivity [73]). Although LEDGF/p75 binding stabilizes IN tetramers [39,92], there is little reason to suspect loss of LEDGF/p75 binding as a contributing factor to IN CTD mutant viral integration retargeting. First, the NTD and CCD mediate LEDGF/p75 binding, with no apparent role for the CTD [39]. Second, the integration retargeting phenotype of the W131D control virus, whose IN is nominally defective for LEDGF/p75 binding [38], was notably different from the CTD mutant viruses (Figures 6 and 7).
One limitation of our study is that the most interesting phenotypes were observed under conditions of severe HIV-1 restriction imposed by comparatively high ALLINI doses (Figures 2, 4 and 5) or IN mutations (Figure 7). Under these conditions, just a small percentage of the virus populations, compared to controls, remains active. ALLINIs and class II IN mutations elicit strikingly similar eccentric HIV-1 particles with the viral ribonucleoprotein complex located outside of electron-lucent cores [52][53][54]61,[85][86][87], and such virions are apparently replication-defective due to premature degradation of IN and HIV-1 RNA in nascently infected cells [45][46][47]. Mishappened/deformed cores with associated electron density are formed at near equal frequencies in WT, ALLINI-treated, and class II mutant virions [87]. If such viruses are infectious, their malformed capsid lattices could potentially lead to premature uncoating in the nucleus, the consequences of which could arrest nuclear penetration and lead to integration into genes more distal from nuclear speckles than would otherwise occur during baseline infection. Image-based assays that track positions of intranuclear uncoating [93,94] may help to lend evidence in support of such a model. CPSF6 has been described as a master regulator of intranuclear HIV-1 penetration [95] and, perhaps not unsurprisingly, HIV-1 integration into SPAD-associated genes is for the most part obliterated in CPSF6 knockout cells [21].
Although not statistically relevant differences, we consistently observed minor reductions in integration into RefSeq genes in DKO cells in the presence of EC 95 concentration of BI-D (Table 1 and Figure 2). Although it may be tempting to speculate that ALLINI treatment imparts aberrant IN hyper-multimerization in the absence of LEDGF/p75 and HDGFL2 during the early phase of HIV-1 infection, binding to nucleic acid shields IN from the ALLINI-induced effect [50,82], indicating this may not be at play in DKO cells, which support normal levels of reverse transcription [44]. Moreover, LEDGF/p75 and HDGFL2 are the only members of the HDGF family that harbor an IBD [40]. Based on the ALLINI-IN contacts that drive aberrant IN hyper-multimerization [58][59][60], it seems possible that ALLINIs could also disrupt host factor interactions with the IN CTD. Numerous host factors, including the histone acetyltransferase EP300, have been shown to bind the IN CTD [96]. Additional research is required to determine if EP300 or perhaps other CTD-binding host factors play roles in HIV-1 integration targeting and, if so, whether these effects are disrupted by ALLINI treatment.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/v14091883/s1, Table S1: p values for infection experiments; Table S2: p values for integration sites in WT and knockout HEK293T cells; Table S3: p values for BI-D-treated virus integration sites; Table S4: p values for CX014442-treated virus integration sites;

Conflicts of Interest:
The authors declare no conflict of interest.