Darunavir-Resistant HIV-1 Protease Constructs Uphold a Conformational Selection Hypothesis for Drug Resistance

Multidrug resistance continues to be a barrier to the effectiveness of highly active antiretroviral therapy in the treatment of human immunodeficiency virus 1 (HIV-1) infection. Darunavir (DRV) is a highly potent protease inhibitor (PI) that is oftentimes effective when drug resistance has emerged against first-generation inhibitors. Resistance to darunavir does evolve and requires 10–20 amino acid substitutions. The conformational landscapes of six highly characterized HIV-1 protease (PR) constructs that harbor up to 19 DRV-associated mutations were characterized by distance measurements with pulsed electron double resonance (PELDOR) paramagnetic resonance spectroscopy, namely double electron–electron resonance (DEER). The results show that the accumulated substitutions alter the conformational landscape compared to PI-naïve protease where the semi-open conformation is destabilized as the dominant population with open-like states becoming prevalent in many cases. A linear correlation is found between values of the DRV inhibition parameter Ki and the open-like to closed-state population ratio determined from DEER. The nearly 50% decrease in occupancy of the semi-open conformation is associated with reduced enzymatic activity, characterized previously in the literature.


Mass Spectrometry Analysis
Mass spectrometry is performed on all spin-labeled samples after final purification steps to ensure near complete labeling and appropriate protein mass. Data was collected on an Agilent 6220 ESI TOF (Santa Clara, CA) mass spectrometer equipped with an electrospray source operated in positive ion mode. Agilent ESI Low Concentration Tuning Mix was used for mass calibration for a calibration range of m/z 100 -2000. Samples were prepared in a solution containing acidified acetonitrile (0.5% formic acid) and 1 μL was injected into the electrospray source at a rate of 100 ml min -1 . Optimal conditions were capillary voltage 4000 V, source temperature 350 o C and a cone voltage of 60 V. The TOF analyzer was scanned over an appropriate m/z range with a 1 s integration time. Data was acquired in continuum mode until acceptable averaged data was obtained. ESI results were collected for all samples and complete spin labeling of proteins was confirmed with correctly anticipated masses before proceeding to DEER data collection.

HIV-1 PR amino acid sequence summary
HIV-1 PR sequence summary. The sequence of PI-naïve subtype B is given as the reference. This sequence is based off the LAI sequence with the following substitutions shown in bold: C67A, C95A, Q7K, L33I, L63I, D25N and K55C. DRV1-6 sequences are shown with drug pressure selected substitutions shown in bold and underlined. 3. CW EPR spectra 3 CW EPR spectra are recorded for each sample prior to and after DEER analysis to ensure sample quality. We noted for DRV1 and DRV3 that at pH = 5.0 the CW lineshapes differed dramatically than other samples, in particular PI-naive subtype B, and appeared similar to spectra observed previously when exploring the impact of salt concentration on WT (Bs). 1 We attribute this broadened spectrum to some form of solution aggregate that is soluble (see DLS data below). Note, solution was not cloudy upon inspection, so no precipitate was forming, however at higher pH values, broadened spectra were obtained. This effect is "reversable" because in SI-2 (B) spectra show that upon addition of DRV (at pH 5.0) the spectrum of DRV3+DRV resembles that expected for well-behaved dimer in solution (spectrum of Bsi pH 5.0). Also note, our lab has performed a series of solution NMR experiments upon various HIV-1 PR constructs 8-10 so we know how to prepare a homogeneous well behaved sample and know that the spectra shown in (A) are representative of some solution aggregate.

Dynamic Light Scattering
Dynamic light scattering (DLS) measurements were performed on an ALV/CGS-3 four-angle, compact goniometer system (Langen, Germany), equipped with a 22 mW HeNe linear polarized laser operating at a wavelength of λ = 632.8 nm. Fluctuations in the scattering intensity were measured via a ALV/LSE-5004 multiple tau digital correlator and analyzed via the intensity autocorrelation function.
where 〈 〉 is the average scattering intensity and I(t) is the scattering intensity at time t, and  is the delay time. The correlation functions at 90° were deconvoluted using a regularized inverse Laplace transform (CONTIN analysis), which yields a distribution of decay rates, i, by where g 1 (q,t) is the normalized electric field autocorrelation function. The mutual diffusion coefficient, Dm, for a particular species in the distribution is determined by DLS data reveal that larger aggregates are forming in solution for DRV1 and DRV3 at pH 5.0; where previous DEER, CW EPR and NMR investigations have been performed for various HIV-1PR constructs. DLS results at lowered pH (4 and 3 for DRV3, and 2.8 for DRV1) give DLS results similar to those obtained for Subtype B. DLS also shows that at pH 5.0; the addition of inhibitor (DRV) to DRV3 shifts the size distribution to a profile seen for subtype B with DRV. This results is consistent with CW EPR results showing the narrowed line shape observed upon addition of inhibitor. Although at pH 5.0 the addition of DRV to DRV1 shifts the profile to smaller sizes (Fig SI-4D), the size distribution is still larger than that observed with subtype B. Upon dropping pH to 2.8 the size distribution of unbound DRV1 now matches that of subtypeB, the distribution profile upon addition of DRV is altered from what has been observed previously. All DEER data for addition of DRV and CaP2 were performed at pH 5.0 to help aid in protein stability.

DEER Data Analyses and Summaries
The summary of relative percentages of subtype B has been published previously. 2,3 All DEER data was processed to generate a background-corrected dipolar modulation curve and a distance profile using DeerAnalysis2019. 4 The validity of each population contributing less than 20% to the total population was tested by suppressing the population of interest, generating a theoretical echo curve, and comparing the generated theoretical echo curve to the background-corrected echo curve using DEERconstruct program. [5][6][7][8] Results are shown in Supporting Information Figures SI-5 to SI-23 for each construct investigated here in unbound form, upon addition of DRV and CaP2. For DRV1 and DRV3 effects of sample pH are also shown. Figure S5. DEER data for apo HIV-1 PR DRV1 pH 2.8, A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S6. DEER data for CaP2-bound HIV-1 PR DRV1 pH 5.0, A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S7. DEER data for DRV-bound HIV-1 PR DRV1 pH 5.0, A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S8. DEER data for apo HIV-1 PR DRV2, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S9. DEER data for CaP2-bound HIV-1 PR DRV2, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "U" indicates the unsigned peak, which is far longer distance than the 41~42 angstrom wide-open states; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S10. DEER data for DRV-bound HIV-1 PR DRV2, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S11. DEER data for apo HIV-1 PR DRV3 at pH 5.0, A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S12. DEER data for apo HIV-1 PR DRV3 at pH 3, A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S13. DEER data for CaP2-bound HIV-1 PR DRV3, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S14. DEER data for DRV-bound HIV-1 PR DRV3, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S15. DEER data for apo HIV-1 PR DRV4, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S16. DEER data for CaP2-bound HIV-1 PR DRV4, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S17. DEER data for DRV-bound HIV-1 PR DRV4, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S18. DEER data for apo HIV-1 PR DRV5, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S19. DEER data for CaP2-bound HIV-1 PR DRV5, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S20. DEER data for DRV-bound HIV-1 PR DRV5, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S21. DEER data for apo HIV-1 PR DRV6, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S22. DEER data for CaP2-bound HIV-1 PR DRV6, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. Figure S23. DEER data for DRV-bound HIV-1 PR DRV6, pH 5.0 A) Background corrected dipolar evolution curve after the long pass filter in DeerAnalysis (black) and the simulated curve from DEERconstruct (gray); B) Raw dipolar evolution curve and background, the signal to noise ratio (S/N) is shown inset, where the signal is the DEER modulation depth and the noise is 2 times of the standard deviation of the noise curve; C) The corresponding distance profile generated via TKR analysis by DeerAnalysis (black) and the theoretical curve generated from the Gaussian reconstruction by DEERconstruct (gray), asterisks indicate that peaks are within the suppression range, "+" indicates that the peak is presumed to be an artifact of processing as it is near the lower limit of the generally accepted range that is measurable using DEER; D) The individual Gaussian functions used in the reconstruction; E) Frequency domain spectrum; F) L-curve derived from TKR fit to obtain the optimal regulation parameter, the optimal regulation parameter is plot in red. The data analysis proceeds first by TKR analysis of the DEER echo curve to give a distance profile. DEERAnalysis2019 provides an estimate of error based upon choosing an optimal regularization parameter from an L-curve (panel F in Figs SI-5-23). This profile is then fit to a linear combination of Gaussian functions using DEERconstruct (Casey et al. 2015 Methods in Enzymology). When using DEERconstruct, the user chooses initial parameters for peak positions. Typically, these parameters are free to vary, yet we choose initial parameters based upon our model where the semi-open distance of ~36 Å was determined from modelling of X-ray structures, MD simulations and original data on subtype B and with the closed distance determined to be 33Å analogously. Wide open and curled tucked distances come from MD simulations and EPR data. Our software allows for peak picking based upon the maximum value seen. Clearly in cases where there is a broad distribution there is more error or ambiguity. In those cases, 33Å and 36 Å are set as the initial values and allowed to vary only slightly (0.5Å and 1 Å, respectively) based upon the structural model from X-ray data. We typically also restrict the breadth of the "closed" state; as we have "control" data for many non-drug resistant constructs that show a rather narrow ranging from 4 -6 Å for FWHM. A broad distribution in width likely indicate heterogeneity of that conformational state. Error reported here is representative from 3x STD from three separate fitting approaches with DEERconstruct for a given regularization parameter, where the initial parameter values were altered. The closed conformation is the best defined as we have numerous data sets where this conformation is obtained from protease with inhibitor.