Discovery and Mechanistic Investigation of Piperazinone Phenylalanine Derivatives with Terminal Indole or Benzene Ring as Novel HIV-1 Capsid Modulators

HIV-1 capsid (CA) performs multiple roles in the viral life cycle and is a promising target for antiviral development. In this work, we describe the design, synthesis, assessment of antiviral activity, and mechanistic investigation of 20 piperazinone phenylalanine derivatives with a terminal indole or benzene ring. Among them, F2-7f exhibited moderate anti-HIV-1 activity with an EC50 value of 5.89 μM, which was slightly weaker than the lead compound PF74 (EC50 = 0.75 μM). Interestingly, several compounds showed a preference for HIV-2 inhibitory activity, represented by 7f with an HIV-2 EC50 value of 4.52 μM and nearly 5-fold increased potency over anti-HIV-1 (EC50 = 21.81 μM), equivalent to PF74 (EC50 = 4.16 μM). Furthermore, F2-7f preferred to bind to the CA hexamer rather than to the monomer, similar to PF74, according to surface plasmon resonance results. Molecular dynamics simulation indicated that F2-7f and PF74 bound at the same site. Additionally, we computationally analyzed the ADMET properties for 7f and F2-7f. Based on this analysis, 7f and F2-7f were predicted to have improved drug-like properties and metabolic stability over PF74, and no toxicities were predicted based on the chemotype of 7f and F2-7f. Finally, the experimental metabolic stability results of F2-7f in human liver microsomes and human plasma moderately correlated with our computational prediction. Our findings show that F2-7f is a promising small molecule targeting the HIV-1 CA protein with considerable development potential.


Introduction
Acquired immunodeficiency syndrome (AIDS) is a series of syndromes characterized by T cell immune deficiency caused by human immunodeficiency virus (HIV) infection [1]. Since the first case of AIDS was reported in 1981, approximately 40 million people have died of AIDS worldwide [2]. The rapid spread of AIDS and its high mortality rate threaten human health and social development. HIV contains two species: HIV-1 is the predominant pathogenic pathogen, but the danger of HIV-2 infection is increasing and has been discovered globally [3,4]. Current combination antiretroviral therapy (cART) and, in light structure-activity relationships (SARs) were established. We then performed surface plasmon resonance (SPR) and molecular dynamics simulation to investigate the mechanism of action of the representative compounds. Furthermore, the ADMET properties of the representative compounds and PF74 were computationally predicted. Finally, we experimentally assessed the metabolic stability of F 2 -7f in human liver microsomes (HLMs) and human plasma.
However, the substitution of R 3 with Br or OH resulted in a marked decrease in efficacy, such as compounds 7c, F 2 -7c, 7e, and F 2 -7e, indicating that a slight change in the substitutes targeting CTD of CA had a considerable effect on the anti-HIV activity of the compounds.
Notably, the compounds with more fluorine atoms at the terminal of the two series had the most potent anti-HIV-1 activity, such as F 2 -7f and F 2 -7j. It is speculated that the F atom forms a hydrogen bond with the surrounding key amino acid residues. In addition, the anti-HIV activity of series I was generally better than that of series II, indicating that the indole derivatives for the NTD-CTD interface were more conducive to antiviral activity than the aniline derivatives. Preliminary SARs analysis of newly synthesized compounds revealed that the linker of PF74 derivatives and the substituent components was particularly important for anti-HIV activity, which could serve as guidance for further research exploring the NTD-CTD interface. Overall, the study of these compounds showed that the alteration of substituents had a significant impact on antiviral activity and selectivity, and F 2 -7f represented a promising lead compound for further optimization.

Surface Plasmon Resonance (SPR) Assay on CA Protein
Next, we selected the representative compound F 2 -7f for direct binding evaluation with monomeric and hexameric HIV-1 CA protein using SPR. We utilized the previously reported SPR method to test the binding affinity and off-rates of F 2 -7f to CA monomers and CA hexamers, using the lead compound PF74 as an internal control [30].
The SPR results are shown in Figures 2 and 3 and Table 3, providing evidence for direct binding with both monomeric and hexameric CA proteins. Based on the equilibrium dissociation constant (K D ), these compounds preferred to bind to the CA hexamer rather than to the CA monomer. The affinity of PF74 (hexamer: K D = 0.159 ± 0.041 µM; monomer: K D = 3.410 ± 1.310 µM) to CA was greater than that of F 2 -7f (hexamer: K D = 7.203 ± 1.101 µM; monomer: K D = 16.063 ± 1.316 µM), which was consistent with their antiviral activity (PF74, EC 50 = 0.75 ± 0.33 µM > F 2 -7f, EC 50 = 5.89 ± 2.03 µM) in vitro. Simultaneously, for CA hexamers, the k off values of F 2 -7f were approximately 8.5 times higher than that of PF74, indicating faster dissociation. As found in a previous study, the increase in k off value was positively correlated with the decrease in antiviral activity [31], which suggested that the lower affinity and faster off-rate of F 2 -7f for HIV-1 CA might be the reason for its reduced anti-HIV-1 activity. Therefore, SPR experiments proved that these newly synthesized compounds could be defined as HIV-1 CA modulators.

Molecular Dynamics (MD) Simulation with F 2 -7f Bound HIV-1 CA Hexamer
The most active molecule, F 2 -7f, was selected to investigate its binding to the ligand site of HIV-1 CA. Figure 4A shows the root mean square deviation (RMSD) values of HIV-1 CA residues from the first frame of the MD simulation. The figure highlights that the protein had largely deviated from the X-ray structure and formed diverse protein conformations. The root mean square fluctuation (RMSF) of amino acids was investigated to identify the deviated amino acids ( Figure 4B). Most amino acids had largely deviated from the X-ray structure, indicating that HIV-1 CA had multiple conformations during MD simulation. This deviation of amino acids indicated that HIV-1 CA had different binding modes with F 2 -7f. To investigate the potential binding modes of F 2 -7f to the binding site, the RMSD values of F 2 -7f were calculated, as shown in Figure 4C. The figure showed that F 2 -7f had deviated from the docked conformation and clustered in different conformations. The MD trajectory was clustered based on F 2 -7f to explore its interactions with the binding site. The clustering procedure returned 10 clusters with one dominant. Figure 5 shows the representative structure of the most populated cluster. The conformation of the HIV-1 CA representative structure had a folded structure, which was different from the X-ray structure. The binding site in the representative structure was at the same location as the X-ray structure. The representative structure of the dominant cluster was investigated to determine the bonding forces between HIV-1 CA and F 2 -7f. Methoxybenzene was embedded between Leu56 and Lys70, where it could be involved in aromatic-aliphatic hydrophobic interactions. Also, the benzene ring of methoxybenzene could be involved in ion-induced dipole with the charged nitrogen of Lys70. 3,5-difluorobenzene was involved in hydrophobic interactions with Ala105. Thr107 formed a hydrogen bond with the oxygen atom of the amide (24.5%), as shown in Figure 5. Glu71 formed a hydrogen bond with the indole ring of F 2 -7f in 20% of the MD simulations, which might explain why F 2 -7f was the most potent compound in this study. It could be seen that F 2 -7f and PF74 bound in the same site, but the conformation of F 2 -7f was partially changed relative to that of PF74, resulting in the absence of some key interactions, which might explain its reduced antiviral activity.

Drug-Like Properties and Metabolic Stability
One of the main drawbacks of PF74 is its low metabolic stability and poor drug-like property profile, which limits its clinical use. Therefore, we evaluated selected compounds from this study for their drug-like properties and metabolic stability and compared them with PF74 ( Figure 6). We utilized in silico prediction of drug-like metrics as implemented in the oral non-central nervous system (CNS) drug profile in StarDrop 7 software (Optibrium, Ltd., Cambridge, UK) [32]. This profile consists of several models, and a probabilistic scoring algorithm combines the model predictions in the oral non-CNS drug profile into an overall score. Scores range from 0 to 1, with 0 suggesting extremely non-drug-like and 1 suggesting the perfect drug.
Based on this analysis, F 2 -7f and especially 7f displayed improved aqueous solubility compared to PF74, as judged by the logS values. This can contribute to improved overall bioavailability. F 2 -7f and 7f showed improved oral non-CNS drug profile scores, primarily due to improved solubility and low plasma protein binding compared to PF74. The poor metabolic stability of PF74 limits its use in clinical applications [31]. For orally administered drugs, the intestinal wall and portal circulation to the liver represents firstpass metabolism and can limit compound concentrations in the bloodstream. Therefore, we next sought to investigate in silico whether or not our compounds had improved predicted metabolic stability over PF74. We employed a computational analysis first demonstrated to be an accurate indicator of metabolic stability by the Cocklin group [33][34][35]. We utilized the P450 module in StarDrop 7 software (Optibrium, Ltd., Cambridge, UK) to predict each compound's major metabolizing Cytochrome P450 isoforms using the WhichP450™ model. Subsequently, we predicted the affinity to that isoform using the HYDE function in See SAR (BioSolveIT Gmbh, Sankt Augustin, Germany) [34,36,37]. The results of this analysis are shown in Figure 7.
The main metabolizing isoform for all compounds, including PF74, is the CYP3A4 isoform [12] ( Figure 7A). In addition, F 2 -7f and 7f were also predicted to be metabolized to a greater extent than PF74 by the 2D6 isoform and were within the high-affinity category for 2D6, according to the analysis in StarDrop.
We next investigated the predicted metabolic lability of our compounds and PF74 with the CY3A4 isoform by comparing the overall composite site lability (CSL) score and number of labile sites. The CSL score can reflect the overall efficiency of metabolism of the molecule by combining the labilities of individual sites within the compound. The number of labile sites between our compounds and PF74 was not significantly different; however, the CSL score indicated increased metabolic stability for PF74 ( Figure 7B). In addition to the CSL score and number of labile sites, which assumed that all compounds bind with similar affinity to the CYP3A4 isoform, other factors such as compound reduction rate and actual binding affinity to the CYP3A4 isoform can influence metabolic stability. In addition, intrinsic compound properties, such as size and lipophilicity, can also infer affinity. Therefore, we performed predictive binding affinity calculations using the hydrogen bond and dehydration (HYDE) energy scoring function in SeeSAR 12.1 (BioSolveIT Gmbh, Sankt Augustin, Germany) [38] using the structure of the human CYPA4 bound to an inhibitor (PDB ID 4D78) [39]. The HYDE scoring function in SeeSAR provides a range of affinities, including an upper and lower limit. We used the lower limit as the affinity predictor to compare F 2 -7f, 7f, and PF74 ( Figure 7C), which resulted in an affinity of 657 µM for F 2 -7f, 179 µM for 7f, and 2 nM for PF74. Although F 2 -7f and 7f had less favorable CSL scores and F 2 -7f had two labile sites (terminal methoxy and a carbon between the two F atoms within the difluorobenzene), the much lower predicted CYP3A4 affinities for F 2 -7f and 7f might have compensated for the higher CSL and number of labile sites.
Combining the results from these predictions (CSL scores, labile sites, and predicted CYP3A4 affinity), this analysis indicated that compounds F 2 -7f and 7f have similar metabolic stability as PF74.

Genotoxicity and Hepatotoxicity
To obtain a toxicity profile for the lead compounds, we included genotoxicity and hepatotoxicity endpoints in our multiparameter optimization for F 2 -7f, 7f, and PF74 using the Derek Nexus module within StarDrop V7 software. Derek Nexus utilizes a knowledgeand rule-based expert system for semi-quantitative estimations of DNA-reactive moieties within molecules. None of the lead compounds nor PF74 showed any concerning likelihood of genotoxicity or hepatotoxicity (Figure 8). To evaluate in silico the accuracy of the prediction, we used positive controls in our prediction. Ethyl methanesulfonate (EMS) [40] and lumiracoxib [41] are known to have in vivo genotoxic and hepatotoxic effects.

Metabolic Stability in the Presence of Human Liver Microsomes and Human Plasma
Equipped with the computational predictions, we next performed metabolic stability assays in human liver microsomes (HLMs) and human plasma. Firstly, testosterone, diclofenac, and propafenone with moderate metabolic stability were selected as control drugs, and we tested the metabolic stability of F 2 -7f and PF74 in HLMs. As shown in Table 4, F 2 -7f and PF74 could be rapidly metabolized in HLMs with a half-life of 0.5 min. The intrinsic clearance (CL int ) of F 2 -7f was slightly lower than that of PF74 (2759.1 and 2862.5 µL/min/mg, respectively). The results of the human plasma stability assay for F 2 -7f and PF74 are shown in Table 5. After 120 min of incubation, the residual amount of the original F 2 -7f decreased to 86.9%, and the residual amount of the original PF74 decreased to 85.2%, indicating that the metabolic stability of F 2 -7f in human plasma was slightly improved compared to that of PF74. This concurred with the lower plasma protein binding prediction probability for F 2 -7f (PPB90, Figure 6C). Overall, the experimental data moderately correlated with the computational prediction, and improving the metabolic stability of PF74-like small molecules remains an urgent issue for future optimization efforts.

Chemistry
All melting points (mp) of the new compounds were determined on a micro melting point apparatus. 1 H NMR and 13 C NMR spectra were obtained in DMSO-d 6 on a Bruker AV-400 spectrometer or Bruker AV-600 spectrometer using tetramethylsilane (TMS) as the internal reference. Chemical shifts were reported in δ values (ppm) and J values were expressed in hertz (Hz). Thin layer chromatography (TLC) for monitoring reactions or purifying products was performed on silica gel GF254 or Huanghai HSGF254, 0.15-0.2 mm, respectively. Spots were visualized with iodine vapor or by irradiation with UV light (λ = 254 nm or λ = 365 nm). Mass spectrometry (MS) was carried out using a Standard G1313A LC autosampler instrument. Flash column chromatography was performed on a column packed with silica gel 60 (200-300 mesh). Solvents were of reagent grade and, if needed, were purified and dried using standard methods. Rotary evaporators were involved in concentrating the reaction solutions under reduced pressure. The solvents dichloromethane, TEA, methanol, etc. were obtained from Sinopharm Chemical Reagent Co., Ltd. (SCRC, Shanghai, China) and were of AR grade. The key reactants, including 4-methoxy-N-methylaniline (CAS: 5961-59-1), N-(tert-butoxycarbonyl)-L-phenylalanine (CAS: 13734-34-4), and N-(tert-butoxycarbonyl)-3,5-difluoro-L-phenylalanine (CAS: 205445-52-9), were purchased from Shanghai Haohong Scientific Co., Ltd. (Shanghai, China). The purity of all target compounds was analyzed by high-performance liquid chromatography (HPLC) and was >95%.

Procedure for the Synthesis of Intermediates
The procedure for the synthesis of key intermediates is provided in the Supplementary Materials [29,30].

General Procedure for the Synthesis of 7 (a-f) and F 2 -7 (a-f)
Corresponding substituted indoleacetic acids (1.2 eq.) were first dissolved in 20 mL dichloromethane (DCM), and then HATU (1.5 eq.) was added. The reaction mixture was stirred at 0 • C for 0.5 h. Then, the key intermediates 6 and F 2 -6 (1 eq.) were added dropwise into the solution, and DIEA (2 eq.) was added. The reaction mixture was restored at room temperature for 3 h. After monitoring the reaction by TLC, the excess solvent was evaporated under reduced pressure, and the residue was redissolved in water and extracted with DCM (3 × 20 mL). Subsequently, the organic layers were combined and washed with saturated sodium bicarbonate (3 × 20 mL), dried over anhydrous Na 2 SO 4 , filtered, and concentrated under reduced pressure to obtain the corresponding crude products. These products were purified by flash column chromatography to provide target compounds 7 (a-f) and F 2 -7 (a-f).

General Procedure for the Synthesis of 7 (g-j) and F 2 -7 (g-j)
Corresponding substituted indoleacetic acids (1.2 eq.) were first dissolved in 20 mL DCM, and then HATU (1.5 eq.) was added. The reaction mixture was stirred at 0 • C for 0.5 h. Then, the key intermediates 6 and F 2 -6 (1 eq.) were added dropwise into the solution and DIEA (2 eq.) was added. The reaction mixture was restored to room temperature for 3 h. After monitoring the reaction by TLC, the excess solvent was evaporated under reduced pressure, and the residue was redissolved in water and extracted with DCM (3 × 20 mL). Subsequently, the organic layers were combined and washed with saturated sodium bicarbonate (3 × 20 mL), dried over anhydrous Na 2 SO 4 , filtered, and concentrated under reduced pressure to obtain the corresponding crude products. These products were purified by flash column chromatography to provide target compounds 7 (g-j) and F 2 -7 (g-j).

In Vitro Anti-HIV Assay with MT-4 Cells
The protocol for the in vitro anti-HIV assay with MT-4 cells is provided in the Supplementary Materials.

Analysis of Binding to HIV-1 CA Proteins via Surface Plasmon Resonance
The protocol for analysis of binding to CA proteins via surface plasmon resonance is provided in the Supplementary Materials.

Molecular Dynamics Simulation
The protocol for the molecular dynamics simulation is provided in the Supplementary Materials.

In Silico ADMET Analysis
The protocol for the in silico ADMET analysis is provided in the Supplementary Materials.

Metabolic Stability in Human Liver Microsomes
The protocol for the metabolic stability assay using human liver microsomes is provided in the Supplementary Materials.

Metabolic Stability in Human Plasma
The protocol for the metabolic stability assay using human plasma is provided in the Supplementary Materials.

Conclusions
The multiple key roles of HIV-1 CA in the viral life cycle make it a novel and attractive target for drug design. However, due to the structural plasticity of CAs, especially between interprotomer pockets, rational drug design and optimization remain challenging. To address this issue, we thoroughly explored the crystallographic information available for the PF74-CA complex, focusing on key residues Gln67, Glu71, Tyr169, and Lys182 of the NTD-CTD interface as new binding sites for structural optimization to obtain compounds with improved antiviral activity and metabolic stability. In this study, we designed and synthesized 20 piperazinone phenylalanine derivatives with a terminal indole or benzene ring, preliminarily verifying the HIV-1 CA protein as the target of these compounds via SPR and MD simulation. According to our SAR studies, compounds with higher numbers of fluorine atoms at the terminal benzene of the two series showed the most potent anti-HIV-1 activity, as shown by F 2 -7f (EC 50 = 5.89 µM) and F 2 -7j (EC 50 = 13.74 µM). Mechanistically, our MD simulation results confirmed this finding, as compounds with terminal F atoms, such as F 2 -7f, had a 20% probability of forming a hydrogen bond with key residue Glu71. However, we acknowledge that the terminal electron-withdrawing F atoms introduce metabolic lability at the benzene ring, as our in silico analysis predicted.
Nevertheless, the anti-HIV-1 activities of the newly synthesized compounds were lower than that of PF74 (EC 50 = 0.75 µM), which might be due to a partial change in their binding modes to CA, resulting in the disappearance of two key hydrogen bonds between the compound and Asn57. Therefore, in our follow-up work, we will employ a conformational restriction strategy to maintain the conformation of the phenylalanine core to maintain its critical interactions. Notably, the newly synthesized compounds displayed significant anti-HIV-2 activity, with compound 7f (EC 50 = 4.52 µM) comparable to PF74 (EC 50 = 4.16 µM).
We used a previously reported workflow to predict the ADMET properties of representative compounds 7f and F 2 -7f. Compared with PF74, F 2 -7f exhibited an improved oral non-CNS drug profile score but no significant improvements in metabolic stability. Moreover, the knowledge-based in silico prediction indicated that the compounds should not induce any genotoxicity and hepatotoxicity issues. Finally, we experimentally verified the in silico metabolic stability results utilizing HLMs and human plasma stability. Although lacking significant metabolic improvements, F 2 -7f represents a new chemotype with the potential for further modifications to improve anti-HIV potency and metabolic stability.