Virtual Screening Strategy to Identify Retinoic Acid-Related Orphan Receptor γt Modulators

Molecular docking is a key method used in virtual screening (VS) campaigns to identify small-molecule ligands for drug discovery targets. While docking provides a tangible way to understand and predict the protein-ligand complex formation, the docking algorithms are often unable to separate active ligands from inactive molecules in practical VS usage. Here, a novel docking and shape-focused pharmacophore VS protocol is demonstrated for facilitating effective hit discovery using retinoic acid receptor-related orphan receptor gamma t (RORγt) as a case study. RORγt is a prospective target for treating inflammatory diseases such as psoriasis and multiple sclerosis. First, a commercial molecular database was flexibly docked. Second, the alternative docking poses were rescored against the shape/electrostatic potential of negative image-based (NIB) models that mirror the target’s binding cavity. The compositions of the NIB models were optimized via iterative trimming and benchmarking using a greedy search-driven algorithm or brute force NIB optimization. Third, a pharmacophore point-based filtering was performed to focus the hit identification on the known RORγt activity hotspots. Fourth, free energy binding affinity evaluation was performed on the remaining molecules. Finally, twenty-eight compounds were selected for in vitro testing and eight compounds were determined to be low μM range RORγt inhibitors, thereby showing that the introduced VS protocol generated an effective hit rate of ~29%.


Introduction
Molecular docking has an established role in protein structure-based drug discovery. However, often default scoring functions cannot identify the bioactive binding poses and therefore cannot successfully identify the active ligands in virtual screening (VS). Unfortunately, this happens even when the docking sampling has worked out satisfactorily [1][2][3]. A straightforward way to overcome this persistent docking scoring problem would be to rescore the already generated docking poses using a fast and accurate scoring method.
Negative image-based rescoring (R-NiB) is a protein cavity-based docking rescoring methodology that ranks docked compounds based on their shape and the electrostatic potential (ESP) complementarity with the protein's binding cavity [4]. For this purpose, mirror images of the protein's binding cavity known as negative image-based (NIB) models are generated using the cavity detection and filling software PANTHER [5]. The cavitybased docking rescoring, which is performed with the ultrafast similarity comparison algorithm ShaEP [6], has been shown to improve the yields with several docking algorithms allows it to bind a co-repressor peptide instead of the co-activator. Co-repressor binding modulates the RORγt function in a manner that the physiological effects are the opposite of the agonist-bound RORγt. Consequently, the genes responsible for inflammation are downregulated, and anti-inflammatory genes are upregulated in the anti-inflammatory effect [18,27,30,31]. ment mainly of psoriasis with RORγt modulators have emerged, highlighting the poten-tial and growing demand of effective RORγt inhibitors for therapeutic human use [27][28][29].
RORγt-mediated inflammatory response initiates when an agonist is bound into the ligand-binding domain (LBD; Figure 1, left), stabilizing helix 11 positioning. This leads to a conformational change of the adjacent helix 12 that further stabilizes the allosteric binding site (ABS) and enables binding of a co-activator peptide. The activated RORγt shifts to the cell nucleus and binds as a monomer to the specific DNA sequences activating gene transcription responsible for inflammation. Antagonists block RORγt activation by preventing co-activator binding to the ABS. Inverse agonists stabilize ABS into a conformation that allows it to bind a co-repressor peptide instead of the co-activator. Co-repressor binding modulates the RORγt function in a manner that the physiological effects are the opposite of the agonist-bound RORγt. Consequently, the genes responsible for inflammation are downregulated, and anti-inflammatory genes are upregulated in the anti-inflammatory effect [18,27,30,31]. On the left is a 3D-structure of the ligand-binding domain of retinoic acid receptor-related orphan receptor gamma t (RORγt; PDB: 5NTW, cartoon representation) with a co-crystallized inverse agonist (CPK model with green carbons). On the right, the H-bond (orange dashed line) formed between His479 and Tyr502 (magenta and pink stick carbons, respectively) in helices 11 and 12 (magenta and pink cartoons, respectively) is shown. The atom coloring for O, N, S, and F is set to red, blue, yellow, and light green, respectively. Water (WAT) is shown as a red sphere.
Modulation of the hydrogen bonding (H-bonding) between His479 in helix 11 and Tyr502 in helix 12 is crucial for determining whether the helix 12 conformation favors either co-activator or co-repressor binding [32]. Some inverse agonists facilitate co-repressor binding by preventing the inter-helix H-bond formation via steric obstruction. Alternatively, as a basis for inverse agonism, the movement of helix 12 has also been suggested to follow the release of a water molecule that is trapped into an unfavorably hydrophobic niche (Figure 1, right) [32].
Both structure-based and ligand-based VS approaches have been successfully employed to identify novel compounds for targeting RORγt in the literature [33][34][35][36][37][38][39]. Initial hits have demonstrated micromolar activities in the AlphaScreen assay and a cell-based reporter gene assay and, after optimization, a few compounds have displayed signifi- Figure 1. Retinoic acid receptor-related orphan receptor gamma t ligand binding domain. On the left is a 3D-structure of the ligand-binding domain of retinoic acid receptor-related orphan receptor gamma t (RORγt; PDB: 5NTW, cartoon representation) with a co-crystallized inverse agonist (CPK model with green carbons). On the right, the H-bond (orange dashed line) formed between His479 and Tyr502 (magenta and pink stick carbons, respectively) in helices 11 and 12 (magenta and pink cartoons, respectively) is shown. The atom coloring for O, N, S, and F is set to red, blue, yellow, and light green, respectively. Water (WAT) is shown as a red sphere.
Modulation of the hydrogen bonding (H-bonding) between His479 in helix 11 and Tyr502 in helix 12 is crucial for determining whether the helix 12 conformation favors either co-activator or co-repressor binding [32]. Some inverse agonists facilitate co-repressor binding by preventing the inter-helix H-bond formation via steric obstruction. Alternatively, as a basis for inverse agonism, the movement of helix 12 has also been suggested to follow the release of a water molecule that is trapped into an unfavorably hydrophobic niche (Figure 1, right) [32].
Both structure-based and ligand-based VS approaches have been successfully employed to identify novel compounds for targeting RORγt in the literature [33][34][35][36][37][38][39]. Initial hits have demonstrated micromolar activities in the AlphaScreen assay and a cell-based reporter gene assay and, after optimization, a few compounds have displayed significantly enhanced RORγt inhibition with half maximal inhibitory concentration (IC 50 ) values at the nanomolar level [35,36]. Previously, NIB screening was used to discover novel RORγt ligands, where pIC 50 values ranged from 4.9 (11 µM) to 6.2 (590 nM) [37]. Molecular docking and/or molecular dynamics simulations have been successfully used to explore how agonist and inverse agonist compounds bind with the RORγt LBD [27,30,40,41]. Likewise, further optimization of absorption, distribution, metabolism, and excretion (ADME) and physicochemical properties with guidance from molecular simulations and modeling have assisted in improving compounds to achieve the level and duration of target engagement required for efficacy in the clinic [33].
Overall, these prior results provide a promising starting point for discovering new potent small-molecule RORγt inhibitors using the novel VS protocol pioneered in this study.

Shape-Focused Pha Modeling and Docking Rescoring
Two NIB models of the RORγt binding cavity (Models Ia and IIa; Figure 2) were generated with PANTHER [5] and optimized using the BR-NiB method [13] before their docking rescoring usage ( Figure 2). ment required for efficacy in the clinic [33].
Overall, these prior results provide a promising starting point for discovering new potent small-molecule RORγt inhibitors using the novel VS protocol pioneered in this study.

Shape-Focused Pha Modeling and Docking Rescoring
Two NIB models of the RORγt binding cavity (Models Ia and IIa; Figure 2) were generated with PANTHER [5] and optimized using the BR-NiB method [13] before their docking rescoring usage ( Figure 2).

Figure 2.
Negative image-based models of the RORγt binding cavity. The original cavity-filling negative image-based (NIB) models, Model Ia (PDB: 5NTW [32], cyan transparent surface) and Model IIa (PDB: 5VB6 [42]; yellow transparent surface), were trimmed with brute force negative image-based optimization (BR-NiB) using the complete set of active ligands from the ChEMBL database yielding Models Ib and IIb, respectively. Alternatively, the models were also optimized using only a subset of the ligands yielding to Models Ic and IIc (transparent surfaces with electrostatic potential). The protein 3D structures (blue/orange cartoon models) used in the NIB model generation with PANTHER are shown in complex with the co-crystallized ligands (CPK models with green/magenta carbons). The NIB models (transparent surfaces) are composed of neutral (C; gray), negatively charged (O; red), and positively charged (N; blue) cavity atoms. The  [32], cyan transparent surface) and Model IIa (PDB: 5VB6 [42]; yellow transparent surface), were trimmed with brute force negative imagebased optimization (BR-NiB) using the complete set of active ligands from the ChEMBL database yielding Models Ib and IIb, respectively. Alternatively, the models were also optimized using only a subset of the ligands yielding to Models Ic and IIc (transparent surfaces with electrostatic potential). The protein 3D structures (blue/orange cartoon models) used in the NIB model generation with PANTHER are shown in complex with the co-crystallized ligands (CPK models with green/magenta carbons). The NIB models (transparent surfaces) are composed of neutral (C; gray), negatively charged (O; red), and positively charged (N; blue) cavity atoms. The projected H-bonds (orange dotted lines) of these cavity atoms with residues (cyan/orange stick models) lining the binding cavity are shown. Hydrogens are omitted for clarity.
Out of the box, in the PANTHER-generated NIB models there is typically an excess of the cavity atoms that decrease their ability to find active ligands in docking rescoring usage ( Figure 2). In BR-NiB, the effect of each cavity atom forming the NIB model, whether charged (N/O) or neutral (C), is subjected to systematic evaluation and editing in an automated greedy search [13]. The effect of each cavity atom or point in the NIB model is analyzed by removing them one by one and then testing the docking rescoring ability of each new model variant. This iterative process continues from generation to generation until the enrichment evaluation of the model does not show improvement. The numerical analysis of each model was done using Boltzmann-Enhanced Discrimination of Receiver Operating Characteristic with alpha value 20 (BEDROC20) [15,43] as the target metric.
In theory, the optimized NIB models include key features from both the active ligands included in the training set and the inverted cavity itself. Moreover, the optimized NIB models should perform better in recognizing active ligands from inactive decoys than the non-optimized NIB models as suggested by the training set performance. However, because the training set composition affects the model optimization, both a full set and a subset containing only the lowest-ranked active RORγt compounds from the ChEMBL database [44][45][46] were used in the NIB model optimization. This dual approach was done to potentially avoid limiting the ability of the optimized models to recognize chemically diverse RORγt inhibitors in docking-based VS.
The BR-NiB processing significantly improved the effectiveness of the NIB models in docking rescoring usage ( Figure 3). The BR-NiB-optimized models, describing the optimal shape/ESP features for filtering active RORγt ligands, were used to rescore the flexible docking poses of SPECS compounds. In practical terms, the docked SPECS compounds play a dual role in this study, first, they act as the decoy molecules in the NIB model optimization, and, second, the best ranked compounds were considered VS hits.

Pharmacophore Filtering Focuses Compound Selection to Inverse Agonists
Five important binding regions in the LBD of RORγt were identified ( Figure 4) based on a literature search [30,32] and visualization of available ligand-bound RORγt X-ray crystal structures.
The first two regions are important for the stabilization of inverse agonist-binding conformation: (1) Two PHA points were defined at sites occupied by His479 and Tyr502, which can act as either H-bond acceptors or donors, and accordingly, non-hydrogen atoms (N, O, S, F, Cl, Br, or I) were sought; (2) Leu324 and Phe388 are important for forming hydrophobic or stacking effects with the binding ligand [30], and therefore, compounds not possessing any aromatic atom within 4.5 Å of Leu324 or Phe388 were excluded. Similarly, the third region (3) included Phe378 and compounds not possessing any aromatic atom within 5.2 Å of the residue were excluded. The fourth region (4) included His323 and the structural water molecule (Wat770), which can be either H-bond acceptors or donors [32,48]. Hence, the PHA filter points to exclude the compounds that did not show a possible H-bonding atom (simplified as N, O, S, F, Cl, Br, or I) within 4.0 Å of His323 or Wat770 were generated. Additionally, hydrophobic Phe377 sterically limits the fourth region. The fifth region (5) included Arg367, which acts as an H-bond donor. Therefore, a PHA filter point requiring an H-bond acceptor (demonstrated as heavy atoms N, O, S, F, Cl, Br, or I) was used.
The PHA filtering, performed with SDFCONF [49], was fine-tuned by adjusting the optimal radii for the PHA points, which would discard as many decoy compounds as possible with a relatively low number of active compounds being discarded. Eventually, PHA filtering radii were determined to be 4.0 Å for region 1, 4.5 Å for region 2, 5.2 Å for region 3, 4.0 Å for region 4, and 4.0 Å for region 5. With these optimized settings (Figure 4), the PHA filtering excluded 83% of the SPECS compounds (with unknown activity) while only 37% of known active compounds were discarded. The discarded active compounds were mostly of low activity, suggesting that the combined BR-NiB and PHA filtering could assist in discovering compounds with a RORγt inverse agonism profile.

Selecting Compounds for Experimental Testing
First, the top 1% of the best-ranked SPECS compounds from BR-NiB (Models Ib, c and IIb, c; Figure 2) were selected for further consideration. Next, the molecules were screened with the PHA points defined from the RORγt ligand binding site using SDFCONF [49] ( Figure 4). Third, the compounds were filtered by logP (< 5.5), and to avoid unspecific binding, the potential PAINS compounds were filtered out using Canvas. Fourth, Gibb's free energy of binding was estimated with MM/GBSA calculations for the remaining compounds, discarding compounds with predicted binding energy > −95 kcal/mol. Finally, the binding complexes were evaluated visually to choose the most promising molecules for in vitro testing. In the manual selection of the compounds, special attention was paid to the diversity of their scaffold or core structures. Favorable steric packing at the binding cavity, as well as the number and quality of the formed interactions, were regarded to be vital. In particular, the disruption of the H-bond formation between His479 and Tyr502 was considered important, as this is favorable for inverse agonism [30]. Altogether, twenty-eight compounds were selected for experimental testing ( Figure 5).  1) The SPECS 10 mg compound library set, which is composed of one hundred sixty-nine thousand and one hundred compounds, was flexibly docked and generated one million eight hundred seventy thousand and seventy entries for both protein structures. This number includes alternative tautomers, enantiomers, and ten docking poses for each compound referred here as duplicates (*).
(2) After BR-NiB processing, one thousand six hundred and ninety-one top-ranked SPECS compounds were extracted for each optimized model (4 × 1691), yielding six thousand seven hundred and sixty-four entries. This number again includes duplicates. (3) Point-based pharmacophore and PAINS filtering was performed for the remaining compounds and, Figure 5. The docking and shape-focused filtering protocol used in virtual screening. (1) The SPECS 10 mg compound library set, which is composed of one hundred sixty-nine thousand and one hundred compounds, was flexibly docked and generated one million eight hundred seventy thousand and seventy entries for both protein structures. This number includes alternative tautomers, enantiomers, and ten docking poses for each compound referred here as duplicates (*).
(2) After BR-NiB processing, one thousand six hundred and ninety-one top-ranked SPECS compounds were extracted for each optimized model (4 × 1691), yielding six thousand seven hundred and sixty-four entries. This number again includes duplicates. (3) Point-based pharmacophore and PAINS filtering was performed for the remaining compounds and, furthermore, logP filtering was done for the remaining compounds, yielding one thousand and seventy-nine entries with duplicates. (4) Next, one hundred top-ranked compounds were obtained based on the binding free energy calculations for all remaining compounds. (5) Twenty-eight compounds were selected for purchase after visual inspection. (6) Eight of the in vitro tested compounds were determined active at the low-micromolar range, yielding a hit rate of 29%.

Novel Virtual Screening Strategy Yielded an Effective Hit Rate of 29%
The full RORγt inverse agonists appear to provide the most significant anti-inflammatory effect [27][28][29], and therefore, the discovery focused on finding potent novel full RORγt inverse agonists. A total of ten compounds (9, 10, 13-15, 19, 21, and 24-26; Figures 6 and 7) were identified to inhibit RORγt at some level. The estimated IC 50 values ranged from 1.1 µM to 10.7 µM (Table 1; Figure 6). Compounds 19 and 9 were the most potent inverse agonists of RORγt with an IC 50 of 1.1 µM and 1.7 µM, respectively (Table 1). For compounds 10 and 21, the reported IC 50 values should be considered only rough estimates, as the minimum signal intensity was not properly reached within the measured concentration range. Compounds 2 and 17 were non-soluble despite their predicted logP values indicating sufficient solubility (1.69 and 3.22 for compound 2 and 2.93 and 4.75 for compound 17, as reported by the vendor or predicted by QikProp, respectively). The IC 50 value of ursolic acid (positive control) was 185 nM, which was determined as an average of four duplicate measurements. Omitting one duplicate measurement with a high error provided an IC 50 value of 157 nM, which is similar to previously reported values ( Figure 6) [36,37]. furthermore, logP filtering was done for the remaining compounds, yielding one thousand and seventy-nine entries with duplicates. (4) Next, one hundred top-ranked compounds were obtained based on the binding free energy calculations for all remaining compounds. (5) Twenty-eight compounds were selected for purchase after visual inspection. (6) Eight of the in vitro tested compounds were determined active at the low-micromolar range, yielding a hit rate of 29%.

Novel Virtual Screening Strategy Yielded an Effective Hit Rate of 29%
The full RORγt inverse agonists appear to provide the most significant anti-inflammatory effect [27][28][29], and therefore, the discovery focused on finding potent novel full RORγt inverse agonists. A total of ten compounds (9, 10, 13-15, 19, 21, and 24-26; Figures  6 and 7) were identified to inhibit RORγt at some level. The estimated IC50 values ranged from 1.1 μM to 10.7 μM (Table 1; Figure 6). Compounds 19 and 9 were the most potent inverse agonists of RORγt with an IC50 of 1.1 μM and 1.7 μM, respectively (Table 1)   Structural similarities between the active hit compounds and previously known RORγt inverse agonists or antagonists were studied by 2D molecular fingerprint comparisons (Table 1). Compound 24 was the only compound with a maximum Tanimoto coefficient > 0.2 due to the (7-methyl-3-oxo-5H- [1,3] thiazolo[3,2-a] pyrimidin-2(3H)ylidene)methyl core that was also present within known ligands. Compound 19 contains a 4-(sulfonyl-phenyl)-quinolinecarboxamide, which was also detected in the set of known ligands, resulting in a slightly elevated maximum Tanimoto coefficient (0.171) for the most active VS hit. No close matches were found for compound 9, which was the second most potent VS hit, although the 3-[(2-chlorobenzyl)oxy] benzaldehyde moiety of 9 had structural analogs among known ligands. Overall, most novel hit compounds had low Tanimoto coefficients (<0.15) compared to known RORγt ligands, indicating low structural similarity with previously known RORγt modulators.   Physicochemical and ADMET properties were calculated for the eight compounds that had well-defined experimental IC 50 values (Table 2). Generally, the active hit compounds displayed drug-like properties as evaluated by the Lipinski's rule of five [50]. All of the compounds had a single violation of the rule of five due to molecular weight moderately exceeding 500 Da. A second violation was caused by predicted logP > 5.0 for four compounds. For compounds 13 and 14, QikProp predicted slightly higher logP values than Canvas. Compounds 9 and 19, the two most active VS hits, contained no potentially reactive groups that could cause toxicity in vivo, and displayed sufficient properties related to human oral absorption and cell permeability. The ADMET predictions together with the molecular similarity analysis encourages the use of these compounds as starting points for the design of structurally novel and potent RORγt inverse agonists. Both structure-based and ligand-based VS approaches have been successfully employed to identify novel compounds for RORγt [33][34][35][36][37][38][39]. For example, we have screened RORγt using molecular docking and NIB screening [37]. Both VS techniques mostly suggested the same compounds for experimental testing, supporting the viability of the in silico methods in molecular discovery. In Rauhamäki et al. eleven of the thirty-four experimentally tested molecules were verified as RORγt inverse agonists, yielding an effective hit rate of 32% [37], matching the results of this study. One of the first studies to use VS to find novel RORγt modulators was performed by Damm-Ganamet et al. using homology models of the LBD of RORγt in VS, and RORα co-crystal structures in recognizing important interactions across various chemical series. Of the one thousand seven hundred and fifty-seven hits prioritized by the VS, sixty-five diverse hits were active in experimental testing, resulting in a hit rate of 4.1% [39]. Zhang et al. used molecular docking to identify novel RORγt compounds and demonstrated that thirteen of the twenty-four tested molecules showed inhibitory activity [36]. Similarly, Song et al. utilized molecular docking and 3D shape similarity searching and identified eight of the twenty-eight tested compounds as RORγt inverse agonists [35]. In both studies, a relatively high concentration of 50 µM was used in the AlphaScreen assay, and the activity limit was set to 50% [36] or 35% [35] inhibition. In the study by Tan et al. two of the fifteen compounds suggested by molecular docking had therapeutic effects as shown in experimental studies [34]. Lugar et al. used VS to identify a promising scaffold for a starting point for the design of potent RORγ inhibitors [33]. Recently, Wu et al. utilized a new type of network-based VS to study RORγt modulators. Seven of the seventy-two compounds were confirmed to be low micromolar (IC 50 from 0.1 µM to 4.97 µM) RORγt inverse agonists by in vitro experiments, resulting in a hit rate of 9.7% [38]. These studies show that VS can be used both successfully and diversely to identify novel RORγt compounds. Furthermore, a comparison to other VS studies using RORγt as a target shows that our methodology yields excellent hit rates.

Possible Binding Modes of the Most Active Inverse Agonists 9 and 19
The predicted binding modes of the active compounds suggest many common features among them. Both compounds 9 and 19 bind across the entire cavity of the ligand-binding pocket (Figure 8). Morpholine groups of 9 accept H-bonds from Arg367 and Glu379. Similarly, a methoxy group of 19 accepts an H-bond from Arg367 and in the middle of 19 the amide group donates an H-bond to the main chain carbonyl oxygen of Phe377. Likewise, both form an H-bond with His479; in 9 an ether group accepts an H-bond from the epsilon position nitrogen of His479 and in 19 an isoxazole accepts an H-bond from the delta position nitrogen of His479. Thus, the length of the whole cavity is covered by H-bonds that stabilize the compounds to their places.
There are several favorable hydrophobic interactions between the amino acids lining the binding cavity and the active compounds. The two aromatic ring systems of 9 are packed against several residues: Trp317, Cys320, Leu324, Met365, Phe378, Phe388, Leu391, Leu396, Ile397, Ile400, Leu483, and Tyr502. In 19, the aromatic ring is packed with Leu324, Met365, Val376, Phe378, Phe388, and Phe401. Additionally, the anisole and 3-methylquinoline are surrounded by Leu287, Leu292, Ala327, Val361, Met365, Ala368, and Phe377. Furthermore, the 5-methyl in the isoxazole group contributes to the binding by interacting with Trp317, Cys393, Leu396, and Ile397. Furthermore, the shape of compounds 9 and 19 matches closely the reciprocal shape of the RORγt binding cavity (Figure 8). However, there are no obvious H-bonding partners for the nitrogen-rich system at the core of 9 and the sulfonamide group of 19 in the predicted binding pose. It is possible that careful modification in these areas could improve their binding with RORγt even further. Amino acids surrounding the nitrogen linker in 9 and the sulfonamide in 19 are hydrophobic, and thus, the introduction of hydrophobic moieties at these positions may be worthwhile.
binding pocket (Figure 8). Morpholine groups of 9 accept H-bonds from Arg367 and Glu379. Similarly, a methoxy group of 19 accepts an H-bond from Arg367 and in the middle of 19 the amide group donates an H-bond to the main chain carbonyl oxygen of Phe377. Likewise, both form an H-bond with His479; in 9 an ether group accepts an H-bond from the epsilon position nitrogen of His479 and in 19 an isoxazole accepts an H-bond from the delta position nitrogen of His479. Thus, the length of the whole cavity is covered by Hbonds that stabilize the compounds to their places. Figure 8. The proposed binding modes of compounds 9 and 19 with RORγt. On the top, both compounds 9 and 19 (ball-and-stick model with magenta/green carbons) show a high level of shape similarity with the BR-NiB-optimized Model Ib (transparent surface with electrostatic potential). Below, both of these compounds (gray shape-filling surface shown in the background) are also predicted to form H-bonds (orange dotted lines) with the key residues lining the RORγt binding cavity. Below, both of these compounds (gray shape-filling surface shown in the background) are also predicted to form H-bonds (orange dotted lines) with the key residues lining the RORγt binding cavity. For clarity, all hydrogens are omitted except for non-polar hydrogens of the compounds. See Figure 2 for further interpretation.
The notable difference between compounds 9 and 19 is that the chlorophenyl ring in 9 protrudes between the His479-Tyr502 pair and disrupts their H-bonding. The His479 is forced to adopt a different side chain conformation, which is stabilized by the formation of either an H-bond with the ether group of 9 or the backbone carbonyl of Leu475. The disruption of the His479-Tyr502 interaction and loss of the aromatic ring stacking of His479 with Phe506 could decrease the stability of the helix 12 active conformation. However, the energy loss of losing the aromatic ring stacking can be at least partly covered by hydrophobic interactions formed when the His479 packs between Met358 and Leu396. The predicted binding mode of 19 showed a smaller conformational change in contrast to the complete re-orientation of His479 with 9. Nevertheless, with 19, the His479 side chain adopted a flipped orientation that was less optimal for the formation of the His479-Tyr502 H-bond (Figure 1).

Protein Structures and Preparation
RORγt structures with a full orthosteric inverse agonist bound were obtained from the Protein Data Bank (PDB; https://www.rcsb.org/; obtained 1 June 2019) [51,52]. The structures were superimposed using VERTAA in the BODIL molecular modeling environment [53], and the validity of side-chain conformations in the ligand-binding site was checked against the electron densities with COOT (version 0.8.9.1) [48]. Two protein 3D structures were selected for VS (PDB IDs: 5NTW [32] and 5VB6 [42]). While 5NTW is a good representative of most inverse agonist bound RORγt structures, 5VB6 presents an exceptional conformation state of the LBD. This unique state is explained by the crystallization process, during which RORγt was covalently tethered to a cofactor peptide stabilizing its structure thermodynamically, which in turn provides higher conformational flexibility. Hence, focusing on 5VB6 may provide new atomic insight into the RORγt inverse agonism [42]. The two structures were prepared with Protein Preparation Wizard in Maestro (Schrödinger Release 2018-2; Schrödinger, LLC, New York, NY, USA, 2018; https://www.schrodinger.com) for molecular docking and NIB modeling. Protonation states were assigned using PROPKA with pH 7.4 ± 0.0, and added hydrogens were optimized with the OPLS3 force field [54].

Ligand Structures and Preparation
RORγt inverse agonists used in the NIB model training with BR-NiB were acquired from the ChEMBL database [46] (obtained 10 June 2019; http://dude.docking.org). Only those full orthosteric inverse agonists with reported IC 50 values of traceable data origin were included. If several activity measurements existed for the same compound, the most potent IC 50 value was selected. The set comprised 191 active compounds. The SPECS 10 mg compound library, consisting of approximately 170,000 compounds (SPECS, The Netherlands; www.specs.net; obtained 28 June 2019), was used as decoy compounds in VS model optimization (25%) and in actual VS to identify RORγt inverse agonists (100%). This SPECS library was selected for VS due to its suitable size for testing various methods, reasonable pricing, and variability of chemistry. Both ChEMBL and SPECS molecules were processed with LIGPREP (Schrödinger Release 2018-2: LigPrep, Schrödinger, LLC, New York, NY, 2018) to generate possible tautomers and enantiomers with OPLS3 charges [54] at pH 7.4 ± 0.0.

Nib Models
NIB models of the LBD for 5NTW (Model Ia) and 5VB6 (Model IIa) were constructed with PANTHER [5] (version 0.18.19; http://www.medchem.fi/panther/). NIB models were generated as described earlier [55]. Here, the FCC packing method with a filler radius of 0.85 Å and box radius of 20 Å was used. All cavity points that were positioned further than 4.5 Å from the protein were automatically removed. These NIB models served as input models for the greedy search optimization done with BR-NiB (see below).

Molecular Docking and Rescoring
A set of molecular docking and scoring methods were compared to discover the most optimal way of identifying active ligands in VS. The best result was acquired when the ChEMBL compounds were docked with PLANTS [56] (version 1.2; http://www.tcd.unikonstanz.de/plants_download/) and rescored (-noOptimization) with ShaEP [6] (version 1.3.0) utilizing a NIB model. In PLANTS, the binding site radius was set to 12 Å, and the output docking conformations were clustered with a root-mean-square deviation (RMSD) of 2 Å. The utilized scoring function was ChemPLP, and the search speed was set to speed1. The selected docking and scoring methods did not yield a high correlation coefficient, which can be explained by the heterogeneous nature of the dataset originating from different studies. Consequently, there was broad dispersion among the IC 50 values of the RORγt inverse agonists as the values typically varied multi-fold for the same compound. Like the ChEMBL compounds, the SPECS compounds were also docked with PLANTS and rescored with ShaEP using the NIB models.

Optimization of Nib Models
The original PANTHER-generated NIB models were optimized using the BR-NiB method [13], ShaEP [6], and ROCKER [15]. The BR-NiB code is available online via GitHub free-of-charge (https://github.com/jvlehtonen/brutenib). In BR-NiB, BEDROC20 [43] was used as the target metric. Randomly selected one-fourth of the SPECS compounds (~50,000) were used as decoy compounds, and all RORγt inverse agonists from ChEMBL (191 compounds) were utilized as active compounds. The optimization was performed for Models Ia and IIa, which, in turn, generated Models Ib and IIb, respectively ( Figure 2).
Two additional BR-NiB-optimized NIB models were generated using training sets in which the best-ranked ChEMBL molecules from the first round of optimization were excluded. In other words, those ChEMBL compounds ranked among the best 1% of decoy compounds for Models Ib and IIb were removed. Finally, the original NIB models (Models Ia and IIa) were optimized a second time utilizing the new sets of active compounds yielding Model Ic and Model IIc, respectively ( Figure 2).

Pharmacophore Point Filtering
PHA point-based filtering or screening was executed using SDFCONF [49]; (https://sites. utu.fi/sdfconf/) for the top 1% best SPECS compounds pushed forward by each optimized NIB model (Models Ib, c and IIb, c) to obtain a set of potentially active compounds. In SDFCONF, the screening was performed using five important binding regions of the RORγt-LBD (Figure 4). During screening, a compound passed the PHA filter if it overlapped or fitted into regions 1, 2, 3, and 4 or fitted regions 2, 3, 4, and 5. In addition, the compound had to contain a heavy atom within a 4 Å radius from either Tyr502 or His479. The PHA filtering was performed separately for all four groups of SPECS compounds that were ranked among the best 1% based on the ShaEP rescoring.

Compound Selection
ALogP values were calculated with Canvas [57,58] (Schrödinger Release: 2018-2: Schrödinger, LLC, New York, NY, USA, 2018) for compounds that passed both the BR-NiB and SDFCONF screens. Those compounds with an estimated logP > 5.5 were excluded. The RORγt-LBD is very lipophilic as reflected by the neutrality of NIB models (Figure 2), and thus, lipophilicity may be essential for binding. This observation justified the selection of a relatively high logP threshold value. Potential PAINS compounds were removed with PAINS-filtering (PAINS1-3) in the Canvas program.
Binding free energy was calculated for the remaining compounds using PRIME [59] MM/GBSA (Schrödinger Release: 2018-2: Prime, Schrödinger, LLC, New York, NY, USA, 2018) to exclude the compounds presenting weak predicted binding energies. The predictions were done individually to the docked compounds in complex with either 5NTW or 5VB6 protein structures. The following options were used: (1) VSGB solvation model and OPLS3 (Optimized Potentials for Liquid Simulations) force field were applied, (2) use input partial charges ON, use implicit membrane OFF, and (3) distance from ligand = 4 Å, sampling method: minimize. The compounds presenting binding energies > −95 were excluded. The remaining molecules were visually evaluated, considering the energyminimized complexes provided by PRIME MM/GBSA. As a result, after careful visual inspection with preference to core diversity, 28 compounds were selected for experimental testing.

Experimental Testing
The activity of the 28 selected molecules was determined at 5 µM concentration using the Human RAR-related Orphan Receptor Gamma Reporter Assay System in a 96-well format (Indigo Biosciences, State College, PA, USA). The assay utilizes human reporter cells designed to express high levels of human RORγ hybrids of both isoforms 1 and 2 and is capable of quantifying inverse agonism and agonism. The assay was performed as a single data point analysis, and 10 molecules exhibiting the strongest activity were selected for further analysis. The same kit was used for determining the IC 50 values for the most promising molecules. Concentrations of the tested molecules varied from 10 µM to 8 nM, using 1/3-fold decrements within the range of 6 µM to 8 nM. Otherwise, the kit protocol was followed, and the analysis was done similarly as in a prior study [37]. The IC 50 value of ursolic acid was determined four times as positive control for the experiments.

Molecular Similarity Analysis
The structural similarity of the in vitro hits with known active compounds for RORγt was analyzed to estimate the structural novelty of the VS hits. All compounds having IC 50 data for RORγt were downloaded from the ChEMBL database in SMILES format (2587 compounds, obtained 8 February 2023). 64-bit linear molecular fingerprints were generated for all compounds with Canvas in Maestro (Schrödinger Release: 2022-1: Schrödinger, LLC, New York, NY, USA, 2022). Daylight invariant atom types were used. The Tanimoto coefficient, which describes fingerprint overlap, was used as the similarity metric. Completely different or similar molecules get a Tanimoto coefficient of 0 or 1, respectively. Low values of the Tanimoto coefficient (0-0.2) were considered to reflect structural novelty.   Figure 3 was prepared using ROCKER [15] (http://www. medchem.fi/rocker). Figure 4 (lower panel) was prepared using PyMOL (version 2.5.0, Schrödinger, LLC). Figure 6 was prepared using GraphPad Prism version 8.4.2 (GraphPad Software, LLC, Boston, MA, USA). The 2D structures in Figure 7 were prepared with 2D Sketcher in Maestro (Schrödinger Release 2018-2; Schrödinger, LLC, New York, NY, USA, USA, 2018).

Conclusions
This study implemented a novel VS protocol, innovatively combining the use of flexible molecular docking with the cavity-based rescoring methodology BR-NiB, PHA filtering, and MM/GBSA calculations, to discover RORγt inverse agonists. The atomic compositions of the NIB models, which were used at the initial stage in shape/ESP-based docking rescoring, were trimmed using the greedy search method BR-NiB. Previously, this shape-focused PHA modeling technique had only been benchmark tested using multiple drug targets and datasets with random training/test set divisions. By combining the usage of BR-NiB with point-based PHA filtering and MM/GBSA calculations, an effective hit-rate of 29% was achieved with limited experimental testing. In total, eight of the twenty-eight tested small molecules inhibited RORγt at ≤10 µM level. Importantly, compound 19 inhibited the receptor with an IC 50 value of 1.1 µM. The verified hits represent different structural cores, which show the structural flexibility and diversity of the RORγt inverse agonism and the ability of the introduced VS protocol to facilitate scaffold hopping in docking-based VS campaigning. Overall, this work, along with the previous benchmarking and prospective drug discovery studies, demonstrates that VS protocols built on the NIB methodology can be readily and effectively utilized with various target proteins.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.