A Modular Approach to Triazole-Containing Chemical Inducers of Dimerisation for Yeast Three-Hybrid Screening

The yeast three-hybrid (Y3H) approach shows considerable promise for the unbiased identification of novel small molecule-protein interactions. In recent years, it has been successfully used to link a number of bioactive molecules to novel protein binding partners. However despite its potential importance as a protein target identification method, the Y3H technique has not yet been widely adopted, in part due to the challenges associated with the synthesis of the complex chemical inducers of dimerisation (CIDs). The development of a modular approach using potentially “off the shelf” synthetic components was achieved and allowed the synthesis of a family of four triazole-containing CIDs, MTX-Cmpd2.2-2.5. These CIDs were then compared using the Y3H approach with three of them giving a strong positive interaction with a known target of compound 2, TgCDPK1. These results showed that the modular nature of our synthetic strategy may help to overcome the challenges currently encountered with CID synthesis and should contribute to the Y3H approach reaching its full potential as an unbiased target identification strategy.


Introduction
The yeast three-hybrid (Y3H) approach shows considerable promise for the identification of novel small molecule-protein interactions [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17]. In recent years, this unbiased approach has linked a number of bioactive molecules to novel protein binding partners generating new biological hypotheses that have been investigated further using alternative experimental techniques [2,3,6,9,13]. For example, Johnsson discovered using Y3H that sulfasalazine, a drug used against inflammatory bowel disease, inhibits tetrahydrobiopterin biosynthesis and consequently nitric oxide production through its interaction with sepiapterin reductase [6]. More recently, Cornish used Y3H to identify PDE6D as a novel protein target of anecortave acetate, an intraocular pressure-lowering agent used in the treatment of glaucoma [18]. At the heart of the Y3H approach is an ingenious system to screen for potential binding proteins [13,15,[19][20][21][22]. The screen is carried out in yeast cells as the successful formation of a ternary complex consisting of the bait, the target protein and a chemical inducer of dimerisation (CID) results in yeast cell growth via the activation of the required yeast reporter gene (see Figure 1 for details). The availability of the CID, a compound that contains the bioactive molecule of interest, a linker unit and typically methotrexate (MTX) is essential. Figure 1. Schematic representation of the Y3H system used in this study showing the ternary complex between the chemical inducer of dimerization (CID) and the two fusion proteins containing the activation and DNA-binding domains of the transcription factor (AD and DBD, respectively). The CID consists of: (i) methotrexate (shown in purple), which binds to the dihydrofolate reductase DHFR-DBD fusion protein; (ii) a flexible linker unit (black line) and (iii) the bioactive molecule of interest (blue hexagon), which binds to the target protein-AD fusion. Successful formation of the ternary complex results in expression of the reporter genes (e.g., LEU2) that enables the yeast cell to grow in the absence of the amino acid.
Despite its potential importance as a protein target identification method, the Y3H technique has not yet been widely adopted in part due to the challenges associated with the synthesis of the high molecular weight (>1,000 Da) and relatively complicated CIDs. For example, we recently reported the use of Y3H to identify TgBRADIN as a target of compound 2 [23]. TgBRADIN is a previously unknown negative regulator of the apicomplexan parasite Toxoplasma gondii's tachyzoite to bradyzoite DBD AD DBD AD Bait (DHFR) CID Target differentiation pathway. In that study we used a CID that contained a PEG linker unit MTX-Cmpd2.1 (Figure 2A), in line with the majority of the existing literature [3,6,11,24]. MTX-Cmpd2.1 was prepared in a total of 21 steps with seven of the steps being required just to make the linker unit. the erlotinib-based CID as described by Chidley et al. [6].
In the course of our Y3H project, we became interested in preparing additional CIDs that contained the same bioactive molecule (compound 2) and methotrexate but that differed in the length of the linker. The question of whether linker length affects the ability of a CID to dimerise proteins has been previously examined in three-hybrid systems, but with conflicting results [19,25,26]. For example, the length of the linker appeared to have no effect on the dimerisation of FK506-binding proteins leading to signal transduction [26]. Similarly, Cornish saw high levels of transcriptional activation induced in the Y3H system using dihydrofolate reductase and the glucocorticoid receptor, regardless of CID structure and linker length [19]. In contrast, Amara et al. found dramatic effects of the linker length when comparing the ability of a series of four CIDs to dimerize tandem FKBP fused either to Fas proteins leading to apoptosis or to the domains of a transcription factor allowing reporter gene transcription [25]. The discrepancies between these studies might be explained by the relative strength of the interactions, the distance between the two proteins required to exert their effects or steric hindrance imposed by the proteins. Unfortunately these variables are extremely difficult to anticipate when designing a CID for use in Y3H-based drug target identification, suggesting that the development of a versatile synthetic strategy to enable the rapid generation of related CIDs (families of CIDs) may be advantageous. We therefore decided to establish a route to families of compound 2-based CIDs that differed only in the linker unit using potentially "off the shelf" reagents.
In 2011, Johnsson reported the synthesis of a 1,2,3-triazole-containing CID based on the clinically approved drug erlotinib ( Figure 2B). Use of this CID in Y3H enabled the identification of the binding partner oxysterol-binding protein-related protein 7, the first non-kinase target identified for this drug [6].
Based on this literature precedent, we investigated the use of the copper-catalysed Hüisgen 1,3-dipolar cycloaddition reaction between an azide and an alkyne to generate CIDs in a modular fashion. Here we report the application of this approach to the rapid synthesis of a family of four CIDs with varying linker lengths, MTX-Cmpd2.2-2.5 (Scheme 1). A comparison of our CIDs MTX-Cmpd2.1-2.5 in the Y3H approach is also described. Significantly, lower background growth was observed with some of our new triazole-containing CIDs than with our original PEG-containing CID, MTX-Cmpd2.1.

Results and Discussion
An outline of the modular approach that was adopted to target MTX-Cmpd2 CIDs is shown in Scheme 1. The planned synthesis involved the coupling of two key components: the compound 2-based alkyne 3 (where n is variable) and the t Bu-MTX-azide 4 (where m is variable). Components 3 and 4 could be accessed by the synthesis of a precursor to compounds 2 and 5, various PEG-based linker units 6 and 7 and tert-butyl methotrexate ( t Bu-MTX, 8 [23]) (Scheme 1).

Scheme 1. Modular approach to MTX-Cmpd2
CIDs enabling components to be mixed and matched as required. To determine whether the synthetic plan would work and to assess whether a triazole-containing linker could be tolerated in this system, initial studies focused on the synthesis of MTX-Cmpd2.2 (n = 1 and m = 2), which was close in structure to the original CID MTX-Cmpd2.1. 3D Representations of MTX-Cmpd2.1 and MTX-Cmpd2.2 (generated using low level computational methods inspired by the report of Lu et al. [27]) suggested that the planned change in the linker system would give CID MTX-Cmpd2.2, with little significant impact on the overall length of the CID despite the fact that the linker unit in MTX-Cmpd2.2 contains an additional atom. Again in agreement with the work of Lu et al. [27], the predicted extended conformation of MTX-Cmpd2.1 was linear, whereas that for MTX-Cmpd2.2 was not ( Figure 3). The required sulfone 5 was successfully synthesised in gram quantities as reported previously by us and others [9,23,28]. The required aminoalkyne linker 6a was then prepared in multi-gram quantities starting with selective propargylation of diethylene glycol 9a. Subsequent tosylation of the remaining alcohol functionality [29] followed by treatment with NaN 3 in the presence of TBAI afforded the corresponding azidoalkyne [30], which was reduced under Staudinger reduction conditions using solid phase triphenylphosphine to give 6a (Scheme 2A) [24]. The linker 6a was then reacted with sulfone 5 at 110 °C using microwave irradiation to afford 3a (Scheme 2A).

Synthesis of the t Bu-MTX-Azides 4
tert-Butyl methotrexate 8 was prepared in gram quantities according to literature methods [23]. The aminoalkyne linker 7b was synthesised in an analogous manner to aminoalkyne linker 6a. Triethylene glycol 9b was converted to the ditosylated analogue [29] and treated with NaN 3 to afford the corresponding diazide [30] (Scheme 2B). Staudinger reduction of one of the diazide groups in the presence of 1 equivalent of PPh 3 and 1 N HCl afforded a pure sample of 7b following an acid-base work-up [31]. Aminoazide 7b was then coupled to t Bu-MTX 8 to give 4b (Scheme 2B). With components 3a and 4b in hand, coupling via the Hüisgen 1,3-dipolar cycloaddition reaction (an example of a click reaction [32][33][34][35]) using copper(II) sulphate and sodium ascorbate was attempted (Scheme 3). Despite achieving the required transformation under these reaction conditions, problems were initially encountered in isolating the required tert-butyl-protected CID in a pure form, given its high molecular weight and polarity. After extensive optimisation of the purification procedure, t Bu-MTX-Cmpd2.2 was purified using column chromatography on normal phase silica gel eluting with a mixture of DCM, MeOH and an aqueous NH 4 OH solution. Subsequent treatment of t Bu-MTX-Cmpd2.2 with TFA in the presence of thioanisole provided MTX-Cmpd2.2 (Scheme 3). Scheme 3. Hüisgen 1,3-dipolar cycloaddition and deprotection to MTX-Cmpd2.2.

Y3H results with CIDs MTX-Cmpd2.1 and MTX-Cmpd2.2
The biological activity of MTX-Cmpd2.1 and MTX-Cmpd2.2 were assessed using our standard Y3H growth assays with yeast expressing T. gondii calcium-dependent protein kinase1 (TgCDPK1) fused to the activation domain. TgCDPK1 has previously been identified as a target of compound 2 [9,28]. Empty vector (the AD vector without TgCDPK1 and containing a stop codon immediately downstream of the multiple cloning site) was used as a negative control. Gratifyingly, both CIDs showed a robust Y3H interaction with TgCDPK1 based on LEU2 reporter activation in 48 hour growth assays ( Figure 4A), consistent with the view that the Y3H system can tolerate incorporation of a 1,2,3-triazole ring in the linker unit. Interestingly, at longer time points MTX-Cmpd2.2 showed less background growth in the empty vector control than MTX-Cmpd2.1 ( Figure 4B), suggesting that the use of the triazole-containing CID MTX-Cmpd2.2 may lead to a reduction in the number of false positive hits associated with a Y3H screen by decreasing the background growth of yeast expressing non-interacting targets. Given the observation that incorporation of a triazole ring in the linker unit was tolerated and to test the modular nature of our approach, it was decided to try and further optimise the interaction between MTX and the DHFR-DBD fusion protein and compound 2 and the TgCDPK1-AD fusion protein by preparing CIDs with: (i) an alternative positioning of the triazole ring and (ii) a modified linker length. The synthesis of MTX-Cmpd2.3-2.5 was achieved rapidly by mixing and matching the components 3 and 4 that contained varying numbers of PEG units (n and m respectively) that had been prepared separately in gram quantities (Scheme 3).

Synthesis and Analysis of Additional MTX-Cmpd2 CIDs
To determine whether the positioning of the triazole ring affected the interaction of the CID with its target protein, MTX-Cmpd2.3 was rapidly prepared by the reaction of 3b and 4a (Schemes 2 and 3). MTX-Cmpd2. 3 was then compared to MTX-Cmpd2.2 and MTX-Cmpd2.1 in the Y3H system by evaluating the activation of the reporter genes LEU2 (growth assay; Figure 5A) and LacZ (β-galactosidase assay; Figure 5B). The two triazole-containing CIDs behaved similarly in both assays (compare MTX-Cmpd2.2 and MTX-Cmpd2. 3 in Figure 5A and 5B), demonstrating that the position of the triazole ring has little, if any, effect on the interaction of the CID with the two fusion proteins. In the LEU2 reporter assay, the triazole-containing CIDs tended to show less growth at 48 h than MTX-Cmpd2.1, but all three supported similar growth at 72 h ( Figure 5A). In the more quantitative LacZ reporter assay, the triazole-containing CIDs showed less reporter activation than MTX-Cmpd2.1, but with similar dose-response curves ( Figure 5B). The triazole-containing CIDs once again showed less non-specific reporter activation with the empty vector than MTX-Cmpd2.1 ( Figure 5B). With confirmation that changes in the positioning of the triazole-PEG linker unit were compatible with this Y3H assay, the influence of linker length was investigated. Two new CIDs, MTX-Cmpd2. 4 and MTX-Cmpd2.5 were prepared from 3b and 4c and 3c and 4d respectively (Scheme 3). The intermediate length CID, MTX-Cmpd2.4, gave a positive Y3H interaction with the TgCDPK1-AD fusion protein in the growth assay, but the longest CID of the series, MTX-Cmpd2.5, failed to support a robust Y3H response; this was particularly evident after 72 h (Figure 6A,B). A similar result was observed in the LacZ reporter assay ( Figure 6C).  Cmpd2.4, therefore, appears to be the optimal size for the ternary complex with TgCDPK1, as this CID consistently supported slightly (though not statistically significant) better growth than MTX-Cmpd2.2 and MTX-Cmpd2.5 ( Figure 6A,B). CIDs with a linker unit longer than that present in MTX-Cmpd2.4 are likely to be suboptimal because of the entropic cost that must be paid to correctly locate the two functional domains of the transcription factor. In an analogous way to the observations reported here for successful Y3H interactions, variations in a linker unit that maintains the two domains in a fusion protein have been shown to influence significantly the appropriate separation and folding of each domain [36]. In both cases it seems likely that the separation provided by the linker unit not only allows correct folding of the two domains in the fusion protein (or the DNA binding and activation domains in the Y3H system) but also affects the overall stability of the complex by changing its hydrophobicity profile [37]. Whilst small linkers restrict the conformational space of the individual domains, longer linkers may be more exposed to the solvent resulting in the inherent properties of the linker unit such as its hydrophobicity or secondary structure potentially coming into play. These could in turn affect operationally important parameters such as CID solubility, uptake or stability. It seems likely that the optimal linker length will change depending on the small molecule-protein pair being studied and therefore the ability to prepare families of CIDs relatively quickly will be important for the applications of Y3H (for example in the detailed study of specific interactions between a bioactive molecule under study and a target by defining important residues in a binding site) [38][39][40].

General
Thin layer chromatography (TLC) analysis was performed using glass plates coated with silica gel (with fluorescent indicator UV 254 ). Developed plates were air dried and analysed under a UV lamp (254/365 nm). Flash chromatography was performed using silica gel (40-63 µm, Fluorochem). Low resolution (LR) and high resolution (HR) electrospray mass spectral (ES-MS) analyses were acquired by electrospray ionisation (ESI), electron impact (EI) or chemical ionisation (CI). These were acquired within the School of Chemistry, University of St Andrews. Nuclear magnetic resonance (NMR) spectra were acquired at room temperature on either a Bruker Avance 300 ( 1 H, 300.1 MHz; 13 C, 75.5 MHz), a Bruker Avance II 400 ( 1 H, 400.1 MHz; 13 C, 100.6 MHz), a Bruker Avance 500 ( 1 H, 500 MHz; 13 C, 125.7 MHz) or a Bruker Avance III 500 ( 1 H, 500.1 MHz, 13 C, 125.7 MHz) spectrometer and in the deuterated solvent stated. All NMR spectra were acquired using the deuterated solvent as the lock. Coupling constants (J) are quoted in Hz and are recorded to the nearest 0.1 Hz. The following abbreviations are used; s, singlet; d, doublet; t, triplet; m, multiplet and br, broad. Chemical shifts are expressed as δ in units of ppm. 13 C-NMR spectra were recorded under the same conditions and solvents using the PENDANT sequence mode. Data processing was carried out using the TOPSPIN 2 NMR program (Bruker UK Ltd).

General Procedure A: Synthesis of Compound 2-based alkyne 3
Sulfone 5 [23] was added to a solution of aminoalkyne 6 (500 mg, 3 equiv.) in CH 3 CN (7 mL). The reaction mixture was then irradiated in the microwave for 1 hour at 100 °C (PSI ~ 50). The reaction mixture was then concentrated in vacuo to give an oil which was purified by column chromatography (DCM/MeOH: 98/2 to 95/5).

General Procedure D: Boc Deprotection of t Bu-MTX-Cmpd2
Thioanisole (30 µL) was added to a solution of t Bu-MTX-Cmpd2 followed by TFA (30 µL). The resulting mixture was then stirred at room temperature overnight before being concentrated in vacuo to give a brown solid. This was then suspended in cyclohexane to remove any traces of thioanisole and the resulting mixture concentrated in vacuo to give a yellow/pale orange film.      screening campaigns. The modular nature of the synthetic strategy used here will help to overcome the CID synthesis challenges currently encountered and should contribute to the Y3H approach reaching its full potential as an unbiased target identification strategy.