Structural Analysis of the 42 kDa Parvulin of Trypanosoma brucei

Trypanosoma brucei is a unicellular eukaryotic parasite, which causes the African sleeping sickness in humans. The recently discovered trypanosomal protein Parvulin 42 (TbPar42) plays a key role in parasite cell proliferation. Homologues of this two-domain protein are exclusively found in protozoa species. TbPar42 exhibits an N-terminal forkhead associated (FHA)-domain and a peptidyl-prolyl-cis/trans-isomerase (PPIase) domain, both connected by a linker. Using NMR and X-ray analysis as well as activity assays, we report on the structures of the single domains of TbPar42, discuss their intra-molecular interplay, and give some initial hints as to potential cellular functions of the protein.


Introduction
Trypanosomes are unicellular eukaryotic parasites belonging to the class Kinetoplastea. Infection with trypanosomes causes various diseases in humans, including Chagas disease (Trypanosoma cruzei) and the sleeping sickness disease (Trypanosoma brucei). Trypanosoma brucei is transmitted to its human host by the bite of a tsetse fly carrying this parasite in its salivary gland. After transmission, the parasite can enter the cerebral fluid through the blood vessel walls, causing the life-threatening effects of sleeping sickness. In the blood stream, T. brucei cells proliferate and are taken up again by the bite of a tsetse fly, where the parasite cells (procyclic state) in the intestine transform their energy metabolism from anaerobic glycolytic to aerobic processes and migrate to the salivary gland. Once the parasite entered the central nervous system of its human host, therapy against the outbreak of sleeping sickness is challenging.
Recently, the nuclear protein TbPar42 (molecular mass of 42 kDa) has been identified, of which RNAi-induced knockdown in procyclic cells of T. brucei inhibits proliferation of the parasite, suggesting a key role of the protein in cell growth [1,2]. Orthologous proteins are found in T. cruzei and Leishmania [2], as well as in Chlamydomonas [3]. Protein sequence alignment identified TbPar42 as a two-domain protein, exhibiting an N-terminal flanked forkhead association (FHA) domain and a putative peptidyl-prolyl-cis/trans-isomerase (PPIase) domain [4] connected by a linker region. FHA domains are generally known as phosphopeptide recognition modules [5], although the TbPar42 FHA

Nuclear Magnetic Resonance (NMR) Spectroscopy
All spectra were recorded at 27 • C on a 700 MHz Ultrashield NMR spectrometer (Bruker, Rheinstetten, Germany) equipped with a triple ( 1 H/ 13 C/ 15 N) cryoprobe. The NMR samples contained 200-600 µM of unlabeled or labeled protein, dissolved in potassium phosphate buffer. To the full length TbPar42 sample, as well as to the linker construct, 2 mM dithiothreitol was added to the NMR sample. For the calibration of proton resonances 2,2-dimethyl-2-silapentane 5-sulfonate was used as an internal standard. 15 N and 13 C resonances were calibrated according to the IUPAC procedure [12,13]. Spectra needed for the assignment were recorded using standard Bruker library pulse sequences (except for the 1 H-15 N-HSQC-nuclear overhauser enhancement spectroscopy (NOESY)). Spectra were processed with the software Topspin 3.0 (Bruker) and analyzed with CcpNmr analysis [14]. For the assignment of the backbone and aliphatic side chain 1 H, 13 C, and 15 N resonances, a set of 2D and 3D spectra ( 1 H-15 N-HSQC, 1 H-13 C-HSQC, 1 H-15 N-TOCSY-HSQC, HNCACB, CBCACONH, HNCA, HN(CO)CA, HNCO, HN(CA)CO) were recorded and analyzed. Aromatic side chain 1 H and 13 C chemical shifts were determined using two-dimensional 1 H-13 C-HSQC, 1 H-1 H-correlation spectroscopy (COSY), TOCSY and NOESY spectra (in H 2 O und D 2 O), and 3D 1 H-13 C-TOCSY-HSQC and 1 H-13 C-NOESY-HSQC spectra. Interproton distance constraints were obtained from the 2D 1 H-homonuclear NOESY spectrum and the 3D 1 H-15 N-NOESY-HSQC. Hydrogen bond donors were derived from 1 H-15 N-HSQC spectra by proton/deuteron exchange experiments, after the protein was lyophilized and dissolved in 100% D2O. Acceptors were initially determined from homologue structures and, if applicable, were corrected during the calculation procedure. Hydrogen bonds were set as distance restraints of 1.7-2.6 for H-O and 2.7-3.5 for N-O. Torsion angle restraints were calculated with TALOS using sequence and chemical shift data as input [15]. Structure calculation from a random template structure was performed with the CYANA 2.1 software package including automated NOE assignment, simulated annealing, and torsion angle dynamic algorithm [16]. An ensemble of 60 structures was calculated, and the 10 structures with the lowest CYANA target function value [17] were selected. The secondary structure elements of the TbPar42-domains were defined on the basis of NOE connectivities [18], calculated distances, and H-bonding, as well as using chemical shift indices [19][20][21]. Peptides for NMR studies were purchased from CASLO (Lyngby, Denmark).

NMR Titration Experiments
To 440 µM 15 N-labeled NterFHA or PPIase in 50 mM potassium phosphate buffer, pH 6.26, 90%/10% (v/v) H 2 O / D 2 O, the peptide Suc-Ala-pThr-Pro-Ala-NH 2 , dissolved in the same buffer, was added stepwise to a final concentration of 13 mM. After each step a 1 H-15 N-HSQC or SOFAST spectrum was recorded on a 700 MHz Ultrashield NMR spectrometer at 27 • C. The spectra were analyzed with the CcpNmr Analysis software, and the chemical shift perturbations were calculated with following equation [22]: Residues with chemical shifts ≥0.015 ppm (PPIase) or ≥0.04 ppm (NterFHA) were represented on the related protein structure using PyMOL [23]. For titrations of the peptide Ala-Glu-Ala-pThr-Glu-Ala-Xaa (Xaa representing Asp, Ile, Val, Ser, or Gln) to the NterFHA domain, the protein concentration was 250 µM and the peptides were titrated to a final concentration of 8 mM.

EXSY Experiment
EXSY spectra were recorded for a high concentrated peptide probe (Suc-Ala-pThr-Pro-Ala-NH2 of 2.4 mM in 25 mM potassium phosphate buffer, pH 6.26, 100% D 2 O at 27 • C on a 700 MHz Ultrashield NMR spectrometer, in the presence and absence of 50 µM hPin-PPIase or TbPar42-PPIase. The pulse sequence used for the spectra recording was from the standard Bruker library, noesygpph19 with a mixing time of 400 ms, a relaxation delay of 1 s, and 16 scans.

Relaxation Measurements
The longitudinal (R 1 ) and transverse relaxation (R 2 ) rates were estimated by signal intensity measurements from a series of 1 H-15 N-HSQC spectra with varying relaxation delays. Ten different delay times were used to determine the R1 rate: 20, 100, 140, 180, 200, 300, 500, 750, 1000, and 1400 ms, whereas nine evolution times 10, 30, 50, 70, 90, 110, 130, 170, and 190 were used to estimate R 2 . R1 and R2 correspond to the reciprocal relaxation times T1 and T2, which were calculated from intensity measurements for each residue according to Reference [24]. The 1 H-15 N-NOE for each amino acid was determined from the ratio of signal intensities of a saturated and an unsaturated spectrum. The two spectra were extracted out of a simultaneously recorded spectrum, recorded with a modified hsqcnoef3gpsi pulse sequence. All measurements were recorded at 27 • C on a 700 MHz Ultrashield NMR spectrometer.

Sample Preparation for TbPar42-PPIase Crystallization Experiments
For crystallization, TbPar42-PPIase (264-383) was purified by size exclusion chromatography using a Superdex TM 75 16/60 pg (GE Healthcare, Freiburg, Germany) column equilibrated with 50 mM Tris/HCl, pH 7.0, 150 mM NaCl, 1 mM DTT. The fractions containing pure TbPar42-PPIase were pooled and concentrated to 15-20 mg/mL. Crystallization conditions were screened by the hanging drop vapor diffusion method, mixing 2 µL reservoir solution with 2 µL protein solution incubated at 20 • C. Crystals used for structure determination grew at 20 • C in drop, with 23% PEG 4000, 100 mM sodium acetate as the reservoir. Prior to freezing in liquid nitrogen, crystals were soaked in 50% PEG 4000 for cryoprotection.

Data Collection and Structure Determination
Data sets were collected at 100 K on beamline P13 at PETRAIII of the Deutsches Elektronen Synchrotron (Hamburg, Germany) using a Pilatus 6M detector. The data were processed using the XDS suite and converted into appropriate formats by XDSCONV [25,26]. The TbPPIase structure was solved by molecular replacement, using human Pin1 as a search model (PDB: 1Pin, residues 50-163) in MOLREP [27]. Refinement was done with REFMAC [28] and PHENIX [29]. For manual model building, the molecular graphics program COOT was employed [30]. The final model was deposited in the PDB with the accession code 6GMP.

PPIase Activity
A protease coupled isomerase assay [31,32] was used to measure the activity of the PPIase for peptides with a Suc-Ala-Xaa-Pro-Phe-pNA scaffold (pNA-paranitroaniline; Xaa-represents pSer, pThr, Arg, Lys, Glu, Asp, Phe, Gln, or Ser), synthetized by CASLO or ChinaPeptides. The peptides were dissolved in 0.5 M LiCl in 2,2,2-trifluorethanol to a concentration of 15 mM, followed by overnight incubation. The protease α-chymotrypsin (35 µM) was pre-incubated with 200 nM or 10 µM TbPar42-PPIase (as positive control 20 nM hPin1-PPIase was used) for 5 min in PBS buffer, pH 6.8 at 10 • C. For the mutants, 2 µM and 10 µM protein were used. Directly after the addition of 75 µM peptide, the assay was followed by monitoring the absorbance of the cleaved pNA at 390 nm. The reaction rate constant of the protease and the cis-isomer content of the peptide were determined from the uncatalyzed reaction for each substrate. The observed curves were analyzed with GraphPadPrism 5.04, using a bi-exponential reaction fit.

NMR Measurements of Full-Length TbPar42 (1-383)
Due to the high molecular weight of TbPar42 (1-383), structural investigations of the protein by NMR spectroscopy were only possible using 2 H-13 C-15 N isotopically labeled protein harvested from E.coli grown in deuterium oxide. Although nearly 96% of all expected amide resonances were present in the 1 H-15 N-total correlation spectroscopy (TROSY) spectra ( Figure 1A), the low protein yield upon recombinant production hampered a complete structural analysis. We therefore decided to dissect the full-length protein in separated domains, which were more easily amenable for NMR spectroscopy. These domains were primarily defined using sequence alignment and similarity searches (BLAST). After several rounds of trials and optimization (expression values, NMR spectra appearance), three feasible domain constructs of TbPar42 (the N-terminal extended FHA domain (1-177; NterFHA), the putative linker region (172-266; Loop_FHA_PPIase), and the putative catalytic domain (264-383; TbPar42-PPIase) were separately cloned, produced, and further analyzed ( Figure 1B).    Table S1. Ten structures with the lowest energy conformations are presented in Figure 2A.   Table S1. Ten structures with the lowest energy conformations are presented in Figure 2A.

Secondary Structure Elements of TbPar42-PPIase
The PPIase is buildup of four α-helices and four β-strands (Figures 2A and 2B). The first Nterminal β-strand comprises amino acids from Arg267 to Val274, and is followed by a long loop, which extends from Lys275 to Arg296. The adjacent α-helix 1 (Ser297 to His312) is separated through a small stretch of residues from α-helix 2 (Leu320-Phe330). Within the following thirteen residues, αhelix 3, which comprises only four residues (Gly334-Lys337), was identified by chemical shift data. Due to a lack of a typical helix-NOE pattern and fast exchanging amide protons in this helix, it is likely to be considered as α-helical turn. The second β-strand from Gly343 to Val345 is connected to α-helix 4 (Gly353 to Phe359) by a short loop. β-strand 3 is formed from Val370 to Thr372, and is followed by a short loop that turns into the fourth β-strand (Gly375-Glu381). Analysis of 15 Nrelaxation data reflecting the motion of the backbone N-H vector ( Figure 2B) indicated that all secondary structure elements are rigid, and even the extended PPIase loop regions (β1-α1 and β2-α4) lack significant internal mobility.

Secondary Structure Elements of TbPar42-PPIase
The PPIase is buildup of four α-helices and four β-strands (Figure 2A,B). The first N-terminal β-strand comprises amino acids from Arg267 to Val274, and is followed by a long loop, which extends from Lys275 to Arg296. The adjacent α-helix 1 (Ser297 to His312) is separated through a small stretch of residues from α-helix 2 (Leu320-Phe330). Within the following thirteen residues, α-helix 3, which comprises only four residues (Gly334-Lys337), was identified by chemical shift data. Due to a lack of a typical helix-NOE pattern and fast exchanging amide protons in this helix, it is likely to be considered as α-helical turn. The second β-strand from Gly343 to Val345 is connected to α-helix 4 (Gly353 to Phe359) by a short loop. β-strand 3 is formed from Val370 to Thr372, and is followed by a short loop that turns into the fourth β-strand (Gly375-Glu381). Analysis of 15 N-relaxation data reflecting the motion of the backbone N-H vector ( Figure 2B) indicated that all secondary structure elements are rigid, and even the extended PPIase loop regions (β1-α1 and β2-α4) lack significant internal mobility.

The Tertiary Structure of TbPar42-PPIase Comprises the Typical Parvulin Fold
The PPIase adopts a globular fold, which is characteristic for parvulin type proteins [33]. The central β-sheet, composed of four antiparallel β-strands (β3-β4-β1-β2), is surrounded by four α-helices ( Figure 2A). The dipole axes of these helices align parallel with the C-N orientation vector of the β-sheet. According to the DALI program, the solution structure of the PPIase domain of human parvulin hPin1 (PDB-ID: 1NMW), an isomerase acting on phosphorylated substrates, has the highest similarity to the TbPar42-PPIase (Z-score of 13.9; Cα r.m.s.d: 1.58 Å).
A comparison of both structures ( Figure 3A,B) indicates that well-conserved residues of Pin1-like parvulins, e.g., the histidine motif (His59 and His157 in hPin1; His271 and His377 in TbPar42), as well as a characteristic cysteine within the catalytic cleft (Cys113 in hPin1; Cys333 in TbPar42), are located at the same spatial positions within both folds. The hPin1 Ser154 ( Figure 3A), described as a part of the catalytic tetrad [34] is replaced by Leu374 in TbPar42 ( Figure 3B). In the catalytically active but non-phosphorylation specific parvulin hPar14, a phenylalanine (Phe120) can be found at this position ( Figure 3C). Thr372, the counterpart of which in Par14 was associated with the catalytic network [35], is also conserved in the TbPar42-PPIase structure. Residues Leu122, Met130, and Phe134 of Pin1, which are located at the concave side of the β-sheet core, constitute a hydrophobic binding pocket for the substrate's proline ring moiety. At the same spatial positions, Leu342 and Phe354 can be found in the TbPar42-PPIase structure, whereas the methionine is exchanged by a tyrosine (Tyr350), Figure 3D. Phosphorylation-specific parvulins exhibit an extended α1-β1 loop, with a basic amino acid cluster important for the binding of the phosphate group of substrates. A homologous loop is also present in the structure of TbPar42-PPIase ( Figure 3E).

The Tertiary Structure of TbPar42-PPIase Comprises the Typical Parvulin Fold
The PPIase adopts a globular fold, which is characteristic for parvulin type proteins [33]. The central β-sheet, composed of four antiparallel β-strands (β3-β4-β1-β2), is surrounded by four αhelices ( Figure 2A). The dipole axes of these helices align parallel with the C-N orientation vector of the β-sheet. According to the DALI program, the solution structure of the PPIase domain of human parvulin hPin1 (PDB-ID: 1NMW), an isomerase acting on phosphorylated substrates, has the highest similarity to the TbPar42-PPIase (Z-score of 13.9; Cα r.m.s.d: 1.58 Å).
A comparison of both structures ( Figure 3A,B) indicates that well-conserved residues of Pin1like parvulins, e.g., the histidine motif (His59 and His157 in hPin1; His271 and His377 in TbPar42), as well as a characteristic cysteine within the catalytic cleft (Cys113 in hPin1; Cys333 in TbPar42), are located at the same spatial positions within both folds. The hPin1 Ser154 ( Figure 3A), described as a part of the catalytic tetrad [34] is replaced by Leu374 in TbPar42 ( Figure 3B). In the catalytically active but non-phosphorylation specific parvulin hPar14, a phenylalanine (Phe120) can be found at this position ( Figure 3C). Thr372, the counterpart of which in Par14 was associated with the catalytic network [35], is also conserved in the TbPar42-PPIase structure. Residues Leu122, Met130, and Phe134 of Pin1, which are located at the concave side of the β-sheet core, constitute a hydrophobic binding pocket for the substrate's proline ring moiety. At the same spatial positions, Leu342 and Phe354 can be found in the TbPar42-PPIase structure, whereas the methionine is exchanged by a tyrosine (Tyr350), Figure 3D. Phosphorylation-specific parvulins exhibit an extended α1-β1 loop, with a basic amino acid cluster important for the binding of the phosphate group of substrates. A homologous loop is also present in the structure of TbPar42-PPIase ( Figure 3E).   1PIN), with its bound Ala-Pro dipeptide in orange. Hydrophobic residues of the proline-binding pocket are represented as sticks in blue for TbPar42, and in beige for hPin1. (E) Sequence alignment of the β1-α1 loop between TbPar42 (blue) and hPin1 (black). The phosphate group binding residues of hPin1 are framed (red). A full alignment of both PPIase domains is found in Figure S3.

TbPar42-PPIase Lacks Catalytic PPIase Activity
Although the PPIase domain of TbPar42 adopts a parvulin fold and reflects structural elements of Pin1-type proteins [1], it is catalytically inactive.
The protein failed to compensate the loss of ESS1 function in yeast (as hPin1 does), and also lacked isomerase activity against a pThr containing substrate, which was used to verify the activities of TbPin1 and AtPin1 [1]. We confirmed this lack of activity of TbPar42-PPIase by exchange spectroscopy (EXSY) experiments using an alternative substrate ( Figure 4A,B), and by isomerase activity studies using a protease coupled assay ( Figure 4C). In the latter experiment, a considerable ensemble of substrates containing pSer or pThr residues preceding proline were used to track activity. In addition, peptides containing negatively charged (Glu, Asp) positively charged (Arg), or uncharged (Ser, Gln, Phe) residues at the Xaa-Pro motif were assayed to test a putative sequence dependency of TbPar42's substrate activity for non-phosphorylated motifs ( Figure 4C(right)). However, no cis/trans-activity could be detected against any of these substrates. From all the current data, we may conclude that the PPIase domain of TbPar42 seems to have no enzymatic function within the protein. Although TbPar42 seems to be catalytically inactive, titration experiments with a Suc-Ala-pThr-Pro-Ala-NH 2 peptide showed chemical shift perturbations in the 2D 1 H-15 N hetero-single-quantum-coherence (HSQC) spectra of the PPIase domain, Figure 4D. Thus, the substrate was capable of binding to the enzyme. Residues affected by peptide addition were in near proximity to the proposed proline binding site (Thr265, Gly334, Val345, Thr349, Glu352, Gly353, Phe359 Glu373), but K D values were in the millimolar range (~10 mM). residues of the proline-binding pocket are represented as sticks in blue for TbPar42, and in beige for hPin1. (E) Sequence alignment of the β1-α1 loop between TbPar42 (blue) and hPin1 (black). The phosphate group binding residues of hPin1 are framed (red). A full alignment of both PPIase domains is found in Figure S3.

TbPar42-PPIase Lacks Catalytic PPIase Activity
Although the PPIase domain of TbPar42 adopts a parvulin fold and reflects structural elements of Pin1-type proteins [1], it is catalytically inactive.
The protein failed to compensate the loss of ESS1 function in yeast (as hPin1 does), and also lacked isomerase activity against a pThr containing substrate, which was used to verify the activities of TbPin1 and AtPin1 [1]. We confirmed this lack of activity of TbPar42-PPIase by exchange spectroscopy (EXSY) experiments using an alternative substrate ( Figure 4A,B), and by isomerase activity studies using a protease coupled assay ( Figure 4C). In the latter experiment, a considerable ensemble of substrates containing pSer or pThr residues preceding proline were used to track activity. In addition, peptides containing negatively charged (Glu, Asp) positively charged (Arg), or uncharged (Ser, Gln, Phe) residues at the Xaa-Pro motif were assayed to test a putative sequence dependency of TbPar42's substrate activity for non-phosphorylated motifs ( Figure 4C(right)). However, no cis/trans-activity could be detected against any of these substrates. From all the current data, we may conclude that the PPIase domain of TbPar42 seems to have no enzymatic function within the protein. Although TbPar42 seems to be catalytically inactive, titration experiments with a Suc-Ala-pThr-Pro-Ala-NH2 peptide showed chemical shift perturbations in the 2D 1 H-15 N heterosingle-quantum-coherence (HSQC) spectra of the PPIase domain, Figure 4D. Thus, the substrate was capable of binding to the enzyme. Residues affected by peptide addition were in near proximity to the proposed proline binding site (Thr265, Gly334, Val345, Thr349, Glu352, Gly353, Phe359 Glu373), but KD values were in the millimolar range (~10 mM).

What Is the Structural Origin of the Lack of Catalytic Activity of TbPar42-PPIase?
The structural fold and the spatial organization of amino acids in the putative catalytic center of TbPar42-PPIase resemble the typical active side of a Pin1-type parvulin isomerase. However, despite the fact that binding of the model peptide Suc-Ala-pThr-Pro-Ala-NH 2 seems to occur at the catalytic cleft close to the proline binding pocket ( Figure 4D), no isomerase activity could be detected under the tested conditions ( Figure 4A,C). Certainly, the dissociation constant of the peptide was only in the millimolar range, but this is no sufficient argument to explain the lack of activity, as comparable weak binding affinities have been pointed out for active parvulins and their substrates [36,37]. Thus, insufficient binding of substrates may not be the origin of catalytic inactivity.
More reasonably, the lack of activity may originate from structural alterations within the catalytic cleft or within the phosphoryl group binding loop when compared to active orthologous or homologous PPIases [38,39]. Goh et al. mentioned that the conservation of residues in the β2-α4 region of TbPar42, which is part of the substrate-binding pocket, is low when compared to homologous regions of other parvulins. According to our structural model of TbPar42-PPIase, there are two significant alterations of residues in the catalytic core region of the protein when compared to hPin1. Position 350 of the proline binding pocket is taken by a tyrosine rather than a methionine, and position 374 of the catalytic site is captured by a leucine (serine in hPin1). To examine the role of these residues on the activity of the PPIase, we mutated these positions in TbPar42 (TbPar42-PPIase Tyr350Met , TbPar42-PPIase Leu374Ser ) and hPin1 (hPin1 Met130Tyr , hPin1 Ser154Leu ), and tested the catalytic efficacy of the mutants in a protein coupled assay ( Figure 5). As hPin1 Met130Tyr was still active ( Figure 5C), this mutation is not expected to be responsible for the lack of activity in TbPar42-PPIase. This was in accordance with a still inactive TbPar42-PPIase Tyr350Met mutant ( Figure 5A). As expected [35], the mutant hPin1 Ser154Leu showed no catalytic activity ( Figure 5D). However, no gain-of-function for the TbPar42-PPIase Leu374Ser mutant was observed.  Beside residues in the catalytic cleft and proline binding pocket, three basic amino acids (basic triad) residing in the β1-α1-loop in Pin-type parvulins are functionally important in hPin1. By binding to the phosphoryl group of the substrate, these residues serve as anchor points for the bond rotation occurring in the isomerization step (reviewed in Reference [9]). Thus, not only absence of this basic triad, but changes in their arrangement and its relation to the spatial organization of residues in the catalytic center or in the proline-binding pocket may abolish catalytic activity. In TbPar42, the triad of these putative phosphoryl-binding residues Lys275, Arg280 and Arg281 is conserved with respect Beside residues in the catalytic cleft and proline binding pocket, three basic amino acids (basic triad) residing in the β1-α1-loop in Pin-type parvulins are functionally important in hPin1. By binding to the phosphoryl group of the substrate, these residues serve as anchor points for the bond rotation occurring in the isomerization step (reviewed in Reference [9]). Thus, not only absence of this basic triad, but changes in their arrangement and its relation to the spatial organization of residues in the catalytic center or in the proline-binding pocket may abolish catalytic activity. In TbPar42, the triad of these putative phosphoryl-binding residues Lys275, Arg280 and Arg281 is conserved with respect to sequence position. Despite this fact, and in contrast to TbPin1 [40] and other Pin1-typ parvulins, no significant chemical shift perturbations were detected in a HSQC spectrum for resonances of the basic triad of the β1-α1 phosphoryl group binding loop of TbPar42-PPIase (Figure 4) upon addition of a phosphorylated peptide. This indicates the absence of an interaction of the triad with the phosphate group of the peptide. We wondered if structural differences in the arrangement of this loop with relation to the catalytic active site residues would prevent binding of the phosphoryl group. Goh and coworkers suggested that an insertion in the β1-α1 region of the protein may affect catalytic activity. To uncover discrepancies in the spatial arrangement, we compared the overall fold of TbPar42-PPIase to those of the known phosphate-binding relatives TbPin and hPin1.
As we wanted to focus on the structural orientation of the side chains of the basic triad and only few NMR constraints (NOEs) exist for the side chain atoms of these residues, we complemented our structural data by crystalizing the TbPar42-PPIase and resolving its structure using X-ray analysis (PDB-code: 6GMP; Table S2) (see also Figures S4 and S5). After superimposition of the catalytic core residues (Figure 6), the spatial orientation of the proteins does not differ with respect to position and orientation. This excludes that the arrangement of phosphoryl group binding residues is responsible for the absence of activity within TbPar42. Thus, the structural reason for the absence of activity remains unknown. To our knowledge, no parvulins from higher eukaryotic organisms have been discovered so far, of which the PPIase domains lack catalytic activity. However, the bacterial PPIase domains of the periplasmic chaperones PpID [41], as well as one PPIase domain of SurA (Par1) [42], are also lacking catalytic activity. Similar to the PPIase domain of TbPar42, the isomerase activity of PpID "could not be generated by substitutions at the peptide binding site" [41]. In case of SurA Par1, the domain was found to cooperate with the second domain Par2 in substrate folding reactions. PpID "could not be generated by substitutions at the peptide binding site" [41]. In case of SurA Par1, the domain was found to cooperate with the second domain Par2 in substrate folding reactions.

NMR Structure Calculation of the N-Terminal Extended FHA Domain
The solution structure of the NterFHA-domain of TbPar42 was calculated using 2667 distance NOEs, 262 torsion angle restraints, and 42 hydrogen bonds (Table S3). An overlay of the 10 final lowest energy structures is shown in Figure 7A. The ensemble is deposited in the Protein Data Bank (PDB code: 2N84), while chemical assignments and shifts were transferred to the BMRB database with the accession number 25834. Although all 10 structures of the N-terminal extended FHA domain (1-177) fit the experimental restraints, the r.m.s.d value for the backbone atoms of the ensemble was 12.5 Å. It dropped down to 0.33 Å when the flexible terminal residues (1-33) ( Figure 7B) and the adjacent three helices α1 (Glu33 to Asn37), α2 (Ile42 to Val45), and α3 (Pro84-Tyr57) were excluded from calculation, indicating an excellent structural resolution of the FHA-domain (residues 58-173). Including α3 in the calculations increased the r.m.s.d value to 0.8 Å for the backbone atoms of the ensemble, while the r.m.s.d value increased to 1.62 Å when additional α1 and α2 were included (Table S3).

Secondary Structure Elements of the N-Terminal Extended FHA Domain (1-177)
The FHA domain of TbPar42 was analyzed together with the N-terminus of the protein. The low number of long-range NOEs per residue, as well as low R2 amide relaxation values and negative hetNOE data of the first 30 residues within the N-terminus, point towards a random coil conformation of this segment ( Figure 7B). Based on distances and typical NOE connectivities, three well-characterized helices α1 to α3 (Glu33-Asn37, Ile42-Val45, Pro48-Tyr57) adjacent to the

Secondary Structure Elements of the N-Terminal Extended FHA Domain (1-177)
The FHA domain of TbPar42 was analyzed together with the N-terminus of the protein. The low number of long-range NOEs per residue, as well as low R2 amide relaxation values and negative hetNOE data of the first 30 residues within the N-terminus, point towards a random coil conformation of this segment ( Figure 7B). Based on distances and typical NOE connectivities, three well-characterized helices α1 to α3 (Glu33-Asn37, Ile42-Val45, Pro48-Tyr57) adjacent to the unstructured region were detected. Via a long loop (Phe58-Ala70), helix α3 is connected to an array of 11 sequentially arranged β-strands. The first β1 strand extends from Cys71 to Arg77. The two following strands, β1 and β2, are both four residues long (Leu82 to Gly86 and Phe92-Gly96), whereas β4 covers just two residues (Tyr103 to Val104). β5 starts at Ala115 and extends to His120. β6 (Cys125-Asp130) is connected to β5 via a turn (Gly121-Arg124). The chemical shift index indicates a helix between strands β6 and β7 (Val137-Leu139), but, due to the fast exchanging amide protons and the missing helical NOE connections, the stretch (Leu131-Gly136) corresponds rather to a turn region. Hydrogen bonds to the other strands helped to identify β-strands β8 (Asn142-Arg143) and β9 (Pro149-Leu15). The last two strands extend from Gly155 to Phe160 (β10) and from Val166 to Leu171 (β11). Residues (Gly172-Ser177) following the last β-strand exhibit random coil character.

Tertiary Structure of TbPar42-NterFHA 1-177
Like other known FHA structures, the FHA of TbPar42 adopts the typical β-sandwich fold, which is formed by two twisted β-sheets consisting of five (β2-β1-β11-β10-β9) and six (β4-β3-β5-β6-β7-β8) β-strands, respectively, Figure 3B. In both β-sheets, the adjoining β-strands are oriented antiparallel to each other, with the exception of β4, which proceeds parallel to the neighboring β3. Due to 15 N relaxation data, the loops connecting β-strands 4/5 and 6/7 exhibit a higher flexibility compared to the other loops or turns within the FHA structure. Except for the 30 N-terminal residues and a few residues at the very C-terminal end, the structure has an overall rigid character. In contrast to the compact β-stranded fold, there are only few long-range NOE restraints emerging from the helical part of the protein (Glu33 to Tyr57). Therein, Gln54 of α3 exhibits more than 40 NOE connectivities (e.g., to Phe93 of β3 and to residues Val100, Cys101 and Asp102 within the β3-β4 loop). According to its molecular interactions, α3 is tightly attached to the β-stranded FHA core, in contrast to α1 and α2, where such NOE contacts are absent.

TbPar42-NterFHA 1-177 binds pThr-Pro Moieties within Peptides
As FHA domains are known as phospho-threonine recognition and binding modules [5], we examined the affinity of the 15 N labeled NterFHA to the phosphorylated model peptide Suc-Ala-pThr-Pro-Ala-NH 2 by performing NMR titration experiments. Within the recorded 2D 1 H-15 N HSQC spectrum, the presence of peptide affected the chemical shifts of resonances from residues of several loop-regions of the FHA core, Figure 8A. Perturbed chemical shifts were observed for residues in the β6-β7 loop (Ser133-Gly136), for isoleucine 144 in the β8-β9 loop, serine 112 and surrounding residues in the β4-β5 loop (Glu106, His107, Ile110-His114), for arginine 97 and serine 98 in the β3-β4 loop, and for serine 163 in the β10-β11 loop, Figure 3D. Structural and sequential alignments with other FHA domains indicate that perturbation mainly affects those resonances which are either essential for direct binding to phosphopeptide ligands, or for the maintenance of the binding surface ( Figure 8B). Arginine 97 from the β3-β4 loop as well as serine 112 from the β4-β5 loop are expected to form a hydrogen bond network with the phosphate group of the ligand's pThr moiety. The conserved glycine 96 and histidine 114 at the end of β3 and adjacent to strand 5 are most likely crucial for the structural stability of the binding site [43]. A significant chemical shift perturbation could also be observed for Asn135 and surrounding residues in the β6-β7 loop. In some FHA domains, like in Kanadaptin [44], this position is substituted by a histidine residue. The asparagine/histidine is important for interactions to residues flanking the pThr in the ligand [45]. Although the TbPar42-FHA binds the above mentioned phosphorylated model peptide, binding affinity is in the lower millimolar range (~4 mM).
Tight binding of FHA domains to phosphorylated motifs greatly depends on the ligand's residues flanking the phosphorylated threonine, particularly on the third position (pT+3) following pThr [43]. According to this so called "pT+3" rule, FHA domains can be divided in pTxxD or pTxxI/V (x representing any amino acid) binding modules. To test if TbPar42-FHA has a preference for a certain residue at this position, we generated synthetic peptides and tested their binding affinity towards the FHA domain by performing NMR titration experiments. As the proposed peptide-binding surface of the FHA is positively charged ( Figure 8C, we used the sequence Ala-Glu-Ala-pThr-Glu-Ala-Xaa as the peptide scaffold and generated derivatives by altering position Xaa (Asp, Ile, Val, Ser, or Gln). Addition of these peptides to the protein perturbs NMR resonances of residues located in the same regions as observed on binding of Suc-Ala-pThr-Pro-Ala-NH 2 , but, in addition, induces chemical shift changes within resonances of amino acids in the β10-β11 loop region. However, no significant improvement in binding affinity compared to the Suc-Ala-pThr-Pro-Ala-NH 2 peptide could be measured (Table S4). Electrostatic surface of the NterFHA33-177 generated with YASARA using the Particle Mesh Ewald method. The intensity of the electrostatic potential is gradually colored from dark red (negative) over grey (neutral) to dark blue (positive), representing energy levels from −150 to +150 kJ/mol. The orientation of the structure is as in (B).

The Isolated Linker Region between the NterFHA and PPIase Domain Is Flexible and Unfolded
As we were able to completely assign the amide resonances in the 1 H-15 N-HSQC spectra of the NterFHA1-177 ( Figure 9B) and PPIase264-383 ( Figure 9A) domains, we expressed and purified the 13 C-15 N-labeled linker region (residues 172-266) connecting the two domains. The signal dispersion in the 1 H-15 N-HSQC spectrum, as well as the chemical shift index analysis of the linker, indicate that the region is mostly random coil ( Figure 9C). The backbone amide groups of the N-terminal stretch Pro174-Lys178 of the linker are absent in the 1 H-15 N-HSQC spectrum, resulting in 96.6% assignment

The Isolated Linker Region between the NterFHA and PPIase Domain Is Flexible and Unfolded
As we were able to completely assign the amide resonances in the 1 H-15 N-HSQC spectra of the NterFHA 1-177 ( Figure 9B) and PPIase 264-383 ( Figure 9A) domains, we expressed and purified the the 1 H-15 N-HSQC spectrum, as well as the chemical shift index analysis of the linker, indicate that the region is mostly random coil ( Figure 9C). The backbone amide groups of the N-terminal stretch Pro174-Lys178 of the linker are absent in the 1 H-15 N-HSQC spectrum, resulting in 96.6% assignment completeness. NMR shift data indicate a long α-helix extending from Arg179 to Val197 ( Figure S1). Upon calculation of a Rosetta model ( Figure S2) [46], five putative transient helices are predicted within this linker regions.

There Is No Interaction between the Domains of TbPar42
Due to the high molecular weight of the protein, we analyzed the domains of TbPar42 as isolated constructs. However, the structures and activities of these isolated domains may be modulated by intra-domain interactions within the protein. To investigate such putative modulations, the 2D-TROSY spectrum of triple labeled 2 H-13 C-15 N-TbPar42 protein was superimposed with the assigned

There Is No Interaction between the Domains of TbPar42
Due to the high molecular weight of the protein, we analyzed the domains of TbPar42 as isolated constructs. However, the structures and activities of these isolated domains may be modulated by intra-domain interactions within the protein. To investigate such putative modulations, the 2D-TROSY spectrum of triple labeled 2 H-13 C-15 N-TbPar42 protein was superimposed with the assigned 1 H-15 N-HSQC spectra of the isolated constructs recorded under identical conditions, Figure 9D.
Interaction of domains should lead to significant chemical shift changes in the full-length protein when compared to its isolated domains. However, beside a few terminal residues of the isolated constructs, 353 signals from all three HSQC spectra perfectly matched to the TROSY spectrum, indicating that there are no interactions of the individual domains nor conformational rearrangements within the linker region under the tested buffer conditions. , as well as in Chlamydomonas [3] and Dyctiostelium [2]). The FHA domain of TbPar42 seems to be involved in the binding of phosphorylated target proteins, as could be proven by NMR titration experiments with model ligand peptides. The dissociation constants of the tested peptides, however, were only in the millimolar range, demonstrating most likely that the optimal binding sequence for this protein domain remains to be elucidated. Nevertheless, binding occurred in regions found to interact with target proteins in homologous and related FHA structures ( Figure 8A,B). As the TbPar42 protein is unique to parasitic protozoa organisms and no interaction partners have been published so far, we can only speculate about its functional role by relying on similar and orthologous domains and proteins in other organisms.
In order to find orthologous proteins for the FHA domain of TbPar42, we ran a DALI-search that depicted the highest Z-score (13 and 12.2), as well as the highest sequence identity (37%), to the FHA domains of the proteins Kanadaptin (human) and DAWDLE (DDL; Arabidopsis thaliana). Both representatives are nuclear-localized proteins. While almost nothing is known about the function of Kanadaptin, DDL appears to act in multiple developmental processes such as growth of root and shoot, as well as in floral morphogenesis and fertility [47]. Dawdle was reported to stabilize the interaction between the protein Dicer-like 1 (DCL1) and a pre-miRNA [48], and therefore is involved in the biogenesis of small RNAs and in gene silencing. Its human orthologue, Smad nuclear interacting protein 1 (SNIP1), has been attributed a similar function [48]. Interestingly, the human hPar14 parvulin protein was also found to be involved in ribosome biogenesis and RNA processing [49], and binds to double-stranded nuclear acids too. In addition to the cellular functions of DDL in Arabidopsis, the human SNIP1 constitutes an inhibitor of TGF-β and NF-κB signaling pathways by competing with the TGF-β canonical signaling protein Smad4 and the NF-κB transcription factor p65/RelA for binding to the transcriptional coactivator p300 [50,51]. Noticeably, human parvulins also act in these signaling pathways. hPin1 protein has been demonstrated to be involved in a variety of these signaling events, e.g., of the canonical (Smad signaling) and non-canonical (Ras/ERK and PI3K/Akt) pathways of TGF-β (as reviewed in References [52,53]) and of NFκB [54]. In addition, hPar14 was found to act in insulin activated PI3K-Akt signaling (reviewed in Reference [9]). The fact that human parvulins are involved in cellular events in which orthologous proteins of the parvulin TbPar42 are also involved may lead towards a role of TbPar42 in TGF-β and NFκB signaling, as well as RNA biogenesis and stability. These activities may be important for Trypanosoma to control host defense mechanisms and immunity.

TbPar42 May Interact as a Protein Recruitment Platform
The domains of TbPar42 are lined up as a string of pearls. They seem to act independently of each other, or they may interact only after binding to a yet unknown binding partner. Following this hypothesis, TbPar42 would function as a specific dynamic recruitment platform, a scaffold protein that allows weakly associating proteins to be engaged in higher order assemblies [55]. Such assemblies allow a cell to gain spatiotemporal control over protein activity. Many cellular events, such as signal transduction cascades or gene activation and its control processes, involve such transient buildup of higher-order protein complexes. SNIP1 and the related eukaryotic parvulin proteins such as hPar14 and hPar17 have already been demonstrated to be involved in such events and assemblies [49,[56][57][58]. An alternative explanation for the absence of domain interaction involves post-translational modification (PTM) of the protein, especially of its linker region. A Netphos-3.1 data search [59] predicted a plethora of phosphorylation sites within this region ( Figure 10). In the N-terminus, only one such site (Thr32 by CKII) was found.
Biomolecules 2019, 9, x FOR PEER REVIEW 17 of 21 data search [59] predicted a plethora of phosphorylation sites within this region ( Figure 10). In the N-terminus, only one such site (Thr32 by CKII) was found. Most of the detected phosphorylation sites in the linker are at the beginning or within the helical regions predicted in our Rosetta model ( Figure S2). Post-translational modifications often act in stabilizing transient helices, or in preventing the formation of such secondary structure elements. The amount and sequential arrangement of PTMs within a mainly unstructured large linker region (according to our NMR data) characterize this part of the protein as intrinsically disordered region (IDR) [60,61]. Such regions play a general regulatory role in signaling and controlling pathways [62][63][64][65].
Noticeably, all (but cdc2) kinases, which, according to the Netphos prediction, execute the posttranslational modifications of the TbPar42 linker region (Figure 10), are involved in Wnt-signaling [66,67], and are active in cytoskeleton organization, mitotic regulation, neuronal patterning, and cell fate decision. Many of these events have been demonstrated to be influenced by parvulins Pin1 and Par14/17 [9,68,69] in human cells. The IDR character of TbPar42 and the involvement of relatives from other organisms in protein assemblies seems to be supportive for the recruitment platform hypothesis of this protein.
Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1, Table S1: Structural statistics for the deposited ensemble of 10 lowest energy PPIase structures, Table S2: Structural statistics for the deposited PPIase structure solved by X-ray analysis, Table S3: Structural statistics for the deposited ensemble of 10 lowest energy NterFHA structures, Table S4: Mean KD values after titration of Ala-Glu-Ala-pThr-Ala-Glu-Xaa-peptides and Suc-Ala-pThr-Pro-Ala-NH2 to NterFHA of TbPar42, Table S5: Oligonucleotides used as forward and reverse primers for PCR-amplification of the constructs. Figure S1: NMR shift data of the linker region of TbPar42 (A) and secondary structure predictions, Figure S2: CS Rosetta model of the TbPar42 linker region, Figure S3: Sequence alignment of the PPIase domains of TbPar42 and hPin1, Figure S4: NMR structure (PDB-ID: 2N87) of the PPIase domain of TbPar42, Figure S5  Most of the detected phosphorylation sites in the linker are at the beginning or within the helical regions predicted in our Rosetta model ( Figure S2). Post-translational modifications often act in stabilizing transient helices, or in preventing the formation of such secondary structure elements. The amount and sequential arrangement of PTMs within a mainly unstructured large linker region (according to our NMR data) characterize this part of the protein as intrinsically disordered region (IDR) [60,61]. Such regions play a general regulatory role in signaling and controlling pathways [62][63][64][65].
Noticeably, all (but cdc2) kinases, which, according to the Netphos prediction, execute the post-translational modifications of the TbPar42 linker region (Figure 10), are involved in Wnt-signaling [66,67], and are active in cytoskeleton organization, mitotic regulation, neuronal patterning, and cell fate decision. Many of these events have been demonstrated to be influenced by parvulins Pin1 and Par14/17 [9,68,69] in human cells. The IDR character of TbPar42 and the involvement of relatives from other organisms in protein assemblies seems to be supportive for the recruitment platform hypothesis of this protein.
Supplementary Materials: The following are available online at http://www.mdpi.com/2218-273X/9/3/93/ s1, Table S1: Structural statistics for the deposited ensemble of 10 lowest energy PPIase structures, Table S2: Structural statistics for the deposited PPIase structure solved by X-ray analysis, Table S3: Structural statistics for the deposited ensemble of 10 lowest energy NterFHA structures, Table S4: Mean K D values after titration of Ala-Glu-Ala-pThr-Ala-Glu-Xaa-peptides and Suc-Ala-pThr-Pro-Ala-NH 2 to NterFHA of TbPar42, Table S5: Oligonucleotides used as forward and reverse primers for PCR-amplification of the constructs. Figure S1: NMR shift data of the linker region of TbPar42 (A) and secondary structure predictions, Figure S2: CS Rosetta model of the TbPar42 linker region, Figure S3: Sequence alignment of the PPIase domains of TbPar42 and hPin1, Figure  S4: NMR structure (PDB-ID: 2N87) of the PPIase domain of TbPar42, Figure S5: Comparison of X-ray structure (PDB-ID: 6GMP) and NMR structure (PDB-ID:2N87) of the PPIase domain of TbPar42.