Human African trypanosomiasis, or sleeping sickness, is endemic among the poorest countries in Central Africa. The causative agent of the disease is Trypanosoma brucei
, an extracellular eukaryotic flagellate parasite [1
]. According to the World Health Organization (WHO), only four drugs are registered to treat the parasite. However, complications may arise when considering toxicity and risk of parasite resistance [3
]. The price of discovering and manufacturing drugs can be exorbitant; thus, alternative routes such as computational methods can help circumvent this problem [4
processes its mitochondrial premature mRNA into translatable mRNA through various post-transcriptional RNA editing processes that heavily rely on uridylate insertion/deletion [5
]. The mitochondrial DNA is in the form of maxicircles and minicircles in which the sequence of pre-edited mRNAs is modified by a multi-protein complex called “the editosome” based on guide RNAs (gRNAs). These gRNAs are encoded by the minicircles and used as a template to modify pre-mRNA [6
]. The partial complementarity of the gRNA with pre-mRNA provides sites in which uridylate nucleotides (U) are inserted or deleted. This process is repeated multiple times with different gRNAs resulting in translatable mRNA [5
Poly(A) polymerases (PAPs) and terminal uridylyl transferases (TUTases) are enzymes that catalyze the transfer of a nucleotide (adenylate and uridylate, respectively) to a hydroxyl group acceptor [8
]. Both enzymes are members of a distinct nucleotidyl transferase superfamily called DNA polymerase beta, or Pol Beta, and share a signature helix-turn motif hG[G/S]X9-13Dh[D/E]h (X signifies amino acid any; h signifies hydrophobic amino acids) [10
]. In most cases, a triad of acidic residues bind the two divalent metal ions required for catalysis. The chemical mechanism is also conserved throughout these polymerases, and consists of the 3′-hydroxyl group of the RNA primer attacking the α-phosphate of uridine triphosphate (UTP), releasing pyrophosphate without forming a covalent intermediate. One divalent metal cation (typically Mg2+
) facilitates this reaction by lowering the affinity of 3′-hydroxyl for hydrogen, while the second metal cation helps to stabilize the pyrophosphate leaving group. Moreover, structural analysis of several different enzymes in the nucleotidyltransferase family demonstrates a conserved N-terminal polymerase domain topology: a five-stranded mixed beta-sheet flanked by two or three alpha-helices [10
]. So far, the structures of only four T. brucei
TUTases have been elucidated (Figure 1
), and we will focus on and compare them in several aspects throughout this study.
Uridylylation catalyzed by mitochondrial RNA editing TUTase 1 (RET1) takes place at 3’-oligo U tail of gRNAs in addition to ribosomal RNAs (rRNAs) and some mRNAs. Moreover, RET1 has been shown to have high substrate affinity for single-stranded RNAs [11
]. Studies of the recombinant protein from related parasite Leishmania tarentolae
concluded that RET1 oligomerizes and can add hundreds of uridylates to unstructured RNA longer than 10 nucleotides. On the other hand, in vivo studies found that the U-tails found in both gRNAs and rRNAs were limited to approximately 15 nucleotides, indicating controlled processivity of this enzyme [11
]. Investigations have shown the majority of RET1 proteins exist in a complex called the mitochondrial 3’ processome and is responsible for recognition, uridylation, and exonucleolytic processing of gRNA precursors along with U-tail addition of mature gRNA [12
]. Crystal structures of T. brucei
RET1 revealed the nucleotidyl transferase bi-domain as well as the RNA Recognition Motif (RRM) and functionally important C2H2 zinc finger domains (Figure 1
RNA editing TUTase 2 (RET2), a TUTase also found in mitochondrial extract of T. brucei
, is a subunit of the RNA-editing core complex (also known as “the editosome”) and is responsible for the U addition in mRNA specified by gRNA to precleaved double-stranded RNA [13
]. Despite the differences in the biological role between TbRET1 and TbRET2, the two TUTases have 24% sequence similarity between their N-terminal catalytic (NTD) and C-terminal (CTD) domains [5
]. In TbRET2, a 107-residue-long middle domain (MiD) is inserted between two beta sheets at the C-terminus of the NTD (Figure 1
B). The MiD is a unique structural feature of TbRET2 compared to all other members of the nucleotidyl transferase superfamily. This domain extends into the solvent while interacting with the CTD. Moreover, the MiD shows a surface charge consistent with the potential role in RNA binding and is responsible for RET2-MP81 binding within the editosome complex. TbRET1 contains a similar domain called the RNA Recognition Motif (RRM) domain (Figure 1
]. Although the RRM and MiD differ in primary structure, the domain positions are conserved and away from the catalytic site. The deletion of the MiD and RRM domains from TbRET2 and TbRET1, respectively, inhibits their enzyme functions, suggesting a role in protein folding or RNA substrate binding.
The smallest of the TUTases, TbTUT4, serves as a minimal model for this class of nucleotidyl transferases [8
]. It is also an RNA-dependent U-specific nucleotidyl transferase that accepts exclusively single-stranded RNA as a substrate. It has 30% sequence identity with TbRET2; as a result, the NTD and CTD of TbTUT4 also form a spherically shaped bi-domain. In addition, TbTUT4 lacks any auxiliary domain like the ones found in TbRET1 (C2H2 Zn finger) or TbRET2 (MiD). TbRET2 and TbTUT4 differ in their biological functions, domain composition, sub-cellular localization, and RNA substrate specificity [8
]. Mitochondrial TbRET2 is active on single-stranded and double-stranded RNA, while cytosolic TbTUT4 accepts exclusively single-stranded RNA as a substrate.
Another mitochondrial TUTase, T. brucei
Mitochondrial Editosome-Like Complex Associated TUTase 1 (TbMEAT1), adds more complexity to the structure–function relationships when compared to the previously mentioned TUTases. Similar to TbRET2, TbMEAT1 associates with a protein complex resembling the editosome [13
]. Within this complex, TbMEAT1 effectively replaces the U-insertion complex found in the editosome that consists of REL2, MP81, and RET2. Moreover, this enzyme is exclusively U-specific and capable of U insertion to both single-stranded and double-stranded RNA. In spite of TbMEAT1’s low sequence identity with TbRET2 (12%) and TbTUT4 (14%), it still adopts a bi-domain architecture forming a deep cleft containing an active site similar to the TUTases mentioned above (Figure 1
]. A unique domain in TbMEAT1 is the bridge domain (BD) that replaces the unstructured loop regions in TbRET2 and TbTUT4 [13
]. Moreover, several active site residues that are common among trypanosomal TUTases are replaced in TbMEAT1 with either similar residues or residues with altered charge or polarity [13
To summarize, all T. brucei
TUTases mentioned above contain an NTD and CTD, sharing a bi-domain topology. This bi-domain confers a catalytic cleft with a common UTP-binding scheme for each enzyme. A divalent metal ion coordinates to the triphosphate moiety of UTP along with three invariant aspartates. Additionally, amino acids surrounding the uracil base provide further stabilization through water-mediated or direct hydrogen bonding in addition to base-stacking interactions provided by a Tyr (Phe in TbMEAT1) residue. TbRET1, TbRET2, and TbMEAT1 are found in the mitochondria unlike the cytosolic TbTUT4. All these enzymes, except TbTUT4, are shown to be essential for T. brucei
viability (refer to Table 1
for overview of differences). The biological role of TbTUT4 has yet to be elucidated.
Although we understand the biochemical function of these biomolecules, a detailed description of how protein dynamics determines their biological role remains a scientific challenge [14
]. Most enzymes undergo a series of conformational changes before and/or after they interact with their substrate. Experimental techniques used to probe these dynamical systems are generally limited in their spatiotemporal resolution, and most report ensemble averages. Computational simulations address this drawback by providing atomic scale spatial and femtosecond temporal resolution of molecular systems [16
]. The data provided by computation improve our mechanistic understanding of conformational change, leading to insights not accessible by experiments.
In this work, we compared various aspects of active sites of these four TUTases to help future structure-based drug design efforts. The four TUTases we analyzed have a conserved catalytic pocket topology, making it an attractive target for trypanocide development. We utilized POVME, a computational tool that characterizes pocket shape and size of proteins. Using this program, we can quantitatively compare volume of binding pockets as a function of time that is not accessible to experiments. From the results, we quantitatively compare each TUTase and provide data to guide future drug discovery efforts.
MD simulations and POVME analysis provided a means to quantitatively compare how the active site shape and volume differ for each TUTase. Upon inspecting different cluster centroids, we found that rigid body side chain translations rather than side chain rotations account mostly for the active site changes.
The clustering algorithm has generated 8 and 6 clusters for TbTUT4 and TbMEAT1, while only 3 and 2 clusters for TbRET1 and TbRET2, respectively. This suggests that TbTUT4’s and TbMEAT1’s trajectories sample relatively more diverse structures that modify the shape of the active site. Several factors contribute to TbTUT4’s pocket shape heterogeneity. Particularly, the loop region connecting beta sheets β1 and β2 (Figure 2
D) and the flexible region connecting β4 and β5. The flexible loop which consists of residues 275 to 291 within the CTD also exhibits large displacements when comparing all centroids of TbTUT4. On the other hand, TbMEAT1 exhibited an increasing solvent accessible surface due to separation between β3 and β4. When comparing the transiently open active site regions (designated by green surfaces) of the TUTases in Figure 2
, TbMEAT1 contains a unique extended volume region extending between these two beta sheets. The loop connecting α1 and α2 also plays a role in modulating the active site volume by a significant translation towards the center of the pocket as observed in Cluster 15. This is quantitatively supported by the fact that Cluster 15 has the lowest average pocket volume compared to the other TbMEAT1 clusters. In TbMEAT1, the loop connecting β1 and β2 (Figure 2
C) did not show much flexibility compared to the other TUTases.
In TbRET1, the loop joining β1 and β2 (Figure 2
A) is flexible enough to modulate the volume of the active site. We also witnessed a side pocket forming between β4 and β5 in TbRET1 (Cluster 5). The fact that TbRET1 trajectory only generated three clusters implies that the enzyme samples a less diverse conformational space, indicating that the side pocket formation is not a rare event. This side pocket can be exploited for developing more selective drugs that are potent towards TbRET1. Moreover, this enzyme exhibits the greatest pocket volume, making it an even more attractive target for inhibitor design. It should also be noted that TbRET1 was simulated longer than the other enzymes only because of the availability of resources. In order to verify that the larger volume observed was not due to the longer simulation time, we calculated the average volume of the first 50 ns of TbRET1 MD copies instead of the entire 250 ns. It was found to be 2507.03 ± 137.44, still the greatest among the four TUTases.
TbRET2 shows the least number of clusters implying the least conformational heterogeneity. A side pocket can be seen forming between α1 and α1. Similarly, the same side pocket seen in TbRET1 is observed in TbRET2 between β4 and β5 but with a smaller volume.
The observed side pocket in TbRET1 offers an opportunity to design specific and potent inhibitors (Figure 4
A) since this centroid belongs to the most populated cluster (Cluster 1) and contains 76% of TbRET1 simulation data. This suggests that most conformations that the enzyme samples contain this additional cavity.
Additionally, we inspected the residues surrounding this side pocket region and compared the different centroids. Our POVME data suggests that this pocket is formed not because of side chain rotations, but backbone translation. For example, the largest side chain group within this loop region (residues 303 to 310), the indole group of Trp 305, showed a displacement of 2.79 Å between the two nitrogens on the pyrrole rings. We also evaluated the druggability of this site using FTMap [23
]. FTMap program identifies regions that have the most energetically favorable binding site by flooding the surface of the protein of the molecule and computing the interaction energy. Probe molecules were found binding this region, further supporting the idea that future efforts should focus on exploiting this region for inhibitor design.