Sarcocystis neurona Protein Kinases: Computational Identification, Evolutionary Analysis and Putative Inhibitor Docking

The apicomplexan parasite, Sarcocystis neurona causes the degenerative neurological equine protozoal myeloencephalitis (EPM) disease of horses. Due to its host range expansion, S. neurona is an emerging threat that requires close monitoring. In apicomplexans, protein kinases (PKs) have been implicated in a myriad of critical functions such as host cell invasion, cell cycle progression and host immune responses evasion. Here, we used various bioinformatics methods to define the kinome of S. neurona and phylogenetic relatedness of its PKs to other apicomplexans. Further, three-dimensional (3D) homology models for selected S. neurona putative PKs were constructed and evaluated for inhibitor docking. We identified 92 putative PKs clustering within the AGC, CAMK, CK1, CMGC, STE, TKL, aPK and OPK groups. Although containing the universally conserved PKA (AGC group), S. neurona kinome was devoid of PKB and PKC, but contained the six apicomplexan conserved CDPKs (CAMK group). The OPK group was represented by ROPKs 19A, 27, 30, 33, 35 and 37, but was devoid of the virulence-associated ROPKs 5, 6, 18 and 38. Two out of the three S. neurona CK1 enzymes had high sequence similarities to T. gondii TgCK1-α and TgCK1-β and the Plasmodium PfCK1. Docking of four inhibitors onto homology models of putative ROP27 and PKA indicated that inhibition of S. neurona PKs is feasible, but needs to be experimentally tested. The essentiality of apicomplexan PKs makes the elucidation of S. neurona kinome a key milestone for development of novel therapeutics for EPM.


Introduction
Equine protozoal myeloencephalitis (EPM) is an infectious, progressive, degenerative neurological disease of horses caused by the apicomplexan parasite, Sarcocystis neurona [1]. To complete its life cycle, this heteroxenous parasite requires a reservoir host (i.e. opossums; Didelphis virginiana, D. albiventris) and an aberrant (horses) or intermediate host (cats, skunks, raccoons and sea otters) [2]. Opossums become infected upon ingestion of sarcocysts containing hundreds of bradyzoites, which undergo gametogony that sporulate into mature oocysts, which are then shed in the faeces. Upon ingestion by the intermediate or aberrant hosts, the oocysts release the environmentally resistant sporozoites, which chronically parasitize the neural and inflammatory cells of the host's central nervous system (CNS). Clinical EPM symptoms depend on which part of the CNS is parasitized; parasitization generally results in abnormal gait, dysphagia and muscle atrophy in affected horses [3].
The intracellular nature of S. neurona and its ability to evade the host's immune surveillance [4] makes EPM treatment expensive, lengthy and challenging. Traditionally, clinical treatment of EPM Here, we used a genome-wide approach to define the kinome of S. neurona and determined the relatedness of the putative PKs to those reported in other apicomplexans. We also tested the docking of some specific inhibitors to a selection of some of the putative PKs. Defining the S. neurona kinome is not only important in providing insights into the parasite's biology, but also identification of potential novel drug targets that can be used to clear chronic S. neurona infections, reduce parasite survival, augment the host's anti-parasitic immune responses, or as candidates for development of EPM vaccines.

Sarcocystis neurona encodes 92 putative kinases
To date, at least 15 apicomplexan genomes (coccidians, gregarines, hemosporidians and piroplasmids) have either been fully sequenced or partially annotated [35]. In the current study, we conducted an exhaustive genome-wide search of the newly sequenced S. neurona genome [36], and identified 92 putative PKs ( Table 1). The identified PKs contained the characteristic PK (IPR000719) or PK-like (IPR011009) domains and three conserved amino acids constituting the catalytic triad (Lys30, Asp125, Asp143). The PKs had sizes ranging between 152 and 6544 amino acids and relative molecular weights of between 15.94 to 671.51 kDa. Majority of the PKs had isoelectric point (pI) greater than 7.0, implying that the PKs have low turnover rates since in general, acidic proteins are thought to be degraded more rapidly than neutral or basic proteins [37].
Assignment of S. neurona PK groups was accomplished by sequence clustering using Blast2GO [62], and by BLASTp searches in the Kinase.com database [14]. Out of the eleven known PK groups [12][13][14], S. neurona PKs segregated into the AGC (n=9), CAMK (n=20), CK1 (n=3), CMGC (n=19), STE (n=2), TKL (n=6), aPK (n=2) and OPK (n=31) ( Table 1). Apart from the 31 OPKs that do not fit to the major kinase groups, the CAMK and CMGC groups, whose members are essential for the parasite's host cell invasion [38] and differentiation (via cell-cycle regulation) [39], respectively, had the highest number of PKs underlying the importance of these processes in the parasite. Unlike in some parasites (e.g. P. falciparum) that lack STEs and TKLs [40], S. neurona contains PKs in the two classes. Table 1. Description of the 92 putative PKs identified in the kinome of S. neurona. The putative PKs could be classified into eight groups. The amino acid coordinates of the conserved PK domains in the protein sequences, and the PK homologies to other apicomplexan PKs are shown in columns 7 to 12.

Description of the putative protein kinases (PKs) in the genome of S. neurona
Description of protein kinase (PK) homologies (BLASTp) Protei n ID ᵃ

The AGC group
The numbers of apicomplexan AGCs range from four (in B. bovis) to 15 (in T. gondii) [15]. Based on our Blast2GO annotations and BLASTp homology searches against the kinome database, five out of the nine S. neurona AGCs (SRCN_3339, SRCN_3990, SRCN_5165, SRCN_5610 and SRCN_1312) were homologs to the universally conserved PKAs that are found in N. caninum and T. gondii (see Table 1). The PKAs are essential for completion of schizogony (asexual reproduction) in Plasmodium parasites [42]. Further, S. neurona contains a putative PKG (SRCN_4518) which shows high homology (92%) to the T. gondii TgPKG1 ( Table 1); PKGs are in essential apicomplexans [43].

The CMGC group
The CMGC is the largest PK group in apicomplexans; CMGC numbers range from 15 in B. bovis to 23 in P. vivax [15], which is within the range we identified in the S. neurona kinome in our study (i.e. 19 CMGCs; See Table 1). Notable of these were the two GSK homologs (SRCN_1731 and SRCN_1732). This finding is similar to what has been observed in Plasmodium parasites in which two GSK-3 enzymes have been reported, both of which are essential for the parasite [47]. Homology searches showed considerable sequence similarity (51% and 41% for SRCN_1731 and SRCN_1732, respectively) to the PfGSK-3 enzymes (data not shown). Notably, eight of the 19 CMGCs in S. neurona were CDKs, including CDK7 (SRCN_4674, SRCN_2759 and SRCN_761), CDK10 (SRCN_895) and CDK11 (SRCN_977). Available data show that CDKs are essential in P. falciparum [35]. We also identified two putative MAPK homologs (SRCN_4209 and SRCN_5365), and ERK7 (SRCN_6472) (see Table 1), a result which is comparable to the two MAPKs in the kinome of P. falciparum [15].

The OPK group
The apicomplexan-specific OPKs are a tight cluster of PKs without clear relation to any of the other major PK groups. Notable of these are ROPKs which have high sequence divergence and have been thought to be largely restricted to T. gondii [48], which has a total of 34 members spread in over 40 distinct sub-families [27]. Although their diversification in apicomplexans is poorly understood, some ROPKs are key virulence factors in T. gondii [27]. At least nine putative ROPKs could be identified in S. neurona, including ROPK19A (SRCN_6184), ROP27 (SRCN_3247), ROP30 (SRCN_2076), ROP33 (SRCN_7082 and SRCN_7086), ROP35 (SRCN_2183, SRCN_2123, SRCN_7083 and SRCN_4410) and ROP37 (SRCN_7084), implying that the ROPKs are not restricted to T. gondii. Although largely presumed to be inactive, ROPKs are implicated in the regulation of the host transcription [48], and their presence in S. neurona may support the hypothesis that the ROPKs have a unique activation mechanisms in their regulatory functions that facilitate apicomplexan pathogenesis [35,49]. Other notable OPKs included two parasite-specific eukaryotic initiation factor-2 (elF2) kinases (elF2K-C [SRCN_1606] and elF2K-B [SRCN_4503]), four NEKs (SRCN_4528, SRCN_2630, SRCN_286 and SRCN_3151) and four ULKs (SRCN_3444, SRCN_3669, SRCN_6812 and SRCN_6157) ( Table 1). The elF2Ks are conserved in apicomplexans, and are important for the induction of parasite differentiation into the bradyzoites cysts, which are clinically important [29].

The STE group
The STEs are poorly represented in apicomplexans, and although most apicomplexans have one or two STE genes per genome, some parasites such as C. parvum are reported to harbor up to six STEs [15,17]. Our results suggest that S. neurona has at least one putative STE ( Table 1). STEs are thought to function in MAPK pathway cascades despite the fact that this pathway is absent in apicomplexans. The small repertoire of apicomplexan STEs is in contrast to that reported in other parasites such as trypanosomatids, in which these enzymes regulate the length of the flagella [50].

The TKL group
Apicomplexans harbor a maximum of seven TKL-coding genes, which makes it notable that we identified six putative TKLs in S. neurona ( Table 1). In Plasmodium, six TKLs are conserved, and are essential in the asexual proliferation of the parasites, hence the TKLs are thought to be ideal drug targets [22]. Two of the six S. neurona putative TKLs had considerable sequence similarities to the Plasmodium TKLs, including SRCN_3466 (36% similar to P. malariae TKL1) and SCRN_1435 (49% similar to P. ovale TKL3) (data not shown).

The aPK group
The aPKs share sequence similarity to the OPKs. Although apicomplexan kinomes are generally thought to be devoid of aPKs, at least four genes in T. gondii are thought to encode these enzymes, the products of which are hypothesized to be part of the Ovoid Mitochondrial Cytoplasmic (OMC) Complex [51], a composite assembly of organelles observed only in growing tachyzoites of T. gondii, but its functions are currently unclear. Our analysis revealed two putative MEK-related aPKs in S. neurona (see Table 1). A representative of an MEK-related aPK (PfPK7) is reported to be expressed during Plasmodium infections in humans [52].

Evolution of S. neurona protein kinases
We phylogenetically deciphered evolutionary relationships among the various PK groups in the S. neurona kinomes with the PKs in other apicomplexans, which revealed that the S. neurona kinome has slightly fewer AGCs (n=9) compared to the kinomes of T. gondii (n=11), N. caninum (n=13) and H. hammondi (n=15). In general, the phylogenetic clustering of the S. neurona AGCs mirrored the homologies of these enzymes to those of the three apicomplexans used in this study (Figure 1; compare with Table 1).
Sequence analysis of S. neurona AGCs revealed significant divergence with only ~30% sequence similarity. Two S. neurona AGCs (SRCN_5610 and SRCN_3990) clearly cluster with T. gondii PKAs TGME49_028420 and TGME49_015670 [53] (Figure 1). Moreover, SRCN_5610 shares ~60% full length sequence identity with its ortholog. Further, two other S. neurona putative PKAs (SRCN_5165 and SRCN_3339), which also distinctly clustered with their T. gondii orthologs (TGME49_026030 and TGME49_086470, respectively) (Figure 1), contained the characteristic PKA's 'GxGxxG' motif harboring a glycine triad that structurally forms a hairpin around the ATP binding pocket [53]. It is also notable that the single putative PKG (SRCN_4518) distinctly clustered with its T. gondii PKG ortholog (TGME49_111360) (Figure 1). The SRCN_3339, SRCN_5165, SRCN_5610 and SRCN_4518 possessed an additional functional domain; the AGC-kinase C-terminal domain that contained two of the three conserved phosphorylation sites in AGCs (data not shown). These conserved sites serve as phosphorylation-regulated switches to control both intra-and inter-molecular interactions [54]. A striking feature in S. neurona is that like in T. gondii, it lacks PKB and PKC. The putative PDK1 (SRCN_1312) clustered with the T. gondii PDPK (TGME49_068210) [53].
Majority of the putative CMGCs identified in S. neurona clustered with robust bootstraps with the conserved CMGCs in T. gondii, N. caninum and H. hammondi (Figure 3). Notable were the clustering of the CDKs, including CDK7 (SRCN_4674, SRCN_2759 and SRCN_761), CDK10 (SRCN_895) and CDK11 (SRCN_977). In addition, comparing the phylogenetic clustering of the CDKs in Figure 3 and the annotations presented in Table 1, the S. neurona kinome also appeared to have putative CDK5 (SRCN_4801, SRCN_1104 and SRCN_6346), all of which were supported by robust bootstraps. It should be noted that CDKs are amongst the principle molecular switches that regulate cell cycle progression in the apicomplexan parasites [56]. Also notable was the clustering in the same clade of three MAPKs (SRCN_6472, SRCN_4209 and SRCN_5365) and two GSKs (SRCN_1731 and SRCN_1732) (Figure 3). In the OPK family, SRCN_4528 and SRCN_2630 are putative NEKs given their clustering with T. gondii NEK kinases TGME49_119700 and TGME49_094260, respectively (Figure 4). The S. neurona SRCN_108 and SRCN_3444 are putative ULKs due to their clustering with their T. gondii ULK orthologs (TGME49_035750 and TGME49_040630, respectively). SRCN_3669 clustered with TGME49_066950, a TBC-domain (Tre-2/Bub2/Cdc16) containing kinase. Further, there seemed to be a species-specific expansion of S. neurona Aurora kinases, SRCN_1606, SRCN_6157 and SRCN_2403 which cluster with T. gondii aurora kinase TGME49_003010; the clade containing these kinases however was not supported by robust bootstrap values (see Figure 4). SRCN_286 is a putative Wee kinase given that it clusters with Wee kinases from T. gondii (TGME49_073690) as well as from H. hammondi (HHA_273690; compare Figure 4 with Table 1). The clustering of putative S. neurona ROPKs (Figure 4) was also notable. For instance, SRCN_7082 clustered with T. gondii ROP33 (TGME49_001130), implying that it is a ROP33 ( Table 1). SRCN_6184 clustered with T. gondii ROP19A (TGME49_042240) and ROP38 (TGME49_042110). It should however be noted that the sequence similarities with the ROP38 was low, implying that this could be a ROP19A (based on annotations presented in Table 1). Other putative S. neurona ROPKs included SRCN_3247, which clustered with ROP27 (TGME49_113330), SRCN_2076 with ROP25 (TGME49_002780). Here, it is notable that the clustering of SRCN_2076 the T. gondii ROP25 was supported with low bootstrap value, potentially meaning that based on the annotations (Table 1), SRCN_2076 is an ortholog to ROP30. SRCN_2123 and SRCN_2183 clustered with ROP35 (TGME49_104740), while SRCN_7084 clustered with ROP37 (TGME49_094560). Notably, from the phylogenetic tree in figure 4, SRCN_7083 and SRCN_4410 are probably duplicated forms of ROP34 given their clustering with T. gondii TGME49_040090. There was also indication of species-expansion of the S. neurona putative ROP35 (SRCN_2123 and SRCN_2183; see Figure 4). Although annotations (Table 1) showed that SRCN_4310 was a putative ROP33 (39% sequence similarity to H. hammondi ROP33), this enzyme clustered with a hypothetical protein (TGME49_039380) of T. gondii (Figure 4). This discrepancy between annotation and phylogeny could be due to differences in the T. gondii. Another apparent discrepancy was the annotation of SRCN_4503 as a putative eIF2K-B (74% sequence similarity to the T. gondii TgCatPRC2 [IF2K-B]; see Table 1); but phylogeny showed clustering of this protein with a putative T. gondii ROP30 (TGME49_027010) (see Figure 4). Note that in this figure, the phylogeny bootstraps were low in the segregation of SRCN_4503 with ROP30. Moreover, SRCN_3151 and SRCN_3075 both segregated with TGME49_018550, a PIK3R4 kinase-related protein; from the phylogeny, SRCN_3151 and SRCN_3075 probably represent duplications. SRCN_6572 falls in the same clade with TGME49_004100, an eIF2 kinase IF2K-C, and TGME49_111510, an eIF2 kinase IF2K-B. NEKs are involved in cell cycle regulation, while Aurora kinases play pivotal roles in endodyogeny, duplication rate and parasite virulence [28]. Taken together, the presence of a variety of ROPKs in S. neurona is interesting given the fact that in T. gondii, ROPKs are key virulence factors [57].

Structural modeling
Using I-TASSER, we constructed the three-dimensional (3-D) homology models for SRCN_3247 using 4B6L, 4C0T, 1KOB, 4ZHX, 4IC7 and 4CZU as templates, and SRCN_5165 using 1CDK, 1SMK, 4WIH, 4Z84, 3PFQ, 4LQS and 1RDQ as templates. SRCN_3247 and SRCN_5165 are putative ROP27 and PKA respectively. From the predicted models for each of the sequences, the model with the highest C-score, a score that estimates the quality of the models produced, was used. The best SRCN_3247 model had a C-score of -0.46 and an estimated TM-score of 0.65 ( Figure 5A) while the best SRCN_5165 model had a C-score of 1.6 and a TM-score of 0.94 ( Figure 5B). Evaluation of the models' quality and suitability for docking using PROCHECK revealed that SRCN_3247 and SRCN_5165 had 97.4% and 99.7% of the residues in the favored/allowed regions, respectively (Supplementary Figure 1A&B). The SRCN_3247 model contained 9 beta-sheets, 11 alpha and 5 3,10 helices while SRCN_5165 contains 9 beta-sheets, 11 alpha and 6 3,10 helices. Using the improved Protein Block Alignment (iPBA) webserver [58], structural superposition of the models to the top-ranking templates was performed for prediction of pair-wise structural alignment. The structural similarity score was measured using root mean square deviation (RMSD) between both Cα atom of the modeled protein and the corresponding template. The RMSD for the alignment of SRCN_3247 and 4B6L was 0.82 Å for the 276 aligned residues (Figure 6A) while that for SRCN_5165 and 1CDK was 0.46 for 335 aligned residues (Figure 6B). The obtained RMSD values indicated that in both cases, the models and the templates had similar folds.  Here, it should be noted that the presence of a binding pocket does not necessarily imply that a target protein is druggable. However, the binding volume is one of the important cavity properties that influence the druggability of a particular target protein, of which most of druggable proteins have volumes of between 500-1000 Å3 [60].

Molecular docking of SRCN_3247 and SRCN_5165 with PK inhibitors
After the structural modeling, the binding modes of known kinase inhibitors to SRCN_3247 and SRCN_5165 was assessed by molecular docking of the chemical ligands into the predicted active site pockets of the modeled structures. For the docking analyses, the following four inhibitors were used: -BAY61-3606, an ATP-competitive inhibitor of mammalian spleen tyrosine kinase (Srk) [61]; Flavopiridol, a synthetic flavonoid that inhibits a wide range of cyclin-dependent kinases [62]; purvalanolB-B (PurB) is a potent CDK inhibitor [63]; and Piceatannol, a naturally occurring hydroxylated analogue of resveratrol that inhibits Syk kinase [64]. With a binding affinity of -8.4 kcal/mol, the overall binding pose of BAY61-3606 in SRCN_5165 is shown in Figure 8A. The binding affinity of flavopiridol to SRCN_5165 was -9.3 kcal/mol (Figure 8B). The binding energies for purB and piceatannol to SRCN_5165 were -6.1 kcal/mol (Figure 8C), and -6.0 kcal/mol (Figure 8D), respectively. For SRCN_3247, BAY-61-3606 bound to the active site with an affinity of -7.2 kcal/mol ( Figure  9A), while the affinities for flavopiridol, purB and piceatannol were -7.9 kcal/mol (Figure 9B), -6.9 kcal/mol (Figure 9C), and -7.3 kcal/mol (Figure 9D), respectively. Taken together, by structural modeling and docking analysis, the above results indicate that the putative S. neurona PKA and ROP27 could be considered as drug targets. The data from the modeling and docking could pave way for further laboratory experiments to design potential drugs against EMP.

Discussion
We used different bioinformatics approaches to identify and classify the PKs of S. neurona, and docked four well-known inhibitors into modelled structures of two PK representatives. The critical roles performed by PKs make them attractive targets for novel therapeutics, and as such, the current study provides a rich source of potential drug targets, vaccine candidates or diagnostic proteins for developing new treatment and diagnostics strategies. PKs are considered viable drug targets because their catalysis mechanism and overall structure are conserved. Because small molecules can bind to the PKs' catalytic clefts [65], many kinase inhibitors (e.g. imatinib, trastuzumab and lapatinib) have been developed to treat various human diseases [66].
The kinomes of apicomplexans range from 35 PKs (in Babesia bovis) to 135 PKs (in T. gondii) [35]. We identified a total of 92 putative PKs in the kinome of S. neurona, compared to the PKs reported in the kinomes of T. gondii (n=135), N. caninum (n=130) and H. hammondi (n=124) [15]. Although the total number of S. neurona PKs appeared markedly reduced compared to that of its close coccidian relatives (T. gondii and N. caninum [27]), taken as a percentage of total genome size, the proportion of S. neurona PKs is comparable to the 2% observed in humans [13] and other coccidians [27]. The contraction of the S. neurona kinome could be attributed to genome compaction, which occasionally offsets lineage-specific expansions of specific gene families. Notably, genome contraction is a common mode of genomic evolution in intracellular parasites, including apicomplexans [67,68]. As such, the evolution of PKs may be in tandem to the overall genomic adaptive strategies of these parasites.
Using a hierarchical scheme based on the major PK groups, the S. neurona kinases could be classified and phylogenetically clustered into the various PK families. A complement of nine putative AGC kinases was identified in S. neurona, which is reduced compared with that of T. gondii, N. caninum and H. hammondi. Despite this potential gene loss, seven of the nine AGCs (SRCN_5165, SRCN_5610, SRCN_3339, SRCN_4249, SRCN_3990, SRCN_5430 and SRCN_1312) had orthologs in T. gondii, N. caninum and H. hammondi. In agreement with the observation that PKA is conserved in apicomplexans [27], two PKAs (SRCN_5610 and SRCN_3990) were identified in S. neurona. In eukaryotes, PKA signaling regulate various cellular responses such as DNA replication [69], cell growth and metabolism [70] and gene transcription by phosphorylating transcription factors such as the cAMP Response Element-Binding Protein (CREB) [71]. In T. gondii, increases in cytosolic cAMP levels activate PKA to trigger the developmental switch from the rapidly proliferating tachyzoites to the quiescent bradyzoites [72]. Additionally, two other S. neurona AGCs, (SRCN_5165 and SRCN_3339), were putative PKAs given that they contained the characteristic GxGxxG motif found in PKA [53]. The PKAs are attractive targets for the disruption of S. neurona growth. Notably, based on orthology, S. neurona contains a single putative PKG (SRCN_4518) that distinctly clustered with T. gondii PKG (TGME49_111360).
The CMGCs, comprising of CDKs, MAPKs, GSKs, and CLKs coordinate a wide range of cellular functions in different species. For instance, members of the CDK subfamily are major coordinators of cell division in both mitosis and meiosis while the MAPKs subfamily is crucial in regulating cell proliferation, cell differentiation and cell death in eukaryotes [56,78]. By both annotations and phylogenetic analyses, we identified four putative CDKs sub-family members; CDK5 (SRCN_4801 and SRCN_6346), CDK7 (SRCN_2759, SRCN_4674 and SRCN_761), CDK10 (SRCN_895) and CDK11 (SRCN_977). The finding of CDKs in S. neurona suggests that this parasite's cell cycle regulation could be CDK-dependent and perhaps similar to that of higher eukaryotes [79]. The identification of three putative MAPKs (SRCN_4209, SRCN_6472 and SRCN_5365) in S. neurona points to the existence of MAPK regulated transduction pathway(s) in this pathogen. Similar to its T. gondii ortholog (TgMAPK1), which is a p38α MAPK homolog [80], SRCN_4209 may be potentially involved in parasite proliferation/stage differentiation, stress response and manipulation of the host immunity to enhance virulence. On the other hand, SRCN_6472 and SRCN_5365 may augment the roles of SRCN_4209 in the parasite. In T. gondii MAPK1/ERK7 is involved in intracellular proliferation [81]. We also identified two putative GSKs (SRCN_1731 and SRCN_1732). A genome-wide gene knockout approach in P. falciparum demonstrated PfGSK-3 to be critical for schizogony of the parasite [82]. Other S. neurona CMGC kinases identified include CLK (SRCN_1479), PRP4 (SRCN_2845), DYRK (SRCN_1611) CK2 (SRCN_6427) and SRPK (SRCN_1236).
ROPKs are secreted by T. gondii into the host cell and play roles in adhesion, motility and manipulation of immune responses [83]. We identified at 10 putative ROPK sub-family members in the S. neurona kinome; -ROP38 (SRCN_6184), ROP33 (SRCN_7082), ROP27 (SRCN_3247), ROP25 (SRCN_2076), ROP37 (SRCN_7084), ROP34 (SRCN_4410 and SRCN_7083), ROP35 (SRCN_2183), SRCN_2123 and ROP23 (SRCN_6812). It has been recently demonstrated that T. gondii ROP21 and ROP27 play a role in a constitutive pathway based on their localization in the PV and cyst matrix [84]. Moreover, T. gondii ROP35 has been shown to play a crucial role in chronic infection [85]. Although the S. neurona genome is more than twice the size of other coccidians whose genomes sequenced so far (e.g. Toxoplasma and Neospora), it has a considerably reduced number of ROPKs that nevertheless may have vital roles in the parasite's virulence. Specifically, S. neurona is devoid of ROP5, ROP16, ROP18 and ROP38 that have been shown to confer virulence and alter the host's cellular signaling pathways [81]. Putatively therefore, S. neurona ROPKs may have multiple roles in the survival of the parasite. In search of drug targets against S. neurona, the reduced ROPKs with possible multiple roles and absent in the vertebrate host are thus attractive candidates.
We explored the possibility of repositioning existing PK inhibitors as lead compounds in the development of new drugs for EPM, an approach that would substantially lower the cost of drug development. The lead compounds found via repositioning could be modified to increase efficacy. Molecular docking studies were therefore conducted to explore the potential binding of four inhibitors to the structural models of SRCN_3247 and SRCN_3314, representatives of putative S. neurona ROP27 and PKA, respectively. Probing the S. neurona kinome with small-molecule has many potential advantages. For instance, given that most kinase inhibitors in therapeutic use target multiple kinases with similar binding pockets [65], the promiscuity of such agents lead to polypharmacology [86] that would help minimize drug resistance. However, our modeling and docking analysis are only predictive, meaning that experimental validations are required.

Conclusion
The kinome of S. neurona contain members of the major classes of PKs, including AGC, CMGC, GSK, CAMK, CK, TKL, aPKs and several PKs in the OPK family. Similar to other apicomplexans, S. neurona kinome is devoid of PKC and the conventional TKs, but appears to have reduced MAPK family members. S. neurona kinome did not also have some of the ROPKs that have been implicated in the virulence of T. gondii. Known kinase inhibitors can be used as starting scaffolds in the search for drugs acting against EPM.

Genome-wide identification of putative S. neurona PKs
The predicted S. neurona proteome was downloaded from Toxoplasma Genomics Resource database (Release 28; version May 2016) [41]. A hidden Markov model (HMM) profile of signature PK domains obtained from the Kinomer database v 1.0 [87] was used to search for S. neurona kinases using HMMER v 3.1b2 [88]. The sequences having PK domain (IPR011009) or PK-like domain (IPR000719) were considered as putative kinases. Annotation of the putative kinase sequences was performed by blast search against the non-redundant (nr)-NCBI protein and UNIPROT/SWISSPROT databases at an e-value of 0.0001. Gene ontology (GO) mapping was performed using Blast2GO v 3.3 [89]. The molecular weight (Mw) and isoelectric point (pI) were obtained using ExPASy compute pI/Mw tool [90]. Motifs analysis was performed with the MEME Suite v 4.11.2 [91]. The parameters were as follows: number of repetitions, any; maximum numbers of motifs, 30; and the optimum motif widths, between 6 and 200 residues.

Phylogenetic analysis
Phylogenetic trees were constructed to decipher the orthologous and paralogous relationships of S. neurona kinases. Sequences of S. neurona kinase subfamilies obtained using Blast2Go were aligned with their homologs in T. gondii [41], H. hammondi and N. caninum using MUSCLE [92]. The alignments were subsequently manually edited in Jalview [93]. Phylogenetic reconstruction was undertaken using the maximum likelihood program PhyML 3.0 [94] and RAxML v 8.0 [95] and the Bayesian inference program MrBAYES v 3.2 [96]. For PhyML, the LG substitution model was selected assuming an estimated proportion of invariant sites and 4 gamma-distributed rate categories to account for rate heterogeneity across sites. The gamma shape parameter was estimated directly from the data. Robustness of internal branches was evaluated using 100 bootstraps. MrBayes was run for 5, 000,000 generations with two runs and four chains in parallel and a burn-in of 25%. Obtained trees were rendered with the Interactive Tree of Life server (iTOL) [97].

Structural modeling
The amino acid sequences of the target kinases, a putative ROP27 (SRCN_3247) and a PKA (SRCN_5165) (see Table 1), were retrieved ToxoDB [30]. Structural modelling was carried out using two approaches. The first approach was modelling by satisfaction of spatial restraints as implemented in MODELLER [98]. For this approach, template structures were obtained by threading target sequences in the PHYRE database [99] and selecting top ranked template structures for model building. The second modelling approach was the iterative threading assembly refinement implemented in I-TASSER [100]. The template structures for SRCN_3247 were 4B6L, 4C0T, 1KOB, 4ZHX, 4IC7 and 4CZU, while those for SRCN_5165 were 1CDK, 1SMK, 4WIH, 4Z84, 3PFQ, 4LQS and 1RDQ. The quality of the models was further evaluated using PROCHECK [101] to ensure suitability for carrying out docking studies. PROCHECK outputs a Ramachandran plot that shows the number (%) of residues mapping to the favored, allowed, generally allowed and disallowed regions of the plot. Structural superpositioning was performed using the improved Protein Block Alignment (iPBA) webserver [58]Results were visualized in the PyMOL Molecular Graphics System [102].

Molecular docking
Docking calculations were carried out using the AutoDock Suite [103]. To compute the free energy of binding on each protein model, essential hydrogen atoms and solvation parameters were added. Dockings were performed with grid boxes sufficient to cover the entire protein of interest using parameters tabulated in Table 2. Four PK inhibitors, BAY61-3606, Flavopiridol, Purvalanol B and Piceatannol were docked onto SRCN_3247 and SRCN_5165 and their overall binding poses evaluated in terms of binding energies. All grid boxes were formed at spacing size of 1.0 Angstrom. Following the docking procedure, the resulting PDBQT output files were visualized using PyMOL [102] and all ligand conformations were analyzed for their binding affinities.

Supplementary Materials
Supplementary Figure S1: Ramachandran plots generated using PROCHECK to evaluate the robustness three-dimensional models SRCN_5165 (Panel A) and SRCN_3247 (Panel B), which are putative PKA and ROP27, respectively, in S. neurona kinome. Here, the red, yellow, light yellow, white regions indicate the favored, allowed, generously allowed and disallowed regions, respectively. The Phi and Psi are dihedral angles that determine torsion of peptide bonds.