4. J-Domain Proteins
HSP40s, also known as J-domain proteins (JDP), DNAJ-like or J proteins, are essential partners for HSP70 chaperones [
16]. These proteins are grouped because of the presence of a J-domain, whose prototype is that defined in
Escherichia coli DnaJ protein [
17]. JDPs can bind to substrate polypeptides by themselves, and their J-domain promotes ATP hydrolysis by the HSP70 protein, favoring the binding of polypeptides by the HSP70 [
18]. Remarkably, but not surprisingly as they are responsible for substrate specificity, HSP40s, in a cell or into cellular compartments, outnumber HSP70 family members [
19]. Thus, 6 HSP40s have been found in
E. coli, 22 in
Saccharomyces cerevisiae, 41 in humans [
20], 49 in
Plasmodium falciparum [
21] and 69 in
L. major [
7]. In contrast, only three distinct DnaK genes exist in
E. coli; six distinct HSP70s are present in
P. falciparum, nine in
Leishmania and ten in humans [
7].
As mentioned above, the prototypical and founding member of this superfamily is the
E. coli DnaJ protein [
17], whose structure and functions have been elucidated in great detail [
22]. DnaJ contains four structural domains: an N-terminal J-domain, followed by a Gly/Phe (G-F)-rich domain, a Zn
2+-finger domain and a less-conserved C-terminal domain. The J-domain region is comprised of approximately 70 amino acids that fold into four α-helices (I–IV). The existence of a highly conserved His-Pro-Asp (HPD) tripeptide motif in the loop region between helices II and III is another structural feature of the J-domain; this motif is essential for the stimulation of HSP70 ATP hydrolysis [
23]. It is believed that J-domain only interacts with the ATP-bound HSP70 conformation at the interface between NBD and SBD moieties. Then, the HDP motif contacts key residues of the HSP70 ATP catalytic site, remodeling the NBD lobes to orientate the catalytic residues to a position optimal for ATP hydrolysis. Furthermore, the J-domain interacts with residues of the HSP70 SBD, promoting high affinity for the HSP70 ADP-bound state and efficient trapping of substrates [
10,
22]. Moreover, many J-proteins directly bind substrates, favoring the specific interactions of HSP70 with particular polypeptides and linking in turn the HSP70 functions to particular cellular processes [
2]. Apart from the common J-domain, which is the distinctive feature of HSP40s, this class of proteins show a large structural divergence among them. Different HSP40s interact with a particular member of the HSP70 family; hence, HSP40s are contributing to the multi-functionality of the HSP70 machinery by specifying the target substrates and cellular processes in which the HSP70 chaperone activity is requested [
20,
24].
Based on the similarity to the domain architecture of the DnaJ, HSP40s have been grouped into four classes (
Figure 2). Type I proteins share the four characteristic domains of DnaJ: the N-terminal domain constituted by the J-domain; a glycine-phenylalanine (G-F)-rich linker segment; two β-sandwich C-terminal domains, which contain four repeats of the CxxCxGxG type (zinc finger-like region); and a dimerization domain involved in binding to the client proteins. Examples of proteins belonging to this class are the yeast Ydj1 and the human DnaJA1-4. Type II proteins share the J-domain, the G-F-rich linker and the C-terminal substrate binding domain but lack the zinc-finger domain; examples are the yeast Sis1 and the human DnaJB-1, -4 and -5 [
24]. The G-F-rich region is also involved in determining the specificity of HSP40s for target proteins [
25]. Additionally, in class II JDPs, the G-F-rich region would be involved in an autoinhibitory mechanism, in which the G-F region initially blocks J-domain binding to HSP70 [
15]. Type III proteins are heterogeneous in sequence and share only the J-domain with DnaJ; more often, they contain domains involved in specifying target interaction or sub-cellular localization [
20]. As mentioned above, within the J-domain, there exists the highly conserved HPD motif; however, for some J-domain containing proteins, a strict HPD sequence is not found. To denote this feature, Louw and coworkers proposed a fourth group (type IV) of HSP40s to include those proteins lacking the HPD sequence in their J-domains [
26]. Unlike type I and type II HSP40s, in which the J-domain always has an N-terminal location, in type III proteins the J-domain can be in any position along their sequence. It has been suggested that type I and type II HSP40s form complexes (dimers or tetramers) and are able to interact promiscuously with nascent polypeptides; also, they recognize mis-folded or aggregated proteins and cooperate with HSP70s in protein disaggregation [
10,
27]. In contrast, type III HSP40s would have evolved to specifically interact with a limited number of HSP70 substrates or alternatively acting directly as a bait to locate an HSP70 to particular cellular places [
10,
20]. Regarding type IV HSP40s, some authors have questioned whether they must be considered either members of the JDP family or only JDP-like proteins [
28]. Nevertheless, they should be considered to understand the evolutionary history of the family and its functional diversification. Moreover, for some HSP40s, the maintenance of their co-chaperone functions in the absence of a canonical J-domain has been reported [
29].
5. Appraisal and Updating of the Compendium of HSP40s in L. infantum
In a previous work, we found 69 different HSP40s to be encoded in the
L. major genome [
7]. At that time (2015), the
L. major genome was the best assembled one for the genus
Leishmania; however, in 2017, an improved genome assembly was attained by the combination of second- and third-generation sequencing methodologies for the
L. infantum genome [
30]. Hence, in this review, we analyzed the
L. infantum genome for genes encoding J-domain containing proteins. All the 69 previously identified proteins for the HSP40 family in
L. major were found to be also present in the
L. infantum genome. Moreover, three new members were uncovered, amounting to a total of 72 different JDPs (
Figure 3 and
Table 1). Following the nomenclature proposed by Folgueira and Requena [
31], they were named J75 (JDP75), J76 (JDP76) and J77 (JDP77). Thus,
Leishmania possesses one of the largest HSP40 families among the organisms in which this family has been studied. To our knowledge, a larger HSP40 family has only been found in pepper, which contains 76 annotated HSP40 genes in its genome [
32]. According to the presence/absence of the prototypical DnaJ-domains (
Figure 2), the 72
Leishmania JDPs could be grouped into the four established classes (
Table 1). Thus, eight
L. infantum HSP40 proteins belong to the type I category, since they contain all typical domains found in the prototypical DnaJ molecule; these are J2, J3, J4, J27, J32, J45, J46 and J50. Another 18 HSP40s belong to the type II, as they have Gly/Phe-rich region close to the J-domain but lack a zinc-binding domain. The largest category (type III) is that formed by 38 proteins containing only the J-domain. Of note, the protein J30 (gene LINF_070013700), as currently annotated at TriTrypDB, lacks the J-domain; however, its ortholog in
Trypanosoma brucei (Tb927.8.1010) contains the J-domain at the N-terminal moiety. Thus, we analyzed the LINF_070013700 transcript sequence and found that the gene was mis-annotated, and the coding sequence can be extended 783 nucleotides at its 5′-end. Thus, the new predicted protein would be 261 amino acids longer and then would contain the complete J-domain [
33]. The
L. infantum J10 protein (gene LINF_170010900) was found to be N-terminal truncated related to its
L. major ortholog, and, as a consequence, the J-domain is incomplete, although the HPD motif is present; in this case, a possible mis-annotation of the coding sequence was not evidenced [
34]. Within the type IV HSP40 group, six proteins were found to be encoded in the
L. infantum genome: J31, J47, J66, J68, J75 and J77. Of them, J47, J66 and J75 have also the Zn-finger domain (
Table 1).
Among the members of the
Leishmania HSP40 family, as observed in other organisms, there is little sequence conservation beyond the characteristic J-domain. A phylogenetic analysis was conducted with the 72 JDPs to determine possible evolutionary relationships (
Figure 3). Only type I HSP40s (except J32) grouped in the same branch of the tree, but the small bootstrap values supporting most of the branches did not allow for definition of clear evolutionary relationships among them. Exceptions are the pairs J45 and J46 (bootstrap 99%) and J6 and J7 (96%), proteins that probably resulted from a recent duplication of an ancestral gene. Proteins of the type II, III and IV show marked divergences in their structure and sequence; even the J-domain (typically found at the N-terminal region in DnaJ and type I-HSP40s) is located in the middle of the sequence or even at the C-terminal region in the proteins J20, J21, J25, J28, J29, J35, J41, J42, J51, J52, J53, J64, J65, J67 and J75 (
Table 1). This atypical location of J-domain has been also reported for HSP40s in other organisms [
37]. A remarkable feature is that
Leishmania J14 contains two J-domains.
Leishmania protists belong to the Euglenozoa phylum, which represent one of the earliest extant branches of the eukaryotic lineage [
38]. Thus, the enormous evolutionary distance separating
Leishmania from the model eukaryotes makes it difficult to establish straightforward orthologous relationships between
Leishmania proteins and the human or yeast ones; in fact, in databases, around half of the
Leishmania genes are annotated as coding for proteins of unknown functions because no clear orthologous proteins exist in model eukaryotes. Nevertheless, based on the hypothesis that the identification of potential orthologues in the well-characterized human and/or yeast proteomes would give clues about the functions of the
Leishmania proteins, we performed a detailed search of sequence conservation with the 72
L. infantum HSP40s against protein databases found in the servers PantherDB (
http://www.pantherdb.org/, accessed on 22 April 2022), Expasy (
https://prosite.expasy.org/, accessed on 22 April 2022), InterPro (
https://www.ebi.ac.uk/interpro/, accessed on 22 April 2022) and UniProt (
https://www.uniprot.org/blast/, accessed on 22 April 2022). As a result, a possible orthologous relationship among the
Leishmania type I J2, J3 and J4 (these three proteins may share an evolutionary origin, see
Figure 3) and the members DnaJA-1, -2 and -4 of the human HSP40 family was deduced. Similarly, J2 may be ortholog to
S. cerevisiae mas5 protein (YDJ1 gene), which is involved in mitochondrial protein import [
39]. Also,
Leishmania type I-JDPs J45, J46 and J50 have substantial sequence conservation with the human DnaJA2 and
S. cerevisiae SCJ1 proteins, which are located at the endoplasmic reticulum in those organisms [
40,
41].
Within the group of type IV HSP40s, J66 has a probable evolutionary relationship with
L. infantum type I-JDPs and particularly with J2 (
Figure 3). Thus, J2 and J66 may be considered paralogs, and in turn both would be orthologs to human DnaJA2. Another type IV-JDP, J47, shares sequence conservation with the type I-JDP J27 (
Figure 3), and both are possible orthologs to human DnaJA3 and
S. cerevisiae Mdj1 proteins, which were located at the mitochondrion [
42,
43].
J6 and J7, belonging to the type II group, might be orthologs to human DnaJB4, DnaJB5 and DnaJB1 HSP40s and to
S. cerevisiae Sis1 protein, which is required for nuclear migration during mitosis and initiation of translation [
44,
45]. J10 (and to some extent J34) may be orthologue to Sec63, an essential HSP40 protein involved in post-translational translocation of proteins across the endoplasmic reticulum [
46,
47]. Another endoplasmic reticulum located proteins, human DnaJC3/ERdj6 and
S. cerevisiae JEM1 [
48,
49], might be the orthologs of
Leishmania J53. Moreover, we postulate that
Leishmania J13 is ortholog of human DnaJC24, a protein involved in diphthamide biosynthesis, a post-translational modification of histidines that have been found in the translation elongation factor-2 [
50]. A possible ortholog of J16 would be the well-known human zuotin/DnaJC2 protein. Zuotin is a component of the ribosome-associated complex involved in maintaining nascent polypeptides in a folding-competent state [
51]. The type I-JDP J27 may be an ortholog of the human DnaJA3/Tid1 protein, a mitochondrial molecular chaperone [
42,
52]. A plausible mitochondrial location of
Leishmania J31 is suggested on its sequence similarity with human DnaJC11 [
53]. J36 might be ortholog of human DnaJC20/HscB and
S. cerevisiae JAC1 proteins, which are involved in iron–sulfur cluster biosynthesis [
54,
55].
Leishmania J68 might be a component of the mitochondrial Tim translocase because of its structural and sequence similarities with human DnaJC15 and
S. cerevisisae Pam18 proteins [
56]. Human DnaJC21 and
S. cerevisiae JJJ1 are involved in rRNA biogenesis [
57], and they may be orthologs of
Leishmania J32. A histone chaperone function may be suggested for
Leishmania J33 based on its sequence similarity with human DnaJC9 and
Schizosaccharomyces pombe C1071.09c proteins [
58]. J59 is a really long protein (2451 amino acids) that might be orthologue of the human DnaJC13/RME-8 protein, which is involved in endosome organization and regulation [
59]. In addition to all possible orthologs mentioned above, another several
Leishmania HSP40s share remarkable sequence identity with genes coding for HSP40s in fungi and plant species. Nevertheless, these are not commented here, as they remain still uncharacterized, and no information about their functional roles is available.
According to structural features and the putative orthologs identified (see above), some
Leishmania HSP40s could be associated to distinct sub-cellular locations such as mitochondrion (J27, J31, J36, J47 and J68), endoplasmic reticulum (J10, J22, J34, J45, J46, J53, J66 and J72), flagellar pocket (J11, J51 and J54), endosomes (J59) and the nucleus (J14, J33 and J56) (
Table 2). Of note, several
Leishmania HSP40s (J2, J3, J6, J8, J27, J50 and J54) have been localized in the
L. donovani glycosomes following a proteomic approach [
60] (
Table 2). Glycosomes are specialized peroxisomes, existing in
Leishmania and other trypanosomatids, that contain key enzymes involved on the glycolytic pathway and purine salvage [
61]. Moreover, at least 23 HSP40s in
Leishmania may play functional roles in cellular membranes as they possess transmembrane domains, including members from group I (J45, J46), group II (J5, J19, J22, J28, J44, J53 and J54) and group III (
Table 1). In these locations, putative functions for these
Leishmania JDPs may be providing co-translational chaperone assistance, transport across organelle membranes and unfolding/refolding after transportation from one cellular compartment to another. For example, J22 is a possible membrane protein related to endoplasmic reticulum (ER) membrane-located
S. pombe pi041 and human DnaJB12 proteins that are involved in ER-associated degradation of misfolded proteins [
62].
In addition, TPR (tetratrico-peptide repeat) domains have been found in eight JDP proteins: J51, J52, J53, J67 and J76 contain two or more TPR domains each, whereas a sole TPR domain is present in J42, J65 and J76 (
Table 1). TPR domain has been found to be a docking site, interacting with the EEVD motif present at the C-termini of some members of the HSP70 family and in the C-terminus of the HSP83/90 protein [
73,
74].
Despite the outstanding number of HSP40s existing in
Leishmania, none of these proteins have been biochemically characterized to date, and consequently, nothing is known about the potential HSP70-HSP40 partnerships and their role in the
Leishmania life cycle. However, evidence of their existence and levels of stage specific expression are available now from transcriptomics and proteomics data. Thus, out of the 72 putative HSP40s identified in
Leishmania, 37 have been experimentally reported in proteomic studies of the
Leishmania promastigote [
62,
66,
72] and/or amastigote stages [
67,
70]. Moreover, some of them have been found in the proteomes derived from specific sub-cellular fractions of the parasite such as the glycosome [
60] and the flagellum [
72]. A list of the experimentally detected HSP40s in
Leishmania proteomes is included in
Supplementary Materials, Table S1. For example, J2 protein has been experimentally reported in
L. infantum and
L. donovani promastigotes and amastigotes [
60,
63,
66], in extracellular vesicles of
L. infantum [
68] and in the
L. braziliensis secreted proteome [
75]. Remarkably, J2 has been found to be phosphorylated on Serine-89, and the phosphorylation ratio of this residue increased by 3.5-fold after 2.5 h of promastigote-to-amastigote differentiation, reaching an increase in phosphorylation up to 22-fold in full differentiated amastigotes [
76]. Such a dramatic increase in phosphorylation suggests a relevant role for this protein in the differentiation process from promastigote to amastigote stage. Phosphorylation on serine residues is the most common post-transcriptional modification in HSP40s (
Supplementary Materials, Table S1) and may be related to a general regulatory mechanism induced during the stress response [
77]. Other relevant protein modifications found in HSP40s are acetylation and methylation. Particularly relevant may be the N-terminal acetylation on threonine-39 of J7 [
78], because this modification would be changing its chemical properties and having marked biological consequences on protein function and cellular localization [
64]. In addition, methylation on glutamic acid-134 in J6 and J50 proteins would increase the hydrophobicity of this protein [
79].