A Molecular Analysis of the Aminopeptidase P-Related Domain of PID-5 from Caenorhabditis elegans

A novel protein, PID-5, has been shown to be a requirement for germline immortality and has recently been implicated in RNA-induced epigenetic silencing in the Caenorhabditis elegans embryo. Importantly, it has been shown to contain both an eTudor and aminopeptidase P-related domain. However, the silencing mechanism has not yet been fully characterised. In this study, bioinformatic tools were used to compare pre-existing aminopeptidase P molecular structures to the AlphaFold2-predicted aminopeptidase P-related domain of PID-5 (PID-5 APP-RD). Structural homology, metal composition, inhibitor-bonding interactions, and the potential for dimerisation were critically assessed through computational techniques, including structural superimposition and protein-ligand docking. Results from this research suggest that the metallopeptidase-like domain shares high structural homology with known aminopeptidase P enzymes and possesses the canonical ‘pita-bread fold’. However, the absence of conserved metal-coordinating residues indicates that only a single Zn2+ may be bound at the active site. The PID-5 APP-RD may form transient interactions with a known aminopeptidase P inhibitor and may therefore recognise substrates in a comparable way to the known structures. However, loss of key catalytic residues suggests the domain will be inactive. Further evidence suggests that heterodimerisation with C. elegans aminopeptidase P is feasible and therefore PID-5 is predicted to regulate proteolytic cleavage in the silencing pathway. PID-5 may interact with PID-2 to bring aminopeptidase P activity to the Z-granule, where it could influence WAGO-4 activity to ensure the balanced production of 22G-RNA signals for transgenerational silencing. Targeted experiments into APPs implicated in malaria and cancer are required in order to build upon the biological and therapeutic significance of this research.


Introduction
Germ cells are required to transmit genetic information from generation to generation. Maintaining a high degree of germ-cell genome stability is therefore integral to the preservation of genetic material [1]. Exogenous and endogenous sources of DNA damage, including the movement of transposable elements (TEs), pose a constant challenge to germcell integrity. TEs are abundant, self-propagating DNA sequences that can be inserted into new locations within the genome [2]. Although TEs are evolutionary drivers, they pose an intrinsic threat to faithful genetic information transmission, where excessive damage results in germline abnormalities and ultimately organism sterility [3].
Many organisms have evolved protective mechanisms against TE activity, including an RNA silencing mechanism termed the Piwi/piRNA (Piwi-interacting RNA) pathway. When this pathway is lost in Caenorhabditis elegans, sterility is not immediately visible, with only a limited set of TEs reactivated [4]. This unique effect can be attributed to the action of the worm-specific argonaute proteins (WAGOs), which act in an additional small (G) 22G RNAs are loaded onto worm-specific argonaute proteins (WAGOs) and are transported to the nucleus for epigenetic silencing. (H) The 22G RNA/WAGO complex initiates silencing through deposition of heterochromatic marks at the target in a way that lasts several generations in a process termed RNA-induced epigenetic gene silencing (RNAe). The way in which the silencing becomes PRG-1-independent is currently unknown. Adapted from Ref. [5].
Maternally provided piRNAs can initiate RNAe and maintain silencing in the absence of additional piRNAs [4]. Although the mechanisms of RNAe establishment and maintenance are unknown, recent findings from a "piRNA-induced silencing defective" (Pid) mutation screen have identified a protein, PID-2, as being involved in this process, as well as two PID-2-interacting proteins termed PID-4 and PID-5. It has been proposed that these proteins are integral to the effective inheritance of this long-term silencing. They also affect Z-granule homeostasis and the level of 22G RNAs [4]. Whilst both PID-4 and PID-5 contain eTudor domains, PID-5 possesses an additional domain related to aminopeptidase P (APP).
The APP family (EC 3.4.11.9) (cytosolic APP-1, cell surface APP-2 and mitochondrial APP-3) comprises conserved metalloproteases that catalyse the cleavage of N-terminal residues from peptide substrates with proline residue at the P1' position [6]. The pyrrolidine ring of the proline side chain confers conformational rigidity and thus resistance to hydrolysis, with specific peptidases required to cleave the Xaa-Pro peptide bond. These enzymes are widely distributed in mammals, contributing to protein homeostasis and the prevention of replication-related genome instability [7]. They have also been identified as targets for developing novel antimalarial drugs as they promote parasitic survival [8].  (WAGOs) and are transported to the nucleus for epigenetic silencing. (H) The 22G RNA/WAGO complex initiates silencing through deposition of heterochromatic marks at the target in a way that lasts several generations in a process termed RNA-induced epigenetic gene silencing (RNAe). The way in which the silencing becomes PRG-1-independent is currently unknown. Adapted from Ref. [5].
Maternally provided piRNAs can initiate RNAe and maintain silencing in the absence of additional piRNAs [4]. Although the mechanisms of RNAe establishment and maintenance are unknown, recent findings from a "piRNA-induced silencing defective" (Pid) mutation screen have identified a protein, PID-2, as being involved in this process, as well as two PID-2-interacting proteins termed PID-4 and PID-5. It has been proposed that these proteins are integral to the effective inheritance of this long-term silencing. They also affect Z-granule homeostasis and the level of 22G RNAs [4]. Whilst both PID-4 and PID-5 contain eTudor domains, PID-5 possesses an additional domain related to aminopeptidase P (APP).
The APP family (EC 3.4.11.9) (cytosolic APP-1, cell surface APP-2 and mitochondrial APP-3) comprises conserved metalloproteases that catalyse the cleavage of N-terminal residues from peptide substrates with proline residue at the P1' position [6]. The pyrrolidine ring of the proline side chain confers conformational rigidity and thus resistance to hydrolysis, with specific peptidases required to cleave the Xaa-Pro peptide bond. These enzymes are widely distributed in mammals, contributing to protein homeostasis and the prevention of replication-related genome instability [7]. They have also been identified as targets for developing novel antimalarial drugs as they promote parasitic survival [8].
The PID-5 structure was predicted using the machine learning-based algorithm, Al-phaFold2, and the domain relating to APP (PID-5 APP-RD) is shown in Figure 2 (green colour) below.
There is evidence that the PID-5 APP-RD may be a product of a recent segmental gene duplication event in which the region of DNA containing the pid-4 and app-1 genes has been repeated within the protein. This assessment is based on the adjacent location of these genes in the C. elegans genome (Figure 3). It is likely that a recombination event may have Biomolecules 2023, 13, 1132 4 of 23 resulted in development of the pid-5 gene, but its presence through evolution suggests an adaptive function has been achieved.
has been repeated within the protein. This assessment is based on the adjacent location of these genes in the C. elegans genome (Figure 3). It is likely that a recombination event may have resulted in development of the pid-5 gene, but its presence through evolution suggests an adaptive function has been achieved.
The biological role of PID-5 is not fully characterised, raising questions over APP-RD functionality. However, due to the role of PID-5 in gene regulation and germline immortality, it has been suggested that the presence of this domain may influence N-terminal proteolysis in RNAe. Placentino et al. [4] hypothesised that the additional domain may bind APP-1 substrates without N-terminal cleavage, or alternatively, heterodimerize with C. elegans APP-1. Heterodimerisation could provide APP-1 activity in granules containing PID-5 or prevent APP-1 homodimerisation that may be inhibitory if C. elegans APP-1 dimerisation is critical for activity. The three-dimensional structure of cytosolic APP (APP-1) has been determined previously by the application of X-ray crystallography to four organisms, namely Caenorhabditis elegans, Homo sapiens, Plasmodium falciparum and Escherichia coli. A detailed analysis of these known structures has provided insight into the possible function of APP-RD in PID-5, as they share a common catalytic mechanism. Performing a comparison of these structures using different bioinformatics techniques and undertaking a critical analysis of the available data enabled us to perform functional annotation by analogy. These analyses provide further understanding of the potential role of the PID-5 APP-RD in RNAe.
To address the Placentino et al. [4] hypotheses, this study focuses on the following key areas: (1) The assessment of the homology between the PID-5 APP-RD predicted structure and the available APP-1 crystal structures. (2) A comparison of metal composition across the metalloproteases and the catalytic mechanism. (3) A comparison of interactions, with a known APP-1 inhibitor used to assess potential substrate binding. (4) The evaluation of potential dimerisation interface interactions to assess the viability of PID-5 APP-RD homodimerisation and heterodimerisation with C. elegans APP-1.

PID-5 and APP-1 Structures and Homology
The predicted structure of PID-5 (Q9GUI6) was obtained from the AlphaFold2 Protein Structure Database 2.0 (https://alphafold.ebi.ac.uk (accessed on 7 February 2023)) [9]. APP-1 structures for Caenorhabditis elegans (CeAPP-1), cytosolic human XPNPEP1 The biological role of PID-5 is not fully characterised, raising questions over APP-RD functionality. However, due to the role of PID-5 in gene regulation and germline immortality, it has been suggested that the presence of this domain may influence N-terminal proteolysis in RNAe. Placentino et al. [4] hypothesised that the additional domain may bind APP-1 substrates without N-terminal cleavage, or alternatively, heterodimerize with C. elegans APP-1. Heterodimerisation could provide APP-1 activity in granules containing PID-5 or prevent APP-1 homodimerisation that may be inhibitory if C. elegans APP-1 dimerisation is critical for activity.
The three-dimensional structure of cytosolic APP (APP-1) has been determined previously by the application of X-ray crystallography to four organisms, namely Caenorhabditis elegans, Homo sapiens, Plasmodium falciparum and Escherichia coli. A detailed analysis of these known structures has provided insight into the possible function of APP-RD in PID-5, as they share a common catalytic mechanism. Performing a comparison of these structures using different bioinformatics techniques and undertaking a critical analysis of the available data enabled us to perform functional annotation by analogy. These analyses provide further understanding of the potential role of the PID-5 APP-RD in RNAe.
To address the Placentino et al. [4] hypotheses, this study focuses on the following key areas: (1) The assessment of the homology between the PID-5 APP-RD predicted structure and the available APP-1 crystal structures. (2) A comparison of metal composition across the metalloproteases and the catalytic mechanism. (3) A comparison of interactions, with a known APP-1 inhibitor used to assess potential substrate binding. (4) The evaluation of potential dimerisation interface interactions to assess the viability of PID-5 APP-RD homodimerisation and heterodimerisation with C. elegans APP-1.

Apstatin Inhibitor Interactions
A docking simulation of the PID-5 APP-RD predicted structure with apstatin was performed using GOLD software (2022.3.0) [15]. The coordinates and geometry of apstatin were generated in Chem3D ) were compared to assess any conformational changes that occurred upon binding. Equivalent residues to those identified as dynamic in CeAPP-1 (Glu929 and Arg941) were defined as flexible. Residues Glu967 and His932 were marked as deprotonated to mimic physiological conditions. A total of 6 poses were generated by the ChemScore function.

PID-5 APP-RD and APP-1 Share High Amino Acid Sequence and Structural Homology
A comprehensive analysis of the sequence and structure conservation between known APP-1 structures and PID-5 APP-RD was performed. High percentages of similarity are indicative of functional relationships and therefore may provide information on the role of PID-5. In the absence of an experimentally solved structure for PID-5, the AlphaFold2predicted structure ( Figure 2) was used for comparison to crystal structures of CeAPP-1, HuAPP-1, Pf APP-1 and EcAPP-1.
The confidence levels of the PID-5 model were analysed prior to undertaking structural interpretation. AlphaFold2 provides confidence estimates per residue, using a predicted local distance difference test (pLDDT) that uses the local distance difference test Cα (IDDT-Cα) to estimate the level of agreement of the predicted structure to known experimental structures flagged during the multiple sequence alignment (MSA) step of the AlphaFold2 algorithm [19]. The APP-RD of PID-5 returned scores in the 'Confident' to 'Very high' model confidence range (90 > pLDDT ≥ 70 and pLDDT ≥ 90, respectively), where scores above 70 indicate a good backbone prediction [20]. Several residues fell within the 'Low' confidence category (70 > pLDDT ≥ 50). The residues were Asn526, Ser587, Ser992-Gln996 and Glu950-Asn955 ( Figure 2C, coloured yellow) and they were interpreted with caution in subsequent analyses.
An MSA was performed to assess the similarity of the PID-5 APP-RD to known APP-1 structures at the sequence level. Areas of high conservation across all structures were revealed, particularly within the putative active site of PID-5 APP-RD (residues 908-947). (Figure 4). We also see high conservation between residues 958-982 and 859-890.    The MSA output was used to create a phylogenetic tree to assess the evolutionary relationships between the structures ( Figure 5). CeAPP-1 was returned as the closest relative to PID-5 APP-RD.     A pairwise sequence alignment was performed to quantify the percentage of perfectly conserved amino acids and those with conserved physiochemical properties in PID-5 APP-RD, relative to the homologous structures ( Table 1). The CeAPP-1 sequence ranked the highest, with 41.4% shared identity with and 61.4% similarity to PID-5 APP-RD. Table 1. Percentage sequence identity and similarity of known APP-1 structures to PID-5 APP-RD using EBLOSUM62 matrix. Pairwise sequence alignment was performed through the EMBL-EBI EMBOSS Needle tool. The associated PDB codes are shown for the homologous APP-1 structures. Identity shows percentage of perfectly conserved amino acid residues and similarity shows percentage of amino acids with similar physiochemical properties.

PDB Code
Identity (%) Similarity (%) Structural comparison can provide information beyond sequence comparison and provide additional insights into functional conservation [13]. A DALI structural search was performed to compare the PID-5 APP-RD predicted structure to the crystal structures of its homologues (Table 2). Each result obtained a Z-score > 20, which confirmed that PID-5 APP-RD is homologous to all analysed structures [22]. The search returned CeAPP-1 as the highest structural homologue, with a Z-score of 47.9. Table 2. Pairwise comparison of PID-5 APP-RD to CeAPP-1, HuAPP1, Pf APP-1, and EcAPP-1 using DALI. PID-5 APP-RD (AlphaFold2 DB: Q9GUI6, residues Ser452-Ile1061) was used as the query structure and compared against the known APP-1 structures (one-against-many comparison). Structures in complexes with the APP-1 inhibitor apstatin are also shown.

Rank
Protein Z-Score RMSD (Å) ID (%) Chain Superposition of the two structures revealed the extent of the sequence conservation and 3D homology ( Figure 6A), with PID-5 APP-RD returning the highest sequence and structural similarity to CeAPP-1.
The 'pita-bread' fold aminopeptidases share the C-terminal peptide fold of two ααβββ repeats, which provides the structural foundation for catalysis [23]. The conservation of this fold was seen in the PID-5 APP-RD predicted structure ( Figure 6B). As the fold is conserved, the substrate binding modes and catalytic mechanisms are also likely to be somewhat similar [24]. and 3D homology ( Figure 6A), with PID-5 APP-RD returning the highest sequence and structural similarity to CeAPP-1.
The 'pita-bread' fold aminopeptidases share the C-terminal peptide fold of two ααβββ repeats, which provides the structural foundation for catalysis [23]. The conservation of this fold was seen in the PID-5 APP-RD predicted structure ( Figure 6B). As the fold is conserved, the substrate binding modes and catalytic mechanisms are also likely to be somewhat similar [24].

PID-5 APP-RD May Bind a Single Zinc Ion
The metalloaminopeptidase classification of APPs results from the presence of one or two divalent metal ions at the active site, which is central to the catalytic mechanism [25]. Metal ions can coordinate a water molecule that acts as a nucleophile for use in the hydrolysis of the peptide bond. All APP-1 members have dinuclear metal centres, with CeAPP-1 known to coordinate Zn 2+ [26] and with HuAPP-1, PfAPP-1 and EcAPP-1 known to bind Mn 2+ [27][28][29]. The two metal binding sites (MA and MB) have differential affinities, with the MA ion being more tightly bound [23]. A comparison of the conserved metalcoordinating residues from the homologous proteins with the PID-5 APP-RD predicted structure revealed the feasibility of metal ion binding in the PID-5 APP-RD (Table 3). Table 3. Metal-coordinating residues of APP-1 homologues and PID-5 APP-RD. The conserved metal-coordinating residues from the homologous structures are separated into the two distinct metal binding sites. The equivalent residues in the PID-5 APP-RD predicted structure (AlphaFold2 DB: Q9GUI6) are aligned. Residues shown in green text are conserved across all five structures. Residues shown in blue are conserved across the homologues only. Black text highlights the residues which are not conserved. PDB codes in parentheses: CeAPP-1 (4S2R), HuAPP-1 (3CTZ), PfAPP-1 (5JQK) and EcAPP-1 (1WL9).

PID-5 APP-RD May Bind a Single Zinc Ion
The metalloaminopeptidase classification of APPs results from the presence of one or two divalent metal ions at the active site, which is central to the catalytic mechanism [25]. Metal ions can coordinate a water molecule that acts as a nucleophile for use in the hydrolysis of the peptide bond. All APP-1 members have dinuclear metal centres, with CeAPP-1 known to coordinate Zn 2+ [26] and with HuAPP-1, Pf APP-1 and EcAPP-1 known to bind Mn 2+ [27][28][29]. The two metal binding sites (M A and M B ) have differential affinities, with the M A ion being more tightly bound [23]. A comparison of the conserved metalcoordinating residues from the homologous proteins with the PID-5 APP-RD predicted structure revealed the feasibility of metal ion binding in the PID-5 APP-RD (Table 3). Table 3. Metal-coordinating residues of APP-1 homologues and PID-5 APP-RD. The conserved metal-coordinating residues from the homologous structures are separated into the two distinct metal binding sites. The equivalent residues in the PID-5 APP-RD predicted structure (AlphaFold2 DB: Q9GUI6) are aligned. Residues shown in green text are conserved across all five structures. Residues shown in blue are conserved across the homologues only. Black text highlights the residues which are not conserved. PDB codes in parentheses: CeAPP-1 (4S2R), HuAPP-1 (3CTZ), Pf APP-1 (5JQK) and EcAPP-1 (1WL9).

PID-5 APP-RD
CeAPP-1 The spatial arrangement of the metal-coordinating residues was analysed to assess the potential for metal coordination in PID-5 APP-RD (Figure 7). two metal-coordinating residues for MA were identified as aspartic acid and glutamic acid. The equivalent residues in PID-5 APP-RD are asparagine (N868) and glycine (G981), respectively. Asparagine may form a coordinate covalent bond to stabilize the metal ion in the active site, but any interactions from the glutamic acid carboxylate group are lost.
In the second metal-binding site (MB), the identity of the three coordinating residues was not conserved. Two negatively charged aspartic acid side chains in the homologues ( Figure 7A-D) were shown to be equivalent to the uncharged asparagine and glutamine residues in PID-5 APP-RD ( Figure 7E). The third coordinating residue was the MA glutamic acid, which was equivalent to glycine in PID-5 APP-RD. Although a lone pair from the asparagine and glutamine carboxamide groups may interact with a metal through a coordinate covalent bond, the combined interactions from these mutated residues are likely to be inadequate to coordinate a metal ion at MB.
The analysis of the potential interactions from the mutated residues supports the hypothesis that PID-5 APP-RD could coordinate a single metal ion at MA ( Figure 7E). The inspection of the residues within the putative metal-binding site revealed both zinc and manganese ions to be biochemically plausible.  On the basis of the amino acid sequence alignment and structural comparison, the conserved metal-ion-coordinating residues were defined as DEEH for M A and DDE for M B . The interactions of the first glutamic acid side chain and the imidazole nitrogen atoms from the histidine in M A were conserved in PID-5 APP-RD (E967 and H932). The other two metal-coordinating residues for M A were identified as aspartic acid and glutamic acid. The equivalent residues in PID-5 APP-RD are asparagine (N868) and glycine (G981), respectively. Asparagine may form a coordinate covalent bond to stabilize the metal ion in the active site, but any interactions from the glutamic acid carboxylate group are lost.
In the second metal-binding site (M B ), the identity of the three coordinating residues was not conserved. Two negatively charged aspartic acid side chains in the homologues ( Figure 7A-D) were shown to be equivalent to the uncharged asparagine and glutamine residues in PID-5 APP-RD ( Figure 7E). The third coordinating residue was the M A glutamic acid, which was equivalent to glycine in PID-5 APP-RD. Although a lone pair from the asparagine and glutamine carboxamide groups may interact with a metal through a coordinate covalent bond, the combined interactions from these mutated residues are likely to be inadequate to coordinate a metal ion at M B .
The analysis of the potential interactions from the mutated residues supports the hypothesis that PID-5 APP-RD could coordinate a single metal ion at M A ( Figure 7E). The inspection of the residues within the putative metal-binding site revealed both zinc and manganese ions to be biochemically plausible.
Further investigation conducted using AlphaFill supports this hypothesis (Table 4). Where AlphaFold2-generated models lack ligand interactions and the presence of cofactors, AlphaFill can be used to consider these additional relationships [14]. Homologous structures are compared and any molecules and ions that have been experimentally shown to be present are assessed in the AlphaFill interface, resulting in an enhanced predicted model. AlphaFill returned five potential ions, three of which were consistently modelled as mononuclear metal centres at M A (Zn, Mn, and Co), aligning with the single-coordinate hypothesis (Table 4). Table 4. AlphaFill small molecule transplant results. The 'transplanted' compound is shown with percentage sequence identity between PID-5 APP-RD and the reference PDB-REDO entry and its PDB code. The global root-mean-square deviation (RMSD) values between the structurally aligned PID-5 APP-RD of the AlphaFold2-predicted structure (AlphaFold2 DB: Q9GUI6, Ser452-Ile1061), and the donor structure Cα-atoms are shown. Local RMSD (local structural alignment of backbone atoms within 6 Å) and Transplant Clash Score (TCS) (that represent the van der Waals overlap between the inserted ions and the protein binding site within 4 Å) are shown as quality indicators [14]. Scores highlighted in red represent medium-confidence results. The local RMSD for zinc from CeAPP-1 (PDB code: 4S2R) was > 0.64 Å and therefore was flagged as exhibiting medium confidence by AlphaFill. However, a low TCS was shown that was indicative of good transplant reliability. Manganese transplanted from HuAPP-1 (PDB code: 3CTZ) returned local RMSD and TCS scores in the high confidence bracket. The coordination geometry of the additional transplants was assessed. In the reference structures, the ions were either coordinated by more acidic residues, or were positioned away from the active site, and the cobalt transplant (PDB code: 1WN1) had lower coordinating residue homology than the manganese and zinc transplants.

Compound Identity (%) PDB.Chain Global RMSD (Å) Local RMSD (Å) TCS
To assess potential structural differences in manganese and zinc coordination, the structures of the human APP isoforms were studied. Interestingly, cytosolic HuAPP-1 binds manganese, whereas the membrane-bound HuAPP-2 has been characterised as zinc-binding [30]. The superposition of HuAPP-1 (PDB code: 3CTZ) onto HuAPP-2 (Al-phaFold2DB: O43895) showed the complete amino acid identity of the metal-coordinating residues. Due to the amino acid conservation, the environment in which the protein resides may dictate which metal ion binds [30]. CeAPP-1 is a cytosolic Zn 2+ -binding protein and PID-5 is also cytosolic. Additionally, PID-5 is localised in the perinuclear region and in P granules [4]. Elemental tomography of C. elegans showed manganese to be predominantly localised to the intestine, whereas zinc was shown to be more widely distributed, and importantly, present in the gonads and the developing embryo [31].
Mononuclear zinc metalloproteases have a conserved active-site signature HExxH motif, where the two histidine residues contribute to the zinc coordination [32]. CeAPP-1 and HuAPP-2 have a HGTGH motif, whereas PID-5 APP-RD has the standard mononuclear Zn 2+ metalloprotease motif (HETGH) between His928 and His932, satisfying the requirement for zinc binding. However, His928 is positioned~8.95 Å (as measured from His928 NE2) away from the AlphaFill-positioned zinc, whereas the equivalent coordinating residue in CeAPP-1 (His496) is positioned~4.5 Å from Zn 2+ . The rotation of Arg979 in PID-5 APP-RD from the current position in the predicted structure would allow His928 to move closer to the proposed Zn 2+ ion.
In conclusion, our analysis predicts that PID-5 APP-RD could coordinate one divalent metal atom at M A . The extent of homology with zinc-binding CeAPP-1, the distribution of Zn 2+ compared to Mn 2+ in C. elegans and the presence of the classical HExxH motif suggest that the M A metal will be a Zn 2+ ion.

PID-5 APP-RD Could Bind the APP-1 Inhibitor Apstatin
The 'pita-bread' aminopeptidases have analogous catalytic mechanisms that are highlighted by similarities in enzyme inhibitor binding. Apstatin is a non-hydrolysable peptide analogue that is known to be an APP-1 inhibitor through substrate mimicry ( Figure 8) [24]. and HuAPP-2 have a HGTGH motif, whereas PID-5 APP-RD has the standard mononuclear Zn 2+ metalloprotease motif (HETGH) between His928 and His932, satisfying the requirement for zinc binding. However, His928 is positioned ~8.95 Å (as measured from His928 NE2) away from the AlphaFill-positioned zinc, whereas the equivalent coordinating residue in CeAPP-1 (His496) is positioned ~4.5 Å from Zn 2+ . The rotation of Arg979 in PID-5 APP-RD from the current position in the predicted structure would allow His928 to move closer to the proposed Zn 2+ ion.
In conclusion, our analysis predicts that PID-5 APP-RD could coordinate one divalent metal atom at MA. The extent of homology with zinc-binding CeAPP-1, the distribution of Zn 2+ compared to Mn 2+ in C. elegans and the presence of the classical HExxH motif suggest that the MA metal will be a Zn 2+ ion.

PID-5 APP-RD Could Bind the APP-1 Inhibitor Apstatin
The 'pita-bread' aminopeptidases have analogous catalytic mechanisms that are highlighted by similarities in enzyme inhibitor binding. Apstatin is a non-hydrolysable peptide analogue that is known to be an APP-1 inhibitor through substrate mimicry ( Figure  8) [24]. Figure 8. The structure of APP-1 inhibitor apstatin. Apstatin is annotated with its own naming convention of the oxygen and nitrogen atoms for clarity in analysis. The hydroxyl group of the N-terminal (2S,3R)-3-amino-2-hydroxy-4-phenyl-butanoic acid (O1) binds both metal ions [8]. The two proline residues accommodate the enzyme's subsite specificity for the P1' Pro and P2' Pro and the amino acid amide Ala-NH2 was designed based on the higher binding affinities of tetrapeptides [33].   [8]. The two proline residues accommodate the enzyme's subsite specificity for the P1' Pro and P2' Pro and the amino acid amide Ala-NH 2 was designed based on the higher binding affinities of tetrapeptides [33]. Apstatin-bound APP-1 structures are available for CeAPP-1 (PDB code: 4S2T, 2.15 Å), Pf APP-1 (PDB code: 5JR6, 2.3 Å) and EcAPP-1 (PDB code: 1N51, 2.3 Å). A total of 12 residues within the active site are conserved across the APP-1 domains analysed here, of which 5 are conserved in PID-5 APP-RD. In comparison to its most closely related homologue, CeAPP-1, PID-5 APP-RD shares 7 identical residues and has 11 'similar' residues (Table 5). Table 5. Apstatin binding site residues of known APP-1 structures and the predicted PID-5 APP-RD structure. Table of identified apstatin-interacting residues for CeAPP-1 (PDB code: 4S2T), Pf APP-1 (PDB code: 5JR6) and EcAPP-1 (PDB code: 1N51), with equivalent residues aligned and corresponding residues in PID-5 APP-RD identified (AlphaFold2 DB: Q9GUI6, S452-I1061). Apstatin from the inhibitor-bound CeAPP-1 structure was modelled into HuAPP-1 in the absence of an apstatinbound crystal structure. Residues in green text are conserved across all structures and residues in blue are conserved across three structures. Residues in black text are not conserved across the structures. Residues listed in bold text have been specifically referenced in the published literature. Interactions detailed for CeAPP-1:apstatin were used to map many of the other structure's residues as a discrepancy in the level of detail was observed in the literature. Those residues mapped from structural superimposition of another structure are shown in normal print.

PID-5 APP-RD
The 5 perfectly conserved apstatin-interacting residues may therefore be involved in apstatin binding for PID-5 APP-RD. To assess the potential interactions, the PID-5 APP-RD predicted structure was superimposed with the apstatin-bound crystal structures (Figure 9). His928 and Arg979 may contribute to the hydrophobic pocket occupied by the P1' proline substrate (in homologous structures) binding and facilitate its correct orientation. Ile828 may contribute to the hydrophobic pocket occupied by the P1 phenylalanine. Residue Glu967 may function to activate the nucleophilic water molecule [34] and His932 may form a hydrogen bond, as seen in CeAPP-1:apstatin.

H932
H487 E536 E690 E406 E537 . Figure 9. Putative PID-5 APP-RD binding site residues with apstatin orientations from the homologous APP-1 crystal structures. The amino acid side chains of PID-5 APP-RD, which may contribute to apstatin binding, are shown (green) (AlphaFold2 DB: Q9GUI6). The active site residues were identified from superimposition with known APP-1:apstatin crystal structures. The blue, orange The conformational changes that occur upon binding were also considered by a comparison of the apo-CeAPP-1 (PDB code: 4S2R) and apstatin-bound CeAPP-1 structures (PDB code: 4S2T). In CeAPP-1, Arg505 flips 180 • to undergo an interaction with apstatin, with no other conformational changes observed. Potential steric clashes were identified that could prevent apstatin from binding, especially due to minimal conformational change. The binding pose of CeAPP-1 apstatin with PID-5 APP-RD revealed four amino acid residues in close proximity to the ligand (Figure 9). To assess potential dynamics of PID-5 APP-RD and the potential binding residues, GOLD was used to dock apstatin into the putative binding site [15]. Arg941 was marked as flexible for docking due to the proximity to apstatin, enabling 34 rotamers to be assessed from the GOLD library. Glu929 had mutated from a glycine residue in CeAPP-1 and was therefore also marked as flexible, with 8 rotamers allowed.
The docking returned 6 solutions, with the highest scoring solution displaying a binding pose in a similar orientation to CeAPP-1:apstatin for the P1 phenylalanine ( Figure 10A). A total of 10 interactions were predicted ( Figure 10B), although the potential hydrogen bond from Glu950 must be treated with caution due to 'Low' AlphaFold2 confidence.
The positioning of docked apstatin further from the metal binding sites relative to CeAPP-1:apstatin could be attributed to the proposed absence of any interactions with M B . This was as apstatin coordinates both divalent metal ions in the homologous structures.
A PDBePISA-generated Δ i G (observed solvation free energy gain) p value suggested that the CeAPP-1 interface had high hydrophobicity compared to averag The predicted hydrogen-bonding interactions were compared to the known crystal structures of CeAPP-1, Pf APP-1, and EcAPP-1 in a complex with apstatin, and it was shown that all the homologues utilise at least 6 interactions for apstatin binding (Supplementary  Table S1). Although mutated residues (Table 5) meant that the other equivalent hydrogenbonding interactions in the homologues were lost, a comparison of the docking results to the interactions present in CeAPP-1:apstatin suggests that some residues may form compensatory interactions (Supplementary Table S2).
The PID-5 APP-RD docking results suggests that five amino acids could be used to form 8 hydrogen bonds with apstatin ( Figure 10B), compared to the seven amino acids in CeAPP-1 that form 13 hydrogen bonds [26]. The occurrence of fewer potential hydrogenbonding interactions may have also contributed to the difference in the docked position relative to CeAPP-1:apstatin.
Successful GOLD docking suggests apstatin binding is plausible for PID-5 APP-RD, although the fewer number of potential interactions would result in weaker binding.
A PDBePISA-generated ∆ i G (observed solvation free energy gain) p value of 0.176 suggested that the CeAPP-1 interface had high hydrophobicity compared to average structures and this was suggestive of a biological dimer. Previous research also showed detectable amounts of monomeric (2.1%) and tetrameric (9.6%) CeAPP-1 species following analytical ultracentrifugation [26]. PDBePISA was used to assess the dimer interface interactions of CeAPP-1, returning 16 hydrogen bonds and 4 salt bridges (Table 6). Table 6. Conserved dimer interface residues. The dimer interface residues for CeAPP-1 (PDB code: 4S2R) with the equivalent residues in PID-5 APP-RD (AlphaFold2 DB: Q9GUI6) aligned. Hydrogenbonding interactions and salt bridges between CeAPP-1 Chain Q and P are listed. The equivalent potential interactions in the modelled PID-5 APP-RD homodimer are shown in the latter two columns. Potential heterodimer interactions can be inferred through the central two columns and the outer two columns. Residues in green text are conserved in both CeAPP-1 and PID-5 APP-RD and blue text highlights similar residues. Backbone interactions are indicated by an asterix.

PID-5 APP-RD (Chain P Equivalent)
Hydrogen bonds Previous immunoprecipitation-mass spectrometry (IP-MS) experiments showed an enrichment of APP-1 with PID-5 that could be due to CeAPP-1:PID-5 heterodimerisation [4]. PID-5 APP-RD was superimposed onto a monomer of CeAPP-1 to assess the likelihood of PID-5 APP-RD heterodimerisation. Two copies of the PID-5 APP-RD predicted structure were also superimposed onto the individual CeAPP-1 monomers in order to model homodimerisation ( Table 6).
Most of the interface interactions could be conserved, with the exception of two salt bridges, as CeAPP-1 Lys136 is equivalent to Gln583 in PID-5. However, hydrogen bondinginteractions may still form. The side chain interaction of CeAPP-1 Thr468 would also be lost as the equivalent residue is Ile913 in PID-5 APP-RD. However, the high residue conservation suggests that both the heterodimeric and homodimeric structures could satisfy the observation that, on average, 5-10 hydrogen bonds are seen per 1000 Å 2 of protein interface [16].
CeAPP-1 and PID-5 APP-RD display a high conservation of charged residues at the dimer interface that may provide favourable electrostatic interactions between the monomers and contribute to the stability of a potential heterodimer ( Figure 11). To further explore dimerisation, we predicted the structu ure 12) and homodimer using AlphaFold2. The solvent-access To further explore dimerisation, we predicted the structures of the heterodimer ( Figure 12) and homodimer using AlphaFold2. The solvent-accessible area buried upon dimer formation was calculated by PDBePISA (Table 7) for the predicted structures [16].
in white represent close-to-neutral residues. The hydrophobicity of the surface is shown for (C) Ce-APP-1 and (D) PID-5 APP-RD. Pale brown areas highlight hydrophobic patches as generated by the GRID-type hydrophobic potential in CCP4mg.
To further explore dimerisation, we predicted the structures of the heterodimer (Figure 12) and homodimer using AlphaFold2. The solvent-accessible area buried upon dimer formation was calculated by PDBePISA (Table 7) for the predicted structures [16].   Table 7. PDBePISA dimer interface analysis for CeAPP-1 homodimer (PDB code: 4S2R), and the AlphaFold2-predicted PID-5 APP-RD homodimer and heterodimer with CeAPP-1. The predicted PID-5 APP-RD was used in the dimer models (AlphaFold2 DB: Q9GUI6, residues S452-I1061). The interface area is also shown as a percentage of the total solvent-accessible area buried upon dimer formation. The identification of potentially conserved hydrogen-bonding interactions and salt bridge formation, in addition to the demonstrated association between PID-5 and APP-1 [4], suggests that PID-5 APP-RD and CeAPP-1 heterodimerisation is feasible. A surface diagram of the AlphaFold2-predicted heterodimer highlights the large potential interface area ( Figure 13).

Protein Dimer
Previous research identified a tryptophan residue as vital for dimerisation in HuAPP-1, and when a Trp477 point mutation to glutamic acid blocked homodimerisation, only 6% of the wild-type activity was maintained [27]. This residue is conserved in CeAPP-1 (W475) and PID-5 (W920), and all three residues are positioned in the same locality and orientation. As this tryptophan residue is essential in the second closest homologue to PID-5 APP-RD and is present in both CeAPP-1 and PID-5, Trp920 can perform the role of CeAPP-1 Trp475 in the heterodimer which may help to facilitate dimerisation and maintain APP-1 activity. However, it is important to note that Pf APP-1 and EcAPP-1 do not have the tryptophan residue and still form dimers.
The identification of potentially conserved hydrogen-bonding intera bridge formation, in addition to the demonstrated association between PID [4], suggests that PID-5 APP-RD and CeAPP-1 heterodimerisation is feas diagram of the AlphaFold2-predicted heterodimer highlights the large pot area (Figure 13).
. Figure 13. Space-filling representation of the predicted PID-5 APP-RD and CeAPP The heterodimer of PID-5 APP-RD (AlphaFold2 DB: Q9GUI6, S452-I1061) and CeA 4S2R) was predicted by AlphaFold2. The spacefill rendering was performed by PD the CeAPP-1 monomer is shown in dark blue, with dimer interface residues in red in light blue, and dimer interface residues in green.
Previous research identified a tryptophan residue as vital for d HuAPP-1, and when a Trp477 point mutation to glutamic acid blocked hom only 6% of the wild-type activity was maintained [27]. This residue is conser 1 (W475) and PID-5 (W920), and all three residues are positioned in the sam orientation. As this tryptophan residue is essential in the second closest PID-5 APP-RD and is present in both CeAPP-1 and PID-5, Trp920 can perfo CeAPP-1 Trp475 in the heterodimer which may help to facilitate dimerisat tain APP-1 activity. However, it is important to note that PfAPP-1 and E have the tryptophan residue and still form dimers.
Although the evidence suggests heterodimerisation is possible, the ef teractions remain unclear as the relationship between dimerisation and Ce is currently unknown. Additionally, the potential substrates for the CeAPP Figure 13. Space-filling representation of the predicted PID-5 APP-RD and CeAPP-1 heterodimer. The heterodimer of PID-5 APP-RD (AlphaFold2 DB: Q9GUI6, S452-I1061) and CeAPP-1 (PDB code: 4S2R) was predicted by AlphaFold2. The spacefill rendering was performed by PDBePISA, where the CeAPP-1 monomer is shown in dark blue, with dimer interface residues in red, PID-5 APP-RD in light blue, and dimer interface residues in green.
Although the evidence suggests heterodimerisation is possible, the effect of such interactions remain unclear as the relationship between dimerisation and CeAPP-1 activity is currently unknown. Additionally, the potential substrates for the CeAPP-1:PID-5 APP-RD heterodimer are unconfirmed, although WAGO-4 has been identified as a potential substrate [4].

AlphaFold2 and APP-1 Enabled Homology-Based Annotations
The AlphaFold2 structure prediction enabled the analysis of the PID-5 APP-RD in the absence of experimental data. Due to the high level of sequence and structural similarity of PID-5 with known APP-1s, strong analogies could be drawn between the structures, allowing us to infer the characteristics of PID-5 through association. The assumption that the predicted model is representative of the crystal structure underlies this study, although it should be acknowledged that AlphaFold2 has been recognised for its predictive accuracy and that the per-residue reliability estimates provided confidence in the analysis.

Mononuclear Zinc Binding PID-5 APP-RD Is Probably Catalytic Inactive
This comparative structural study suggests PID-5 APP-RD may bind a single Zn 2+ at M A , but it was not possible to consider the stability of the potential coordination complex in this report. The distance of the first histidine residue of the HExxH motif (His928) may be too great to facilitate a stable Zn 2+ coordination and non-specific metal binding may occur as a result. As PID-5 APP-RD may also bind Mn 2+ , it may sense the concentration of different ions, with metal selectivity potentially having a role in modulating its activity. Indeed, the metal ion content of both HuAPP-1 and EcAPP-1 has been shown to vary under different cell culture conditions [27,35]. Furthermore, it has been reported that the mutation of the conserved metal-coordinating residues Asp260 and Asp271 in EcAPP-1 results in a catalytically inactive enzyme [34]. Additionally, the mutation of the active site residue His243 into alanine in EcAPP-1 abolished any catalytic activity [34]. The equivalent residue to His243 in PID-5 APP-RD via structural superimposition is Ala839. This evidence suggests mononuclear zinc PID-5 APP-RD will be catalytically inactive, and perhaps could be considered a 'pseudo-APP' molecule. Experimental evidence is required to confirm if a single metal does coordinate.

Apstatin Binding Is Likely to Be Transient in PID-5 APP-RD
An IC 50 value of 20.2 ± 1.2 µM was shown for Pf APP-1 [28], but apstatin is a relatively poor APP-1 inhibitor. This is particularly of its capacity to inhibit HuAPP-1, where a crystal structure of the protein-ligand complex is unavailable. This study suggests that apstatin binding to PID-5 APP-RD is more transient than in the homologous structures and therefore would be a very weak interaction. However, in vitro studies are required to confirm this. A peptide-PID-5 APP-RD complex may not be stable and, based on the current evidence, CeAPP-1 would outcompete PID-5 APP-RD for substrate. PID-5 APP-RD may therefore not function to bind and lock substrates without cleavage, as was suggested in the first Placentino et al. hypothesis [4]. An alternative role may be more probable, supporting the heterodimerisation hypothesis.

Would Heterodimerisation Result in APP-1 Inactivity?
Placentino et al. [4] identified APP-1 as an interactor of PID-5 and this study showed heterodimer formation is plausible based on the high level of sequence conservation at the dimer interface. Placentino et al. [4] hypothesised that heterodimerisation may function to prevent APP-1 homodimerisation and activity, or to bring APP-1 activity into PID-5positive locations. Although the relationship between dimerisation and APP-1 activity is unknown, the structural homology of CeAPP-1 and PID-5 APP-RD, particularly at the dimer interface, indicates the possibility that the heterodimer can maintain the correct fold for activity.
Confirming the relationship between dimerisation and APP-1 activity would help to determine if heterodimerisation, with a likely catalytically inactive PID-5, results in APP-1 activity and hence enable the two hypotheses to be distinguished.

Biological Implications
PID-5 is known to interact with PID-2, with a proposed role in regulating RdRP and other factors that bring about heritable silencing [4]. PID-4 has also shown to be a PID-2interactor and both PID-4 and PID-5 contain eTudor domains. A lack of the characteristic aromatic cage and acidic amino acids suggests that they will not bind symmetrically dimethylated arginines and can use these domains to bind PID-2 non-simultaneously. Tudor domains are known to function as 'adaptors' [36], and therefore the role of PID-4/5 may be to act as mediators of the downstream effectors PMRT-5 and APP-1, respectively, for the modification of RNAe proteins.
This study shows that the heterodimerisation of APP-1 with PID-5 is plausible, connecting N-terminal proteolysis to RNAe. Whilst PID-4 may bring PMRT-5 activity to PID-2 for potential arginine modifications, PID-5 could bring APP-1 activity to the Z-granules and act on substrates such as WAGO-4 [4]. This argonaute protein interacts with 22G-RNAs and is a known promoter of heritable epigenetic silencing with ZNFX-1 [37]. It has been suggested that ZNFX-1 maintains epigenetic signals at the 3' region of the target mRNA by redistributing RdRPs, as argonaute proteins tend to target the 5'-end of mRNA [38]. ZNFX-1 and WAGO-4 may therefore work together to maintain a balanced 22G-RNA population and an even distribution of epigenetic signals [39]. WAGO-4 has a proline-rich (22.9% of the first 70 residues) N-terminal domain that comprises three proline dipeptides in the first 25 residues. Since two adjacent prolines of apstatin and bradykinin, a 9 amino acid peptide substrate, are well accommodated by the S1' and S2' subsites of CeAPP-1 [40], it is conceivable that the dipeptide motifs of WAGO-4 may be important for any interaction with the PID-5 APP-RD domain.
Placentino et al. [4] showed that pid-2 mutants lost 22G-RNAs from the 5'-end, suggesting a relationship between PID-2 and WAGO-4 activity. Additionally, ZNFX-1 and WAGO-4 form the independent Z-granule [41] where PID-2 is also located. By drawing together the evidence from the results reported here and that available in the literature, it is possible to address the hypothesis that, if the heterodimer has APP-1 activity, PID-5 may interact with PID-2 for the regulated modification of the N-terminus of WAGO-4. This may control WAGO-4 stability and activity and contribute to the balanced inheritance of epigenetic information with ZNFX-1.
As PID-4/5 may be localised at the periphery of P granules [4], this hypothesis supports the idea that these proteins enable conservation between the perinuclear components and contribute to the spatial-temporal regulation of RNAe components.

Therapeutic Translational Research
The potential biological application of the current work is also seen in Plasmodium falciparum (Pf ), a lethal human malaria-causing Plasmodium species [8]. Increasing resistance to the antimalarial artemisinin in Pf is a significant concern. Pf APP-1 has been identified as key to parasitic survival and therefore could be a promising therapeutic target. However, apstatin has been shown to be only a weak inhibitor of Pf APP-1, with more potent inhibitors not currently available. However, if dimerisation is shown to be crucial to Pf APP-1 activity, a dimerisation block may be of interest therapeutically. Comparative studies such as this contribute to the determination of binding site differences, which is key to designing specific inhibitors to target common catalytic centres/mechanisms.
Inhibitor design is also of interest for HuAPP-1. Studies by Silva et al. [7] demonstrated increased replication-related DNA damage with knocked-down HuAPP-1. The relationship between HuAPP-1 and genome stability suggests that HuAPP-1 inhibitors may be of therapeutic importance for treating malignancies as a lack of HuAPP-1 would result in DNA replication errors, genome instability and ultimately cell death. The narrow specificity of APPs due to the requirement for proline as the second residue makes them an attractive therapeutic target. The specific mechanisms of APP-1 action need to be further studied and enhanced inhibitors can potentially be designed and incorporated into chemotherapeutic or immunotherapy regimes.

Conclusions
Based on the evidence from the present detailed bioinformatic analysis, we state the following hypotheses: (a) PID-5 APP-RD will share high structural homology with known APP-1 enzymes; (b) PID-5 APP-RD will bind with a single Zn 2+ atom and is likely to be catalytically inactive and therefore classed as a 'pseudoenzyme'; (c) PID-5 APP-RD will bind apstatin transiently and might also interact with the Pro-rich N-terminal domain of WAGO-4; and (d) PID-5 APP-RD will heterodimerize with CeAPP-1 This study supports the hypothesis that PID-5 APP-RD may function to heterodimerize with CeAPP-1. Ultimately, in order to confirm or repute these hypotheses, specific biological experiments must be designed to addressing the above points.
Author Contributions: A.C.L. performed all the detailed analysis reported in this work, wrote the initial manuscript and edited the manuscript. K.S.G. supervised the study, analysed the data and edited the manuscript. R.E.I. analysed the data and edited the manuscript. K.R.A. conceptualised the work, supervised the study, analysed the data and edited the manuscript. All authors have read and agreed to the published version of the manuscript.