Structure–Function Relationship Study of a Secretory Amoebic Phosphatase: A Computational-Experimental Approach

Phosphatases are hydrolytic enzymes that cleave the phosphoester bond of numerous substrates containing phosphorylated residues. The typical classification divides them into acid or alkaline depending on the pH at which they have optimal activity. The histidine phosphatase (HP) superfamily is a large group of functionally diverse enzymes characterized by having an active-site His residue that becomes phosphorylated during catalysis. HP enzymes are relevant biomolecules due to their current and potential application in medicine and biotechnology. Entamoeba histolytica, the causative agent of human amoebiasis, contains a gene (EHI_146950) that encodes a putative secretory acid phosphatase (EhHAPp49), exhibiting sequence similarity to histidine acid phosphatase (HAP)/phytase enzymes, i.e., branch-2 of HP superfamily. To assess whether it has the potential as a biocatalyst in removing phosphate groups from natural substrates, we studied the EhHAPp49 structural and functional features using a computational-experimental approach. Although the combined outcome of computational analyses confirmed its structural similarity with HP branch-2 proteins, the experimental results showed that the recombinant enzyme (rEhHAPp49) has negligible HAP/phytase activity. Nonetheless, results from supplementary activity evaluations revealed that rEhHAPp49 exhibits Mg2+-dependent alkaline pyrophosphatase activity. To our knowledge, this study represents the first computational-experimental characterization of EhHAPp49, which offers further insights into the structure–function relationship and the basis for future research.


Introduction
Phosphatases are enzymes that cleave the phosphoester bond of various substrates (e.g., proteins, lipids, and sugars) containing phosphorylated residues [1][2][3]. According to the pH at which they have optimal catalysis, the typical classification divides them into acid or alkaline phosphatases [4].
The histidine phosphatase (HP) superfamily (InterPro: IPR029033) is a large group of functionally diverse enzymes sharing a conserved active site that includes a His residue, which becomes phosphorylated during catalysis. This superfamily comprises two branches that share limited sequence similarity. Branch-1 (InterPro: IPR013078) involves a wide variety of enzymes, including fructose-2,6-bisphosphatases and phosphoglycerate mutases. Branch-2 (InterPro: IPR000560) contains mainly acid phosphatases and phytases [5]. HP enzymes are relevant for biomedicine and biotechnology due to their current and potential applications. For instance, the human prostatic acid phosphatase is a biomarker with clinical significance for prostate cancer [6], and phytases have potential as biocatalysts for sustainable agriculture and animal nutrition [7][8][9].
Proteomic analysis of isolated amoebic phagosomes showed the active expression of a wide variety of secretory enzymes, including acid phosphatases [13]. Among these, the 49.3-kDa histidine acid phosphatase(HAP)/phytase-like protein encoded by EHI_146950 (from now on called EhHAPp49) caught our attention due to an apparent lack of information about its precise function. Here, we studied the structure-function relationship of EhHAPp49 using a computational-experimental approach to gain knowledge about its catalytic capabilities and assess its potential as a biocatalyst in sustainable agriculture or animal nutrition, removing phosphates from natural substrates.

E. histolytica Encodes a HAP/Phytase-Like Protein: EhHAPp49
Bioinformatic analysis of the EhHAPp49 primary structure provided the first insights into its structure-function relationship and the initial leads about its enzymatic abilities. As expected, the combined BLAST/CD-Search/InterPro analyses confirmed that it shares significant identities with phosphatases of the HAP/phytase family [5]. Furthermore, a restricted BLAST search of the Entamoebidae database returned several genes encoding homologous proteins ( Figure 1A). The polypeptide sequence analysis also revealed that it has a putative N-terminal signal peptide (Met1-Cys18), which suggests protein targeting to the secretory pathway, and a phosphatase domain (Glu19-Gln418), which includes six highly conserved residues (Arg41, His42, Arg45, Arg152, His334, and Asp335) as found in the catalytic core of functional homologs ( Figure 1B).  Automatic prediction of the tertiary structure of EhHAPp49 by homology-based modeling offered additional insights into its three-dimensional (3D) conformation. Using the EhHAPp49 polypeptide as the query sequence, a restricted BLAST search of the Protein Data Bank (PDB) repository returned the crystal structures of three protein counterparts as suitable templates (E-value < 10 −14 ) for the 3D modeling: (1) the human PAP (prostatic acid phosphatase): 2HPA, 1CVI, 1ND5, and 1ND6 [14][15][16], (2) the human LAP (lysophosphatidic acid phosphatase type 6): 4JOB, 4JOC, and 4JOD [17], and (3) the Legionella pneumophila HAP: 5CDH [18]. The top five 3D-models generated by the Modeller multi-template approach met the expected structural benchmarks: low normalized discrete optimized protein energy (zDOPE) score and a Ramachandran plot showing more than 85% of residues in the most favored regions. After refinement using a molecular dynamics (MD)-based method, MolProbity analysis validated the structural accuracy of the best 3D-model for EhHAPp49 (Figure 2A), scoring 2.09 (71st percentile) for protein geometry and 2.79 (98th percentile) for all-atom contacts, which exceeded the benchmark (values >66th percentile are good scores). Moreover, the Ramachandran plot showed that 98.8% of all residues were in the allowed regions, with 93.0% in favored regions ( Figure 2B).

22, x FOR PEER REVIEW 3 of 16
Automatic prediction of the tertiary structure of EhHAPp49 by homology-based modeling offered additional insights into its three-dimensional (3D) conformation. Using the EhHAPp49 polypeptide as the query sequence, a restricted BLAST search of the Protein Data Bank (PDB) repository returned the crystal structures of three protein counterparts as suitable templates (E-value < 10 −14 ) for the 3D modeling: (1) the human PAP (prostatic acid phosphatase): 2HPA, 1CVI, 1ND5, and 1ND6 [14][15][16], (2) the human LAP (lysophosphatidic acid phosphatase type 6): 4JOB, 4JOC, and 4JOD [17], and (3) the Legionella pneumophila HAP: 5CDH [18]. The top five 3D-models generated by the Modeller multitemplate approach met the expected structural benchmarks: low normalized discrete optimized protein energy (zDOPE) score and a Ramachandran plot showing more than 85% of residues in the most favored regions. After refinement using a molecular dynamics (MD)-based method, MolProbity analysis validated the structural accuracy of the best 3Dmodel for EhHAPp49 (Figure 2A), scoring 2.09 (71st percentile) for protein geometry and 2.79 (98th percentile) for all-atom contacts, which exceeded the benchmark (values >66th percentile are good scores). Moreover, the Ramachandran plot showed that 98.8% of all residues were in the allowed regions, with 93.0% in favored regions ( Figure 2B). Homology-based modeling also predicted disulfide bond formation, a post-translational modification that allows the proper folding and stabilization of numerous secretory proteins [19]. Two disulfide bonds (Cys196-Cys413 and Cys386-Cys394) showing the spatial proximity required for cysteine residue pairing: Cβ-Cβ distance ≤ 4.5 Å [20], are predicted. In addition to stabilization, this structural feature suggests that the native conformation of EhHAPp49 depends on oxidative folding, a cellular process catalyzed by endoplasmic reticulum-resident foldases, such as protein disulfide isomerases [21].

rEhHAPp49 Is an Active Enzyme
Recombinant production of EhHAPp49 and analysis of hydrolytic activity against three common phosphatase substrates provided the first experimental evidence of its enzymatic function. The EHI_146950/EhHAPp49 sequence served as a template to design the specific primers used to amplify the fragment encoding the mature polypeptide (Asp14-Gln418). After obtaining the expected amplicon (1239 bp) by high-fidelity PCR, a plasmid engineering approach allowed to clone this product and construct pQEhHAP-Myc22 (Figure 8  Homology-based modeling also predicted disulfide bond formation, a post-translational modification that allows the proper folding and stabilization of numerous secretory proteins [19]. Two disulfide bonds (Cys196-Cys413 and Cys386-Cys394) showing the spatial proximity required for cysteine residue pairing: Cβ-Cβ distance ≤ 4.5 Å [20], are predicted. In addition to stabilization, this structural feature suggests that the native conformation of EhHAPp49 depends on oxidative folding, a cellular process catalyzed by endoplasmic reticulum-resident foldases, such as protein disulfide isomerases [21].

rEhHAPp49 Is an Active Enzyme
Recombinant production of EhHAPp49 and analysis of hydrolytic activity against three common phosphatase substrates provided the first experimental evidence of its enzymatic function. The EHI_146950/EhHAPp49 sequence served as a template to design the specific primers used to amplify the fragment encoding the mature polypeptide (Asp14-Gln418). After obtaining the expected amplicon (1239 bp) by high-fidelity PCR, a plasmid engineering approach allowed to clone this product and construct pQEhHAP-Myc22 (Figure 8), which enabled the IPTG-inducible cytosolic expression of rEhHAPp49 as a soluble Myc-tagged protein. Transformed E. coli SHuffle cells, harboring pQEhHAP-Myc22, worked as efficient factories for consistent rEhHAPp49 production. Bacterial lysates provided sufficient soluble protein for standard purification by chromatographic procedures (IMAC and gel filtration). The high degree of purity (>95%, as judged by SDS-PAGE analysis) showed the reliability of rEhHAPp49 production.
A quantitative determination of the enzymatic activity against three reference phosphatase substrates: pNPP, phytic acid, and sodium pyrophosphate (Na-PPi), allowed establishing the catalytic capabilities of rEhHAPp49 under three pH conditions (Table 1). Surprisingly, the enzyme showed negligible HAP/phytase activity at pH 5.0 (acidic conditions), followed by a complete loss at pH 7.0 and 9.0 (neutral and alkaline conditions). In contrast, it exhibited a considerable pyrophosphatase (PPase) activity under acidic conditions, with a significant increase under neutral and alkaline conditions (p < 0.001).

rEhHAPp49 Exhibits Pyrophosphatase Activity
The quantitative determination of PPase activity in rEhHAPp49-assisted reactions with increasing concentrations of Na-PPi (0 to 0.5 mM) under three pH conditions (5.0, 7.0, and 9.0) allowed establishing the effect of substrate concentration on enzyme kinetics. As expected, increasing the Na-PPi concentration boosted the reaction rate, reaching the plateau at 100 µM and exhibiting a maximum velocity at pH 9.0 ( Figure 3). Furthermore, the estimated K M and k cat values confirmed that rEhHAPp49 shows catalytic efficiency under alkaline conditions ( Table 2).
A subsequent evaluation of PPase activity in rEhHAPp49-assisted reactions conducted at different pH (2.0-11.0) or temperature (30-76 • C) conditions allowed establishing the effect of these environmental variables on enzyme activity and stability. As suspected, rEhHAPp49 showed optimal activity at pH 9.0 and ≥50% activity within pH 7.6-10.3 ( Figure 4A), implying that the ionization state of the catalytic residues favors enzymatic activity [22]. Furthermore, the high activity (≥90%) retained after 14 h at pH values within the 8.0-11.0 range ( Figure 4B) confirmed enzymatic stability under these alkaline conditions. In contrast, the low activity (≤15%) detected under acidic conditions (pH ≤ 5.0) indicated that the protonation of protein residues destabilizes the active site, leading to decreased enzymatic activity [23]. On the other hand, rEhHAPp49 showed optimal activity at 50 • C and suboptimal activity (≥90%) at temperatures within the 44-58 • C range ( Figure 4C). Furthermore, the high activity (≥70%) retained after 30 min at temperatures <55 • C confirmed the enzyme thermal-stability ( Figure 4D). In contrast, the loss of activity at higher temperatures suggested irreversible heat-induced inactivation. 7.0, and 9.0) allowed establishing the effect of substrate concentration o As expected, increasing the Na-PPi concentration boosted the reaction plateau at 100 µM and exhibiting a maximum velocity at pH 9.0 (Figur the estimated KM and kcat values confirmed that rEhHAPp49 shows cata der alkaline conditions ( Table 2). A subsequent evaluation of PPase activity in rEhHAPp49-assisted reactions conducted at different pH (2.0-11.0) or temperature (30-76 °C) conditions allowed establishing the effect of these environmental variables on enzyme activity and stability. As suspected, rEhHAPp49 showed optimal activity at pH 9.0 and ≥50% activity within pH 7.6-10.3 ( Figure 4A), implying that the ionization state of the catalytic residues favors enzymatic activity [22]. Furthermore, the high activity (≥90%) retained after 14 h at pH values within the 8.0-11.0 range ( Figure 4B) confirmed enzymatic stability under these alkaline conditions. In contrast, the low activity (≤15%) detected under acidic conditions (pH ≤5.0) indicated that the protonation of protein residues destabilizes the active site, leading to decreased enzymatic activity [23]. On the other hand, rEhHAPp49 showed optimal activity at 50 °C and suboptimal activity (≥90%) at temperatures within the 44-58 °C range ( Figure 4C). Furthermore, the high activity (≥70%) retained after 30 min at temperatures <55 °C confirmed the enzyme thermal-stability ( Figure 4D). In contrast, the loss of activity at higher temperatures suggested irreversible heat-induced inactivation.

The Active-Site Entrance of EhHAPp49, An Apparent Molecular Sieve
The automatic prediction of the active site conformation followed by a analysis provided further insights into the structure-function relationship As expected, the 3D-model showed that it folds similarly to functional hom surface representation offered a better perspective of both the substrate-b and the active-site entrance. While the pocket retained a typical topograph shaped a narrow gap ( Figure 6A), compared to those of EcAppA ( Figure 6B phytase [24,25]. This observation suggested that a molecular sieving proce volved in the substrate selectivity shown by rEhHAPp49.

The Active-Site Entrance of EhHAPp49, an Apparent Molecular Sieve
The automatic prediction of the active site conformation followed by a computational analysis provided further insights into the structure-function relationship of EhHAPp49. As expected, the 3D-model showed that it folds similarly to functional homologs. Still, the surface representation offered a better perspective of both the substrate-binding pocket and the active-site entrance. While the pocket retained a typical topography, the entrance shaped a narrow gap ( Figure 6A), compared to those of EcAppA ( Figure 6B), the bacterial phytase [24,25]. This observation suggested that a molecular sieving process could be involved in the substrate selectivity shown by rEhHAPp49.

The Active-Site Entrance of EhHAPp49, An Apparent Molecular Sieve
The automatic prediction of the active site conformation followed by a computational analysis provided further insights into the structure-function relationship of EhHAPp49. As expected, the 3D-model showed that it folds similarly to functional homologs. Still, the surface representation offered a better perspective of both the substrate-binding pocket and the active-site entrance. While the pocket retained a typical topography, the entrance shaped a narrow gap ( Figure 6A), compared to those of EcAppA ( Figure 6B), the bacterial phytase [24,25]. This observation suggested that a molecular sieving process could be involved in the substrate selectivity shown by rEhHAPp49. A supplementary analysis by molecular docking simulations supported the latter remark. As suspected, the EhHAPp49 active site entrance acted as a selective sieve, allowing A supplementary analysis by molecular docking simulations supported the latter remark. As suspected, the EhHAPp49 active site entrance acted as a selective sieve, allowing the passage and stable docking of PPi (−6.82 kcal/mol, binding energy) but not doing so for the other ligands (pNPP and phytic acid). Given this theoretical finding, experimental studies will be crucial to accurately test the suggested hypothesis (i.e., the active site entrance plays a structural role in EhHAPp49 substrate-selectivity); for instance, sitedirected mutagenesis followed by the analysis of enzyme function [26].

EhHAPp49 Is a Non-Canonical Phosphatase
So far, our findings suggest an apparent discrepancy between the observed and expected function for EhHAPp49. In brief, the enzymatic characterization showed that it lacks the HAP/phytase activity predicted by the bioinformatic analysis; instead, it exhibits a PPase-like activity. Due to this feature and the absence of previous reports showing a similar catalytic transition for any other amoebic enzyme involved in phosphate metabolism, we propose that EhHAPp49 represents a non-canonical phosphatase.
Furthermore, supplementary analysis of the substrate-enzyme interactions provided additional insights into the EhHAPp49 active-site structure. S-PPases usually require 3-4 metal ions for maximal activity (depending on the pH and enzyme family), while most residues in the active site have supportive roles (as reactions catalyzed by these proceed without an enzyme-phosphate intermediate) [28,36,37]. In addition to the already proved dependence on magnesium ions, its hypothetical ability to bind PPi through numerous noncovalent interactions with active site residues, predicted by molecular docking simulations: Arg41, His42, Arg45, Trp48, Arg152, Tyr245, His334, and Asp335 (Figure 7), suggest that the EhHAPp49 active-site structure contains the molecular requirements to support PPi binding and achieve consistent PPase activity. However, it remains uncertain whether the associated catalytic reaction involves forming an enzyme-phosphate intermediate, as occurs in typical HP enzymes but not in S-PPases.
As a final thought, and based on its biochemical nature (i.e., secretory protein), we reasonably suggest that EhHAP49 could be involved in specialized extracellular processes required for membrane interactions associated with the E. histolytica phagocytic activity. Previous studies on the protein structure and function of a pseudophosphatase, called Cf60, secreted by the slime mold Dictyostelium discoideum support this hypothesis. Remarkably, Cf60 has structural similarity to HAP/phytase enzymes (branch-2 of the HP superfamily), but it lacks acid phosphatase activity. Nonetheless, as a component of the counting factor (CF, a 450-kDa protein complex secreted by D. discoideum cells), it regulates the multicellularstructure size. Furthermore, functional analysis based on gene disruption suggested that Cf60 is essential for early development [38,39]. Given these, cellular studies will be essential to test the suggested hypothesis and determine the functional role of EhHAPp49 in the pathobiology of E. histolytica. Likewise, experimental approaches aimed at gaining insights into the structure and function of bioinformatically-identified EhHAPp49 protein homologs will be critical to support the hypothesis of a novel class of phosphatases, likely specific to the phylum Amoebozoa. dependence on magnesium ions, its hypothetical ability to bind PPi non-covalent interactions with active site residues, predicted by mole lations: Arg41, His42, Arg45, Trp48, Arg152, Tyr245, His334, and Asp gest that the EhHAPp49 active-site structure contains the molecular r port PPi binding and achieve consistent PPase activity. However, i whether the associated catalytic reaction involves forming an enzyme diate, as occurs in typical HP enzymes but not in S-PPases. Figure 7. Schematic representation (2D) of residues in the substrate-binding site of EhHAPp49 exhibiting interactions with inorganic pyrophosphate. Color codes: hydrogen bonds, green dashes; salt bridges, red dashes/arcs; carbon, black; oxygen, red; nitrogen, blue; phosphorous, pink; ligand bonds, purple; protein bonds, gray.

Materials
Reagents for bacteriological culture media were obtained from Becton, Dickinson and Company (Franklin Lakes, NJ, USA). PCR amplification biochemicals, kits for DNA isolation, and materials for 6xHis-tagged protein purification were obtained from Qiagen (Germantown, MD, USA). Enzymes for standard cloning were obtained from New England Biolabs (Ipswich, MA, USA). Chemicals for protein analysis by SDS-PAGE and immunoblotting were obtained from Bio-Rad Laboratories (Hercules, CA, USA). Unless otherwise specified, all other materials were obtained from Sigma-Aldrich (St. Louis, MO, USA) or GE Healthcare Bio-Sciences (Pittsburgh, PA, USA). Table 3 shows the Escherichia coli strains, bacterial plasmids, and synthetic primers used throughout this study. E. coli ER2738 was the host strain for plasmid DNA propagation, and E. coli SHuffle was the expression strain for recombinant protein production (i.e., rEhHAPp49). Bacterial cells were cultured in Luria-Bertani (LB) broth at 37 • C, with constant shaking (300 rpm). Selection of stably transformed bacteria was performed by antibiotic resistance, using ampicillin (150 µg/mL) or chloramphenicol (15 µg/mL) as required. The pBluescript SK(−) and pBAD33 plasmids were used for standard molecular cloning [47], while pQE30 was the expression plasmid for the recombinant protein production. Synthetic oligonucleotides (Eurofins Genomics LLC, Louisville, KY, USA) were used as primers for PCR amplification.

EhHAPp49 PCR Amplification
The E. histolytica HM1:IMSS strain was grown at~90% confluence in TYI-S-33 medium [48]. The gDNA of 2.5 × 10 6 amoebic cells was purified using a QIAmp ® DNA Mini Kit (Qiagen). An Expand™ High Fidelity PCR System (Sigma-Aldrich) was used to amplify the amoebic gene fragment coding for the EhHAPp49 mature polypeptide (UniProt C4M8S6: Asp14-Gln418), with EHHAP[F/R] as the gene-specific primer set. The PCR cycling conditions were an initial denaturation step (2 min at 94 • C) followed by 10 cycles of exponential amplification involving 20 s at 94 • C, 20 s at 50 • C, and 90 s at 72 • C, 25 cycles of exponential amplification involving 20 s at 94 • C, 20 s at 50 • C, and 110 s at 72 • C, and a final elongation step (7 min at 72 • C). Agarose gel electrophoresis was applied to analyze the amplicon (1239 bp). After that, a QIAquick ® PCR Purification Kit (Qiagen) was used to purify the PCR product (i.e., EhHAPp49).

Construction of Recombinant Plasmids Harboring EhHAPp49
BamHI/XhoI sticky-end cloning of EhHAPp49 into pBPelB-BHX-Myc (Supplementary Figure S2) was the approach used to construct pBPelB-EhHAP-Myc (Supplementary Figure  S3a). Following proper digestion of both insert and vector (and successive purification of restriction products), sticky-end ligation was catalyzed by T4 DNA ligase (New England Biolabs) in a typical reaction mix. After heat shock-induced transformation of chemically competent bacteria and subsequent selection of transformants, performed following standard protocols, the recombinant plasmid was isolated by DNA purification using a QIAprep ® Spin Miniprep Kit (Qiagen).
XbaI/HindIII sticky-end cloning of PelB-EhHAPp49-Myc into pBAD33 [49] was the consecutive approach to construct pBAD-PelB-EhHAP-Myc (Supplementary Figure S3b). With M13_RV/H3MYCR as the primer pair and pBPelB-EhHAP-Myc as a template, the insert (1475 bp) was amplified by high-fidelity PCR, using both reaction system and cycling conditions as defined previously. After proper digestion-purification of both insert and vector, the subsequent molecular protocols (i.e., ligation of sticky ends, selection of transformed bacteria, and purification of plasmid DNA) were performed by standard methods, as mentioned above.
Each construct was selected by endonucleolytic analysis, followed by verification of the cloned fragment through DNA sequencing.

Construction of the Recombinant Plasmid Expressing EhHAPp49
BamHI/HindIII sticky-end cloning of EhHAPp49-Myc into pQE30 (Qiagen), a bacterial vector that allows high-level expression of His-tagged recombinant proteins, was the approach used for constructing pQEhHAP-Myc22 (Figure 8). The Expand™ High Fidelity PCR System was used to amplify the insert (1630 pb), with pBAD-PelB-EhHAP-Myc as the template and BAD_[FW/RV] as the primer pair. Molecular cloning, from insert/vector digestion to DNA sequencing, was conducted by standard protocols.
BamHI/XhoI sticky-end cloning of EhHAPp49 into pBPelB-BHX-Myc tary Figure S2) was the approach used to construct pBPelB-EhHAP-Myc (Su Figure S3a). Following proper digestion of both insert and vector (and succ cation of restriction products), sticky-end ligation was catalyzed by T4 DNA England Biolabs) in a typical reaction mix. After heat shock-induced tran chemically competent bacteria and subsequent selection of transformants, p lowing standard protocols, the recombinant plasmid was isolated by DNA using a QIAprep ® Spin Miniprep Kit (Qiagen).
XbaI/HindIII sticky-end cloning of PelB-EhHAPp49-Myc into pBAD33 consecutive approach to construct pBAD-PelB-EhHAP-Myc (Supplementar With M13_RV/H3MYCR as the primer pair and pBPelB-EhHAP-Myc as a insert (1475 bp) was amplified by high-fidelity PCR, using both reaction sy cling conditions as defined previously. After proper digestion-purification and vector, the subsequent molecular protocols (i.e., ligation of sticky end transformed bacteria, and purification of plasmid DNA) were performed methods, as mentioned above.
Each construct was selected by endonucleolytic analysis, followed by v the cloned fragment through DNA sequencing.

Construction of the Recombinant Plasmid Expressing EhHAPp49
BamHI/HindIII sticky-end cloning of EhHAPp49-Myc into pQE30 (Qia rial vector that allows high-level expression of His-tagged recombinant pro approach used for constructing pQEhHAP-Myc22 (Figure 8). The Expand™ PCR System was used to amplify the insert (1630 pb), with pBAD-PelB-Eh the template and BAD_[FW/RV] as the primer pair. Molecular cloning, from digestion to DNA sequencing, was conducted by standard protocols. Figure 8. Schematic representation of the pQEhHAP-Myc22 plasmid, which encode a 6xHis/Myc-tagged cytosolic protein. ColE1 ori (yellow) functions as the autonom sequence, and the encoded β-lactamase (orange) as a selection marker (Amp R ). As a unit, the promoter and terminator: T5/lacO (blue) and λ t0 (red), regulate the gene rEhHAPp49 through an IPTG-inducible system. Figure 8. Schematic representation of the pQEhHAP-Myc22 plasmid, which encodes EhHAPp49 as a 6xHis/Myc-tagged cytosolic protein. ColE1 ori (yellow) functions as the autonomous replication sequence, and the encoded β-lactamase (orange) as a selection marker (Amp R ). As a transcriptional unit, the promoter and terminator: T5/lacO (blue) and λ t0 (red), regulate the gene expression of rEhHAPp49 through an IPTG-inducible system.
CelLytic ® B reagent (Sigma-Aldrich), supplemented as recommended by the supplier (100 U/mL benzonase, 0.2 mg/mL lysozyme, and 1X protease inhibitor cocktail), was used to lyse the bacterial cells. Next to suspension in this reagent (5 mL), cells were disrupted by sonication (10 cycles: 30 s ON, 30 s OFF) in an ice-bath. The protein extraction process was completed by slow shaking for 10 min. Two consecutive centrifugations isolated the soluble fraction: a mid-speed run (9300× g; 15 min; 10 • C) to remove the cell debris and a high-speed run (16,000× g; 15 min; 10 • C) to separate the fine sediment.
The expression product (rEhHAPp49) was purified through immobilized-metal affinity chromatography (IMAC) and desalted using a Sephadex G-25 column. Ni-NTA agarose Production of rEhHAPp49 was monitored by routinely performing a typical SDS-PAGE [50] analysis (Supplementary Figure S4), and protein concentration was determined using the Bradford colorimetric assay [51].

EhHAPp49 Enzyme Activity Assays
The enzymatic function of rEhHAPp49 was studied using three phosphatase substrates (as reference compounds): p-nitrophenyl phosphate (pNPP), phytic acid, and sodium pyrophosphate (Na-PPi). Standard activity assays were performed either under acidic, neutral, and alkaline conditions. A 100 mM solution of the respective buffer: Naacetate (pH 5.0; acid) or Tris-HCl (pH 7.0/9.0; neutral/alkaline), was used to set each condition. The hydrolytic reactions were allowed for 60/180 mins at 37 • C. Reaction mixes (200 µL final) and specific settings for each activity assay (i.e., phosphatase, phytase, and pyrophosphatase) were as follows.

Phosphatase Activity Assay
The enzyme (1 µM rEhHAPp49) and cofactor (5 mM MgSO 4 ) were incubated for 15 min at 37 • C before adding the substrate (10 mM pNPP). After a 60-min period of enzyme-assisted hydrolysis, the reaction was stopped by rapidly mixing with 100 µL of 1.2 N NaOH. After that, the supernatant was isolated through centrifugation at 16,000× g for 1 min. An enzyme-free reaction (i.e., uncatalyzed) was used as a blank to subtract the background. The absorbance (A 415 ) was promptly measured and used to determine the p-nitrophenolate (pNP) concentration with a standard curve. The amount of pNP (µmole) released per minute per µmole of rEhHAPp49 defined the specific phosphatase activity.

Phytase Activity Assay
Before adding the substate (1.5 mM phytic acid), the enzyme/cofactor mix was prepared and treated as described above. After 180 min of enzyme-assisted hydrolysis, the reaction was immediately cooled in an ice-bath and stooped by thoroughly mixing with 100 µL of 6% trichloroacetic acid. The supernatant was then separated by centrifugation at 16,000× g for 15 min (10 • C). A Pi-background reaction, in which the substrate was added immediately after the stopping solution, functioned as a blank. The malachite green (MG) colorimetric assay [52], detailed below, was used to determine the Pi concentration. The amount of Pi (µmole) released per minute per µmole of rEhHAPp49 defined the specific phytase activity.

Pyrophosphatase Activity Assay
Before adding the substrate (0.5 mM Na-PPi), the enzyme/cofactor mix was prepared and treated as previously stated. After a 60-min period of enzyme-assisted hydrolysis, the subsequent steps (from reaction stopping to Pi determination) were as detailed in the phytase assay. The amount of Pi (µmole) released per minute per µmole of rEhHAPp49 defined the specific PPase activity.

MG Colorimetric Assay
MG colorimetric assay. MG color reagent batch was freshly-prepared based on the assay requirements by mixing the following components: 20 parts of 0.13% MG (in 3.1 M H 2 SO 4 ), 5 parts of 7.8% ammonium molybdate, and 1 part of 5.2% Tween-20. After 30 min standing at room temperature, the mix was then centrifuged at 16,000× g for 10 min to remove the fine sediment. For the assay, 25 µL of MG reagent and 100 µL of supernatant (from the phytase or PPase activity assay) were thoroughly mixed, and color development was allowed for 10 min (exact) at room temperature. Absorbance (A 650 ) was immediately measured and used to determine the Pi concentration with a standard curve.

Characterization of the EhHAPp49 PPase Activity
The effect of substrate on enzyme kinetics was assessed by conducting the PPase assay at different concentrations of Na-PPi (0-0.5 mM) under acidic, neutral, and alkaline conditions (established as previously mentioned). Reaction mixes (200 µL final) included 0.1 µM rEhHAPp49 and 5 mM MgSO 4 . Kinetics parameters, K M and k cat , were determined by fitting the data (i.e., enzyme velocity against substrate concentration) to a nonlinear least-squares regression model using the Michaelis-Menten equation.
The effect of pH on enzyme activity was evaluated by performing the PPase assay as previously stated but using distinct buffers to establish the pH conditions: Gly-HCl (pH 2.0-3.5), Na-acetate (pH 4.0-5.0), MES-NaOH (pH 5.5-6.5), Tris-HCl (pH 7.0-9.0), and Gly-NaOH (pH 9.5-11.0). The optimum pH for activity was established by fitting the data (i.e., PPase against pH) to a dose-response curve using a bell-shaped Gaussian distribution model. The effect of pH on enzyme stability was assessed by incubating rEhHAPp49 under different pH conditions: 2.0-11.0, for 14 h (4 • C) before assaying the PPase activity. The pH thresholds for stability were determined by fitting the data (i.e., PPase against pH) to a dose-response curve using a sigmoidal distribution model.
The effect of temperature on enzyme activity was assessed by performing the PPase assay in the 30-76 • C range using the reaction settings described above, but under alkaline conditions (pH 9.0). The optimum temperature for activity was determined by fitting the data (i.e., PPase against temperature) to a dose-response curve using a bell-shaped Gaussian distribution model. The effect of temperature on enzyme stability was assessed by preincubating rEhHAPp49 under distinct thermal conditions: 30-76 • C for 30 min (pH 9.0), before assaying PPase activity. The temperature thresholds for stability were determined using the method described earlier. A MultiGene™ Thermal Cycler (Labnet International, Inc.; Edison, NJ, USA) was used to control the temperature.
The effect of magnesium ions on PPase activity was evaluated by conducting the enzymatic reaction under optimal conditions with increasing concentrations of MgSO 4 (0-5 mM). The EC 50 value was established by fitting the data (i.e., PPase against the log10 value of MgSO 4 concentration) to a dose-response curve using a sigmoidal distribution model.

Data Analysis
All data represent the mean (±standard error) of three independent experiments. One-way analysis of variance (ANOVA) with a post-hoc Tukey test was used for multiple comparisons. GraphPad Prism ® 4.0 for Windows (San Diego, CA, USA) was the computational package used for all statistical analyses.

In Silico Analysis of the EhHAPp49 Ligand-Binding Site
IntFOLD (https://www.reading.ac.uk/bioinf/IntFOLD/, accessed on 1 October 2020), an integrated interface for protein structure and function prediction [53,54], was used to model the EhHAPp49 active site and estimate the presumed protein-ligand interactions. While FunFOLD was used to predict the ligand-binding site residues, using ligandcontaining structures (from the PDB database) as templates and the default settings for homology-based modeling [55,56], FunFOLDQA was used to assess the 3D-model accuracy [57]. The top-ranked model of the EhHAPp49 active site (saved as a PDB file) was used as the receptor for molecular docking simulations with ArgusLab (Planaria Software LLC; Seattle, WA) [58]. Ligands (pNPP, phytic acid, and PPi) were simulated based on 3D structures (SDF files), retrieved from PDB (https://www.rcsb.org/search/advanced, accessed on 5 October 2020) and converted to Mol files with OpenBabel 3.0 [59]. ArgusDock was used as the shape-based algorithm for flexible ligand docking with default settings for geometry optimizations and energy calculations.
PyMOL (Schrödinger, LLC; New York, NY, USA) and UCSF Chimera were the interactive molecular graphics systems used for structural analysis, whereas LigPlot+ [60] and PLIP [61] were visualization tools used to analyze the protein ligand-binding site.

Conclusions
Here, we studied the structural and biochemical features of the amoebic enzyme EhHAPp49 using a computational-experimental approach. Bioinformatic analyses of genomic and proteomic data confirmed its similarity to HAP/phytases (branch-2 of the HP superfamily). We engineered a bacterial plasmid for cytosolic expression of the recombinant enzyme (rEhHAPp49) after induction with IPTG. A standard protocol for protein production in E. coli cells yielded a suitable amount of soluble-active rEhHAPp49 for the biochemical characterization. Based on the enzymatic characterization and supported by supplementary in silico studies, we determined that EhHAPp49 is a non-canonical phosphatase that exhibits non-typical PPase activity. Overall, our findings provide additional knowledge about the structure and function of EhHAPp49 and offer the basis for future research. For instance, the preference for other organic or linear inorganic polyphosphate substrates, such as thiamine pyrophosphate, adenosine (di/tri)phosphate, or tripolyphosphate, should be assessed to establish a precise protein function (i.e., enzyme specificity). Furthermore, as a closing thought, it is reasonable to assume that EhHAPp49 could belong to a novel class of phosphatases, likely specific to the phylum Amoebozoa.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author, without undue reservation, to any qualified researcher.