The Structural Basis of the Binding of Various Aminopolycarboxylates by the Periplasmic EDTA-Binding Protein EppA from Chelativorans sp. BNC1

The widespread use of synthetic aminopolycarboxylates, such as ethylenediaminetetraacetate (EDTA), as chelating agents has led to their contamination in the environment as stable metal–chelate complexes. Microorganisms can transport free EDTA, but not metal–EDTA complexes, into cells for metabolism. An ABC-type transporter for free EDTA uptake in Chelativorans sp. BNC1 was investigated to understand the mechanism of the ligand selectivity. We solved the X-ray crystal structure of the periplasmic EDTA-binding protein (EppA) and analyzed its structure–function relations through isothermal titration calorimetry, site-directed mutagenesis, molecular docking, and quantum chemical analysis. EppA had high affinities for EDTA and other aminopolycarboxylates, which agrees with structural analysis, showing that its binding pocket could accommodate free aminopolycarboxylates. Further, key amino acid residues involved in the binding were identified. Our results suggest that EppA is a general binding protein for the uptake of free aminopolycarboxylates. This finding suggests that bacterial cells import free aminopolycarboxylates, explaining why stable metal–chelate complexes are resistant to degradation, as they are not transported into the cells for degradation.


Introduction
Aminopolycarboxylate chelators (APCs) are amine-containing polycarboxylic acids that are used as metal chelators. Ethylenediaminetetraacetic acid, better known as EDTA, is the most widely used chelator in science, industry, medicine, and consumer goods due to its ability to chelate metals to form stable, water-soluble metal-chelate complexes [1,2]. The stability of metal-EDTA complexes leads to EDTA's persistence and accumulation in the environment, making it a significant anthropogenic pollutant [3][4][5]. Concerns about EDTA's potential to mobilize heavy metals, and radionuclides in particular, have led many countries to regulate its use [2,6,7]. Besides EDTA, similar APCs with more specialized applications exist. Both 1,2-bis(2-aminophenoxy)ethane-N,N,N ,N -tetraacetic acid (BAPTA) and ethylene glycol-bis(β-aminoethylether)-N,N,N ,N -tetraacetic acid (EGTA) are used as selective calcium chelators; ethylenediamine-N,N -bis(2-hydroxyphenylacetic acid) (EDDHA) is used to increase the bioavailability of iron for plant fertilization; diethylenetriaminepentaacetic acid (DTPA) is used in numerous applications, such as a contrasting agent in magnetic resonance imaging; N-(2-hydroxyethyl)ethylenediamine-N,N ,N -triacetic acid (HEDTA) is used as an iron-based herbicide; and nitrilotriacetate (NTA) is often used in laundry detergents. Their effluence into water supplies may also contribute to persistent APC pollutants. Environmentally friendlier alternatives like the naturally occurring EDDS ((S,S)-ethylenediamine-N,N -disuccinic acid) exist [8], and others are under development [9], but the effectiveness and affordability of EDTA have so far ensured its continued use.
One promising method for removing environmental EDTA contamination is bioremediation. For this purpose, multiple bacterial species have been identified that can subsist on EDTA as a sole source of carbon, nitrogen, and energy [10][11][12][13][14]. Being related, the EDTA-degrading bacteria were assigned to the novel genus Chelativorans [15]. In Chelativorans species, the genes encoding an ATP-binding cassette (ABC)-type transporter system for EDTA uptake and EDTA-degrading enzymes are co-located in a single operon [16]. Chelativorans sp. BNC1 (formerly Mezorhizobium sp. BNC1), isolated from industrial sewage [17], uses the ABC transport system to uptake EDTA. Inside the cell, EDTA is catabolized [18]. In the first step, an FMNH 2 -dependent EDTA monooxygenase, EmoA, together with its partner NADH:FMN oxidoreductase, EmoB, oxidizes EDTA to ethylenediamine-N,N -diacetate (EDDA) [16,19,20]. Next, iminodiacetate oxidase (IdaA) oxidizes EDDA to ethylenediamine (ED) [21,22]. EmoA and EmoB also oxidize nitrilotriacetate (NTA) to iminodiacetate, and IdaA oxidizes the latter to glycine [16,21]. Each oxidative cleavage produces a glyoxylate molecule, which is used as a carbon source, and the ethylenediamine can be used as a nitrogen source [15]. The EDTA transporter system is composed of a periplasmic binding protein, EppA, and a type I ABC-type importer consisting of a heterodimer of its transmembrane domain components, EppB and EppC, and a dimer of its nucleotide binding domain component, EppD. By sequence comparison, EppA belongs to the periplasmic binding protein PBP2 NikA/DppA/OppA-like superfamily, which is a family in the Class II Cluster C PBP [18]. Class II Cluster C PBPs contain two large polypeptide lobes connected via flexible tethers, allowing them to undergo a large and reversible conformational change, known as the "Venus fly-trap" model, in which ligand binding in the cleft between the two lobes induces a closed conformation [23,24]. EppA binds free EDTA, but not metal-EDTA complexes, restricting the ability of Chelativorans sp. BNC1 to use only weak metal-EDTA complexes that can dissociate to free EDTA [18], making it imperative to determine the biophysical mechanism of EppA's binding specificity before any improvements to its binding capabilities can be engineered. We have been delineating the underlying substrate specificity, catalytic mechanism and molecular interactions of key metabolic enzymes-EmoA [20], EmoB [19], and IdaA [22]-in the EDTA-degradation pathway of Chelativorans sp. [16]. To understand the first step of EDTA catabolism by Chelativorans sp. BNC1 and how it may act as a gatekeeper for all enzymes downstream, here we report structural characterization of EppA and thermodynamic characterization of its binding of EDTA and other APCs.

Structure of EppA
EppA crystallized without its ligand EDTA in the tetragonal space group P4 3 2 1 2 with one molecule in the asymmetric unit (Figure 1a). Soaking the crystals with EDTA lowered their symmetry from P4 3 2 1 2 to P2 1 2 1 2 1 , thereby doubling the molecules in the asymmetric unit to two ( Figure 1b); however, no EDTA molecules could be placed unambiguously into the orthorhombic structure after refining the model with EDTA in the ligand-binding cleft. Instead, two sulfate ions from the ammonium sulfate in the crystallization solution were present in the cleft of EppA. For both structures, all but the first four N-terminal residues (Gln27 through Leu30) of the mature protein (Gln27 through Glu563) were visible, and the C-terminal His-tag from pET30a(+) was visible through its first three His residues. Arrow and rectangles are helices and βstrands, respectively. Domain I is aquamarine, Domain IIa is blue, Domain IIb is red, and Domain III is tan. The yellow and green strands, which belong to Domain IIa and IIb, respectively, form the hinge. (b) Structure of EppA in the P21212 space group. The dashed line shows the intermolecular interface between the two monomers, which are related by non-crystallographic 2-fold symmetry. (c) Topology diagram of EppA. (d) EppA's electrostatic potential surface on a scale of −12.5 kT/e (red) to +12.5 kT/e (blue), with white at 0 kT/e. The two sulfates in the ligand-binding site are shown as red (oxygen) and yellow (sulfur) ball-and-stick models.
Based on the relatively weak (e.g., no large, hydrophobic intermolecular interfaces) crystallographic and non-crystallographic protein-protein interactions in both the P43212 and P212121 crystal lattices, EppA is monomeric, which we confirmed by calculating a molecular mass of 63 kDa for the single chromatographic peak observed by analytical size-exclusion chromatography ( Figure 2). Arrow and rectangles are helices and β-strands, respectively. Domain I is aquamarine, Domain IIa is blue, Domain IIb is red, and Domain III is tan. The yellow and green strands, which belong to Domain IIa and IIb, respectively, form the hinge. (b) Structure of EppA in the P2 1 2 1 2 space group. The dashed line shows the intermolecular interface between the two monomers, which are related by non-crystallographic 2-fold symmetry. (c) Topology diagram of EppA. (d) EppA's electrostatic potential surface on a scale of −12.5 kT/e (red) to +12.5 kT/e (blue), with white at 0 kT/e. The two sulfates in the ligand-binding site are shown as red (oxygen) and yellow (sulfur) ball-and-stick models.
Based on the relatively weak (e.g., no large, hydrophobic intermolecular interfaces) crystallographic and non-crystallographic protein-protein interactions in both the P4 3 2 1 2 and P2 1 2 1 2 1 crystal lattices, EppA is monomeric, which we confirmed by calculating a molecular mass of 63 kDa for the single chromatographic peak observed by analytical size-exclusion chromatography ( Figure 2).

Figure 2.
Size-exclusion chromatography-multiangle light scattering profile of EppA. The blue trace is the differential refractive index and the red markers are the calculated molar mass (kDa) at that given elution volume. EppA was monomeric, with an average mass of 63 kDa.
The tertiary structure of EppA was bilobate and composed of three domains, denoted here by convention used for periplasmic binding proteins as Domains I, II, and III. Domain I (Gln27 through Ile256) and Domain III (Ser303 through Leu530) were contiguous domains, whereas Domain II, the smallest of the three, was split into two non-contiguous parts: Domain IIa (Ile257 through Pro302) and Domain IIb (Pro531 to the C-terminus at Glu563). The last eight residues of Domain IIa (Gly295 to Pro302) and first eight residues of Domain IIb (Pro531 to Glu538) each formed a β-strand in an anti-parallel fashion, making a two-strand hinge that links together EppA's two lobes (Figure 1a). This two-strand hinge, along with the β2β1β3βnβ4 structure of Domain III's core β-sheet (Figure 1b), classified EppA as a Class II Cluster C periplasmic binding protein [25,26]. Domain I had a structural motif similar to Domain III, consisting of two β -sheets surrounded by loops, α-helices, and a smaller two-strand β sheet between the first and second sheet ( Figure 1b). Translation-libration-screw (TLS) [27] analysis of EppA's structure supported the domain assignments by showing that a two-group partition contained Domains I and IIb for the first group and Domains III and IIb for the second group. Increasing the number of TLS groups partitioned Domain III/IIb further and eventually began to partition Domain IIa.
Through the comparison to other Class II Cluster C PBPs and the support by molecular docking of EDTA (Section 2.3), the interface of Domains I and III established the putative ligand-binding cleft, with one end of the cleft capped by Domain II. In its observed state, this binding cleft had a volume of 688 Å 3 and was solvent accessible, as shown by the presence of 35 water molecules occupying the extended binding cleft. The cleft was hydrophilic and electrostatically positive (Figure 1d), being lined by the sidechains of Thr55, Arg56, Asn69, Asn70, Ala71, Val72, Arg74, Asn152, and Tyr155 from Domain I; Gln278 and Gln549 from Domain IIa and IIb, respectively; and Tyr415 Asn459, Tyr460, Phe461, Ser462, Gln464, Lys470, Arg480, Gln481, and Tyr483 from Domain III. Domain I and Domain III each contributed two of the four cationic sidechains, whose total 4+ formal charge counterbalances that of EDTA −4 . Highlighting the counterbalancing charges, electron densities for two sulfate ions were clearly observed to be bound electrostatically in the ligand-binding cleft by Arg56 (watermediated), Arg74, Lys470, and Arg480. Arg143 bound a third sulfate ion near the ligand-binding cleft. Of the four cationic residues identified by molecular docking, Arg480 was the only residue situated outside of binding cleft within a short loop (Arg477 through Gln481). The loop had two conformers: Conformer A had a refined occupancy of 60% and was held in place outside of the binding clef by Asp447, and Conformer B had a refined occupancy of 40% and was oriented toward the binding cleft. The tertiary structure of EppA was bilobate and composed of three domains, denoted here by convention used for periplasmic binding proteins as Domains I, II, and III. Domain I (Gln27 through Ile256) and Domain III (Ser303 through Leu530) were contiguous domains, whereas Domain II, the smallest of the three, was split into two non-contiguous parts: Domain IIa (Ile257 through Pro302) and Domain IIb (Pro531 to the C-terminus at Glu563). The last eight residues of Domain IIa (Gly295 to Pro302) and first eight residues of Domain IIb (Pro531 to Glu538) each formed a β-strand in an anti-parallel fashion, making a two-strand hinge that links together EppA's two lobes (Figure 1a). This two-strand hinge, along with the β 2 β 1 β 3 β n β 4 structure of Domain III's core β-sheet (Figure 1b), classified EppA as a Class II Cluster C periplasmic binding protein [25,26]. Domain I had a structural motif similar to Domain III, consisting of two β -sheets surrounded by loops, α-helices, and a smaller two-strand β sheet between the first and second sheet ( Figure 1b). Translation-libration-screw (TLS) [27] analysis of EppA's structure supported the domain assignments by showing that a two-group partition contained Domains I and IIb for the first group and Domains III and IIb for the second group. Increasing the number of TLS groups partitioned Domain III/IIb further and eventually began to partition Domain IIa.
Through the comparison to other Class II Cluster C PBPs and the support by molecular docking of EDTA (Section 2.3), the interface of Domains I and III established the putative ligand-binding cleft, with one end of the cleft capped by Domain II. In its observed state, this binding cleft had a volume of 688 Å 3 and was solvent accessible, as shown by the presence of 35 water molecules occupying the extended binding cleft. The cleft was hydrophilic and electrostatically positive (Figure 1d), being lined by the sidechains of Thr55, Arg56, Asn69, Asn70, Ala71, Val72, Arg74, Asn152, and Tyr155 from Domain I; Gln278 and Gln549 from Domain IIa and IIb, respectively; and Tyr415 Asn459, Tyr460, Phe461, Ser462, Gln464, Lys470, Arg480, Gln481, and Tyr483 from Domain III. Domain I and Domain III each contributed two of the four cationic sidechains, whose total 4+ formal charge counterbalances that of EDTA −4 . Highlighting the counterbalancing charges, electron densities for two sulfate ions were clearly observed to be bound electrostatically in the ligand-binding cleft by Arg56 (water-mediated), Arg74, Lys470, and Arg480. Arg143 bound a third sulfate ion near the ligand-binding cleft. Of the four cationic residues identified by molecular docking, Arg480 was the only residue situated outside of binding cleft within a short loop (Arg477 through Gln481). The loop had two conformers: Conformer A had a refined occupancy of 60% and was held in place outside of the binding clef by Asp447, and Conformer B had a refined occupancy of 40% and was oriented toward the binding cleft.
The EppA structure, depicted via its atomic displacement putty radius (Figure 3a), has three important regions with high isotropic atomic displacements: Domain IIa and two sections of Domain III, one of which is proximal to Domain IIa, the second high displacements region of Domain III being distal to Domain IIb, but connected to the proximal section by helix-15. Notable in Domain IIa was helix-8, a long 3 10 helix (Gln282 through Leu294), which was capped on both ends by glycine residues (Gly279 and Gly295) and had interacted with the hinge via van der Waals forces (by inspection). Importantly, a short loop before helix-8 had two residues, Asn278 and Gly279, which showed alternate conformers in the electron density map. Asn278 had two alternate sidechain conformers, one with its sidechain pointing out of the ligand-binding site, and its second conformer pointing into the ligand-binding site within hydrogen bonding distance of Tyr415, which itself was held in place by two proline residues (Pro 412 and Pro413) and Tyr460. The ϕ and Ψ angles of the peptide bond linking Asn278 and Gly279 rotated by 179 • and 50 • , respectively, between the two conformers ( Figure 3b). This suggests that the sidechain conformation of Asn278 is dependent on ligand binding, and by switching conformers, it flips its peptide bond with Gly279, thereby rotating helix-8 in a mechanism analogous to that in NikA from Escherichia coli (PDB IDs 1UIV and 1UIU) ( Figure 3c) [28]. The EppA structure, depicted via its atomic displacement putty radius (Figure 3a), has three important regions with high isotropic atomic displacements: Domain IIa and two sections of Domain III, one of which is proximal to Domain IIa, the second high displacements region of Domain III being distal to Domain IIb, but connected to the proximal section by helix-15. Notable in Domain IIa was helix-8, a long 310 helix (Gln282 through Leu294), which was capped on both ends by glycine residues (Gly279 and Gly295) and had interacted with the hinge via van der Waals forces (by inspection). Importantly, a short loop before helix-8 had two residues, Asn278 and Gly279, which showed alternate conformers in the electron density map. Asn278 had two alternate sidechain conformers, one with its sidechain pointing out of the ligand-binding site, and its second conformer pointing into the ligand-binding site within hydrogen bonding distance of Tyr415, which itself was held in place by two proline residues (Pro 412 and Pro413) and Tyr460. The φ and Ψ angles of the peptide bond linking Asn278 and Gly279 rotated by 179° and 50°, respectively, between the two conformers ( Figure  3b). This suggests that the sidechain conformation of Asn278 is dependent on ligand binding, and by switching conformers, it flips its peptide bond with Gly279, thereby rotating helix-8 in a mechanism analogous to that in NikA from Escherichia coli (PDB IDs 1UIV and 1UIU) ( Figure 3c) [28].

Isothermal Titration Calorimetry
It was previously reported that EppA binds free EDTA [18]; a further investigation of EppA's ligand-binding abilities was performed by using isothermal titration calorimetry (ITC) to analyze a wider range of metal chelates ( Figure 4). Results showed that EppA binds free EDTA and EGTA, with dissociation constants (kd) of 9.52 nM and 169 nM, respectively (Table 1). Both ligands bound exothermically and with an increase in entropy, implying that ligand binding and/or closure of the ligand-binding site upon ligand binding is accompanied by solvent/solute release. Binding of EDTA by EppA was more favorable than the binding of EGTA entropically (29 vs. 16.40 cal mol −1 K −1 ), but less favorable enthalpically (−2.283 vs. −4.340 kcal mol −1 ). EppA also bound EDDA, albeit weakly (kd = 0.588 μM), but it did not bind decameric polyaspartate (Asp10) ( Table 1), suggesting that it did not bind anionic oligopeptides.

Isothermal Titration Calorimetry
It was previously reported that EppA binds free EDTA [18]; a further investigation of EppA's ligand-binding abilities was performed by using isothermal titration calorimetry (ITC) to analyze a wider range of metal chelates ( Figure 4). Results showed that EppA binds free EDTA and EGTA, with dissociation constants (k d ) of 9.52 nM and 169 nM, respectively (Table 1). Both ligands bound exothermically and with an increase in entropy, implying that ligand binding and/or closure of the ligand-binding site upon ligand binding is accompanied by solvent/solute release. Binding of EDTA by EppA was more favorable than the binding of EGTA entropically (29 vs. 16.40 cal mol −1 K −1 ), but less favorable enthalpically (−2.283 vs. −4.340 kcal mol −1 ). EppA also bound EDDA, albeit weakly (k d = 0.588 µM), but it did not bind decameric polyaspartate (Asp10) ( Table 1), suggesting that it did not bind anionic oligopeptides.

Molecular Docking
To find the location and conformation of EDTA, EDDA, and EGTA when bound by EppA, and to determine whether EppA bound other aminopolycarboxylates, the ligands were docked into EppA's structure by using AutoDock Vina [30]. To measure the significance of Arg480 s two alternate conformers on ligand binding, docking was performed for both alternate conformers. The best docked pose of each ligand was chosen with similarity to EDTA and charge neutralization criteria in mind, along with the free energy of binding (∆G b ) rankings for both conformers of Arg480 (∆G b,A and ∆G b,B ) ( Table 2). In most cases, the binding was more favorable with the B conformer.  As expected, most of the top EDTA conformers bound in the putative ligand-binding cleft. Of EDTA's four carboxylate arms, two carboxylates were docked at the same position as the two sulfate ions whose total −4 charge neutralizes the four cationic residues-Arg56, Arg74, Lys470, and Arg480-in EppA's ligand-binding cleft (Figure 5a), giving weight to the docking results. In particular, when docked into EppA with Arg480 in Conformer A, the carboxyl groups on one end of EDTA were positioned near Arg56 and Arg74, and on the other end, one carboxyl group was positioned near Lys470, while the fourth arm was free, giving a ∆G b of −6.4 kcal mol −1 . When docked into EppA with Arg480 in Conformer B (Figure 5b), each of the carboxyl arms was positioned near one of the four cationic residues, which increased the magnitude of ∆G b to −6.7 kcal mol −1 , indicating that Conformer B of Arg480 is important for binding (Figure 5c). Accompanying the ionic binding in both cases, Asn69 s Nδ and Tyr460 s phenol were within hydrogen bonding distance of one of EDTA's carboxylate arms, and the backbone amide of Ser462 was within hydrogen bonding distance of another carboxylate arm.
As expected, most of the top EDTA conformers bound in the putative ligand-binding cleft. Of EDTA's four carboxylate arms, two carboxylates were docked at the same position as the two sulfate ions whose total −4 charge neutralizes the four cationic residues-Arg56, Arg74, Lys470, and Arg480-in EppA's ligand-binding cleft (Figure 5a), giving weight to the docking results. In particular, when docked into EppA with Arg480 in Conformer A, the carboxyl groups on one end of EDTA were positioned near Arg56 and Arg74, and on the other end, one carboxyl group was positioned near Lys470, while the fourth arm was free, giving a ∆Gb of −6.4 kcal mol −1 . When docked into EppA with Arg480 in Conformer B (Figure 5b), each of the carboxyl arms was positioned near one of the four cationic residues, which increased the magnitude of ∆Gb to −6.7 kcal mol −1 , indicating that Conformer B of Arg480 is important for binding (Figure 5c). Accompanying the ionic binding in both cases, Asn69's Nδ and Tyr460's phenol were within hydrogen bonding distance of one of EDTA's carboxylate arms, and the backbone amide of Ser462 was within hydrogen bonding distance of another carboxylate arm.  EGTA, which is larger than EDTA but otherwise has four carboxyl groups like EDTA and was shown by ITC to bind to EppA, docked with a ∆G b of −6.0 kcal mol −1 for Conformer A and −6.1 kcal mol −1 for the secondary (Figure 5d). Unlike EDTA, EGTA's best conformer interacted ionically with only Lys470 and Arg480 due to steric clashing from EGTA's larger size. Other stabilizing interactions were hydrogen bonding with Asn69 and Gln481.
NTA, because of its small size and having only three carboxylates, interacted with only two of EppA's four cationic sidechains, either Arg56 and Arg74 (∆G b of −5.2 kcal mol −1 ) or Arg74 and Lys470 (∆G b of −5.0 kcal mol −1 ), at once when R480 was in its primary conformer. When docked into EppA with Arg480 in its secondary conformer (Figure 5e), NTA interacted with Arg 74, Lys470, and R480, increasing binding to ∆G b of −6.5 kcal mol −1 . EDDA, despite being longer than NTA, could only interact with two of the four cationic residues due to having only two carboxylate arms (Figure 5e), binding with a weak ∆G b of −5.2 kcal mol −1 for Conformer A and even weaker −4.9 kcal mol −1 for Conformer B, making it unlikely that EDDA can close the ligand-binding site. EDDS docked similarly to EDTA (Figure 5e) with a ∆G b of −6.1 kcal mol −1 for Conformer A of R480 and −6.7 kcal mol −1 for Conformer B. HEDTA, similar to EDTA in size but having one carboxylate replaced by a hydroxymethyl group, interacted with Arg56, Arg74, and Lys470 with ∆G b of −6.1 kcal mol −1 when Arg480 was in Conformer A (Figure 5e). When Arg480 was in Conformer B, HEDTA's three carboxylates were positioned near either Arg74, Lys470, and Arg480 or Arg56, Arg74, and Lys470 with the same ∆G b of −6.1 kcal mol −1 .
BAPTA, the largest chelator analyzed here, was able to interact with all four cationic residues (Figure 5f) with a ∆G b of −6.9 kcal mol −1 for both of Arg480 s conformeric states. DTPA, having five carboxylates and being larger in size than EGTA, bound with a ∆G b of −6.6 kcal mol −1 for Conformer A of R480 and −6.7 kcal mol −1 for Conformer B (Figure 5f). EDDHA, similar to EDTA but larger and less flexible because one carboxylate on each end is replaced by a phenolate, interacted with Arg74 through a carboxylate on one end and Lys470 and Arg480 with the carboxylate on its other end in both of Arg480 s conformeric states (Figure 5f) with the same ∆G b of −7.0 kcal mol −1 . The phenolate closest to Lys470 and Arg480 was within hydrogen bonding distance to the backbone amide of Ser462, while the second phenolate was pointed toward solvent.
Despite the more favorable ∆G b of some of the larger chelators, their larger size may inhibit closure of the ligand-binding site. Since molecular docking uses a relatively low level of theory, does not take into account the entropic effects of solvent/solute displacement by ligand binding, and does not accurately model the closure of the ligand-binding cleft, the ∆G b rankings of the chelators were approximations of ligand binding that neglected the full binding mechanism.

The Electrostatic Potentials of the Ligand-Binding Pocket
Electrostatic potential surfaces generated for EppA by the classical Adaptive Poisson-Boltzmann Solver (APBS) method [31] show that EppA's ligand-binding pocket has a relatively positive electrostatic potential throughout, making it difficult to infer a specific, electrostatics-dependent binding mechanism. Using an electronic structure approach to electrostatic potential surfaces, an alternative approach in this situation, shows that on a quantum chemical level, EppA's ligand-binding pocket is relatively neutral, but spotted with positive electrostatic potential at Arg56, Arg74, Lys470, and Arg480 (Figure 6a). All the APC ligands in their deprotonated forms had negative electrostatic potential surfaces (Figure 6b-g). The configuration of Arg56, Arg74, Lys470, and Arg480 binding the extended form of EDTA in our docking simulations explains why EppA only binds free EDTA.
alternative approach in this situation, shows that on a quantum chemical level, EppA's ligandbinding pocket is relatively neutral, but spotted with positive electrostatic potential at Arg56, Arg74, Lys470, and Arg480 (Figure 6a). All the APC ligands in their deprotonated forms had negative electrostatic potential surfaces (Figure 6b-g). The configuration of Arg56, Arg74, Lys470, and Arg480 binding the extended form of EDTA in our docking simulations explains why EppA only binds free EDTA.

Structural Homologs of EppA and Evolutionary Conservation
To identify homologs of EppA and correlate their sequences with known ligand specificities, the amino acid sequence of EppA was used to perform a similarity search with the deposited crystal

Structural Homologs of EppA and Evolutionary Conservation
To identify homologs of EppA and correlate their sequences with known ligand specificities, the amino acid sequence of EppA was used to perform a similarity search with the deposited crystal structures in the Protein Data Bank (PDB) using position-specific iterative BLAST (PSI-BLAST) [32]. EppA showed low sequence identities to other PBPs (Table 4), and it did not align well with them ( Figure S3). Among the ten highest identities, the highest was only 22.7% for LpqW (PDB: 2GRV) from Mycobacterium smegmatis, while the lowest was only 17.9% for a chitin oligosaccharide binding protein (CosBP) (PDB: IZTY) from Vibrio cholerae. Sequence similarities were higher than identities (38.8% for LpqW and 33.3% for CosBP), but were relatively flat at~33% for most periplasmic binding proteins ( Table 4). The structural homology by pairwise secondary structure superposition of top 10 PSI-BLAST results with EppA were given in Table S1. Ranking homology by DALI [33] Z-scores (Table 5) showed that the most similar 3D structure was NikZ from Campylobacter jejuni (4OET) and AppA from Bacillus subtilis (1XOC), both having Z-scores of 34.4 (Table 4). Comparing the Z-scores to their respective identity and similarity from PSI-BLAST ( Figure S4) again shows not much correlation between sequence homology and 3D structure. The disconnect could be in part due to the difference between a given PBP's open and closed forms.  Tables S1 and S2). The low sequence identity, but relatively high similarity in 3D structures suggest a common evolutionary history, as they all belong to the Class II Cluster C PBP NikA/DppA/OppA-like superfamily. The structural homology by pairwise secondary structure superposition of top 10 DALI results with EppA were given in Table S2.  Of the four cationic residues identified by molecular docking as being important for ligand binding, Arg56 was identified by ConSurf [34][35][36][37][38] as being a highly conserved residue with a normalized conservation score of −1.019 and a conservation binning of 8, showing preference for only R or K in homologous structures. Arg74, Lys470, and Arg480 were highly variable, having respective conservation scores of 1.255, 0.861, 0.917, and conservation binning of 1, 2, and 2, respectively (Table S3). While the equivalent position of Arg56 in homologous structures preferred only Arg or Lys, Arg 74, Lys470, and Arg480 preferred small residues (Ala, Gly, Ser), amides (Asn and Gln), and charged residues (Asp, Glu, Arg, and Lys). These analyses indicated that EppA's ability to bind EDTA has evolved primarily through adopting Arg74, Lys470, and Arg480, while Arg56 was likely adventitious.

Discussion
Based on its three-domain, bilobate structure, the β 2 β 1 β 3 β n β 4 configuration of the core β-sheet of Domain III, and the two-β-strand hinge, EppA is a Class II Cluster C periplasmic binding protein [25,39]. For the Clusters A, B, D, E, and F of PBPs, their Domains I and III are relatively symmetric in size and shape, and the two domains of Cluster C PBPs like OppA, NikA and EppA are significantly asymmetric (Figure 7). This asymmetry among the Cluster C PBPs is possibly reflected in the heterodimeric composition of their ABC transporter's transmembrane region, consisting of two different proteins (EppB and EppC), while the ABC importers associated with other PBP clusters are often homodimeric transmembrane proteins [40]. Due to the high ammonium sulfate concentrations (1 to 2 M) in all of the conditions that EppA crystallized in, sulfate ions occupied EppA's binding cleft, preventing formation of an EppA-EDTA complex. When 100 mM EDTA was added to the crystals, it caused the crystals to melt. To determine EppA's EDTA-binding mechanism, we resorted to molecular docking. Molecular docking suggests that EppA binds its ligands by salt bridging interaction between the ligands and the cationic sidechains of Arg56, Arg74, Lys470, and Arg480. The site-directed mutation of R56A, R74A, K470A, and R480A in EppA resulted in a 17-, 26-, 3-, and 4-fold reduction in EDTA binding, respectively (Table 3), suggesting Arg56 and Arg74 are more critical in the substrate binding than Lys470, and Arg480. Further, the electronic structure analysis supports that EppA's ligand-binding pocket is relatively neutral with positive charged Arg56, Arg74, Lys470, and Arg480 that directly interact with negatively charged carboxylic groups of ETDA (Figure 6a).
The short loop (Arg477 through Gln481) with Arg480 outside of the binding cleft may function as a gate. Conformer A positioned Arg480's guanidium sidechain too far (~6 Å) from the docked EDTA, and Conformer B positioned Arg480's sidechain within a reasonable distance for binding (~3 Å) (Figure 5c). The apparent conformeric flexibility of the loop may allow Arg480 to serve as an actuator for closing the ligand-binding cleft, as Arg480 in Conformer B appeared interacting with one of the bound sulfates. Since both cationic sidechains of Arg56 and Arg74 from Domain I and Lys470 from Domain III lie closely within the binding cleft, EDTA likely binds to them first, followed by Arg480 in Domain III when the short loop is in Conformer B, an event that would close the binding Due to the high ammonium sulfate concentrations (1 to 2 M) in all of the conditions that EppA crystallized in, sulfate ions occupied EppA's binding cleft, preventing formation of an EppA-EDTA complex. When 100 mM EDTA was added to the crystals, it caused the crystals to melt. To determine EppA's EDTA-binding mechanism, we resorted to molecular docking. Molecular docking suggests that EppA binds its ligands by salt bridging interaction between the ligands and the cationic sidechains of Arg56, Arg74, Lys470, and Arg480. The site-directed mutation of R56A, R74A, K470A, and R480A in EppA resulted in a 17-, 26-, 3-, and 4-fold reduction in EDTA binding, respectively (Table 3), suggesting Arg56 and Arg74 are more critical in the substrate binding than Lys470, and Arg480. Further, the electronic structure analysis supports that EppA's ligand-binding pocket is relatively neutral with positive charged Arg56, Arg74, Lys470, and Arg480 that directly interact with negatively charged carboxylic groups of ETDA (Figure 6a).
The short loop (Arg477 through Gln481) with Arg480 outside of the binding cleft may function as a gate. Conformer A positioned Arg480 s guanidium sidechain too far (~6 Å) from the docked EDTA, and Conformer B positioned Arg480 s sidechain within a reasonable distance for binding (~3 Å) (Figure 5c). The apparent conformeric flexibility of the loop may allow Arg480 to serve as an actuator for closing the ligand-binding cleft, as Arg480 in Conformer B appeared interacting with one of the bound sulfates. Since both cationic sidechains of Arg56 and Arg74 from Domain I and Lys470 from Domain III lie closely within the binding cleft, EDTA likely binds to them first, followed by Arg480 in Domain III when the short loop is in Conformer B, an event that would close the binding site in a manner consistent with the Venus flytrap mechanism, as observed in other periplasmic binding proteins [23,24].
Strong metal-EDTA chelates cannot bind to EppA for two reasons. First, metal-EDTA complexes are spherical, compact molecules that cannot span the gap between Arg56, Arg74, Lys470, and Arg480 in the binding pocket of EppA. Docking simulations show the best docked MgEDTA reaching only 3.67 Å from Lys470, the three other residues being even further away (~5 Å)( Figure S5). Second, there are no apparent means by which EppA can interact with a metal-EDTA complex's metal center. NikA is the binding protein required for the uptake of Ni 2+ , and it binds FeEDTA(H 2 O), suggesting a natural metallophore is required to complex with Ni 2+ before its uptake. When binding FeEDTA(H 2 O), NikA uses its Arg97 and Arg137 sidechains to bind one carboxylate each with reasonable intermolecular distances (~2.8 Å), and it interacts with the metal center by means of a π-cation interaction with Trp398 of Domain III [41,42]. By structural superposition, the π-cation interaction appears to be responsible in part for the closure of NikA's ligand-binding site. No Trp sidechains are within EppA's binding site, making the existence of a homologous π-cation interaction in EppA unlikely, thereby explaining why it cannot bind strong metal-EDTA complexes.
In summary, EppA's affinity for EDTA, EDDA, and EGTA, and its putative affinity for other APC chelators via molecular docking, suggests that it is a general binding protein for aminopolycarboxylates. We speculate that EppA's original function might have been to bind naturally occurring aminopolycarboxylates, such as ethylenediaminedisuccinate [43], considering that EDTA was only first synthesized in 1935 [44]. Binding of ligands by PBPs is a prerequisite for their import to the bacterial cytoplasm by the PBP's cognate ABC transporter. Of the aminopolycarboxylates, DTPA and NTA are also substrates for EmoA [45], and EDDS is a substrate of both the bacterium BNC1 and a related EDTA-degrading bacterium Chelativorans multitrophicus DSM 9103 [8,15]; therefore, EppA may participate in transporting EDTA as well as other aminopolycarboxylates into Chelativorans sp. for biodegradation. In the case of weak metal-chelate complexes, EppA likely facilitates dissociation of the weak chelates by using its cationic residues to weaken the carboxylate-mediated metal-chelate bonds and bind the carboxylates, opening up EDTA to its extended conformation and releasing the metal. Since EppA can facilitates the uptake of free synthetic and natural aminopolycarboxylates, the stable metal-chelate complexes will not be subject to EppA-dependent uptake for biodegradation in the cytoplasm, explaining the recalcitrant nature of aminopolycarboxylates in natural environments.

Site-Directed Mutagenesis
R56A, R74A, K470A, and R480A EppA mutants were generated by site-directed mutagenesis of the wild-type EppA gene (GenBank: ABG63228.1) using the standard Phusion protocol. All primers were ordered from Invitrogen (Carlsbad, CA, USA). The stability of the mutant proteins were supported by molecular mechanics optimizations and the fact that their chromatographic elution profiles during purification were consistent with wild-type EppA.

Molecular Mass Determination
In total, 200 µg of EppA was injected onto a Yarra 3u SEC-2000 (Phenomenex; Torrance, CA, USA) size-exclusion column and eluted isocratically by EppA assay buffer. The 280 nm absorbance, laser light scattering, and differential refractive index were measured in tandem by a 280 nm UV detector (Agilent Technologies 1260 Infinity II), a DAWN HELEOS II 8+ (Wyatt Technology; Santa Barbara, CA, USA), and an Optilab T-rEX (Wyatt Technology; Santa Barbara, CA, USA), respectively. The molecular mass of EppA was calculated in ASTRA 7.1.4.8 (Wyatt Technology; Santa Barbara, CA, USA) by Zimm fitting measured light scattering intensities.

Crystallization
Initial crystallization trials using sparse matrix screening from Anatrace (Maumee, OH, USA) and Hampton Research (Aliso Viejo, CA, USA) were set up by a Phoenix RE (Art Robbins Instruments; Sunnyvale, CA, USA). The best screening solution was optimized and used for all crystal growth. Crystals were grown using the hanging drop vapor diffusion method with 1.5 µL of EppA (625 µM in 20 mM MOPS, 0.15 M NaCl, 0.05 g dL −1 NaN 3 , pH 7.2) mixed with 1.5 µL of mother liquor (0.1 M Tris, 2.0 M (NH 4 ) 2 SO 4 , pH 8.0) over a 500 µL reservoir of mother liquor at 4 • C. Crystals finished growing by two months.

Structure Determination
Crystallographic data were collected at the Advanced Light Source (Beamline 5.0.2) and integrated, reduced, and scaled using HKL2000 [46]. The structure of the periplasmic binding protein TM1223 from Thermotoga maritima (PDB ID: 1VR5) was used as a template in SWISS-MODEL (Computational Structural Biology Group at the SIB Swiss Institute of Bioinformatics at the Biozentrum, University of Basel; Basel, Basel-City, Switzerland) [47][48][49][50] to generate a homology model of EppA since of all possible templates, TM1223 had the third highest identity via PSI-BLAST of 20.0% and the second highest GMQE score of 0.58 from SWISS-MODEL, making it the best combination of both among all homology models. After deleting all of the homology model's sidechains, its two core β sheets of Domain I and core β sheet of Domain III were used as an input model along with the P4 3 2 1 2 dataset for molecular replacement in PHENIX (PHENIX Industrial Consortium; Berkeley, CA, USA) [51]. After obtaining initial phases from the core β-sheet model, sections of the sidechain-free homology model were fitted to the electron density where appropriate using COOT (Biomedical Campus, Cambridge, UK) [52] and refined in PHENIX. This process was performed iteratively to build a successively more complete partial model as phases improved until the model was complete enough to be built finished by PHENIX AutoBuild [53], after which the rest of the model was built by hand. Iterative adjustment and refinement of the AutoBuild solution were performed in COOT and PHENIX, respectively. TLS groups were identified by the TLSMD web server [27] after the isotropic atomic displacement parameters had sufficiently converged. Crystallographic coordinates and structure factors have been deposited in the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) with PDB IDs of 6WM6 for the tetragonal space group and 6WM7 for the orthorhombic space group [54,55]. Refinement statistics are listed in 5.

Structure Analysis
PDB2PQR (Battelle Memorial Institute, Columbus, OH, USA) [56] was used to prepare crystallographic/homology model coordinates for whole-model electrostatic potential surface calculations. Electrostatic potentials were then calculated for the prepared models by APBS (Battelle Memorial Institute, Columbus, OH, USA) [31] and mapped onto their respective solvent-excluded molecular surface using the MSMS package [57] in UCSF Chimera (Resource for Biocomputing, Visualization, and Informatics at University of California, San Francisco; San Francisco, CA, USA) [58]. The binding cleft volume (concerning the solvent excluded surface) was calculated using the CASTp 3.0 web server (University of Illinois at Chicago, Chicago, IL, USA) [59]. Backbone torsion comparisons of the open and closed forms of TM1223 (1VR5 and 4PFT), respectively) were made by extracting torsion angles from their respective structures after refinement against their deposited structure factors to correct for geometric errors. Evolutionary conservation was analyzed by submitting the coordinates of EppA to the ConSurf server (Tel Aviv University, Tel Aviv, IL, USA) [34][35][36][37][38]. For ConSurf, the multiple sequence alignment was built using MAFFT, homologues were collected from the UNIREF90 database by two iterations of PSI-BLAST (E-value: 0.0001), and conservation scores of homologues was calculated using a Bayesian method.

Isothermal Titration Calorimetry
Isothermal calorimetric titrations were performed in a MicroCal iTC200 (Malvern Panalytical Ltd.; Malvern, UK). All titrations were performed at 25 • C and stirred at 750 rpm. Ligand solutions (0.5 mM for EDTA, EGTA, MgEDTA, and MgEGTA; 1.5 mM for CaEDTA, SrEDTA, BaEDTA, and the transition metal EDTA chelates; and 2.5 mM for the lanthanide EDTA chelates) were injected into the calorimetric cell containing 100 µM EppA in assay buffer as either sixteen injections (one 0.8 µL injection followed by fifteen 2.47 µL injections) for free EDTA, free EGTA, and the Mg chelates, or twenty injections (0.8 µL injection followed by nineteen 1.8 µL injections) for all other chelates. EppA-ligand binding curves were corrected for heats of dilution by subtracting reference titrations of ligands at the same concentrations into a buffer without EppA. Data were fitted to a single-site model as implemented into the Origin 7 MicroCal Data Analysis software analysis package (Malvern Panalytical Ltd.; Malvern, UK) and then plotted in Excel.
BAPTA, DTPA, EDDA, EDDHA, EDDS, EDTA, EGTA, HEDTA, and NTA, in their fully ionized forms, were generated from their SMILES codes by Phenix eLBOW [68] and optimized in Gaussian 09 at the CAM-B3LYP level of theory with double-ζ correlation-consistent basis sets (cc-pVDZ) that were augmented for carbon, nitrogen, and oxygen atoms. The lowest energy conformation of each ligand was found by relaxed potential energy surface scans around relevant dihedral angles and was confirmed by frequency analysis of the optimized structure. A single-point calculation was then ran on each optimized structure at the CAM-B3LYP level of theory with triple-ζ correlation-consistent basis sets (cc-pVTZ) that were augmented for carbon, nitrogen, and oxygen atoms [69,70]. Electrostatic potential surfaces were generated using the same method as described for EppA's ligand-binding cleft.

Molecular Docking
QM-optimized BAPTA, DTPA, EDDA, EDDHA, EDDS, EDTA, EGTA, HEDTA, and NTA were docked into EppA by AutoDock Vina [30]; ligands and grids were prepared for docking using AutoDock Tools [71]. Each ligand was docked into EppA with Arg480 in Conformer A and Conformer B, and each run with EppA set as a rigid receptor. EppA's EDTA binding site was found by blind docking EDTA into a whole-protein search grid. The lowest energy binding position was then used to center a 15 Å × 20 Å × 15 Å grid at the coordinates (26.372 Å, 53.932 Å, 67.442 Å) into which EDTA was docked again. BAPTA, DTPA, EDDA, EDDHA, EDDS, EGTA, HEDTA, and NTA were docked into the same grid centered on the same coordinates.