Molecular Insights into Substrate Binding of the Outer Membrane Enzyme OmpT

: The enzyme OmpT of the outer membrane of Escherichia coli shows proteolytic activity and cleaves peptides and proteins. Using molecular dynamics simulations in a fully hydrated lipid bilayer on a time scale of hundreds of nanoseconds, we draw a detailed atomic picture of substrate recognition in the OmpT-holo enzyme complex. Hydrogen bonds and salt bridges are essential for maintaining the integrity of the active site and play a central role for OmpT in recognizing its substrate. Electrostatic interactions are critical at all stages from approaching the substrate to docking at the active site. Computational alanine scanning based on the Molecular Mechanics Generalized Born Surface Area (MM-GBSA) approach conﬁrms the importance of multiple residues in the active site that form salt bridges. The substrate ﬂuctuates along the axis of the β -barrel, which is associated with oscillations of the binding cleft formed by the residue pairs D210-H212 and D83-D85. Principal component analysis suggests that substrate and protein movements are correlated. We observe the transient presence of putative catalytic water molecules near the active site, which may be involved in the nucleophilic attack on the cleavable peptide bond of the substrate.


Introduction
OmpT is a bacterial outer membrane enzyme of Escherichia coli that owes its name to the fact that its expression is temperature dependent [1]. The enzymatic function of OmpT is to digest peptides and proteins, preferably by cleavage of a pair of adjacent basic amino acids [2]. This proteolytic activity is probably involved in DNA excision repair [3] and in the degradation of antimicrobial peptides [4]. OmpT is part of a unique family of integral outer membrane proteases (Omptin). Such omptins exist, for instance, in Salmonella enterica and Yersinia pestis [5,6]. To identify the active site residues that maintain the proteolytic activity of E. coli OmpT, site-directed mutagenesis was performed. Initially, residues S99 and H212 were identified as essential active site residues [7]. Later, catalytic residues D83, D85, D210, and H212 were shown to strongly affect enzymatic activity [8].
The structure of OmpT was determined at 2.6 Å resolution by an X-ray diffraction study, which showed that it contained a ten-stranded antiparallel β-barrel extending far from the lipid bilayer into the extracellular space [9]. The proteolytic site is located in a groove consisting of the extracellular end of the β-barrel. The residue pairs D83-D85 and H212-D210 are located on opposite sides of the active site pocket,~5 Å apart. The structure suggests that OmpT should not be classified as a serine protease, because the putative catalytic residues S99 and H212 are too far apart (~9 Å) compared to other serine proteases. An alternative proteolytic mechanism was proposed in which a water molecule between D83 and H212 could perform the nucleophilic attack on the cleavable peptide bond. Subsequently, the structure of the related plasminogen activator Pla from Yersinia pestis was determined by an X-ray diffraction study with a resolution of 1.9 Å and the enzyme activity

Structural Drift and Stability of the Enzyme-Substrate Complex
To assess the stability of the ARRA-OmpT complex, we observed its fluctuations and its structural drift with respect to the initial structure of the production MD simulation (see Figures 1A and A1 in Appendix A). At the end of the simulation, the root mean square deviation (RMSD) of the Cα-structure reaches a plateau of just over 2 Å, while the RMSD of the β-barrel is~0.5 Å lower, consistent with previous reports of its stability [13,20]. The coil Catalysts 2023, 13, 214 3 of 18 and turn regions, which comprise large extracellular loops, exhibit the highest structural drift. Secondary structural analysis confirmed the structural integrity of the ARRA-OmpT complex as described in Appendix A1 (see Figure A1B), with the β-barrel and turn regions maintaining their structure throughout the simulation. Overall, these results indicate the formation of a stable ARRA-OmpT complex. Next, we compare the conformational fluctuations in apovs. holo-OmpT. Figure 1A shows the root mean square Cα-fluctuations (RMSFs) as a function of the number of residues in the last 20 ns of the holo simulation compared with the 20 ns simulation of the apo enzyme. Overall, both forms of the enzyme exhibited a very similar fluctuation pattern, suggesting that the presence of the substrate does not significantly affect the dynamics of the rigid barrel scaffold. The pattern reflects a classic alternating fluctuation scheme often observed for such β-barrels, in which the barrel region exhibits minimal dynamics, and the connecting loops or coils show increased flexibility. In fact, the most pronounced comparative fluctuations are observed in the loop regions, while the β-barrels consistently exhibit low fluctuations, with loop L3 showing a broader fluctuation pattern compared to all of the others. Some subtle differences can be observed near the loop regions, but cannot be interpreted without expanding the sampling timescale. Loops L1 to L3 have different conformations in both simulations, as shown in Figure 1B.
tion (see Figure 1A and Figure A1 in Appendix A). At the end of the simulation, the ro mean square deviation (RMSD) of the Cα-structure reaches a plateau of just over 2 while the RMSD of the β-barrel is ~0.5 Å lower, consistent with previous reports of stability [13,20]. The coil and turn regions, which comprise large extracellular loops, e hibit the highest structural drift. Secondary structural analysis confirmed the structu integrity of the ARRA-OmpT complex as described in Appendix A1 (see Figure A1 with the β-barrel and turn regions maintaining their structure throughout the simu tion. Overall, these results indicate the formation of a stable ARRA-OmpT compl Next, we compare the conformational fluctuations in apo-vs. holo-OmpT. Figure shows the root mean square Cα-fluctuations (RMSFs) as a function of the number of r idues in the last 20 ns of the holo simulation compared with the 20 ns simulation of t apo enzyme. Overall, both forms of the enzyme exhibited a very similar fluctuation p tern, suggesting that the presence of the substrate does not significantly affect the d namics of the rigid barrel scaffold. The pattern reflects a classic alternating fluctuati scheme often observed for such β-barrels, in which the barrel region exhibits minim dynamics, and the connecting loops or coils show increased flexibility. In fact, the m pronounced comparative fluctuations are observed in the loop regions, while the barrels consistently exhibit low fluctuations, with loop L3 showing a broader fluctuati pattern compared to all of the others. Some subtle differences can be observed near t loop regions, but cannot be interpreted without expanding the sampling timesca Loops L1 to L3 have different conformations in both simulations, as shown in Figure 1

Substrate Mobility towards a Stable Pose within the Active Site
To better understand how OmpT recognizes the R301 residue of the ARRA su strate, we analyzed its position near the presumed binding site, which includes residu D85 and D97, as a function of time ( Figure 2A). Using two distances, s1 between ato R301@CZ and D97@ CG and s2 between atoms R301@CZ and D85@ CG, we plotted t corresponding potential of the average force in Figure 2B. The minimum of the free e ergy is at (5 Å, 4.5 Å). At the beginning of the simulation, the conformation is far fro this minimum, and is at (9.8 Å, 6.9 Å), while the final structure settles at (4.3 Å, 4.3 Å The pronounced energy minimum on the free energy contour map indicates a significa , and L3 loops on the secondary structure of the enzyme, as seen from above the membrane. The numbers of the residues are given for reference.

Substrate Mobility towards a Stable Pose within the Active Site
To better understand how OmpT recognizes the R301 residue of the ARRA substrate, we analyzed its position near the presumed binding site, which includes residues D85 and D97, as a function of time ( Figure 2A). Using two distances, s1 between atoms R301@CZ and D97@ CG and s2 between atoms R301@CZ and D85@ CG, we plotted the corresponding potential of the average force in Figure 2B. The minimum of the free energy is at (5 Å, 4.5 Å). At the beginning of the simulation, the conformation is far from this minimum, and is at (9.8 Å, 6.9 Å), while the final structure settles at (4.3 Å, 4.3 Å). The pronounced energy minimum on the free energy contour map indicates a significant interaction between R301 and both D97 and D85. We will see later that these two distances make the largest contribution to the binding affinity of R301. In addition to salt bridges, side-chain:side-chain hydrogen bonds were observed in the R300-E27 and R300-D208 residue pairs ( Figure 2C). The R301-D97 and R301-D85 interactions were maintained by tight salt bridges ( Figure 2D). interaction between R301 and both D97 and D85. We will see later that these two distances make the largest contribution to the binding affinity of R301. In addition to salt bridges, side-chain:side-chain hydrogen bonds were observed in the R300-E27 and R300-D208 residue pairs ( Figure 2C). The R301-D97 and R301-D85 interactions were maintained by tight salt bridges ( Figure 2D).

Figure 2.
(A) Superposition of two conformations of the R301 residue of the ARRA-OmpT complex at 0 ns (brown) and at 100 ns (blue). The neighboring residues D85 and D97 of OmpT are in close contact with R301 and are highlighted in red. (B) The two-dimensional free energy landscape as a function of distances s1 and s2 (defined in the text) for the 100 ns simulation. The asterisk (*) and cross (+) indicate the distances for the first and last simulation snapshots, respectively. (C) Time series of the number of hydrogen bonds between selected active site residues (E27 and D208) of OmpT and basic residue R300 of the ARRA peptide. Hydrogen bonds were calculated with a donor-acceptor distance of 3.5 Å and an angle of 35° between donor-H-acceptor positions. (D) Time series of closest contact distances of salt bridges between selected active site residues (D97 and D85) of OmpT and basic residue R301 of the ARRA peptide. The distance of salt bridges was calculated using the smallest native distance between two residues.

Active Site Modes of Motion Assist Biological Function
To better understand the structural fluctuations of the ligand within the active site, we calculated the time series of the distances between the catalytic dyad and the substrate. As in our previous work [18], we defined two distances δ and ε from the center of mass of the ARRA peptide, representing the "horizontal" distance δ with respect to the center of mass of the D210-H212 residue pair and the "vertical" distance ε with respect to the center of mass of the entire β-barrel, respectively. The two residue pairs D210-H212 and D83-D85 define the barrel opening and their width has a direct influence on the δ-distance. The interactions of the "catalytic" water with the ARRA peptide observed The two-dimensional free energy landscape as a function of distances s1 and s2 (defined in the text) for the 100 ns simulation. The asterisk (*) and cross (+) indicate the distances for the first and last simulation snapshots, respectively. (C) Time series of the number of hydrogen bonds between selected active site residues (E27 and D208) of OmpT and basic residue R300 of the ARRA peptide. Hydrogen bonds were calculated with a donor-acceptor distance of 3.5 Å and an angle of 35 • between donor-H-acceptor positions. (D) Time series of closest contact distances of salt bridges between selected active site residues (D97 and D85) of OmpT and basic residue R301 of the ARRA peptide. The distance of salt bridges was calculated using the smallest native distance between two residues.

Active Site Modes of Motion Assist Biological Function
To better understand the structural fluctuations of the ligand within the active site, we calculated the time series of the distances between the catalytic dyad and the substrate. As in our previous work [18], we defined two distances δ and ε from the center of mass of the ARRA peptide, representing the "horizontal" distance δ with respect to the center of mass of the D210-H212 residue pair and the "vertical" distance ε with respect to the center of mass of the entire β-barrel, respectively. The two residue pairs D210-H212 and D83-D85 define the barrel opening and their width has a direct influence on the δ-distance. The interactions of the "catalytic" water with the ARRA peptide observed near residues H212 and D83 (see below) affect the ε-distance. Therefore, these two distances can be used to describe "active" enzyme conformations. Figure 3A,B show that in the first 30 ns of the simulation, the δ-distance increases from 8.2 Å (δ1) to 9.5 Å (δ2), whereas the ε-distance decreases from 23 Å (ε1) to 22 Å (ε2). At these initial stages of the simulation, vibration of the ligand in the horizontal and vertical directions results in a change in the binding cleft associated with increased penetration of the ARRA peptide within the β-barrel. At this stage, the change in δand ε-gaps is clearly correlated. From Figure 3B, we can further see that ε1 and ε2 represent two main width distributions occurring during the simulation period. These two principal degrees of opening capture the dominant oscillation of the ligand along the barrel axis. We performed a principal component analysis to investigate whether this gap oscillation is correlated with the movements of the OmpT protein. Figure 3C shows that the first largest eigenvector (V1) is significantly correlated with the ε-distance, with the two most populated regions corresponding to ε1(*) and ε2(+). The movement of the first eigenvectors is visualized using a "porcupine" plot and indicates obvious conformational changes in the ligand binding and active site regions.
near residues H212 and D83 (see below) affect the ε-distance. Therefore, these two distances can be used to describe "active" enzyme conformations. Figure 3A,B show that in the first 30 ns of the simulation, the δ-distance increases from 8.2 Å (δ1) to 9.5 Å (δ2), whereas the ε-distance decreases from 23 Å (ε1) to 22 Å (ε2). At these initial stages of the simulation, vibration of the ligand in the horizontal and vertical directions results in a change in the binding cleft associated with increased penetration of the ARRA peptide within the β-barrel. At this stage, the change in δ-and ε-gaps is clearly correlated. From Figure 3B, we can further see that ε1 and ε2 represent two main width distributions occurring during the simulation period. These two principal degrees of opening capture the dominant oscillation of the ligand along the barrel axis. We performed a principal component analysis to investigate whether this gap oscillation is correlated with the movements of the OmpT protein. Figure 3C shows that the first largest eigenvector (V1) is significantly correlated with the ε-distance, with the two most populated regions corresponding to ε1(*) and ε2(+). The movement of the first eigenvectors is visualized using a "porcupine" plot and indicates obvious conformational changes in the ligand binding and active site regions.

Active Site Intactness Emphasizes a Tight Complex
Previous studies have shown that the catalytic triad of several hydrolasesincluding OmpT-is conserved [17]. To assess the integrity of the active site of OmpT in the holo simulation, we plotted the distances between atoms CG @D210 and CG @H212

Active Site Intactness Emphasizes a Tight Complex
Previous studies have shown that the catalytic triad of several hydrolases-including OmpT-is conserved [17]. To assess the integrity of the active site of OmpT in the holo simulation, we plotted the distances between atoms CG @D210 and CG @H212 (d1) and be-Catalysts 2023, 13, 214 6 of 18 tween atoms CG @D83 and CG @H212 (d2) analogous to the analysis presented in previous work. The lowest free energy in the corresponding landscape ( Figure A2 in Appendix A) is in the range of (4.1 Å, 11 Å). This observation indicates that OmpT forms a tight active site conformation throughout the simulation and is consistent with previous results. Due to the presence of the substrate, the d2 spacing is larger than that previously reported for apo-OmpT. Residues D83, D210, and H212 play an important role in maintaining this configuration in the active site.

Key Interactions Emerge within the Active Site
Residues D83, D85, D210, H212, and S99 in the active site of OmpT are stabilized by a network of hydrogen bonds and salt bridges with water molecules and surrounding residues [9,13], and by favorable contacts. The main contacts of the ARRA-OmpT complex are shown in Figure 4 and described in Table 1. The persistent salt bridge for residue pair H212-D210 and hydrogen bonding for D85-D97 confirm the importance of residues D85, D97, D210, and H212 in maintaining the active site [8]. Interestingly, the presence of the substrate significantly weakens the H101-D83 interaction. The stabilizing contacts between residues D83-S99 and an occasionally forming H101-S99 hydrogen bond (not shown) provide an atomistic explanation for the loss of enzyme activity after mutation of S99 to alanine [8]. In addition, residues R300 and R301 belonging to the substrate are in contact with these residues. The key contacts in the active site show a similar pattern in both the holo and apo forms of OmpT (Table 1).
Catalysts 2023, 13, x FOR PEER REVIEW 6 of 20 (d1) and between atoms CG @D83 and CG @H212 (d2) analogous to the analysis presented in previous work. The lowest free energy in the corresponding landscape ( Figure  A2 in Appendix A) is in the range of (4.1 Å, 11 Å). This observation indicates that OmpT forms a tight active site conformation throughout the simulation and is consistent with previous results. Due to the presence of the substrate, the d2 spacing is larger than that previously reported for apo-OmpT. Residues D83, D210, and H212 play an important role in maintaining this configuration in the active site.

Key Interactions Emerge within the Active Site
Residues D83, D85, D210, H212, and S99 in the active site of OmpT are stabilized by a network of hydrogen bonds and salt bridges with water molecules and surrounding residues [9,13], and by favorable contacts. The main contacts of the ARRA-OmpT complex are shown in Figure 4 and described in Table 1. The persistent salt bridge for residue pair H212-D210 and hydrogen bonding for D85-D97 confirm the importance of residues D85, D97, D210, and H212 in maintaining the active site [8]. Interestingly, the presence of the substrate significantly weakens the H101-D83 interaction. The stabilizing contacts between residues D83-S99 and an occasionally forming H101-S99 hydrogen bond (not shown) provide an atomistic explanation for the loss of enzyme activity after mutation of S99 to alanine [8]. In addition, residues R300 and R301 belonging to the substrate are in contact with these residues. The key contacts in the active site show a similar pattern in both the holo and apo forms of OmpT (Table 1).

Quantitative Analysis of Substrate Recognition
We used computational alanine scanning to quantify the contribution of key residues to molecular recognition in the ARRA-OmpT complex ( Figure 5). We selected the first snapshot at 0 ns and ten snapshots during the last 10 ns of the simulation to assess the differences in binding free energies ∆∆G * binding between the mutant and wild-type ARRA-OmpT complex. Figure 5 shows that the R300A mutant leads to a loss of 20.3 kcal/mol in the ∆∆G * binding scan result in the initial structure. This observation highlights the central role that R300 plays in recognition by the OmpT enzyme. Residues E27 and D208 are located within 4 Å of the R300 residue. Initially, their mutation is already largely unfavorable at 19.0 and 14.7 kcal/mol, indicating strong binding affinity. These interactions either strengthen or remain at similar levels during the simulation, as confirmed by the values of 34.2, 14.2, and 19.4 kcal/mol in the ∆∆G * binding scan over the last 10 ns structures. The side-chain:side-chain hydrogen bonds between the R300-E27 and R300-D208 residue pairs appear to maintain these interactions throughout the simulation (Figure 2).
One interesting phenomenon should be highlighted: a change in the free energy of binding of the R301 residue of the ARRA peptide. This residue contributed 7.3 kcal/mol during the initial ∆∆G * binding scanning before the MD simulation. This value increases to 12.1 kcal/mol after 100 ns of simulation, which is most likely related to R301 vibrating toward residues D85 and D97 ( Figure 2). These two acidic residues strongly interact with R301 during most of the simulation ( Figure 2B), resulting in a loss of 9.6 and 8.9 kcal/mol when they mutate to alanine. It was hypothesized that residues M81 and I170, located at the bottom of the hydrophobic cleft [9], recognize hydrophobic residues of the ligand. To investigate this possibility, we located the A299 residue of the ARRA peptide near M81 and I170 in our docked conformation. Alanine scanning results showed that residues M81 and I170 do not contribute significantly to binding affinity, with low values of 0.5 and 1.4 kcal/mol before and −0.03 and 0.7 kcal/mol after molecular dynamics sampling. They may require more extended hydrophobic regions on the substrate to interact. Residues D83 and H212 were important for maintaining key contacts of the active site residues (Figure 4). D83 clearly contributes to the free energy of the bond in the final 100 ns structure. However, residue H212 does not directly contribute to the energy of ligand recognition. One interesting phenomenon should be highlighted: a change in the free energy of binding of the R301 residue of the ARRA peptide. This residue contributed 7.3 kcal/mol during the initial ∆∆ * scanning before the MD simulation. This value increases to 12.1 kcal/mol after 100 ns of simulation, which is most likely related to R301 vibrating toward residues D85 and D97 ( Figure 2). These two acidic residues strongly interact with R301 during most of the simulation ( Figure 2B), resulting in a loss of 9.6 and 8.9 kcal/mol when they mutate to alanine. It was hypothesized that residues M81 and I170, located at the bottom of the hydrophobic cleft [9], recognize hydrophobic residues of the ligand. To investigate this possibility, we located the A299 residue of the ARRA peptide near M81 and I170 in our docked conformation. Alanine scanning results showed that residues M81 and I170 do not contribute significantly to binding affinity, with low values of 0.5 and 1.4 kcal/mol before and −0.03 and 0.7 kcal/mol after molecular dynamics sampling. They may require more extended hydrophobic regions on the substrate to interact. Residues D83 and H212 were important for maintaining key contacts of the active site residues (Figure 4). D83 clearly contributes to the free energy of the bond in the final 100 ns structure. However, residue H212 does not directly contribute to the energy of ligand recognition.

Free Energy Profiles from Umbrella Sampling MD Simulations Characterize the Driving Force for Association
We compared the absolute free energy of binding of the ARRA and AKKA ligands to the OmpT enzyme. We considered the Cα atoms in the middle of the enzyme perpendicular to the OmpT barrel axis as a reference group and pulled the ligand (tensile group) away from OmpT along the z-axis direction ( Figure 6A). The free energy recording the COM distance between the pulling and reference groups is shown in Figure 6A. The snapshots extracted in the 2.8 nm, 3.7 nm, and 4.4 nm regions of the COM distance

Free Energy Profiles from Umbrella Sampling MD Simulations Characterize the Driving Force for Association
We compared the absolute free energy of binding of the ARRA and AKKA ligands to the OmpT enzyme. We considered the Cα atoms in the middle of the enzyme perpendicular to the OmpT barrel axis as a reference group and pulled the ligand (tensile group) away from OmpT along the z-axis direction ( Figure 6A). The free energy recording the COM distance between the pulling and reference groups is shown in Figure 6A. The snapshots extracted in the 2.8 nm, 3.7 nm, and 4.4 nm regions of the COM distance represent the initial, intermediate, and final conformational state of the AKKA-OmpT complex for the umbrella sampling simulations ( Figure 6B). As with the AKKA-OmpT complex, the free energy value increases from −4.8 kcal/mol to 7.7 kcal/mol as the COM spacing changes from 2.8 nm to 3.7 nm. In this process, the energy mainly contributes to the ligand AKKA leaving the negatively charged pocket formed by residues E27, D85 and D97 ( Figure 6B). Then, the free energy increases to 10.9 kcal/mol when the COM distance reaches 4.4 nm. At this stage, the free energy is used to break the salt bridge between the ligand and the two negatively charged residues D85 and D97. In general, the free energy for the ARRA-OmpT complex shows a similar pattern compared to AKKA-OmpT. The difference of~2 kcal/mol free energy between these two complexes indicates the strong binding ability of AKKA when recognizing OmpT. We analyzed the evolution of salt bridges during the process in more detail, as shown in Supplementary Figure S1. The side chains in AKKA, especially at position 301, form somewhat stronger and longer-lasting interactions. This difference is particularly visible at about 3.5 nm, where two strong salt bridges are formed by K310 in AKKA, while the corresponding interactions in ARRA are already significantly weakened. the strong binding ability of AKKA when recognizing OmpT. We analyzed the evolution of salt bridges during the process in more detail, as shown in Supplementary Figure  S1. The side chains in AKKA, especially at position 301, form somewhat stronger and longer-lasting interactions. This difference is particularly visible at about 3.5 nm, where two strong salt bridges are formed by K310 in AKKA, while the corresponding interactions in ARRA are already significantly weakened.

A Search for Putative Catalytic Water Molecules Yields Two Candidate Poses
Previous reports [9,13] suggest that the catalytic water molecules between residues D83 and H212 in the active site perform a nucleophilic attack on the cleavable peptide bond of the substrate. In the final complex (Figure 7), we indeed observe two water molecules near the cleavable CO-NH peptide bond of R300. The oxygen atoms of these water molecules have the possibility of attacking the carbon atom CO-NH due to their

A Search for Putative Catalytic Water Molecules Yields Two Candidate Poses
Previous reports [9,13] suggest that the catalytic water molecules between residues D83 and H212 in the active site perform a nucleophilic attack on the cleavable peptide bond of the substrate. In the final complex (Figure 7), we indeed observe two water molecules near the cleavable CO-NH peptide bond of R300. The oxygen atoms of these water molecules have the possibility of attacking the carbon atom CO-NH due to their small distance < 4 Å. To characterize the presence of such water molecules, we looked for two types of configurations during the simulation: (1) W1 molecules were defined as being within 4 Å of the carbon atom of R300 and simultaneously within 4 Å of the H212 residue. (2) W2 molecules were defined as being within 4 Å of the carbon atom of R300 and simultaneously within 4 Å of the D83 or D85 residues. Figure 7 shows that both types of water molecules are present during most of the simulation time and configurations sometimes alternate between them. We look for these two types of water molecule in the ARRA-OmpT and AKKA-OmpT complex embedded in the POPE membrane during the 10 ns umbrella sampling simulation. These two types of water molecule are also found in this environment. Our observation of the simulation trajectory shows that the water molecules in these two configurations can move far into the extracellular region of OmpT due to strong hydrogen bonding interactions. However, the orbital distribution of the ARRA substrate observed in extended Hückel calculations previously reported in [13] and depicted in the Supplementary Materials Figure S2, indicates that the cleavable peptide is the region most likely to be attacked. This result suggests that water molecules are present that can play a catalytic role by attacking the CO-NH cleavable peptide bond of the R300 residue via its oxygen atom. This observation is a prerequisite for subsequent quantum mechanical calculations required to evaluate the reactivity of water molecules in these positions.
in the ARRA-OmpT and AKKA-OmpT complex embedded in the POPE membrane during the 10 ns umbrella sampling simulation. These two types of water molecule are also found in this environment. Our observation of the simulation trajectory shows that the water molecules in these two configurations can move far into the extracellular region of OmpT due to strong hydrogen bonding interactions. However, the orbital distribution of the ARRA substrate observed in extended Hückel calculations previously reported in [13] and depicted in the Supplementary Materials Figure S2, indicates that the cleavable peptide is the region most likely to be attacked. This result suggests that water molecules are present that can play a catalytic role by attacking the CO-NH cleavable peptide bond of the R300 residue via its oxygen atom. This observation is a prerequisite for subsequent quantum mechanical calculations required to evaluate the reactivity of water molecules in these positions.

Lipid-Protein Interactions May Specifically Anchor OmpT within the Membrane
The positioning of the enzyme in the membrane is probably stabilized by finely tuned interactions with the membrane environment. The current simulation on a time scale of 100 ns is not long enough to reliably analyze such interactions, but we provide a preliminary analysis in the Supplementary Materials, Section S2.9, to show the properties that can be characterized, such as the area per lipid, to assess convergence (Supplementary Figure S3), the density distribution in the membrane (Supplementary Figure  S4

Lipid-Protein Interactions May Specifically Anchor OmpT within the Membrane
The positioning of the enzyme in the membrane is probably stabilized by finely tuned interactions with the membrane environment. The current simulation on a time scale of 100 ns is not long enough to reliably analyze such interactions, but we provide a preliminary analysis in the Supplementary Materials, Section S2.9, to show the properties that can be characterized, such as the area per lipid, to assess convergence (Supplementary Figure S3), the density distribution in the membrane (Supplementary Figure S4), the number of interactions of DMPC lipids surrounding the ARRA-OmpT complex as a function of time (Supplementary Figure S5), the lateral diffusion coefficients (Supplementary Table S1), and the dominant lipid/protein interactions (Supplementary Table S2).

Discussion
We simulated the ARRA-OmpT complex embedded in a DMPC lipid bilayer during a 100 ns all-atom molecular dynamics simulation to capture the protein-ligand dynamics and investigate the substrate recognition of the enzyme. Since the protein barrel scaffold is very robust and stabilizes rapidly, this time scale was sufficient to capture the substrate dynamics within the binding pocket. Convergence of such proteins can already be observed at a 10 ns timescale, as described in [21], except for loop regions. Compared with the previous simulation of apo OmpT [13] and with a coarse-grained simulation of the ARRA-OmpT holo enzyme complex [18], the current simulation sheds additional light on the mechanism of substrate recognition. Structural drift analysis showed that, by and large, the protein remains close to the crystal structure. The loop and beta barrel regions exhibited the highest and lowest fluctuations, respectively, which is similar to a previous MD simulation of OMPLA [11], another outer membrane enzyme. Secondary structure analysis confirmed this stability. Binding of the ARRA substrate to OmpT was maintained throughout the simulation period by hydrogen bonds and strong salt bridges.
Previous crystallization experiments [9] and point mutation experiments [7,8] have shown that residues D83, D85, H210, and H212 in the active site are essential for enzymatic activity. The active site remains intact during simulation in the presence of substrate. The results of the alanine mutation scans confirm the importance of these residues for binding to the short ARRA substrate mimic. Residues H210 and H212 form strong hydrogen bonds that can maintain active site stability. This is consistent with previous comparative analyzes of the active site of several related hydrolases [17].
Mutagenesis experiments [7] have shown that S99 is very important for enzymatic activity. Simulations of the OmpT apo enzyme [13] have shown that S99 can transiently form hydrogen bonds with residue D83. In the present simulation and in a previous coarsegrained simulation [18] of the complex, the S99 residue forms a stable hydrogen bond with D83 throughout the simulation, highlighting its importance for enzymatic activity. In follow-up studies, it would be of interest to compare E. coli OmpT to its equivalent omptin in Yersinia pestis, as many similarities exist between the two proteins [5].
The final structure of the ARRA-OmpT complex may provide a clue to a possible binding mechanism between the ARRA substrate and the OmpT enzyme. Throughout the 100 ns simulation, we observe that strong interactions form between the residue pairs E27-R300 and D208-R300. The importance of these three residues for substrate recognition was suggested by alanine scanning. These interactions were previously observed in simulations [18] and mutagenesis experiments [8]. Our binding pose appears to be qualitatively consistent with a proposed docking model of the inhibitor aprotinin [19] and offers a starting point for further computational studies concerning protein-protein interface design [22].
Residue R301 moves toward residues D85 and D97 during the simulation, although it starts from a position far from the active center. These residues form strong interactions and contribute to the free binding energy of the complex. This result is consistent with those of a previous mutagenesis experiment [8]. As mentioned previously, the substrate in the active site is aligned by electrostatic interactions. This observation is confirmed by Supplementary Video S1, which shows that the electrostatic field lines emanate precisely from the active site cleft and guide the approach of a putative substrate. We have not examined this feature in detail here, but it would lend itself to further visual inspection as described in [23,24] in future studies. The importance of electrostatics is also echoed by the reported analysis of the major salt bridges of the enzyme, be it in the formed complex or upon extraction during the umbrella sampling runs.
As a hydrolase, OmpT exhibits several differences compared to the traditional hydrolase mechanism. It is generally accepted that trypsin uses a negatively charged active site to digest Arg and Lys substrate residues [25,26]. The structure of the trypsin-vasopressin complex [27] shows that R8 of vasopressin is essential for binding the P1 site, and makes direct contact with the D189 recognition residue of trypsin. Our ARRA-OmpT model suggests that OmpT binds the basic substrate residues via its acidic side chains and that the electrostatic interaction energy drives the ligand into the catalytic site.
If OmpT were a conventional serine protease, it would be expected to use a directly interacting serine residue to digest the substrate. In the ARRA-OmpT complex, S99 is an obvious candidate. It stabilizes the active site mainly by forming a hydrogen bond with D83. However, it is unlikely to act directly on the ARRA substrate because the distance between S99 and the cleavable peptide bond averages 9.4 Å in the simulation. This has previously been cited as a reason why OmpT should not be classified as a member of the serine protease family [9].
Water molecules instead of serine were thought to be involved in the catalytic mechanism. In our final complex, some water molecules located between residues H212 and D83 were observed near the cleavable CO-NH bond. The distance between these water molecules and the peptide bond is stable <4 Å during the simulation, and it was assumed that the water molecules play a catalytic role by participating in the nucleophilic attack on the carbon atom of the peptide bond. The catalytic water molecules detected in the active site during the 100 ns simulation may reflect this earlier assumption [9,13].
The effects of protein-lipid interactions on lipid mobility were analyzed throughout the simulation. OmpT has previously been shown to insert into the membrane with a slight tilt [13,28], presumably due to some specificity of its lipid interactions. In our preliminary analysis, the "border" and "free" lipid molecules exhibited differences in both qualitative and quantitative terms. The mobility of the "border" lipids was limited compared with the "free" lipids. These results reflect previous observations on the OmpA-DMPC system [29], KcsA-POPC [29], and OprF-DMPC [30]. For a more thorough and specific study of proteinlipid interactions and dynamics, coarse-grained approaches that can account for larger scale effects such as crowding are more appropriate [31] and start to take into account protein conformational dynamics to some extent [32]. We can perform such calculations in follow-up studies, but automated approaches such as MemProtMD [33] already provide a good resource for comparing lipid interactions for all available omptin structures.

Generation of Atomic Models
The atomic coordinates of OmpT used for this study were obtained from the Protein Data Bank [34] (PDB code: 1I78). The substrate docking procedure has been described previously [13], and the appropriate candidates for the ARRA-OmpT complex were determined by a combination of the AUTODOCK program [35], YASARA software [36], and STC software [37]. The protonation state of OmpT was assigned based on pKA calculations using the WHATIF program [38]. The simulation was set up for pH7, with residues E27 and H212 predicted to be protonated, H101 and H268 to be neutral, all other residues in their standard protonation states. The titration curves for these four residues are reported in Supplementary Figure S6 and pKa values are reported in Supplementary Table S3. In this work, the numbering of residues in the ARRA-OmpT complex is consecutive, indicating that the first residue of the ARRA peptide in the complex is actually designated A299. The ARRA-OmpT complex was inserted into a pre-equilibrated, hydrated dimyristoylphosphatidylcholine (DMPC) bilayer using previously described methods [39]. We chose this lipid type to be compatible with our previous work [13] and to allow direct comparison. It should be noted that E. coli has about 70% POPE lipids in its cell membranes, so POPE would represent a more biologically natural environment. On the other hand, the literature suggests a relative insensitivity of OMPs to lipid composition [40]. Na+ and Cl − ions were then added at a concentration of 0.1 M.

Molecular Dynamics Simulations
Simulations were performed using the GROMACS software package [41,42]. Consistent with the reference simulations of the apo form described in [13], with which we performed comparisons, identical force fields were used to represent proteins and lipids [43] in combination with the SPC water model [44]. Simulations were performed using the NPT ensemble. Temperature was maintained at 310 K using a thermostat [45] with τ = 0.1 ps. Pressure was maintained at 1 bar in all three dimensions by anisotropic coupling of the x, y, and z components using a barostat [45] with τ = 1 ps and a compressibility of 4.5 × 10 −5 bar −1 . The time step for integration was 2 fs. The electrostatic interactions were calculated using the Particle-Mesh Ewald (PME) algorithm [46], and the van der Waals forces were treated with a limiting distance of 10 Å. The LINCS algorithm [47] was used to constrain all bond lengths. A production run with a duration of 100 ns was then performed. The results were analyzed using the standard software tools of the GROMACS package [41,42]. Visualization and manipulation of conformations was performed using the programs VMD [48] and UnityMol [49]. Statistical and data analysis was performed using the statistical software package R [50].

Calculation of Relative Binding Free Energies by MM-GBSA
To further analyze the binding affinity of the ARRA-OmpT complex, we extracted the initial snapshot at 0 ns and ten snapshots during the last 10 ns of the simulation to evaluate the binding free energy. For each snapshot, we performed 600 ps of MD using the Amber 11 program [51] and the GB approach. This method has been shown to be reliable for estimating binding free energy, as indicated previously [16,52,53]. In this study, the implicit generalized Born solvation model was used (igb = 2). The temperature was fixed at 300 K. Unbound interactions were truncated at a distance of 12 Å. The ff99 force field (Parm99) [54] was used throughout the energy minimization and MD simulations. In the MM-GBSA implementation of Amber 11.0, the free energy of the A + B → AB bond was calculated using the following thermodynamic cycle: Here, T is the temperature, S is the entropy of the solute, ∆Ggas is the interaction energy between A and B in the gas phase, and ∆G A solv , ∆G B solv , and ∆G AB solv are the free solvation energies of A, B, and AB, which are estimated using a GB surface method [55,56]. That is, ∆G AB solv = ∆G AB GBSA + G AB GB + ∆G AB SA , and so on. ∆G GB and ∆G SA are the electrostatic and nonpolar terms, respectively. The binding, angular, and torsional energies represent the intramolecular energy ∆E intra of the complex, while ∆E elec and ∆E vdw represent the electrostatic and van der Waals interactions between receptor and ligand. Assuming a constant contribution of −T∆S for each docked complex, we refer in the discussion to ∆G * binding for ∆G binding + T∆S. The relative free energy of binding ∆G * binding was calculated using MM-GBSA for post-processing snapshots from the MD trajectories. The computational alanine scan method in MM-GBSA [57] was used to analyze the importance of residues contributing to the recognition mechanism. The key residues were mutated to alanine and then the difference in binding free energies between mutant and wild-type complexes was calculated based on the MM-GBSA approach. The calculated results were compared with the experimental data.

Umbrella Sampling to Calculate Potential of Mean Force Profiles
The simulation box for the umbrella sampling simulations contained the ARRA-OmpT (or AKKA-OmpT) complex, 214 POPE lipids, 14836 TIP4P water molecules, 28 chloride ions, and 31 sodium ions. The change in lipids to POPE was unintentional, yet we did not repeat the simulations with DMPC lipids, because the literature suggests a relative insensitivity of OMPs to lipid composition [40]. Note that the ARRA-OmpT pose was extracted from the 84 ns equilibrium run MD and the AKKA-OmpT pose was generated by mutation of two Arg residues from ARRA to Lys. The force field parameters for POPE were taken from GROMOS-87 [43], and the other parameters were described above. After a 3 ns equilibrium run MD for the simulation box, the ARRA ligand (group A) was pulled away from OmpT over 800 ps, using a spring constant of 1000 kJ −1 ·mol −1 ·nm −2 . The Cα atoms in the center plane perpendicular to the OmpT axis were held as an immobile reference (group B). The center of mass distance (COM) between group A and B was monitored and the region of interest from 2.8 nm to 4.4 nm was divided into 55 equidistant windows (each 0.03 nm wide). For each window simulation, 100 ps equilibrium at constant pressure (NPT) and 1 ns production MD were simulated. The Nose-Hoover thermostat [58,59] was used to maintain temperature and the Parrinello-Rahman barostat [60,61] was used to isotropically control pressure. Data were collected for 1 ns and the last 600 ps of each run was used to create the PMF. The unbiased PMF was calculated using the weighted histogram analysis method (WHAM) [62]. Errors were estimated by performing the above analysis in three trajectory blocks with a length of 200 ps, and taking the average of the block PMFs.

Conclusions
This study provides a detailed molecular model of the ARRA-OmpT complex that is quantitatively characterized by computational alanine scanning and therefore can be used as a reference for better rationalization of this system. Residues D83, D85, H210 and H212 are located in the active site and a complex network of hydrogen bonds and salt bridges is formed to maintain the stability of these residues. The hydrogen bonding of S99-D83 reflects the importance of the S99 residue for enzymatic activity. In addition, our model shows that R300 can interact with residues E27 and D208 throughout the simulation, while R301 interacts with D85 and D97 only at the end of the simulation. The importance of these residues contributing to substrate recognition is confirmed and quantified by computational alanine scanning. Our study shows that the oscillation of the substrate along the β-barrel axis is associated with the motions of the binding cleft formed by the two pairs of residues D210-H212 and D83-D85. Principal component analysis suggests that substrate oscillation is correlated with protein motions. Further studies describe the catalytic water positions between residues H212 and D83. The oxygen atom of water is assumed to perform the nucleophilic attack on the carbon atom of the cleavable peptide bond. Thus, the model remains compatible with water-mediated attack, an atypical catalytic mechanism, and provides a plausible starting point for subsequent quantum mechanical studies. Analysis of the protein-lipid interaction shows that free lipids exhibit high lateral diffusion compared to bound lipids, highlighting the fluid character of the bilayer. We take a brief look at the ligand approach phase using umbrella sampling and find that AKKA is about 2 kcal/mol stronger compared to ARRA. Overall, the crucial importance of electrostatic interactions becomes apparent at several stages.

Supplementary Materials:
The following supporting information can be downloaded at: https://www. mdpi.com/article/10.3390/catal13020214/s1, Figure S1: Closest contact distances of salt bridges between selected active site residues (E27, D210, D208, D85 and D97) of OmpT and basic residues at positions 300 and 301 of the AKKA (A) and ARRA (B) peptides as a function of the z reaction coordinate used during umbrella sampling for the extraction of the substrate from the active site. The distance of salt bridges was calculated using the smallest native distance between two residues; Figure S2: Result of an extended Hückel calculation, a simplified method for calculating the electronic structure of conjugated molecules, for the ARRA peptide in a docked formation, showing that the LUMO is located at the carbon atom of the central C=O in the peptide bond; Figure S3: Area per lipid relaxation over time for the DMPC bilayer. The plots were obtained from the trajectories of (A) DMPC+ARRA/OmpT, and (B) DMPC+ OmpT [63]; Figure S4: Density profiles of membrane components (headgroups, glycerol ester and acyl chains) along the membrane normal direction. The plots obtained from the trajectories of (A) DMPC+ARRA/OmpT, and (B) DMPC+ OmpT are shown; Figure S5 Figure S6: Titration curves for the four residues E27, H101, H212 and H238 with a predicted non-standard ionization state at pH7; Table S1: Lateral diffusion coefficients for lipids in the bilayer; Table S2: Dominant lipid/protein interactions [64][65][66]; Table S3: Results for pKa calculations of OmpT in vacuo ("Bare") or in a dielectric membrane region ("Slab"); Video S1: Animation of OmpT electrostatic field lines [24,67]. Funding: This research was funded by the "Initiative d'Excellence" program from the French State, grants "DYNAMO", ANR-11-LABX-0011, and "CACSICE", ANR-11-EQPX-0008). M.B. thanks Sesame Ile-de-France for co-funding the display wall used for data analysis.

Data Availability Statement:
The data presented in this study, especially the molecular dynamics trajectory, are available upon request from the corresponding author. The data are not publicly available at this time because further efforts are underway in the authors' laboratory to make the molecular dynamics data widely available and to establish appropriate best practices, which are ongoing. Once this goal is achieved, the relevant data from this study will be deposited and added to the corresponding author's dataset list permanently available at doi: 10.6084/m9.figshare.21779627.

Acknowledgments:
Computations were performed using HPC resources from LBT-HPC. We thank Geoffrey Letessier for technical and HPC support.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A
Here we provide complementary analysis results that support the core results presented in the main text. First, we look at simulation stability in terms of structural drift and secondary structures as illustrated in Figure A1. Specifically, the intactness of the catalytic triad was assessed by plotting the distances between atoms CG @D210 and CG @H212 (d1) and between atoms CG @D83 and CG @H212 (d2). The corresponding sampling and free energy landscape is shown in Figure A2. Figure A1. (A) Root mean square deviation (RMSD) of the ARRA-OmpT complex with respect to the initial structure for the carbon alpha atoms of all residues (red), the coil and turn structural elements (blue), and the β-barrel (green) during the 100 ns production simulation. (B) Temporal evolution of the secondary structure elements. The beta sheets, turns, and coil regions are shown in yellow, green, and white, respectively. Markers β1 to β10 represent the ten different beta sheets of OmpT.
Specifically, the intactness of the catalytic triad was assessed by plotting the distances between atoms CG @D210 and CG @H212 (d1) and between atoms CG @D83 and CG @H212 (d2). The corresponding sampling and free energy landscape is shown in Figure A2.
Catalysts 2023, 13, x FOR PEER REVIEW 17 of 20 Figure A2. Projection of the free energy landscape representing free energy basins as a function of key distances d1 and d2 characterizing the active site center. The landscape's free energy basin is concentrated around a narrow conformation of the active site.