Simulated Docking Predicts Putative Channels for the Transport of Long-Chain Fatty Acids in Vibrio cholerae

Fatty acids (FA) play an important role in biological functions, such as membrane homeostasis, metabolism, and as signaling molecules. FadL is the only known protein that uptakes long-chain fatty acids in Gram-negative bacteria, and this uptake has traditionally been thought to be limited to fatty acids up to 18 carbon atoms in length. Recently however, it was found Vibrio cholerae has the ability to uptake fatty acids greater than 18 carbon atoms and this uptake corresponds to bacterial survivability. Using E. coli’s FadL as a template, V. cholerae FadL homologs vc1042, vc1043, and vca0862 have been computationally folded, simulated on an atomistic level using Molecular Dynamics, and docked in silico to analyze the FadL transport channels. For the vc1042 and vc1043 homologs, these transport channels have more structural accommodations for the many rigid unsaturated bonds of long-chain polyunsaturated fatty acids, while the vca0862 homolog was found to lack transport channels within the signature beta barrel of FadL proteins.


Introduction
In Gram-negative bacteria, the transport of exogenous long-chain fatty acids (LCFA) across the outer membrane leaflet is mediated by FadL [1][2][3]. FadL's ability to acquire LCFAs grants versatility in carbon source utilization, providing a selective advantage for survival. In the case of Vibrio cholerae, the causative agent of cholera, this bacterial robustness may have ecological and medical implications. In this paper, we will discuss the importance of bacterial FA synthesis and uptake. Then, we will compare novel structural models of V. cholerae FadL homologs with that of E. coli's, highlighting both conservation and divergence in the proteins.
Fatty acids (FAs) are molecules with a carboxylic acid head group and an aliphatic tail group of varying length and saturation. FAs are used primarily as building blocks for cell membranes, but also supply energy and can be used as signaling molecules [4]. Fatty acids can be acquired from exogenous sources as well as being synthesized de novo. However, many organisms (such as Homo sapiens) require specific exogenous sources of FAs for specific metabolic functions [4]. In humans, this can be immune system regulation, blood clotting, neurotransmitter biosynthesis, cholesterol metabolism, and phospholipids for the brain and the retina [5].
In nature, plants typically have a limited synthesizing capacity that produces polyunsaturated fatty acids (PUFAs) up to only 18 carbons (many plants are still capable of monounsaturated and unsaturated FAs for waxes and seed storage lipids) [6]. However, plants are generally the only producers of n-3 (ω-3) and n-6 (ω-6) where the first unsaturated carbon starts on the 3rd or 6th carbon from the tail methyl group. Oddly, there are some for comparison and elucidation of the uptake dynamics of the transmembrane protein FadL. In this paper, we study the predicted structure and functions of several V. cholerae FadL homologs (v1042, vc1043, and vca0862) using Molecular Dynamics and perform comparisons to the E. coli (b2344) FadL homolog.

Selecting the Vibrio cholerae Homologs
E. coli M1655 FadL (accession number: NP_416846) was used as input for a BLAST [25] homolog search against all available sequenced V. cholerae strains of the pathogenic O1 and O139 serogroups. The search algorithm settings were set at 100 max target sequences, short queries, an expect threshold of 10, a BLOSUM64 scoring matrix, with gap costs determined by Existence: 11 Extension: 1, and a Conditional composition score matrix adjustment. No filters or masks were used to analyze the results.
The resulting proteins were reduced to unique sequences and vc1042, vc1043, and vca0862 (accession numbers: WP_000856207, WP_001061938, and WP_000966057, respectively) were selected based on prevalence.

Generating the FadL Tertiary Structures
The selected V. cholerae sequences were removed of the predicted signal peptide sequences, folded using the I-TASSER [26] standalone version, and compared to the known crystal structure of E. coli's FadL from the RCSB database (PDB ID: 1T16) [27]. The C-score determined by I-TASSER (on a scale from −5 to 2) for vc1042, vc1043, and vca0862 were 0.58, 0.70, and −1.08, respectively. The resulting I-TASSER predicted structures can be seen in the Supplementary Material Figure S1. To validate the I-TASSER folded structures the homolog 111 sequences were then folded with AlphaFold [28] using the Colab server [29]. The resulting AlphaFold structures can be found in Figure S2. The resulting AlphaFold structures had PLDDT values typically above 90, indicating a very good prediction model. The lower scores were primarily still within acceptable tolerances, but these were typically found in the extracellular loops. A comparison between the I-TASSER and AlphaFold structures was performed by aligning each homolog and calculating the RMSD using VMD. The resulting aligned structures can be seen in Figure S3 and the RMSD for each homolog can be seen in Table S3. The extracellular loops appear to generate the largest difference between the I-TASSER and AlphaFold generated structures, and because these are suspected to have little involvement in the transport channels, the structures generated appear to have a good agreement. MolProbity [30] was used for a Ramachandran analysis of the I-TASSER and AlphaFold generated structures seen in Table S2.

Generating the Membrane System
The resulting computationally folded I-TASSER FadL structures, in addition to the E. coli b2344 1T16 crystal structure, were each placed into a membrane using the CHARMM-GUI Membrane Builder [31]. The V. cholerae homologs' membrane had an outer leaflet of V. cholerae type 1 Lipid A, Core A, and 15 O1 O-antigen units. The E. coli outer leaflet was composed of E. coli type 1 Lipid A, Core R1, and 3 E. coli O1 O-antigen units (with 5 sugars per O unit). Both types had an inner leaflet of 67% phosphatidylethanolamine (PE) and 33% phosphatidylglycerol (PG). The structure of each of these molecules can be seen in the Supplementary Material, Figure S4.

Equilibrating the Membrane Systems
The resulting simulation constraints generated by the CHARMM-GUI were then used in conjunction with NAMD [32] and CHARMM36 force fields [33]. During simulations, Langevin dynamics were used to maintain constant temperature (310 K) and pressure (1 atm). The simulations were sized as 80 Å × 80 Å × 140 Å and a flexible cell boundary was chosen for an anisotropic membrane system. A cutoff of 12 Å was used along with a particle mesh Ewald [34] for electrostatic interactions. All equilibrations used a timestep of 2 fs and nonbonded frequency and full electrostatics calculated at every step. Each of the four protein systems were equilibrated for a minimum of 250 ns. The RMSD of the equilibration run for each protein tested can be seen in Figure S5. A Ramachandran plot was made to detect conformational outliers and determine a Z-score using MolProbity [30] of the homolog structures at 0 and 201 ns to determine the differences between structure of the and the initial and membrane equilibrated structures. The outlier residues found were sparse both spatially and numerically in both instances, indicating that the structures were of good agreement. Additionally, the Z-scores were within acceptable tolerances, where typically the absolute value being less than 2 (Table S2 in the Supplemental Material).

The Docking of the FadL Proteins
50 frames of the equilibrated FadL trajectories were taken (one every nanosecond) starting at 201 ns into the equilibration and ending at 250 ns. For each frame, the protein was isolated and aligned with respect to the 1T16 position. An array of 10 ligands (Table S1) were 152 then blindly docked to each of the frames generated from equilibration using AutoDock 153 [35] (in addition to 50 repeated instances of the 1T16 crystal structure) with the purpose to 154 map the channels and binding sites. The 80 × 80 × 120 AutoDockTools gridbox binding region 155 (closer to a 40 Å × 40 Å × 60 Å box) was restricted to the upper extracellular region of the 156 FadL proteins encasing the majority of the FadL proteins. To maintain the cis structures of the FAs, the unsaturated double bonds of the ligands were kept rigid during docking. Each docking used a genetic algorithm with a population size of 150, a maximum number of evaluations of 2,500,000, and a maximum of 27,000 generations.
To test the viability of AutoDock, we also docked 50 repeated instances of 1T16 using GNINA [36] with the same array of 10 ligands used with AutoDock. The autobox was produced using the entire protein, and the exaustiveness was set to 64 to account for the large search space. The default CNN was used for rescoring the poses, and 10 poses were generated per ligand per frame.
The nodal cluster analysis was performed by using an in-house mean shift algorithm script written in VMD's TCL to locate the positions of the highest density of docked FAs. The mean shift algorithm propagated spheres (or nodes) with a radius of 7 Å every 7 Å in the x, y, and z directions, thus covering the entire volume of the system. The coordinates of any FA heavy atoms found within the sphere were averaged and the next iteration of the sphere started at the averaged coordinates. If a sphere did not have any heavy atoms, it was removed. The iterations stopped when the sphere movement stopped, usually around 10 iterations total. Overlapping sphere locations were combined and outliers, generated from stray docked FAs, were removed except in the case of Node 3 for homolog vc1043 because the node was located in the S3 kink.
FAs were ascribed to the nodes based on proximity. Because the nodes represent docking locations of ambiguous size, the nodal spheres' radius were increased iteratively and any FA within the node's iterative radius was ascribed to that node.

Docking Showed Viability between the Crystal Structure and Simulated results
Currently, there are no experimentally determined X-ray or NMR structures of the Vibrio cholerae FadL homologs, consequentially to find the structures of vc1042, vc1043, and vca0862, the protein sequences were computationally folded using I-TASSER to predict their native tertiary structures. Simulations were initiated from the computationally predicted structures with the Molecular Dynamic (MD) simulator NAMD along with the known structure of E. coli's FadL b2344 henceforth referred as b2344. Various conformations of the equilibrated structures were docked with AutoDock using an array of fatty acids shown in Figure 1. To test the accuracy of the simulated docking, we compared the detergent binding from our simulated docking and the experimentally determined binding locations from van den Berg's crystallography study [27]. Specifically, using the E. coli FadL b2344 crystal structure (1T16) was docked using AutoDock with LDAO and C8E4 as the ligands. The resulting simulated docking locations were compared with the experimentally determined bindings for LDAO and C8E4 molecules found in the 1T16 PDB structure. Figure 2 examines a frame of the resulting dockings showing a preference of the AutoDock bindings sites to primarily be the locations that were bound experimentally [27]. However, the selected docking of the S3 kink binding site (residues highlighted in green) contains C8E4 molecules where in the original crystal structure C8E4 molecules were restricted to the low affinity binding site in the L3 and L4 extracellular loops. This is likely due to the methodology of van den Berg, where LDAO and C8E4 competed for binding during the protein washing phase [27], while in simulated docking there was no binding competition. Figure 3 shows the original b2344 1T16 crystal structure with the native detergents outlining the low affinity binding site, the high affinity binding site, and the S3 kink [27]. The AutoDock binding within the 1T16 crystal structure in conjunction with the mean shift algorithm did find that the docked FAs were located in the low affinity binding site (Node 1), the high affinity binding site (Node 2), and the S3 kink (Node 3).

Figure 2.
Docking of 1T16 with LDAO and C8E4 (lines) compared with the original LDAO and C8E4 (solid) placement from X-ray crystallography. While this is only one frame, the LDAO from docking tends to be strongly correlated to the crystallography bound LDAO, found in the high affinity binding site. The docked C8E4 was found to correspond to the crystallography bound C8E4 in the low affinity binding site, but the docked C8E4 was also found to bypass the transport channels and appear docked around the crystallography bound LDAO found the S3 kink. This occurrence is likely an effect of AutoDock's ligand placement algorithm and non-competitive docking). The L3 loop is colored red, the L4 loop is colored orange, and the S3 kink is colored green for reference. (Left) the original detergents from the 1T16 X-ray structure and locations of the low affinity binding site, the high affinity binding site, and the S3 kink as described by van den Berg [27]. (Center) The location of the nodes found by the mean shift algorithm from computationally docked FAs, the 1T16 crystal structure bound detergents are also shown to validate nodal locations. (Right) Nodal locations and a snapshot of FA clusters from docking that the mean shift algorithm used to generate the nodes showing relatively good correlation.

AutoDock Docking Compared to GNINA Docking
We compared the generated dockings of AutoDock with that of a similarly run GNINA docking for the experimentally found 1T16 structure ( Figure 4). The nodal clusters were compared and found to be in good agreement with the high affinity binding site and S3 kink. There was some disagreement in the low affinity binding site locus (Node 1) being set closer to the extracellular space with GNINA than AutoDock, however, both these sites yielded ligand binding results that mirrored the X-ray crystallography results. . Both sets of dockings appear to find the important nodes of the S3 kink and the high and low affinity binding sites. GNINA's docking did reveal a preference for the dockings to be located closer to the extracellular space than AutoDocks, but the C8E4's binding from experimentation are still in close proximity.
A closer look at the ligand binding results shows a satisfactory level of agreement between the two methods. While in different proportions, typically, the longer chained ligands tend to be be found less in the high affinity binding site (Node 2) and more in the other sites ( Figure 5). However, GNINA tends to propagate more of the tested ligands in the high affinity binding site (Node 2) than AutoDock overall. This can likely be attributed to differences in the sampling method for each approach. One noteworthy difference from the GNINA docking was that, out of the 500 poses for the C8E4 ligands, none were found bound to the low affinity binding site. This is somewhat surprising as C8E4 was originally bound to the low affinity binding site (Node 1) in the 1T16 crystallography structure. This could possibly be due to the marked differences in the structure of the extracellular loops between the two methods. These differences could be particularly impactful for C8E4 due to non-favorable interactions between its hydrophilic headgroup and the hydrophobic residues of the low affinity binding site. Figure 5. Charts of the percent of each ligand in proximity to each node between the AutoDock and GNINA dockings. Node 1 represents the low affintity binding site, Node 2 the high affinity binding site, and Node 3 the S3 kink. There were a total of 500 poses per ligand type docked.

Nodal Cluster Analysis Shows Ligand Preference for Certain Binding Sites
The mean shift nodal analysis was performed for each of the FadL and V. cholerae and the equilibrated b2344 homolog structures, resulting in the nodal locations seen in Figure 6. The nodes from equilibrated b2344, Figure 6A, shows almost identical nodal locations as the X-ray 1T16 structure Figure 3. A small difference between the high affinity binding site where Node 2 tended to be closer to the center of the FadL beta barrel in the equilibrated structure). Exception aside, this demonstrates the NAMD equilibration b2344 FadL structure tended to retain important structures during the simulation.

The Nodal Analysis Shows Preference for Certain Node Loci
The FAs were categorized based on proximity, with each docked FA being prescribed one node per frame. With 10 FAs per type per frame and 50 frames, 5000 FAs were assigned to each FadL homolog-giving a reasonable statistical model. The resulting docked locations were summarized in Figure 7. For the 1T16 test case, LDAO had a strong affinity for the high affinity binding pocket (Node 2) with 86.0% of the LDAO molecules docked appearing in or around the high affinity binding site. The other small molecules such as 16:1, 18:2, 18:3α, and 18:3γ also showed clustering in the Node 2 region (42.2%, 17.8%, 23.4%, and 23.2%, respectively). This is reasonable due to the original 1T16 structure having a LDAO molecule bound to the high affinity binding site [27] (the Node 2 locus), where other similarly shorter chained FAs could also fit into the open pocket. Interestingly, the S3 kink (Node 3) tended to have more docked FAs than the low affinity binding site (Node 1) which may be due to the tubular cavity of the S3 kink region, providing more surface area for FAs to bind to than the more open low affinity binding site. Amongst all the FAs tested, the average docking binding energy for Nodes 1 and 3 of 1T16 were −8.975 and −8.976 kcal/mol, respectively, indicating a very close average binding energy. Examining 18:2 in specific, the binding energy of 18:2 with Node 1 was better (−9.29 versus −8.97 kcal/mol), but AutoDock propagated more 18:2 FAs on Node 3. This is likely due to the AutoDock algorithm finding it more difficult to dock the Node 1 area due to a smaller binding channel, even if the binding channel has a better binding energy. Another example of this phenomena is 22:6 having the best overall binding energy when it was found in the high affinity binding site (−12.65 versus −9.73 and −10.11 kcal/mol), however, this only occurred with 1.4% of 22:6 dockings because the high affinity binding site was originally bound to LDAO-A shorter carbon chained molecule.
The equilibrated b2344 docking revealed that the high affinity binding site had fewer dockings than the other sites. This indicates that due to the vacancy of FAs during simulation, that the high affinity binding site was smaller and did not dock many FAs. Further investigation revealed that small positional changes in the high affinity binding pocket residues-particularly ALA153, ILE155, and LEU200 impeded the binding pocket channel, and greatly reducing the ability for FAs to fit in the binding pocket. Unlike the high affinity binding site, the S3 kink did appear to have comparable binding between the crystal struc-ture bound with detergents and the equilibrated structure. Additional investigation of the RMSD (data not shown) between the two S3 kink structures showed very small structural differences (<1.3 Å ) between the two molecules, indicating that there does not appear to be a conformational change in the S3 binding pocket, but instead a shift in the gated channel between the high affinity site and S3 kink as proposed by van den Berg [27]. The size and saturation of the FA did have an effect on the docking. Typically, the longer the FA carbon chains and more unsaturated, the affinity for Node 1 was increased and the affinity for Node 3 was decreased -mirroring the 1T16 dockings.
The vc1042 docking revealed a visible main channel. It is predicted that the FAs move from Node 1 to Node 2 to Node 4 and then to Node 3, the S3 kink ( Figure 6D). The clustering of FAs did not show much preference for any one of the four nodes with the exception of Node 3, where the totaled percent FAs located at Nodes 1, 2, 3, and 4 were 17.5%, 26.2%, 35.0%, and 21.3%, respectively. As expected of a V. cholerae homolog there was no discernable difference in FA tail length or saturation which is reasonable due to V. cholerae's ability to uptake long-chain fatty acids.
In vc1043's docking, Nodes 1 and 2 were in close proximity to one another as seen in Figure 6F, the major difference between the two being Node 2 is the locus of the high affinity binding site in the E. coli homolog. vc1043 showed a strong favoritism for Node 2 with a total of 88.3% of all FAs appearing in the Node 2 region. Figure 6F illustrates the FA's tendency to funnel around Node 2. Only 0.6% of FAs were found in the S3 kink region (Node 3), alluding to a conformational mechanism to allow passage of the FA.
The docking of vca0862 showed a high affinity for the outer portion of the S3 kink (Node 4). This is unexpected based on the premise that FAs travel through the beta barrel in E. coli. Node 4 does tend to have a more pronounced indention making docking more ideal than some other locations; however, the docking did not factor in the lipopolysaccharide (LPS) which encompassed the outer perimeter of the FadL beta barrel, which would leave little room for FAs. AutoDock's current atom limit prevents a system with LPS included. These results indicate that the vca0862 beta barrel was not in an open conformation for the 50 frames used for docking. This suggests that a conformational change may be necessary in order to allow a fatty acid to pass through. It is important to note that the lower C-score in the I-TASSER [26] folding, −1.08, may have a part in this resulting barrel structure, however, additional computational foldings and Molecular Dynamic runs using different programs (data not shown) were performed to test the viability of alternatively folded vca0862 sequences; none of which had any notable differences in the structure.

Eqiulibrated E. coli b2344 Fatty Acids show Similar Channels as the X-ray Crystal Structure 1T16
To determine any important residues in the transport of FAs, the docked homologs were searched for any residues within 3 Å of each of the docked FAs. These residues were agglomerated, and each residue found was counted for recurrences. The resulting table (Table 1) shows the twenty residues that were found to interact with the docked FAs most often.
For the E. coli b2344 homolog dockings, the residues found most frequently were those of the low affinity binding site and the S3 kink. This was expected, as the nodal analysis determined that the majority of FAs were docked in the Node 1 and 3 regions. While not in the same proportions, many of the same residues were found for both the b2344 1T16 and the equilibrated b2344 structures. Residues PRO253, ILE254, PRO255, and PHE315 have a reoccurring presence in the low affinity binding sites for both structures Figure 8A,C. Residues GLY2, LEU5, PRO54, VAL56, ALA74, GLY103, LEU104, ALA105, PRO362, and ARG366 are commonly found in the S3 kink region. The majority of these residues are nonpolar except for the polar glycines GLY2 and GLY103 and the positively charged arginine, ARG366. The arginine headgroup faces towards the S3 kink pocket indicating an affinity for carboxyl groups of FAs which is confirmed by the number of FA carboxyl headgroups in the proximity of ARG366 during docking. This could indicate an orientation of FA with the tail group facing the outlet before egress of FA through the S3 kink. The RMSD for the heavy atoms of these residues tends to be between 1.1Åand 1.7Å, alluding to a stable S3 kink structure even with the difference of a bonded LDAO in the S3 kink of 1T16.

V. cholerae vc1042 Docking Generates a Well Defined Transport Channel
The V. cholerae homolog vc1042 residues were primarily centered around the predicted transport channel Figure 9B. This channel tends to start from between the 5th and 6th extracellular loops ( Figure S6B). The FA is predicted to bypass the N-terminal hatch (residues 1 through 5), and then past the N-terminal hatch through the S3 kink opening. The N-terminal hatch in the docking does not appear to restrict transport in our model as it has been predicted for E. coli b2344 [27,37], as the docked FAs revealed a continuous channel starting from the extracellular space around the 5th and 6th extracellular loops and ending at the S3 kink. This is somewhat unexpected as, generally, an FA transport protein would have some selection mechanism specific to FAs. The residues that line the predicted channel are primarily hydrophobic, with a few exceptions (GLN4, HSD79, THR244, THR331, GLU336, and ARG399) which are all hydrophilic (ARG339 also having a positive charge). These residues are placed periodically throughout the channel in such a way that it could be the FA headgroup's attraction to these residues that guide the movement of the FA through the channel-possibly in a orientation specific manner. The predicted channel appears to end at the S3 kink in the same manner as the E. coli homolog. A similar computational study involving the docking of the bottom half of the homologs (data not shown) revealed that there was a discontinuity from the main channel to any docking channels found in the bottom of the protein reinforcing the hypothesis that the S3 kink is the FA egress point [37]. Oddly, the predicted channel for vc1042 shares a similar overlap of the E. coli homolog's high affinity binding site location and N-terminal hatch domain, but interestingly, the vc1042 pathway bypasses the predicted hatch domain pathway used in E. coli. It is interesting that the original pathway (through the high affinity binding site location, and then through a tunnel created by a conformational mechanism occurring with the N-terminal hatch domain) may still exist somehow in the V. cholerae homolog. Whether or not this N-terminal hatch pathway is vestigial or is functional has yet to be determined at this time computationally or experimentally.  shown with its relation to the displayed residues. The L3 loops is colored red, the L4 loop is colored orange, and the S3 kink is colored green for reference.

V. cholerae vc1043 Transport Channel Has a Discontinuity Revealing Multiple Predicted Pathways
V. cholerae homolog vc1043 was found to have a very large flat channel that seems to funnel FAs to the N-terminal hatch domain ( Figure 10). The entry of the channel can be seen in the supplemental material ( Figure S6C) where the undefined low affinity binding region is again found between the base of the extracellular loops. The channel leads to the N-terminal hatch opposite the S3 kink. The N-terminal hatch domain rests in the same position as the E. coli 1T16 structure and the channel overlaps the general area of the high affinity binding. This could indicate an evolutionary adaptation to combine the low affinity and the high affinity binding sites found in b2344 favoring a more direct pathway, but leaving the mechanisms of the N-terminal hatch domain-which would play the same role for the vc1043 homolog as it does for the E. coli homolog. This would require the N-terminal domain to act as a hatch that opens and closes for FA transport, unfortunately this mechanism at the atomistic level has not been elucidated for accurate prediction. Alternatively, the vc1043 channel spans further down than the transposed high affinity binding site ending below the N-terminal hatch and opposite the S3 kink. This could be an alternate pathway that follows the vc1042 pathway, but with some selection mechanism to cross the remainder of the channel. Again, this proposed pathway has yet to be substantiated from experimentation or computational study. The docked FAs did not appear in the S3 kink pore, likely due to LYS130 from the fourth beta strand, S4, positioned parallel to the S3 kink that appears to be attracted to GLU50, SER106, and ASN107 as well as the backbone oxygens of the S3 kink residue GLY109. This attraction causes LYS130 to fill the S3 transport pore and prevent docking (and possibly FA transport). This could be a selection mechanism that may determine the resulting FA position or FA type. The channel of vc1043 was found to be composed generally of hydrophobic residues. The exceptions to this are GLN4, THR127, TYR298, and SER338 which are hydrophilic, and ARG163 and LYS296 which are positively charged. Previously, it was postulated that the hydrophilic residues in vc1042 guided the FAs through the channel through hyrdophilic residue interactions, but vc1043 hydrophilic residues are restricted to the top of the conical channel. Therefore, there are no hydrophilic residues present in the vacant channel to guide the polar FA headgroup through the channel. It is predicted that the polar head groups bind to the hydrophilic residues at the top of the channel for alignment. Directional positioning of the FA is yet to be determined experimentally or in any in silico study, but FA orientation may play an important role with a positively charged lysine (LYS130) residue blocking the S3 kink pore. shown with its relation to the displayed residues. The L3 loops is colored red, the L4 loop is colored orange, and the S3 kink is colored green for reference.

V. cholerae vca0862 Docking Reveals a Transport Channel External to the Beta Barrel
The docking of V. cholerae homolog vca0862 revealed that the majority of docking sites did not occur within the computationally folded and equilibrated beta barrel structure of the FadL protein, but rather along the outer barrel primarily around the S3 kink ( Figure 11B). This appears to be due to the substantial lack of open space for FAs to be docked on the inner portion of the beta barrel. Oddly, there seems to be a pathway from between the L3 and L4 loops that goes down the side of the protein and to the outside of the S3 kink as shown in the supplemental material ( Figure S7). The results implicate that any molecule of similar size to a FA would be able to make its way through the side channel unless there was some interplay with the interface of the LPS and lipid bilayer to create some sort of selectivity mechanism. The channel between the outer portion of the S3 kink and the predicted initial binding sites between the L3 and L4 extracellular loops tends to close off depending on the L3 and L4 conformations. These L3 and L4 conformations may be the selectivity mechanism that this homolog uses to ensure the uptake of FAs instead of bactericidal compounds. Many of the docked FA were found within the S3 kink, where the internal cavity of the S3 kink would be vestigial if the FAs are transported to the predicted egress point without entry of the FA into the FadL beta barrel structure. Unless the s3 kink has an orientation mechanism to help FAs diffuse passively through the membrane. This vestigial S3 kink cavity agrees with the secondary docking of the bottom portion of the protein, where no FA pathways were found from the S3 kink to the periplasmic end of the FadL protein. shown with its relation to the displayed residues. The L3 loops is colored red, the L4 loop is colored orange, and the S3 kink is colored green for reference.

Simulations and Docking Agree the S3 Kink Is the Fatty Acid Point of Egress
To verify that the S3 kink is the egress point first suggested by Hearn et al. [37], the membrane layer location after equilibration was checked for the possibility of membrane diffusion. The resulting lipid bilayer headgroups or the polar heavy atoms of the LPS were shown in relation to the D3 kink pore (Figure 12). This pore was typically found at the upper portion of the LPS polar region indicating a strong affinity for the polar headgroups of the FA with the polar LPS residues, indicating a good possibility for assimilation into the LPS bilayer and passive diffusion into the periplasmic space. Additional docking studies (not shown) using the bottom portion of the FadL structures also revealed no such pathways through the lower portion of the N-terminus loop that fills the lower beta barrel for either V. cholerae or E. coli.

Docking Energies Show that Binding is Stronger in the Presence of Fatty Acids
Ten docking conformations were produced per ligand creating a total of 100 docking conformations per FadL homolog frame. With 250 frames docked, 25,000 conformations were generated overall. The best conforming (lowest energy) are shown in Table S4. Similarly, the averaged docking energies by FA are given in Table S5.
Simulated docking results indicate that the original crystal structure E. coli 1T16 tended to have the most energetically favorable docking with respect to overall average as well as the best individual FA docking conformations. This is likely because the 1T16 structure was generated with the FadL protein bound with LDAO and C8E4 in the structure when the PDB was generated, giving it the specific conformation needed for strong binding. It is also apparent that the docking energies are more favorable for the longer chain FAs (with exception to C8E4 which has a total length of 21 heavy atoms). This result is likely due to the fact that longer FA chains provide more surface area for binding. However, many of the longer chained polyunsaturated FAs tend to have a hairpin tail due to the cis unsaturated portions. The uptake of these longer chained FAs would likely require some internal mechanisms for FA uptake that compensates for these rigid sections of the FAs, although these compensation mechanisms have not been found computationally.

Discussion
The atomistic study of FadL homologs reveals the structure and transport channels of V. cholerae homologs vc1042, vc1043, and vca0862. The E. coli controls showed agreeable results when compared the X-ray crystallography study [27]. All homologs tended to share similarity in their low affinity binding site locus at the base of the L3 and L4 extracellular loops, but the structure of extracellular loops themselves tended to deviate from the original template. The reason for the deviations has has yet to be determined, but it may have an effect on nascent protein passage through the cell membrane, localization of the protein in the membrane with respect to the LPS, or as guide for FAs into to the transport channel.
The equilibration of b2344 in comparison to the X-ray structure (1T16) shows that there are conformational shifts in the binding sites for FAs that are closed without the presence of FAs. vc1043 shows this similarity, where it appears the presence of FA are required for the span between the observed channel and the S3 kink domain to activate and allow passage. The trajectories over 50 nanoseconds gives a wide range of conformations for the FA to propagate on the proteins, and for neither b2344 or vc1043 to reveal a contiguous channel through docking emphasizes the importance of the FA-protein interaction. This is not the case with vc1042, where the channel is fully expressed without FAs being present. The N-terminal hatch residues ALA1, GLY2, PHE3, and GLN4 were conserved throughout all of the homologs, as well as their tertiary structures and positions. These residues may play a part in conformational changes that allow selective passage of FAs through the cell [27,37]. However, the channels presented for the V. cholerae homolog vc1042 suggests the possibility that this structure can be vestigial in the transport of FAs, but retain the conserved sequence as part of the signal peptide sequence [38], the protease recognizing the signal sequence ALA-GLY-PHE-GLN as part of the processing site. This is subject to the protease responsible for the cleavage, which is has not been discovered at this time.
The resulting structure of vca0862 shows that the channel is on the exterior of the beta barrel. This may be due to the folded structure, but multiple foldings and equilibrations across different software did not predict any significant differences in the structure. Investigation using RefSeq [39] on FA uptake revealed that V. cholerae strains that have these homologs all contain a copy of vc1042, vc1043, (chromosome I) and vca0862 (chromosome II) (data not shown). vca0862 has not been shown in any studies to be the definitive protein responsible for FA transport, with many bioinformatic searches of FadL neglecting vc1042 and vc1043. Additionally, in studies with gene expression of V. cholerae strain N16961 between in vitro and in vivo, the expression of vca0862 was low compared to the other FadL sequences [40]. The relatively low expression of vca0862 and the lack of a suitable channel for FA transfer may indicate that this protein may be a result of a loss of function adaptation [41]. Interestingly the data presented by Xu et al. reveals that expression of vc1042 increases in vivo (in rabbits) in comparison to in vitro (growth in LB). The inverse was true for vc1043 with a reduction in expression in vivo. The effect of FA concentration in the lumen as opposed to the FA devoid LB, may be a selection mechanism for vc1042 which appears to have a larger, more complete, and less selective channel.
The atomistic structures of Vibrio cholerae FadL homologs were found and analyzed. The channels of these homologs bring to light the complex nature of biological systems and the diverse machinery that is used to adapt to environmental conditions. While each homolog has unique characteristics, the exact nature of each homolog is still unknown, and additional studies into these characteristics will shed light on the diversification and expanded uptake capacity for not only Vibrio species, but the growing list of Gram-negative bacteria demonstrating fatty acid utilization versatility.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biom12091269/s1, Figure S1: I-TASSER folded structures. (A) the E. coli b2344 1T16 crystal strcuture compared to (B) vc1042 (C), vc1043, and (D) vca0862. L3 and L4 loops are colored red and orange respectively, and the S3 kink is green for reference. (E-H) are views of the N-terminal hatch domain (blue) for E. coli b2344,vc1042, vc1043, and vca0862 respectively. Figure S2: AlphaFold generated structures. The structures are colored by pLDDT scaling from 100 (blue) to 90 (white). Any residues that were below 90 were colored red. The only residues that had a pLDDT value of less than 80 were in the extracellular loops with the exception of ALA1 in homolog vc1043 with a pLDDT of 78.75. Figure S3: Aligned backbone structures generated by I-TASSER (green) and AlphaFold (blue). The red line indicates the approximate location of the start of the extracellular loops. Figure S4: 2-D representations of constituents of the V. cholerae membranes. Figure S5: RMSD of equilibrated systems over the 250 ns trajectory. Figure S6: Surface view of FA transport channel entrance points (shown in purple) for (A) b2344, (B) vc1042, (C) vc1043 FadL proteins after equilibration. The general protein surface is colored by secondary structure and the L3 and L4 loops are colored red and orange respectively. This perspective is opposite of the S3 kink. Figure S7. Surface view of FA exterior pathway (shown in purple) for vca0862 FadL protein after equilibration. The general protein surface is colored by secondary structure, the S3 kink is colored green, and the L3 and L4 loops are colored red and orange respectively. Table S1: Fatty acids tested during docking.

Funding:
We thank the support of NSF REU #1852042 and #2149956, and the internal support from the Department of Computer Science and Engineering and the Office of the Vice Chancellor of Research at the University of Tennessee at Chattanooga.