Asymmetric Cryo-EM Structure of Anthrax Toxin Protective Antigen Pore with Lethal Factor N-Terminal Domain

The anthrax lethal toxin consists of protective antigen (PA) and lethal factor (LF). Understanding both the PA pore formation and LF translocation through the PA pore is crucial to mitigating and perhaps preventing anthrax disease. To better understand the interactions of the LF-PA engagement complex, the structure of the LFN-bound PA pore solubilized by a lipid nanodisc was examined using cryo-EM. CryoSPARC was used to rapidly sort particle populations of a heterogeneous sample preparation without imposing symmetry, resulting in a refined 17 Å PA pore structure with 3 LFN bound. At pH 7.5, the contributions from the three unstructured LFN lysine-rich tail regions do not occlude the Phe clamp opening. The open Phe clamp suggests that, in this translocation-compromised pH environment, the lysine-rich tails remain flexible and do not interact with the pore lumen region.


Introduction
The lethality of anthrax, a zoonotic disease and bioterrorism agent, is due to the anthrax toxin. This tripartite toxin consists of a protective antigen (PA), lethal factor (LF; a mitogen-activated protein kinase kinase protease), and edema factor (EF; an adenylate cyclase) [1]. After secretion from Bacillus anthracis, the 83 kDa PA (PA 83 ) binds to its target host cell receptor, either capillary morphogenesis protein 2 (CMG2) or tumor endothelium marker-8 (TEM8) [1][2][3][4]. PA 83 is cleaved by proteases, resulting in 20 kDa and 63 kDa fragments. PA 63 then self-associates to form a heptameric PA prepore that can associate with up to three molecules of LF or EF [5]. Octameric PA prepores may also assemble in solution, governed by LF or EF binding to PA 63 monomers clipped in solution [6]. Receptor-bound assembled complexes

Cryo-EM Sample Preparation of PA Pore with Three LF N Bound
With the recent publication of the cryo-EM PA pore structure at pH 5.0 [17], the logical, but challenging, next step in understanding anthrax toxin pore formation and translocation involves determining how bound LF influences the conformation of the PA pore. An atomic resolution structure of LF N -bound PA pore would give molecular insight into the nuances of this interaction. In order to solve the cryo-EM structure of the LF N -PA pore, several obstacles must be overcome, including the aggregation propensity of the pore, asymmetry of the LF N -PA complex, and orientational preferences of complexes on EM grids.
We previously published a methodology to assemble LF N -PA pore complexes while avoiding aggregation by immobilizing PA pores before solubilizing the hydrophobic tip with lipid bilayer nanodiscs [22][23][24][25]. After immobilization, the PA prepores were transitioned into pores using a urea/37 • C pulse methodology, exposing the aggregation-prone pore tip. The nanodisc formed around the hydrophobic pore tip while the complex was immobilized [22,23,25,26]. A schematic of this methodology is shown in Figure 1A. Our previous low-resolution LF N -PA-nanodisc structures were reconstructed from samples frozen on perforated carbon containing a thin carbon layer over holes [23]. There were a number of caveats limiting the structural analysis of that preparation. Most importantly, large diameter nanodiscs (approximately 400 Å) were generated and required the use of thicker ice. In addition, LF N -PA-nanodisc complexes interacted with the carbon layer, resulting in complexes preferring a side view orientation which displays the long axis of the heptameric PA pore rather than allowing for more diverse conformational orientations, including top views. Although these LF N -PA-nanodisc complexes were inherently structurally asymmetric (symmetry mismatch of seven PA subunits to a maximum of three LF N bound), their structures were generated by imposing seven-fold symmetry, which resulted in smearing of the LF N -bound density. This coupled with the sample-induced constraints (thin carbon backing, thick ice, and Fresnel fringe effects for the sharp nanodisc protein interface) diminished the contrast of the protein. These constraints also interfered with the visualization of the PA β-barrel in the reconstruction.
To obtain a more concise LF N -PA-nanodisc complex structure, these sample preparation issues had to be overcome. For better contrast, samples were frozen on simple perforated carbon grids without a thin carbon layer in order to achieve greater orientational diversity and were imaged with a JEM 2200FS electron microscope (60,000× magnification) (JEOL, Peabody, MA, USA). A representative micrograph with high defocus for better contrast (for visualization, not reconstruction purposes) with individual particles highlighted with red circles is shown in Figure 1B. Low-dose, low-defocus conditions were used to collect images for 3D reconstruction. Notably, the nanodiscs for these samples were significantly smaller than the previous larger nanodisc samples. The nanodisc size was dependent on the length of time that LF N -PA-nanodisc complexes were immobilized as well as rotation of the sample tube. Under non-ideal conditions, the pre-nanodisc micelles may merge, generating larger nanodisc diameters. Interestingly, larger nanodiscs often resulted in multiple PA pore-inserted nanodisc complexes (e.g., sometimes four PA pores inserted into one large nanodisc). These larger nanodiscs were attributed to longer dialysis times that consistently resulted in merging of pre-nanodisc micelles. Reducing the time of incubation, ensuring adequate detergent dialysis with Bio-Beads, and constant rotation during formation yielded smaller nanodiscs within the expected diameter range (100-150 Å) containing a single PA pore ( Figure 1B).

Single-Particle Analysis of LFN-PA-Nanodisc Complexes
Initial classification analysis using SPARX [27] revealed heterogeneity in the dataset with one, two, or three LFN bound to PA pores (Figure 2A-C). The release of LFN-PA-nanodisc complexes from the bead surface into solution also resulted in the release of non-complexed LFN, which was then able to bind released complexes leading to particles with multiple binding events. This led to subsets of PA having one, two, or three LFN bound. This inherent heterogeneity in LFN binding stoichiometry made 3D reconstruction difficult. Initially, this limited particle dataset could only be used to obtain a model by imposing C7 symmetry during reconstruction using EMAN2.1 and RELION ( Figure 2D).

Single-Particle Analysis of LF N -PA-Nanodisc Complexes
Initial classification analysis using SPARX [27] revealed heterogeneity in the dataset with one, two, or three LF N bound to PA pores (Figure 2A-C). The release of LF N -PA-nanodisc complexes from the bead surface into solution also resulted in the release of non-complexed LF N , which was then able to bind released complexes leading to particles with multiple binding events. This led to subsets of PA having one, two, or three LF N bound. This inherent heterogeneity in LF N binding stoichiometry made 3D reconstruction difficult. Initially, this limited particle dataset could only be used to obtain a model by imposing C7 symmetry during reconstruction using EMAN2.1 and RELION ( Figure 2D).

Single-Particle Analysis of LFN-PA-Nanodisc Complexes
Initial classification analysis using SPARX [27] revealed heterogeneity in the dataset with one, two, or three LFN bound to PA pores (Figure 2A-C). The release of LFN-PA-nanodisc complexes from the bead surface into solution also resulted in the release of non-complexed LFN, which was then able to bind released complexes leading to particles with multiple binding events. This led to subsets of PA having one, two, or three LFN bound. This inherent heterogeneity in LFN binding stoichiometry made 3D reconstruction difficult. Initially, this limited particle dataset could only be used to obtain a model by imposing C7 symmetry during reconstruction using EMAN2.1 and RELION ( Figure 2D).  While PA alone has C7 symmetry, LF N -bound PA in a saturated (three LF N bound) or sub-saturated binding ratio only possesses C1 symmetry. The recent successful high-resolution reconstruction of the PA pore at pH 5.0 by Jiang et al. [17] was accomplished using, primarily, top and side view orientations that were generated by taking advantage of a grid adherence platform. In that sample preparation, the prepore adhered to the carbon layer through its receptor binding interface and the pore transition was accomplished by adjusting the pH of the solution to pH 5.0. Since the pore itself has an axis of seven-fold symmetry, the variable positioning of the side views of the PA pore on the carbon layer were sufficient to cover most of the conformational space to obtain the first high-resolution structure (2.9 Å) of the anthrax toxin pore translocon [17]. With LF N -PA-nanodisc complexes, the nanodisc insertion procedure permits conformational diversity, which is critical for obtaining a structure without imposing sevenfold symmetry. A direction distribution map, analogous to an Euler angle map, confirmed the orientation of the LF N -PA-nanodisc particles was conformationally diverse (Figure 3). While PA alone has C7 symmetry, LFN-bound PA in a saturated (three LFN bound) or subsaturated binding ratio only possesses C1 symmetry. The recent successful high-resolution reconstruction of the PA pore at pH 5.0 by Jiang et al. [17] was accomplished using, primarily, top and side view orientations that were generated by taking advantage of a grid adherence platform. In that sample preparation, the prepore adhered to the carbon layer through its receptor binding interface and the pore transition was accomplished by adjusting the pH of the solution to pH 5.0. Since the pore itself has an axis of seven-fold symmetry, the variable positioning of the side views of the PA pore on the carbon layer were sufficient to cover most of the conformational space to obtain the first high-resolution structure (2.9 Å) of the anthrax toxin pore translocon [17]. With LFN-PAnanodisc complexes, the nanodisc insertion procedure permits conformational diversity, which is critical for obtaining a structure without imposing sevenfold symmetry. A direction distribution map, analogous to an Euler angle map, confirmed the orientation of the LFN-PA-nanodisc particles was conformationally diverse (Figure 3). It is important to note this diverse distribution is crucial for acquiring the asymmetric LFN-PAnanodisc structures since the imposition of sevenfold symmetry during 3D reconstructions distorts the density of any bound LFN ( Figure 2D). CryoSPARC is well suited to obtain unbiased, reproducible, and reliable ab initio 3D models rapidly even when extensive sample heterogeneity is present [28,29]. For example, Ripstien et al. [30] reexamined their previous cryo-EM data of the Thermus thermophiles V/A-ATPase using cryoSPARC and were able to determine their ATPase sample was actually populated by multiple conformations that were previously unresolved, resulting in new mechanistic insights.
To separate the heterogeneous LFN-PA-nanodisc particles, an initial 2D classification was performed on the 30,696 particles with removal of bad classes as determined by eye ( Figure 4A). An ab initio classification with four groups was then performed on the remaining 18,806 good particles ( Figure 4B). Four groups were chosen since two LFN can bind to PA at neighboring binding sites or with an empty binding site between them resulting in 1LFN, 2ALFN, 2BLFN, or 3LFN bound. Group 2 was the most highly populated group identified by the cryoSPARC stochastic gradient descent (SGD) ab initio model generation with three distinct and equal LFN densities ( Figure 4B). Further 2D classification was performed on all four groups to assess the quality of particles within each group ( Figure 4C). Group 3 contained several highly-populated classes showing sharp sevenfold symmetric top and bottom views. Group 1 and 4 particles did not result in clear classes and were discarded ( Figure 4C, top and bottom panels). Since the top and bottom view classes in Group 2 were underrepresented, all particles from Group 2 (4560) and particles from the good classes in Group 3 (1159) were combined. A homogeneous refinement was run with the Group 2 ab initio model with the combined good particle set ( Figure 4D). It is important to note this diverse distribution is crucial for acquiring the asymmetric LF N -PA-nanodisc structures since the imposition of sevenfold symmetry during 3D reconstructions distorts the density of any bound LF N ( Figure 2D). CryoSPARC is well suited to obtain unbiased, reproducible, and reliable ab initio 3D models rapidly even when extensive sample heterogeneity is present [28,29]. For example, Ripstien et al. [30] reexamined their previous cryo-EM data of the Thermus thermophiles V/A-ATPase using cryoSPARC and were able to determine their ATPase sample was actually populated by multiple conformations that were previously unresolved, resulting in new mechanistic insights.
To separate the heterogeneous LF N -PA-nanodisc particles, an initial 2D classification was performed on the 30,696 particles with removal of bad classes as determined by eye ( Figure 4A). An ab initio classification with four groups was then performed on the remaining 18,806 good particles ( Figure 4B). Four groups were chosen since two LF N can bind to PA at neighboring binding sites or with an empty binding site between them resulting in 1LF N , 2 A LF N , 2 B LF N , or 3LF N bound. Group 2 was the most highly populated group identified by the cryoSPARC stochastic gradient descent (SGD) ab initio model generation with three distinct and equal LF N densities ( Figure 4B). Further 2D classification was performed on all four groups to assess the quality of particles within each group ( Figure 4C). Group 3 contained several highly-populated classes showing sharp sevenfold symmetric top and bottom views. Group 1 and 4 particles did not result in clear classes and were discarded ( Figure 4C, top and bottom panels). Since the top and bottom view classes in Group 2 were underrepresented, all particles from Group 2 (4560) and particles from the good classes in Group 3 (1159) were combined. A homogeneous refinement was run with the Group 2 ab initio model with the combined good particle set ( Figure 4D). The homogeneous refinement resulted in a 17 Å 3LFN-PA pore model from 5719 particles. Figure  5 shows the Fourier Shell Coefficient (FSC) used to calculate the resolution. This resulting reconstruction was not biased by outside models or symmetrization operations. The β-barrel pore of PA was not prominent in the ab initio model but became more apparent upon cryoSPARC refinement. The bulge in the β-barrel of the final model was also seen in the cryo-EM structure of the PA pore alone where this hydrophobic region of the outer barrel bound lipids, resulting in the accumulation of additional density [17]. As can be seen in the 2D classification ( Figure 4C, second panel), side view images reveal variation either in nanodisc size or electron density. This resulted in a lack of nanodisc structure in the final electron density map. The irregular density at the bottom of the pore tip in the final structure can be attributed to either the presence of nanodisc or free lipid binding to exposed hydrophobic residues. As mentioned previously, the decrease in nanodisc density appears to be due to extended dialysis times during micelle to nanodisc collapse. The decreased nanodisc size did not diminish our ability to reconstruct LFN-PA pore complexes, particularly in the PA pore cap and the initial extension of the β-barrel.  The homogeneous refinement resulted in a 17 Å 3LF N -PA pore model from 5719 particles. Figure 5 shows the Fourier Shell Coefficient (FSC) used to calculate the resolution. This resulting reconstruction was not biased by outside models or symmetrization operations. The β-barrel pore of PA was not prominent in the ab initio model but became more apparent upon cryoSPARC refinement. The bulge in the β-barrel of the final model was also seen in the cryo-EM structure of the PA pore alone where this hydrophobic region of the outer barrel bound lipids, resulting in the accumulation of additional density [17]. As can be seen in the 2D classification ( Figure 4C, second panel), side view images reveal variation either in nanodisc size or electron density. This resulted in a lack of nanodisc structure in the final electron density map. The irregular density at the bottom of the pore tip in the final structure can be attributed to either the presence of nanodisc or free lipid binding to exposed hydrophobic residues. As mentioned previously, the decrease in nanodisc density appears to be due to extended dialysis times during micelle to nanodisc collapse. The decreased nanodisc size did not diminish our ability to reconstruct LF N -PA pore complexes, particularly in the PA pore cap and the initial extension of the β-barrel. The homogeneous refinement resulted in a 17 Å 3LFN-PA pore model from 5719 particles. Figure  5 shows the Fourier Shell Coefficient (FSC) used to calculate the resolution. This resulting reconstruction was not biased by outside models or symmetrization operations. The β-barrel pore of PA was not prominent in the ab initio model but became more apparent upon cryoSPARC refinement. The bulge in the β-barrel of the final model was also seen in the cryo-EM structure of the PA pore alone where this hydrophobic region of the outer barrel bound lipids, resulting in the accumulation of additional density [17]. As can be seen in the 2D classification ( Figure 4C, second panel), side view images reveal variation either in nanodisc size or electron density. This resulted in a lack of nanodisc structure in the final electron density map. The irregular density at the bottom of the pore tip in the final structure can be attributed to either the presence of nanodisc or free lipid binding to exposed hydrophobic residues. As mentioned previously, the decrease in nanodisc density appears to be due to extended dialysis times during micelle to nanodisc collapse. The decreased nanodisc size did not diminish our ability to reconstruct LFN-PA pore complexes, particularly in the PA pore cap and the initial extension of the β-barrel.

Constructing Samples with Highly-Populated Singly-Bound LF N -PA for Cryo-EM
The heterogeneity of this sample preparation was due to the stepwise assembly of LF N -PA complexes, shown above in Figure 1A. LF N was immobilized onto thiol sepharose beads, then PA prepore was added, binding to the LF N . The bulkiness of PA relative to LF N blocked PA from binding to multiple LF N . After LF N -PA-nanodisc complexes were formed on the beads, they were released into solution. Any unbound LF N was also released and, due to its high affinity for PA, bound to open binding sites of PA ( Figure 1A). To obtain a larger, more homogeneous LF N -bound PA pore particle set, the protocol was modified by pre-incubating LF N with PA prepore in a 1:2 ratio to ensure a higher population of singly-bound LF N -PA. A schematic of the updated protocol is shown in Figure 6A. As proof of principle for future structure determinations, an initial cryo-EM screen of complexes isolated with this new protocol was performed. Figure 6B shows a representative screening image collected on F30 twin TEM (FEI, Hillsboro, OR, USA) at 39,000 times nominal magnification and a pixel size of 3 Å on the specimen. 2D class averaging with SPARX (side views shown in Figure 7) showed the majority of the classified populations had single LF N densities. As with all preparations using the immobilized construction of LF N -PA pore complexes, the elution volume is easily adjusted to obtain a sufficient concentration of particles on the grid for automated screening with a high-powered microscope with a direct electron detector.

Constructing Samples with Highly-Populated Singly-Bound LFN-PA for Cryo-EM
The heterogeneity of this sample preparation was due to the stepwise assembly of LFN-PA complexes, shown above in Figure 1A. LFN was immobilized onto thiol sepharose beads, then PA prepore was added, binding to the LFN. The bulkiness of PA relative to LFN blocked PA from binding to multiple LFN. After LFN-PA-nanodisc complexes were formed on the beads, they were released into solution. Any unbound LFN was also released and, due to its high affinity for PA, bound to open binding sites of PA ( Figure 1A). To obtain a larger, more homogeneous LFN-bound PA pore particle set, the protocol was modified by pre-incubating LFN with PA prepore in a 1:2 ratio to ensure a higher population of singly-bound LFN-PA. A schematic of the updated protocol is shown in Figure 6A. As proof of principle for future structure determinations, an initial cryo-EM screen of complexes isolated with this new protocol was performed. Figure 6B shows a representative screening image collected on F30 twin TEM (FEI, Hillsboro, OR, USA) at 39,000 times nominal magnification and a pixel size of 3 Å on the specimen. 2D class averaging with SPARX (side views shown in Figure 7) showed the majority of the classified populations had single LFN densities. As with all preparations using the immobilized construction of LFN-PA pore complexes, the elution volume is easily adjusted to obtain a sufficient concentration of particles on the grid for automated screening with a high-powered microscope with a direct electron detector.

Molecular Dynamics Flexible Fitting of 3LFN-PA Pore Model into the 17 Å Cryo-EM Density Map
The refined 17 Å cryo-EM model of 3LFN-PA-nanodisc generated by cryoSPARC has several interesting asymmetric features ( Figure 8). As mentioned previously, there are three LFN densities. A molecular dynamics flexible fitting (MDFF) of 3LFN-PA pore docked three LFN, in pink, magenta, and purple, in between subunit interfaces of PA, as was seen previously in the prepore crystal structure of 4LFN-8PA and confirmed by 15 Å cryo-EM structures using the complete LF-PA prepore structure [16,20]. Previous work has shown the N-terminal tail of LFN feeds into the pore lumen and interacts with the Phe clamp. A cross-section of the model, shown in Figure 9, reveals the narrowing of the pore lumen is consistent with the positioning of the Phe clamp region in the MDFF model. Curiously, this pH 7.5 low-resolution triply-bound LFN-PA pore structure shows an open pore region, in contrast to the closed densities observed for the previous lower-resolution, seven-fold symmetrized structures [22].

Molecular Dynamics Flexible Fitting of 3LF N -PA Pore Model into the 17 Å Cryo-EM Density Map
The refined 17 Å cryo-EM model of 3LF N -PA-nanodisc generated by cryoSPARC has several interesting asymmetric features ( Figure 8). As mentioned previously, there are three LF N densities. A molecular dynamics flexible fitting (MDFF) of 3LF N -PA pore docked three LF N , in pink, magenta, and purple, in between subunit interfaces of PA, as was seen previously in the prepore crystal structure of 4LF N -8PA and confirmed by 15 Å cryo-EM structures using the complete LF-PA prepore structure [16,20]. Previous work has shown the N-terminal tail of LF N feeds into the pore lumen and interacts with the Phe clamp. A cross-section of the model, shown in Figure 9, reveals the narrowing of the pore lumen is consistent with the positioning of the Phe clamp region in the MDFF model. Curiously, this pH 7.5 low-resolution triply-bound LF N -PA pore structure shows an open pore region, in contrast to the closed densities observed for the previous lower-resolution, seven-fold symmetrized structures [22].

Molecular Dynamics Flexible Fitting of 3LFN-PA Pore Model into the 17 Å Cryo-EM Density Map
The refined 17 Å cryo-EM model of 3LFN-PA-nanodisc generated by cryoSPARC has several interesting asymmetric features ( Figure 8). As mentioned previously, there are three LFN densities. A molecular dynamics flexible fitting (MDFF) of 3LFN-PA pore docked three LFN, in pink, magenta, and purple, in between subunit interfaces of PA, as was seen previously in the prepore crystal structure of 4LFN-8PA and confirmed by 15 Å cryo-EM structures using the complete LF-PA prepore structure [16,20]. Previous work has shown the N-terminal tail of LFN feeds into the pore lumen and interacts with the Phe clamp. A cross-section of the model, shown in Figure 9, reveals the narrowing of the pore lumen is consistent with the positioning of the Phe clamp region in the MDFF model. Curiously, this pH 7.5 low-resolution triply-bound LFN-PA pore structure shows an open pore region, in contrast to the closed densities observed for the previous lower-resolution, seven-fold symmetrized structures [22].   A comparison of the MDFF atomic structure filtered to 17 Å with the 17 Å cryo-EM-derived 3LFN-PA pore structure showed surface details that were visually indistinguishable ( Figure 10). For example, the top view of the cryo-EM 3LFN-PA structure showed LFN has a distinctive bean shape ( Figure 10A). A top view of the space filled PDB structure of LFN bound to the prepore structure also had this same characteristic shape [16]. A small protrusion from the PA pore cap where LFN is absent was also present in both models. Unlike the MDFF structure, the domain 4 regions of the cryo-EM derived structure are not equal in density, suggesting that these regions are dynamic structures as was previously observed by Jiang et al. [17]. It is also important to note that not all surface regions in the cryo-EM reconstruction are filled by MDFF analysis. For example, the β-barrel bulge that is due to lipid binding is not revealed in the fit structure since such a bulge in the highly-stable β-barrel is energetically restrictive.  A comparison of the MDFF atomic structure filtered to 17 Å with the 17 Å cryo-EM-derived 3LF N -PA pore structure showed surface details that were visually indistinguishable ( Figure 10). For example, the top view of the cryo-EM 3LF N -PA structure showed LF N has a distinctive bean shape ( Figure 10A). A top view of the space filled PDB structure of LF N bound to the prepore structure also had this same characteristic shape [16]. A small protrusion from the PA pore cap where LF N is absent was also present in both models. Unlike the MDFF structure, the domain 4 regions of the cryo-EM derived structure are not equal in density, suggesting that these regions are dynamic structures as was previously observed by Jiang et al. [17]. It is also important to note that not all surface regions in the cryo-EM reconstruction are filled by MDFF analysis. For example, the β-barrel bulge that is due to lipid binding is not revealed in the fit structure since such a bulge in the highly-stable β-barrel is energetically restrictive. A comparison of the MDFF atomic structure filtered to 17 Å with the 17 Å cryo-EM-derived 3LFN-PA pore structure showed surface details that were visually indistinguishable ( Figure 10). For example, the top view of the cryo-EM 3LFN-PA structure showed LFN has a distinctive bean shape ( Figure 10A). A top view of the space filled PDB structure of LFN bound to the prepore structure also had this same characteristic shape [16]. A small protrusion from the PA pore cap where LFN is absent was also present in both models. Unlike the MDFF structure, the domain 4 regions of the cryo-EM derived structure are not equal in density, suggesting that these regions are dynamic structures as was previously observed by Jiang et al. [17]. It is also important to note that not all surface regions in the cryo-EM reconstruction are filled by MDFF analysis. For example, the β-barrel bulge that is due to lipid binding is not revealed in the fit structure since such a bulge in the highly-stable β-barrel is energetically restrictive.

Discussion
Atomic resolution cryo-EM is a rapidly evolving structural method that can be applied to examine the atomic consequences of LF N interactions with the PA pore. The ability to generate soluble, lipid-stabilized LF N -PA pore structures, even in this low resolution model, is the critical, important first step in demonstrating that we can obtain structural snapshots of this complex.

Sample Preparation of Highly Pure Complexes
One of the main thrusts of this work has been to demonstrate that we can routinely obtain highly-pure engagement complexes (multiply-or singly-bound LF N ) using an immobilization bead-based protocol and nanodisc technology without using columns to purify the final complexes [23,25] and minimizing detergent influences on structure [31,32]. Even at 17 Å resolution, the variability of the domain 4 densities for the LF N -PA pore indicates this region is intrinsically flexible [17], ruling out the possibility that this flexibility is due to grid adherence constraints. Although it is possible the insertion of the tip region into an authentic lipid bilayer (e.g., a nanodisc) may result in more ordered structures, better nanodisc resolution is required to make this assessment [33][34][35]. Previously, protein-bilayer interactions in nanodiscs have been noted to result in extended β-barrel protein structures (approximately two residues per strand) compared with detergent-solubilized structures [35].

Initial Cryo-EM Model of 3LF N -PA Pore
The cryo-EM density map structure was created without imposing symmetry or biasing towards an initial input model using the cryoSPARC ab initio reconstruction and subsequent refinement procedures. This 17 Å 3LF N -PA pore model showed three distinct LF N densities. In agreement with what was observed previously, the LF N densities are positioned between two protomer interfaces of the PA pore [16,20]. The main contact points are on the crest of the pore and in the α-clamp. Only three LF N are able to bind to a heptameric pore, leaving one protomer without any direct LF N contacts.
A cross-section through the EM density map showed the location of the pore opening complete with the narrowing of the pore lumen. An MDFF fit starting from the atomic resolution pore structure with LF N bound positions this narrowing region with the Phe clamp loop region and preserves the opening at the Phe clamp annulus. While the number of particles and subsequent resolution of this current cryo-EM density map do not allow us to definitively define structural details of the pore lumen, it would be of interest to determine if the pore remains in a more open configuration at pH 7.5 when one or three LF monomers are bound. This further highlights the need to obtain high-resolution structures of the PA pore with one or more LF bound to determine if the Phe clamp region remains more open under these conditions. As mentioned previously, the presence of interfering electrostatic interactions appears to lead to a more open pore structure. Notably, this open pore diameter has been suggested by Das and Krantz to be necessary in order to accommodate α-helical regions during translocation at pH 5.0. These atomic resolution structures will be key to determining if varying ratios of LF bound (i.e., one vs. three) induces significant structural asymmetry (variable positioning of the Phe clamp) or concerted symmetry (all open) on the PA pore structure.
It is not uncommon to observe both small-and large-scale symmetry breakage of ordered oligomers induced by protein-protein interactions. For example, structures of protein substrate and nucleotide interactions with GroEL, a tetradecameric ring chaperonin protein, show very discernable asymmetric adjustments due to protein substrate interactions [36,37], as well as ATP binding and hydrolysis [38]. A more dramatic demonstration for ligand-induced distortion of symmetry is observed for the ATP bound vs. ADP bound ATPase unfolding machinery of the valosin-containing protein-like ATPase (VAT) recently resolved by cryo-EM [39]. In this instance, the hexameric structure was dramatically distorted in the presence of ADP and appeared to coincide with its ATP/ADP conformational switching mechanism to provide a conformational platform that unfolds proteins prior to degradation.
It would be of great interest to compare singly bound and multiply bound LF N -PA pore structures in different pH conditions in order to discern any distinct structural differences that may result from being in various pH environments. Observing these different states of the engagement complex (pH 5.0 vs. pH 7.5, 1 LF N vs. 3 LF N ) would be useful in determining the position of the Phe clamp loop region and potentially defining unstructured regions of the LF N that may become structured upon binding to the pore prior to translocation at pH 5.0. There are existing crosslinking studies by the Collier group indicating this interaction is present at pH 5.0 [13]. Thus, there is precedence for this interaction and those cryo-EM structure collection experiments at pH 5.0 are currently underway. In all cases, given the intrinsic stability of the extended β-barrel at pH 5.0 and pH 7.0, it is highly unlikely that the β-barrel region will be structurally altered when LF N binds to the PA pore cap region. Rather, the more flexible parts of the PA pore (i.e., the cap region, Phe clamp region, etc.) will be highly susceptible to LF N -induced conformational changes. How LF structurally impacts translocation and pore formation may be manifested through long range allosteric affects.

Conclusions
Understanding both PA pore formation and LF translocation through the PA pore is crucial to mitigating, and perhaps preventing, anthrax disease. To better understand the interactions between LF N and the PA pore, the structure of LF N -bound PA pore was examined using cryo-EM. The 17 Å structure of PA pore with 3 LF N bound was the result of pore immobilization, nanodisc solubilization, ab initio modeling, and refinement. In this pH 7.5 structure, the contributions from the three unstructured LF N lysine-rich tail regions do not occlude the Phe clamp opening, indicating these flexible tails remain unstructured and unresolved. The next structures to examine are the LF N -PA pore complexes at pH 5.0 to determine if the unstructured LF N-terminal tails interact with the Phe clamp.

Protein Expression and Purification
Recombinant wild-type (WT) PA was expressed in the periplasm of Escherichia coli BL21 (DE3) and purified by anion exchange chromatography [40] after activation of PA with trypsin [41]. QuikChange site-directed mutagenesis (Stratagene) was used to introduce mutations into the plasmid (pET SUMO (Invitrogen)) encoding a truncated recombinant portion of lethal factor. LF N E126C and was expressed as His 6 -SUMO-LF N , which was later cleaved by SUMO (small ubiquitin-related modifier) protease, revealing the native LF N E126C N-terminus [41]. Membrane scaffold protein 1D1 (MSP1D1) was expressed from the pMSP1D1 plasmid (AddGene) with an N-terminal His-tag and was purified by immobilized Ni-NTA affinity chromatography as previously described [42].

Formation of LF N -PA-Nanodisc Complexes
Heterogeneous LF N -PA-nanodisc complexes were formed and purified as previously described [22,25]. In brief, E126C LF N was immobilized by coupling E126C LF N to activated thiol sepharose 4B beads (GE Healthcare Bio-Sciences, Pittsburgh, PA, USA) in Assembly Buffer (50 mM Tris, 50 mM NaCl, pH 7.5) at 4 • C for 12 h. One hundred microliters (100 µL) of 0.2 µM heptameric WT PA prepore was then added to 50 µL of LF N bead slurry. Beads were washed three times with Assembly Buffer to remove any unbound PA prepores. The immobilized LF N -PA prepore complexes were then incubated in 1 M urea (Thermo Fisher Scientific, Waltham, MA, USA) at 37 • C for 5 min to transition the PA prepores to pores. After three more washes with Assembly Buffer, pre-nanodisc micelles (2.5 µM MSP1D1, 162.5 µM 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) (Avanti, Alabaster, AL, USA) in 25 mM Na-cholate (Sigma-Aldrich, St. Louis, MO, USA), 50 mM Tris, and 50 mM NaCl) were added and bound to the aggregation-prone hydrophobic transmembrane β-barrel of PA. The micelles were collapsed into nanodiscs by removing Na-cholate using dialysis with Bio-Beads (BIO RAD, Hercules, CA, USA) as previously described [43]. Soluble complexes were released from the thiol sepharose beads by reducing the E126C LF N -bead disulfide bond using 50 mM dithiothreitol (DTT) (Goldbio, St. Louis, MO, USA) in Assembly Buffer. To select for LF N -PA-nanodisc complexes, the released complexes were then incubated with Ni-NTA resin (Qiagen, Germantown, MD, USA). The His-tag on the MSP1D1 construct bound to the resin. Complexes were eluted from the Ni-NTA using 200 mM imidazole (Sigma-Aldrich, St. Louis, MO, USA) in Assembly Buffer. Assembled complexes were initially confirmed using negative-stain TEM.
Homogeneous 1LF N -PA pore complexes were produced using a modified protocol where E126C LF N and PA were incubated in solution at a ratio of 1LF N :2PA prior to immobilization to reduce the number of complexes with multiple bound LF N . In this particular instance, affinity purification with Ni-NTA resin was omitted to minimize sample loss and homogeneous samples were still obtained.

Image Analysis and 3D Reconstruction
The 650 raw micrographs obtained at Baylor were evaluated using EMAN2.1 [44]. At the early evaluation stage, around 250 of these micrographs were rejected due to either gross contamination or charging artifacts visible in the Fourier transforms. A total of 30,696 particles were manually boxed out using the e2boxer.py routine of EMAN2.1 with a box size of 224 × 224 pixels. The data evaluated with EMAN2.1 and RELION, showed a heterogeneous population of single, double, and triple LF N -bound PA. Due to this heterogeneity, it was difficult to use earlier versions of RELION with this smaller dataset to produce a model without imposing C7 symmetry. The approximately 30,000 particles were reevaluated using cryoSPARC (version 0.5). First, 2D class averaging was performed ( Figure 5A). Bad classes were visually identified and discarded (e.g., unrecognizable densities, smaller than predicted density envelopes, etc.). Using the remaining 18,806 good particles, an ab initio reconstruction using the cryoSPARC SGD was carried out to computationally purify the dataset into subsets containing one, two, or three bound LF N . This computation was performed with the following settings: four groups, a group similarity factor of 0.2, and 10-fold the default iterations.
The SGD algorithm allows for ab initio structure determination that is insensitive to initial model inputs. An arbitrary computer-generated random initialization model improves over many noisy model iterations. Each step is based on the gradient of the approximated objective function obtained with a random selection of a small batch of initial particles. These approximate gradients do not exactly match the "overall optimization objective" (best ab initio model) but through multiple rounds, the derived models gradually approach this maximum. As stated by Punjani, Brubaker, and colleagues, "the success of SGD is commonly explained by the noisy sampling approximation allowing the algorithm to widely explore the space of all 3D maps to finally arrive near the correct structure" [28,29]. In contrast to using the entire dataset for initial model reconstruction, cryoSPARC samples random subsets of the images during its rapid iteration processes.
The ab initio model with three clearly-resolved LF N densities possessed the largest percentage of particles (44.9%). The second most populated class (20.1%) appeared to contain one prominent LF N density with the hint of a second bound LF N , but requires more particles in order to achieve definition ( Figure 11B, column 1). After the ab initio model was generated, a homogeneous refinement with 100 additional passes using the branch-to-bound maximum likelihood optimization cryoSPARC algorithm. The final cryo-EM map resolution was estimated to be 17 Å based on Fourier Shell Correlation (FSC) with a cut off of 0.5 ( Figure 5). The ab initio group 1 with the second highest percentage (20.1%) had one LF N density at a lower volume threshold. However, further processing of the potential single bound LF N revealed added density on the PA pore cap from a mixture of one and two LF N populations ( Figure 11C, column 1). More particles are needed to populate this distribution before definitive single or double LF N -bound structures can be obtained. density with the hint of a second bound LFN, but requires more particles in order to achieve definition ( Figure 11B, column 1). After the ab initio model was generated, a homogeneous refinement with 100 additional passes using the branch-to-bound maximum likelihood optimization cryoSPARC algorithm. The final cryo-EM map resolution was estimated to be 17 Å based on Fourier Shell Correlation (FSC) with a cut off of 0.5 ( Figure 5). The ab initio group 1 with the second highest percentage (20.1%) had one LFN density at a lower volume threshold. However, further processing of the potential single bound LFN revealed added density on the PA pore cap from a mixture of one and two LFN populations ( Figure 11C, column 1). More particles are needed to populate this distribution before definitive single or double LFN-bound structures can be obtained. Figure 11. CryoSPARC data analyses parsed out heterogeneous LFN-PA-nanodiscs: (A) Image projection of heterogeneous ab initio reconstruction with four groups, the largest group, with 44.9% of particles, corresponds to 3LFN; (B) ab initio 3D models (side views); and (C) homogeneous refinements of ab initio group 1 and group 2. Group 2 refined to 18 Å model of 3LFN-PA from 4732 particles. Group 1 clearly shows missing density in the cap region and will need more particles to determine if this structure contains sub-saturated populations (i.e., one or two LFN bound) of LFN bound to the PA pore structure or that this group will split out further to separate one vs. two LFNbound populations.
The cryoSPARC 3D reconstruction software tool (Structura Biotechnology, Toronto, ON, Canada) was run on a single workstation (Nova 2 Model: 2 × NVIDIA Titan Xp GPU, Intel Xeon E5-1630v4 (4-core 3.7 GHz CPU), 64 GB DDR4-2400 RAM, Intel 1.2 TB SATA solid state drive for runtime cache, and 4 × 4 TB Seagate SATA HDDs) purchased from Silicon Mechanics (Bothell, WA, USA) housed in the Fisher Laboratory. One of the main advantages of using cryoSPARC in combination with this computer system is the reduced computational time. What was once days or weeks in computational time is now only minutes or hours [29]. For example, as this paper was being written, the latest version of cryoSPARC was released (upgrade from 0.41 to 0.5). All Baylor collected data was reanalyzed with the newer version as a test for reproducibility in the span of 4 h (from reevaluating 2D classification, removing poor particles, etc.) where the final output ab initio models, reevaluated 2D class averages from separated populations and refined structures were reproduced Figure 11. CryoSPARC data analyses parsed out heterogeneous LF N -PA-nanodiscs: (A) Image projection of heterogeneous ab initio reconstruction with four groups, the largest group, with 44.9% of particles, corresponds to 3LF N ; (B) ab initio 3D models (side views); and (C) homogeneous refinements of ab initio group 1 and group 2. Group 2 refined to 18 Å model of 3LF N -PA from 4732 particles. Group 1 clearly shows missing density in the cap region and will need more particles to determine if this structure contains sub-saturated populations (i.e., one or two LF N bound) of LF N bound to the PA pore structure or that this group will split out further to separate one vs. two LF N -bound populations.
The cryoSPARC 3D reconstruction software tool (Structura Biotechnology, Toronto, ON, Canada) was run on a single workstation (Nova 2 Model: 2 × NVIDIA Titan Xp GPU, Intel Xeon E5-1630v4 (4-core 3.7 GHz CPU), 64 GB DDR4-2400 RAM, Intel 1.2 TB SATA solid state drive for runtime cache, and 4 × 4 TB Seagate SATA HDDs) purchased from Silicon Mechanics (Bothell, WA, USA) housed in the Fisher Laboratory. One of the main advantages of using cryoSPARC in combination with this computer system is the reduced computational time. What was once days or weeks in computational time is now only minutes or hours [29]. For example, as this paper was being written, the latest version of cryoSPARC was released (upgrade from 0.41 to 0.5). All Baylor collected data was reanalyzed with the newer version as a test for reproducibility in the span of 4 h (from reevaluating 2D classification, removing poor particles, etc.) where the final output ab initio models, reevaluated 2D class averages from separated populations and refined structures were reproduced using the single workstation described above. The use of SGD algorithms to generate ab initio models are now being beta tested or implemented in other software packages.

Molecular Dynamics Flexible Fitting of 3LF N -PA
A molecular model was fit into the cryo-EM density map using molecular dynamics flexible fitting (MDFF) methods [45] which apply an additional potential derived from the density map to the molecules. The starting molecular model was built by rigid docking three LF N (PDB 3KWV) onto the PA pore cap (PDB 3J9C). The cryo-EM density map and initial molecular model were spatially aligned in Sculptor [46,47]. The density map was then converted from mrc to a situs file extension for compatibility with the Visual Molecular Dynamics (VMD) software suite. The atomic model and density map files were prepared for MDFF fitting in VMD by the typical MDFF tutorial progression [47,48]. The model was minimized for 2000 steps simulated for 50 ps at 300 K in vacuum. The grid-scaling factor, which controls the relative strength of the MDFF potential was set to 0.3. Figure 10 compares the 17 Å filtered MDFF structure with the 17 Å cryo-EM derived structure to show distinct similarities in surface topologies [46,48,49].