Crystal Structures of the Clostridium botulinum Neurotoxin A6 Cell Binding Domain Alone and in Complex with GD1a Reveal Significant Conformational Flexibility

Clostridium botulinum neurotoxin A (BoNT/A) targets the soluble N-ethylmaleimide-sensitive factor attachment protein receptor (SNARE) complex, by cleaving synaptosomal-associated protein of 25 kDa size (SNAP-25). Cleavage of SNAP-25 results in flaccid paralysis due to repression of synaptic transmission at the neuromuscular junction. This activity has been exploited to treat a range of diseases associated with hypersecretion of neurotransmitters, with formulations of BoNT/A commercially available as therapeutics. Generally, BoNT activity is facilitated by three essential domains within the molecule, the cell binding domain (HC), the translocation domain (HN), and the catalytic domain (LC). The HC, which consists of an N-terminal (HCN) and a C-terminal (HCC) subdomain, is responsible for BoNT’s high target specificity where it forms a dual-receptor complex with synaptic vesicle protein 2 (SV2) and a ganglioside receptor on the surface of motor neurons. In this study, we have determined the crystal structure of botulinum neurotoxin A6 cell binding domain (HC/A6) in complex with GD1a and describe the interactions involved in ganglioside binding. We also present a new crystal form of wild type HC/A6 (crystal form II) where a large ‘hinge motion’ between the HCN and HCC subdomains is observed. These structures, along with a comparison to the previously determined wild type crystal structure of HC/A6 (crystal form I), reveals the degree of conformational flexibility exhibited by HC/A6.


Introduction
Clostridium botulinum neurotoxin (BoNT) is renowned as the most potent toxin known to humans [1]. It is the causative agent of botulism; thankfully, outbreaks of this deadly disease are incredibly rare [2]. Botulism causes flaccid paralysis by inhibiting acetylcholine release at the neuromuscular junction (NMJ) due to cleavage of a SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein required for neurotransmitter release. Considering the low incidence of botulism, there has been no need for mass vaccination against BoNT; indeed, such measures would be undesirable [3] due to its increasing use as a therapeutic for the treatment of hyper-muscular and glandular disorders [3][4][5]. Historically BoNT has been categorised into seven immunologically distinct serotypes (BoNT/A-/G), however, with the emergence of mosaic (BoNT/DC, /CD, /FA, /HA) and BoNT-like proteins (BoNT/Wo and BoNT/En [6][7][8]) guidelines on BoNT nomenclature have been introduced to limit confusion within the literature [9]. The serotypes are further divided into subtypes due to minor differences in amino acid sequence, which have been associated with significant variation in toxicity across BoNT/A subtypes [10][11][12][13][14].
BoNT is expressed by Clostridium botulinum as a single polypeptide from a bont gene cluster [15] and is cleaved post-translationally by either a host or an endogenous protease into the active di-chain [16,17]. Non-toxic-non-hemagglutinin protein (NTNH) and neurotoxin associated proteins (NAPs) are co-expressed with BoNT and together they form the progenitor neurotoxin complex, which protects BoNT during its passage through the digestive system and into the bloodstream [18]. BoNT consists of three domains each with a specific role in the mechanism of toxicity. Firstly, the cell binding domain (H C ), which consists of an N-terminal (H CN ) and C-terminal (H CC ) subdomain, utilises the H CC to bind to both a protein (e.g., SV2 for BoNT/A) and a ganglioside (e.g., GT1b or GD1a) on the surface of motor neurons. BoNT is then internalized into an endosome via the endocytic pathway where the acidic environment is believed to cause conformational changes within the translocation domain (H N ) that grants entry of the catalytic domain (LC) into the cytosol. Upon entry, the LC, a Zn 2+ dependent endopeptidase, can cleave its target SNARE protein (SNAP-25 for BoNT/A) preventing vesicular fusion and neurotransmitter release [19].
Gangliosides consist of a hydrophilic oligosaccharide moiety containing sialic acid and a hydrophobic lipid tail that is embedded in the cell membrane of most vertebrate cells [20]. They are more abundant on the surface of nerve cells and have been identified in a range of biochemical processes such as cell-cell recognition and signal transduction [21]. GD1a ( Figure 1A) constitutes one of the four major gangliosides (GM1, GD1a, GT1b and GD1b [22]) that make up 80-90% [20] of all gangliosides. The oligosaccharide moiety contains six monosaccharide units, of which the three terminal units (Sialic acid, Galactose, and N-acetylglucosamine) have been shown to form direct hydrogen bonding interactions with the H CC subdomain of BoNT subtypes (/A1 to/A5) [23][24][25][26][27]. The terminal units are conserved among GD1a and GT1b and both have been identified as binding partners with BoNT/A1 [28].
Here, we report the high-resolution structure of BoNT/A6 cell binding domain in complex with GD1a (H C /A6:GD1a), and a new crystal form of H C /A6 [H C /A6 (crystal form II)]. A detailed analysis of these two structures along with a previous H C /A6 structure [H C /A6 (crystal form I)] [29], reveals the interactions that occur across the H C /A6:GD1a interface, and the conformational flexibility of H C /A6. The structural information presented here may aid the development of novel BoNT-based therapeutics and our understanding of BoNT function.   . The dashed blue inset shows how GD1a fits into the electron density (FO-Fc, contoured at 3σ level) in the ganglioside binding site (GBS) and surrounding residues. The SV2 binding site is indicated by the black dashed oval. The dashed magenta inset highlights the conformational changes associated with GD1a binding by comparison of HC/A6 (crystal form I) (grey) with HC/A6:GD1a (orange), where the loop of residues 1269-1277 widens upon binding and Phe 1278 flips towards the Sia 5 moiety of GD1a ganglioside. The large arrows indicate the widening of the loop as measured by the change in Cα distance between R1269 and T1277.

New Crystal Form of H C /A6
During the screening of H C /A6 co-crystallisation with GD1a we identified the presence of two crystal forms in one drop. One form was identical to a previously published H C /A6 structure, that we refer to as H C /A6 (crystal form I; PDB code 6TWO [29]), which belonged to the space group P2 1 2 1 2 1 with unit cell dimensions of a = 39.54 Å, b = 105.59 Å, and c = 112.41 Å [29]. The overall fold is near-identical to other H C /A subtypes, consisting of an N-terminal 14 β-strand 'jelly roll' fold and a C-terminal 'β-trefoil'. The GBS also has high structural similarity to H C /A5 [29] Figure 1B), hence GD1a binding was not observed. H C /A6 (crystal form I) has a higher solvent content than H C /A6 (crystal form II) (46.87% and 33.58%, respectively) and the interface between the H CN and H CC subdomain is larger (858.7 Å 2 and 461.1 Å 2 , respectively).

Crystal Structure of H C /A6 in Complex with GD1a Oligosaccharide
The crystal structure of H C /A6:GD1a complex was determined at 1.5 Å resolution by molecular replacement in the monoclinic space group P2 1 with one molecule in the asymmetric unit (Table 1). There was clear electron density at the expected ganglioside binding site (GBS) for which 5 out of the 6 GD1a monosaccharides could be modelled ( Figure 1C, blue).
The overall fold of H C /A6 did not change upon binding to GD1a, as indicated by a low RMSD (0.81 Å for C α atoms) between H C /A6:GD1a and H C /A6 (crystal form I) (PDB code 6TWO [29]) structures. However, at the GBS, residue Phe 1278 rotates towards Sia 5 and the loop 1269-1277 appears to widen by 4 Å (measured by the difference in C α position between Arg 1269 and Thr 1277 residues) ( Figure 1C, magenta). This was observed previously for H C /A2, H C /A3, and H C /A5. Furthermore, H C /A6 forms seven hydrogen bonds with GD1a at Sia 5 , Gal 4 , and GalNAc 3 through six residues that are conserved among all BoNT/A subtypes ( Figure 2) [23][24][25][26][27]. The H C /A6:GD1a interface is most similar to H C /A5, forming a total of 7 hydrogen bonds with the three monosaccharides GalNAc 3 , Gal 4 and Sia 5 [27].

HC/A6 (Crystal form II) Reveals a Large Hinge-Rotation between HCN and HCC Subdomains
Superimposition of HC/A6 (crystal form I) (PDB code 6TWO [29]) and HC/A6:GD1a structures with HC/A6 (crystal form II) (RMSD values of 2.78 and 2.92 Å, respectively, for Cα atoms) revealed a misalignment in Cα positioning across the entire molecule ( Figure  3A). The program Dyndom [30] revealed a large hinge rotation of ~16.8° in HC/A6 (crystal form II) when compared to HC/A6 (crystal form I) and HC/A6:GD1a (Figure 4). To date, this is the largest "hinge motion" in subdomain orientation observed among BoNT/A subtype structures [25,27], and it suggests a high degree of flexibility existing between the HCN and HCC subdomains. The biological implication of this hinge-rotation has not yet been determined; however, it has been previously suggested that it may aid in orientating the HN towards the membrane in preparation for translocation [23].

HC/A6 (Crystal Form II) Reveals a Large Hinge-Rotation between HCN and HCC Subdomains
Superimposition of H C /A6 (crystal form I) (PDB code 6TWO [29]) and H C /A6:GD1a structures with H C /A6 (crystal form II) (RMSD values of 2.78 and 2.92 Å, respectively, for C α atoms) revealed a misalignment in C α positioning across the entire molecule ( Figure 3A). The program Dyndom [30] revealed a large hinge rotation of~16.8 • in H C /A6 (crystal form II) when compared to H C /A6 (crystal form I) and H C /A6:GD1a (Figure 4). To date, this is the largest "hinge motion" in subdomain orientation observed among BoNT/A subtype structures [25,27], and it suggests a high degree of flexibility existing between the H CN and H CC subdomains. The biological implication of this hinge-rotation has not yet been determined; however, it has been previously suggested that it may aid in orientating the H N towards the membrane in preparation for translocation [23].

Structural Comparison of HC/A6 in the PPresence/Absence of GD1a
To determine the local structural differences between the HC/A6 (crystal forms I & II) and HC/A6:GD1a, the HCC and HCN subdomains were superimposed independently ( Figure 3B,C)-this revealed two regions of conformational flexibility within the HCC subdomain ( Figure 3B). The first is within the SV2 binding site (1139-1157) where residues 1150-1152 in both HC/A6:GD1a and HC/A6 (crystal form I) structures form a β-sheet with residues 1003-1006. However, for the HC/A6 (crystal form II) structure, this β-sheet could not be modelled due to disorder, possibly due to the large hinge motion between the two subdomains separating those residues. This appears to be consistent with residues 1152-1157 being rotated away from the HCN subdomain when compared to HC/A6:GD1a and HC/A6 (crystal form I), accompanied by a flip at Tyr 1155 ( Figure 3B, magenta). This likely contributes to the large difference in interface surface area observed between the HCN and HCC subdomains. The flexibility of the SV2 binding site might act to promote anchoring of HC/A6 to SV2, as the binding loop is able to extend outward away from the protein and sample more conformational space.
The second structural difference is within a polar/charged loop (residues 1225-1236) positioned close to the GBS ( Figure 3B). This loop adopts alternative conformations in HC/A6 (crystal form I) and HC/A6 (crystal form II) but is only partially modelled in HC/A6:GD1a ( Figure 3B, blue). A comparison of residues Cys 1235, Lys 1236, and Cys 1280 reveals an alternating bridging interaction ( Figure 3B, blue). For the HC/A6 (crystal form

Structural Comparison of HC/A6 in the Presence/Absence of GD1a
To determine the local structural differences between the H C /A6 (crystal forms I & II) and H C /A6:GD1a, the H CC and H CN subdomains were superimposed independently ( Figure 3B,C)-this revealed two regions of conformational flexibility within the H CC subdomain ( Figure 3B). The first is within the SV2 binding site (1139-1157) where residues 1150-1152 in both H C /A6:GD1a and H C /A6 (crystal form I) structures form a β-sheet with residues 1003-1006. However, for the H C /A6 (crystal form II) structure, this β-sheet could not be modelled due to disorder, possibly due to the large hinge motion between the two subdomains separating those residues. This appears to be consistent with residues 1152-1157 being rotated away from the H CN subdomain when compared to H C /A6:GD1a and H C /A6 (crystal form I), accompanied by a flip at Tyr 1155 ( Figure 3B, magenta). This likely contributes to the large difference in interface surface area observed between the H CN and H CC subdomains. The flexibility of the SV2 binding site might act to promote anchoring of H C /A6 to SV2, as the binding loop is able to extend outward away from the protein and sample more conformational space.
The second structural difference is within a polar/charged loop (residues 1225-1236) positioned close to the GBS ( Figure 3B). This loop adopts alternative conformations in H C /A6 (crystal form I) and H C /A6 (crystal form II) but is only partially modelled in H C /A6:GD1a ( Figure 3B, blue). A comparison of residues Cys 1235, Lys 1236, and Cys 1280 reveals an alternating bridging interaction ( Figure 3B, blue). For the H C /A6 (crystal form II) structure there is a continuation of electron density between the Nζ of Lys 1236 and Sγ of Cys 1280 residues ( Figure 3D), whereas in the H C /A6:GD1a structure Cys 1280 forms a disulphide bridge with Cys 1236, and the structure of H C /A6 (crystal form I) shows neither (Figure 3, blue). This is consistent with similar observations with H C /A2 and H C /A5 [25,29] and such Lys-O-Cys bridging interactions are believed to be widespread in protein structures but under-reported [31]. It is likely that these different bridging interactions could lead to multiple conformations that the loop can adopt, however, the biological significance of the Cys-Lys bridge remains unclear at the moment.

B-Factor Analysis
To further assess the conformational flexibility of H C /A6 we used the program BAN∆IT (a tool for the normalisation and analysis of B-factor profiles [32]) to produce a raw ( Figure 5A) and normalised ( Figure 5B) B-factor plot for the H C /A6:GD1a and H C /A6 (crystal form II) structures. B-factors quantify the relative motion of individual atoms within a crystal structure, where an increase in B-factor indicates an increase in motion, providing insights into the flexible regions of a protein. The raw plot shows higher B-factors overall for the H C /A6 (crystal form II) structure, indicating more flexibility across the entire molecule compared to the H C /A6:GD1a structure. For the latter, the H CC subdomain is more flexible than the H CN ( Figure 5B). This is consistent with the observed conformational flexibility of the H CC subdomain, perhaps to accommodate binding to both SV2 and a ganglioside on the surface of motor neurons. II) structure there is a continuation of electron density between the Nζ of Lys 1236 and Sγ of Cys 1280 residues ( Figure 3D), whereas in the HC/A6:GD1a structure Cys 1280 forms a disulphide bridge with Cys 1236, and the structure of HC/A6 (crystal form I) shows neither (Figure 3, blue). This is consistent with similar observations with HC/A2 and HC/A5 [25,29] and such Lys-O-Cys bridging interactions are believed to be widespread in protein structures but under-reported [31]. It is likely that these different bridging interactions could lead to multiple conformations that the loop can adopt, however, the biological significance of the Cys-Lys bridge remains unclear at the moment.

B-Factor Analysis
To further assess the conformational flexibility of HC/A6 we used the program BANΔIT (a tool for the normalisation and analysis of B-factor profiles [32]) to produce a raw ( Figure 5A) and normalised ( Figure 5B) B-factor plot for the HC/A6:GD1a and HC/A6 (crystal form II) structures. B-factors quantify the relative motion of individual atoms within a crystal structure, where an increase in B-factor indicates an increase in motion, providing insights into the flexible regions of a protein. The raw plot shows higher Bfactors overall for the HC/A6 (crystal form II) structure, indicating more flexibility across the entire molecule compared to the HC/A6:GD1a structure. For the latter, the HCC subdomain is more flexible than the HCN ( Figure 5B). This is consistent with the observed conformational flexibility of the HCC subdomain, perhaps to accommodate binding to both SV2 and a ganglioside on the surface of motor neurons.  Figure 3D), whereas in the HC/A6:GD1a structure Cys 1280 forms a disulphide bridge with Cys 1236, and the structure of HC/A6 (crystal form I) shows neither (Figure 3, blue). This is consistent with similar observations with HC/A2 and HC/A5 [25,29] and such Lys-O-Cys bridging interactions are believed to be widespread in protein structures but under-reported [31]. It is likely that these different bridging interactions could lead to multiple conformations that the loop can adopt, however, the biological significance of the Cys-Lys bridge remains unclear at the moment.

B-Factor Analysis
To further assess the conformational flexibility of HC/A6 we used the program BANΔIT (a tool for the normalisation and analysis of B-factor profiles [32]) to produce a raw ( Figure 5A) and normalised ( Figure 5B) B-factor plot for the HC/A6:GD1a and HC/A6 (crystal form II) structures. B-factors quantify the relative motion of individual atoms within a crystal structure, where an increase in B-factor indicates an increase in motion, providing insights into the flexible regions of a protein. The raw plot shows higher Bfactors overall for the HC/A6 (crystal form II) structure, indicating more flexibility across the entire molecule compared to the HC/A6:GD1a structure. For the latter, the HCC subdomain is more flexible than the HCN ( Figure 5B). This is consistent with the observed conformational flexibility of the HCC subdomain, perhaps to accommodate binding to both SV2 and a ganglioside on the surface of motor neurons.

Protein Expression and Purification
The cell binding domain (residues 871-1296) of BoNT/A6 (H C /A6) was previously cloned into the pJ401 vector [29] and subsequently transformed into BL21 (DE3) star cells. Expression cultures were grown at 37 • C until an OD 600 of 0.6. The culture was induced with 1 mM IPTG and incubated overnight at 16 • C. Cells were harvested by centrifugation and the pellet was resuspended in 50 mM Tris pH 7.4, 0.5 M NaCl. Cells were then lysed by homogenisation in a cell disrupter and the lysate was clarified by centrifugation. H C /A6 was captured using Ni 2+ affinity chromatography and eluted with a 0-0.5 M imidazole linear gradient (in 50 mM Tris pH 7.4, 0.5 M NaCl). H C /A6 was further purified by gel filtration in 50 mM Tris pH 7.2, 150 mM NaCl using a GE superdex 200 column. H C /A6 was concentrated to 1 mg/mL and flash frozen in liquid nitrogen for storage at −20 • C until required for crystallisation.

X-ray Crystallography
H C /A6 was concentrated to 6 mg/mL and incubated with 5 mM GD1a for 1 h at room temperature prior to setting up crystallisation screens using the sitting drop vapour diffusion method in Swissci Intelli Well plates (High Wycombe, UK) at 16 • C. Crystals were identified in the PACT premier and BCS screens supplied by Molecular Dimensions (Rotherham, UK). The best H C /A6:GD1a crystals grew in 0.2 M NaCl, 0.1 M HEPES pH 7.0, 20% w/v PEG 6000. H C /A6 crystals (crystal form II) grew in 0.1 M magnesium acetate tetrahydrate, 0.1 M MES pH 6.5, 12% w/v of 50% PEG smear medium (4.55% v/v PEG 400, 4.55% v/v PEG 500 MME, 4.55% v/v PEG 600, 4.55% w/v PEG 1000, 4.55% w/v PEG 2000, 4.55% w/v PEG 3350, 4.55% w/v PEG 4000, 4.55% w/v PEG 5000 MME, 4.55% w/v PEG 6000, 4.55% w/v PEG 8000, 4.55% w/v PEG 10,000), 10% v/v ethylene glycol. Crystals were mounted using a cryo-loop and flash frozen in liquid nitrogen. A total of 7200 images were collected on I04 at Diamond Light Source (Didcot, UK) at 0.1 • for 0.01 s per image. Indexing and integration of X-ray diffraction data were performed using DIALS [33]. Subsequent data processing was performed using the CCP4 software suite [34]. Data reduction and merging was performed in AIMLESS as part of CCP4 [34]. The initial phases were determined by molecular replacement using Phaser [35]. H C /A6:GD1a was determined using H C /A6 (crystal form I) as a single search model. H C /A6 (crystal form II) was determined using the H CN and H CC subdomains as separate search models from the H C /A6 (crystal form I) structure (PDB code 6TWO [29]). Both structures were refined using REFMAC [36] and Phenix [37] and molecular modelling was performed in COOT [38], The structures were validated using Molprobity [39] and PDB validation. All figures were produced using CCP4mg [40].

Conclusions
The crystal structure of H C /A6:GD1a revealed six residues of H C /A6 which form seven hydrogen bonding interactions with the three terminal monosaccharides of GD1a. Overall, the conformation of H C /A6 upon binding GD1a does not change dramatically except for a flipping of the Phe 1278 sidechain towards Sia 5 of GD1a and an accompanying widening of loop 1269-1277. However, a new crystal form of H C /A6 (H C /A6 (crystal form II)) revealed a large 16.8 • rotation between the H CN and H CC subdomain (hinge) that might aid membrane anchoring through both the SV2 and ganglioside binding site. Finally, a detailed comparison of the structures presented here (H C /A6:GD1A and H C /A6 (crystal form II)) with a previously reported H C /A6 structure (crystal form I) revealed the extent of conformational flexibility within the H CC subdomain. Two areas of particular interest are the SV2 binding loop and a polar/charged loop close to the GBS. The SV2 loop, in the H C /A6 (crystal form II) structure, appears to have rotated away from the H CN domain protruding outward from the surface of the protein. There is a polar/charged loop that adopts a different conformation in each structure along with a dynamic bridging interaction that alternates between Cys 1235-Cys 1280, Lys 1236-O-Cys 1280, and no bridge at all. The biological implication of these structural features is yet to be established and will require further experimental investigation.