Crystal Structures of Botulinum Neurotoxin Subtypes A4 and A5 Cell Binding Domains in Complex with Receptor Ganglioside

Botulinum neurotoxins (BoNT) cause the potentially fatal neuroparalytic disease botulism that arises due to proteolysis of a SNARE protein. Each BoNT is comprised of three domains: a cell binding domain (HC), a translocation domain (HN), and a catalytic (Zn2+ endopeptidase) domain (LC). The HC is responsible for neuronal specificity by targeting both a protein and ganglioside receptor at the neuromuscular junction. Although highly toxic, some BoNTs are commercially available as therapeutics for the treatment of a range of neuromuscular conditions. Here we present the crystal structures of two BoNT cell binding domains, HC/A4 and HC/A5, in a complex with the oligosaccharide of ganglioside, GD1a and GM1b, respectively. These structures, along with a detailed comparison with the previously reported apo-structures, reveal the conformational changes that occur upon ganglioside binding and the interactions involved.


Introduction
Botulinum neurotoxin serotype A (BoNT/A) is produced by anaerobic spore forming bacteria, Clostridium botulinum, and, along with other serotypes, is responsible for the disease botulism-a neuromuscular condition that causes flaccid paralysis and can lead to death by asphyxiation if left untreated [1]. The exquisite toxicity of BoNT/A makes it one of the deadliest agents known to humankind [2]; however, at miniscule doses, they can be used as a therapeutic to treat a range of diseases associated with hyper-muscular and -glandular activity [3]. The toxin is post-translationally cleaved to form an active di-chain, comprised of a 50 kDa light chain (LC) and a 100 kDa heavy chain (HC) linked by a disulphide bond. The HC can be further divided into an N-terminal translocation domain (H N ) and a C-terminal cell binding domain (H C ) [4]. The mechanism of intoxication involves three general steps [5]: highly specific targeting to the neuromuscular junction by dual-receptor recognition of both a protein and ganglioside receptor by the H C domain, resulting in endocytic internalisation into an endosome [6]; pH-mediated conformational change of the H N domain that translocates the LC into the cytosol [7][8][9]; and a Zn 2+ -dependent endopeptidase cleavage of a soluble N-ethylmaleimide-sensitive factor attachment protein receptor (SNARE) protein by the LC [10]. This cleavage prevents vesicular fusion to the cell membrane, halting the release of presynaptic acetylcholine, and the progression of synaptic signalling at the neuromuscular junction [11].
There are many different types of BoNT and BoNT-like molecules that are categorised by sequence similarity, serological activity, and/or host source. BoNTs produced by Clostridia are categorised into serotypes /A to /G, and /X, whereas the BoNTlike molecules by non-Clostridia include BoNT/Wo (Weissella oryzae) [12], BoNT/En (Enterococcus faecium) [13], and PMP1 (Paraclostridium bifermentans) [14]. Some serotypes exist naturally as mosaics (e.g., BoNT/CD, BoNT/DC, and BoNT/FA), whereas other serotypes are divided into subtypes (e.g., BoNT/A1-/A8, /B1-/B8, /E1-/E12, /F1-/F9) due to subtle variations in amino acid sequence [15,16]. Although these subtypes arise due to only minor changes in their amino acid sequence, the toxicity of subtypes has been shown to vary significantly [17][18][19][20]. All serotypes require recognition of both a protein (Synaptic vesicle protein 2 in BoNT/A) and ganglioside receptor to initiate endocytosis, except for BoNT/C which binds to two gangliosides. Gangliosides are glycosphingolipids that are often involved in cellular-signalling pathways and are comprised of a membrane anchored hydrophilic lipid tail, and an extracellular oligosaccharide moiety [21]. Previous studies have reported the structures of the binding domain of BoNT/A1 (H C /A1) and BoNT/A3 (H C /A3) in complex with the receptor ganglioside GD1a [22,23], detailing the interactions that occur between the two. These structures reveal that the ganglioside binding site (GBS) is formed by a β-hairpin and loop in the C-terminal subdomain of H C (H CC ).
We have previously reported the crystal structures of H C /A4 [24] and H C /A5 [25], and now present the crystal structures of H C /A4 in complex with GD1a, and H C /A5 in complex with GM1b, and highlight the interactions and structural changes that occur upon ganglioside binding. The structural information revealed in this report may aid in the development of future BoNT therapeutics.

Structure of H C /A4 in Complex with GD1a Oligosaccharide
The structure of the H C /A4:GD1a complex was solved to 2.3 Å by molecular replacement using the unbound H C /A4 structure (PDB: 6F0P) [24] as a search model. Two molecules (designated A and B) were present in the asymmetric unit (ASU) ( Table 1). The overall quality of the electron density map was good with better density for molecule A (residues 994-999, 1029-1032,1047-1053, 1172-1174, and 1232-1239 could not be modelled for molecule B). Consequently, all subsequent analyses below involved molecule A. An initial inspection of the map revealed large positive electron density at the expected GBS, which indicated that GD1a had bound. Monosaccharides Sia 5 -Gal 2 could be modelled with no ambiguity into the electron density ( Figure 1A) and Glc 1 partially, but there was insufficient electron density to model Sia 6 . A total of nine hydrogen bonding interactions were present between H C /A4 and GD1a ( Figure 1B) (Table 2)-there was clear electron density for the two terminal nitrogen atoms of Arg 1282 which interact with Sia 5 and Gln 1276.  [23]) and H C /A1:GD1a (PDB: 5TPC [22]). Watermediated interactions are indicated by a "-H 2 O molecule (n 1 , n 2 )" where n 1 is the distance between the amino acid residue and the water, and n 2 is the distance between the water and monosaccharide. ∆ Indicates they are the equivalent water molecule for each structure. Data adapted from [23].  The crystal packing of HC/A4 changes significantly upon binding of GD1a as evidenced by the change in both unit cell dimensions and space group. Although there is minimal overall conformational change between HC/A4:GD1a with HC/A4 alone (RMSD of 0.88 Å for all Cα atoms); there is a noticeable change in the relative position of the HCN and HCC subdomains when compared to the unbound structure (RMSD of 0.6 and 0.5 Å, respectively, for all Cα atoms after individual subdomain superimposition). Therefore, upon ganglioside binding, the two subdomains appear to rotate apart like an opening The crystal packing of H C /A4 changes significantly upon binding of GD1a as evidenced by the change in both unit cell dimensions and space group. Although there is minimal overall conformational change between H C /A4:GD1a with H C /A4 alone (RMSD of 0.88 Å for all Cα atoms); there is a noticeable change in the relative position of the H CN and H CC subdomains when compared to the unbound structure (RMSD of 0.6 and 0.5 Å, respectively, for all Cα atoms after individual subdomain superimposition). Therefore, upon ganglioside binding, the two subdomains appear to rotate apart like an opening hinge ( Figure 2A). There is also a noticeable conformational change to the loop spanning residues 933-946 within the H CN subdomain (Figure 2A) that may be attributed to different crystal packing in the unbound structure. residues 933-946 within the HCN subdomain ( Figure 2A) that may be attributed to different crystal packing in the unbound structure.

Monosaccharide
Inspection of the HC/A4:GD1a GBS residues revealed changes in the relative position of the side chains compared to HC/A4 alone; most notably Arg 1282 (which adopts two conformations in the unbound structure) and Tyr 1123. Upon GD1a binding, these residues shift to form a hydrogen bonding interaction with Sia 5 ( Figure 2B).

Structure of HC/A5 Co-crystallised with GM1b Oligosaccharide
Several attempts to crystallise HC/A5 with GD1a did not yield crystals for the complex. Consequently, a smaller ganglioside, GM1b, which is identical to GD1a in terms of the expected binding portion (Sia 5 -GalNAc 3 ) but lacks only Sia 6 ( Figure 1B), was used for co-crystallisation with HC/A5. Crystals of HC/A5:GM1b diffracted to 2.4 Å, in space group P21 (Table 1), and the structure was determined by molecular replacement with two molecules (designated A and B) in the ASU ( Figure 3A). Molecule A generally showed clearer electron density throughout the structure compared to molecule B, especially the H…SxWY motif that is essential for ganglioside binding [26] which could not be modelled in molecule B. There are, however, three small loop regions in molecule A (1167-1169, 1226-1235, and 1271-1276) that showed insufficient density for modelling. It Inspection of the H C /A4:GD1a GBS residues revealed changes in the relative position of the side chains compared to H C /A4 alone; most notably Arg 1282 (which adopts two conformations in the unbound structure) and Tyr 1123. Upon GD1a binding, these residues shift to form a hydrogen bonding interaction with Sia 5 ( Figure 2B).

Structure of H C /A5 Co-Crystallised with GM1b Oligosaccharide
Several attempts to crystallise H C /A5 with GD1a did not yield crystals for the complex. Consequently, a smaller ganglioside, GM1b, which is identical to GD1a in terms of the expected binding portion (Sia 5 -GalNAc 3 ) but lacks only Sia 6 ( Figure 1B), was used for co-crystallisation with H C /A5. Crystals of H C /A5:GM1b diffracted to 2.4 Å, in space group P2 1 (Table 1), and the structure was determined by molecular replacement with two molecules (designated A and B) in the ASU ( Figure 3A). Molecule A generally showed clearer electron density throughout the structure compared to molecule B, especially the H . . . SxWY motif that is essential for ganglioside binding [26] which could not be modelled in molecule B. There are, however, three small loop regions in molecule A (1167-1169, 1226-1235, and 1271-1276) that showed insufficient density for modelling. It is possible that the latter loop (residues 1271-1276) is flexible due to its proximity to the GBS. the GBS that was not part of the protein. With the aid of polder maps [28] for His 1253 and Tyr 1117 [22,23,29], it was possible to model in sugars Gal 4 and Sia 5 ( Figure 3B). Gal 4 forms a total of four hydrogen bonds with residues Glu 1203, Phe 1252, His 1253, and Ser 1264, while Sia 5 forms three hydrogen bonds with Tyr 1117, Tyr 1267, and Gly 1279 ( Table 2). Further refinement of the Gal 4 molecule with occupancies 0.6 and 1 generated average B-factors of 60.74 Å 2 and 61.97 Å 2 , respectively, indicating that GM1b is bound at low occupancy.  Both molecules A and B are conformationally very similar, with an RMSD of 0.68 Å for all Cα atoms. Residues 928-939, however, adopt alternative conformations-for molecule A they form a β-strand with residues 1047-1050 of the conserved jelly-roll fold, whereas for molecule B, they form an unstructured loop ( Figure 3C). Inspection of the surrounding symmetry-related molecule suggests that this difference may be due to crystallographic packing.
The two molecules of the ASU form a dimer through an extended β-sheet interaction ( Figure 3A, Arrow). For the interaction to occur the 882-889 loop, which extends beyond the β-sheet ( Figure 4C, Arrow), has to move away to allow for the interface to form between molecule A and B. Although computational analyses [27] suggest this may also be due to crystallographic packing, the GBS in this crystal form has become accessible to ligand binding. For molecule A, some additional weak electron density was observed at the GBS that was not part of the protein. With the aid of polder maps [28] for His 1253 and Tyr 1117 [22,23,29], it was possible to model in sugars Gal 4 and Sia 5 ( Figure 3B). Gal 4 forms a total of four hydrogen bonds with residues Glu 1203, Phe 1252, His 1253, and Ser 1264, while Sia 5 forms three hydrogen bonds with Tyr 1117, Tyr 1267, and Gly 1279 ( Table 2). Further refinement of the Gal 4 molecule with occupancies 0.6 and 1 generated average B-factors of 60.74 Å 2 and 61.97 Å 2 , respectively, indicating that GM1b is bound at low occupancy.
could not be modelled, comparisons to the unbound HC/A5 structure will be made with molecule A. Most of the residues within the putative GBS show little to no conformational change, with the exception of Phe 1278 which has its side chain flipped towards the GBS (Figure 4A,B). This flip in residue positioning is accompanied by a change in the loop structure spanning residues 1260-1280, where there is an increase in the C α distance of 4 Å between residues Tyr 1267 and Thr 1277 upon ganglioside binding ( Figure 4A and B inset). This increase in C α distance is indicative of the loop widening to accommodate the ganglioside.  The β-sheet arrangement between the two molecules of the ASU, shows a significant conformational change at the N-terminus of the H C /A5:GM1b structure when compared to H C /A5 alone ( Figure 4C,D). Most prominently, the side chains of Arg 893 and Tyr 894 have rotated towards the main body of the protein structure upon GM1b binding, and there is a rotation in the protein backbone that results in a more compact structure. This closely resembles the full-length structure of BoNT/A1 (PDB:3BTA) in the absence of ganglioside.
Overall, the H C /A5:GM1b structure is very similar to the H C /A5 structure (PDB:6TWP), with RMSD values of 0.76 Å (for Cα atoms) for molecule A and 0.64 Å for molecule B (for Cα atoms). Considering that the residues of the GBS for molecule B could not be modelled, comparisons to the unbound H C /A5 structure will be made with molecule A. Most of the residues within the putative GBS show little to no conformational change, with the exception of Phe 1278 which has its side chain flipped towards the GBS (Figure 4A,B). This flip in residue positioning is accompanied by a change in the loop structure spanning residues 1260-1280, where there is an increase in the C α distance of 4 Å between residues Tyr 1267 and Thr 1277 upon ganglioside binding ( Figure 4A,B inset). This increase in C α distance is indicative of the loop widening to accommodate the ganglioside.

Structural Variability of H C /A Subtypes at the Ganglioside Binding Site
There appears to be some structural variation of the GBS among the H C /A subtypes as illustrated by a comparison of the H C /A1, H C /A3, H C /A4, and H C /A5 structures with and without ganglioside bound ( Figure 5). The most significant variation is seen within the loop that follows the β-hairpin of the GBS for H C /A3 and H C /A5 ( Figure 6A-D, Arrow). Upon binding the ganglioside, the loop widens in H C /A3 and H C /A5, as measured by an increase in the distance between Cα atoms of T1273 A3 /1277 A5 and Y1263 A3 /1267 A5 within the loop, to accommodate the ganglioside. In contrast, the loop in the unbound H C /A1 and H C /A4 structures, adopts a more open conformation, which suggests that it does not need to move to allow GD1a to bind. Furthermore, a comparison of the GBS opening groove, formed by the histidine and tryptophan residues of the H . . . SxWY motif, in the bound and unbound structures reveals that the structural changes of H C /A4 is more similar to H C /A1 than H C /A3, with the tryptophan moving towards the GBS upon ganglioside binding ( Figure 7A-D). H C /A5 is somewhere in between with some conformational variation reminiscent of the H C /A3 structure, where Phe 1274 A3 /1278 A5 appears to flip towards the GBS upon binding. This residue is not conserved across the subtypes-it appears as Leu 1278 for H C /A1 and Leu 1285 for H C /A4. Not surprisingly, there is some variation to the ganglioside interaction between subtypes. H C /A1 has a total of ten hydrogen bonding interactions with GD1a, while H C /A3 and /A4 has nine (Table 2). Furthermore, H C /A4 displays no water-mediated interactions with the ganglioside, while both H C /A1 and H C /A3 have at least two each. The occupancy of Gal 4 and Sia 5 in the H C /A5:GM1b structure was too low to be able to determine any water-mediated interactions that contributed to binding.    [29]) and bound to GD1a (orange; PDB: 5TPC [22]); (B) HC/A3 unbound (green; PDB: 6F0O [24]) and bound to GD1a (dark grey; PDB: 6THY [23]); (C) HC/A4 unbound (burlywood; PDB: 6F0P [24]) and bound to GD1a (cyan; this study); (D) HC/A5 unbound (grey; PDB: 6TWP [25]) and bound to GM1b (cyan; this study). The arrow points to the loop that follows the GBS-β hairpin; dotted lines indicate unmodeled regions of the loop.  [29]) and bound to GD1a (orange; PDB: 5TPC [22]); (B) H C /A3 unbound (green; PDB: 6F0O [24]) and bound to GD1a (dark grey; PDB: 6THY [23]); (C) H C /A4 unbound (burlywood; PDB: 6F0P [24]) and bound to GD1a (cyan; this study); (D) H C /A5 unbound (grey; PDB: 6TWP [25]) and bound to GM1b (cyan; this study). The arrow points to the loop that follows the GBS-β hairpin; dotted lines indicate unmodeled regions of the loop.

Conclusions
The crystal structures of HC/A4:GD1a and HC/A5:GM1b presented here reveal the interactions involved with ganglioside binding and also the conformational changes that occur. For HC/A4, eight residues form a total of nine hydrogen bonding interactions with the three principal oligosaccharides, GalNAC 3 , Gal 4 , and Sia 5 . However, for HC/A5, only two oligosaccharides could be modelled in the electron density map, revealing seven hydrogen bonding interactions. The low occupancy of GM1b, and multiple failed attempts of co-crystallising HC/A5 with GD1a, suggested a low affinity to the Sia-Gal-

Conclusions
The crystal structures of H C /A4:GD1a and H C /A5:GM1b presented here reveal the interactions involved with ganglioside binding and also the conformational changes that occur. For H C /A4, eight residues form a total of nine hydrogen bonding interactions with the three principal oligosaccharides, GalNAC 3 , Gal 4 , and Sia 5 . However, for H C /A5, only two oligosaccharides could be modelled in the electron density map, revealing seven hydrogen bonding interactions. The low occupancy of GM1b, and multiple failed attempts of co-crystallising H C /A5 with GD1a, suggested a low affinity to the Sia-Gal-GalNAc moiety or preference for a different ganglioside.
A total of four H C /A subtype structures (H C /A1, H C /A3, H C /A4, H C /A5) have now been reported with and without ganglioside. We previously reported that the reduction in hydrogen bonding interactions of H C /A3 for GD1a compared to H C /A1, may be a contributing factor in its reduction in toxicity [23]. H C /A4 follows this trend as the structure displays a reduction in hydrogen bonding interactions with GD1a and has a reported 1000-fold lower activity in mice [30]. Furthermore, both BoNT/A3 and BoNT/A4 are significantly less active in vivo when compared to BoNT/A1, and BoNT/A4 is also less efficient at entering cells [31], with the cell binding domain contributing to this variation.

Protein Expression and Purification
The sequences of BoNT/A4 residues 870-1296 (HC/A4) and BoNT/A5 residues 871-1296 (HC/A5) were cloned into the pJ401 vector with an N-terminal hexa-histidine tag, as described previously [24,25]. Both constructs were transformed into E. coli strain BL21 and grown at 37 • C to an OD600 of 0.5 before induction with 1 mM IPTG for 16 h at 16 • C. Cells were harvested by centrifugation. Cells expressing H C /A4 were lysed in 50 mM Tris pH 7.4, 0.2 M NaCl, 10 mM trehalose and 20 mM imidazole, while cells expressing H C /A5 were lysed in 50 mM Tris pH 7.4, 0.5 M NaCl, and 20 mM imidazole. Both proteins were captured on a GE HisTrap column and further purified by gel filtration using a GE superdex 200 column. For H C /A4, the running buffer was 50 mM Tris pH 7.4, 150 mM NaCl, and 10 mM trehalose, while for H C /A5 it was 50 mM Tris pH 7.4 and 150 mM NaCl. Both proteins were concentrated to 1 mg/mL using a 10 kDa MWCO centrifugal concentrator and flash frozen in liquid nitrogen for storage at −20 • C until required for crystallisation.

Protein Crystallisation
H C /A4 and H C /A5 proteins were concentrated to 5 mg/mL and the former incubated with 5 mM GD1a oligosaccharide and the latter with 5 mM GM1b oligosaccharide for 1 h at room temperature. Crystallisation screens were setup using the sitting drop vapour diffusion method in 96-3 well intelli-plates (SWISSCI, High Wycombe, UK) with a number of high throughput crystallisation conditions (Molecular Dimensions). Both a 1:1 and 2:1 protein to reservoir ratios were screened in each case. H C /A4 crystals grew in 0.2 M NaAcO·3H 2 O, 20% w/v PEG 3350 (1:1 ratio, protein:reservoir). H C /A5 crystals grew in 150 mM Li 2 SO 4 , 50 mM MgCl 2 ·6H 2 O, 0.1 M HEPES pH 7.8, 4.7% w/v PEG 8K, 4.7% PEG 10K and 4.7% PEG 8K (1:1 ratio, protein:reservoir). Crystals were mounted directly onto a cryo-loop and flash frozen for storage in liquid nitrogen.

X-ray Diffraction Data Collection and Structure Determination
Diffraction images were collected at a wavelength of 0.9785 Å with 0.1 • oscillation and 0.01 s of exposure time per image on the i04 beamline at the Diamond Light Source (Harwell, Oxfordshire, UK). Crystals were kept under a jet stream of liquid nitrogen at 100 K during data collection. A total of 7200 images were collected for H C /A4:GD1a, and 3600 images for H C /A5:GM1b. Data processing was carried out in DIALS [32] and both structures were solved by molecular replacement in PHASER [33] using a previously reported structure of H C /A4 [24] and H C /A5 [25] as search models. Initial rounds of refinement were performed using REFMAC [26] as part of the CCP4 package [34] with the final round of refinement and validation performed in Phenix [35]. The structures were validated using Molprobity [36] and PDB validation. Figures were produced using ccp4mg [37] and BioRender.com (Biorender, Toronto, ON, Canada).