Variations in the Botulinum Neurotoxin Binding Domain and the Potential for Novel Therapeutics

Botulinum neurotoxins (BoNTs) are categorised into immunologically distinct serotypes BoNT/A to /G). Each serotype can also be further divided into subtypes based on differences in amino acid sequence. BoNTs are ~150 kDa proteins comprised of three major functional domains: an N-terminal zinc metalloprotease light chain (LC), a translocation domain (HN), and a binding domain (HC). The HC is responsible for targeting the BoNT to the neuronal cell membrane, and each serotype has evolved to bind via different mechanisms to different target receptors. Most structural characterisations to date have focussed on the first identified subtype within each serotype (e.g., BoNT/A1). Subtype differences within BoNT serotypes can affect intoxication, displaying different botulism symptoms in vivo, and less emphasis has been placed on investigating these variants. This review outlines the receptors for each BoNT serotype and describes the basis for the highly specific targeting of neuronal cell membranes. Understanding receptor binding is of vital importance, not only for the generation of novel therapeutics but also for understanding how best to protect from intoxication.


Botulinum Neurotoxins
Botulinum neurotoxins (BoNTs) are produced mainly by Clostridium botulinum, under anaerobic conditions [1], and are the causative agent of botulism-a rare disease that can lead to paralysis and eventually death. The C. botulinum taxon can be divided into four groups (I, II, III, and IV), based on phenotypic differences between the bacteria [2]. C. botulinum group I (proteolytic) and group II (non-proteolytic) are mostly responsible for human botulism, whereas C. botulinum group III is responsible for botulism in other animal species, and C. botulinum group IV does not appear to cause botulism [2,3]. Across these phenotypes, a range of serologically distinct BoNTs have been identified and classified within different serotypes. Until recently, all BoNTs have been categorised into one of seven serotypes ranging from BoNT/A to BoNT/G. The recent identification of novel BoNTs and BoNT-like proteins, which are not neutralisable by existing anti-sera, has lead to classification that does not currently continue from the classical nomenclature (e.g., BoNT/X). Some C. botulinum strains have also been identified which express more than one serotype and/or chimeric neurotoxins (e.g., BoNT/CD and BoNT/DC).
Each BoNT is expressed as a single polypeptide chain of~150 kDa (Figure 1a), after which it is cleaved post-translationally by a protease to yield an active di-chain molecule consisting of a~50 kDa light chain (LC) and a~100 kDa heavy chain (HC) linked by a disulphide bond. Some serotypes are cleaved into a di-chain by an endogenous host protease, while others may be cleaved in the target organism [4,5]. For example, BoNT/A purified from C. botulinum culture after 8 h is mostly as a single-chain peptide, but when purified from a 96 h culture it is in the nicked di-chain form [6]. Polysialogangliosides consist of a hydrophilic complex polysaccharide with many sialic acid residues, bound to a hydrophobic ceramide tail. Different forms of these gangliosides can be found embedded in the cell membrane with the various sugar moieties displayed on the cell surface. The most common examples found on neuronal membranes include GT1b, GD1a, GD1b, and GM1. Two types of BoNT protein receptors have been identified to date: three isoforms of synaptic vesicle glycoprotein 2 (SV2A-C) and two isoforms of synaptotagmin (SytI-II). Both types are involved with the regulated secretion of neurotransmitter from synaptic vesicles [15,16]. SV2A-C contribute to the modulation of exocytosis, although their exact role is yet to be determined, while SytI and SytII are calcium-sensitive membrane proteins also involved in exocytosis [15,[17][18][19]. Their involvement in synaptic vesicle endocytosis also requires them to be recycled back into the cell through endocytosis, making them excellent targets for BoNTs.
Once the BoNT has bound to its target receptors, it is internalised into a vesicle by endocytosis. The vesicle then matures into an endosome, and proton pumps reduce the internal pH which may cause the BoNT to undergo a conformational change. The exact mechanism of translocation is still not well understood, but it is proposed that the H N forms a pore through which a partially unfolded LC passes into the neuronal cytosol [20][21][22]. The LC remains bound to the H N on the cytosolic side due to a single disulphide bond, and requires host protein thioredoxin (Trx) and its partner thioredoxin reductase (TrxR) to release the LC (Figure 2). Disulphide cleavage is essential to intoxication, and inhibition of Trx is sufficient to block the LC release [23,24]. The free LC is then able to cleave a soluble N-ethylmaleimide-sensitive factor attachment protein receptor (SNARE), which prevents vesicle-plasma membrane fusion, thus inhibiting exocytosis and release of acetylcholine, causing flaccid paralysis.

BoNT/A
Within the BoNT/A serotype there are currently eight subtypes BoNT/A1 to /A8 which differ by between 3% and 16% at the amino acid level ( Table 1). The most thoroughly characterised BoNT subtype is BoNT/A1-this is in part due to its use as a therapeutic for several conditions such as spasticity, dystonias, and glabellar lines [25,26]. The carboxyl-terminal half of the BoNT H C domain (H CC ) contains the peptide motif, H. . . SxWY. . . G, which constitutes the core of the ganglioside-binding site (GBS) [27]. In contrast to the dual ganglioside binding sites identified on the related TeNT H C , the GBS of BoNT/A1 can only bind one ganglioside at a time [28], but is capable of recognising more than one type, specifically GT1b, GD1a, and to a lesser extent GM1 [29,30]. The exact interactions between ganglioside and the GBS were first determined from the crystal structure of the H C domain in complex with GT1b [31] (PDB ID: 2VU9). This revealed extensive hydrogen-bonding with four of the seven individual monosaccharides within GT1b. Depletion of gangliosides in neuroblastoma cells completely prevents entry of BoNT/A1 [29]. However, gangliosides alone do not mediate cellular entry-for this, BoNT/A also requires the protein receptor SV2 [32,33]. Of the three SV2 isoforms found in humans, BoNT/A1 has the greatest affinity for SV2C [32]. BoNT/A1 binds specifically to the luminal domain 4 of SV2 (SV2-LD4) via direct backbone-backbone interactions between a β-strand of SV2-LD4 and a β-strand of BoNT H C [34], and also through interactions with an N559-linked glycan [35]. The significance of the latter is highlighted by the inability of BoNT/A1 to bind to bacterially-expressed (i.e., non-glycosylated) SV2A or SV2B, and a reduced affinity for non-glycosylated SV2C [33,36]. The crystal structure of BoNT/A1-gSV2C-LD4 revealed a large range of interactions between the H C and the SV2 glycan, which extended away from the backbone-backbone interactions, almost doubling the contact surface area [37].
The other subtypes of BoNT/A are predicted to bind the same receptors as BoNT/A1 due to their high sequence identity between the binding domains ( Table 1). The crystal structure of the BoNT/A2 H C domain in complex with a non-glycosylated SV2C-LD4 showed that the binding mode is conserved, and despite some residue differences, it still binds SV2C [38,39]. F563 of SV2-LD4 forms a π-stacking interaction with R1156 of BoNT/A1, while a glutamic acid residue in BoNT/A2 (E1156) causes F563 to adopt a different conformation and the BoNT instead interacts directly with H564 of SV2C-LD4. Such mutations indicate flexibility with respect to the backbone-backbone interaction of SV2, which may also be tolerated due to extra interactions with the N-linked glycan [37]. BoNT/A2 has also been shown to have a higher affinity for gangliosides than BoNT/A1, although the interactions mediating this difference have not yet been identified [40]. The crystal structures of the BoNT/A3 and /A4 H C domains suggest a similar mode of interaction to ganglioside compared to BoNT/A1-the GBS of the former shows a potential loss of a hydrogen bond to one of the terminal sialic acids due to a difference in amino acid (phenylalanine rather than a tyrosine), whereas the latter is conformationally conserved [41]. With regard to the SV2 binding site, both structures show slight differences in conformation compared to that of BoNT/A1 due to differences in the primary sequence. Whether this will affect interactions with the SV2 glycan is yet to be determined. Despite high sequence identity between BoNT/A subtypes, significant differences in their intoxication properties have been identified. For example, BoNT/A2 is more potent in neuronal cells than BoNT/A1, possibly due to faster cell entry [42][43][44], and BoNT/A4 has been reported to be three orders of magnitude less potent than BoNT/A1 [44]. It is difficult to attribute these differences to just interactions between the H C and receptors, and is instead likely to be as a result of contributions from the H C , H N , and LC combined. Uncovering the subtle structural changes resulting from sequence variation which may affect receptor affinity requires further work through structural studies of individual H C domains and their complexes with receptors. Table 1. BoNT/A subtype primary sequence identities. Percentage identities are given for full-length and H C domain sequence alignments of each BoNT/A subtype. For the H C alignments, sequences aligned to BoNT/A1 residues 870-1296 were used. Uniprot accession codes for BoNT/A1 to /A8 are A5HZZ9 [45][46][47], Q45894 [48], Q3LRX9 [49], Q3LRX8 [49], C7BEA8 [50], C9WWY7 [51], K4LN57 [52], and A0A0A7PDB7 [53], respectively.

BoNT/B
There are currently eight subtypes within the BoNT/B serotype (BoNT/B1 to /B8), and they differ by between 1.5% and 7% at the amino acid level (Table 2) [54]. Although the crystal structure of BoNT/B1 exists in an open conformation similar to that of BoNT/A [11], the BoNT/B serotype targets a different protein receptor on the neuronal cell membrane, namely SytI or SytII [55][56][57][58]. Crystal structures of the HC domain from BoNT/B in complex with murine SytII revealed the high specificity of the binding interface where the SytII peptide forms a helix and binds to a hydrophobic groove via six hydrophobic residues [59,60]. Interestingly, BoNT/B displays a much lower affinity toward human SytII than murine SytII due to a single mutation at residue 54-Phe in rodents and Leu in humans [59,61,62]. Considering that SytII is more abundant on human motor neurons than SytI, a significantly larger dose of BoNT/B needs to be administered in order to achieve a similar therapeutic effect to that of BoNT/A. To overcome this issue, the BoNT/B binding domain has been engineered (E1191M,S1199Y) to increase its binding affinity-this molecule showed an 11-fold higher functional efficacy in human cells compared to wild-type BoNT/B1 [63]. BoNT/B is only capable of entering cells once it has bound to both its synaptotagmin receptor and its ganglioside receptor, either GT1b or GD1a [28,55,64]. The crystal structure of the BoNT/B1 binding domain in complex with both SytII and GD1a show strong interactions with the Sia5 moiety [65]. Although there is no direct contact between SytII and GD1a, there is some evidence that each can influence binding to the other, possibly due the spatial arrangement of both binding sites [66]. In addition to the dual receptors, BoNT/B has been reported to interact directly with the cell membrane through an exposed hydrophobic loop ("lipid-binding loop") located between the ganglioside and Syt binding sites on the H C [67].

BoNT/C
BoNT/C (specifically known as BoNT/C1) is predominantly associated with botulism in animals rather than humans [2,73]. There are no subtypes of the BoNT/C serotype-only two distinct protein sequences have been identified to-date (UniProtKB: P18640 [74], Q93HT3 [75]) which share 99.9% identity. Perhaps confusingly, there are two other botulinum toxins called "C2 toxin" and "C3" which are not "traditional" neurotoxins, but rather refer to different gene products-a binary AB toxin and an exoenzyme, respectively [76][77][78]. The mechanisms of cell-binding is of great interest because unlike the majority other BoNTs, no protein receptor for BoNT/C1 has yet been identified [79,80]. Interestingly, while the conserved SxWY ganglioside-binding motif is absent from the H C domain, BoNT/C1 is still able to bind gangliosides [81]. Indeed, an extended hydrophobic loop termed the "ganglioside binding loop" (GBL) was reported to be essential for neuronal binding, but the specific interactions have yet to be determined [80]. Crystal structures of the BoNT/C1 H C domain in complex with sialic acid revealed two potential binding sites that are independent of the GBS identified in other BoNTs [82,83].

BoNT/D
Like BoNT/C, there are no subtypes of BoNT/D, of which there are multiple sequences that share a primary sequence identity of >96%. BoNT/D appears to recognise all three isoforms of SV2 [81,82]. Cells lacking SV2 do not get intoxicated by BoNT/D, but this can be restored by the expression of any of the three SV2 isoforms (SV2A, B, C) [84]. It was further demonstrated that SV2A/B knockout neurones displaying a chimeric form of SV2-LD4 (SV2A, B, or C) alone were unable to mediate BoNT/D entry despite rescuing intoxication for BoNT/A and /E. Mutation of the N537 N-linked glycosylation site also had no effect on BoNT/D entry, despite blocking entry to BoNT/E [84]. This suggests that the SV2 receptor-binding domain in BoNT/D may be distinct from other SV2-interacting BoNTs such as BoNT/A. Gangliosides are also required for BoNT/D cell entry [85], however like BoNT/C, BoNT/D does not contain an SxWY motif in the GBS, although the site is still able to recognise gangliosides [86]. It is also proposed to contain a second binding site termed Sia-1, since mutation of this site results in reduced ganglioside binding [87].

BoNT/E
There are currently twelve known BoNT/E subtypes (BoNT/E1 to /E12) whose amino acid identities vary by up to 12% ( Table 3). The protein receptor for BoNT/E is SV2, although only isoforms SV2B and SV2C are capable of mediating entry [36,88], and in the presence of gangliosides [89]. The SxWy motif is conserved in the BoNT/E GBS, and direct binding of GT1b has been observed [90]. No crystal structures of BoNT/E in complex with receptor or ganglioside have yet been solved. Therefore, the precise molecular basis of their interactions have yet to be determined. The native crystal structure of BoNT/E has been solved, and it reveals a conformation that is significantly different from that of BoNT/A and BoNT/B [12]. In this structure, the HC domain wraps around the toxin, giving the protein more compact shape overall. BoNT/E is capable of entering cells much more quickly than BoNT/A [91], and this domain organisation has been proposed to prime the toxin for translocation, resulting in a faster onset of paralysis [12]. However, investigations using various chimeras of BoNT/A1 and BoNT/E1 showed that the speed of translocation is not affected by the binding domain [92].  Table 4). The exact protein receptor for BoNT/F1 has been reported to be glycosylated SV2 [81,100], but this remains to be established conclusively. For example, one study showed that BoNT/F activity decreased when H C /A was introduced as a competitor molecule [81], whereas a separate study demonstrated that BoNT/F1 entry in neurones was unaffected by a double SV2A/B knockout in cortical neurones (which have negligible expression of SV2C) [101]. For ganglioside binding, BoNT/F1 requires gangliosides containing an α2,3-linked sialic acid on the terminal galactose (i.e., GT1b or GD1a) [81,100]. The SxWY motif is conserved in BoNT/F, and the crystal structure of the H C domain from BoNT/F1 in complex with GD1a confirmed the existence of a GBS [102].

BoNT/G
Only two protein sequences of BoNT/G are currently known to exist, and they share 99.9% amino acid identity. The protein receptor for BoNT/G is either SytI or SytII, although interestingly the interface diverges from BoNT/B and it has a lower binding affinity [58,106,107]. Only 5 of 14 residues involved in the BoNT/B-SytII interaction are conserved [57,106]. BoNT/G also displays a low affinity for the human SytII receptor due to a human/chimpanzee-specific mutation [61]. The BoNT/B H C domain was successfully engineered to improve human SytII binding, and a similar approach would be worth investigating here [63]. BoNT/G possesses the conserved SxWY motif in its GBS, and binds preferentially to GT1b [108]. In addition to the dual-receptor interactions, BoNT/G also contains a "lipid-binding loop" (residues 1252-1256) similar to that of BoNT/B which can directly interact with the cell membrane to further contribute binding affinity [67,106], and deletion of this loop dramatically decreased neurotoxicity [67].

Mosaic/Chimeric BoNTs
BoNTs composed of domains from different serotypes also exist in nature. The most common of these chimeric toxins are discussed below.

BoNT/CD
BoNT/CD is a mosaic toxin composed of a LC domain and a H N domain that is most similar to BoNT/C and a H C domain that is most similar to BoNT/D. Interestingly the binding domain of BoNT/CD binds synaptosomes more tightly than BoNT/D [79]. This may be due to residues K1118 and K1136 (which differ from the equivalent residues in BoNT/D, E1114 and G1132) since mutation of these lysines results in a dramatic loss in synaptosome binding affinity [109]. Protein residues which may also interact with a ganglioside have also been identified through crystallisation with a sialic acid molecule [110].

BoNT/DC
The BoNT/DC chimera possesses an LC domain and a H N with 96% sequence identity to BoNT/D and a H C domain similar to that of BoNT/C (74% sequence identity) [111,112]. Botulism caused by BoNT/DC is usually found outside of humans in birds and other mammals, but it is also capable of binding human neuronal cells [62,112]. Despite having a binding domain similar to BoNT/C, BoNT/DC binds to either SytI or SytII. This interaction is mediated by hydrophobic residues, and is distinct from that of BoNT/B [113]. The BoNT/DC protein is particularly interesting, as it appears that it may not require complex gangliosides to enter target neurones [114,115]. However, the crystal structure of BoNT/DC in complex with Sialyl-T suggests that BoNT/DC is capable of recognising a single sialic acid, and thus potentially a range of membrane-bound sugars. The structure also reveals the presence of an extended "lipid-binding loop" that is also observed in BoNT/B and BoNT/G [67,114].

BoNT/HA(FA)
BoNT/FA was recently identified in 2014 from a case of infant botulism [116,117]. At the time it was referred to as BoNT/H (and sometimes still as BoNT/HA) due to its non-neutralisable antigenicity [116], and phylogenetic analysis of the bont sequences placed the gene in a lineage distinct from other serotypes [117]. The sequence was finally released to the scientific community after a protracted period of data restriction due to supposed safety concerns [118][119][120]. It was determined that the molecule was a mosaic toxin composed of an LC similar to that of BoNT/F5, an H N domain similar to that of BoNT/F1, and an H C domain similar to that of BoNT/A1 [121]. Direct binding of the BoNT/FA HC domain has been confirmed for glycosylated SV2C-LD4 [122], and crystal structures of this binding domain show some slight differences with respect to BoNT/A1 which would be consistent with a decreased affinity towards the protein backbone of SV2 [122,123]. Although no ganglioside-bound structure of BoNT/FA has yet been solved, the structure of the GBS appears to maintain the same fold as that observed for BoNT/A1 [123]. SV2 is likely the protein receptor for BoNT/FA, and direct binding has been confirmed for glycosylated SV2C-LD4 [122]. The BoNT/FA sequence contains mutations with respect to BoNT/A1 which result in decreased affinity towards the protein backbone of SV2, as determined by a pull-down assay against non-glycosylated SV2C, while the equivalent residues involved in glycan binding remain unchanged [122]. The effect of these mutations towards different isoforms of SV2 remains to be seen. The ganglioside-binding site is able to maintain the same fold as BoNT/A1, but no ganglioside-bound structures yet exist, so the exact interactions remain to be determined [123]. In recent assays using cultured rat embryonic spinal cord neurones and rat cortical neurones, BoNT/FA was found to be much more potent than BoNT/A1. However, counterintuitively the toxin was much less potent when assayed using an ex vivo mouse phrenic nerve hemidiaphragm (mPNHD). These results, along with the methods used for each assay, point toward a toxin that may have a slow speed of onset despite a highly active LC [124]. Understanding the interactions of BoNT/FA with its receptors is crucial to both determining what causes intoxication differences and for developing novel therapeutics.

BoNT/X
A strain of C. botulinum that was already known to express BoNT/B was recently found to contain the gene for another BoNT molecule that shared low primary sequence identity to other serotypes (<30%)-this was named BoNT/X [125]. It is unknown whether this molecule is capable of causing human botulism, but interestingly, its LC cleaved non-canonical substrates such as VAMP4, VAMP5, and Ykt6 [125]. This suggests that this toxin significantly diverged from other serotypes during its evolution. Despite this, recent structural characterisation of the LC has revealed a core fold common to all BoNTs [126]. Little is known about the BoNT/X H N and H C domains, and considering the novel characteristics of LC, attempts are underway to determine the specific receptor(s) that it targets and how it functions in vivo. The BoNT/X H C does contain an SxWY sequence motif, indicating that it potentially shares similar ganglioside binding characteristics with other BoNTs. Due to its divergence and low sequence similarity to existing BoNTs, structural and functional characterisation could lead to new insights into receptor binding that could be exploited for future therapeutics.

BoNT-Like Proteins
Considering that BoNTs are the deadliest biological agents that exist, it was surprising to find BoNT-like proteins produced by non-Clostridium species. The first was found in 2015 and is referred to as "BoNT/Wo", named after the bacterium that produced it, Weissella oryzae SG25 [127,128]. BoNT/Wo cleaves VAMP at a unique location (Trp89-Trp90) [129], but it does not contain any typical BoNT motifs in the receptor-binding domain. This would be consistent with zero reported cases of botulism in humans. Indeed, it has been speculated that BoNT/Wo may instead target SNARE-mediated plant defence systems [128]. More recently, another BoNT-like gene cluster was discovered in the bacterium Enterococcus faecium, which is a ubiquitous commensal microorganism commonly found in the gut of mammals. The BoNT-like protein, referred to as BoNT/En or eBoNT/J, possesses many traditional BoNT motifs, including a HExxH zinc-binding motif in the LC and a ganglioside-binding SxWY motif in the HC domain [130,131]. Early studies indicate that rodents do not possesses the receptor(s) for BoNT/En intoxication [130].

Conclusions
BoNTs are highly specific and potent exotoxins that are being exploited for therapeutic gain. Our knowledge of the molecular aspects of botulinum neurotoxin, such as mechanism of cell targeting and internalisation, is incomplete and mostly limited to only one or two serotypes (i.e., BoNT/A1 and BoNT/B1). We have yet to fully understand the binding mechanism of others and also how subtle amino acid differences may result in differences of intoxication (i.e., between subtypes). From what we know so far, X-ray crystallography has suggested that the mechanism of binding is more complex than was initially thought. It is possible that BoNTs may accommodate heterogeneous glycosylation of their protein receptors and target a variety of gangliosides to ensure successful binding to their target cell type. It is not a trivial task to determine how BoNTs bind to their receptors on neuronal cell membranes, especially when trying to replicate the conditions in vivo. With the recent discovery of new BoNTs and BoNT-like molecules in other bacterial species, this raises questions regarding the evolution of the bont gene cluster, their ability to be transferred between species, the potential implications for biosafety, and the need for an agreed-upon consistent naming convention to avoid confusion and ambiguity [132,133]. Fast characterisation and the generation of neutralising antibodies against these novel toxins is required. Despite the potential dangers posed, the knowledge may lead to the generation of new and safer therapeutics. In particular, atomic data of the receptor-binding domains from individual subtypes could be used for structural and functional analyses, providing insights for the design of novel BoNTs [134]. In summary, this review highlights the need for further functional and structural characterisation of different BoNT subtypes to improve our understanding of what determines the toxicological differences and how they may be used in therapeutics.