The Dual α-Amidation System in Scorpion Venom Glands

Many peptides in scorpion venoms are amidated at their C-termini. This post-translational modification is paramount for the correct biological function of ion channel toxins and antimicrobial peptides, among others. The discovery of canonical amidation sequences in transcriptome-derived scorpion proproteins suggests that a conserved enzymatic α-amidation system must be responsible for this modification of scorpion peptides. A transcriptomic approach was employed to identify sequences putatively encoding enzymes of the α-amidation pathway. A dual enzymatic α-amidation system was found, consisting of the membrane-anchored, bifunctional, peptidylglycine α-amidating monooxygenase (PAM) and its paralogs, soluble monofunctional peptidylglycine α-hydroxylating monooxygenase (PHMm) and peptidyl-α-hydroxyglycine α-amidating lyase (PALm). Independent genes encode these three enzymes. Amino acid residues responsible for ion coordination and enzymatic activity are conserved in these sequences, suggesting that the enzymes are functional. Potential endoproteolytic recognition sites for proprotein convertases in the PAM sequence indicate that PAM-derived soluble isoforms may also be expressed. Sequences potentially encoding proprotein convertases (PC1 and PC2), carboxypeptidase E (CPE), and other enzymes of the α-amidation pathway, were also found, confirming the presence of this pathway in scorpions.


Introduction
The order Scorpiones constitutes one of the most ancient lineages within the phylum Arthropoda [1,2]. The key to the ecological success of these arachnids resides in the production of potent venoms used for feeding, defense, and deterring competitors [3,4]. Scorpion venoms are complex mixtures of components, including bioactive peptides with potential therapeutic applications [4], enzymes, metabolites, and most importantly, an arsenal of toxins active on Na + , K + , Ca 2+ , and Cl − channels [5][6][7][8][9][10]. By altering the normal mechanics of these channels, scorpion toxins unleash systemic havoc in their victims, which can lead to severe envenomation symptoms, including death [11,12]. The venom is produced and secreted by two symmetrical glands located in the last segment of the metasoma, the telson [12]. In these glands, the peptidyl venom components undergo synthesis and maturation, a complex process involving a series of post-translational modifications (PTMs) that result in the biologically active molecules [3]. The most common PTMs found in scorpion venom peptides are the formation of disulfide bridges from pairs of cysteines, proteolytic cleavage, and C-terminal amidation (α-amidation). Amidated toxins and peptides without disulfide bonds (NDBP) are well known in scorpion venoms (Table 1). Post-translational α-amidation is characteristic of bioactive peptides from many eukaryotic taxa [13]. C-terminal amidation confers on peptides enhanced resilience to degradation by carboxypeptidases, thus increasing their half-lives and decreasing their turnover rates [14]. Moreover, amidation is essential for correct functioning of many mammalian neuropeptides and hormones [15]. Several scorpion toxins have also been shown to require C-terminal amidation for full biological activity, without which, potency is severely reduced [16,17]. C-terminal amidation results in a change with two possible functional implications: the amidated terminal amino acid could be directly involved in molecular recognition events, or the amidation could simply reduce the negative charge of the carboxyl moiety and the peptide as a whole [18].
In general, metazoan amidated peptides are translated as larger polypeptidyl precursors, which contain an amidation signal, a glycine typically followed by one or two basic residues (R-X-Gly-Basic or R-X-Gly-Basic-Basic) and the rest of the propeptide sequence. This signal is first targeted by endoproteolytic proprotein-processing enzymes of the protein convertase family (PCs), resulting in peptides terminated with basic residues, which are substrates for carboxypeptidases that remove those residues from the processing intermediate and expose the C-terminal glycine. This glycine is then further subjected to sequential reactions that amidate the peptide [19,20] (Figure 1A). Two enzymatic activities catalyze these reactions. First, a peptidylglycine α-hydroxylating monooxygenase (PHM, EC 1.14.17.3) catalyzes the hydroxylation of the glycine residue, using ascorbate and molecular oxygen as co-substrates. Then, a peptidyl-α-hydroxyglycine α-amidating lyase (PAL, EC 4.3.2.5) cleaves the hydroxyglycine residue, yielding the amidated product and glyoxylate [21,22] ( Figure 1B). Post-translational α-amidation is characteristic of bioactive peptides from many eukaryotic taxa [13]. C-terminal amidation confers on peptides enhanced resilience to degradation by carboxypeptidases, thus increasing their half-lives and decreasing their turnover rates [14]. Moreover, amidation is essential for correct functioning of many mammalian neuropeptides and hormones [15]. Several scorpion toxins have also been shown to require C-terminal amidation for full biological activity, without which, potency is severely reduced [16,17]. C-terminal amidation results in a change with two possible functional implications: the amidated terminal amino acid could be directly involved in molecular recognition events, or the amidation could simply reduce the negative charge of the carboxyl moiety and the peptide as a whole [18].
In general, metazoan amidated peptides are translated as larger polypeptidyl precursors, which contain an amidation signal, a glycine typically followed by one or two basic residues (R-X-Gly-Basic or R-X-Gly-Basic-Basic) and the rest of the propeptide sequence. This signal is first targeted by endoproteolytic proprotein-processing enzymes of the protein convertase family (PCs), resulting in peptides terminated with basic residues, which are substrates for carboxypeptidases that remove those residues from the processing intermediate and expose the C-terminal glycine. This glycine is then further subjected to sequential reactions that amidate the peptide [19,20] ( Figure 1A). Two enzymatic activities catalyze these reactions. First, a peptidylglycine α-hydroxylating monooxygenase (PHM, EC 1.14.17.3) catalyzes the hydroxylation of the glycine residue, using ascorbate and molecular oxygen as co-substrates. Then, a peptidyl-α-hydroxyglycine α-amidating lyase (PAL, EC 4.3.2.5) cleaves the hydroxyglycine residue, yielding the amidated product and glyoxylate [21,22] ( Figure 1B). Peptide amidation seems to be common to all metazoans, and PHM and PAL are assumed to have monophyletic origins [23]. However, the way these activities are expressed differs among taxa. For example, in insects Apis mellifera, Drosophila melanogaster and others, PHM and PAL are encoded by independent genes [24][25][26]. In vertebrates, such as Bos taurus, Rattus norvegicus, Xenopus laevis, Homo sapiens and others, a single gene encodes both activities in a bifunctional enzyme comprising a single polypeptide, peptidylglycine α-amidating monooxygenase (PAM) [27][28][29][30]. The same twodomain PAM structure was reported for the gastropod Aplysia californica [31]. Curiously, another gastropod, Lymnaea stagnalis, produces a zymogen comprising four different PHM domains and a single PAL domain, which is endogenously converted to a mixture of monofunctional isoenzymes [32]. PAM isoforms have been reported in R. norvegicus, with up to seven isoforms generated by alternative splicing. These isoforms include configurations with and without internal proteolytic sites, resulting in both independent PHM and PAL, and the bifunctional PAM [27,33]. Among arthropods, independent PHM and PAL, but not the bifunctional PAM, are expressed in insects, as indicated above, whereas both independent and bifunctional enzymes are expressed in crustaceans Peptide amidation seems to be common to all metazoans, and PHM and PAL are assumed to have monophyletic origins [23]. However, the way these activities are expressed differs among taxa. For example, in insects Apis mellifera, Drosophila melanogaster and others, PHM and PAL are encoded by independent genes [24][25][26]. In vertebrates, such as Bos taurus, Rattus norvegicus, Xenopus laevis, Homo sapiens and others, a single gene encodes both activities in a bifunctional enzyme comprising a single polypeptide, peptidylglycine α-amidating monooxygenase (PAM) [27][28][29][30]. The same two-domain PAM structure was reported for the gastropod Aplysia californica [31]. Curiously, another gastropod, Lymnaea stagnalis, produces a zymogen comprising four different PHM domains and a single PAL domain, which is endogenously converted to a mixture of monofunctional isoenzymes [32]. PAM isoforms have been reported in R. norvegicus, with up to seven isoforms generated by alternative splicing. These isoforms include configurations with and without internal proteolytic sites, resulting in both independent PHM and PAL, and the bifunctional PAM [27,33]. Among arthropods, independent PHM and PAL, but not the bifunctional PAM, are expressed in insects, as indicated above, whereas both independent and bifunctional enzymes are expressed in crustaceans [26]. No information is available on the α-amidating system of other subphyla, e.g., the chelicerates, and in particular, the arachnids.
Amidated peptides are common in venoms produced by various animals. For example, marine snails of the genus Conus, produce a large array of peptidyl toxins (conotoxins), a significant fraction of which are amidated. From the venom ducts of cone snails, cDNAs were cloned that encode bifunctional PAMs. Heterologously expressed PAMs were demonstrated to be active [34]. Although no amidating system has been described in arachnids, the abundance of amidated peptides in their venoms, in particular scorpion venoms, suggests that amidating enzymes are active in their venom glands. The correlation between canonical α-amidation signals in transcripts from different transcriptomic analyses and amidation of the mature encoded peptides, confirmed by biochemical and proteomic analyses [35][36][37][38][39], indicates that the classical PHM plus PAL and/or PAM systems must be present in scorpion venom glands. In this work, the enzymatic amidation system of Old and New World scorpions is assessed by transcriptomic analysis.

The Dual Enzymatic System for α-Amidation in the Order Scorpiones
We investigated venom gland transcriptomes of 21 scorpion species and the genome of Centruroides sculpturatus and identified sequences encoding orthologs of the bifunctional PAM enzyme in 13 of them. Partial sequences for PAM were found in the remaining eight transcriptomes. A 14th complete PAM-coding sequence was recovered by RT-PCR from venom-gland total RNA from the scorpion Centruroides noxius (Table 2 and Supplementary Table S2). Complete coding sequences (CDS) from those transcripts translate into proteins of 861-887 amino acids (Supplementary Figure S1). The deduced topology of the scorpion PAM precursor is similar to that of the PAM-2 isoform described for Rattus norvegicus (Figure 2A,B). A signal peptide sequence (SP) for secretion is followed by a short propeptide (PP) region, a PHM domain, a linker sequence (Linker 1), a PAL domain, a second linker sequence (Linker 2), a membrane spanning domain (MSD), and a cytosolic domain (CD) (Figure 2A). The rat PAM-2 isoform lacks the Exon A-encoded linker region with respect to the rat PAM-1 isoform. This extra region contains an endoproteolytic site which, after processing, cleaves the PHM and PAL monofunctional enzymes into separate polypeptides. This Exon-A-encoded region has been described only for vertebrates [23], and has no equivalent sequence in the scorpion PAM (Supplementary Figure S2). It is notable that although the scorpion PAM lacks this region, two putative endoproteolytic sites are still present in the scorpion PAM sequence (Figure 2A). The first site, defined by a lysine dyad (KK), is located between the PHM and PAL domains, and is proposed to delimit the PHM domain. The second site, located between the PAL sequence and the MSD, is also defined by a KK dyad, and if subjected to post-translational processing, would liberate a soluble PAL enzyme from the MSD and CD domains. Thus, the scorpion bifunctional PAM enzyme could be post-translationally processed to generate independent, soluble PHM and PAL enzymes.
Shorter transcripts encoding the monofunctional PHM and PAL enzymes (PHMm and PALm) were also identified in most of the analyzed scorpion transcriptomes/genome (Table 2 and Supplementary  Table S2). The encoded proteins are 345-350 amino acids long (PHMm) and 356-366 (PALm) (Supplementary Figures S3 and S4). Topologies of the monofunctional enzymes are similar to those of the PHM and PAL-2 isoforms from D. melanogaster ( Figure 2C). The proproteins include a SP and the catalytic domain. No MSD and CD domains are detected; therefore, the monofunctional enzymes are predicted to be soluble.  PC2 and CPE sequences with more than 50% of the sequence determined; ( ): Partial sequences with less than 50% of the estimated total sequence determined; a PAM sequence amplified by PCR; b PAM sequence verified by DNA sequencing; * Old World scorpion. The tblast and blastn algorithms were used to identify sequences in the local scorpion transcriptomic databases, with an e-value of 1 × 10 −6 . Empty spaces indicate that no sequences were identified in those transcriptomes.
Key residues involved in catalysis and metal coordination are conserved in both scorpion amidation systems (Figure 2A and Supplementary Figures S5 and S6), suggesting that those enzymes are probably functional. The percentage of sequence identity between homologous domains of the bifunctional and independent enzymes for each species are indicated in Supplementary Table S3. As an example, for C. noxius, the percentage of identity between the PAM subdomains and the PHMm and PALm are 29.8% and 32.5%, respectively.   Sequences encoding other components of the α-amidation pathway were also sought among available scorpion transcriptomic/genomic sequences. Transcripts encoding orthologs of proprotein convertases 1 and 2 (PC1 and PC2) and carboxypeptidase E (CPE), enzymes that operate upstream in the α-amidation pathway ( Figure 1A), were also found, as well as their genes in the Centruroides sculpturatus genome ( Table 2, Supplementary Table S2), reinforcing the notion of a conserved α-amidation pathway in scorpions.
These results indicate that in scorpions, a dual enzymatic system for α-amidation is responsible for the amidation of venom peptides. Transcripts for both the bifunctional PAM and the monofunctional PHMm and PALm are present in scorpion venom glands. Among arthropods, a similar dual system is present in crustaceans, but not in insects [26].

The PAM-, PHMm-and PALm-Coding Genes
The search for genomic sequences in C. sculpturatus using blastn showed that separate genes encode the bifunctional and monofunctional enzymes, demonstrating that they are encoded by paralogous genes and are not the result of alternative splicing, a phenomenon reported in the generation of isoforms in R. norvegicus [27,33]. Figure 3A shows the structure of the genes for the PAM, PHMm and PALm enzymes in C. sculpturatus, including their sizes, exon numbers and distributions. The structures of the rat PAM and fruit fly PHM and PAL genes are also shown for comparison ( Figure 3B,C). Sequences encoding other components of the α-amidation pathway were also sought among available scorpion transcriptomic/genomic sequences. Transcripts encoding orthologs of proprotein convertases 1 and 2 (PC1 and PC2) and carboxypeptidase E (CPE), enzymes that operate upstream in the α-amidation pathway ( Figure 1A), were also found, as well as their genes in the Centruroides sculpturatus genome (Table 2, Supplementary Table S2), reinforcing the notion of a conserved αamidation pathway in scorpions.
These results indicate that in scorpions, a dual enzymatic system for α-amidation is responsible for the amidation of venom peptides. Transcripts for both the bifunctional PAM and the monofunctional PHMm and PALm are present in scorpion venom glands. Among arthropods, a similar dual system is present in crustaceans, but not in insects [26].

The PAM-, PHMm-and PALm-Coding Genes
The search for genomic sequences in C. sculpturatus using blastn showed that separate genes encode the bifunctional and monofunctional enzymes, demonstrating that they are encoded by paralogous genes and are not the result of alternative splicing, a phenomenon reported in the generation of isoforms in R. norvegicus [27,33]. Figure 3A shows the structure of the genes for the PAM, PHMm and PALm enzymes in C. sculpturatus, including their sizes, exon numbers and distributions. The structures of the rat PAM and fruit fly PHM and PAL genes are also shown for comparison ( Figure 3B,C).

Phylogenetic Reconstruction of Amidating Enzymes of Arachnids
Phylogenomic analyses have proposed two basal branches from which all scorpions have descended (parvorders Buthida and Iurida) [1,40,41]. Maximum likelihood analyses with the nucleotide sequences of the PHM and PAL domains from the PAM (designated as phm-PAM and pal-

Phylogenetic Reconstruction of Amidating Enzymes of Arachnids
Phylogenomic analyses have proposed two basal branches from which all scorpions have descended (parvorders Buthida and Iurida) [1,40,41]. Maximum likelihood analyses with the nucleotide sequences of the PHM and PAL domains from the PAM (designated as phm-PAM and pal-PAM in these analyses, respectively) and the PHMm and PALm enzymes, show a correlation between the phylogeny of these enzymes and the phylogeny of the scorpion families from which they originate [40]. Figure 4; Figure 5 show the two main clades in which the sequences of the phm-PAM and pal-PAM are separated from the sequences PHMm and PALm, respectively. Within those clades, there is a clear divergence between sequences from species belonging to the family Buthidae (parvorder Buthida) and sequences from species belonging to families of the parvorder Iurida (Vaejovidae, Caraboctonidae, Euscorpiidae, Chactidae, Superstitionidae, Diplocentridae, Urodacidae, Scorpionidae). Within the family Buthidae, sequences from Old World scorpions Leiurus abdullahbayrami (Turkey) and Mesobuthus martensii (Eastern Asian countries) are placed in an independent, supported clade that precedes the clade of New World species. The sequences from Tityus trivittatus (from the south-central part of South America, Argentina, and Brazil) are in independent supported clades with respect to those of the genus Centruroides (distributed in Central America, the Caribbean, and North America). The variable numbers of sequences recovered from different transcriptomes, limits comparative analyses of the catalytic domains, though a consistent topology for the phylogenetic trees is observed. Sequences putatively encoding a dual amidation system, as in scorpions, were also found in other arachnids, including members of the orders Araneae (Liphistius malayanus, Frontinella communis, Parasteatoda tepidariorum, Leucauge venusta), Opiliones (Siro boyerae, Trogulus martensi) Ricinulei (Ricinoides atewa), and the xiphosuran, Limulus polyphemus (recently placed within the class Arachnida [42]), among others (Supplementary Table S1). This indicates that the same dual α-amidation system is also employed by other arachnids.

Discussion
Venom gland transcriptomic analyses performed with representative scorpion families from both the Old and New Worlds have shown the enormous diversity of compounds that comprise these important biofluids [4]. Together with available biochemical information on scorpion venom components, sequences of many transcripts indicate that amidation is one of the most common PTMs of scorpion venom peptides. The discovery of canonical amidation signals in the translated sequences suggested that a conserved α-amidation system might be present in scorpion venom glands to convert propeptides into shorter, amidated, mature peptides. In this work, transcripts encoding the relevant components of this pathway are described, confirming that a dual amidation system, including a bifunctional PAM enzyme and individual non-membrane bound PHMm and PALm is employed. Genes for this dual system were found in the genome of C. sculpturatus, demonstrating than the bifunctional and the monofunctional enzymes are encoded by independent genes and are

Discussion
Venom gland transcriptomic analyses performed with representative scorpion families from both the Old and New Worlds have shown the enormous diversity of compounds that comprise these important biofluids [4]. Together with available biochemical information on scorpion venom components, sequences of many transcripts indicate that amidation is one of the most common PTMs of scorpion venom peptides. The discovery of canonical amidation signals in the translated sequences suggested that a conserved α-amidation system might be present in scorpion venom glands to convert propeptides into shorter, amidated, mature peptides. In this work, transcripts encoding the relevant components of this pathway are described, confirming that a dual amidation system, including a bifunctional PAM enzyme and individual non-membrane bound PHMm and PALm is employed. Genes for this dual system were found in the genome of C. sculpturatus, demonstrating than the bifunctional and the monofunctional enzymes are encoded by independent genes and are not the result of alternative splicing. Paralogs involved in various developmental processes and cellular functions within the orders Scorpiones and Araneae arose as a consequence of a genome duplication in the common ancestor of scorpions and spiders [43,44]. Given the importance of amidation in peptide signaling and the functionality of toxins and other amidated venom peptides, it is not surprising that both amidation enzyme systems were retained in this lineage of venomous arachnids, where they evolved to target specific substrates, or to be expressed in particular cell types or physiological conditions. Together with conserved functional residues for cation coordination and enzymatic activity, the scorpion PAM sequence contains all the structural elements for generation of a membrane-anchored protein. However, the sequence of the bifunctional PAM contains putative endoprotease cleavage sites (dyads of basic amino acids), which are normally targeted by proprotein convertases, flanking the catalytic domains. This means that the PAM proprotein could in principle be processed to the complete membrane-bound two-domain enzyme or it could be post-translationally cleaved by convertases to render soluble monofunctional domains. The presence of transcripts encoding convertases in the scorpion venom glands, also described in this work, reinforces this possibility. Whether both the two-domain PAM and the PAM-derived monofunctional enzymes coexist in the venom gland remains to be established. We expect that the soluble PHMm and PALm, as well as the putative PAM-derived soluble isoforms, are secreted by the venom glands into the venom. This has been confirmed, at least for PHMm with liquid chromatography-mass spectrometry (LC-MS/MS) in scorpion venom proteomic analyses. Although it is not clear what additional functions they might have in scorpion venom, it is known that the bovine PAM enzyme is capable of catalyzing three alternative reactions: sulfoxidation, N-dealkylation of amines and O-dealkylation [45]. This raises the possibility of finding new natural substrates for this set of enzymes and taking advantage of their catalytic capacities for synthesis or chemical modification of molecules of biotechnological interest.
Other proteomic analyses have confirmed the presence of putative amidating enzymes in arachnid venoms. One of these sequences was reported as a PAM from the spider Cupiennius salei (annotated as PAM_CUPSA [MH766628]) [46]. However, a rigorous sequence analysis demonstrates that this sequence is not from a PAM ortholog, but a monofunctional PHMm. Similarly, for the scorpion Tityus obscurus, a sequence reported as a PAM (GenBank: JAT91064) [38], shares 87% sequence identity with the PHMm from T. trivittatus, as reported here, and is therefore also a PHMm. A third report found a PHMm sequence in transcriptomic and proteomic analyses of the scorpion C. hentzi (annotated as GFWZ01000197.1 TSA: Centruroides hentzi Chent_MonoO transcribed RNA sequence) [47]. Sequences encoding orthologs of PHMm were also identified in venoms of Centruroides limpidus, Centruroides hirsutipalpus and Superstitionia donensis (data not shown). Therefore, this constitutes the first report of the monofunctional PALm and the bifunctional PAM enzymes from any arachnid, and demonstrates that a conserved, functional dual α-amidation system is present in scorpion venom glands, as well as in other arachnids.

Identification and Annotation of Amidating Enzymes in Scorpions and Related Organisms
Sequences putatively encoding PAM, PHM and PAL homologs were identified in transcriptomes using tBLASTn, with the sequence of the R. norvegicus PAM (Uniprot, P14925) as query. Recovered nucleotide sequences were translated with the ExPASy server [56]. The presence and organization of characteristic domains was evaluated with NCBI-CDART [57] in accordance with [23]. Other sequence hallmarks were identified: the signal peptide (SP) with SignalP 4.1 and Phobius [58,59], the propeptide region (Pp) with ArachnoServer v. 3.0 [60] and the transmembrane domain with the TMHMM server v. 2.0 [61]. Identification and delimitation of the catalytic domains and the residues involved in metal coordination and disulfide formation was manually performed by sequence alignment with the reference R. norvegicus PAM (Uniport P14925). Potential glycosylation sites were predicted with the NetNGlyc 1.0 Server (http://www.cbs.dtu.dk/services/NetNGlyc/). The annotation of each determined sequence can be found in Supplementary Table S1. The sequences were submitted to the European Nucleotide Archive (ENA) under project PRJEB32831.

Amplification and Cloning of the PAM Sequence from Centruroides noxius
Total RNA was extracted from the telson of a single female C. noxius using an SV Total RNA Isolation System kit (Promega Corporation, Madison, WI, USA). cDNA was amplified with a First Strand cDNA Synthesis Kit for RT-PCR (AMV) (Roche, Basel, Switzerland). Primers, Cen-Fw3 (5 -GAT CTT GTA AAC GGC GTA TTT CCC TT-3 ) and Cen-Rv4 (5 -CCG ATA TCC TCC CAA CCA TCC TTT C-3 ), were designed from the consensus of the PAM sequences from two scorpions of the genus Centruroides (C. limpidus and C. orizaba). Amplification conditions were 3 min at 96 • C, followed by 30 cycles of 3 sec at 96 • C, 1 min at 56 • C and 2 min at 68 • C, plus a final step of 5 min at 68 • C. A recombinant Pfu polymerase produced in-house was used. The PCR product was purified with the QIAQuick Gel extraction Kit (QIAGEN GmbH, Hilden, Germany), ligated into an EcoRV-digested pBluescript II KS(+) vector, and electroporated to electrocompetent DH5α Escherichia coli cells. Positive clones were selected with the blue/white system by growing the cells in X-Gal/IPTG-complemented LB/ampicillin medium. Plasmids were prepared by alkaline lysis and submitted to sequencing with the primers T7-Like

Multiple Alignments and Phylogeny Reconstruction of PAM, PHM, and PAL
Phylogenetic reconstruction of the PAM enzyme and its two catalytic domains phm-PAM and pal-PAM (with the suffix '-PAM' used to differentiate them from those of the monofunctional enzymes), and of the independent enzymes PHMm and PALm (with the suffix 'm', for 'monofunctional') was performed using the maximum likelihood (ML) method with nucleotide sequences. Additional sequences from phylogenetically related organisms (external groups) were obtained from NCBI or assembled from transcriptome raw reads deposited at SRA-NCBI. All sequences were aligned with MAFFT v7.407 [62]. The best substitution model (GTR+F+I+G4) and the ML analysis were evaluated with IQ-TREE v1.6.9 [63,64], using the ultrafast bootstrap method (UFBoot2) [65] with 10,000 replicates.

Genomic Organization of Scorpion PAM, PHM, and PAL
Genome sequences of C. sculpturatus (NCBI:txid218467) corresponding to the amidating enzymes were recovered from NCBI using BLASTn, with the nucleotide sequences for PAM, PHM, and PAL from C. limpidus as queries. Identification of introns and exons was performed with the Splign utility [66].
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-6651/11/7/425/s1: Figure S1: Schematic alignment of PAM sequences with >90% of the estimated sequence determined, Figure S2: Schematic alignment of PAM1 and PAM2 isoforms from R. norvegicus and the completely sequenced scorpion PAM, Figure S3: Schematic alignment of the PHMm sequences found in 22 analyzed scorpion transcriptomes, Figure S4: Schematic alignment of the 20 PALm sequences found in 22 analyzed scorpion transcriptomes, Figure S5: Sequence alignment of PHM domains, Figure S6: Sequence alignment of PAL domains, Table S1: Nomenclature of transcripts in various scorpion species, Table S2: Sequence conservation between catalytic domains of the bifunctional and monofunctional enzymes by species (% of identity), Table S3: External groups used for phylogenetic reconstruction of the evolutionary history of the functional domains.