Discovery of the Gene Encoding a Novel Small Serum Protein (SSP) of Protobothrops flavoviridis and the Evolution of SSPs

Small serum proteins (SSPs) are low-molecular-weight proteins in snake serum with affinities for various venom proteins. Five SSPs, PfSSP-1 through PfSSP-5, have been reported in Protobothrops flavoviridis (“habu”, Pf) serum so far. Recently, we reported that the five genes encoding these PfSSPs are arranged in tandem on a single chromosome. However, the physiological functions and evolutionary origins of the five SSPs remain poorly understood. In a detailed analysis of the habu draft genome, we found a gene encoding a novel SSP, SSP-6. Structural analysis of the genes encoding SSPs and their genomic arrangement revealed the following: (1) SSP-6 forms a third SSP subgroup; (2) SSP-5 and SSP-6 were present in all snake genomes before the divergence of non-venomous and venomous snakes, while SSP-4 was acquired only by venomous snakes; (3) the composition of paralogous SSP genes in snake genomes seems to reflect snake habitat differences; and (4) the evolutionary emergence of SSP genes is probably related to the physiological functions of SSPs, with an initial snake repertoire of SSP-6 and SSP-5. SSP-4 and its derivative, SSP-3, as well as SSP-1 and SSP-2, appear to be venom-related and were acquired later.


Introduction
The bites of viperid snakes, including Protobothrops flavoviridis (Pf), cause a variety of symptoms, including bleeding, necrosis, edema, and neurotoxicity, and can be fatal in severe cases. Recent transcriptomic and proteomic studies have identified multiple components of viperid venoms [1][2][3], including phospholipases A 2 [4][5][6][7], metalloproteases (snake venom metalloproteases, SVMPs) [8][9][10][11], and serine proteases [12,13]. Many of these venom proteins have isoforms. In contrast to neurotoxic to the nucleotide sequence of PfSSP-5. The amino acid sequence of its N-terminal domain differs from those of the five known PfSSPs, whereas its C-terminal domain is very similar. Ten cysteines are conserved among PfSSPs. Therefore, this nucleotide sequence was determined to encode a novel type of SSP which was named PfSSP-6. To sequence PfSSP-6, genomic PCR was performed using the draft nucleotide sequence of the corresponding region in Scaffold 2858 of the Amami Island P. flavoviridis genome as a reference. A 1453 bp genome segment that encompassed the 5 terminus of the putative first exon to the 3 terminus of the putative third exon of PfSSP-6, and another 2375 bp segment that encompassed the 5 terminal of the putative third exon to 85 bp downstream from the putative fourth exon of PfSSP-6 were then acquired. Finally, the 3642 bp sequence of PfSSP-6 was determined. Referring to the construction of PfSSP-5, definitive exon-intron boundaries of PfSSP-6 were identified. PfSSP-6 consists of four exons and three introns and encodes a 111 amino acid protein, including a 19 amino acid signal peptide. The deduced amino acid sequence of the mature protein encoded by PfSSP-6 shows 33%-61% identity with the other five PfSSPs, and the positions of its 10 cysteine residues are conserved ( Figure 1). Referring to the draft nucleotide sequence of Scaffold 2858 encompassing PfSSP-6 to PfSSP-4, which is the 5 terminal gene of the array of five PfSSPs [39], further genomic PCR with the Amami-Oshima P. flavoviridis genome was performed and the 12,406 bp sequence of the intergenic region between PfSSP-6 and PfSSP-4, named Pf I-Reg64, was determined ( Figure 2).
Toxins 2020, 12,177 3 of 17 conserved among PfSSPs. Therefore, this nucleotide sequence was determined to encode a novel type of SSP which was named PfSSP-6. To sequence PfSSP-6, genomic PCR was performed using the draft nucleotide sequence of the corresponding region in Scaffold 2858 of the Amami Island P. flavoviridis genome as a reference. A 1453 bp genome segment that encompassed the 5′ terminus of the putative first exon to the 3′ terminus of the putative third exon of PfSSP-6, and another 2375 bp segment that encompassed the 5′ terminal of the putative third exon to 85 bp downstream from the putative fourth exon of PfSSP-6 were then acquired. Finally, the 3642 bp sequence of PfSSP-6 was determined.
Referring to the construction of PfSSP-5, definitive exon-intron boundaries of PfSSP-6 were identified. PfSSP-6 consists of four exons and three introns and encodes a 111 amino acid protein, including a 19 amino acid signal peptide. The deduced amino acid sequence of the mature protein encoded by PfSSP-6 shows 33%-61% identity with the other five PfSSPs, and the positions of its 10 cysteine residues are conserved ( Figure 1). Referring to the draft nucleotide sequence of Scaffold 2858 encompassing PfSSP-6 to PfSSP-4, which is the 5′ terminal gene of the array of five PfSSPs [39], further genomic PCR with the Amami-Oshima P. flavoviridis genome was performed and the 12,406 bp sequence of the intergenic region between PfSSP-6 and PfSSP-4, named PfI-Reg64, was determined ( Figure 2).
Toxins 2020, 12, 177 3 of 17 conserved among PfSSPs. Therefore, this nucleotide sequence was determined to encode a novel type of SSP which was named PfSSP-6. To sequence PfSSP-6, genomic PCR was performed using the draft nucleotide sequence of the corresponding region in Scaffold 2858 of the Amami Island P. flavoviridis genome as a reference. A 1453 bp genome segment that encompassed the 5′ terminus of the putative first exon to the 3′ terminus of the putative third exon of PfSSP-6, and another 2375 bp segment that encompassed the 5′ terminal of the putative third exon to 85 bp downstream from the putative fourth exon of PfSSP-6 were then acquired. Finally, the 3642 bp sequence of PfSSP-6 was determined.
Referring to the construction of PfSSP-5, definitive exon-intron boundaries of PfSSP-6 were identified. PfSSP-6 consists of four exons and three introns and encodes a 111 amino acid protein, including a 19 amino acid signal peptide. The deduced amino acid sequence of the mature protein encoded by PfSSP-6 shows 33%-61% identity with the other five PfSSPs, and the positions of its 10 cysteine residues are conserved ( Figure 1). Referring to the draft nucleotide sequence of Scaffold 2858 encompassing PfSSP-6 to PfSSP-4, which is the 5′ terminal gene of the array of five PfSSPs [39], further genomic PCR with the Amami-Oshima P. flavoviridis genome was performed and the 12,406 bp sequence of the intergenic region between PfSSP-6 and PfSSP-4, named PfI-Reg64, was determined ( Figure 2).

Sequence Configurations Classify the Six PfSSPs Into Three Subgroups
Introns of PfSSP-6 contained insertions of specific nucleotide sequences, fragments of L1, chicken repeat-1 (CR1), and Gypsy LINEs, fragments of a reverse transcriptase (RT) domain of L2 LINE, fragments of Mariner and hobo-Ac-Tam3 (hAT) DNA transposons, and repetitive sequences, as in the other five PfSSPs (Figure 3). The five inserted fragments, L1 and CR1 LINEs in the first intron

Sequence Configurations Classify the Six PfSSPs Into Three Subgroups
Introns of PfSSP-6 contained insertions of specific nucleotide sequences, fragments of L1, chicken repeat-1 (CR1), and Gypsy LINEs, fragments of a reverse transcriptase (RT) domain of L2 LINE, fragments of Mariner and hobo-Ac-Tam3 (hAT) DNA transposons, and repetitive sequences, as in the other five PfSSPs (Figure 3). The five inserted fragments, L1 and CR1 LINEs in the first intron and Mariner-iii, Gypsy-i, and Gypsy-ii in the third intron, are conserved in all PfSSPs. These insertions Toxins 2020, 12, 177 4 of 16 must therefore have occurred before the divergence of the six PfSSPs. Second, configurations of the nucleotide sequences inserted into the second or third intron classified the six PfSSPs into two subgroups, Long SSPs and Short SSPs [39]. Long PfSSPs are characterized by the fragment of the RT domain of L2 LINE in the third intron. However, the nucleotide sequence of the fragment of the RT domain of L2 LINE in the third intron of PfSSP-6 differs from those inserted into the other three genes of conventional Long SSPs, PfSSP-1, PfSSP-2, and PfSSP-5. The fragment of L2 LINE in the third intron of three PfSSPs, PfSSP-1, PfSSP-2, and PfSSP-5, is truncated in the 3 terminal region (Figure 1). On the other hand, the fragment of L2 LINE in the third intron of PfSSP-6 is truncated from the 5 terminal region, as in typical LINEs [42]. L2 LINE is composed of two open reading frames, ORF1 and ORF2, in which ORF1 encodes an RNA-binding protein and ORF2 encodes a two-domain protein consisting of an endonuclease (EN) and an RT domain [42]. The RT domain of L2 LINE consists of 10 subdomains numbered from zero to IX and a carboxy-terminal conserved region (CTCR) which is thought to serve as the scaffold of reverse-transcription of L2 LINE [43]. A 320 bp section of the L2 LINE fragment in PfSSP-1 encodes three subdomains, zero to II, of the RT domain. A 431 bp section of that in PfSSP-2 encodes four subdomains, zero to III, of the RT domain, and 1011 bp of the L2 LINE fragment in PfSSP-5 encodes nine subdomains, zero to VIII, of the RT domain [39]. However, 1240 bp of the L2 LINE fragment in PfSSP-6 encode eight subdomains, III to X, and CTCR of the RT domain. This indicates that this L2 LINE is truncated in the 5 terminal region. It is highly likely that the nucleotide sequence from the 3 downstream region of the third exon of PfSSP-6 to the 5 terminal of the inserted L2 LINE fragment of PfSSP-6 has disappeared, accompanied by 5 truncation of L2 LINE. These characteristics indicate that PfSSP-6 should be classified as a novel Long SSP. Interestingly, body map analysis using semi-quantitative RT-PCR showed that PfSSP-6 is strongly expressed in the stomach and weakly in the liver (data not shown). It seems that the product of PfSSP-6 is irrelevant to its role in blood. Thus, the three configurations of inserted nucleotide sequences classify the six PfSSPs into three subgroups, conventional and novel Long SSPs, and Short SSPs. and Mariner-iii, Gypsy-i, and Gypsy-ii in the third intron, are conserved in all PfSSPs. These insertions must therefore have occurred before the divergence of the six PfSSPs. Second, configurations of the nucleotide sequences inserted into the second or third intron classified the six PfSSPs into two subgroups, Long SSPs and Short SSPs [39]. Long PfSSPs are characterized by the fragment of the RT domain of L2 LINE in the third intron. However, the nucleotide sequence of the fragment of the RT domain of L2 LINE in the third intron of PfSSP-6 differs from those inserted into the other three genes of conventional Long SSPs, PfSSP-1, PfSSP-2, and PfSSP-5. The fragment of L2 LINE in the third intron of three PfSSPs, PfSSP-1, PfSSP-2, and PfSSP-5, is truncated in the 3′ terminal region ( Figure 1). On the other hand, the fragment of L2 LINE in the third intron of PfSSP-6 is truncated from the 5′ terminal region, as in typical LINEs [42]. L2 LINE is composed of two open reading frames, ORF1 and ORF2, in which ORF1 encodes an RNA-binding protein and ORF2 encodes a two-domain protein consisting of an endonuclease (EN) and an RT domain [42]. The RT domain of L2 LINE consists of 10 subdomains numbered from zero to IX and a carboxy-terminal conserved region (CTCR) which is thought to serve as the scaffold of reverse-transcription of L2 LINE [43]. A 320 bp section of the L2 LINE fragment in PfSSP-1 encodes three subdomains, zero to II, of the RT domain. A 431 bp section of that in PfSSP-2 encodes four subdomains, zero to III, of the RT domain, and 1011 bp of the L2 LINE fragment in PfSSP-5 encodes nine subdomains, zero to VIII, of the RT domain [39]. However, 1240 bp of the L2 LINE fragment in PfSSP-6 encode eight subdomains, III to X, and CTCR of the RT domain. This indicates that this L2 LINE is truncated in the 5′ terminal region. It is highly likely that the nucleotide sequence from the 3′ downstream region of the third exon of PfSSP-6 to the 5′ terminal of the inserted L2 LINE fragment of PfSSP-6 has disappeared, accompanied by 5′ truncation of L2 LINE. These characteristics indicate that PfSSP-6 should be classified as a novel Long SSP. Interestingly, body map analysis using semi-quantitative RT-PCR showed that PfSSP-6 is strongly expressed in the stomach and weakly in the liver (data not shown). It seems that the product of PfSSP-6 is irrelevant to its role in blood. Thus, the three configurations of inserted nucleotide sequences classify the six PfSSPs into three subgroups, conventional and novel Long SSPs, and Short SSPs.
Configurations of SSP paralogs in each snake genome were used to classify the eight snakes into three groups. Non-venomous P. bivittatus formed the first group, in which two genes encoding SSP-6 and SSP-5 were present in the genome. T. sirtalis, V. berus, C. viridis, and O. hannah formed a second group in which three genes encoding SSP-6, SSP-5, and SSP-4 were present. D. actus, P. mucrosquamatus, and P. flavoviridis formed a third group with two genes encoding SSP-1 and SSP-2, in addition to the initial three genes, SSP-6, SSP-4, and SSP-5. This result suggests that the configuration of SSP paralogs is relevant to habitat characteristics of each snake. Snakes in the third group, D. actus, P. mucrosquamatus, and P. flavoviridis, inhabit the Orient, where the warm and humid climate might provide richer and more diversified prey than in Europe and America. It is likely that their venom proteins have become varied, and that the serum proteins that neutralize those venoms then also diversified. O. hannah did not need to develop novel varieties of IIA-PLA 2 isozymes; it had another type of venom PLA 2 , the neurotoxic IA-PLA 2 , a lethal component. Therefore, OhSSP-5β and OhSSP-5γ, corresponding to SSP-1 and/or SSP-2, may have had no need to become derivatives as the counterpart of variable IIA-PLA 2 s.

Diversified SSPs Acquired by Advanced Snakes Have More Complex Venom Compositions
Genes encoding SSP paralogs of each snake were analyzed mathematically. The KA/KS ratio, which is the relative ratio of synonymous to nonsynonymous substitutions between the ORFs (Tables  1-8), or KN, which is the rate of substituted nucleotides between the introns, were calculated (Tables 9-13). However, for genes for which full-length nucleotide sequences of exons or introns remained unknown, the rate of KA/KSS for DaSSP-5δ(Ψ), PbSSP-5γ, and VbSSP-4, or KNs for PbSSPs, TsSSPs, and VbSSPs were not calculated. In the previous section, we proposed that the initial repertoire of SSPs in the genomes of venomous snakes comprised SSP-6, SSP-5, and SSP-4. Our mathematical analysis suggested that these three differ in their characteristics. For any snake, the KA/KS ratio estimated between the ORFs of SSP-6 and SSP-5 was the lowest, or considerably lower than the KA/KS ratios estimated between other paralogs. On the other hand, the KA/KS ratios estimated between SSP-6 and SSP-4 and between SSP-5 and SSP-4 were close to one. Our interpretation of these results is as follows. SSP-6 or SSP-5 are irrelevant for neutralizing venom proteins and have constitutive or essential roles, such as digestion or blood homeostasis. Therefore, nucleotide sequences of SSP-6 and SSP-5 have

Diversified SSPs Acquired by Advanced Snakes Have More Complex Venom Compositions
Genes encoding SSP paralogs of each snake were analyzed mathematically. The K A /K S ratio, which is the relative ratio of synonymous to nonsynonymous substitutions between the ORFs (Tables 1-8), or K N , which is the rate of substituted nucleotides between the introns, were calculated (Tables 9-13). However, for genes for which full-length nucleotide sequences of exons or introns remained unknown, the rate of K A /K S s for DaSSP-5δ(Ψ), PbSSP-5γ, and VbSSP-4, or K N s for PbSSPs, TsSSPs, and VbSSPs were not calculated. In the previous section, we proposed that the initial repertoire of SSPs in the genomes of venomous snakes comprised SSP-6, SSP-5, and SSP-4. Our mathematical analysis suggested that these three differ in their characteristics. For any snake, the K A /K S ratio estimated between the ORFs of SSP-6 and SSP-5 was the lowest, or considerably lower than the K A /K S ratios estimated between other paralogs. On the other hand, the K A /K S ratios estimated between SSP-6 and SSP-4 and between SSP-5 and SSP-4 were close to one. Our interpretation of these results is as follows. SSP-6 or SSP-5 are irrelevant for neutralizing venom proteins and have constitutive or essential roles, such as digestion or blood homeostasis. Therefore, nucleotide sequences of SSP-6 and SSP-5 have been conserved. On the other hand, SSP-4, acquired in the genomes of venomous snakes, may have encoded the first SSP with a role specific to venom neutralization in the event of accidental bites. Therefore, SSP-4 may have had to be more plastic than SSP-5 and SSP-6.   Table 4. K A /K S ratios estimated between the ORFs of P. bivittatus SSPs.

CvSSP-4 CvSSP-5 CvSSP-6
CvSSP-4 0.319 0.358 CvSSP-5 0.372 CvSSP-6  The K A /K S ratios estimated between DaSSP-1 and DaSSP-2, DaSSP-3 and DaSSP-4, PmSSP-1 and PmSSP-2, PmSSP-3 and PmSSP-4, PfSSP-1 and PfSSP-2, and PfSSP-3 and PfSSP-4 were 1.61, 1.77, 1.49, 1.35, 1.80, and 1.42, respectively ( Table 2, Table 5, and Table 6), and the rates of K N were 0.0227, 0.005, 0.154, 0.0397, 0.0317, and 0.0283, respectively (Table 10, Table 12, and Table 13). These results showed that the branching of these genes, especially late genes such as SSP-1, SSP-2, or SSP-3, occurred in an accelerated manner, and that the time that passed after their divergence was very short. In addition, the K A /K S ratios estimated between SSP-1 and SSP-5 or SSP-2 and SSP-5 of D. acutus, P. mucrosquamatus, and P. flavoviridis were around 0.7. This result also supports the idea that SSP-1 or SSP-2 and SSP-5 are evolutionarily related, as suggested above. That is, SSP-1 and SSP-2 were recently derived from SSP-5 and then diversified in an accelerated manner to accommodate venom proteins. SSP-3, the truncated SSP acquired as the successor to SSP-4, is also thought to bind more venom proteins than SSP-4, as do SSP-1 and SSP-2 relative to SSP-5. Therefore, the K A /K S ratios estimated between SSP-3 and SSP-4 also show considerably higher values. The many reports that venom proteins bind to SSP-1 [35], SSP-2 [33,37], or SSP-3 [36] also support the above idea. Since animal venoms work as defense mechanisms, as tools to catch prey, or simply to enhance digestion, they should be sensitive to the surrounding environment. Because venom proteins have become more diversified in environments where there are more diverse prey, serum proteins required to neutralize venom activities, such as SSP-1, SSP-2, and SSP-3, have also diversified in an accelerated manner. Even among conventional Long SSPs, SSP-1, SSP-2, and SSP-5 have evolved in an accelerated or neutral manner, depending on whether they deal with venom components. On the other hand, SSP-3 and SSP-4, which specifically arose as anti-venom proteins, have evolved in an accelerated manner.
Chijiwa et al. proposed that most nucleotide substitutions at non-synonymous sites occur only immediately after gene duplication. Then random mutations accumulate over time, and selective pressure that leaves "neutral" mutations at synonymous sites erases the traces of accelerated evolution [39]. However, in the genomes of the viperids P. flavoviridis and P. mucrosquamatus, inversion of the genome segment encompassing SSP-1 to SSP-2 occurred, and subsequent accumulation of random mutations was suppressed [39]. These findings are also applicable to D. acutus DaSSP-1 and DaSSP-2. Since the SSP-3 allele is located in the 3 region downstream of the inverted genome segment containing SSP-1 and SSP-2, the inversion may also have suppressed accumulation of random mutations in PfSSP-3.

Materials
P. flavoviridis specimens were provided by the Institute of Medical Sciences of the University of Tokyo. The tail of an O. hannah specimen was provided by the Japan Snake Center. That of a P. mucrosquamatus was provided by the Medical Institute of Bioregulation, at the Research Center of Genetic Information, Kyushu University. High-molecular-weight genomic DNA was prepared from livers or tails of the snakes according to the method of Blin and Stafford [44]. Total RNA was prepared from various snake organs, according to the ISOGEN protocol (Nippon Gene, Toyama, Japan). Restriction endonucleases and KOD plus DNA polymerase were purchased from Nippon Gene and TOYOBO (Osaka, Japan), respectively. Other reagents and antibiotics were from Nacalai Tesque (Kyoto, Japan) and TAKARA BIO (Shiga, Japan). Specific oligonucleotide primers were synthesized by GENNET (Fukuoka, Japan).

Cloning and Sequencing of the Genome Segment Containing PfSSP-6
A dedicated database, HabAm1, [40] was constructed to carry out blastn and tblastx analysis with the nucleotide sequences of PfSSPs (PfSSP-1-PfSSP-5) as queries. Exon-intron boundaries were then determined based on the five PfSSPs. Referring to the nucleotide sequence of Scaffold 2858, the sense primer SSP6-5UTR-1, 5 -ggC gTC CCT CCT TCT CCT Tg-3 , which anneals specifically to the first exon of PfSSP-6, and the antisense primer SSP6ex3-2, 5 -CTC gCA TTC CAT ACA ATT ggC Tg-3 , which anneals specifically to the third exon of PfSSP-6, were used to amplify the 1453 bp genome fragment ( Table 14). The sense primer, SSP6ex3-1, 5 -TgT ggC CAA CCA AAT gCg Tgg-3 , which anneals specifically to the third exon of PfSSP-6, and the antisense primer SSP6-3flank-1, 5 -CAg CTA TgC ATg CCT TAT ATC AC-3 , which anneals specifically to 85 bp 3 downstream of the fourth exon of PfSSP-6, were then used to amplify the 2363 bp genome fragment (Table 14). Amplified genome fragments were ligated to the pCR™-Blunt II-TOPO ® vector (Life Technologies, Carlsbad, CA, USA) and transformed with DH5α-competent cells (TAKARA BIO, Shiga, Japan). Nucleotide sequences were determined using an ABI 3130xl capillary sequencer. The 1453 bp PCR fragment overlapped with the 2363 bp PCR fragment by 89 bp. The physical structure of the 3727 bp segment encompassing the first exon of PfSSP-6 to 85 bp 3 downstream of the fourth exon of PfSSP-6 was determined. This 3727 bp DNA fragment contained four exons encoding Pf SSP-6. Moreover, to acquire the nucleotide sequence of the intergenic region between PfSSP-6 and PfSSP-4, named Pf I-Reg64, genomic PCR was carried out against the Amami-Oshima P. flavoviridis genome and the sense primer Ireg64-1, 5 -CTC CAT gCA AAg gAg gAT TTC C-3 , which anneals to the 3 terminus of the third intron of PfSSP-6, and the antisense primer Ireg64-6, 5 -TAg gCC TTg ACA CAT gAT ggC-3 , which anneals to the middle portion of Pf I-Reg64, were used to amplify the 7717 bp genome fragment, named Pf IREG64-I (Table 14). The Pf IREG64-I fragment was also cloned and sequenced. The 7717 bp Pf IREG64-I overlapped with the 3727 bp PfSSP-6 by 474 bp. The sense primer Ireg64-5, 5 -CAT TgT TgA gCA ACC CTT ggC-3 , which anneals 2501 bp 5 upstream of Ireg64-6, and the antisense primer Ireg64-8 5 -ggA CTA TTA AgC AgT ggA ATg gC-3 , which anneals 2340 bp 5 upstream of the first exon of PfSSP-4 (3 terminal of Pf IReg-64), were then used to amplify the 5283 bp genome fragment, named Pf IREG64-II (Table 14). The Pf IREG64-II fragment was also cloned and sequenced. The 5283 bp Pf IREG64-II overlapped with the 7717 bp Pf IREG64-I by 2523 bp. The sense primer Ireg64-9, 5 -ggC CCT CTT CCA Agg ACA AgC-3 , which anneals 455 bp 5 upstream of Ireg64-8, and the antisense primer Ireg64-10, 5 -ACC TCg TTC CTC CAg CCA CT-3 , which anneals to the 5 terminus of the first intron of PfSSP-4, were then used the 2971 bp genome fragment, named Pf IREG64-III (Table 14). The Pf IREG64-III fragment was also cloned and sequenced. The 2971 bp Pf IREG64-III overlapped with the 5267 bp Pf IREG64-II by 455 bp. Finally, the physical structure of the 16,248 bp segment encompassing the third intron of PfSSP-6 to the first intron of PfSSP-4 was completely established. The nucleotide sequences of PfSSP-6 and the genome segment from PfSSP-6 to PfSSP-4 are available from the Genbank/EMBL/DDBJ databases under Accession No. LC518073. Table 14. Primers used to acquire the nucleotide sequences from the genome domain encompassing PfSSP-6 to PfSSP-4. The symbols (f) or (r) after the position numbers indicate the directions of the primers. Forward or reverse denote whether the direction of elongation was the same or opposite to that of transcription. Nucleotide positions refer to nucleotide sequences reported in this study (LC518073).

RepeatMasker Analysis of the Nucleotide Sequence of PfSSP-6
A dedicated database was constructed with repetitive sequences of the genomes of various organisms collected from Repbase of the Genetic Information Research Institute [45]. RepeatMasker utilized the nucleotide sequences of PfSSP-6 against the database via BLAST+, RMBlast (NCBI), and Tandem Repeats Finder (Boston University) [46].

Determining the Nucleotide Sequences of SSP Paralogs of P. mucrosquamatus and O. hannah
To acquire complete nucleotide sequences of PmSSP-3, PmSSP-4, OhSSP-1, OhSSP-2, OhSSP-5, and OhSSP-6, genomic PCR was performed on the O. hannah and P. mucrosquamatus genomes to amplify two overlapping nucleotide segments separately. These included the 5 segment of the gene encompassing the first exon to the second exon, and the 3 segment of the gene encompassing the second exon to the fourth exon of each gene.
For PmSSP-3 (Pm Scaffold 462), the sense primer, PmSSP34-5UTR, 5 -CAA ggg TTg gTC TTg gTT TTT g-3 , which anneals to the 5 terminus of the first exon of PmSSP-3 and PmSSP-4, and the antisense primer, PmSSP3ex2-R, 5 -ggT AgA gAA AAg CCC CCA AAg-3 , which anneals to the second exon of PmSSP-3, were used to amplify the 1169 bp 5 segment of PmSSP-3 (Table 15). The sense primer, PmSSP3-F, 5 -TgC TTT ggg ggC TTT TCT C-3 , which anneals to the middle portion of the second exon of PmSSP-3, and the antisense primer, PmSSP34-R, 5 -CTT gAC TgA GAC TgA AgT TCC-3 , which anneals to the 311 bp 3 region downstream of the fourth exon of PmSSP-3 and PmSSP-4, were then used to amplify the 2722 bp 3 segment of PmSSP-3 (Table 15). The 5 segment of PmSSP-3 overlapped with the 3 segment of PmSSP-3 by 31 bp. The physical structure of the 3860 bp segment encompassing the first exon of PmSSP-3 to the 311 bp at the 3 region downstream of the fourth exon of PmSSP-3 was completed. With regard to PmSSP-4, the sense primer, PmSSP34-5UTR, described above, and the antisense primer PmSSP4ex2-R, 5 -CgT TTC Agg TAA Agg AAT ACT C-3 , which anneals to the second exon of PmSSP-4 based on the nucleotide sequence of Pm Scaffold 21,362, were used to amplify the 1139 bp 5 portion of PmSSP-4 (Table 15). Using Pm Scaffold 21,362, the sense primer, PmSSP4-F, 5 -gAg TAT TCC TTT ACC TgA AAC g-3 , which anneals to the middle portion of the second exon of PmSSP-4, and the antisense primer, PmSSP34-R, described above, were then used to amplify the 2999 bp 3 segment of PmSSP-4. The 5 segment of PmSSP-4 overlapped with the 3 segment of PmSSP-4 by 22 bp (Table 15). The physical structure of the 4118 bp segment encompassing the first exon of PmSSP-4 to the 311 bp at the 3 region downstream of the fourth exon of PmSSP-4 was sequenced.

Mathematical Analysis
Alignment of the amino acid sequences of snake SSPs was performed using ClustalX software. Nucleotide sequences of ORFs encoding the mature SSPs were rearranged and gaps in the aligned amino acid sequences were removed using PAL2NAL. The rates of synonymous (K S ) and nonsynonymous (K A ) substitutions per site between the ORFs of the genes were calculated using the Nei-Gojobori method, as implemented in PAML [55]. After removing LINEs, DNA transposons, and indels (insertion/deletion) from the introns, alignment of introns was performed using ClustalX. Values of K N that estimated rates of substituted nucleotides between the introns of SSPs were calculated from the aligned sequence data.