Horizontal Transfer and Evolutionary Profiles of Two Tc1/DD34E Transposons (ZB and SB) in Vertebrates

Both ZeBrafish (ZB), a recently identified DNA transposon in the zebrafish genome, and SB, a reconstructed transposon originally discovered in several fish species, are known to exhibit high transposition activity in vertebrate cells. Although a similar structural organization was observed for ZB and SB transposons, the evolutionary profiles of their homologs in various species remain unknown. In the present study, we compared their taxonomic ranges, structural arrangements, sequence identities, evolution dynamics, and horizontal transfer occurrences in vertebrates. In total, 629 ZB and 366 SB homologs were obtained and classified into four distinct clades, named ZB, ZB-like, SB, and SB-like. They displayed narrow taxonomic distributions in eukaryotes, and were mostly found in vertebrates, Actinopterygii in particular tended to be the major reservoir hosts of these transposons. Similar structural features and high sequence identities were observed for transposons and transposase, notably homologous to the SB and ZB elements. The genomic sequences that flank the ZB and SB transposons in the genomes revealed highly conserved integration profiles with strong preferential integration into AT repeats. Both SB and ZB transposons experienced horizontal transfer (HT) events, which were most common in Actinopterygii. Our current study helps to increase our understanding of the evolutionary properties and histories of SB and ZB transposon families in animals.


Introduction
Class II transposable elements are DNA segments (jumping genes) that can mobilize and integrate into the genome by mechanisms involving a DNA intermediate [1]. Transposons can form a substantial fraction of vertebrate genomes (4-60%) [2], and can have considerable impact on genome function based on their ability to move and reorganize the DNA. Most DNA transposons can be classified into families of autonomous (encoding a functional transposase) and nonautonomous (lacking a functional transposase) elements, characterized by their ability to respond (be mobilized) to the same transposase. Transposons belonging to the same family typically share several nucleotides in their termini that are identical [3]. Similarly, the superfamilies can also be identified by amino acid analysis sequence of the transposase genes, both in eukaryote and prokaryote transposons [4].
The transposition mechanism of a widespread [5] and extensively characterized "cut and paste" transposon superfamily, Tc1/mariner, is based on the excision (cut) and reinsertion (paste) of fragments into a new location in the genome. As an outcome of transposon integration, the element generates a target site duplication [6]. Tc1/mariner superfamily transposons are about 1.6 kb in length and are characterized by terminal inverted repeats (TIRs) 17-300 bp in length that flank a coding sequence for a transposase

Transposon Searching
The taxonomic distribution of ZB and SB transposons was determined via TBLASTN (v. 2.12.0) [21] searching in the National Center for Biotechnology Information (NCBI) against the accessible assembled eukaryote genomes (including Contig and Scaffold). This entailed using the queries of the full-length transposase protein sequences of 340 aa SB100X [22] and 341 aa ZB [10], with defaulted algorithm parameters and an E-value of 1 × 10 −4 . The obtained transposases were used as queries to identify other homology elements in succession. Finally, all mined transposases were submitted for phylogenetic analysis and only sequences belonging to ZB and SB clades were used for further analysis. The best hits of ZB and SB elements (E-value of 1 × 10 −4 ) were extracted with 2 kb flanking sequences in each genome, and their boundaries in each genome were defined by alignment using the BioEdit tool (v. 7.2.0) in the ClustalW program [23] and were observed manually for any changes. The representative sequence (<10 copies, hard to derive consensus sequence) or consensus sequence (>10 copies) in each genome was submitted to BLASTN for each host genome to estimate copy numbers. More than 40% coverage and 90% identity of BLAST hits to the queries (with a default E value) were used to calculate copy numbers for each element to avoid overlapping hits between subfamilies.

Phylogenetic Analysis
The coding sequences (CDS) of the identified SB and ZB elements were aligned with the CDS sequences of 29 known DNA transposases representing seven families (DD34E/Tc1, DD36E/IC, DD37E/TRT, DD41D/VS, DD39D/GT, DD34D/mariner, and DD37D/maT) of the Tc1/mariner transposon family by MAFFT v. 7.310 [24]. Then, the alignment was submitted to the IQ-tree program (v. 1.6.12) [25] to determine their evolutionary relationships by using the maximum likelihood method, with an ultrafast bootstrap value of 1000. The DD37D, which forms a distinct clade with DD41D and DD39D from DD34E/Tc1 [12], was used as the outgroup. The best-suited amino acid substitution model was selected by ModelFinder embedded in the IQ-tree program (v. 1.6.12) [26].

HT Analysis
The pairwise distances between the host genes and the transposons were used to detect ZB and SB transposons' horizontal transfer events. Two globally conserved ribosomal proteins (RPL3 and RPL4), were selected as the host genes and successfully applied to evaluate the HT events for hAT and Tigger transposons [31,32]. The pairwise distances between transposase-coding sequences (ZB and SB) and host gene-coding (RPL3 and RPL4) sequences were calculated to detect possible HT events of transposons. Transposons with a sequence identity of less than 70% of pairwise species were excluded from the HT analysis. To decrease the putatively false-positive estimation of HT events, the HT events were recognized when the genetic distance of transposons between species was 1.2 times smaller than both the host genes (RPL3 and RPL4).
All accessible gene annotations (CDS) for host genes (RPL3 and RPL4) of species involved in the putative HT events of ZB and SB were retrieved from the NCBI database. The CDS of these genes were searched against the NCBI genome database via TBLASTN for those species whose host genes were not annotated and manually annotated by GenScan (http://hollywood.mit.edu/GENSCAN.html, accessed on 16 March 2022). The multiple sequence alignments of the host-gene-CDS and transposase-CDS was built by the MAFFT program (v. 7.310) [25] and subsequently submitted to MEGA software (v. 7.0.26) to calculate the genetic distances between the host genes and transposons (pairwise deletion and maximum composite likelihood) [33]. The genetic distances between host genes and transposons in each species are listed in Supplementary Tables S5 and S6, and the alignment files were deposited as Supplementary Files (Supplementary Files S2-S5). It was summarized using GraphPad Prism v. 8.0.1.244.

Phylogeny and Sequence Analysis of ZB and SB Transposons
Overall, 629 sequences homologous to ZB and 366 sequences homologous to SB were obtained and submitted for phylogenetic analysis. The phylogenetic tree showed that all identified SB and ZB homology elements belonged to the clade of DD34E/Tc1, and they formed four distinct branches with strong bootstrap supports (100). The branches harboring 341 aa ZB and 340 aa SB100X reference sequences were named ZB and SB, respectively, and their close sibling branches were named ZB-like and SB-like, respectively ( Figure 1 and Supplementary Figure S1, Tables S1 and S2 and Supplementary File S1). Overall, ZB elements from 313 species, ZB-like elements from 316 species, SB elements from 108 species, and SB-like elements from 258 species were designated as ZB, ZB-like, SB, and SB-like transposons, respectively (Supplementary Table S1).   Pairwise sequence comparison of SB and ZB transposons revealed that the CDS sequences (DNA sequences) of ZB, ZB-like, SB, and SB-like transposases are highly conserved. Their internal sequence identities were higher than 80%, with ZB and SB-like represented by over 90%, and ZB-like and SB by over 80% (Figure 2A). In contrast, the sequence identities between ZB and ZB-like, and between SB and SB-like were 61%. The sequence identities of CDS between SB and ZB groups ranged from 52% to 54% (Figure 2A). Furthermore, the DDE domains (protein sequences) tended to be more conserved than the DBD (protein sequences) domains. The sequence identities of DDE domains between ZB and ZB-like and between SB and SB-like were 68%. In comparison, the sequence identities of DDE between SB and ZB groups ranged from 58% to 61% ( Figure 2B,C). Similarly, the sequence identities of DBD domains between ZB and ZB-like, and between SB and SB-like, were 54% and 55%, respectively. Whereas, the sequence identities of DBD between SB and ZB groups range from 34% to 38% ( Figure 2B). In addition, their internal sequence identities of TIRs for ZB, ZB-like, SB, and SB-like are 78%, 68%, 65%, and 80%, respectively. The sequence identities of TIRs between ZB and ZB-like, and between SB and SB-lik,e were 39% and 33%, respectively. In contrast, the sequence identities of TIRs between SB and ZB groups range from 20% to 30% ( Figure 2D).

Taxonomic Distribution and Phylogenetic Analysis of ZB and SB Transposons
The ZB and SB homology transposons display narrow taxonomic distributions in eukaryotes. They were only detected in animals, mostly in vertebrates but also a few lineages were found in invertebrates. Indeed, different taxonomic distributions of four branches (ZB, ZB-like, SB, and SB-like) were observed ( Figure 3A and Table 1). ZB was observed in 299 Actinopterygii species (32 orders), 9 Anura species of vertebrates, 4 Arthropoda spe-

Taxonomic Distribution and Phylogenetic Analysis of ZB and SB Transposons
The ZB and SB homology transposons display narrow taxonomic distributions in eukaryotes. They were only detected in animals, mostly in vertebrates but also a few lineages were found in invertebrates. Indeed, different taxonomic distributions of four branches (ZB, ZB-like, SB, and SB-like) were observed ( Figure 3A and Table 1). ZB was observed in 299 Actinopterygii species (32 orders), 9 Anura species of vertebrates, 4 Arthropoda species of invertebrates, and detected in only 1 Mollusca species (Euprymna scolopes). Although ZB-like is more widely distributed in vertebrates than in other elements, it was discovered in 271 Actinopterygii species (42 orders), 17 Anura species, 5 Agnatha species (2 orders), 19 Squamata species, 1 Sarcopterygii species, 2 Aves species, and 1 Chondrichthyes species. SB was mostly detected in Actinopterygii of vertebrates (107 species of 33 orders) and has only been seen in one Mollusca (Anentome helena) species. Whereas, SB-like invaded into 255 Actinopterygii species (25 orders), 1 Aves species of vertebrates, 1 Echinodermata species (Lytechinus variegatus) and 1 Cnidaria (Dendronephthya gigantea) of invertebrates ( Figure 3A and Table 1). Furthermore, Actinopterygii is the major host of SB and ZB transposons, with 107 species for SB, 255 species for SB-like, 299 species for ZB, and 271 species for ZB-like detected in this lineage, respectively ( Figure 3B). However, substantially different distribution patterns in the orders of Actinopterygii were observed for the four transposon branches. The SB-like and ZB were widely distributed in the order of Cichliformes, with 202 species and 193 species detected, accounting for 78% and 62% of the total detected species, respectively.Whereas, ZB-like and SB are more evenly distributed in the orders of Actinopterygii ( Figure 3B). In addition, we found that some branches of SB, SB-like, ZB, and ZB-like co-exist in some species. A total of 2 species (Siniperca knerii and Siniperca scherzeri) are co-invaded by all four branches, 3 to 24 species are co-invaded by three branches, and 16 to 159 species are co-invaded by two branches ( Figure 3C and Supplementary Table S4).
branches. The SB-like and ZB were widely distributed in the order of Cichliformes, with 202 species and 193 species detected, accounting for 78% and 62% of the total detected species, respectively.Whereas, ZB-like and SB are more evenly distributed in the orders of Actinopterygii ( Figure 3B). In addition, we found that some branches of SB, SB-like, ZB, and ZB-like co-exist in some species. A total of 2 species (Siniperca knerii and Siniperca scherzeri) are co-invaded by all four branches, 3 to 24 species are co-invaded by three branches, and 16 to 159 species are co-invaded by two branches (Figure 3C and Supplementary Table S3).

Structural Organization of ZB and SB
Generally, similar structural organization was observed for ZB and SB transposons. The total lengths of intact SB and ZB transposons range from 1.3 kb to 3.0 kb, but most of them (96% of the total detected intact transposons, 336/350) are between 1.5 kb and 1.7 kb. They contain a single ORF (open reading frame) that encodes a transposase of about 340 aa, ranging from 302 aa to 411 aa, and flanked with TIRs in lengths varied from 27 bp to 415 bp (Table 1, Figure 4A, and Supplementary Table S1). Overall, the structural features of SB and ZB transposons are similar to that observed for other Tc1/mariner members [10,14,35].  The CDS of ZB, ZB-like, SB, and SB-like transposases are highly conserved, and display over 80% of sequence identities (Figure 2A). In the intact SB and ZB transposases, several well-defined domains, including the catalytic domain, GRPR-like motif, linker motif [K(V/T)PLLS], nuclear localization sequence (NLS), and DNA-binding domain (DBD), which contains six helixes in the N terminus, and were identified and indicated in Figure 4A,B. The interdomain linker ( Figure 4B and Supplementary Figure S2) was identified as a conserved sequence stretch (KKPLLS) in SB100X transposase [36]. The first and last two residues of linkers varied across ZB, ZB-like, SB, and SB-like transposons, while the middle three residues (PLL) are highly conserved (Supplementary Figure S2).
Compared with other Tc1/mariner families, such as maT/DD39D, GT/DD37D, and IC/DD36E [12,13], most TIRs of SB and ZB elements (94% of the total detected intact transposons, 330/350) are relatively long, ranging from 180 bp to 300 bp. However, very long and short TIRs are also observed in some species (Table 1 and Supplementary Table S1). The 3 TIR of SB partially overlaps with the ORF regions, which is not observed for the other three group transposons ( Figure 4C). The end motifs (20 bp) of TIRs are highly conserved across ZB, ZB-like, SB, and SB-like, and start with a GC-rich motif followed by an AT-rich region ( Figure 5A). The genomic flank sequence analysis revealed that the integration profiles of ZB, ZB-like, SB, and SB-like in genomes are highly conserved, and display strongly preferential integration into AT repeats ( Figure 5B).

Evolution Dynamics of ZB and SB Transposons
The genomic copy numbers of ZB, ZB-like, SB, and SB-like transposons vary significantly across species, ranging from one to several thousand (Supplementary Table S1). Overall, 142 (45.37%, 142/313), 205 (64.87%, 205/316), 54 (50%, 54/108), 63 (24.42%, 63/258) genomes harbor complete copies (transposons flanked by detectable TSDs and TIRs) of ZB, ZB-like, SB, and SB-like transposons, respectively. Most high numbers of full copies (>=100) of ZB, ZB-like, SB, and SB-like were detected in Actinopterygii. However, high numbers (>=100) of full copies of ZB-like were also detected in Agnatha, Sarcopterygii, Anura, Squamata, and Chondrichthyes (Table 2). Furthermore, intact copies (transposons flanked by detectable TSDs and TIRs and encoded >=300 aa transposases) were detected in many species of multiple animal lineages for all four groups of transposons, but with significant variations across groups and lineages, which support that these transposons display recent and current activities in some lineages of animals, but with differential evolution dynamics (Table 2). In general, 105 (33.55%, 105/313), 183 (57.91%, 183/316), 26 (24.07%, 26/108), and 36 (13.95%, 36/258) genomes contain an intact copy of ZB, ZB-like, SB, and SB-like transposons, respectively. However, most of them represent less than 10 intact copies in genomes, and only 4, 3, 57, and 51 species contain 10 to 99 intact copies of SB, SB-like, ZB, and ZB-like transposons in their genomes, respectively. Very high intact copies (>=100) were observed in very few species (one or four) for SB and SB-like, and not detected for ZB, but observed for many species (26) for ZB-like (Table 2).  Overall, ZB-like has been significantly amplified in some genomes of animals (more than 100 copies), and the intact copy number of ZB-like is much higher than that of ZB in most species, indicating ZB-like may be more active than ZB in most lineages of animals. More than 100 ZB-like intact copies were detected in 26 species (1 Sarcopterygii species, 19 Actinopterygii species, 3 Agnatha species, and 3 Anura species). The most significant number of intact ZB-like copies (5188) was detected in Microcaecilia unicolor (Sacopterygii). At the same time, ZB represents the highest intact copy number in Salarias fasciatus of Actinopterygii by only 81. Furthermore, except for Parhyale hawaiensis in Arthropoda, all species with more than ten intact copies of ZB are distributed in Actinopterygii, and 80% of these belong to order Cichliformes (Table 2 and Supplementary Table S1).
While SB and SB-like have undergone significant expansion in some Actinopterygii species (more than 100 copies), the numbers of intact SB-like copies in some genomes are higher than that of SB, suggesting that SB-like transposons tend to be more active than SB. The species containing intact copies of SB and SB-like species are much less than that of ZB and ZB-like, and only several species in Actinopterygii harbor 10-99 or more than 100 intact copies of SB and SB-like in their genomes (Table 2 and Supplementary Table S1). In addition, SB transposons were detected in 13 species of Salmonid, where the original SB transposase was reconstructed based on the inactive copies from multiple species [9], but SB elements in most species tend to be truncated in Salmonid genomes, which agree with previous studies [9]. However, more than 100 intact copies of SB in Coregonus clupeaformis of Salmonid were detected, indicating that SB may still be active in some species of Salmonid (Supplementary Table S1).

Most ZB and SB Transposons Obtained by Horizontal Transfer
The HT events of ZB and SB transposons were recognized based on the standards described in methods and summarized in Supplementary Figure S3. The number of species involved in HT events was illustrated in Figure 6, indicating that HT obtained most ZB and SB transposons in animals. Overall, 252 (80.5% of the total detected species) ZB, 184 (58.2% of the total detected species) ZB-like, 71 (65.7% of the total detected species) SB, and 241 (93.4% of the total detected species) SB-like invaded species were involved in HT events ( Figure 6A). Moreover, most HT events were confirmed in Actinoptery-gii at which the recorded species involved in HT events were SB: 64.4%, ZB: 84.3%, ZB-like: 66.1%, of which the highest occurred in SB-like: 94.5%. Notaby, most were detected in Cichliformes and Perciformes ( Figure 6C,D). However, some HT events were observed in Squamata, where 26.3% (5/19) ZB-like invaded species are involved in HT events ( Figure 6D). In addition, 16 species of Actinopterygii tend to be more common for HTs of these transposons and have been invaded by at least three families. Particularly, co-HT events of the four families (ZB, ZB-like, SB, and SB-like) were detected for three species (Siniperca knerii, Siniperca scherzeri, and Mastacembelus armatus) and represent the most common species of HTs ( Figure 6B). knerii, Siniperca scherzeri, and Mastacembelus armatus) and represent the most common species of HTs ( Figure 6B).

Recent Origins of ZB and SB Transposons
The phylogenetic relationships of IS630-Tc1-Mariner (ITm) transposons were recently reviewed and at least four superfamilies were suggested, including DDxD/pogo, DD34E/Gambol, Tc1/mariner, and DD82E/Sailor. DD82E/Sailor is a new superfamily characterized recently, with a DD82E catalytic domain distinct from the other three groups (DD34E/Gambol, and Tc1/mariner) [5,[37][38][39]. Both DD34E/Gambol and DD82E/Sailor superfamilies seem to represent low diversity and narrow distribution in nature, while extremely high diversity and wide distribution were observed for DDxD/pogo and Tc1/mariner superfamilies. Six distinct families (Passer, Tigger, pogoR, Lemi, Mover, and Fot/Fot-like) were detected for DDxD/pogo transposons [38], while at least nine distinct families (DD34E/Tc1, DD35E/TR, DD36E/IC, DD37E/TRT, DD38E/IT, DD34D/mariner, DD37D/maT, DD39D/GT, and DD41D/VS) have been defined for Tc1/mariner transposons [14]. Furthermore, a previous study from Gao et al. [40] also demonstrated that DD34E/Tc1 transposons display a high diversity at the family level because at least five distinct clusters or sub-families (Passport-like, SB-like, Frog Prince-like, Minos-like, and Bari-like) were identified. DD34E/Tc1 transposons exhibit an unexpected diversity and may evolve into many families as a common ancestor. It was recently indicated that at least three families (DD35E/TR, DD36E/IC, and DD38E/IT) displaying the closest phylogenetic relationship and highest sequence identity to DD34E/Tc1 transposons may have evolved from this family. Here, we systematically defined the evolution profiles of ZB, a naturally active transposon from zebrafish [10], and SB, a rebuilt active transposon [41], which belong to DD34E/Tc1 transposons. Overall, four distinct clades named ZB, ZB-like, SB, and SB-like were identified and exhibited the closest phylogenetic relationship with the DD34E/Tc1 family with typical structure organizations of this family. Generally, ZB, ZB-like, SB, and SB-like displayed a similar evolution profile and share a high sequence identity. The ZB, ZB-like, SB, and SB-like displayed a narrow taxonomic distribution, and are mainly detected in vertebrates (particularly in Actinopterygii), which is similar to that of DD35E/TR, DD36E/IC, and DD38E/IT, and different from that of DD37D/maT, DD39D/GT, and DD41D/VS. DD37D/maT and DD41D/VS mainly distribute in invertebrates, while DD39D/GT in plants.
Additionally, our data analysis also revealed that ZB, ZB-like, SB, and SB-like displayed high intra-family and inter-family sequence identities, and intact copies were detected in many species of multiple animal lineages for all four groups of transposons. The intact copy number and sequence identity of a transposon in a given genome are key factors to judge the activity in the genome. A high number of intact copies means the transposons have obtained a substantial amplification and may still be active, and indeed that they can jump in the genome. More accurate predictions of activity can be obtained through the combination of more data analysis including structure organization and K divergence. Our data analysis indicates that ZB, ZB-like, SB, and SB-like are recently evolved families, and represent recent and current activity. Furthermore, it indicates that more active members may exist in diverse species of animals, beside ZB, which was proven as a highly active element in zebrafish [10]. However, their transposition activities need further experimental validation.

Horizontal Transfer of ZB and SB Transposons
The horizontal transfer (HT) has long been recognized as an important driver for species diversity and has evolutionary significance on the nuclear genomes within prokaryote domains (bacteria and archaea) [31,39,42]. It was once believed that HT in eukaryotes are rare; yet shreds of evidence support that HT events of mobile elements, including DNA transposons and retrotransposons, are common in eukaryotes and may contribute to shaping genomic and evolutionary patterns in eukaryotes [43][44][45][46]. Although the mechanisms of HTs are still largely unknown, the close physical relationship between a parasite and its host could facilitate horizontal transfer [47][48][49].
The HT events of retrotransposons between kingdoms of eukaryotes (from Arthropods to Flowering Plants) or between phyla were observed [13,[44][45][46]. DNA transposons are widespread across eukaryote kingdoms. The HT events of hAT DNA transposons were observed in multiple lineages of animals, and they may play a role in shaping the evolution of animal genomes [31,48,50]. While Tc1/Mariner superfamily appears to be the most common type of TEs among other DNA transposons involved in HT [51]. Most well-defined families of Tc1/mariner families, including DD35E/TR [11], DD36E/IC [12], DD37E/TRT [52], DD38E/IT [14], DD37D/maT [13], DD39D/GT [13], and DD41D/VS [53], are involved in HT events [50]. In this study, our data analysis indicated that ZB and SB transposons in animals are largely obtained by HT events, mainly occurring in Actinopterygii, which were also observed for DD35E/TR [11], DD36E/IC [12], DD37E/TRT [13], DD38E/IT [14], and DD41D/VS [53]. This indicated that Actinopterygii tend to be "hot" hosts of HTs of Tc1/mariner transposons. On the other hand, high diversity and common HT events of Tc1/mariner transposons in Actinopterygii also suggest that this superfamily may play roles in shaping the evolution of genomes and contribute to the speciation of this lineage. A similar biological role was observed for the Tigger transposons, a family of pogo transposons. These transposons (Tigger) are commonly involved in HT events across different lineages of animals, including mammals, that may have contributed to their wide taxonomic distribution, indicating that Tigger may play a role in the shaping of mammal genome evolution [32].

Structure Organization of ZB and SB Transposons
In the present study, SB and ZB are elements of the DD34E/Tc1 group, which present the typical structural organization of Tc1/mariner elements [5,8]. Functional domain analysis indicated that both ZB and SB transposases have distinct transposition active domains, including a DNA binding domain, catalytic domain (DDE), nuclear localization signal (NLS), GRPR-like motif, and a linker motif. The linker [K(V/T)PLLS] is suggested to be structurally equivalent to the regulatory WVPHEL motif of mariner transposases [36], and was shown to play a critical role in orchestrating cleavage events within the transpososome [35,54,55]. According to the present study, most TIRs of SB and ZB elements are relatively long, ranging from 180 bp to 300 bp. However, very long or short TIRs are also observed in some species. The 3 TIR of SB partially overlaps with the ORF regions, which is not observed for the other three group transposons. The end motifs (20 bp) of TIRs are highly conserved across ZB, ZB-like, SB, and SB-like and start with a GC-rich motif followed by an AT-rich region. Analyses of genomic flanking sequences revealed that ZB, ZB-like, SB, and SB-like integration profiles in genomes are evolutionarily conserved and show a distinct preference for integration into AT repeats. ZB had a longer TIR sequence (TIR:201 bp) and preferred to integrate into the regions containing long repeated dinucleotide TA sequences similar to SB [56,57]. Similarly, the present study shows that the general structure organization of ZB and SB is similar to two close families of Tc1/DD34E superfamily transposons. ZB was identified to share a similar structural organization and target site sequence preference but there exists a slightly different integration profile compared with the features of SB at the mammalian genome-wide scale [10]. In the Tc1/mariner family, the inverted repeats vary in length and contain transposase-binding sites in different numbers and patterns, thus based on the distinct "DDE/D" signatures of transposase. Furthermore, the DDE domains (protein sequences) tend to be more conserved than the DBD domains, and the sequence identities of DDE domains are 68% between both ZB and ZB-like, and SB and SB-like Comparatively, ZB and SB are derived from a common ancestor of DD34E/Tc1. ZB shares the same clade with DD34E/Passport, while SB also shares the same clade with DD34E/Quetzal [56].

Conclusions
In the present study, we established that four distinct clades of transposons (ZB, ZB-like, SB, and SB-like) exhibited the closest phylogenetic relationship with typical structure organizations to the DD34E/Tc1 family. In addition, SB and ZB displayed a narrow distribution in eukaryotes but were mostly detected in animals. Similarly, evidence to support the occurrence of HT events of ZB and SB across vertebrates indicated that these transposons largely occurred in animals, and specifically mainly in Actinopterygii. The current study provides a further understanding of the evolutionary history of ZB, SB, and Tc1/mariner elements, and updates the classification of DD34E/Tc1.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.