Previous Article in Journal
Wetland Restoration Effects on Waterbird Diversity and Habitat Use: A Long-Term Case Study from Chongming Dongtan in Shanghai, China
Previous Article in Special Issue
Helenus and Ajax, Two Groups of Non-Autonomous LTR Retrotransposons, Represent a New Type of Small RNA Gene-Derived Mobile Elements
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Long-Standing Activity with Characteristic Genomic Insertion Signatures in Reptilian Bov-B LINEs and Associated Sauria SINEs

by
Yoshiki Nakatsuka
and
Kazuhiko Ohshima
*
Graduate School of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama 526-0829, Japan
*
Author to whom correspondence should be addressed.
Biology 2026, 15(12), 927; https://doi.org/10.3390/biology15120927 (registering DOI)
Submission received: 26 May 2026 / Revised: 8 June 2026 / Accepted: 10 June 2026 / Published: 13 June 2026
(This article belongs to the Special Issue De Novo Detection of Transposons)

Simple Summary

Long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs) are retrotransposons that constitute a large proportion of the genomes of multiple animal and plant species. These elements are typically inherited from ancestors and passed down to descendants as part of the genome. However, Bov-B LINE was likely transferred horizontally to a ruminant ancestor from a snake. We demonstrated the long-standing activity of the reptilian Bov-B LINE and revealed a characteristic genomic insertion signature that closely resembles the plant RTE-clade LINE signature in the Bov-B LINE and associated SINE. Although a complex evolutionary trajectory of LINEs across species is plausible, we suggest an ancient origin (over 411 MYA) for the retrotranspositional mechanism underlying this signature. The discovery of novel insertion signatures in distinct clades of LINEs and associated SINEs could contribute to a better understanding of the evolutionary impact of these elements.

Abstract

Although long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs) are typically passed down to descendants as part of the genome, the Bov-B LINE was likely horizontally transferred from a snake to the ancestor of ruminants. Plant RTE-clade LINEs and their associated SINEs possess a genomic insertion signature different from that of mammalian L1 LINEs. However, the reason for the increased frequency of horizontal transfer in RTE-clade LINEs such as Bov-B relative to that in L1-clade LINEs has not yet been clarified. In this study, we identified family members of the reptilian Bov-B LINE and associated Sauria SINE across various squamate species to determine the amplification timing of the LINE. The findings revealed that the LINE may be over 180 million years old. Moreover, profiling of target site duplications showed that a characteristic genomic insertion signature of the LINE and SINE closely resembled the signature of the plant RTE-clade LINEs. We conducted phylogenetic analyses of RTE-clade LINEs with characteristic genomic insertion signatures and estimated their divergence times. The findings suggest an ancient origin (over 411 MYA) of the retrotranspositional mechanism underlying this signature; however, a complex evolutionary trajectory of LINEs across species warrants further investigation.

1. Introduction

Long interspersed nuclear elements (LINEs) are typically inherited from evolutionary ancestors and passed down to descendants as part of the species genome [1,2]. However, recent research has revealed an increasing number of well-supported cases of horizontal transfer (HT) of LINEs [3,4,5,6,7,8,9]. One notable example is the bovine Bov-B LINE family, which was likely introduced from a snake into the ancestor of ruminants via HT [10,11,12]. Phylogenetic analyses of reverse transcriptases (RTs) have revealed that Bov-B LINEs belong to the RTE clade [1,13,14,15]. RTE-clade LINEs have been identified in the genomes of a wide variety of, but not all of, eukaryotes such as mammals, non-mammalian vertebrates, lancelet, insects, nematodes, planarian, cnidarians and flowering plants [15].
Surprisingly, RTE-clade LINEs exhibited frequent HT. Walsh et al. [3] demonstrated that the HT of Bov-B LINEs was more widespread than previously believed, with two plausible arthropod vectors, specifically reptile ticks, playing a role. Our research group [9] also discovered a unique HT pattern of the Bov-B LINE in vertebrates, suggesting its transfer from predators (snakes) to their prey (frogs). Bov-B HT between predators and prey is prevalent in Madagascar [9]. Additionally, Suh et al. [4] observed that the genomes of nematodes and seven tropical bird lineages exclusively shared a novel RTE-clade LINE called AviRTE that resulted from HT. Furthermore, Gao et al. [16] conducted phylogenetic and evolutionary analyses of RTEs from both animals and plants and reported that an angiosperm RTE-clade LINE likely underwent HT from ancient aphids or ancestral arthropods to angiosperms. However, the reasons why RTE-clade LINEs undergo HT more frequently than L1-clade LINEs remain poorly understood [6,7].
LINEs are inserted into the genome through a mechanism known as target DNA-primed reverse transcription [17,18,19,20]. The LINE insertion site is primarily determined by the DNA cleavage specificity of the endonuclease (EN) domain of the LINE-encoded open reading frame 2 (ORF2) protein [17,21,22,23]. LINEs belonging to more than 20 clades (e.g., L1 and RTE) encode apurinic/apyrimidinic EN-like ENs. Most of them are inserted at multiple loci within the host genome and may exhibit weak target site preference [24], while only two clades, Tx1 and R1, contain site-specific LINEs [25]. For example, human L1 preferentially inserts at sites with the sequence 5′-TT|AAAA-3′, where the vertical bar (|) indicates the site of insertion, and the EN of L1 cleaves the TpA bond on the complementary strand [20,21,26,27] (Figure 1). Target DNA-primed reverse transcription often results in the duplication of a short stretch of nucleotides, typically no more than 20 bp, due to integration at staggered chromosomal breaks. Consequently, each newly inserted LINE is typically flanked by short direct repeats, known as target site duplication (TSD) [27]. Analysis of TSDs has primarily focused on mammalian L1 LINEs [28,29].
Short interspersed nuclear elements (SINEs) are non-autonomous retroposons [30,31,32,33]. Their 5′-end sequences are derived from tRNA [34], 5S rRNA [35], or 7SL RNA [36], while their 3′-end sequences originate from corresponding LINEs [37], except for the L1 LINEs [38,39,40,41]. The analysis of TSDs in SINEs [42] has provided valuable insights into the enzymatic source of SINE retrotransposition [29,43,44,45,46,47,48]. Sauria SINEs represent a SINE family whose members are widely distributed among the genomes of lizards and snakes, and they share a part of their 3′-end sequence with Bov-B LINEs [49]. Therefore, these SINEs likely use the enzymatic machinery of Bov-B LINEs for retrotransposition [49].
In a previous study, we identified a novel insertion signature shared by plant RTE-clade LINEs and related SINEs [29]. These angiosperm SINEs exhibited a novel and remarkable TSD pattern, where a thymine stretch appeared approximately ten nucleotides (one helical pitch) upstream of the first nucleotide of the TSDs (Figure 1). Such a split signature of TSDs has not been previously reported in plants. We observed this pattern in both SINEs and LINEs in the genome of leguminous plants, where the SINEs share the 3′-end sequence with the RTE-clade LINEs. Their TSDs began with thymine—the 3′-end nucleotide next to thymine was typically adenine—and the 3′-ends of the SINEs and LINEs terminated at Tn and (GTT)n, respectively [29]. We also demonstrated that a horse SINE family exhibits a similar pattern, with a thymine stretch plus TA; however, the TSDs started with adenine while SINE ended with (CAA)n. The TSDs of the corresponding RTE-clade LINE tended to start with adenine while the LINE ended with (CAA)n [29]. Gilbert et al. also found a similar TSD pattern in an elephant LINE family from the RTE clade [44].
Based on these observations, we proposed a mechanism underlying these genomic signatures. We hypothesized that the RTE-encoded ORF2 protein would preferentially bind to DNA regions containing thymine stretches, allowing it to cleave a phosphodiester bond downstream of the stretch. Furthermore, microsatellite-like repeats in the RNA template may influence the identification of EN cleavage sites and/or efficiency of reverse transcription initiation [29]. Sauria SINEs in the green anole exhibited the same pattern in terms of the thymine stretch plus TA. Their TSD started with thymine while SINE ended with (ACCTTT)n. However, the TSD of the corresponding Bov-B LINE tended to start with adenine while the LINE ended with (CGA)n [29].
In this study, we investigated the genomic insertion patterns of Bov-B LINEs and associated Sauria SINEs across various squamate species, the only reptilian group harboring Bov-B, to elucidate the insertion mechanism of the reptilian Bov-B. We successfully clarified the insertion signatures of the LINE and SINE in the same species and observed long-term activity of the LINE with the characteristic insertion signature.

2. Materials and Methods

2.1. Nucleotide Sequences

Whole-genome sequences were obtained from Ensembl (including 9 squamates and 6 representative ruminants from the 15 species assembled at chromosomal level) [50] and Khedkar et al. (Indotyphlops braminus) [51] (Table S1). Consensus sequences of LINEs and SINEs used as queries for initial basic local alignment search tool (BLAST) searches were retrieved from Repbase [52] and Piskurek et al. [49], respectively (Table S2). The initial survey results for the Bov-B and Sauria SINE in the genomes of reptiles and birds with these sequences as the query are presented in Table S3. The query sequences used for subsequent BLAST searches were extracted from the genomes of respective species (Section 2.2; Table S4).

2.2. Extraction of Copy Sequences and Search for TSD

Using the consensus sequences as queries (Table S2), the first blastn search (BLAST+ 2.9.0 [53]) was performed against the whole-genome sequences of 16 or 9 species (for LINEs or SINEs, respectively), with default settings. Based on the search results, the copy sequence (family member) with the highest identity and sequence length closest to the query was selected for each species. Using this sequence as another query (Table S4), the second blastn search was performed against each genome. The resulting copy sequences plus 200 bases of their 5′ and 3′ flanking sequences were extracted from the genomic sequence using the blastdbcmd (BLAST+ 2.9.0 [53]).
Within the extracted sequences, we scanned for TSDs with a Python 3 script (Figure S1) using the following criteria: (1) TSD length is between 10 and 49 bases inclusive; (2) the 5′ and 3′ TSD sequences are perfectly matched; (3) the 5′ and 3′ TSD sequences are separated by at least 99 bases [29]. Due to the long nucleotide sequences of LINEs, direct repeats unrelated to TSDs may exist within the detected sequences. To address this, we examined the presence of direct repeats in the query sequences of the first BLAST searches using the Python script mentioned above (Table S2). We aimed to prevent false positives by setting an interval greater than that of the apparent direct repeat (Table S5). For SINEs, we searched for TSDs at three intervals (Table S6).

2.3. Motif Discovery

Using a Perl script, we extracted 30 nucleotide-long sequences from respective copies of the SINE and LINE families, with 15 nucleotides upstream from the start of a 5′ direct repeat and 15 nucleotides downstream from the start of the repeat, from the genomic sequences obtained by the blastdbcmd (Section 2.2). The multiple expectation maximizations for motif elicitation (MEME) discovery algorithm [54] was applied to the TSD datasets. These motifs are represented as position-dependent character probability matrices that indicate the likelihood of each character appearing at each position in the pattern. We performed the MEME analysis (MEME suite 4.11.2 [55]) with the following parameters via the command-line version: sequence type, DNA; minimum motif width, 15; maximum motif width, 30; minimum sites per motif, N (number of analyzed TSDs) × 0.25; maximum sites per motif, N; type of model, zoops; and maximum number of motifs, 3. For SINEs from Podarcis muralis and Pogona vitticeps, we analyzed 2000 randomly selected TSDs. All available TSDs were used for the other species and LINEs. The most significant (lowest E-value) motif was selected for further analyses (Table S7). We identified a Bov-B motif using TSD datasets with maximum intervals in Anolis carolinensis (≥2999 bases) and Bos taurus (≥3699 bases) but not with smaller intervals (Table S5). For the other species, we used the TSD datasets with maximum intervals (Table S8).

2.4. Age Estimation of Bov-B

We estimated the mean evolutionary divergence over all Bov-B sequence pairs in a species using sequences with lengths > 3000 nucleotides. Multiple sequence alignment was performed with MAFFT version 7.511 [56], and then the number of base substitutions per site was calculated by averaging over all sequence pairs with molecular evolutionary genetics analysis version 11 (MEGA11) [57]. The codon positions included were 1st + 2nd + 3rd + Noncoding. All sites containing missing data or alignment gaps were removed from each sequence pair (pairwise deletions). We also estimated the Bov-B age for each species from the sequence divergence of all copies by multiplying half of the mean value from a representative copy (Section 2.2; Table S4) by a nucleotide substitution rate of 0.0013/site/million years [58,59,60].

2.5. Phylogenetic Analysis and Divergence Time Estimation

We conducted a phylogenetic analysis of the RTE-clade LINEs using the amino acid sequences of the ORF2 protein that includes the EN and RT domains. The alignment was performed using multiple sequence comparison by log-expectation implemented with MEGA11 and the following parameters: gap opening penalty, −2.90; gap extension penalty, 0.00; hydrophobicity multiplier, 1.20; clustering method, unweighted pair group method with arithmetic mean; and minimum diagonal length, 24. Next, we constructed phylogenetic trees in MEGA11 using the maximum likelihood method and a Jones–Taylor–Thornton matrix-based model, using the amino acid sequences of the ORF2 proteins from LINEs. These sequences were selected from those for which almost all the full-length data were available (sequence references are provided in Table S9). The bootstrap consensus tree was inferred from 500 replicates [61]. Overall, our analysis considered 1508 positions in the final dataset. Divergence time estimation was performed using the RelTime method implemented with MEGA11, using the monocot/eudicot divergence time of 140 MYA [62,63] and/or the Toxicofera (Anguimorpha and Iguania)/Serpentes divergence time of 184.6 MYA [64] as calibration constraints. A discrete gamma distribution was used to model the evolutionary rate differences between the sites. Phylogenetic analysis of squamate Bov-B LINEs was conducted using nucleotide sequences of almost full-length and partial (diverged in I. braminus) Bov-B LINEs. Phylogenetic trees were constructed in MEGA11 using the maximum likelihood method with 500 replicates.

3. Results

3.1. Long-Standing Activity of Reptilian Bov-B LINEs

To elucidate the genomic insertion signatures of the reptilian Bov-B LINE and associated SINE, we identified LINE and SINE family members in the genomes of squamates and Bov-B LINE family members in ruminants (Table 1 and Table S3). BLAST searches against the whole-genome sequences of 16 species (all nine squamates and six representative ruminants in Ensembl and Indotyphlops braminus) identified Bov-B copies for the respective species with significantly different numbers. Elapid snakes harbored relatively small numbers of Bov-B copies among squamates, while ruminants harbored a significantly greater number of Bov-B than squamates.
We also identified family members of the Sauria SINE via BLAST searches in many of the squamate species examined (Table 1 and Table S3). The copy number of Sauria SINE tended to be lower than that of Bov-B LINE in the same species. Among the surveyed species, Podarcis muralis and Anolis carolinensis harbored greater copy numbers of SINE than LINE. Such a successful SINE amplification in A. carolinensis was reported previously [65].
Figure 2 and Figure S2 present the frequency distributions of sequence divergence (extent of the difference between the query and copies) and sequence lengths for all LINE copies detected in each species. As commonly observed in other LINEs [27], we observed numerous truncated copies of the Bov-B LINE, which is approximately 3.5 kb in full length. Many of these copies were less than 1 kb in size (nucleotide sequence of Bov-B from P. muralis, with a length of >3 kb, is presented in Figure S3).
The sequence divergence of Bov-B LINEs varied across species. For instance, S. merianae exhibited a relatively large divergence, while I. braminus exhibited a very small divergence. Table 2 presents the sequence divergence and estimated ages of squamate Bov-B. Although absolute dating with a single substitution rate is generally difficult, we used a single substitution rate to estimate the ages since the nucleotide substitution rate among squamates presented relatively low variations [59]. The mean amplification time for each lineage ranged from 133 to 9 million years. Although most copies fell within the range of 110–40 MYA, some copies were over 150 MYA (S. merianae) or almost zero (I. braminus). These results suggest that the Bov-B LINE retained its retrotranspositional activity during squamate evolution. The Bov-B in I. braminus consisted of old copies with large divergence (mean of 21.4%) and young copies with small divergence (mean 1.4%) (Table 2 and Figure S4), with the estimated ages of the two groups at 134 million and 9 million years, respectively (Table 2).
We analyzed phylogenetic relationships among squamate Bov-B copies, including those in the two I. braminus groups (Figure 3). The results indicated that squamate Bov-B LINEs possibly diverged into at least three distinct lineages and demonstrated that the old and young I. braminus copies belonged to two different lineages: one consisting of Bov-B copies from a lizard and Booidea and the other consisting of Bov-B copies from a lizard and Caenophidia snakes. Considering the squamate phylogeny (Figure 2), these results suggest that the origin of squamate Bov-B LINEs dates to before the divergence of Squamata lineages and that the Bov-B LINE was active during squamate evolution. Thus, the origin point could exceed 180 MYA.

3.2. Insertion Signature near the DNA Cleavage Site in Bov-B LINEs

Next, we searched for TSDs within the sequences from respective copies of the LINE and SINE families (Table 1, Tables S5 and S6). Young Bov-B copies from I. braminus exhibited conspicuous TSDs (Figure S5; a search result for Bov-B in P. muralis is presented in Figure S6). To investigate the DNA cleavage preference of Bov-B EN, we focused on 30 nucleotides around the first nucleotide (a DNA cleavage site) within the 5′ direct repeat (which constitutes the 5′-end of a TSD). A distinct insertion signature was observed around Bov-B TSDs in I. braminus and other species (Table 3). The signature was observed in species with a high proportion of copies exhibiting low divergence (less than 15%), often exceeding 81% of the copies (Table 3).
Figure 3. Phylogenetic relationships among squamate Bov-B LINEs. Bov-B LINEs from Indotyphlops braminus (Scolecophidia) with large and small divergence are represented by the filled triangle and circle, respectively. Snake Bov-B, other than that of I. braminus (in red), is highlighted in blue. Bov-B LINEs identified in this study are marked by asterisks (Table S4). Other LINEs were obtained from Repbase (Table S1). The phylogenetic tree was constructed using the maximum likelihood method with the nucleotide sequences of the Bov-B LINEs. Bootstrap analysis was performed with 500 replicates. Legume RTE-clade LINEs were used for an outgroup. SMe: Salvator merianae; PMo: Python molurus; VA: Vipera ammodytes; ACo: Agkistrodon contortrix; NNa: Naja naja; MT: Medicago truncatula; GM: Glycine max. Other abbreviations are presented in Table 4.
Figure 3. Phylogenetic relationships among squamate Bov-B LINEs. Bov-B LINEs from Indotyphlops braminus (Scolecophidia) with large and small divergence are represented by the filled triangle and circle, respectively. Snake Bov-B, other than that of I. braminus (in red), is highlighted in blue. Bov-B LINEs identified in this study are marked by asterisks (Table S4). Other LINEs were obtained from Repbase (Table S1). The phylogenetic tree was constructed using the maximum likelihood method with the nucleotide sequences of the Bov-B LINEs. Bootstrap analysis was performed with 500 replicates. Legume RTE-clade LINEs were used for an outgroup. SMe: Salvator merianae; PMo: Python molurus; VA: Vipera ammodytes; ACo: Agkistrodon contortrix; NNa: Naja naja; MT: Medicago truncatula; GM: Glycine max. Other abbreviations are presented in Table 4.
Biology 15 00927 g003
As presented in Figure 4A, Bov-B TSDs consistently started with adenine at approximately one helical pitch (ten nucleotides) downstream of the three thymine residues. Moreover, the 5′-end nucleotide next to adenine is typically thymine (Figure 4 and Figure S7). We detected TSDs in approximately 80% of the I. braminus copies (Table S5), with this motif observed in 108 out of 123 TSDs analyzed. This pattern with such high frequency likely did not occur by chance. We will henceforth refer to this insertion signature as “Tn-TA.” Tn-TA was observed in I. braminus and four lizard species (Table 3), and a clear Tn-TA pattern in Bov-B LINE was also observed in Bos taurus (Figure 4A).

3.3. Variation in the Tn-TA Pattern Between Bov-B LINEs and Sauria SINEs and Correlation with 3′-End Microsatellite-Like Sequences

We analyzed the nucleotide frequency around the first nucleotide of the 5′ direct repeats from Sauria SINEs identified in the genomes of three lizards and five snake species (Table 1 and Table S1). We observed that these SINEs exhibited the same Tn-TA pattern in lizards and the three snakes (Figure 4B and Figure S7; Tables S6 and S8). The first nucleotide of the TSDs from these SINEs was thymine, whereas that in Bov-B LINEs was adenine.
In our investigation, we observed distinct Tn-TA trends between the Bov-B LINEs and their associated Sauria SINEs in lizard species that included P. muralis, P. vitticeps, and V. komodoensis (Figure 4 and Figure S7; Table S8). A comparison of the Tn-TA patterns of LINEs and SINEs within the same species indicated that both contain three consecutive thymines, approximately 10 nucleotides upstream of the first nucleotide of the 5′ direct repeats. However, a crucial difference was noted: the first nucleotide was consistently adenine in the LINEs from all species but thymine in the SINEs (Figure 4 and Figure S7).
To investigate the cause of this difference, we examined the relationship between the first nucleotide of TSDs and microsatellite-like sequences at the 3′-ends of both LINEs and SINEs (Table 4 and Table S8). The reptilian Bov-B LINEs, where the first nucleotide is adenine, possessed a microsatellite-like sequence with consecutive adenine residues (CAA)n, except for A. carolinensis. In contrast, the Sauria SINEs, where the first nucleotide is thymine and the 3′-end sequence is shared with Bov-B LINEs, terminated in a sequence of consecutive thymine residues (ACCTTT)n.

3.4. Evolutionary Time Scale of the Tn-TA Containing RTE-Clade LINEs

Figure 5A indicates the amplification timing of the squamate Bov-B LINEs estimated from the sequence divergence of each copy (Table 2). We identified a Tn-TA pattern in the respective Squamata lineages that diverged over 180 MYA. This suggests that the insertion signature or genomic integration machinery of LINEs is stably inherited along with these lineages, even if there are unusual HT events across squamate lineages. This is the first observation of stable and long-term inheritance of the integration machinery of RTE-clade LINEs.
Figure 5B presents a molecular phylogenetic tree that includes various RTE-clade LINEs, including the I. braminus Bov-B LINE that exhibits a Tn-TA pattern. It diverged into four major clades with statistical support, with the divergence almost parallel to the taxonomic groups, although their branching orders were not conclusive (Figure S10). The Tn-TA pattern has been observed in distantly related species, including plants and reptiles. Divergence time estimation with a calibration constraint of monocot/eudicot divergence suggested that the RTE lineages in angiosperms and vertebrates diverged over 411 MYA (411–1884 MYA; Figure 5B and Figure S11).

4. Discussion

4.1. Genomic Integration Machinery of Bov-B LINEs and Sauria SINEs

In the present study, we observed that the first nucleotide of the TSDs from the Bov-B LINE and Sauria SINE differed, with adenine in the Bov-B LINE and thymine in the Sauria SINE (Figure 4). Moreover, the Bov-B LINEs ended in (CAA)n while the Sauria SINEs ended in (ACCTTT)n (Table 4). A general consensus has not been established for the functional significance of the microsatellite-like sequences at the 3′ ends of LINEs and SINEs [66,67,68,69]. These repeat sequences could be required to ensure base pairing between the LINE RNA and exposed 3′ DNA for the initiation of reverse transcription [70]. However, Bov-B family members terminated with 3′ repeats are rarely flanked by target sequences that resemble the 3′ repeats (Figure S5). The different 3′ repeats between Bov-B and associated SINEs in the same species could be due to the different mechanisms for transcription termination of RNA Pol II and Pol III between LINEs and SINEs [71].
The different DNA cleavage sites between LINEs and SINEs are explained by the hypothesis that consecutive residues (AA or UUU) within microsatellite-like repeats in LINE or SINE RNA influence whether Bov-B EN nicks one of the DNA strands at thymine or adenine and/or affect the efficiency of priming reverse transcription (Figure 6). These findings are consistent with previous results for RTE-clade LINEs and related SINEs [29].
TA dinucleotides may be primarily recognized by the three-dimensional structure of the EN domain in Bov-B ORF2 protein. Recent studies have shed light on the structural features of ORF2 proteins (including domains for both EN and RT) encoded in the Bombyx mori R2 LINE (R2Bm) and human L1 LINE [19,20,23,72]. The R2Bm-encoded protein recognizes a DNA sequence approximately 20 nucleotides upstream of the cleavage site, followed by first-strand cleavage [19,23]. A similar pattern was observed in the Bov-B LINE, where a thymine stretch was present upstream of the cleavage site. The Tn-TA pattern was observed in RTE-clade LINEs other than Bov-B, as well as in associated SINEs from several mammalian species (unpublished results). It is possible that most RTE-clade LINEs possess this signature. The thymine stretch is often three in animals, whereas it tends to be four or more in legumes [29]. Therefore, whether this difference in number indicates evolutionary changes in RTE ENs or simply reflects the different genomic landscapes of hosts, such as GC content, among taxa, should be determined. The integration machinery and role of microsatellite-like repeats can be evaluated using experimental approaches such as in vivo retrotransposition assays using RTE-clade LINEs as targets [69].

4.2. Ability of RTE-Clade LINEs to Propagate in a New Host Genome Following HT

Our results suggest that Tn-TA-containing RTE-clade LINEs in angiosperms and vertebrates diverged over 411 MYA (411–1884 MYA; Figure 5B and Figure S11). If this dates back before 600 MYA, the approximate origin of Metazoa, this deep branch raises the possibility that Tn-TA-containing RTE-clade LINEs originated from a common ancestor to both plants and animals. The machinery underlying the Tn-TA pattern, such as the characteristic EN, may be fundamental to the retrotransposition of RTE-clade LINEs and therefore might have been maintained over such a long evolutionary timescale. The insertion signature of RTE-clade LINEs from other species, such as fish and birds, remains to be elucidated. If the distribution of the Tn-TA-containing RTE-clade LINEs among taxa is patchy, it raises the possibility that this signature might be occasionally lost. Alternatively, complex evolutionary trajectories across species might account for their widespread distribution. As noted previously, an ancestor of plant RTE-clade LINEs likely underwent HT from ancient aphids or ancestral arthropods to angiosperms [16]. The long distance between angiosperm and vertebrate RTEs (Figure 5B) may reflect the divergence time between arthropods and vertebrates. Moreover, a probable HT event between reptiles (RTE-1_AC_1) and sea urchins (RTE1X_SP), as well as between birds and nematodes (AviRTE [4]), was observed in the phylogeny (Figure 5B). More extensive taxon sampling might have led to intermediates between reptiles and sea urchins. The missing link between plant and reptilian LINEs may include insects, fish, and avian LINEs and thus should be further investigated to provide insights into the trajectory of cross-species HT [4,5,16,73].
The Tn-TA pattern was observed in ruminants and snakes. The first nucleotide of the TSDs was also adenine in B. taurus (Figure 4A). However, the microsatellite-like sequences of the Bov-B LINEs differed between B. taurus and I. braminus, with (CTGAA)n and (CAA)n repeats, respectively (Table 4). This difference suggests vestiges of adaptation to a new genome following HT. Specifically, an ancient snake Bov-B LINE may have acquired a new microsatellite at the 3′-end in the genome of an ancestor of ruminants after HT, thus allowing the LINE to propagate in the host genome.
Although several reports have been published on possible HT vectors in RTE-clade LINEs [74], the mechanism underlying the increased ability of RTE-clade LINEs to propagate in a new host genome following HT relative to that observed for L1-clade LINEs remains unclear. Environmental stressors, such as chemical agents, physical agents (radiation), and experiential factors (e.g., environmental light/dark cycles), can affect the retrotransposition of the human L1 LINE [75], whereas such external factors for RTE-clade LINEs have not yet been reported. RTE-clade LINEs generally contain a single ORF and lack what corresponds to the first ORF of the L1-clade LINEs. Because the ORF1 protein from human L1 interacts with host proteins, which in turn affect L1’s activity [76,77,78], the lack of the ORF1 protein may release RTEs from the regulation by host factors and facilitate their HT. Furthermore, for LINEs with indiscriminate integration sites, separation of the DNA cleavage site and ordinary recognition motif within a short distance, Tn-TA may have some advantages, such as providing LINE EN with flexibility to nick DNA. Thus, whether this characteristic is also present in LINEs other than in RTEs should be further investigated. Future TSD surveys for distinct clades of LINEs and SINEs may lead to the discovery of another insertion signature, which could enhance our understanding of the integration strategies of LINEs and SINEs and their evolutionary impact on host genomes.

5. Conclusions

We proposed a model for the genomic integration machinery of Bov-B LINEs and Sauria SINEs, which can be evaluated experimentally. We also observed long-term activity of Bov-B LINE with a characteristic insertion signature. Although a complex evolutionary trajectory across species is plausible, the long evolutionary distances of Tn-TA-containing LINEs among species suggest an ancient origin for the mechanism underlying this signature. Despite their large occupancy in eukaryotic genomes, the mechanism of LINEs with indiscriminate integration sites, except for L1s and RTEs, remains poorly understood. The accurate detection of TSDs and the discovery of novel insertion signatures for distinct clades of LINEs and SINEs could contribute to a better understanding of the evolutionary and functional impacts of these elements.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biology15120927/s1, Table S1: Genome sequences and LINE/SINE sequences used in this study; Table S2: Query sequences used for the initial BLAST search and their internal direct repeats; Table S3: Genome survey for the Bov-B and Sauria SINE with the initial query sequences; Table S4: Query sequences used for the second BLAST search; Table S5: Copy numbers of the Bov-B and the number of direct repeats detected at different intervals; Table S6: The number of direct repeats detected beside Sauria SINEs at different intervals; Table S7: Discovered motif sites and statistical significance of motifs; Table S8: Comparison of Tn-TA trends and the first nucleotides of TSDs between LINEs and SINEs; Table S9: LINE sequences used in phylogenetic analysis. Figure S1: Python script for TSD search; Figure S2: Frequency distributions of sequence divergence and sequence length of LINE copies across snake and ruminant species; Figure S3: Sequence comparisons among Bov-B from Podarcis muralis with a length > 3 kb; Figure S4: Frequency distributions of sequence divergence and sequence length in the two Indotyphlops braminus Bov-B groups; Figure S5: Sequence comparisons among young I. braminus Bov-B copies and their TSDs; Figure S6: TSD search results with an interval of ≥2999 bases for Bov-B from P. muralis; Figure S7: Comparisons of the LINE/SINE motifs among different TSD intervals; Figure S8: MEME results for Bov-B from P. muralis; Figure S9: MEME results for Sauria SINE from P. muralis; Figure S10: ML tree of RTE-clade LINEs and bootstrap consensus tree; Figure S11: Estimation of the divergence time with different calibration constraints.

Author Contributions

Y.N. and K.O. analyzed the sequences. Y.N. and K.O. wrote the manuscript. Y.N. and K.O. created figures and edited tables. K.O. conceived the project. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within this article and Supplementary Materials.

Acknowledgments

We are grateful to Chiaki Kambayashi at Niigata University for providing the genome assembly of Indotyphlops braminus. We are indebted to Keita Morimoto and Haru Okawa at the Nagahama Institute of Bio-Science and Technology for supporting the data processing. Finally, we would like to thank the anonymous reviewers for providing useful comments on an earlier version of this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Malik, H.S.; Burke, W.D.; Eickbush, T.H. The age and evolution of non-LTR retrotransposable elements. Mol. Biol. Evol. 1999, 16, 793–805. [Google Scholar] [CrossRef]
  2. Schaack, S.; Gilbert, C.; Feschotte, C. Promiscuous DNA: Horizontal transfer of transposable elements and why it matters for eukaryotic evolution. Trends Ecol. Evol. 2010, 25, 537–546. [Google Scholar] [CrossRef]
  3. Walsh, A.M.; Kortschak, R.D.; Gardner, M.G.; Bertozzi, T.; Adelson, D.L. Widespread horizontal transfer of retrotransposons. Proc. Natl. Acad. Sci. USA 2013, 110, 1012–1016. [Google Scholar] [CrossRef]
  4. Suh, A.; Witt, C.C.; Menger, J.; Sadanandan, K.R.; Podsiadlowski, L.; Gerth, M.; Weigert, A.; McGuire, J.A.; Mudge, J.; Edwards, S.V.; et al. Ancient horizontal transfers of retrotransposons between birds and ancestors of human pathogenic nematodes. Nat. Commun. 2016, 7, 11396. [Google Scholar] [CrossRef]
  5. Peccoud, J.; Loiseau, V.; Cordaux, R.; Gilbert, C. Massive horizontal transfer of transposable elements in insects. Proc. Natl. Acad. Sci. USA 2017, 114, 4721–4726. [Google Scholar] [CrossRef] [PubMed]
  6. Ivancevic, A.M.; Kortschak, R.D.; Bertozzi, T.; Adelson, D.L. Horizontal transfer of BovB and L1 retrotransposons in eukaryotes. Genome Biol. 2018, 19, 85. [Google Scholar] [CrossRef]
  7. Zhang, H.H.; Peccoud, J.; Xu, M.R.; Zhang, X.G.; Gilbert, C. Horizontal transfer and evolution of transposable elements in vertebrates. Nat. Commun. 2020, 11, 1362. [Google Scholar] [CrossRef]
  8. Melo, E.S.; Wallau, G.L. Mosquito genomes are frequently invaded by transposable elements through horizontal transfer. PLoS Genet. 2020, 16, e1008946. [Google Scholar] [CrossRef]
  9. Kambayashi, C.; Kakehashi, R.; Sato, Y.; Mizuno, H.; Tanabe, H.; Rakotoarison, A.; Künzel, S.; Furuno, N.; Ohshima, K.; Kumazawa, Y.; et al. Geography-dependent horizontal gene transfer from vertebrate predators to their prey. Mol. Biol. Evol. 2022, 39, msac052. [Google Scholar] [CrossRef] [PubMed]
  10. Kordis, D.; Gubensek, F. Unusual horizontal transfer of a long interspersed nuclear element between distant vertebrate classes. Proc. Natl. Acad. Sci. USA 1998, 95, 10704–10709. [Google Scholar] [CrossRef] [PubMed]
  11. Gallus, S.; Kumar, V.; Bertelsen, M.F.; Janke, A.; Nilsson, M.A. A genome survey sequencing of the Java mouse deer (Tragulus javanicus) adds new aspects to the evolution of lineage specific retrotransposons in Ruminantia (Cetartiodactyla). Gene 2015, 571, 271–278. [Google Scholar] [CrossRef] [PubMed]
  12. Puinongpo, W.; Singchat, W.; Petpradub, S.; Kraichak, E.; Nunome, M.; Laopichienpong, N.; Thongchum, R.; Intarasorn, T.; Sillapaprayoon, S.; Indananda, C.; et al. Existence of Bov-B LINE retrotransposons in snake lineages reveals recent multiple horizontal gene transfers with copy number variation. Genes 2020, 11, 1241. [Google Scholar] [CrossRef] [PubMed]
  13. Malik, H.S.; Eickbush, T.H. The RTE class of non-LTR retrotransposons is widely distributed in animals and is the origin of many SINEs. Mol. Biol. Evol. 1998, 15, 1123–1134. [Google Scholar] [CrossRef]
  14. Zupunski, V.; Gubensek, F.; Kordis, D. Evolutionary dynamics and evolutionary history in the RTE clade of non-LTR retrotransposons. Mol. Biol. Evol. 2001, 18, 1849–1863. [Google Scholar] [CrossRef] [PubMed]
  15. Kapitonov, V.V.; Tempel, S.; Jurka, J. Simple and fast classification of non-LTR retrotransposons based on phylogeny of their RT domain protein sequences. Gene 2009, 448, 207–213. [Google Scholar] [CrossRef]
  16. Gao, D.; Chu, Y.; Xia, H.; Xu, C.; Heyduk, K.; Abernathy, B.; Ozias-Akins, P.; Leebens-Mack, J.H.; Jackson, S.A. Horizontal transfer of non-LTR retrotransposons from arthropods to flowering plants. Mol. Biol. Evol. 2018, 35, 354–364. [Google Scholar] [CrossRef]
  17. Luan, D.D.; Korman, M.H.; Jakubczak, J.L.; Eickbush, T.H. Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: A mechanism for non-LTR retrotransposition. Cell 1993, 72, 595–605. [Google Scholar] [CrossRef]
  18. Cost, G.J.; Feng, Q.; Jacquier, A.; Boeke, J.D. Human L1 element target-primed reverse transcription in vitro. EMBO J. 2002, 21, 5899–5910. [Google Scholar] [CrossRef]
  19. Wilkinson, M.E.; Frangieh, C.J.; Macrae, R.K.; Zhang, F. Structure of the R2 non-LTR retrotransposon initiating target-primed reverse transcription. Science 2023, 380, 301–308. [Google Scholar] [CrossRef]
  20. Thawani, A.; Ariza, A.J.F.; Nogales, E.; Collins, K. Template and target-site recognition by human LINE-1 in retrotransposition. Nature 2024, 626, 186–193. [Google Scholar] [CrossRef]
  21. Feng, Q.; Moran, J.V.; Kazazian, H.H., Jr.; Boeke, J.D. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell 1996, 87, 905–916. [Google Scholar] [CrossRef] [PubMed]
  22. Morrish, T.A.; Gilbert, N.; Myers, J.S.; Vincent, B.J.; Stamato, T.D.; Taccioli, G.E.; Batzer, M.A.; Moran, J.V. DNA repair mediated by endonuclease-independent LINE-1 retrotransposition. Nat. Genet. 2002, 31, 159–165. [Google Scholar] [CrossRef] [PubMed]
  23. Deng, P.; Tan, S.Q.; Yang, Q.Y.; Fu, L.; Wu, Y.; Zhu, H.Z.; Sun, L.; Bao, Z.; Lin, Y.; Zhang, Q.C.; et al. Structural RNA components supervise the sequential DNA cleavage in R2 retrotransposon. Cell 2023, 186, 2865–2879. [Google Scholar] [CrossRef] [PubMed]
  24. Zingler, N.; Weichenrieder, O.; Schumann, G.G. APE-type non-LTR retrotransposons: Determinants involved in target site recognition. Cytogenet. Genome Res. 2005, 110, 250–268. [Google Scholar] [CrossRef]
  25. Fujiwara, H. Site-specific non-LTR retrotransposons. Microbiol. Spectr. 2015, 3, MDNA3-0001–2014. [Google Scholar] [CrossRef]
  26. Cost, G.J.; Boeke, J.D. Targeting of human retrotransposon integration is directed by the specificity of the L1 endonuclease for regions of unusual DNA structure. Biochemistry 1998, 37, 18081–18093. [Google Scholar] [CrossRef]
  27. Szak, S.T.; Pickeral, O.K.; Makalowski, W.; Boguski, M.S.; Landsman, D.; Boeke, J.D. Molecular archeology of L1 insertions in the human genome. Genome Biol. 2002, 3, research0052. [Google Scholar] [CrossRef]
  28. Ichiyanagi, K.; Okada, N. Mobility pathways for vertebrate L1, L2, CR1, and RTE clade retrotransposons. Mol. Biol. Evol. 2008, 25, 1148–1157. [Google Scholar] [CrossRef]
  29. Nishiyama, E.; Ohshima, K. Cross-kingdom commonality of a novel insertion signature of RTE-related short retroposons. Genome Biol. Evol. 2018, 10, 1471–1483. [Google Scholar] [CrossRef]
  30. Matetovici, I.; Sajgo, S.; Ianc, B.; Ochis, C.; Bulzu, P.; Popescu, O.; Damert, A. Mobile element evolution playing jigsaw-SINEs in gastropod and bivalve mollusks. Genome Biol. Evol. 2016, 8, 253–270. [Google Scholar] [CrossRef]
  31. Nishihara, H.; Plazzi, F.; Passamonti, M.; Okada, N. MetaSINEs: Broad distribution of a novel SINE superfamily in animals. Genome Biol. Evol. 2016, 8, 528–539. [Google Scholar] [CrossRef] [PubMed]
  32. Kojima, K.K. LINEs contribute to the origins of middle bodies of SINEs besides 3′ tails. Genome Biol. Evol. 2018, 10, 370–379. [Google Scholar] [CrossRef]
  33. Seibt, K.M.; Schmidt, T.; Heitkam, T. The conserved 3′ Angio-domain defines a superfamily of short interspersed nuclear elements (SINEs) in higher plants. Plant J. 2020, 101, 681–699. [Google Scholar] [CrossRef] [PubMed]
  34. Okada, N. SINEs: Short interspersed repeated elements of the eukaryotic genome. Trends Ecol. Evol. 1991, 6, 358–361. [Google Scholar] [CrossRef]
  35. Kapitonov, V.V.; Jurka, J. A novel class of SINE elements derived from 5S rRNA. Mol. Biol. Evol. 2003, 20, 694–702. [Google Scholar] [CrossRef]
  36. Batzer, M.A.; Deininger, P.L. Alu repeats and human genomic diversity. Nat. Rev. Genet. 2002, 3, 370–379. [Google Scholar] [CrossRef] [PubMed]
  37. Ohshima, K.; Hamada, M.; Terai, Y.; Okada, N. The 3′ ends of tRNA-derived short interspersed repetitive elements are derived from the 3′ ends of long interspersed repetitive elements. Mol. Cell Biol. 1996, 16, 3756–3764. [Google Scholar] [CrossRef]
  38. Gogolevsky, K.P.; Vassetzky, N.S.; Kramerov, D.A. Bov-B-mobilized SINEs in vertebrate genomes. Gene 2008, 407, 75–85. [Google Scholar] [CrossRef]
  39. Ohshima, K. Parallel relaxation of stringent RNA recognition in plant and mammalian L1 retrotransposons. Mol. Biol. Evol. 2012, 29, 3255–3259. [Google Scholar] [CrossRef]
  40. Suh, A.; Bachg, S.; Donnellan, S.; Joseph, L.; Brosius, J.; Kriegs, J.O.; Schmitz, J. De-novo emergence of SINE retroposons during the early evolution of passerine birds. Mob. DNA 2017, 8, 21. [Google Scholar] [CrossRef]
  41. Han, G.; Zhang, N.; Jiang, H.; Meng, X.; Qian, K.; Zheng, Y.; Xu, J.; Wang, J. Diversity of short interspersed nuclear elements (SINEs) in lepidopteran insects and evidence of horizontal SINE transfer between baculovirus and lepidopteran hosts. BMC Genom. 2021, 22, 226. [Google Scholar] [CrossRef]
  42. Jurka, J. Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons. Proc. Natl. Acad. Sci. USA 1997, 94, 1872–1877. [Google Scholar] [CrossRef]
  43. Lenoir, A.; Lavie, L.; Prieto, J.L.; Goubely, C.; Coté, J.C.; Pélissier, T.; Deragon, J.M. The evolutionary origin and genomic organization of SINEs in Arabidopsis thaliana. Mol. Biol. Evol. 2001, 18, 2315–2322. [Google Scholar] [CrossRef]
  44. Gilbert, C.; Pace, J.K., II; Waters, P.D. Target site analysis of RTE1_LA and its AfroSINE partner in the elephant genome. Gene 2008, 425, 1–8. [Google Scholar] [CrossRef] [PubMed]
  45. Kojima, K.K. Different integration site structures between L1 protein-mediated retrotransposition in cis and retrotransposition in trans. Mob. DNA 2010, 1, 17. [Google Scholar] [CrossRef]
  46. Wenke, T.; Döbel, T.; Sörensen, T.R.; Junghans, H.; Weisshaar, B.; Schmidt, T. Targeted identification of short interspersed nuclear element families shows their widespread existence and extreme heterogeneity in plant genomes. Plant Cell 2011, 23, 3117–3128. [Google Scholar] [CrossRef]
  47. Noll, A.; Raabe, C.A.; Churakov, G.; Brosius, J.; Schmitz, J. Ancient traces of tailless retropseudogenes in therian genomes. Genome Biol. Evol. 2015, 7, 889–900. [Google Scholar] [CrossRef] [PubMed]
  48. Kögler, A.; Schmidt, T.; Wenke, T. Evolutionary modes of emergence of short interspersed nuclear element (SINE) families in grasses. Plant J. 2017, 92, 676–695. [Google Scholar] [CrossRef] [PubMed]
  49. Piskurek, O.; Austin, C.C.; Okada, N. Sauria SINEs: Novel short interspersed retroposable elements that are widespread in reptile genomes. J. Mol. Evol. 2006, 62, 630–644. [Google Scholar] [CrossRef]
  50. Martin, F.J.; Amode, M.R.; Aneja, A.; Austine-Orimoloye, O.; Azov, A.G.; Barnes, I.; Becker, A.; Bennett, R.; Berry, A.; Bhai, J.; et al. Ensembl 2023. Nucleic Acids Res. 2023, 51, D933–D941. [Google Scholar] [CrossRef]
  51. Khedkar, G.; Kambayashi, C.; Tabata, H.; Takemura, I.; Minei, R.; Ogura, A.; Kurabayashi, A. The draft genome sequence of the Brahminy blindsnake Indotyphlops braminus. Sci. Data 2022, 9, 410. [Google Scholar] [CrossRef]
  52. Bao, W.; Kojima, K.K.; Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 2015, 6, 11. [Google Scholar] [CrossRef]
  53. Camacho, C.; Coulouris, G.; Avagyan, V.; Ma, N.; Papadopoulos, J.; Bealer, K.; Madden, T.L. BLAST+: Architecture and applications. BMC Bioinform. 2009, 10, 421. [Google Scholar] [CrossRef]
  54. Bailey, T.L.; Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 1994, 2, 28–36. [Google Scholar]
  55. Bailey, T.L.; Johnson, J.; Grant, C.E.; Noble, W.S. The MEME Suite. Nucleic Acids Res. 2015, 43, W39–W49. [Google Scholar] [CrossRef]
  56. Katoh, K.; Rozewicki, J.; Yamada, K.D. MAFFT online service: Multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 2019, 20, 1160–1166. [Google Scholar] [CrossRef]
  57. Tamura, K.; Stecher, G.; Kumar, S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol. Biol. Evol. 2021, 38, 3022–3027. [Google Scholar] [CrossRef]
  58. Pasquesi, G.I.M.; Adams, R.H.; Card, D.C.; Schield, D.R.; Corbin, A.B.; Perry, B.W.; Reyes-Velasco, J.; Ruggiero, R.P.; Vandewege, M.W.; Shortt, J.A.; et al. Squamate reptiles challenge paradigms of genomic repeat element evolution set by birds and mammals. Nat. Commun. 2018, 9, 2774. [Google Scholar] [CrossRef]
  59. Perry, B.W.; Card, D.C.; McGlothlin, J.W.; Pasquesi, G.I.M.; Adams, R.H.; Schield, D.R.; Hales, N.R.; Corbin, A.B.; Demuth, J.P.; Hoffmann, F.G.; et al. Molecular adaptations for sensing and securing prey and insight into amniote genome diversity from the garter snake genome. Genome Biol. Evol. 2018, 10, 2110–2129. [Google Scholar] [CrossRef]
  60. Sookdeo, A.; Hepp, C.M.; Boissinot, S. Contrasted patterns of evolution of the LINE-1 retrotransposon in perissodactyls: The history of a LINE-1 extinction. Mob. DNA 2018, 9, 12. [Google Scholar] [CrossRef]
  61. Felsenstein, J. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 1985, 39, 783–791. [Google Scholar] [CrossRef]
  62. Zhang, N.; Zeng, L.; Shan, H.; Ma, H. Highly conserved low-copy nuclear genes as effective markers for phylogenetic analyses in angiosperms. New Phytol. 2012, 195, 923–937. [Google Scholar] [CrossRef]
  63. Hertweck, K.L.; Kinney, M.S.; Stuart, S.A.; Maurin, O.; Mathews, S.; Chase, M.W.; Gandolfo, M.A.; Pires, J.C. Phylogenetics, divergence times and diversification from three genomic partitions in monocots. Bot. J. Linn. Soc. 2015, 178, 375–393. [Google Scholar] [CrossRef]
  64. Zheng, Y.; Wiens, J.J. Combining phylogenomic and supermatrix approaches, and a time-calibrated phylogeny for squamate reptiles (lizards and snakes) based on 52 genes and 4162 species. Mol. Phylogenet Evol. 2016, 94, 537–547. [Google Scholar] [CrossRef]
  65. Piskurek, O.; Nishihara, H.; Okada, N. The evolution of two partner LINE/SINE families and a full-length chromodomain-containing Ty3/Gypsy LTR element in the first reptilian genome of Anolis carolinensis. Gene 2009, 441, 111–118. [Google Scholar] [CrossRef]
  66. Chambeyron, S.; Bucheton, A.; Busseau, I. Tandem UAA repeats at the 3′-end of the transcript are essential for the precise initiation of reverse transcription of the I factor in Drosophila melanogaster. J. Biol. Chem. 2002, 277, 17877–17882. [Google Scholar] [CrossRef]
  67. Kajikawa, M.; Okada, N. LINEs mobilize SINEs in the eel through a shared 3′ sequence. Cell 2002, 111, 433–444. [Google Scholar] [CrossRef]
  68. Suh, A. The specific requirements for CR1 retrotransposition explain the scarcity of retrogenes in birds. J. Mol. Evol. 2015, 81, 18–20. [Google Scholar] [CrossRef]
  69. Doucet, A.J.; Wilusz, J.E.; Miyoshi, T.; Liu, Y.; Moran, J.V. A 3′ poly(A) tract is required for LINE-1 retrotransposition. Mol. Cell 2015, 60, 728–741. [Google Scholar] [CrossRef]
  70. Volff, J.N.; Körting, C.; Sweeney, K.; Schartl, M. The non-LTR retrotransposon Rex3 from the fish Xiphophorus is widespread among teleosts. Mol. Biol. Evol. 1999, 16, 1427–1438. [Google Scholar] [CrossRef]
  71. Girbig, M.; Misiaszek, A.D.; Müller, C.W. Structural insights into nuclear transcription by eukaryotic DNA-dependent RNA polymerases. Nat. Rev. Mol. Cell Biol. 2022, 23, 603–622. [Google Scholar] [CrossRef]
  72. Baldwin, E.T.; van Eeuwen, T.; Hoyos, D.; Zalevsky, A.; Tchesnokov, E.P.; Sánchez, R.; Miller, B.D.; Di Stefano, L.H.; Ruiz, F.X.; Hancock, M.; et al. Structures, functions and adaptations of the human LINE-1 ORF2 protein. Nature 2024, 626, 194–206. [Google Scholar] [CrossRef]
  73. Tay, W.T.; Behere, G.T.; Batterham, P.; Heckel, D.G. Generation of microsatellite repeat families by RTE retrotransposons in lepidopteran genomes. BMC Evol. Biol. 2010, 10, 144. [Google Scholar] [CrossRef]
  74. Piskurek, O.; Okada, N. Poxviruses as possible vectors for horizontal transfer of retroposons from reptiles to mammals. Proc. Natl. Acad. Sci. USA 2007, 104, 12046–12051. [Google Scholar] [CrossRef]
  75. Del Re, B.; Giorgi, G. Long INterspersed element-1 mobility as a sensor of environmental stresses. Environ. Mol. Mutagen. 2020, 61, 465–493. [Google Scholar] [CrossRef]
  76. Pizarro, J.G.; Cristofari, G. Post-transcriptional control of LINE-1 retrotransposition by cellular host factors in somatic cells. Front. Cell Dev. Biol. 2016, 4, 14. [Google Scholar] [CrossRef]
  77. Furano, A.V.; Jones, C.E.; Periwal, V.; Callahan, K.E.; Walser, J.C.; Cook, P.R. Cryptic genetic variation enhances primate L1 retrotransposon survival by enlarging the functional coiled coil sequence space of ORF1p. PLoS Genet. 2020, 16, e1008991. [Google Scholar] [CrossRef]
  78. Luqman-Fatah, A.; Miyoshi, T. Human LINE-1 retrotransposons: Impacts on the genome and regulation by host factors. Genes Genet. Syst. 2023, 98, 121–154. [Google Scholar] [CrossRef]
Figure 1. Insertion signatures of the RTE-associated SINE and L1-clade LINE. The first DNA strand cleavage sites that correspond to a phosphodiester bond between the first nucleotide of a 5′ direct repeat and the adjacent 5′ flanking nucleotide on the complementary strand are indicated by a vertical line. (Upper) A thymine stretch appears approximately ten nucleotides upstream of the first nucleotide (thymine) of the TSDs in the RTE-associated SINE from soybean [29]. (Lower) A typical consensus 5′-TT|AAAA-3′ appears beside the first nucleotide (adenine) of the TSDs in the L1-clade LINE from Bos taurus (L1-BT; this study). MEME result for L1-BT with an interval of ≥7999 bases (361 TSDs). The vertical axis represents the information content of the site.
Figure 1. Insertion signatures of the RTE-associated SINE and L1-clade LINE. The first DNA strand cleavage sites that correspond to a phosphodiester bond between the first nucleotide of a 5′ direct repeat and the adjacent 5′ flanking nucleotide on the complementary strand are indicated by a vertical line. (Upper) A thymine stretch appears approximately ten nucleotides upstream of the first nucleotide (thymine) of the TSDs in the RTE-associated SINE from soybean [29]. (Lower) A typical consensus 5′-TT|AAAA-3′ appears beside the first nucleotide (adenine) of the TSDs in the L1-clade LINE from Bos taurus (L1-BT; this study). MEME result for L1-BT with an interval of ≥7999 bases (361 TSDs). The vertical axis represents the information content of the site.
Biology 15 00927 g001
Figure 2. Varying sequence divergence of Bov-B LINEs across squamate species. The panels on the left indicate the frequency distribution of sequence divergence, and the panels on the right indicate the frequency distribution of sequence length. The vertical axis represents the number of LINE copies. All hits resulting from the second BLAST for each species were used to generate graphs. The phylogenetic relationships between higher-level squamate clades and their ages were obtained from Zheng and Wiens [64].
Figure 2. Varying sequence divergence of Bov-B LINEs across squamate species. The panels on the left indicate the frequency distribution of sequence divergence, and the panels on the right indicate the frequency distribution of sequence length. The vertical axis represents the number of LINE copies. All hits resulting from the second BLAST for each species were used to generate graphs. The phylogenetic relationships between higher-level squamate clades and their ages were obtained from Zheng and Wiens [64].
Biology 15 00927 g002
Figure 4. Insertion signatures of the Bov-B LINEs and Sauria SINEs. (A) Comparison of Bov-B LINE motifs across reptilian species and cows. The first DNA strand cleavage sites that correspond to a phosphodiester bond between the first nucleotide of a 5′ direct repeat and the adjacent 5′ flanking nucleotide on the complementary strand are indicated by a vertical line. The first nucleotides were predominantly adenine for all species analyzed. Three consecutive thymines are located at approximately ten nucleotides upstream. A stretch of cytosine at approximately 12 nucleotides downstream corresponds to the beginning of the reptilian Bov-B LINE full-length sequence (Figure S5). (B) Comparison of the SINE motifs across reptilian species. Sauria SINEs with a Bov-B-shared 3′-end from P. muralis (upper), Pogona vitticeps (middle) and Varanus komodoensis (lower). The first DNA strand cleavage sites are indicated by a vertical line. The first nucleotide in the TSDs was predominantly thymine for all species analyzed. The nucleotide at approximately 12 nucleotides downstream corresponds to the beginning of a Sauria SINE copy sequence. The vertical axis represents the information content of the site. Motifs with low scores for the Bov-B LINEs from A. carolinensis and Moschus moschiferus and for the Sauria SINEs from I. braminus, Naja naja and Pseudonaja textilis are presented in Figure S7. The MEME results for the Bov-B and Sauria SINE from P. muralis are presented in Figures S8 and S9, respectively. Detailed information on the discovered motif sites and E-values is provided in Table S7.
Figure 4. Insertion signatures of the Bov-B LINEs and Sauria SINEs. (A) Comparison of Bov-B LINE motifs across reptilian species and cows. The first DNA strand cleavage sites that correspond to a phosphodiester bond between the first nucleotide of a 5′ direct repeat and the adjacent 5′ flanking nucleotide on the complementary strand are indicated by a vertical line. The first nucleotides were predominantly adenine for all species analyzed. Three consecutive thymines are located at approximately ten nucleotides upstream. A stretch of cytosine at approximately 12 nucleotides downstream corresponds to the beginning of the reptilian Bov-B LINE full-length sequence (Figure S5). (B) Comparison of the SINE motifs across reptilian species. Sauria SINEs with a Bov-B-shared 3′-end from P. muralis (upper), Pogona vitticeps (middle) and Varanus komodoensis (lower). The first DNA strand cleavage sites are indicated by a vertical line. The first nucleotide in the TSDs was predominantly thymine for all species analyzed. The nucleotide at approximately 12 nucleotides downstream corresponds to the beginning of a Sauria SINE copy sequence. The vertical axis represents the information content of the site. Motifs with low scores for the Bov-B LINEs from A. carolinensis and Moschus moschiferus and for the Sauria SINEs from I. braminus, Naja naja and Pseudonaja textilis are presented in Figure S7. The MEME results for the Bov-B and Sauria SINE from P. muralis are presented in Figures S8 and S9, respectively. Detailed information on the discovered motif sites and E-values is provided in Table S7.
Biology 15 00927 g004
Figure 5. Evolutionary time scale of the Tn-TA-containing RTE-clade LINEs. (A) Amplification timing of squamate Bov-B LINEs that exhibit the Tn-TA pattern. Bov-B ages for respective lineages were estimated from the mean value of the sequence divergence of all copies and its standard deviation (red dot and thin rectangle) or average distance of overall sequence pairs of selected long copies (blue dot), using a nucleotide substitution rate of 0.0013/site/million years [58,59,60] (Table 2). Phylogenetic relationships between squamates and their divergence time were taken from [64]. (B) Phylogenetic relationships among the RTE-clade LINEs that exhibit the Tn-TA pattern. Species referenced in this study: Bov-B_IBr (I. braminus), Bov-B (B. taurus), and RTE_BOV_B_AC_1 (A. carolinensis). Species referenced in [29]: RTE-1_EC (horse), RTE1_LA (elephant), RTE1_MT (Medicago), and RTE-1_GM (soybean). Taxa in which the respective LINEs were identified are denoted as colored shapes: mammals (circles in red), reptiles (purple), birds (orange), fish (blue), plants (green), echinoderms (black square), and nematodes (black triangle). The phylogenetic tree was constructed using the maximum likelihood method with the amino acid sequences of the ORF2 protein of LINEs. The bootstrap consensus tree was inferred from 500 replicates (Figure S10). Mammalian L1 LINEs were used for an outgroup. Divergence time estimation was performed using the monocot/eudicot divergence time of 140 MYA [62,63] and Toxicofera/Serpentes divergence time of 184.6 MYA [64] as calibration constraints (red squares). The motif on the right of the plants was taken from the soybean SINE [29].
Figure 5. Evolutionary time scale of the Tn-TA-containing RTE-clade LINEs. (A) Amplification timing of squamate Bov-B LINEs that exhibit the Tn-TA pattern. Bov-B ages for respective lineages were estimated from the mean value of the sequence divergence of all copies and its standard deviation (red dot and thin rectangle) or average distance of overall sequence pairs of selected long copies (blue dot), using a nucleotide substitution rate of 0.0013/site/million years [58,59,60] (Table 2). Phylogenetic relationships between squamates and their divergence time were taken from [64]. (B) Phylogenetic relationships among the RTE-clade LINEs that exhibit the Tn-TA pattern. Species referenced in this study: Bov-B_IBr (I. braminus), Bov-B (B. taurus), and RTE_BOV_B_AC_1 (A. carolinensis). Species referenced in [29]: RTE-1_EC (horse), RTE1_LA (elephant), RTE1_MT (Medicago), and RTE-1_GM (soybean). Taxa in which the respective LINEs were identified are denoted as colored shapes: mammals (circles in red), reptiles (purple), birds (orange), fish (blue), plants (green), echinoderms (black square), and nematodes (black triangle). The phylogenetic tree was constructed using the maximum likelihood method with the amino acid sequences of the ORF2 protein of LINEs. The bootstrap consensus tree was inferred from 500 replicates (Figure S10). Mammalian L1 LINEs were used for an outgroup. Divergence time estimation was performed using the monocot/eudicot divergence time of 140 MYA [62,63] and Toxicofera/Serpentes divergence time of 184.6 MYA [64] as calibration constraints (red squares). The motif on the right of the plants was taken from the soybean SINE [29].
Biology 15 00927 g005
Figure 6. Model of the initial stage of genomic integration of Bov-B LINEs and Sauria SINEs. Consecutive residues within microsatellite-like repeats in the LINE RNA (AA: (right)) or SINE (UUU: (left)) influence whether Bov-B EN nicks one of the DNA strands at thymine or adenine. Moreover, they almost simultaneously (or alternatively) affect the efficiency of priming reverse transcription. DNA cleavage sites are indicated by a red arrowhead. EN: endonuclease, RT: reverse transcriptase.
Figure 6. Model of the initial stage of genomic integration of Bov-B LINEs and Sauria SINEs. Consecutive residues within microsatellite-like repeats in the LINE RNA (AA: (right)) or SINE (UUU: (left)) influence whether Bov-B EN nicks one of the DNA strands at thymine or adenine. Moreover, they almost simultaneously (or alternatively) affect the efficiency of priming reverse transcription. DNA cleavage sites are indicated by a red arrowhead. EN: endonuclease, RT: reverse transcriptase.
Biology 15 00927 g006
Table 1. Copy Numbers of the LINEs and SINEs and Number of Analyzed TSDs.
Table 1. Copy Numbers of the LINEs and SINEs and Number of Analyzed TSDs.
RetroposonsSpeciesCommon NameGenome
Size (Mb) (1)
Number
of Hits (2)
TSD Interval (Bases) (3)TSDs (4)
Bov-B LINESalvator merianaeArgentine black and white tegu20263393≥29991
Podarcis muralisCommon wall lizard151152,041≥2999353
Pogona vitticepsCentral bearded dragon181645,093≥309962
Anolis carolinensisGreen anole179915,740≥299941
Varanus komodoensisKomodo dragon150831,077≥2999509
Indotyphlops braminusBrahminy blindsnake1856968≥2999123
Naja najaIndian cobra17692370≥29993
Pseudonaja textilisEastern brown snake15901919≥29992
Notechis scutatusMainland tiger snake16661981≥29992
Laticauda laticaudataBlue-lipped sea krait1559668≥29993
Bos taurusCow2716365,688≥3699153
Bison bison bisonAmerican bison2954346,426≥349982
Capra hircusGoat2923292,528≥3499487
Ovis ariesSheep2870334,914≥3499408
Moschus moschiferusSiberian musk deer3070357,380≥3499105
Cervus hanglu yarkandensisYarkand deer2594219,829≥349938
Sauria SINESalvator merianaeArgentine black and white tegu20260n/an/a
Podarcis muralisCommon wall lizard151177,228≥9940,838
Pogona vitticepsCentral bearded dragon181610,602≥995807
Anolis carolinensisGreen anole1799(5) 78,442≥99(5) 33,597
Varanus komodoensisKomodo dragon1508845≥99378
Indotyphlops braminusBrahminy blindsnake1856497≥99129
Naja najaIndian cobra17691041≥99233
Pseudonaja textilisEastern brown snake1590986≥99227
Notechis scutatusMainland tiger snake1666408≥9986
Laticauda laticaudataBlue-lipped sea krait1559448≥9990
(1) Number of nucleotides in the genome assembly. (2) Number of LINE and SINE copies was determined based on a second BLAST search. The initial survey results with a specific sequence as the query are presented in Table S3. (3) Interval of the 5′ and 3′ direct repeats used for each TSD search. (4) Number of TSDs with the interval indicated on the left. The number of direct repeats detected at different intervals for each species is presented in Table S5 (for LINE) and Table S6 (SINE). (5) Nishiyama E, Ohshima K. [29]. n/a: Not available.
Table 2. Sequence Divergence and Estimated Ages of Squamate Bov-B.
Table 2. Sequence Divergence and Estimated Ages of Squamate Bov-B.
SpeciesMean Pairwise Divergence (%) (1)Mean Divergence from a Copy (% ± SD) (2)Age (MYA) (3)
Salvator merianae25.021.3 ± 3.6133 (111–156)
Podarcis muralis3.09.7 ± 3.761 (37–84)
Pogona vitticeps6.012.6 ± 2.978 (60–97)
Anolis carolinensis7.09.5 ± 3.160 (40–79)
Varanus komodoensis8.09.8 ± 3.261 (42–81)
Indotyphlops braminus (4)1.01.4 ± 3.39 (0–30)
Indotyphlops braminus (5)n/a21.4 ± 1.6134 (124–144)
Naja naja11.014.3 ± 3.389 (69–110)
Pseudonaja textilis11.014.8 ± 3.092 (74–111)
Notechis scutatus11.013.2 ± 2.982 (64–101)
Laticauda laticaudata15.015.2 ± 2.495 (80–110)
(1) Mean pairwise divergence between copies of >3000 nucleotides. (2) Mean divergence of squamate Bov-B copies from a representative Bov-B copy (Table S4). The representative squamate copies appear in Figure 3. SD: standard deviation. (3) Bov-B age for a species was estimated from the mean sequence divergence of all copies using a nucleotide substitution rate of 0.0013 per site/million years [58,59,60]. (4) A mean of 159 copies with length > 3 kb (1) and all 968 copies (2). (5) Mean of the 23 copies with the largest divergence (Subfamily II; Figure S4).
Table 3. Insertion Signature of Bov-B LINEs and Overall Proportion of Copies Exhibiting Low Divergence.
Table 3. Insertion Signature of Bov-B LINEs and Overall Proportion of Copies Exhibiting Low Divergence.
SpeciesLength > 3 kb (1)Divergence < 0.15 (2)Tn-TA Trend (3)
Salvator merianae0.003 (11)0.054 (183)
Podarcis muralis0.008 (435)0.960 (49,938)+(265/353)
Pogona vitticeps0.004 (164)0.815 (36,760)+(37/62)
Anolis carolinensis0.004 (65)0.954 (15,020)+(37/41) (4)
Varanus komodoensis0.035 (1076)0.934 (29,019)+(313/509)
Indotyphlops braminus0.164 (159)0.976 (945)+(108/123)
Naja naja0.004 (9)0.574 (1360)
Pseudonaja textilis0.003 (5)0.492 (944)
Notechis scutatus0.004 (8)0.767 (1519)
Laticauda laticaudata0.007 (5)0.439 (293)
Bos taurus0.022 (8199)0.850 (310,773)+(42/153)
Bison bison bison0.008 (2632)0.837 (289,932)
Capra hircus0.032 (9244)0.702 (205,434)
Ovis aries0.029 (9554)0.593 (198,506)
Moschus moschiferus0.012 (4265)0.806 (287,877)+/−(27/105) (5)
Cervus hanglu yarkandensis0.023 (5005)0.599 (131,607)
(1) Ratio of the number of copies with length > 3000 nucleotides to the total number of copies. The number of copies with length > 3 kb is indicated in parentheses. (2) Ratio of the number of copies with sequence divergence less than 0.15 to the total number of copies. The number of copies with sequence divergence < 0.15 is indicated in parentheses. (3) Species exhibiting a Tn-TA pattern in the TSDs are indicated by +. The number of sites contributing to the construction of the motif and number of analyzed TSD sites are indicated in parentheses. (4) Motif was statistically significant; however, the E-value was not significantly low (Figure S7A, Tables S5 and S7). (5) A Tn-TA-like pattern was observed; however, the E-value was not significant (Figure S7A, Table S7).
Table 4. Correlation of 3′-end Microsatellite-Like Sequences and First Nucleotide of TSDs.
Table 4. Correlation of 3′-end Microsatellite-Like Sequences and First Nucleotide of TSDs.
NameReferencesSpecies3′RepeatTSD (1)
LINEBov-B_PMuThis studyPodarcis muralis(CAA)2–6A
Bov-B_PViThis studyPogona vitticeps(CAA)3/(CA)4–8A
RTE_BOV_B_AC_1RepbaseAnolis carolinensis(GCA)2–4A [29]
Bov-B_VKoThis studyVaranus komodoensis(CAA)2–4A
Bov-B_IBrThis studyIndotyphlops braminus(CAA)2–5A
Bov-B (BovB)RepbaseBos taurus(CTGAA)3–5/(CTGAT)3–6A [29]
Bov-B_MMoThis studyMoschus moschiferus(CTGAA)3–4A
SINESauria_POMRepbase; Piskurek et al. [49]Podarcis muralis(ACCTTT)1–2T
Sauria_PViThis studyPogona vitticeps(ACCTTT)1–3T
Sauria_ACARepbase; Piskurek et al. [49]Anolis carolinensis(ACCTTT)2–4T [29]
Sauria_VKoThis studyVaranus komodoens(ACCTTT)1–3T
Sauria_IBrThis studyIndotyphlops braminus(ACCTTT)1–2T
Sauria_NNaThis studyNaja naja(ACCTTT)1–2T
Sauria_PTeThis studyPseudonaja textilis(ACCTTT)1–3T
(1) Microsatellite-like sequences at 3′-ends of the SINEs and LINEs consist of a stretch of T or A along with other nucleotides. The first nucleotides of TSDs and repeat nucleotides within the microsatellite-like sequences are consistent across many species.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nakatsuka, Y.; Ohshima, K. Long-Standing Activity with Characteristic Genomic Insertion Signatures in Reptilian Bov-B LINEs and Associated Sauria SINEs. Biology 2026, 15, 927. https://doi.org/10.3390/biology15120927

AMA Style

Nakatsuka Y, Ohshima K. Long-Standing Activity with Characteristic Genomic Insertion Signatures in Reptilian Bov-B LINEs and Associated Sauria SINEs. Biology. 2026; 15(12):927. https://doi.org/10.3390/biology15120927

Chicago/Turabian Style

Nakatsuka, Yoshiki, and Kazuhiko Ohshima. 2026. "Long-Standing Activity with Characteristic Genomic Insertion Signatures in Reptilian Bov-B LINEs and Associated Sauria SINEs" Biology 15, no. 12: 927. https://doi.org/10.3390/biology15120927

APA Style

Nakatsuka, Y., & Ohshima, K. (2026). Long-Standing Activity with Characteristic Genomic Insertion Signatures in Reptilian Bov-B LINEs and Associated Sauria SINEs. Biology, 15(12), 927. https://doi.org/10.3390/biology15120927

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop