Cytogenomics Unveil Possible Transposable Elements Driving Rearrangements in Chromosomes 2 and 4 of Solea senegalensis

Cytogenomics, the integration of cytogenetic and genomic data, has been used here to reconstruct the evolution of chromosomes 2 and 4 of Solea senegalensis. S. senegalensis is a flat fish with a karyotype comprising 2n = 42 chromosomes: 6 metacentric + 4 submetacentric + 8 subtelocentric + 24 telocentric. The Fluorescence in situ Hybridization with Bacterial Artificial Chromosomes (FISH-BAC) technique was applied to locate BACs in these chromosomes (11 and 10 BACs in chromosomes 2 and 4, respectively) and to generate integrated maps. Synteny analysis, taking eight reference fish species (Cynoglossus semilaevis, Scophthalmus maximus, Sparus aurata, Gasterosteus aculeatus, Xiphophorus maculatus, Oryzias latipes, Danio rerio, and Lepisosteus oculatus) for comparison, showed that the BACs of these two chromosomes of S. senegalensis were mainly distributed in two principal chromosomes in the reference species. Transposable Elements (TE) analysis showed significant differences between the two chromosomes, in terms of number of loci per Mb and coverage, and the class of TE (I or II) present. Analysis of TE divergence in chromosomes 2 and 4 compared to their syntenic regions in four reference fish species (C. semilaevis, S. maximus, O. latipes, and D. rerio) revealed differences in their age of activity compared with those species but less notable differences between the two chromosomes. Differences were also observed in peaks of divergence and coverage of TE families for all reference species even in those close to S. senegalensis, like S. maximus and C. semilaevis. Considered together, chromosomes 2 and 4 have evolved by Robertsonian fusions, pericentric inversions, and other chromosomal rearrangements mediated by TEs.


Introduction
Cytogenomics is a methodology in which the cytogenetic and genomic data obtained are integrated. This approach emerged to advance studies of the relationship between chromosomes and diseases in humans, but it has been extended to other species due to its potential value for studies of evolution [1]. The wide variability of karyotype observed among species in cytogenetic studies has prompted researchers to investigate the molecular mechanisms that underlie chromosome structure and function. Next Generation Sequencing (NGS) techniques have allowed us to characterize the genome of many organisms and to perform sequence-based comparisons between them. The integration of sequencing and mapping data across the genome is helping us to visualize past rearrangement events and to assess synteny among species [2]. The availability of more data from a greater number of species should help to clarify the apparent relationship between intra-and interspecific variation and its relationship to environmental conditions. the chromosome diploid number ranges from 2n = 28, observed in the Paralichthyidae Citarichthys spilopterus, to 2n = 48, found in most of the Pleuronectidae species [21]. This variability has been explained by the occurrence of Robertsonian fusions and chromosome inversions during the course of the evolution of Pleuronectiformes [21,26].
The Senegalese sole (Solea senegalensis (Kaup, 1858)) is a flatfish, with an oval and asymmetric body, belonging to the Pleuronectiformes order. The species is widely distributed in the Atlantic, from the Gulf of Biscay to the Northwest coast of Africa, and in Mediterranean waters, from the Strait of Gibraltar to Tunisia; this species has good potential for marine aquaculture given the high demand and profitable price. However, there are several issues that hamper its production: (1) high larval mortality, (2) sub-optimal larval weaning strategies, and (3) disease control. In recent years, considerable efforts have been devoted towards understanding genetic and genomics aspects of this species. The karyotype of this species comprises 2n = 42 chromosomes: 6 metacentric (M) + 4 submetacentric (SM) + 8 subtelocentric (ST) + 24 telocentric (T), and its Fundamental Number (FN) is 60 [27]. Its chromosomes are very small with sizes ranging from 0.5 to 1 µm [28]. The largest metacentric chromosome, chromosome 1, has been proposed as a sex proto-chromosome originated through a Robertsonian fusion [3]. The rest of the chromosomes have BAC-based markers that enable them to be distinguished [9] and its characterization can contribute important clues, taken in consideration that it is a non-model species with limited genomic resources available, and also to contribute to the karyotype evolution of Pleuronectiformes. Hence, the object of the present paper is to study the evolution of chromosomes 2 and 4 of S. senegalensis (classified as metacentric and submetacentric, respectively) by comparing them with other available species and analyzing their repetitive elements.

Synteny Analysis for Metacentric Chromosome 2
The synteny analysis for the eight species used as reference is presented in Figures S1-S8, and summarized in Figure 3. The taxonomic relationships of these species are shown in Figure S9. A high degree of conservation in different regions for the genes was observed in each BAC of S. senegalensis, and 7 out of 11 BACs were distributed in two chromosomes, except in D. rerio and L. oculatus. BACs 60P19, 46C5, 36I3, and 4D15 (arm 1 of S. senegalensis) and BAC 21O23 (arm 2) were located in the same chromosome in C. semilaevis, S. maximus, G. aculeatus, X. maculatus, and O. latipes, and in the same genomic region for C. semilaevis and S. maximus. BACs 52G10 and 38N10, which were positioned in different arms on chromosome 2 of S. senegalensis, were located in a second different chromosome in all species, except in S. maximus where they were located in two different chromosomes. For all species, gene regions for BACs 60P19 and 46C5 showed a high degree of conservation, except for D. rerio in which six out of nine genes of 60P19 were located in the chromosome far from the BAC 60P19, and only one gene of BAC 46C5 was located near BAC 60P19 (see Figure S7). The other BACs mapped to chromosome 2: 6P22, 9E8 (both situated in arm 1), 3F15, and 19L16 (in arm 2) were located in four different chromosomes, except D. rerio and L. oculatus, in which the genes of BAC 6P22 were distributed in two different chromosomes. A comparative mapping graph plot (Figure 5a)

Synteny Analysis for Metacentric Chromosome 2
The synteny analysis for the eight species used as reference is presented in Figures S1-S8, and summarized in Figure 3. The taxonomic relationships of these species are shown in Figure S9. A high degree of conservation in different regions for the genes was observed in each BAC of S. senegalensis, and 7 out of 11 BACs were distributed in two chromosomes, except in D. rerio and L. oculatus. BACs 60P19, 46C5, 36I3, and 4D15 (arm 1 of S. senegalensis) and BAC 21O23 (arm 2) were located in the same chromosome in C. semilaevis, S. maximus, G. aculeatus, X. maculatus, and O. latipes, and in the same genomic region for C. semilaevis and S. maximus. BACs 52G10 and 38N10, which were positioned in different arms on chromosome 2 of S. senegalensis, were located in a second different chromosome in all species, except in S. maximus where they were located in two different chromosomes. For all species, gene regions for BACs 60P19 and 46C5 showed a high degree of conservation, except for D. rerio in which six out of nine genes of 60P19 were located in the chromosome far from the BAC 60P19, and only one gene of BAC 46C5 was located near BAC 60P19 (see Figure S7). The other BACs mapped to chromosome 2: 6P22, 9E8 (both situated in arm 1), 3F15, and 19L16 (in arm 2) were located in four different chromosomes, except D. rerio and L. oculatus, in which the genes of BAC 6P22 were distributed in two different chromosomes. A comparative mapping graph plot (Figure 5a) was used to distribute the nodes (BACs and syntenic fish species chromosomes) in the plane. Nodes sharing more connections (syntenic positions with other fish chromosomes) are closer to each other. Figure 5a shows two BACs (9E8 and 19L16) with single connections to chromosomes from other fishes, three BACs (3F15, 6P22, and 38N10) with a few connections and six BACs (52G10, 60P19, 46C5, 36I3, 4D15, and 21O23) with multiple connections, showing the conserved syntenic regions in other fish chromosomes ( Figure 5a). was used to distribute the nodes (BACs and syntenic fish species chromosomes) in the plane. Nodes sharing more connections (syntenic positions with other fish chromosomes) are closer to each other. Figure 5a shows two BACs (9E8 and 19L16) with single connections to chromosomes from other fishes, three BACs (3F15, 6P22, and 38N10) with a few connections and six BACs (52G10, 60P19, 46C5, 36I3, 4D15, and 21O23) with multiple connections, showing the conserved syntenic regions in other fish chromosomes ( Figure 5a).

Synteny Analysis for Submetacentric Chromosome 4
The results derived from the synteny analysis of chromosome 4 are presented in Figures S10-S17, and summarized in Figure 4. In total, 8 out of 10 BACs (3C15, 46B2, 30J4, 12D24, 8A23, 36J2, 36H3, and 36H2) mapped to the "q" arm and were located in one chromosome for all reference species. BAC 12N15, placed in the "p" arm, was located in a different chromosome in all reference species except for L. oculatus in which the genes of this BAC were distributed on two chromosomes. The BACs 3C15, 46B2, 30J4, 12D24, 8A23, 36J2, 36H3, and 36H2 presented in a different sequence in the chromosome in each of the species, but the region with BACs 46B2, 30J4, and 12D24 was highly conserved

Synteny Analysis for Submetacentric Chromosome 4
The results derived from the synteny analysis of chromosome 4 are presented in Figures S10-S17, and summarized in Figure 4. In total, 8 out of 10 BACs (3C15, 46B2, 30J4, 12D24, 8A23, 36J2, 36H3, and 36H2) mapped to the "q" arm and were located in one chromosome for all reference species. BAC 12N15, placed in the "p" arm, was located in a different chromosome in all reference species except for L. oculatus in which the genes of this BAC were distributed on two chromosomes. The BACs 3C15, 46B2, 30J4, 12D24, 8A23, 36J2, 36H3, and 36H2 presented in a different sequence in the chromosome in each of the species, but the region with BACs 46B2, 30J4, and 12D24 was highly conserved and the BACs followed the same sequence in all the species except for D. rerio and L. oculatus. It is noteworthy that BAC 30J4 (19 genes) showed a translocation of ten genes (nlrc3, wdr90, rhot2, H1.0B, rhbdl1, wdr24, anks3, c8orf33, H3.3, and gcgr) in all the species (see Figures S10-S17). These BACs were located on one chromosome in all species apart from L. oculatus in which they were positioned on two chromosomes. Thus, only nine genes from BAC 30J4 (pcdh8, ednrb, cog3, mid1, arhgap6, tlr7, tlr8, tyb12, and egfl6) remained in the conserved region for BACs 46B2, 30J4, and 12D24. BAC 46P22, i.e., in the "q" arm of S. senegalensis; in all the reference species they were positioned in a different chromosome. The mapping graph plot for chromosome 4 ( Figure 5b) showed one BAC clone (46P22) with a single connection to the chromosomes of the other species; two BACs (12N15 and 30J4) had only a few connections with other fish chromosomes; and the remaining S. senegalensis BAC clones presented multiple connections, creating a tight cluster of nodes.
ulatus. It is noteworthy that BAC 30J4 (19 genes) showed a translocation of ten genes (nlrc3, wdr90, rhot2, H1.0B, rhbdl1, wdr24, anks3, c8orf33, H3.3, and gcgr) in all the species (see Figures S10-S17). These BACs were located on one chromosome in all species apart from L. oculatus in which they were positioned on two chromosomes. Thus, only nine genes from BAC 30J4 (pcdh8, ednrb, cog3, mid1, arhgap6, tlr7, tlr8, tyb12, and egfl6) remained in the conserved region for BACs 46B2, 30J4, and 12D24. BAC 46P22, i.e., in the "q" arm of S. senegalensis; in all the reference species they were positioned in a different chromosome. The mapping graph plot for chromosome 4 ( Figure 5b) showed one BAC clone (46P22) with a single connection to the chromosomes of the other species; two BACs (12N15 and 30J4) had only a few connections with other fish chromosomes; and the remaining S. senegalensis BAC clones presented multiple connections, creating a tight cluster of nodes.

Distribution of Repeated Sequences
Chromosome 2 displayed the highest coverage values (% of repetitive elements per BAC) in the TE analysis by BAC and chromosome. Among chromosome 2 BACs, 4D15, 36I3, and 52G10 showed coverage values of 9.22, 8.54, and 8.5, respectively. BAC 52G10 presented a high percentage of simple repeats, and the presence of satellites was ob-

Distribution of Repeated Sequences
Chromosome 2 displayed the highest coverage values (% of repetitive elements per BAC) in the TE analysis by BAC and chromosome. Among chromosome 2 BACs, 4D15, 36I3, and 52G10 showed coverage values of 9.22, 8.54, and 8.5, respectively. BAC 52G10 presented a high percentage of simple repeats, and the presence of satellites was observed for BAC 4D15 (Figure 6a and Table S1).

Distribution of Repeated Sequences
Chromosome 2 displayed the highest coverage values (% of repetitive elements per BAC) in the TE analysis by BAC and chromosome. Among chromosome 2 BACs, 4D15, 36I3, and 52G10 showed coverage values of 9.22, 8.54, and 8.5, respectively. BAC 52G10 presented a high percentage of simple repeats, and the presence of satellites was observed for BAC 4D15 (Figure 6a and Table S1).  The number of loci per Mb (NL/Mb) was proportional to coverage, but 52G10 showed higher values than the other BAC clones (1189.5 NL/Mb), mostly caused by the presence of multiple simple repeats loci (Figure 6c and Table S2). In chromosome 4, the coverage of repetitive elements was lower than in chromosome 2. However, BACs located in the subtelomeric region of the "q" arm of this chromosome (46P22, 36J2, 36H2, and 36H3) showed coverage and NL/Mb values up to five times higher than other BACs analyzed. These values correspond to both TEs and simple repeats (Figure 6b,d and Tables S3 and S4).
After grouping all BAC sequences for each chromosome, the analysis of repetitive elements was carried out. Chromosome 4 had more retroelements and with higher coverage (NL/Mb) than chromosome 2 (Table 2), mainly due to the presence of almost twice the number of LTR elements (retroviral and Gipsy/DIRs). However, DNA transposons showed higher coverage and NL/Mb in chromosome 2, mostly due to Hobo-Activator and Tc1-IS630-Pogo elements; also RTE/Bov-B LINE elements showed high values of NL/Mb and coverage in chromosome 2 (Table 2 and Figure S18).

Analysis of the Transposable Elements Divergence
The most frequent Kimura's divergence values for TEs between S. senegalensis chromosome 2 and syntenic regions in C. semilaevis ranged from 16 to 21% across all repeat classes, suggesting a relatively recent transposition burst across all major TE types. DNA/CMC/Emspm, DNA/Kolobok and DNA/hAT-Ac elements showed the highest coverage (Table 3, Figure 7a and Figure S19a).
The divergence peak of TEs in the chromosome 4 between S. senegalensis and C. semilaevis was less pronounced with higher coverage in divergences between 22 and 27% (Figure 7b and Figure S19b). The coverage over the whole divergence range was lower than that of chromosome 2, indicating fewer TE elements in common between the two species. A retroelement (LTR/Ngaro) presented low values of divergence (<10%). The DNA/Kolobok and DNA/hAt-Carlie were the most abundant elements in the most frequent divergence values. When the TEs from S. senegalensis chromosome 2 were used to analyze divergence in repetitive elements in syntenic regions of S. maximus, the most frequent divergence observed ranged between 9 and 13%, the lowest divergence values obtained in our study. Over the total range, elements DNA/CMC-En, DNA/Maverick, DNA/hAT-Ac and LTR/Gypsy had high coverage. In chromosome 4 analyses, the peaks of divergence were ∼12-14%, with DNA/CMC-EnSpm, DNA/Ginger-1 and LTR/ERV-1 elements presenting higher coverage than other elements. A LINE (Long interspaced element) element (Rex-Babar) from chromosome 4 presented a high degree of divergence (40%) between S. senegalensis and S. maximus, indicating a putative strong selection of this retroelement (Table 3, Figures 7d, and S19d).
The analysis of TE elements from S. senegalensis chromosome 2 and syntenic regions in O. latipes revealed the widest spread of divergence values found in this study, with ill-defined peaks of divergence. Figures 7e and S19e show very high coverage peaks, ranging between 23 and 38% ( Table 3). The greatest divergence values correspond to two DNA transposon families: DNA/PIF-Harburger and DNA/Tcmar-TC1. In contrast, the A LINE (Long interspaced element) element (Rex-Babar) from chromosome 4 presented a high degree of divergence (40%) between S. senegalensis and S. maximus, indicating a putative strong selection of this retroelement (Table 3, Figure 7d and Figure S19d).
The analysis of TE elements from S. senegalensis chromosome 2 and syntenic regions in O. latipes revealed the widest spread of divergence values found in this study, with ill-defined peaks of divergence. Figure 7e and Figure S19e show very high coverage peaks, ranging between 23 and 38% ( Table 3). The greatest divergence values correspond to two DNA transposon families: DNA/PIF-Harburger and DNA/Tcmar-TC1. In contrast, the DNA/Ginger-1 family showed a null divergence value, indicating high evolutionary conservation between these species and DNA regions. In the TE divergence analysis of S. senegalensis chromosome 4 with its syntenic regions of O. latipes, the most frequent divergence values were around 20 and 21%, with DNA/CMC-EnSpm and RC Helitron showing the greatest coverage. A second peak with higher, but more discontinuous divergence and lower coverage values was found at 34, 37, and 39% with the most representative families DNA/TcMar-1, Line/Rex-Babar and LTR/ERV. Again, a single family, DNA/Ginger-1, exhibited high coverage and a low divergence value (7%). LINE and SINE (Short interspaced nuclear element) elements throughout this chromosome revealed the greatest degree of divergence (Figure 7f and Figure S19f). Finally, the divergence analysis of TE elements of D. rerio syntenic regions with S. senegalensis chromosome 2 presented a main peak at 24-27%, with minor peaks around 11-15% (Figure 7g and Figure S19g). The most abundant families for the most frequent divergence values were DNA/Kolobok-T2 and DNA/hAT-Ac. For the minor peaks the most representative families were SINE /tRNA-V and RC/Helitron. A single divergence value of 12% with high coverage was observed for DNA/Kolobok and DNA/hAT-Ac families. For chromosome 4, the most frequent divergence values with the syntenic region of D. rerio was around 24-27%. The most abundant families across all divergence peak values were RC/Helitron and a SINE/tRNA-V subfamily (Figure 7h and Figure S19h).

Discussion
In this work, the evolution of the metacentric chromosome 2 and submetacentric chromosome 4 of S. senegalensis has been studied. Synteny and repetitive elements were analyzed in 11 and 10 BACs of chromosomes 2 and 4, respectively. Considering the synteny results for both chromosomes as a whole, BACs map to two main chromosomes in the reference species, indicating that metacentric chromosome 2 and submetacentric chromosome 4 could have been formed by Robertsonian fusions, pericentric inversions and other chromosomal rearrangements. In a previous study, comparable results in the distribution of BACs were observed for chromosome 1 of S. senegalensis [2,3,31].
One consequence of these fusions would be the reduction of the number of chromosomes in S. senegalensis (2n = 42), from the plesiomorphic condition of 2n = 48 in teleosts. Fish families have followed distinct evolutionary paths in relation to the number of chromosomes. The families Haemulidae, Lutjanidae, and Sciaenidae of the Perciformes order, for example, present remarkable karyotype conservation and the ancestral condition or karyotype stasis is maintained (2n = 48, FN = 48) [32]. In contrast, other orders such as Tetraodontiforms and Gasterosteiforms exhibit reduction in chromosome numbers due to fusion of pairs of ancestral chromosomes [33]. For the family Batrachoididae (Batrachoideforms order), the ancestral condition has been reported to be 2n = 46 instead of 2n = 48, and for the species Porichthys plectrodon, the presence of a pair of large metacentric chromosomes in their karyotype suggests a Robertsonian translocation between two acrocentric chromosomes in the evolution of the karyotype toward the actual 2n = 44 [34].
The Pleuronectiformes order shows variation in the number of chromosomes [21]. In addition, species with the same number of chromosomes differ in the FN, including, for example, C. semilaevis and Trinectes inscriptus (2n = 42). While all chromosomes in C. semilaevis are acrocentric, the karyotype of T. inscriptus is formed by three large metacentric, one submetacentric and several subtelocentric chromosome pairs. The metacentric chromosomes probably originated from chromosome fusions, while the submetacentric and other subtelocentric pairs originated from pericentric inversions probably from six ancient acrocentric chromosome pairs [14,35]. The karyotype of S. maximus (2n = 44) comprises 3 pairs of M/SM and 19 pairs of ST/T chromosomes and differs in one chromosome pair from S. senegalensis [36]. These kinds of difference are also observed within the Soleidae family. Hence, Dagetichthys lusitanica, has 2n = 42 (FN = 50), with two metacentric, two submetacentric and 17 telocentric chromosome pairs; and Dicologlossa cuneata, with 2n = 50 (FN = 54) and with one large metacentric chromosome and several smaller ones, and 23 telocentric pairs. Zoo-FISH studies with these two species indicated that the chro-mosome 1 of S. senegalensis originated from the fusion of two acrocentric chromosomes found in the karyotype of both D. cuneata and D. lusitanicus. This chromosome pair has been proposed as a proto-sex chromosome in S. sengalensis [3]. The results presented in this paper indicate that evolution of chromosomes 2 and 4 depends on the genomic surrounding of TEs that are responsible for the interchange and rearrangement of blocks of DNA.
The transposable elements analysis of S. senegalensis chromosomes 2 and 4 measured the abundance of different TE classes. Chromosome 2 showed a greater abundance of class II elements (DNA transposons), in terms of NL/Mb and coverage, than chromosome 4. In most fish genomes, Class II DNA transposons are the most abundant component [5,37], although many TE superfamilies are present in this group of organisms, presenting evidence of greater diversity than in other vertebrates [8]. Among TE families, Tc/mariner, hAT, L1, L2, and Gypsy are the most widespread and predominant TE superfamilies in Actinopterygian genomes [38,39]. However, some organisms present a predominance of specific TE superfamilies [5]; thus they could have played a pivotal role in their evolution. The TE abundance observed in chromosome 2 ( Table 2) might have facilitated the multiple chromosomal rearrangements, such as pericentric inversions, leading to the formation of this chromosome ( Figure 8) [14,31]. smaller ones, and 23 telocentric pairs. Zoo-FISH studies with these two species indicated that the chromosome 1 of S. senegalensis originated from the fusion of two acrocentric chromosomes found in the karyotype of both D. cuneata and D. lusitanicus. This chromosome pair has been proposed as a proto-sex chromosome in S. sengalensis [3]. The results presented in this paper indicate that evolution of chromosomes 2 and 4 depends on the genomic surrounding of TEs that are responsible for the interchange and rearrangement of blocks of DNA.
The transposable elements analysis of S. senegalensis chromosomes 2 and 4 measured the abundance of different TE classes. Chromosome 2 showed a greater abundance of class II elements (DNA transposons), in terms of NL/Mb and coverage, than chromosome 4. In most fish genomes, Class II DNA transposons are the most abundant component [5,37], although many TE superfamilies are present in this group of organisms, presenting evidence of greater diversity than in other vertebrates [8]. Among TE families, Tc/mariner, hAT, L1, L2, and Gypsy are the most widespread and predominant TE superfamilies in Actinopterygian genomes [38,39]. However, some organisms present a predominance of specific TE superfamilies [5]; thus they could have played a pivotal role in their evolution. The TE abundance observed in chromosome 2 ( Table 2)   The analysis of two S. senegalensis chromosomes has shown the differences between them in the presence of classes and families of TEs. Retroviral LTR elements, hobo-Activator and Tc1-Pogo DNA transposons and LINE elements such as RTE/Bov-B and L1/CIN4 LINE showed 2-to-9 fold differences, in terms of NL/Mb and coverage. These findings indicate that these TEs could play a main role in their differentiation and evolution. Moreover, the TE analysis per mapped BACs, on various chromosomes, has allowed us to analyze the distribution on specific chromosome arms. The results show a general The analysis of two S. senegalensis chromosomes has shown the differences between them in the presence of classes and families of TEs. Retroviral LTR elements, hobo-Activator and Tc1-Pogo DNA transposons and LINE elements such as RTE/Bov-B and L1/CIN4 LINE showed 2-to-9 fold differences, in terms of NL/Mb and coverage. These findings indicate that these TEs could play a main role in their differentiation and evolution. Moreover, the TE analysis per mapped BACs, on various chromosomes, has allowed us to analyze the distribution on specific chromosome arms. The results show a general pattern of high TEs abundance (measured as NL/Mb and coverage) next to telomeric and centromeric regions, as described previously for S. sengalensis chromosome 1 [9]. However one BAC (46P22) located at an interstitial position on chromosome 4 presented a high degree of abundance of TEs, when compared with other BACs analyzed in this and in a previous S. senegalensis study [9]. It would be important to note that none of the studied BAC was located in the centromere of the chromosomes, as centromeres are known to be rich in repeat sequences further studies including them could complete data shown in this paper.
The comparative mapping net plot revealed an isolated node cluster for this BAC, reflecting a different evolution process in that BAC region. The syntenic regions of this BAC in other fishes are always found in telomeric locations, so this chromosome could have kept its abundance of TEs during evolution. High TE abundance in these chromosomes could also be associated with selection events such as in the evolution of sex chromosomes, as found in S. senegalensis chromosome 1, where TEs could account for the evolution of the putative sex-determining chromosome of this species, although sex chromosomes evolve differently than autosomes [2]. The number of Single Sequence Repeats (SSR) loci was in the range found in previous S. senegalensis studies ∼400-1000 NL/Mb [9], except for telomeric BACs in chromosome 4, which displayed higher values (1000-1700 NL/Mb). The SSR coverage in these two chromosomes (1.5-2.2%) was slightly higher than in other chromosomes, as found in a previous analysis of S. senegalensis [9]. These values are slightly lower than those in the green puffer fish T. nigroviridis, where SSRs account for 3.21% of the genome [40] but higher than that found in the genome of the fugu puffer fish T. rubripes (1.29%) [41]. The relationship between meiotic recombination and TEs has been discussed recently [42], and its role in the coverage and distribution of TEs in particular chromosome regions of S. senegalensis could provide insights into their genome dynamics and evolution.
In order to estimate divergence and "age" history of TEs for the syntenic regions of S. senegalensis chromosomes 2 and 4 and those of four other fish species, Kimura distances were calculated for all TE copies. Divergence is correlated with the age of the activity [8], where low K-values (similar TEs) are indicative of more recent activity (left side of the graphics), while high K-values (divergent TEs) have been created by more ancient transposition events (right side of the graphics). All syntenic regions analyzed in the other fish species, for both chromosomes 2 and 4, have been strongly shaped by DNA transposons (Class II), except for syntenic regions of chromosome 2 in S. maximus, which presents the most amplifications of LTR elements (Class I), with a major and recent burst of activity (∼K-value 10) ( Figure S19). In contrast, ancient amplifications of elements (K-value around 37) in LINEs in the syntenic region in O. latipes for S. senegalensis chromosome 4 was observed. Syntenic regions on chromosomes 2 and 4 showed a major burst of activity for more recent copies in S. maximus, and a more ancient amplification event in the other analyzed flatfish C. semilaevis, in relation to S. senegalensis. In chromosome 2 D. rerio revealed two major bursts of activity with ancient copies predominant but fewer recent copies ( Figure S19). In teleosts as a whole, significant interspecific differences in TE divergence have been observed [43], generally with one or two bursts of transposition [5,8]. Teleost genomes generally contain fewer ancient copies (K-values >25) than the genomes of other organisms such as mammals, suggesting differences in the process of elimination [8]. All these data show a differentiation in the divergence values of TE elements for both chromosomes in comparison with other teleosts, revealing this method as efficient and useful for future analyses of the evolution of transposable elements in the genome of soles.
This paper provides data about karyotype evolution in Pleuronectiformes, for which cytogenomics information is scarce. We conclude that chromosomes 2 and 4 of S. senegalensis evolved by Robertsonian fusion accompanied by some additional rearrangements that could be mediated by TEs, given the high number found in the studied sequences. A genome comparison showed more similarities between S. senegalensis and C. semilaevis than with S. maximus. However, synteny results suggested otherwise, indicating that the evolution of the S. senegalensis karyotype has been notably different from those other species.

Repetitive Elements Analysis
After BAC clone mapping, a statistical analysis of repetitive elements was carried out using a homology-based approach with the Repbase database (release 23.07) and Repeat Masker software v.4.0.9 (from now on RM) [49]. The repetitive elements analyzed were: DNA retrotransposons, retroelements, low complexity, simple repeats, and satellite sequences. The low complexity elements and DNA satellite coverage was measured as the quantity of sequences (bp) per BAC sequences length analyzed (%), and the average number of TEs was calculated, in relation to the BAC sequences length, as the total number of identified loci per Mb.

Transposable Elements Divergence
After pooling S. senegalensis BAC clone sequences per chromosome (Chromosomes 2 and 4), repetitive elements were first identified using RM with the D. rerio RepBase repeat library. Low-complexity repeats were ignored (-nolow) and a sensitive (-s) search was performed. From RM results, S. senegalensis repeat libraries, one per chromosome, were then constructed using home-made scripts and bedtools software v2.25.0 [50]. Sequence Dereplicator and Database Curator python software was used to dereplicate redundant sequences (https://github.com/Eslam-Samir-Ragab/Sequence-database-curator). Subsequently, syntenic regions from C. semilaevis, S. maximus, O. latipes and D. rerio species were mined from the Ensemble database, and the S. senegalensis repeat element libraries were used to identify repetitive elements and their divergences with RM.
A Kimura distance-based copy divergence analysis relative to the S. senegalensis TE elements database made per chromosomes (2 and 4) and four fish species, from the closer C. semilaevis and S. maximus, to the more distant species O. latipes and D. rerio, was carried out.
Perl scripts were used to calculate divergence analytic measures on the RM alignment files and to create a Repeat Landscape graph using the divergence summary data (https: //github.com/rmhubley/RepeatMasker). Results were analyzed per family and they were also grouped for the four different types of TEs (DNA transposons, LTR, LINE, and SINE retrotransposons) [8].