Comprehensive Transcriptome Analysis Reveals Insights into Phylogeny and Positively Selected Genes of Sillago Species.

Sillago species lives in the demersal environments and face multiple stressors, such as localized oxygen depletion, sulfide accumulation, and high turbidity. In this study, we performed transcriptome analyses of seven Sillago species to provide insights into the phylogeny and positively selected genes of this species. After de novo assembly, 82,024, 58,102, 63,807, 85,990, 102,185, 69,748, and 102,903 unigenes were generated from S. japonica, S. aeolus, S. sp.1, S. sihama, S. sp.2, S. parvisquamis, and S. sinica, respectively. Furthermore, 140 shared orthologous exon markers were identified and then applied to reconstruct the phylogenetic relationships of the seven Sillago species. The reconstructed phylogenetic structure was significantly congruent with the prevailing morphological and molecular biological view of Sillago species relationships. In addition, a total of 44 genes were identified to be positively selected, and these genes were potential participants in the stress response, material (carbohydrate, amino acid and lipid) and energy metabolism, growth and differentiation, embryogenesis, visual sense, and other biological processes. We suspected that these genes possibly allowed Sillago species to increase their ecological adaptation to multiple environmental stressors.


Introduction
The fish family Sillago are commonly known as sand whitings or sand borers, which are widely distributed in inshore waters of the Indian Ocean and the western Pacific Ocean and thrive in estuaries and shoals [1]. As minitype bottom-dwelling fishes of shallow sea regions, Sillago species are gregarious and have the ecological habit of drilling sand [2]. In addition, the Sillago species have become an important economical edible fish due to their palatable meat and long chain fatty acids that prevent thrombosis [2]. The inshore fishing of Sillago species has also developed rapidly in the past decades. It is worth noting that the overfishing will eventually lead to the ecological and economic damage of the Sillago resource. Additionally, more complex demersal environments caused by climate change and human activities, such as localized oxygen depletion, sulfide accumulation, and high turbidity, also In order to reconstruct the complete phylogenetic relationship of Sillago species and detect their genomic characters associated with adaptive evolution, five valid Sillago species (including S. japonica, S. aeolus, S. sihama, S. parvisquamis, and S. sinica) and two unpublished new species (S. sp.1 and S. sp.2; unpublished data)-which were confirmed by using morphological, anatomical, and DNA-barcoding evidence-were studied. Next, seven RNA sequencing (RNA-seq) libraries were constructed together. First, clean reads produced by RNA-seq were applied to assemble seven relatively integrated transcriptomes of Sillago species. Subsequently, transcriptome data of seven Sillago species and other existing Perciformes species were detected for orthologous genes, and then the shared orthologous genes were used to reconstruct the phylogenetic tree and predict the differentiation time.
Considering that these genes that experience positive selection may be beneficial to improving the ecological adaptation of Sillago species, we investigated the positively selected genes (PSGs) of Sillago species based on orthologous genes. Meanwhile, a gene function enrichment analysis was performed on these PSGs. These results can constitute important data with which to gain insights into the process of species differentiation and adaptive evolution in other Sillago species.

Ethics Approval and Consent to Participate
The Sillago species are not endangered or protected species in China or other countries. In addition, frost anesthesia was used to minimize suffering in all samples.

Sample Collection, RNA Extraction, and Illumina Sequencing
Seven Sillago species were obtained from the coast of China and Japan. Then, one adult female per species was immediately euthanized, and the muscle of each individual was rapidly sampled, snap-frozen in liquid nitrogen, and stored at −80 • C prior to the RNA extraction. The sampling location and the standard length (SL) of each sequenced individual are shown in Figure 1. The total RNA of the seven Sillago species was extracted using a standard TRIzol Reagent kit, following the manufacturer's protocol. The quantitative evaluation of total RNA was conducted by using the Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). We then purified mRNA by depleting rRNA from total RNA, and the remaining mRNA was cleaned three times. Then, we fragmented the mRNA into fragments of appropriate size. The fragmented mRNA was used to construct a cDNA library. Afterwards, we added A-tails and adapters to the double stranded cDNA. Then, the cDNA libraries were diluted to 10 pM and then quantified by using the Agilent 2100 Bioanalyzer. Finally, each tagged cDNA library was sequenced on the Illumina HiSeq 2000 across one lane with paired-end 150 bp reads.

Transcriptome De Novo Assembly
The raw RNA-seq data of Sillago japonica, Sillago aeolus, Sillago sihama, Sillago parvisquamis, Sillago sinica, Sillago sp.1 and Sillago sp.2 reported in this paper have been deposited in the Sequence Read Archive (SRA) database of National Center for Biotechnology Information (NCBI) under BioProject number PRJNA596307, with accession number of SRR10743284, SRR10743285, SRR10743281, SRR10743280, SRR10743279, SRR10743282 and SRR10743283, respectively. The FastQC 0.11.2 software (version, Babraham Institute, Cambridgeshire, UK) was applied to assess the sequencing quality of all raw reads in FASTQ format. Then, clean reads were obtained by removing reads with sequencing adapters, unknown nucleotides (N ratio > 10%), and low quality (quality scores < 20) using Trimmomatic 0.36 [18]. Additionally, the S. sinica genome dataset served as a reference sequence and was used for subsequent de novo assembly [5]. All remaining high-quality clean reads were sorted according to S. sinica genome index sequences and were aligned to the reference assembly generated from pairedend reads by using the bwa-mem algorithm in the BWA (Version 0.7.12, Microsoft Corporation, Redmond, WA, USA; [19]) software with default parameters, with the resulting output as 'sam' files. All sam files of each species were merged into a 'bam' file using the SAMtools (Version 1.3.1; Microsoft Corporation, Redmond, WA, USA; [20]) software. Finally, the Trinity (Version 2.4.0, Microsoft Corporation, Redmond, WA, USA; [21]) software was used for the transcriptome de novo assembly of each species, with the parameters as follows: --genome_guided_max_intron 10000. In order to perform further quantitative assessment of the assembly completeness, we applied the BUSCO software (version 4.0, Microsoft Corporation, Redmond, WA, USA) package with default settings, and downloaded the Ensembl Actinopterygii assembly as a reference.

Orthology Determination and Phylogenetic Tree Reconstruction
In order to reconstruct a more accurate phylogenetic tree, we performed an extensive orthologous gene comparison among the seven Sillago species and thirteen other Perciformes

Transcriptome De Novo Assembly
The raw RNA-seq data of Sillago japonica, Sillago aeolus, Sillago sihama, Sillago parvisquamis, Sillago sinica, Sillago sp.1 and Sillago sp.2 reported in this paper have been deposited in the Sequence Read Archive (SRA) database of National Center for Biotechnology Information (NCBI) under BioProject number PRJNA596307, with accession number of SRR10743284, SRR10743285, SRR10743281, SRR10743280, SRR10743279, SRR10743282 and SRR10743283, respectively. The FastQC 0.11.2 software (version, Babraham Institute, Cambridgeshire, UK) was applied to assess the sequencing quality of all raw reads in FASTQ format. Then, clean reads were obtained by removing reads with sequencing adapters, unknown nucleotides (N ratio > 10%), and low quality (quality scores < 20) using Trimmomatic 0.36 [18]. Additionally, the S. sinica genome dataset served as a reference sequence and was used for subsequent de novo assembly [5]. All remaining high-quality clean reads were sorted according to S. sinica genome index sequences and were aligned to the reference assembly generated from paired-end reads by using the bwa-mem algorithm in the BWA (Version 0.7.12, Microsoft Corporation, Redmond, WA, USA; [19]) software with default parameters, with the resulting output as 'sam' files. All sam files of each species were merged into a 'bam' file using the SAMtools (Version 1.3.1; Microsoft Corporation, Redmond, WA, USA; [20]) software. Finally, the Trinity (Version 2.4.0, Microsoft Corporation, Redmond, WA, USA; [21]) software was used for the transcriptome de novo assembly of each species, with the parameters as follows: -genome_guided_max_intron 10000. In order to perform further quantitative assessment of the assembly completeness, we applied the BUSCO software (version 4.0, Microsoft Corporation, Redmond, WA, USA) package with default settings, and downloaded the Ensembl Actinopterygii assembly as a reference.

Orthology Determination and Phylogenetic Tree Reconstruction
In order to reconstruct a more accurate phylogenetic tree, we performed an extensive orthologous gene comparison among the seven Sillago species and thirteen other Perciformes species (hereafter referred to as the "research species") with transcriptome or genome datasets, including Epinephelus fuscoguttatus (GCNQ00000000.1), Epinephelus coioides (GCA_900536245.1), Perca fluviatilis (GCA_003412525.1), Chionodraco hamatus (GCA_009756495.1), Gymnodraco acuticeps (GGFR00000000.1), Dissostichus eleginoides (GHKE00000000.1), Trematomus bernacchii (GBXS00000000.1), Argyrosomus regius (GFVG00000000.1), Nibea albiflora (GCA_900327885.1), Miichthys miiuy (GCA_001593715.1), Collichthys lucidus (GCA_004119915.1), Larimichthys polyactis (GETG00000000.1), and Larimichthys crocea (GCA_000972845.2). Meanwhile, the ecological characteristics of the 20 research species were obtained from the Fishbase website [22] and the research results of Xiao et al. (2016) [4], as shown in Table 1. Firstly, we obtained the assembly sequences from the National Center for Biotechnology Information (NCBI) database under the accession number. Additionally, a dataset of 1721 single-copy conserved nuclear coding sequences (> 200bp) was obtained by comparing and optimizing eight well-annotated model fish genomes: Lepisosteus oculatus, Anguilla anguilla, Danio rerio, Gadus morhua, Oryzias latipes, Oreochromis niloticus, Gasterosteus aculeatus, and Tetraodon nigroviridis [23]. Subsequently, we extracted the single-copy conserved nuclear coding sequences of each research species using the HMMER software (Version 3.1, Howard Hughes Medical Institute, Chevy Chase, MD, USA; [24]). Specifically, 1721 parameterized hidden Markov model (HMM) profiles were obtained based on 1721 sequence alignments using the 'hmmbuild' function of the HMMER software. Each HMM model was applied to search against each research species dataset using the nHMMER program within HMMER, with the resulting hits output as a table. We used Python scripts (available on Dryad; [23]) to retain the hits that were at least 70% as long as the shortest model fragment and had a bit score of at least 100, and de-redundancy was performed on these hits with 100% similarity by using the CD-HIT software [25]. The remaining high-quality single-copy conserved nuclear coding sequences of each research species were aligned and spliced into single nucleotide sequences using the MAFFT software [26]. Conserved sequences were extracted from each concatenated nucleotide sequence using Gblocks with parameter -t=c [27]. Finally, we performed 1000 nonparametric bootstrap replicates for the optimal GTRGAMMA substitution model of all concatenated nucleotide sequences in the MEGA (Version 6.0, National Institutes of Health, Bethesda, MD, USA; [28]) software, and then a complete neighbor-joining (NJ) tree of the abovementioned twenty species was reconstructed according to the branch lengths and bootstrap support values. Additionally, we also reconstructed the NJ trees based on the amino acid sequences and variation sites of concatenated nucleotide sequences. The iTol (Version 4.0, Beijing Institute of Genomics, Beijing, China; [29]) software was used to visualize the phylogenetic relationship. Finally, the estimated molecular clock data of L. polyactis and L. crocea (min and max differentiation times are 11.2 and 42.0 MYA) were obtained from the TimeTree database [30], and then the divergence times of the seven Sillago species were estimated using the r8s software [31].
Note: "-" indicates that no statistics were found.

Prediction of PSGs of Sillago Species
In the present study, the codeml program in the PAML (Version 4.9, University College London, London, UK; [32]) software was applied to identify the PSGs of Sillago species. Firstly, we constructed tree files for 20 research species (including the seven Sillago species and 13 outgroups) using each single-copy orthologous gene, respectively. Subsequently, the branch-site model (model = 2, Nsites = 2) in the codeml program was used to identify the PSGs of Sillago species, and a comparison was also conducted between the null and alternative models. The null model assumed that Sillago species were under purifying selection and that therefore those sites on the foreground branch evolved neutrally (non-synonymous (dN)/synonymous (dS) = 1, modelA1, fix_omega = 1, and omega = 1.5), and the alternative model assumed that those sites on the foreground branch were under positive selection (dN/dS > 1, modelA2, fix_omega = 0, and omega = 1.5). Then, a likelihood ratio test (LRT) was applied to calculate the log-likelihood values (2 ln) between the null model and alternative model of each single-copy orthologous gene. After a Chi-square statistical analysis, a gene was considered as a PSG of Sillago species if the FDR-adjusted p < 0.01. Finally, we used the Blast2GO [33] software to predict the functions of those PSGs.

Illumina Sequencing and the De Novo Assembly of the Seven Sillago Species' Transcriptomes
Illumina sequencing was carried out on muscle from the seven Sillago species. After cleaning and quality testing, we obtained 138.4 Gb of clean reads from the seven Sillago species, and the details of the data are listed in Table 2. All clean reads of the present study were uploaded to the SRA databases of NCBI under BioProject number PRJNA596307, with accession numbers of SRR10743279 to SRR10743285. The Trinity software was applied to the de novo assembly of the clean data, and 82,024, 58,102, 102,185, 69,748, 102,903, 63,807, and 85,990 unigenes were generated from S. aeolus, S. japonica, S. parvisquamis, S. sihama, S. sinica, S. sp.1, and S. sp.2, respectively. The assembly information of the seven Sillago species is shown in Table 3. Additionally, the BUSCO analysis results showed that 91.5%, 83.1%, 94.2%, 91.7%, 93.1%, 88.7%, and 89.1% of protein-coding genes were found in the unigenes of S. aeolus, S. japonica, S. parvisquamis, S. sihama, S. sinica, S. sp.1, and S. sp.2, respectively.

Orthologous Gene Identification and the Phylogenetic Structure of Sillago Species
Single-copy conserved nuclear coding sequences of these eight model species were applied to search for orthologous genes in 20 research species. Using the HMMER software, we identified a set of 140 orthologous exon markers longer than 200bp. Then, the concatenated alignment of 140 orthologous genes produced a data matrix with 82,833 bp for 20 research species. The NJ analyses of 20 concatenated nucleotide sequences, concatenated amino acid sequences, and concatenated variation sites were implemented in the MEGA package, and the phylogenetic structures of Sillago species are shown in Figures 2-4. According to all phylogenetic analyses, fishes from the same family eventually cluster into one branch, except for C. hamatus and G. acuticeps. Seven Sillago species, including two suspected new species, were also clearly distinguished from other Perciformes. Additionally, the divergence time was estimated, and the results showed that the internal divergence time varied within the seven Sillago species due to the different kind of concatenated sequence used for analysis. Specifically, the internal divergence times of the seven Sillago species based on concatenated nucleotide sequences was between 161. 29

PSGs Representative of Sillago Species
We calculated the log-likelihood values (2 ln) between the null model and alternative model for 140 orthologous genes. After a Chi-square statistical analysis, a total of 44 genes with FDR-adjusted p < 0.05 were identified as PSGs and these genes were suspected to contribute to the specific adaptive evolution of Sillago fishes in the burrowing lifestyle. (Table 4). Combining nr (non-redundant protein sequence) database annotation information, we speculate that these adaptive genes are potentially related to the stress response, material (amino acid and lipid) and energy metabolism, growth and differentiation, embryogenesis, visual sense, and other things. Further gene function enrichment analysis indicated that these adaptive genes were involved in cellular process (GO: 0050794), metabolic process (GO: 0006006), cell parts (GO: 0044464), cells (GO: 0005623), catalytic activity (GO: 0003824), binding (GO: 0005488), and other things ( Figure 5). Additionally, we identified the networks of molecular interactions in the cells and variants specific to particular organisms by comparing the adaptive genes to the KEGG database (Table 5). Results showed that these adaptive genes were significantly enriched for nicotinate and nicotinamide metabolism (map00760), carbon fixation pathways in prokaryotes (map00720), purine metabolism (map00230), C5-Branched dibasic acid metabolism (map00660), starch and sucrose metabolism (map00500), arginine biosynthesis (map00220), riboflavin metabolism (map00740), pantothenate and CoA biosynthesis (map00770), pyrimidine metabolism (map00240), the biosynthesis of antibiotics (map01130), the citrate cycle (map00020), propanoate metabolism (map00640), thiamine metabolism (map00730), and arginine and proline metabolism (map00330).

Discussion
As they are typical marine demersal fish, species differentiation of Sillago species is often disordered, which ultimately affects the accurate interpretation by taxonomists of their evolutionary processes. Meanwhile, more complex habitat environments caused by climate change and human activities can also have an impact on the survival of Sillago species. However, it is undeniable that unraveling their complete phylogenetic relationships and adaptive mechanisms can effectively solve the above problems. In order to gather this fundamental knowledge, we first sequenced and assembled the transcriptome of seven Sillago species. Then, we identified the orthologous genes of seven Sillago species and 13 other Perciformes species based on transcriptome data. Finally, the phylogenetic relationships were reconstructed and the PSGs of Sillago species were inferred based on orthologous genes. In brief, we believe that this research can provide new perspectives for protecting the Sillago fishery resource.

Transcriptome Data Processing
In the present study, we obtained 138.4 Gb of clean transcriptomic data from seven Sillago species. To our knowledge, this study may be the first systematic research of seven Sillago species' transcriptomes by using high-throughput sequencing technology, except for S. japonica [34]. It is undeniable that the transcriptome assembly results of S. japonica, S. sp.1, and S. sp.2 are imperfect because their N50 lengths are less than 1000 bp. Meanwhile, the BUSCO results also found that the unigenes of S. japonica, S. sp.1, and S. sp.2 lacked integrity. This is the case because some of the reads contaminated by the sequencing process were removed. Additionally, the selection of the reference genome (the S. sinica genome was selected as the reference sequence in this study) and assembly strategy (software and parameters) can also influence the final assembly efficiency. However, we still believe that these data expand the currently available genomic resources for Sillago species.

More Accurately Determining the Phylogenetic Relationships of Seven Sillago Species
Knowledge of the phylogeny of Sillago species is insufficient due to their similar phenotypic and physiological characteristics. Additionally, the paucity of genomic resources has restricted the phylogenetic resolution of Sillago species relationships. Currently established high-throughput sequencing technologies enable systematists to acquire huge amounts of orthologous genes for phylogenetic relationship reconstruction for Sillago species [23]. It is well known that the differentiation of orthologous genes usually leads to the speciation of species [35]. Therefore, there is every reason to believe that we can more accurately determine the phylogenetic relationships of Sillago species based on hundreds to thousands of orthologous genes. Although more complete orthologous genes could be obtained based on whole-genome sequencing technology, their application in phylogenetic analysis is limited by the high sequencing cost and methodological challenges of big databases [13]. With this background, orthologous exon markers captured by transcriptome sequencing have been attempted to be used in phylogenetic studies [36]. In fact, Betancur-R et al. (2013) considered that exons can be translated to amino acids, and further, to reduce errors from base compositional biases in phylogenetic studies [37]. Therefore, we focused on orthologous exon markers to reconstruct the phylogenetic relationships between seven Sillago species in this study. Unfortunately, only 140 orthologous exon markers were obtained using the HMMER method, and we suspected it may be related to the imperfect assembly results for S. japonica, S. sp.1, and S. sp.2. Although the number of markers is small, their efficiency has been verified by Hughes et al. [23], thus we believed that these markers are suitable for subsequent phylogenetic relationship reconstruction. Unsurprisingly, the reconstructed phylogenetic structure based on 140 orthologous exon markers (nucleotide sequences, amino acid sequences, and variation sites) is significantly congruent with the prevailing morphological and molecular biological view of Perciformes species relationships, except for C. hamatus and G. acuticeps. In other words, fishes from the same family eventually cluster into one branch. The reconstructed phylogenetic relationships of seven Sillago species are consistent with the results based on the mitochondrial genome [38]. Our study also provided evidence to prove that S. sp.1 and S. sp.2 may belong to the cryptic species of Sillago. However, more detailed evidence, including ecological and morphological data, needs to be provided in the further definition of S. sp.1 and S. sp.2. Additionally, there is no denying the fact that we still need to supplement other Sillago transcriptome data to reconstruct a more comprehensive understanding of phylogenetic relationships. It is worth noting that C. hamatus and G. acuticeps belong to the family Channichthyidae and Bathydraconidae, respectively, but the two species were clustered into one branch in this study. We suspected that the accuracy of the data may have contributed to the divergence. Additionally, some markers that were used to construct phylogenetic relationships between C. hamatus and G. acuticeps may have genetic convergence [39], which eventually cause the two species to cluster into one branch. This also confirmed that the selection of genetic markers may influence the reconstruction results of phylogenetic relationships [40]. Further studies will need larger datasets to illustrate the divergence.
The evolutionary sequence of Sillago species was identified based on differentiation times. The evolutional sequence of the seven Sillago species was consistent with that in previous studies [38]. Previous studies suspected that incomplete or missing swim bladders may be an adaptive mechanism for Sillaginidae species to demersal life [2]. In addition, Xiao (2018) also deduced that the swim bladders of Sillaginidae ancestors were extremely simple, and then became more complex as they evolved [38]. It is worth noting that the swim bladder of S. aeolus is imperfect relative to that in the other six Sillago species. Therefore, the evolutionary sequence of seven Sillago species based on transcriptome data also seems to support the hypothesis about the evolution of swim bladders. However, when evaluating the differentiation time of Sillago species based on different sequence formats, the results are quite different. In fact, Schwarzhans considered that Sillaginidae species gradually evolved into different species at Miocene (23 MYA to 5.33 MYA) [41]. Additionally, Takahashi has found an otolith fossil of Sillago from Niigata Prefecture (Japan) that may have existed during the Pliocene (5.3 MYA to 2.58 MYA) [42]. In the present study, the differentiation times based on amino acid sequences and variation sites are consistent with those in previous studies, although there existed a significant bias when using nucleotide sequences. We suspect that the base compositional biases caused by high-throughput sequencing affects the subsequent differentiation time analysis. However, it is undeniable that the tolerance of amino acid sequences to degenerate bases can effectively reduce this deviation [37]. Future studies still need to verify whether amino acid sequences and variation sites are more suitable for estimating the evolutionary order of species. Surprisingly, we used branch length to predict the differentiation times with other Perciformes species that might be quite different from those in the time tree database. There are two probable reasons: (1) the divergence in differentiation time results may be influenced by the estimation strategies and the number of genes used; (2) the orthologous exon markers obtained from transcriptome data are mostly functional genes, and the convergent evolution of functional genes has an inevitable effect on the evaluation of the species differentiation time.
In brief, transcriptome data can provide hundreds to thousands of single-copy orthologous exon markers, and then be used to reconstruct a more complete view of Sillago species phylogenetic relationships. However, when using orthologous exon markers obtained from transcriptome data to reconstruct phylogenetic relationships, two recommendations are worth considering: (1) the amino acid sequences of orthologous exon markers can reduce errors caused by base compositional biases, so it is necessary for phylogenetic relationship reconstruction; (2) the accuracy of phylogenetic relationships may be positively correlated with the number of orthologous exon markers used, which is possibly because a large number of markers can eliminate the bias from a small number of convergent evolutionary genes.

Positively selected genes Might Contribute to the Ecological Adaptation of Sillago Species
Transcriptome-wide analysis of the rates of non-synonymous to synonymous orthologous nucleotide substitutions represents an effective approach to quantitatively measure the selection force [17,43]. To reveal the molecular mechanism underlying the ecological adaptation of Sillago species, we estimated the dN/dS to identify the PSGs of Sillago species. A total of 44 orthologous genes were identified to be positively selected and might be involved in many biological processes, including the stress response (LTV1 [44], SMO [45], PSA [46,47] We suspected that the complexity (i.e., localized oxygen depletion, sulfide accumulation, and high turbidity) of the habitat environment may make Sillago species subject to multiple environmental stressors. Multiple environmental stressors might contribute to DNA damage and immunosuppression in Sillago species [74,75]. Meanwhile, the burrowing behavior of Sillago species may cause mechanical injuries to skin, and thus pathogens may enter the organism from the wound. Therefore, we suspected that those PSGs related to the stress response reflect the plasticity of Sillago species' adaptation to multiple environmental stressors. A previous study has considered that the foraging probability and food-intake of Sillago species larvae may be limited by low light [76]. The inadequate intake can further affect all kinds of behaviors that necessitate high energy consumption, such as predation, reproduction, and others [77]. Therefore, it is a questionable whether the positive selection of material and metabolism-related genes related to material and energy metabolism could maintain the energy compensation of Sillago species. Interestingly, two vision genes were found to be positively selected in Sillago species. The positive selection of vision genes can enhance the visual acuity of Sillago species, which is useful for some behaviors (predation, reproduction, and others) in low light [78]. Meanwhile, the retina of Sillago species may have possessed a well-developed vascularization due to the positive selection of vision genes, possibly to overcome the hypoxic conditions [78]. A previous study has also considered that light can affect the foraging, growth, and reproductive behavior and the circadian rhythms of fishes [79]. It has been discussed above that multiple environmental stressors may affect the stress response, predation behavior, material and energy metabolism, and visual sensitivity of Sillago species, which may eventually limit the survival and development of Sillago species. Therefore, we suspected that Sillago species might positively select a battery of genes associated with growth, differentiation, and embryogenesis to maintain their effective population numbers under multiple environmental stressors.
All in all, these PSGs might contribute to the ecological adaptation of Sillago species to the multiple environmental stressors, and these PSGs are also crucial to the evolution of Sillago species. However, our current evidence only shows specific protein sequence mutation in Sillago species. Whether these PSGs lead to favorable mutations in the phenotype of fish is unknown. Meanwhile, future experiments are also needed to explore which ecological traits of fish might have evolved along with these PSGs.

Conclusions
This study is the first systematic report of the transcriptome resource of Sillago species, and these data enrich the genomic information for molecular studies of these species. Based on eight well-annotated model fish genomes, we obtained 140 orthologous exon markers shared in Sillago species and then reconstructed a more complete phylogenetic relationship. Through the rates of non-synonymous to synonymous orthologous nucleotide substitutions, we found positive selection traces in 44 genes, and these genes are potentially related to the stress response, material (carbohydrate, amino acid, and lipid) and energy metabolism, growth and differentiation, embryogenesis, visual sense, and other things. This suggests that multiple environmental stressors may have led to specific selection force towards key genes in Sillago species. However, further experiments are needed to determine the exact function of these PSGs in Sillago species. Taken together, our results reconstruct a more complete view of the phylogenetic relationships of Sillago species based on transcriptome resources. We also suspect that the capacity of Sillago species to thrive in multiple environmental stressors may be due to the positive selection of these adaptive genes. The present study only represents a first step in understanding the habitat adaptive mechanism of the fish family Sillaginidae at the molecular level; further studies are still needed to validate the results and hypotheses.