Identification of Genes Related to Cold Tolerance and Novel Genetic Markers for Molecular Breeding in Taiwan Tilapia (Oreochromis spp.) via Transcriptome Analysis

Simple Summary In this study, we investigated the brain, gill, liver, and muscle transcriptomic responses of Taiwan tilapia towards cold stress. Some key genes and molecular markers involved in cold biological pathways were screened through differential expression. Among them, energy-related metabolic pathways and nucleotide genotypes were highly correlated with cold tolerance traits. This suggested that single nucleotide polymorphism (SNP) genetic variation can be used as a molecular marker to assist the selection and verification of cold-tolerant populations. Our study results will accelerate the understanding of different farmed tilapia tolerance mechanisms to environmental temperature changes and provide insights for the molecular breeding of cold-tolerant Taiwan tilapia species. Abstract Taiwan tilapia is one of the primary species used in aquaculture practices in Taiwan. However, as a tropical fish, it is sensitive to cold temperatures that can lead to high mortality rates during winter months. Genetic and broodstock management strategies using marker-assisted selection and breeding are the best tools currently available to improve seed varieties for tilapia species. The purpose of this study was to develop molecular markers for cold stress-related genes using digital gene expression analysis of next-generation transcriptome sequencing in Taiwan tilapia (Oreochromis spp.). We constructed and sequenced cDNA libraries from the brain, gill, liver, and muscle tissues of cold-tolerance (CT) and cold-sensitivity (CS) strains. Approximately 35,214,833,100 nucleotides of raw sequencing reads were generated, and these were assembled into 128,147 unigenes possessing a total length of 185,382,926 bp and an average length of 1446 bp. A total of 25,844 unigenes were annotated using five protein databases and Venny analysis, and 38,377 simple sequence repeats (SSRs) and 65,527 single nucleotide polymorphisms (SNPs) were identified. Furthermore, from the 38-cold tolerance-related genes that were identified using differential gene expression analysis in the four tissues, 13 microsatellites and 37 single nucleotide polymorphism markers were identified. The results of the genotype analysis revealed that the selected markers could be used for population genetics. In addition to the diversity assessment, one of the SNP markers was determined to be significantly related to cold-tolerance traits and could be used as a molecular marker to assist in the selection and verification of cold-tolerant populations. The specific genetic markers explored in this study can be used for the identification of genetic polymorphisms and cold tolerance traits in Taiwan tilapia, and they can also be used to further explore the physiological and biochemical molecular regulation pathways of fish that are involved in their tolerance to environmental temperature stress.


Introduction
Due to the extreme climate and environmental changes caused by global warming that include unusual snowfall and frequent winter cold currents, all biological communities and ecosystems have been impacted [1]. The sustainable development of agricultural production stability and safety is a major challenge facing all countries today [2,3]. Taiwan is located in a tropical/subtropical region. The majority of aquaculture production is dominated by subtropical fish species. High temperatures in summer and typhoons, heavy rainfall, low temperature, and cold currents in winter along with other extreme abnormal weather have caused serious damage to aquaculture fisheries and economic losses [4].
Tilapiine fishes of the cichlid family are native to tropical and subtropical regions of Africa and eventually reached and settled in the Middle East through the Great Rift Valley [5]. Given the tropical origin of these fish, 20-30 • C is the optimal growth temperature for most tilapia. When the temperature is lower than 20 • C, feeding and reproduction are typically inhibited. Currently, in addition to many genetic studies examining low temperature tolerance in different tilapia species, a number of countries have also developed cultivated tilapia strains exhibiting different degrees of cold-tolerance through selection and hybridization [6,7]. In Taiwan, after years of research, genetic improvement, and certification, tilapia fish have gradually become a unique and high-quality strain. Twenty years ago, the industry termed these fish "Taiwan tilapia" [8].
Thermal tolerance is a quantitative trait that is biologically significant in several cultured fish species. This body phenotypic variation is attributed to acclimation, physiological stage, genetic effects, and environmental factors [9,10]. To survive, fish can also respond to environmental temperature fluctuations [11]. When fish are stimulated by the environment, they activate the two major physiological regulation nerves and endocrine systems in the body to maintain cellular homeostasis by regulating metabolism, osmotic pressure, and the endocrine system [12][13][14]. If the body functions of these fish are damaged, the resistance of these fish will decrease, and this can result in disease or even death.
At the molecular level, low temperatures reduce the enzymatic reaction rates and the diffusion and transport of biomolecules and can slow protein denaturation and disaggregation [15]. Additionally, low temperatures can also inhibit transcription and translation, destroy cellular cytoskeletal elements, alter membrane permeability, and affect energy production in cells [16]. Furthermore, previous genetic studies have used microsatellite markers to facilitate genetic linkage map quantitative trait loci (QTL) mapping [17] and cold-tolerance marker development [7,18,19]. Several studies have attempted to further explore the genetic model and the genetic basis of cold-tolerance in fish [20][21][22][23][24]. However, the mechanisms and pathways underlying the observed variations in cold-tolerance among tilapia fish species remain unknown.
In this study, genetic variation within species was characterized by selecting and breeding families possessing different low-temperature tolerances and sensitivities. Furthermore, transcriptome analysis was used to compare the transcriptome responses in brain, gill, liver, and muscle tissues between cold-tolerant and sensitive Taiwan tilapia strains under low-temperature challenges. In addition to characterizing the differential gene expression patterns between cold-tolerant and cold-sensitive fish, it is also possible to develop and verify the low-temperature test of the functional genes involved in the regulation pathway. Molecular genetic DNA markers for microsatellites and single nucleotide polymorphisms assist in broodstock breeding. In the future, this can accelerate the molecular selection of low-temperature-tolerant strains of Taiwan tilapia to mitigate the losses of the aquaculture fish industry due to cold damage and increase the economic value of the aquaculture industry.

Animals and Experimental Conditions
Three tested Taiwan tilapia strains were collected from the wild and from private markets by the National Taiwan Ocean University (NTOU), and breeding programs were performed at the core breeding greenhouse and Aquatic Organism Research and Conservation Center in Gongliao. The Rui-Fang (RF) strain is a wild species of Taiwan tilapia that was originally collected from the Rui-Fang field in northern Taiwan. The YiHua (YH) strain is a breeding species of Taiwan tilapia that was originally collected from the private commercial seed breeding farm in southern Taiwan. The NTOU (NT) strain is a pure strain of Nile tilapia, and this strain has been maintained for a long time period in the NTOU [25,26].
Spawns were obtained by mating in pairs to obtain 22 families (Supplementary Figure S1). The offspring of each family were kept in separate fish tanks until they were 3 to 4 months old and weighed approximately 30-40 g or more. Subdermally injected dyes were used to identify families. To determine the cold-tolerance value of each family, each one contributed fry for cooling tests. The operation method was referred to and modified from the description published by Nitzan et al. [9].

Cooling Test
The selection criteria for cold-tolerance traits as assessed using the cooling test involved the use of F1 (n = 15) and F2 (n = 15) test fish from the offspring of full-sib and half-sib families that were raised in a 90-L orange bucket that was equipped with an upper filter and an air pump (TURBO 600 pumping motor, SanYu Aquarium, Kaohsiung, Taiwan). The upper portion was covered with a plastic fence to prevent the fish from jumping out of the tank. The fish were allowed to adapt to a water temperature of 25 • C for one week, and the photoperiod was set to 12 h light/12 h darkness. The upper portion of the filter was cleaned, the water was changed, and the fish were fed daily. The offspring fish of each family were then moved into a 600-L water tank that was equipped with a temperature-controlled system (Firstek Scientific, New Taipei City, Taiwan) that was used to slowly lower the water temperature at a rate of 1 • C per day. The water temperature was recorded and observed after each cooling. The swimming behavior and feeding behavior of Taiwan tilapia were observed, and a dissolved oxygen saturation of greater than 90% was maintained. Finally, the minimum tolerable temperature ( • C), full length (cm), body length (cm) and mass (g) of each flipping fish was recorded, and tail fin tissue samples were collected to extract genomic DNA for subsequent analysis.
The offspring of each Taiwan tilapia family were tested for low-temperature tolerance using a cooling and cold attack test. The first three families (RF strain) considered most tolerant to low temperature, and the last three families (NT strain) considered least tolerant were selected and defined as the cold-tolerance (CT) and cold-sensitivity (CS) groups.

Total RNA Extraction
The cold-tolerance (CT, n = 6) and cold-sensitivity (CS, n = 6) fish groups were subjected to cold stimulation at 10 • C, and the brain, gill, liver, and muscle tissues were collected and placed into a 1.5 mL centrifuge tube for isolation (Roche Applied Science, Mannheim, Germany). The EasyPure Total RNA Spin Kit (Bioman, Taipei, Taiwan) was used for total RNA extraction. Three stainless steel beads (3 mm) and one (5 mm) steel ball (LabTurbo ® , Taigen Bioscience, Taipei City, Taiwan) were placed into a 1.5 mL centrifuge tube containing Trizol, and they were homogenized in the SpeedMin PLUS (Analytik Jena AG, Biometra GmbH, Göttingen, Germany) and maintained at room temperature for 5 min. The mixture was poured into a 2 mL filter column and centrifuged (4 • C, 10,000× g) for 2 min. The supernatant was then transferred to a new centrifuge tube, and 400 µL of 70% ethanol was added. The tube contents were shaken, mixed immediately, and poured into the RB column (Bioman, Taipei, Taiwan). This was centrifuged (4 • C, 10,000× g) for 2 min, and the lower layer of liquid was discarded and the column was placed into a 2 mL collection tube. Next, 400 µL of W1 buffer was added to the RB column that was subsequently centrifuged (4 • C, 10,000× g) for 1 min. The lower layer of liquid was discarded and returned to the 2 mL collection tube. Then, 600 µL of wash buffer was added to the RB column that was subsequently centrifuged (4 • C, 10,000× g) for 1 min. The lower layer of liquid was poured into a 2 mL collection tube and then centrifuged (4 • C, 10,000× g) for 3 min. The RB column was placed into a new microcentrifuge tube, and 50 µL of RNase-free water was added to absorb RNA. The mixture was allowed to stand for 5 min and then centrifuged (4 • C, 10,000× g) for 2 min to obtain purified RNA. Subsequently, a NanoDrop (MaestroGen, Hsinchu City, Taiwan) was used for analysis. The ratio of OD 260 to OD 280 was measured using a spectrophotometer to assess the concentration of purified RNA. The ratio of 1.9-2.0 indicates high purity RNA. The RNA extracts were stored at −80 • C for later use.

Transcriptome High-Throughput Next-Generation Sequencing
A 10 µg total RNA sample was sent to a sequencing service company for RNA sample quality testing to confirm the integrity of the RNA sample. The use of contaminated or degraded samples was avoided. Only high-quality total RNA was selected, and these RNA samples were divided into two groups that include CT and CS. Tissues (brain, gill, liver, and muscle) were each added to one specific tube, and total RNA extracts from CT (n = 6) and CS (n = 6) brain, gill, liver, and muscle tissue samples were used for transcriptome gene library construction according to high-throughput next generation sequencing (NGS) platform. The transcriptome assembly of eight libraries was submitted to the NCBI short read archive database (accession number: SUB 9446960). High-quality read sequence data for unigene assembly were used, and they were screened for differentially expressed genes and utilized for Gene Ontology (GO) annotation and pathway enrichment analysis. Additionally, Kyoto Encyclopedia of Genes and Genomes (KEGG), Clusters of Orthologous Groups (COG) annotation and prediction analyses of encoded proteins were performed using gene function annotation. MicroSAtellite (MISA) was used to search for simple sequence repeat (SSR) or microsatellite DNA markers on the reference sequence of the transcript. Primer3 (v2.3.7) [27] was used to analyze the genes for the detected SSR markers and for the design of the primer pair for the locus. HISAT (v0.1.6) software [28] was used to compare clean reads for genes, and GATK (v3.4-0) [29] was used to detect single nucleotide polymorphisms (SNPs) and to filter low-quality SNPs.

Transcript Database Gene Differential Expression
From the Nile tilapia transcript gene library, the maximum and minimum log2 ratio (CT/CS) 2 of the expression of the cold-tolerance group (CT) and the cold-sensitivity group (CS) was selected. Digitized gene expression (DGE) analysis and selection of genes exhibiting differential expression were conducted. Gene differential performance analysis was used to identify genes possessing different expression levels in different samples, and their expression levels according to FPKM were expressed.
Fragments per kilobase of exon per million fragments mapped (FPKM) was calculated according to the following formula: FPKM = total exon reads mapped reads(millions) × exon length(KB) Total fragments represent the number of fragments that are uniquely aligned to the gene, mapped reads represent the total number of fragments that are uniquely aligned to all genes, and exon length is the number of bases within the gene [30].

Reverse Transcription Polymerase Chain Reaction
A high-capacity cDNA reverse transcription kit (Applied Biosystems-Life Technologies, Carlsbad, CA, USA) was used for reverse transcription. The total reaction volume was 20 µL, and the components were 1 µg of RNA, 2 µL of 10× RT buffer, 0.8 µL of 25× dNTP mix (100 mM), 2 µL of 10× RT Random primer, 1 µL of MultScribe TM reverse transcriptase (50 U/µL), and 4.2 µL of Nuclease-free water. Reaction was performed using Veriti ® thermal cycler (Applied Biosystem-Life Technologies), and the reaction conditions were Animals 2021, 11, 3538 5 of 27 25 • C for 10 min, 37 • C for 120 min, and 85 • C for 5 min. After the reaction, the sample was diluted 100-fold and stored at −20 • C for subsequent use.

Real-Time Quantitative Polymerase Chain Reaction
To verify the credibility of the RNAseq results, six functional genes (CL10781_10, CL1487_25, CL5212_1, CL5902_1, Unigene196, and Unigene7071) were used for qPCR expression analysis and screened based on: (1) expression ability in four tissues, including brain, gill, liver, and muscle; (2) differences in expression between the cold-tolerance group (CT) and the sensitive group (CS); (3) the presence of SSR or SNP molecular markers. The tested samples of cold-tolerance (CT, n = 6) and cold-sensitivity (CS, n = 6) previously used for RNAseq were used for qPCR analysis. The total reaction volume was 20 µL, and the components were 10 µL of 2× Power SYBR ® Green PCR Master Mix, 5 µL of cDNA (diluted 100-fold), 1.2 µL of forward primer (0.5 µM), 1.2 µL of reverse primer (0.5 µM) and 2.6 µL of RNase-free water. A Roche LightCycler ® 480 Real-Time PCR System (Roche Applied Science, Mannheim, Germany) was used for the reaction. The reaction conditions were 50 • C for 2 min in the first stage, 95 • C for 10 min in the second stage, and 40 cycles at 95 • C for 15 s and 60 • C for 30 s in the third stage. The obtained Ct value was used as the mean and standard deviation (SD), and the data from the control group (18S rRNA and β-actin) were deducted as a relative quantification expression graph. The primer pairs of candidate target and internal reference genes employed are listed and shown in Supplementary Table S1.

Genomic DNA Extraction
Approximately 100 mg of the tail fin of the test fish was cut and placed into a 1.5 mL microcentrifuge tube containing 800 µL of 70% ethanol solution. Information regarding the source of the sample that included the sample number, sample name, and sampling date was attached to each centrifuge tube. After computer labeling, the samples were stored at −20 • C and frozen for genomic DNA extraction. The MasterPure TM DNA Purification Kit (Epicenter, Madison, WI, USA) was used to extract genomic DNA from Taiwan tilapia. The fin sample was placed into a new 1.5 mL microcentrifuge tube, and 800 µL of tissue and cell lysis solution (Epicenter) and 2 µL of proteinase K (Epicenter) were both added. The samples were mixed well and placed into a 55 • C oven for 6-12 h. After the tissue was completely dissolved, 400 µL was pipetted into a new centrifuge tube, and 250 µL of MPC protein precipitation reagent (Epicenter) was added and mixed thoroughly. The mixture was centrifuged at 10,000× g and 4 • C for 10 min. Thereafter, 500 µL of supernatant was pipetted into a new 1.5 mL microcentrifuge tube, and 500 µL of isopropanol (Sigma-Aldrich, Saint Louis, MO, USA) was added and mixed several times to obtain dehydrated DNA pellets. The pellets were centrifuged to the bottom of the tube using a microcentrifuge (Spin mini), and the supernatant was poured out. Next, 100 µL of 70% alcohol was added to the 1.5 mL microcentrifuge tube for two washes. After centrifugation to remove residual alcohol, the tube was placed in an oven at 55 • C for 10 min to completely dry the alcohol. Subsequently, 200 µL of ddH 2 O was added, and the tube was placed into a dry bath at 37 • C for 20 min to dissolve the DNA.
A Maestro Nano Spectrophotometer (Maestrogen, Las Vegas, NV, USA) was used to measure the OD 260 and OD 280 absorbance values and to calculate the DNA concentration. Then, 0.8% agarose gel electrophoresis was used for electrophoresis separation. GelRed Nucleic Acid Gel Stain (Biotium, Inc., Fremont, CA, USA) was used after staining for 30 min, and the Slite140 Compact Gel Documentation System (Avegene Life Science, New Taipei City, Taiwan) colloid camera system was used to assess the quality of genomic DNA. After measuring the computer label of the sample source information that includes tissue number, sample name, extraction date, and other information, the samples were stored in a frozen state at −20 • C for subsequent testing. The microsatellite markers that were developed using two low-temperature tolerancerelated microsatellite markers and 13 transcript functional databases targeted 269 lowtemperature-tested Taiwan tilapia (RF, n = 120; YH, n = 102; NT, n = 47) for genotyping analysis. Based on the multiple fluorescent labeling method, the first PCR amplification used the forward primer containing the adaptor to bind gDNA fragments. The total volume of the reaction solution was 10 µL, and the composition was 5 µL of 2× Taq DNA polymerase Mastermix (Bioman, Taipei, Taiwan), 0.3 µL of forward primer, 0.3 µL of reverse primer, 2 µL of template DNA, and 2.4 µL of ddH 2 O. PCR was performed using a 96-well polymerase chain reactor (Veriti ® thermal cycler; Applied Biosystems-Life Technologies). The reaction conditions were 94 • C for 3 min, 94 • C for 45 s, annealing temperature (T a ) • C for 30 s, 72 • C for 30 s, and 72 • C for 7 min. The four steps were cycled 30 times. When the second PCR amplification was performed, the forward primer was changed to fluorescent primers that included JOE (green), FAM (blue), ROX (red), NED (yellow), and four different fluorescent primers. After labeling, the total volume of the reaction solution was 10 µL, and the components were 5 µL of 2× Taq DNA polymerase Mastermix (Bioman), 0.3 µL of fluorescent primer, 0.3 µL of reverse primer, 2 µL of the first PCR product diluted 10-fold as template DNA and 2.4 µL of ddH 2 O. A 96-well polymerase chain reactor (Veriti ® thermal cycler; Applied Biosystems-Life Technologies) was used for PCR, and the reaction conditions were the same as those used for the first PCR amplification.
After the PCR products were separated using electrophoresis with a 2% agarose colloid, they were stained with GelRed Nucleic Acid Gel Stain (Bioman) for 30 min, photographed, and assessed using the Slite 140 Compact Gel Documentation System (Average Life Science, New Taipei City, Taiwan) colloid camera system. The probability of allele identification errors was reduced, and each sample was subjected to a second PCR analysis. If the same result was obtained twice, the microsatellite genotype of this sample was determined. After mixing four different fluorescently labeled PCR products, 5 µL of these products were further mixed with 0.2 µL of GeneScan TM -500 LIZ TM size standard (Applied Biosystems, Foster City, CA, USA) and 10.8 µL HiDi TM Deionized formamide (Applied Biosystems, Foster City, CA, USA) and centrifuged. After heating at 94 • C for 3 min, the samples were quickly placed on ice for 5 min and then placed on an ABI PRISM ® 3730xl DNA Analyzer (Applied Biosystems, Foster City, CA, USA) automatic DNA analyzer for short tandem repeat (STR) capillary electrophoresis separation. The obtained microsatellite marker data were represented by the analogy of A, B, C, and so forth according to the size of the allele fragments, using Geneious software for the interpretation and analysis of multiple fluorescent PCR polymorphic marker genotypes.

Single Nucleotide Polymorphism Markers Genotyping
The principle of typing is to perform a single-base extension reaction at the SNP site and then use mass spectrometry to analyze the molecular weight of the product. As a single base extends four bases (A, T, C, and G), the molecular weights are all different (ddATP: 475.18 g/mol, ddCTP: 451.16 g/mol, ddGTP: 491.18 g/mol, ddTTP: 466.17 g/mol), and SNP genotypes can be analyzed based on this feature. SNP stereotype ratio, HM call, and LM call respectively represent the ratio of homozygous high-molecular-weight base type to low molecular weight base type. For example, when assessing at SNP of the mutual conversion of cytosine (C) and thymine (T), the molecular weight of the T base is greater than is the molecular weight of the C base. Based on this, the T base will be set as high mass (HM), and the C base will be set as low mass (LM).
Thirty-seven single-nucleotide polymorphism gene-based markers were selected from the transcript database for use in the targeting of 190 Taiwan tilapia populations (RF = 72, YH = 96, NT = 22) that were tested under low temperature using the Sequenom MassARRAY platform and iPLEX gold chemistry (Sequenom, San Diego, CA, USA) for genotyping analysis. Design-specific PCR primers and extension primers created using the Assay Designer software package (v.4.0) (Premier Biosoft International, Palo Alto, CA, Animals 2021, 11, 3538 7 of 27 USA) were used to configure a total volume of 5 µL of the reaction solution for multiplex PCR reactions. The composition was 1 µL of genomic DNA (10 ng/µL), 0.2 units of Taq polymerase, 2.5 pmol of PCR primer, and 25 mM of dNTP (Sequenom, San Diego, CA, USA). PCR reaction conditions were 94 • C for 4 min, 94 • C for 20 s, 56 • C for 30 s, 72 • C for 1 min, and 72 • C for 3 min. Steps 2-4 were cycled 45 times, and 0.3 U of shrimp alkaline phosphatase (SAP) was added to inactivate the excess dNTPs. The iPLEX enzyme, terminator mix, and extension primer (Sequenom, San Diego, CA, USA) were used for the single-base extension reaction. The PCR reaction conditions were 94 • C for 30 s, 94 • C for 5 s, 56 • C for 5 s, 80 • C for 5 s, and 72 • C for 3 min. Steps 3-4 were cycled five times, and then the program transitioned back to Step 2 for 40 cycles. A cation exchange resin was added to remove the residual salts. Then, 7 nL of the product was placed on the 384-well SpectroCHIP (Sequenom, San Diego, CA, USA), and a MassARRAY Analyzer 4 was used for analysis. Then, TYPER 4.0 analysis software (http://sequenom.com/Genetic-Analysis/ Applications/iPLEX-Genotyping/iPLEX-Literature, accessed on 15 October 2021) was used to read the mass spectrometry results for data analysis and SNP genotype result interpretation. The genotype call score parameter analysis result determines if it can be interpreted as a reliable result. The genotype call score calculation method was as follows: Genotype call score = P MA × P YLD × P SKW The genotype call score was calculated using three parameters that include the signal intensity parameter of the PYLD-peak, the molecular weight accuracy and signal sharpness of PMA-pea, and the signal intensity ratio of the two genotypes of PSKW-SNP. The software determines if each peak in the mass spectrum is a credible result and performs SNP interpretation based on these results. When the genotype call score is lower than the cut-off value that the software can trust, no genotype interpretation will be provided. If the genotype call score is above the cut-off value, the software can perform genotype interpretation. According to the genotype call score from low to high, the software will be aggressive, moderate, or conservative in the final analysis result as a reference.

Statistical Analysis
The obtained genotype data were imported into PopGene32 software [31] to statistically analyze the number of alleles (Na) and allele frequency (Ne) of each microsatellite locus, and the various evaluating coefficients of genetic diversity and miscellaneous data were observed. Observed heterozygosity (H o ), expected heterozygosity (H e ), polymorphism information content (PIC), and fixation index (F IS and F ST ) were calculated as follows.
Observed heterozygosity (H o ): where N het is the number of heterozygous individuals, and N hom is the number of homozygous individuals. Expected heterozygosity (H e ): where n is the number of alleles at each locus, and Pi is the frequency of the ith allele [32]. Polymorphism information content (PIC): where k is the number of alleles, and pi and pj are the gene frequencies of the ith and jth alleles [33]. Fixed index (F IS ): where H o is the observed heterozygosity, and H e is the expected heterozygosity [34].
The phenotypic allele effect of SSR/SNP which was associated with cold-tolerance trait was estimated through comparison between the average phenotypic values over accessions with the specific allele.
The IBM SPSS Statistics version 22.0 program (SPSS Inc., Chicago, IL, USA) was used for statistical analysis of the association between the phenotypes and the markers. According to ANOVA and Scheffe's 95% confidence level (post-hoc test), genotype and body phenotype differences were significant.

Phenotypic Differences in Cold-Tolerance as Assessed by the Loss of Balance Behavior in Response to Cooling Stress
The results from the reduction test indicated that RF fish only reduced their food intake when the water temperature reached 18.3 • C, and only a small number of these fish would eat food at a water temperature of 12.3 • C. The fish began to roll over at 10.5 • C. At 8.6 • C, the rollover rate reached 55.5%, and at 7.7 • C, the rollover rate became 100%. YH fish only consumed less food when the water temperature reached 19.0 • C, and only a small number would eat when the temperature reached 13.3 • C. At 11.2 • C, the fish began to roll over. At 9.3 • C, the rollover rate reached 51.9 %, and at 8.0 • C, the rollover rate became 100%. The NT fish only consumed less food when the water temperature reached 18.7 • C, and only a small number of fish would eat at 12.2 • C. The fish began to roll at 11.5 • C. At 10.3 • C, the rollover rate reached 50.1%, and at 9.3 • C, the rollover rate became 100%.
Among the three low-temperature test groups, the RF Taiwan tilapia group exhibited the best cold-tolerance with an average temperature tolerance of 8.74 ± 0.55 • C, and this was followed by the YH Taiwan tilapia group with an average temperature tolerance of 9.36 ± 0.72 • C. The NT Taiwan tilapia population possessed the worst low-temperature tolerance with an average tolerance temperature of 10.16 ± 0.45 • C ( Figure 1). In this study, the cold-tolerant fish (those that can tolerate 8.74 ± 0.55 °C) lower temperature tolerance than do the cold-sensitive group of fish (that can

RNAseq Retrieval, Pre-Processing, Assembly, and Annotation of the Unigenes
In this study, the cold-tolerant fish (those that can tolerate 8.74 ± 0.55 • C) exhibit lower temperature tolerance than do the cold-sensitive group of fish (that can tolerate 10.16 ± 0.45 • C). The collection of tissue samples used for transcriptome sequencing was based on the lowest tolerable temperature (10 • C) of the cold-sensitive group. Concurrently, cold-tolerant group fish samples were collected. The differences in measured gene expression were indeed caused by the differences in tolerance.
Using the Illumina Hiseq 2000 next-generation platform, the transcriptome gene library was constructed for the total RNA of the brain, gill, liver, and muscle tissues of the two groups of cold-tolerance (CT) and cold-sensitivity (CS) fish.
After quality trimming and filtering of low-quality reads, 234,765,554 high-quality paired-end (PE) reads were generated (each with a length of 150 bp), and a total of 35,214,833,100 nucleotides of data (Table 1) were generated for assembly from scratch. The length distribution statistics of the unigene transcripts after assembly are presented in Supplementary Figure S2. A total of 128,147 unigenes were obtained, and these possessed a total length of 185,382,926 nt, an average length of 1446 nt, an N50 of 3157 nt, and a GC content of 47.46% (Supplementary Table S2).
Unigene was used to compare the distribution of genes based on blastx NR annotation, and the species comparisons utilized Oreochromis niloticus, Haplochromis burtoni, Neolamprologus brichardi, Maylandia zebra, and other species ( Figure 2). The analysis results revealed that 48,521 single genes exhibited higher homology and the highest similarity to Oreochromis niloticus, where they accounted for 64.99% of the total. By comparing unigenes to the nucleic acid database Nt (p-value < 0.00001) through the use of blastn, the protein possessing the highest sequence similarity to that of the unigenes can be determined, and information regarding the protein function of the unigene can be obtained. A total of 71,890 predicted CDS were compared to listings in the protein database, and 1389 predicted CDS were identified for a total of 73,279.
In terms of function annotation, unigenes included protein function annotation and COG function annotation. Unigenes were annotated to the NR, NT, Swiss-Prot, COG, GO, and KEGG databases (Supplementary Figures S3-S5), and we counted the number of unigenes annotated to each database and also the total number of annotations ( Table 2). The cross-comparison analysis of the Venn diagrams of the four major protein databases yielded 25,844 co-annotated unigenes ( Figure 3).
to Oreochromis niloticus, where they accounted for 64.99% of the total. By comparin unigenes to the nucleic acid database Nt (p-value < 0.00001) through the use of blastn the protein possessing the highest sequence similarity to that of the unigenes can be de termined, and information regarding the protein function of the unigene can be ob tained. A total of 71,890 predicted CDS were compared to listings in the protein data base, and 1389 predicted CDS were identified for a total of 73,279. In terms of function annotation, unigenes included protein function annotation an COG function annotation. Unigenes were annotated to the NR, NT, Swiss-Prot, COG GO, and KEGG databases (Supplementary Figures S3-S5), and we counted the numbe of unigenes annotated to each database and also the total number of annotations (Tabl 2). The cross-comparison analysis of the Venn diagrams of the four major protein data bases yielded 25,844 co-annotated unigenes ( Figure 3).  7 The number of unigenes which were annotated in at least one functional database.   Based on the COG database, a large proportion of sequences within the transcriptome of Taiwan tilapia fish participate in the functional classification of "general function prediction only". Additionally, "replication, recombination, repair", "transcription", "signal transduction mechanisms", "cell cycle control, cell division, chromosome partitioning" and "translation, ribosomal structure and biogenesis" pathways were identified Based on the COG database, a large proportion of sequences within the transcriptome of Taiwan tilapia fish participate in the functional classification of "general function prediction only". Additionally, "replication, recombination, repair", "transcription", "signal transduction mechanisms", "cell cycle control, cell division, chromosome partitioning" and "translation, ribosomal structure and biogenesis" pathways were identified (Supplementary Figure S3).
The functional classification of these transcripts according to the GO database indicates that "cellular process", "single-organism process" and "metabolic process" are the primary involved functions. Biological processes such as "cell", "cell part", "membrane" and "membrane part" are the primary cellular components involved, and "binding" and "catalytic activity" are the primary molecular functions involved (Supplementary Figure S4).

Detection of Microsatellites and Single Nucleotide Polymorphism Markers
Using the assembled unigene as the reference sequence, the MicroSAtellite (MISA) software was used to search for a total of 38,377 microsatellite markers containing one (single), two (double), three, four, five, and six base repeats. The analysis identified 8745 single-base repeats, 17,610 double-base repeats, 10,045 three-base repeats, 1182 fourbase repeats, 650 five-base repeats, and 145 six-base repeats ( Figure 4A).  . The X-axis is the specific repeated nucleotide types, which were defined as mone for one-, di for two-, tri for three-, quad for four-, penta for five-, and hexa for six-nucleotides and the Y-axis indicates the SSR motif numbers. (B) The histogram presents the distribution of the total number of SNPs within different classes (transition and transversion). The X-axis is the SNP variation type, and the Y-axis indicates the SNP numbers.

Differential Gene Expression between Cold-Tolerant and Sensitive Fish
To compare the cold-tolerance (CT) and cold-sensitivity (CS) groups of Taiwan tilapia in the context of the brain, gill, liver, and muscle (CT-B vs. CS-B, CT-G vs. CS-G, CT-L vs. CS-L, and CT-M vs. CS-M), differential expression of transcripts was assessed. Parameters that included a false discovery rate (FDR) < 0.05, log2 fold change > 1, or log2 fold change < − 1 were used to screen for differentially expressed genes (DEGs) from the DE-seq analysis. All unigenes represent the results for up-regulation and down-regulation in regard to differential expression distribution based on quantitative analysis ( Figure 5). Totals of 4191, 4235, 3164, and 2439 differentially expressed genes were detected in the brain, gill, liver, and muscle tissues, respectively. The numbers of regulatory genes within the four tissues of the cold-tolerance (CT) group were 2081, 2261, 1824, and 1200, respectively. There were 2110, 1974, 1340, and 1239 downregulated genes, respectively ( Figure 5A,B). The results revealed that the low-temperature treatment process resulted in significant differences in gene expression between cold-tolerance (CT) and cold-sensitivity (CS) fish.
From a total of 10,380 differentially expressed genes, the Venny online software . The X-axis is the specific repeated nucleotide types, which were defined as mone for one-, di for two-, tri for three-, quad for four-, penta for five-, and hexa for six-nucleotides and the Y-axis indicates the SSR motif numbers. (B) The histogram presents the distribution of the total number of SNPs within different classes (transition and transversion). The X-axis is the SNP variation type, and the Y-axis indicates the SNP numbers.

Differential Gene Expression between Cold-Tolerant and Sensitive Fish
To compare the cold-tolerance (CT) and cold-sensitivity (CS) groups of Taiwan tilapia in the context of the brain, gill, liver, and muscle (CT-B vs. CS-B, CT-G vs. CS-G, CT-L vs. CS-L, and CT-M vs. CS-M), differential expression of transcripts was assessed. Parameters that included a false discovery rate (FDR) < 0.05, log 2 fold change > 1, or log 2 fold change < − 1 were used to screen for differentially expressed genes (DEGs) from the DE-seq analysis. All unigenes represent the results for up-regulation and down-regulation in regard to differential expression distribution based on quantitative analysis ( Figure 5). Totals of 4191, 4235, 3164, and 2439 differentially expressed genes were detected in the brain, gill, liver, and muscle tissues, respectively. The numbers of regulatory genes within the four tissues of the coldtolerance (CT) group were 2081, 2261, 1824, and 1200, respectively. There were 2110, 1974, 1340, and 1239 downregulated genes, respectively ( Figure 5A,B). The results revealed that the low-temperature treatment process resulted in significant differences in gene expression between cold-tolerance (CT) and cold-sensitivity (CS) fish. The numbers of differentially expressed genes (DEGs) between the two comparison groups in brain, gill, liver, and muscle tissues were determined. All DEGs were determined based on the results from the statistical analysis (FDR < 0.05). The X-axis represents the comparison of samples. The Y-axis represents the DEG numbers. The red color represents up regulated DEGs, and the blue color represents down-regulated DEGs. (C) Venn diagram of the expressed unigenes in brain, gill, liver, and muscle tissues. A total of 10,380 unigenes were expressed, and of these, 156 genes were commonly expressed in all the four tissues. Unigenes exhibiting |log 2 fold change| 1 were considered to be differentially expressed in each tissue.  Figure 5C).

Differential Expression of Functional Genes Containing SSRs and SNPs
Among the 156 genes exhibiting differential performance in the four tissues described above and combined with the microsatellite and single nucleotide polymorphic marker database previously explored from transcripts, from the 38-cold tolerance-related functional genes that were identified using differential gene expression analysis in the four tissues, 13 genes with 13 microsatellites (Table 3) and 25 genes with 37 single nucleotide polymorphism markers (Table 4) were identified. Of these, six unigenes contained two to five SNP markers. Each unigene containing SSR and SNP markers corresponded to the observed expression level ( Figure 6).   MicroSAtellite (MISA) and GATK (v3.4-0) software programs were used to label the microsatellite (SSR) and single nucleotide polymorphism (SNP) variant sequences and to determine other biological information regarding the 156 differentially expressed genes selected above. The analysis revealed that there were 13 and 25 genes possessing SSR and SNP sequences, respectively. The online software ClustVis was used to perform pattern clustering analysis of the FPKM value of the marker gene and to create a heat map to indicate the gene difference performance ( Figure 6A,B).
Additionally, based on the results of the gene annotation database and the NCBI BLAST tool used to compare DNA sequence data, Sequence Viewer was used to annotate genes to the RefSeq database, where 13 candidate SSRs ( Figure 6A) and 37 candidate SNPs were marked ( Figure 6B

Validation of the Transcriptome Sequencing Results Using Real-Time qPCR
To verify the results obtained by RNAseq, six genes were randomly selected for qPCR analysis. The results revealed that the expression and transcription levels of RNAseq and qPCR genes in the brain, gill, liver, and muscle tissues of the cold-tolerance (CT) and cold-sensitivity (CS) strains were similar for each gene, and the two data sets were highly correlated (R 2 = 0.9794, p < 0.001, Figure 7). This result indicates that the RNAseq results were reliable. MicroSAtellite (MISA) and GATK (v3.4-0) software programs were used to label the microsatellite (SSR) and single nucleotide polymorphism (SNP) variant sequences and to determine other biological information regarding the 156 differentially expressed genes selected above. The analysis revealed that there were 13 and 25 genes possessing SSR and SNP sequences, respectively. The online software ClustVis was used to perform pattern clustering analysis of the FPKM value of the marker gene and to create a heat map to indicate the gene difference performance ( Figure 6A,B).
Additionally, based on the results of the gene annotation database and the NCBI BLAST tool used to compare DNA sequence data, Sequence Viewer was used to annotate genes to the RefSeq database, where 13 candidate SSRs ( Figure 6A) and 37 candidate SNPs were marked ( Figure 6B). Gene function annotations located in gene exons, introns, 3 -and 5 -untranslated regions (UTR), and other positions are listed in Tables 3 and 4, respectively.

Validation of the Transcriptome Sequencing Results Using Real-Time qPCR
To verify the results obtained by RNAseq, six genes were randomly selected for qPCR analysis. The results revealed that the expression and transcription levels of RNAseq and qPCR genes in the brain, gill, liver, and muscle tissues of the cold-tolerance (CT) and cold-sensitivity (CS) strains were similar for each gene, and the two data sets were highly correlated (R 2 = 0.9794, p < 0.001, Figure 7). This result indicates that the RNAseq results were reliable. In this experiment, the results of target gene sequence amplification reveale the number of molecules can be doubled in each replication cycle and that this p exhibits excellent amplification efficiency, thus indicating that the primer design experiment is appropriate and of high quality and that there are no problems wi ondary structures. The optimal qPCR reagent concentration and reaction con yielded ideal results.

Identification of Candidate SSR Markers Involved in Cold-Tolerance
Using 13 microsatellite markers to analyze the diversity of the three Taiwan populations (RF, YH, and NT), it was determined that the Unigene196 marker ca duce the largest number of alleles (16) and genotype combinations (44), while CL1 and CL279_7 yielded only two allele numbers and two genotype combinations (Ta Previous studies have developed two cold-tolerance markers that are identif UNH916 and UNH999 [19]. According to the test results of this experiment, it w termined that UNH916 and UNH999 can produce 12 and 14 alleles in the Taiwan test population, respectively. Among them, the RF population possessed the two ers with the most alleles (9 and 12, respectively), and the NT population possess least alleles (2 and 3). The E, H, J, and L alleles marked by UNH916 were only ob in the RF population, and allele I was only observed in the YH population. The I In this experiment, the results of target gene sequence amplification revealed that the number of molecules can be doubled in each replication cycle and that this process exhibits excellent amplification efficiency, thus indicating that the primer design of this experiment is appropriate and of high quality and that there are no problems with secondary structures. The optimal qPCR reagent concentration and reaction conditions yielded ideal results.

Identification of Candidate SSR Markers Involved in Cold-Tolerance
Using 13 microsatellite markers to analyze the diversity of the three Taiwan tilapia populations (RF, YH, and NT), it was determined that the Unigene196 marker can produce the largest number of alleles (16) and genotype combinations (44), while CL1876_16 and CL279_7 yielded only two allele numbers and two genotype combinations (Table 5). Previous studies have developed two cold-tolerance markers that are identified as UNH916 and UNH999 [19]. According to the test results of this experiment, it was determined that UNH916 and UNH999 can produce 12 and 14 alleles in the Taiwan tilapia test population, respectively. Among them, the RF population possessed the two markers with the most alleles (9 and 12, respectively), and the NT population possessed the least alleles (2 and 3). The E, H, J, and L alleles marked by UNH916 were only observed in the RF population, and allele I was only observed in the YH population. The I and N alleles marked by UNH999 were only observed in the RF and YH populations, respectively (Supplementary Table S3).   (Table 5).

Genotype of the Gene-Based SNP Marker That Was Significantly Correlated with Cold-Tolerance
The results of the SNP assay analysis of 37 candidate single nucleotide polymorphism markers related to low temperature tolerance traits revealed that the success rate of SNP marker gene typing was as high as 99.30%, and the reliability (conservative calls) averaged 87.98% ( Table 6). The average proportions of homozygous and heterozygous genotypes were 83.8% and 13.5%, respectively. The AutoCluster model was used to classify homozygote and heterozygote according to the population characteristics of the samples acquired from different Taiwan tilapia populations. The peak signal was used as the coordinate axis to draw a two-dimensional graph, and the genotype was then judged based on the clustering results (Figure 8). Among the 37 single nucleotide polymorphism markers, 26, 20, and four markers in the RF, YH, and NT Taiwan tilapia populations, respectively, were polymorphic, and a total of 27 markers were polymorphic among the three test populations (Table 6).  IBM SPSS Statistics v22.0.0 and the Scheffe test method (SPSS Inc., Chicago, IL USA) were both used to analyze the correlation between the genotypes of 37 single nu cleotide markers and the minimum tolerable temperature of the three Taiwan tilapi populations (RF, YH, and NT). The results revealed the SNPs in the transcript functiona database. The gene marker CL5212_1 was significantly correlated with th low-temperature tolerance of the YH Taiwan tilapia population (p < 0.01) ( Table 6). Table 6. The number of genotype, the frequency of homozygous and heterozygous, and the overall data of cold-tolerance related tests were analyzed by using RNAseq single nucleotide polymorphism (SNP) marker loci in RF, YH and NT Taiwan tilapia populations. IBM SPSS Statistics v22.0.0 and the Scheffe test method (SPSS Inc., Chicago, IL, USA) were both used to analyze the correlation between the genotypes of 37 single nucleotide markers and the minimum tolerable temperature of the three Taiwan tilapia populations (RF, YH, and NT). The results revealed the SNPs in the transcript functional database. The gene marker CL5212_1 was significantly correlated with the low-temperature tolerance of the YH Taiwan tilapia population (p < 0.01) ( Table 6).

Verification of the SNP Marker Assisted Selection for Cold-Tolerant Strains in Taiwan Tilapia
CL5212_1 marker from CT and CS species were used to select parental individuals possessing AA and dd genotypes, respectively, and to perform CS × CS, CS × CT, CT × CS, CT × CT positive and negative hybridization pairings to produce SC1 (n = 87), SC2 (n = 148), SC3 (n = 172), and SC4 (n = 110) (Figure 9). The progeny fish of the four families were subjected to low-temperature tolerance test, and the minimum temperature tolerance of each fish was recorded. Genomic DNA was extracted from the collected fin samples to confirm the genotype. Significant differences in cold-tolerance were observed in the four tested hybrid families (p < 0.01) that originated from the SC1 family with homozygous A/A cold-tolerant genotypes, and the lowest temperature tolerance was 7.3 ± 0.23 • C. For SC2 and SC3 family fish possessing the heterozygous A/d genotype, the minimum tolerable temperatures were 8.35 ± 0.49 • C and 8.47 ± 0.31 • C, respectively. For the SC4 family fish possessing the homozygous d/d cold-sensitive genotype, the minimum tolerable temperature was 10.47 ± 0.70 • C. This study demonstrates that the CL5212_1 marker is a selection marker that can be used for a marker-assisted breeding platform. mozygous A/A cold-tolerant genotypes, and the lowest temperature tolerance was 7.3 ± 0.23 °C. For SC2 and SC3 family fish possessing the heterozygous A/d genotype, the minimum tolerable temperatures were 8.35 ± 0.49 °C and 8.47 ± 0.31 °C, respectively. For the SC4 family fish possessing the homozygous d/d cold-sensitive genotype, the minimum tolerable temperature was 10.47 ± 0.70 °C. This study demonstrates that the CL5212_1 marker is a selection marker that can be used for a marker-assisted breeding platform.

Discussion
Tilapia is an important farmed fish species in tropical and subtropical regions. It is often exposed to prolonged or repeated cold stress at low temperatures during winter, and this stress can result in a reduction or a complete halt of food intake. Physiological responses such as stunting and death can seriously affect yield and quality [7,18,35,36]. Sensitivity to low temperatures is the primary limiting factor that hinders the introduction and distribution of tilapia in temperate and high-elevation areas.
Since the late 1990s, several Nile tilapia strains exhibiting low temperature adaptability have been domesticated and cultivated [37,38]. Understanding the genetic basis for intraspecies phenotypic variation is a long-standing goal in biological breeding [39,40]. In addition to providing abundant genetic resources, body phenotypic variation that has undergone long-term evolution or multi-generational improvement also plays a key role in the contribution of gene expression pattern variation [41]. Research comparing the cold-tolerance characteristics among the different Nile tilapia species located in different geographic locations such as Egypt, Ghana, and the Ivory Coast revealed that Nile tilapia have survived through natural selection for many years, and the physiologies and lethal temperature ranges reflect different cold-tolerant phenotypic characteristics acquired from their ancestral populations [42,43]. Although it is well known that there are intraspecies phenotypic differences in temperature tolerance [44][45][46][47], there is still a lack of information regarding this genetic variation. The physiological basis of this study was the long-term collection of populations possessing different cold-tolerant phenotypes. After multiple generations of low-temperature pressing and parental family selection, cold-tolerant and cold-sensitive strains were established.
Since 2007, the Taiwan tilapia research team has been cooperating with the Fisheries Research Institute of the Executive Yuan Agriculture Committee to perform genetic improvement studies for the selection of growth traits for Nile tilapia broodstock [25,26]. In addition to the pure Nile strain (code name NT), samples of a variety of Taiwan tilapia species were continuously collected from northern Taiwan (Rui-Fang wild field, code RF) and southern Taiwan (private commercial seed breeding farm, code YH). These fish are maintained in the core breeding greenhouse of the Department of Aquaculture at National Taiwan Ocean University and the Aquatic Organism Research and Conservation Center in Gongliao. In addition to establishing the family code of each full-sib family through the use of the Avid Identification System, each strain also tracks the performance of each family in F1 and F2 for different generations of cold tolerance trait selection. These characteristics were measured from individual fish that survived in the F1 and F2 generations. According to the previous results involving cold tolerance data, the cold tolerance of the two generations from the same sibling family exhibits hereditary performance.
In the present study, transcriptomic gene libraries of the cold-tolerant (CT) and coldsensitive (CS) groups were constructed using high-throughput next generation sequencing (NGS). A total of 48 total RNA samples (2 groups × 6 fish/group × 4 tissues/fish) were extracted from four tissues that included the brain, gill, liver and muscle. Biological analysis met the requirements of the test sample repetition. Additionally, research reports by Assefa et al. [56] and Zhao et al. [57] noted that pooling of RNA samples can not only reduce the cost of sequencing (optimize the cost) but also maintain high-throughput sequencing and high quality (maintaining the power). In this study, based on the consideration of sequencing cost, we mixed high-quality total RNA samples for the same tissue in the same group of six samples and then sequenced them, and this produced high-quality sequencing results.
This study proposes for the first time that the Illumina HiSeq 2000 platform can be used to observe the transcriptome of the brain, gill, liver, and muscle tissues of the two strains of Taiwan tilapia under low temperature stimulation in the context of genetic research. Among the identified unigenes, 100,108 (78.12%) were successfully annotated in the public NR, GO, COG, KOG, and KEGG databases through the use of a BLAST search. GO and COG analyses determined the distribution of functional genes in tilapia from Taiwan. The KEGG database search successfully revealed the functions of cellular process genes and the gene products of metabolic processes.
Low temperature and cold adaptation culture model fish species such as Japanese flounder (Paralichthys olivaceus) [52] and common carp (Cyprinus carpio haematopterus) [22] have been previously studied, and the transcriptome KEGG analyses revealed that the differentially expressed genes primarily participate in cell pathways (intercellular communication, cell movement, membrane transport, and catabolism), signal transmission (signal molecular interaction, genetic information processing, folding, classification, degradation, transcription and translation), metabolism (amino acid metabolism, lipid metabolism, energy metabolism), and the biological system (endocrine, circulation, digestion, nerve, sensory, excretion and immune system). Additionally, transcript sequencing analysis examining milkfish (Chanos chanos) used transcript sequencing and determined that this low-temperature sensitive fish species can produce enough energy to resist cold under low temperature stress, and the glycogen and fatty acid degradation-and synthesis-related functional genes appear to be up-and down-regulated, respectively [53].
The KEGG analysis in this study revealed that the metabolic pathways exhibit common enrichment in four tissues, including the brain, gill, liver, and muscle. Among these, tight junctions are the most enriched pathways in the brain and gills, and these structures support the continuous intercellular barrier between epithelial cells that separates interstitial spaces and regulates the selective passage of solutes through epithelial cells [58]. In liver tissues, protein degradation (ubiquitin-mediated proteolysis) that is primarily guided by ubiquitin, is the most enriched pathway. It is extensively involved in the regulation of the cell cycle, immune and inflammatory responses, signal control transmission pathways, and development and differentiation [59]. The most enriched pathway in muscle tissue is the MAPK signaling pathway that uses protein networks to transmit extracellular signals to the cell to regulate cellular processes. It involves the activation of cell membrane signaling molecules and protein kinases [60]. It is noteworthy that this reflects the interactions between the cell and the surrounding environment, adjacent cells, and ECM, and this pathway participates in intracellular processes, such as protein processing and RNA transport and degradation. These findings are consistent with the results of previous transcriptomic analyses [9,22,48,52].
The current study used the qPCR test and the 18S rRNA and β-actin as the internal references genes, which were quite stable. Based on repeated test results, the average Ct values for the 18S rRNA and β-actin mRNAs were determined to be 16.96 ± 0.24 and 15.06 ± 0.47, respectively. In several previous research studies, with Oreochromis niloticus [26,48,51], Oreochromis aureus [9] and Paralichthys olivaceus [52], the 18S rRNA and β-actin genes were also used and proved as good internal reference expression genes.
Molecular markers can be used to evaluate the genetic diversity and genetic purity of a population. A significant reduction in genetic diversity causes the population to decline in regard to close relatives and affects the performance of phenotypic traits [61]. Regarding the genetic diversity of tilapia populations and the development of microsatellite markers, Zhu et al. [19] screened UNH916 from the genomic DNA of cold-tolerant and cold-sensitive fish. With the UNH999 microsatellite marker locus, the allelic fragment amplified through the use of primer pairs can be used for family codominance analysis and identification and also to assess polymorphisms. Similar results were obtained for the two sets of SSR markers in the three test populations used in this study. The total number of genotype combinations (n = 52) generated was more than 2-fold the total number of alleles (n = 25). However, it is speculated that this study may be affected by the experimental species and the number of analyses, and these factors may result in an inability to identify the same cold-tolerant allele from the two groups of microsatellites. Additionally, eleven sets of novel SSR markers that were developed from transcripts were added to analyze the correlation between the diversity index and cold tolerance traits. By analyzing the diversity of multiple sets of alleles in addition to using genetic parameters such as H o (0.347-0.535), H e (0.293-0.595), and PIC (0.920-0.591) to demonstrate that Taiwan tilapia breeding populations still retain a high genetic polymorphism rate within the genomic DNA, researchers can also develop functional SSR markers for cold-tolerant fish strains. This can assist the seedling industry in the future to achieve scientifically improved germplasm for cold-tolerance molecular breeding purposes.
Due to the large number of SNPs and their widespread distribution within the genome, they provide an advantage in terms of quantitative trait locus (QTL) mapping and are crucial molecular markers for genetic research [62,63]. In the development of fish SNP markers, many research teams have used simplified genome sequencing methods to develop markers that can be used for genotyping and selection of different geographic locations and strains [64]. The SNP markers developed through the use of RNAseq and RAD-seq [65,66] are not only used for genetic diversity analysis in GIFT, O. niloticus, O. mossambicus, red tilapia and other ethnic groups but have also been used to develop and establish high-density genetic linkage maps [67] and genome-wide SNP array [68][69][70]. In this study, we used a high-throughput transcript next-generation sequencing platform to screen candidate SNP markers for low-temperature tolerance traits from the database. The results of the Sequenom MassARRAY's analysis of different cold-tolerant groups of Taiwan tilapia revealed that in addition to polymorphic identification applications, this fish also possesses the ability to withstand low temperatures significantly.
Among 37 SNP markers, we observed differences in low-temperature tolerance among the different genotypes of CL5212_1 (clathrin gene). Clathrin translated from the clathrin gene contains heavy and light chains. The mesh-like coat formed by polymerization can capture and transport molecules during receptor-mediated endocytosis and organelle biosynthesis to the cell membrane, thus playing a key role in cellular physiology [71]. There are high proportions of A/A and A/d genotypes in cold-tolerant (CT) strain, while the genotypes of non-cold-tolerant (CS) strain is predominantly d/d, as the SNP locus is located in the UTR of the clathrin gene. Although there was no sequence change involving the amino acids of the protein, we unexpectedly observed after further prediction of miRNAs and target gene sequence that the A base of the 3 end UTR of the clathrin gene in the cold-sensitive (CS) strain was missing (genotype is d/d), as were the miRNA regulatory sites nearby. It is speculated that miRNAs might negatively regulate the expression of the gene and decrease protein translation, thus leading to the heredity of the phenotype of the cold-tolerant trait in fish. In the future, we will continue to explore the molecular mechanisms underlying this hypothesis. Figure 7 presents the consistency of the two quantitative RNAseq and qPCR results. Among them, under cold acclimation, the RNAseq and qPCR data of the CL5212_1 gene in the brain, liver, muscle, and gill tissues were all up-regulated, although we did identify genetic variation (qualitative) in the SNP marker from the CL5212_1 gene. Cold tolerance is related; however, it remains unknown if its gene expression (quantitative) affects the cold tolerance of fish, and determining this will require further follow-up tests.
Additionally, Koštál et al. [72] determined that arginine can form supramolecular aggregates, thus suggesting that in addition to partially binding to unfolded proteins, it can also inhibit its aggregation under freeze dehydration to stimulate fruit fly larvae with high freeze tolerance. Fan et al. [73] studied arginine kinase (a heat shock protein) based on the genomics and proteomics of L. vannamei under cold stress. Proteins and histones were identified as positive regulators. Abdel-Ghany et al. [74] observed that cold shock to Nile tilapia can enhance cold tolerance by reducing the changes in systemic saturated fatty acids and increasing the lipid metabolism of n-6 and n-3 unsaturated fatty acids. Additionally, Li et al. [75] determined that in tilapia diets supplemented with arginine, the regulation of lipid metabolism-related genes can reduce the mechanism of liver fat deposition and fatty acid composition induced by a high-fat diet [76]. From the transcript sequencing results of this study, it was observed that the arginine kinase gene of the cold-tolerant strain possessed high expression levels.
In summary, this study conclusively demonstrated that after long-term collection from different source groups and the establishment of multi-generation seed stocks through family pairing genetic management flags and trait recording data tracking systems, the extremes of genetic variation can be screened from important economic traits of farmed fish in ethnic group individuals [26,77]. The combination of high-throughput transcriptome sequencing and assembly, database comparison and functional annotation, gene differential expression, and molecular marker genotype analysis was quite effective in exploring the theory of key regulatory pathways in stress physiology, the development of molecular marker-assisted breeding platforms and their potential applications [78,79]. In the future, in addition to further research examining the effects of aquatic diet additives on the cold resistance traits of aquatic products, artificial intelligence, and nutrigenomics research strategies such as the AI-assisted adversity abnormal behavior identification system [80] will be used. The recently developed whole genomic SNP array technology is also crucial for the follow-up study of more Taiwan tilapia resistance traits and genome-wide association studies, to improve the research and development of precise nutrition breeding regulation mechanisms.

Conclusions
In conclusion, the results of this study lay the foundation for identifying genetic markers associated with cold tolerance and for developing molecular markers based on simple sequence repeats and single nucleotide polymorphisms. These can be used in tilapia breeder-assisted selection breeding plans to increase the temperature adaptation range and productivity of tilapia. Tilapia breeding develops tilapia strains possessing improved cold tolerance through marker-assisted selection. This study proposes that multiple sets of novel polymorphic SSR and SNP markers derived from RNAseq can be applied to the establishment, monitoring, and management of the genetic resource diversity of various commercial populations of Taiwan tilapia in the market and can also be used for providing the basis for family selection and breeding genetics and to further explore the gene regulation mechanisms and precision breeding of cold-tolerant strains of additional aquaculture species.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/ani11123538/s1, Figure S1: Size statistics of experimental fish family. The TL on the vertical axis refers to the overall length (cm); SL refers to the standard body length (cm); BW refers to the weight (g) and other three measurements are fish size average ± standard deviation. Figure S2: The size distribution of unigenes from four tissues of two tilapia groups. All tissues-unigene. Y-axis represents the length of transcripts. X-axis represents the number of transcripts. Figure S3: Distribution map of COG function annotation in Taiwan tilapia transcript. X-axis represents the number of unigenes. Y-axis represents the COG functional category. Figure S4: Functional distribution of GO annotation. X-axis represents the number of unigenes. Y-axis represents the Gene Ontology functional category. Figure S5: Functional distribution of KEGG annotation. X-axis represents the number of unigenes. Y-axis represents the KEGG functional category. Table S1: Nucleotide sequences, melting temperature, amplicon size and Ct value of the cold tolerance-related gene primers used for real-time quantitative reverse transcription-PCR assays. Table S2: Quality metrics of transcripts in Taiwan tilapia (Oreochromis spp.). Table S3: Allele frequency of polymorphic SSR markers in different populations of Taiwan tilapia.