Transcriptome Analysis of Epithelioma Papulosum Cyprini Cells Infected by Reovirus Isolated from Allogynogenetic Silver Crucian Carp

The present study aimed to identify differentially expressed genes (DEGs) and major signal transduction pathways that were related to the immune response of epithelioma papulosum cyprinid (EPC) cells to reoviruses isolated from allogynogenetic silver crucian carp. The study also lays a theoretical foundation for the pathogenesis and immunity of the reovirus, which is helpful to the breeding of cyprinids fish. Reovirus infected and uninfected EPC cells were analyzed by using a new-generation high-throughput sequencing technology. DEGs were identified, annotated, and classified, and the signal pathways involved in the response to reovirus infection were identified by using bioinformatics tool. The data were assembled into 92,101 contigs with an average length of 835.24 bp and an N50 value of 1432 nt. Differential expression analysis of all the genes identified 3316 DEGs at a false discovery rate (FDR) of <0.01 and a fold-change of ≥3, of which 1691 were upregulated genes, 1625 were downregulated, and about 305 were immune-related genes. Gene Ontology (GO) enrichment analysis resulted in the annotation of 3941 GO terms, including 2719 biological processes (37,810 unigenes), 376 cell components (7943 unigenes), and 846 molecular functions (11,750 unigenes). KEGG metabolic pathway analysis matched the DEGs from pre-and post-infection EPC cells to 193 pathways, of which 35 were immune-related, including the Toll-like receptor, cytokine-cytokine receptor interaction, and the JAK-STAT signaling pathways.


Introduction
Aquareoviruses are characterized to have a double capsid, icosahedral symmetry, and no capsule. The average diameter of aquareoviruses is 60-70 nm and its genome consists of 11 segments of double-stranded RNA [1,2]. Aquatic reoviruses have infected various economically important economic aquaculture fishes such as grass carp, black carp, barley fish, Atlantic salmon, and rare crucian carp, thereby resulting in higher mortality rates. Besides aquatic animals, reoviruses have been isolated from ostriches, bats, and ducks. However, reports on the isolation of reoviruses that infect allogynogenetic crucian carp are limited.
The transcriptome is the sum of all the transcripts of a given organism in a given state, including not only mRNAs but also non-coding RNAs [3,4]. The transcriptome is an essential link between genetic information and biological function (proteome). New genes and transcripts are often identified by using the latest high-throughput sequencing technology and correlations are calculated via mRNA expression analysis [5][6][7]. Transcriptome analysis is mainly used in studies of differential gene expression, allele-specific expression, alternative splicing, fusion genes and other related studies [8][9][10]. Transcriptome sequencing (RNA-seq) pertains to the direct sequencing of a cDNA sequence using a large-scale sequencing technology, thereby resulting in tens of millions of reads that could determine its transcription level by direct comparison of the number of reads in the genomic region [11]. In recent years, RNA sequencing has become a widely used method for fish-virus interaction studies. Due to our limited understanding of the molecular mechanism underlying reovirus infection in Cyprinidae, most research studies on the resistance to reoviruses have mainly focused on immune-related genes such as IL-10, HSP70, and HSP90, as well as other areas such as antimicrobial peptides (AMPs), proteomics analysis, and physical and chemical control strategies.
To understand the molecular immune mechanism of reoviruses in cyprinid fish, the present study analyzed the transcriptome of epithelioma papulosum cyprinid (EPC) cells that were infected by reovirus using the Illumina/Solexa high-throughput sequencing technique. We investigated the mRNA expression levels, immune-related genes, and pathways, and identified several simple repeat sequences (SSRs) and single nucleotide polymorphisms (SNPs). The findings of the present study may be utilized in the elucidation of the molecular mechanism underlying the potential interaction of reoviruses isolated from allogynogenetic crucian carp with EPCs.

Isolation, Inoculation, and Sampling of Reoviruses from Allogynogenetic Silver Carassius auratus
The allogynogenetic reoviruses used in the present study were isolated from diseased gibel carp by our laboratory in a fishing ground in Jilin Province in 2015. The EPC cells and crucian carp reoviruses were stored at the Jilin Agricultural University aquatic laboratory. The EPC cells were cultured in 75-cm 2 flasks (BD Biosciences, Franklin Lakes, NJ, USA) with M199 cell culture medium containing 10% fetal bovine serum (Life Technologies, Carlsbad, CA, USA) and 1% double antibody (Sigma-Aldrich, St. Louis, MO, USA) at 27 • C. Upon reaching 80% confluency, the reovirus was added to the cell culture bottle for propagation. Upon reaching a confluency of >80%, the EPCs were harvested and stored at −80 • C as the virus mother liquor. The titer of the virus was measured on a 96-well plate covered with EPC cells using the Karber method. Approximately 1.4 × 10 7 cells were seeded onto 75-cm 2 culture flasks (BD Biosciences) and 90% confluency was achieved after incubating for 24 h in a 27 • C incubator. Reoviruses with a multiplicity of infection (MOI) of 10 were added to the cell cultures. The same volume of M199 medium was added to a control flask ( Figure 1A), and the supernatant was discarded when the proportion of cytopathic lesions (CPE) reached about 80% (about 12 h) ( Figure 1B), and the cells were collected, frozen in liquid nitrogen, and kept in a −80 • C ultra-low freezer. Each group consisted of two biological replicates. genetic information and biological function (proteome). New genes and transcripts are often identified by using the latest high-throughput sequencing technology and correlations are calculated via mRNA expression analysis [5][6][7]. Transcriptome analysis is mainly used in studies of differential gene expression, allele-specific expression, alternative splicing, fusion genes and other related studies [8][9][10]. Transcriptome sequencing (RNA-seq) pertains to the direct sequencing of a cDNA sequence using a large-scale sequencing technology, thereby resulting in tens of millions of reads that could determine its transcription level by direct comparison of the number of reads in the genomic region [11]. In recent years, RNA sequencing has become a widely used method for fish-virus interaction studies. Due to our limited understanding of the molecular mechanism underlying reovirus infection in Cyprinidae, most research studies on the resistance to reoviruses have mainly focused on immunerelated genes such as IL-10, HSP70, and HSP90, as well as other areas such as antimicrobial peptides (AMPs), proteomics analysis, and physical and chemical control strategies.
To understand the molecular immune mechanism of reoviruses in cyprinid fish, the present study analyzed the transcriptome of epithelioma papulosum cyprinid (EPC) cells that were infected by reovirus using the Illumina/Solexa high-throughput sequencing technique. We investigated the mRNA expression levels, immune-related genes, and pathways, and identified several simple repeat sequences (SSRs) and single nucleotide polymorphisms (SNPs). The findings of the present study may be utilized in the elucidation of the molecular mechanism underlying the potential interaction of reoviruses isolated from allogynogenetic crucian carp with EPCs.

Isolation, Inoculation, and Sampling of Reoviruses from Allogynogenetic Silver Carassius auratus
The allogynogenetic reoviruses used in the present study were isolated from diseased gibel carp by our laboratory in a fishing ground in Jilin Province in 2015. The EPC cells and crucian carp reoviruses were stored at the Jilin Agricultural University aquatic laboratory. The EPC cells were cultured in 75-cm 2 flasks (BD Biosciences, Franklin Lakes, NJ, USA) with M199 cell culture medium containing 10% fetal bovine serum (Life Technologies, Carlsbad, CA, USA) and 1% double antibody (Sigma-Aldrich, St. Louis, MO, USA) at 27 °C. Upon reaching 80% confluency, the reovirus was added to the cell culture bottle for propagation. Upon reaching a confluency of >80%, the EPCs were harvested and stored at −80 °C as the virus mother liquor. The titer of the virus was measured on a 96-well plate covered with EPC cells using the Karber method. Approximately 1.4 × 10 7 cells were seeded onto 75-cm 2 culture flasks (BD Biosciences) and 90% confluency was achieved after incubating for 24 h in a 27 °C incubator. Reoviruses with a multiplicity of infection (MOI) of 10 were added to the cell cultures. The same volume of M199 medium was added to a control flask ( Figure 1A), and the supernatant was discarded when the proportion of cytopathic lesions (CPE) reached about 80% (about 12 h) ( Figure 1B), and the cells were collected, frozen in liquid nitrogen, and kept in a −80 °C ultra-low freezer. Each group consisted of two biological replicates.

Library Construction and Sequencing
Total RNA was extracted from the samples, and the purity, concentration, and integrity of the RNA samples were determined by using a Nanodrop, Qubit 2.0, and Agilent 2100 methods to ensure that the samples to be used for transcriptional sequencing were of good quality. mRNA from the extracted total RNA was enriched and purified by using poly(T)-rich low-adsorption magnetic beads. A fragmentation buffer was added to randomly break the mRNA. Using mRNA as template, the first cDNA strand was synthesized using random hexamers. Then, the second cDNA strand was synthesized by adding a buffer, dNTPs, RNase H, and DNA polymerase I. The cDNA was purified by using AMPURE XP beads. The purified double-stranded cDNA was further subjected to terminal repair with A-tails and ligated into sequencing adapters. The AMPure XP beads were then used to select the fragment size. Finally, the cDNA library was obtained by PCR amplification. The established library was sequenced on an Illumina HiSeqTM 4000 system (San Diego, CA, USA).

Transcriptome Analysis
The raw data obtained by sequencing was initially processed. The data with a ratio of reads containing connectors and N (undetermined bases) >10%, along with the low-quality reads were discarded, and the corresponding clean reads database was obtained. The clean data of each sample were aligned and compared to the assembled unigene library. The reads that matched Unigenes were designated as mapped reads, which were then used in the subsequent analysis. The expression level of each gene in the sample was calculated using the reads per kb per monthly reads (RPKM) method [5] using only the number of reads and the total number of reads that could be matched to the reference sequence.

Screening of Differentially Expressed Genes (DEGs)
The relative expression level of unigenes was determined by using the number of reads per kilobase of the map to the number of reads per 1 million base of the exon (FPKM). DEGs between the experimental and control groups were calculated using the DESeq [25] technique [26]. For differential expression analysis, the well-known Benjamin-Hochberg method was used to correct the p-value from the original test. Using the corrected p-value, the false discovery rate (FDR) was selected as the key index for screening DEGs. In this study, the FDR was <0.01, and the fold-change was ≥3.

Functional Annotation and Enrichment Analysis of DEGs
Gene Ontology (GO) is a functional annotation of the DEGs. Using the Goatools software (https: //github.com/tanghaibao/GOatool) for feature enrichment, significant enrichment was observed when the FDR was <0.05. COG classification of DEGs from various groups of samples was also performed. KEGG pathway analysis using the KEGG Orthology-Based Annotation System (KOBAS; http://kobas.cbi.pku.edu.cn/home.do) was also conducted, and the signaling pathways were verified using the Fisher's test and an FDR of ≤0.05.

Real-time Polymerase Chain Reaction Analysis
To validate the data generated from Illumina HiSeqTM 4000 sequencing (San Diego, CA, USA), 10 DEGs were randomly selected for real-time quantitative PCR (qPCR) analysis. Primers were designed using a primer software ( Table 1). The total RNA used for RT-PCR was the same as that for the transcriptome. A Bole (CFX96) PCR instrument was used for PCR detection with β-actin as internal reference. The reaction system (total volume: 25 µL) consisted of the following: 12.5 µL SYBR Premix Ex Taq II, 1 µL of the upstream and downstream primers, 2.5 µL of the cDNA template, and supplemented with dH 2 O to a final reaction volume of 25 µL. To confirm that only one PCR product was amplified, the PCR product was subjected to dissociation curve analysis at the end of each PCR reaction. The data were analyzed by using the CFX (Computational Fluid Dynamics X) software (version3.1, Hercules, CA, USA) package using Ct values (2 −∆∆Ct values) to analyze the expression levels of different genes.

RNA Sequencing and Transcriptome Sequence Assembly
In the present study, high-throughput sequencing of EPC cells (T03 and T04) infected with reovirus and uninfected EPC cells (T01 and T02) was performed via high-throughput transcriptional analysis and short sequence assembly analysis. The purity, concentration, and integrity of the RNA of the four samples were assessed by using Nanodrop, Qubit 2.0, and the Aglient 2100 system. The results showed that the concentration and total amount of the four samples were in accordance with the sequencing requirements. Four samples were sequenced using a HiSeqTM 4000 (San Diego, CA, USA) high-throughput sequencing platform. After removing the linker and primer sequences, filtering low-quality data, and performing sequencing quality control, read number and base number of each sample were obtained ( Table 2). The analysis found that the percentages of Q30 bases were not >89.44% and GC Content were >47.19%, which indicated that the sequencing results were good and could generate good original data for the subsequent data assembly. Using Trinity (http://trinityrnaseq.sourceforge.net/) to assemble and splice the clean reads, the data was assembled into 92,101 contigs with an average length of 835.24 bp and a N50 value of 1432 nt. The length distribution of the reads is shown in Figure 2, with Unigene length 300-500 being the largest number, accounting for 30.23%, followed by Unigene length 200-300. The clean reads were mapped to Unigene, and the results are shown in Table 3, mapped ratio were >73.37%, and EPC cells (T03 and T04) infected with reovirus and uninfected EPC cells (T01 and T02) were similar, indicating better reproducibility. The assembly showed that the pattern of length distribution and average length of contigs were similar to the results of previous transcriptome studies using Illumina sequencing [28,29]. The large amount of sequence information obtained not only filled the gene expression profile of reovirus-immunized cyprinids, but also facilitated the mining of genetic information resources [30]. filtering low-quality data, and performing sequencing quality control, read number and base number of each sample were obtained ( Table 2). The analysis found that the percentages of Q30 bases were not >89.44% and GC Content were >47.19%, which indicated that the sequencing results were good and could generate good original data for the subsequent data assembly. Note: Samples: Sample name on the sample information sheet; Read number: Total number of pairedend reads in the clean data; Base number: Total number of bases in the clean data; GC content: GC content in the clean data, which is the percentage of G and C bases in the total bases in the clean data; % ≥Q30: Percentage of bases with a mass value ≥ 30 in the clean data.
Using Trinity (http://trinityrnaseq.sourceforge.net/) to assemble and splice the clean reads, the data was assembled into 92,101 contigs with an average length of 835.24 bp and a N50 value of 1432 nt. The length distribution of the reads is shown in Figure 2, with Unigene length 300-500 being the largest number, accounting for 30.23%, followed by Unigene length 200-300. The clean reads were mapped to Unigene, and the results are shown in Table 3, mapped ratio were >73.37%, and EPC cells (T03 and T04) infected with reovirus and uninfected EPC cells (T01 and T02) were similar, indicating better reproducibility. The assembly showed that the pattern of length distribution and average length of contigs were similar to the results of previous transcriptome studies using Illumina sequencing [28,29]. The large amount of sequence information obtained not only filled the gene expression profile of reovirus-immunized cyprinids, but also facilitated the mining of genetic information resources [30].

Functional Annotation and Classification of Transcriptional Group Data
The 92,101 assembled unigenes as describe above were compared to those in the NR, SWISS-PROT, eggNOG, COG, Pfam, KOG, GO, and KEGG databases respectively. By selecting the BLAST parameter E-value of < 1 × 10 −5 and the HMMER parameter E-value of < 1 × 10 −10 , 42,522 unigene annotations were obtained. The results of the comparative analysis are shown in Table 4  Note: Annotated databases: the function of the database; Annotated_Number: Number of comments annotated to the unigene database; 300 ≤ Length < 1000: Number of unigenes annotated to the database that is ≤300 or less and <1000 bases in length; Length ≥ 1000: Number of unigenes annotated to the database that are >1000 bases in length.

Differential Gene Expression of EPC Cells Pre-infected and Post-infected by Reovirus from Allogynogenetic Crucian Carp
To identify DEGs that may play an important role in the defense response to reovirus infection in cyprinids, the data of the test groups (TO3 and T04) infected with reoviruses from allogynogenetic silver crucian carp were compared to that of the uninfected control groups (T01 and T02). A total of 3316 DEGs were identified by using the standard FDR of <0.01 and fold-change of ≥3, of which 1691 were upregulated and 1625 were downregulated ( Figure 3). The distribution of DEGs is illustrated using an MA plot ( Figure 4) [31]. To visualize the gene expression profiles, we generated a heatmap using the Hcluster algorithm, where the color changes from red to green with decreasing expression levels [32].   , the logarithm of the difference in gene expression between the two samples, which was used to measure the difference in the level of expression. The green and red dots represent genes with significant differences in expression levels, green represents downregulated gene expression, red represents upregulated gene expression, and black spots represent genes with no significant differences in expression.

DEGs GO Functional Enrichment Analysis
The GO database is a structured standard biology annotation system constructed by the GO organization, which aims to establish a standard vocabulary system for genes and its products and is applicable to all species. To further reveal the function of DEGs, 3316 DEGs from the test group (T03 and T04) that were infected with allogynogenetic reovirus and the uninfected control group (T01   , the logarithm of the difference in gene expression between the two samples, which was used to measure the difference in the level of expression. The green and red dots represent genes with significant differences in expression levels, green represents downregulated gene expression, red represents upregulated gene expression, and black spots represent genes with no significant differences in expression.

DEGs GO Functional Enrichment Analysis
The GO database is a structured standard biology annotation system constructed by the GO organization, which aims to establish a standard vocabulary system for genes and its products and is applicable to all species. To further reveal the function of DEGs, 3316 DEGs from the test group (T03 and T04) that were infected with allogynogenetic reovirus and the uninfected control group (T01 , the logarithm of the difference in gene expression between the two samples, which was used to measure the difference in the level of expression. The green and red dots represent genes with significant differences in expression levels, green represents downregulated gene expression, red represents upregulated gene expression, and black spots represent genes with no significant differences in expression.

DEGs GO Functional Enrichment Analysis
The GO database is a structured standard biology annotation system constructed by the GO organization, which aims to establish a standard vocabulary system for genes and its products and is applicable to all species. To further reveal the function of DEGs, 3316 DEGs from the test group (T03 and T04) that were infected with allogynogenetic reovirus and the uninfected control group (T01 and T02) were subjected to GO Note ( Figure 5). The results showed that 3941 GO terms were annotated, including 2719 biological processes (37,810 unigenes). Among the 376 cell components (7943 unigenes) and 846 molecular functions (11,750 unigenes), it is indicated that most of the differentially expressed genes are mainly involved in the biological process of cells, but there are few genes involved in cell components and molecular functions.The DEGs were mainly involved in cellular processes (GO: 0009987), single biological processes (GO: 0044699) (GO: 0044464), organelle (GO: 0043226), and catalytic activity (GO: 0003824). These results indicated that various physiological and biochemical changes occurred in the EPC cells during reovirus infection, and these played an important role in virus defense response, it is possible to recognize the functional distribution of differentially expressed genes from the macroscopic angle of infection of the infected reovirus.  The secondary function with significant difference indicates that the differentially expressed genes are different from the enrichment trend of all genes, and can be focused on whether this function is related to the difference.

COG Annotation of DEGs
COG (Clusters of Orthologous Groups) is a database of orthologous classification of gene products. Each COG protein is presumed to be derived from an ancestral protein. The COG database was constructed based on bacterial, algal, eukaryotic encoded proteins with complete genomes and phylogenetic relationships. A total of 3166 differentially expressed genes from the comparative analysis of the test group (TO3 and T04) that was infected with allogynogenetic reovirus to the uninfected control groups (T01 and T02) were directly orthologous and classified with gene products in the COG database. The resulting classification is shown in Figure 6  The secondary function with significant difference indicates that the differentially expressed genes are different from the enrichment trend of all genes, and can be focused on whether this function is related to the difference.

COG Annotation of DEGs
COG (Clusters of Orthologous Groups) is a database of orthologous classification of gene products. Each COG protein is presumed to be derived from an ancestral protein. The COG database was constructed based on bacterial, algal, eukaryotic encoded proteins with complete genomes and phylogenetic relationships. A total of 3166 differentially expressed genes from the comparative analysis of the test group (TO3 and T04) that was infected with allogynogenetic reovirus to the uninfected control groups (T01 and T02) were directly orthologous and classified with gene products in the COG database. The resulting classification is shown in Figure 6

Pathway Enrichment Analysis of DEGs
The execution of biological functions relies on the coordination among genes. Pathway enrichment analysis of DEGs could provide better insights into the biological function of a gene and its interaction with other genes, so as to further study their complex behavior. The differential genes in the test group (T03 and T04) infected with reovirus and the uninfected control group (T01 and T02) were found to be involved in 50 subclasses of metabolic pathways in six broad categories (cellular processes, environmental information processing, genetic information processing, human diseases, metabolism, and organic systems) after these were analyzed by using the metabolic pathway database KEGG (Figure 7). Among these, several signaling pathways with a relatively high number of genes were identified, which included actin cytoskeleton regulation (ko04810), adhesion (ko04510), MAPK signaling pathway (ko04010), and protein processing in the endoplasmic reticulum (ko04141). These metabolic pathways are involved in the signal transduction of environmental information processing. There were 35 pathways associated with immune and disease resistance, including cytokine-cytokine-receptor interaction (12 genes, ko04060), apoptotic signaling pathway (21 genes, ko04068), phagocytosis (22 genes, ko04145), endocytosis (28 genes, ko04144), and Toll receptor signaling pathway (13 genes, ko04620). These pathways are basically involved in all aspects of fish immunization. This demonstrates that the experimental KEGG annotation effect is good and shows the reovirus and immune-induced regulatory pathways. Genes and pathways associated with the immune system, signal transduction and disease processes were similar to those previously reported in zebrafish [33], grouper [34], and bream [35]. Studies on immune-related genes and pathways can improve our understanding of the molecular immune mechanisms of reovirus-stimulated EPCs.

Pathway Enrichment Analysis of DEGs
The execution of biological functions relies on the coordination among genes. Pathway enrichment analysis of DEGs could provide better insights into the biological function of a gene and its interaction with other genes, so as to further study their complex behavior. The differential genes in the test group (T03 and T04) infected with reovirus and the uninfected control group (T01 and T02) were found to be involved in 50 subclasses of metabolic pathways in six broad categories (cellular processes, environmental information processing, genetic information processing, human diseases, metabolism, and organic systems) after these were analyzed by using the metabolic pathway database KEGG (Figure 7). Among these, several signaling pathways with a relatively high number of genes were identified, which included actin cytoskeleton regulation (ko04810), adhesion (ko04510), MAPK signaling pathway (ko04010), and protein processing in the endoplasmic reticulum (ko04141). These metabolic pathways are involved in the signal transduction of environmental information processing. There were 35 pathways associated with immune and disease resistance, including cytokine-cytokine-receptor interaction (12 genes, ko04060), apoptotic signaling pathway (21 genes, ko04068), phagocytosis (22 genes, ko04145), endocytosis (28 genes, ko04144), and Toll receptor signaling pathway (13 genes, ko04620). These pathways are basically involved in all aspects of fish immunization. This demonstrates that the experimental KEGG annotation effect is good and shows the reovirus and immune-induced regulatory pathways. Genes and pathways associated with the immune system, signal transduction and disease processes were similar to those previously reported in zebrafish [33], grouper [34], and bream [35]. Studies on immune-related genes and pathways can improve our understanding of the molecular immune mechanisms of reovirus-stimulated EPCs.

SSR Tags
Among various molecular markers, SSRs have many recognized functions and are widely used in paternity testing, genetic diversity, some aspects of genetic linkage maps, and marker-assisted breeding [36]. SSRs are tandem repeats of single nucleotides to hexanucleotides found in all prokaryotic and eukaryotic genomes. As a molecular marker, SSRs have a variety of putative functions and high polymorphisms that play an important role in certain aspects of genetic and breeding research. In addition, gene transcription and mRNA splicing or export to the cytoplasm may be affected by intron SSRs. Therefore, it is essential to understand the spectrum of SSRs. In the present study, we determined that all 21,985 unigenes in the transcriptome group contained SSRs (Table 5).
SNPs are the most abundant type of DNA sequence polymorphisms and have been increasingly used in quantitative trait loci (QTLs) for molecular marker linkage map construction, mapping, and association studies [37,38]. According to the number of alleles (Allele) on the SNP locus, i.e., the number of different bases supported by the sequencing reads, SNP sites can be divided into homozygous SNP sites (only one allele) and heterozygous SNP sites (two or more alleles). The percentage of heterozygous SNPs in different species varies. The number of SNP sites in each sample in this study is shown in Table 6. Both the Homo SNP and HeteSNP crucian carp were less in the test group (T03 and T04) infected with reovirus and the uninfected control group (T01 and T02). We can further study genetic markers related to immune-related genes and provide reference for the pathogenesis of reovirus-infected cyprinid fish and the search for genetic markers of immune genes.

SSR Tags
Among various molecular markers, SSRs have many recognized functions and are widely used in paternity testing, genetic diversity, some aspects of genetic linkage maps, and marker-assisted breeding [36]. SSRs are tandem repeats of single nucleotides to hexanucleotides found in all prokaryotic and eukaryotic genomes. As a molecular marker, SSRs have a variety of putative functions and high polymorphisms that play an important role in certain aspects of genetic and breeding research. In addition, gene transcription and mRNA splicing or export to the cytoplasm may be affected by intron SSRs. Therefore, it is essential to understand the spectrum of SSRs. In the present study, we determined that all 21,985 unigenes in the transcriptome group contained SSRs (Table 5).
SNPs are the most abundant type of DNA sequence polymorphisms and have been increasingly used in quantitative trait loci (QTLs) for molecular marker linkage map construction, mapping, and association studies [37,38]. According to the number of alleles (Allele) on the SNP locus, i.e., the number of different bases supported by the sequencing reads, SNP sites can be divided into homozygous SNP sites (only one allele) and heterozygous SNP sites (two or more alleles). The percentage of heterozygous SNPs in different species varies. The number of SNP sites in each sample in this study is shown in Table 6. Both the Homo SNP and HeteSNP crucian carp were less in the test group (T03 and T04) infected with reovirus and the uninfected control group (T01 and T02). We can further study genetic markers related to immune-related genes and provide reference for the pathogenesis of reovirus-infected cyprinid fish and the search for genetic markers of immune genes.

Real-Time PCR Verification of DEGs
The selected genes were associated with cytokine receptors, Toll-like receptor signaling conduction (TLR), chemokine signaling pathways, and MAPK signaling pathways. Compared to the control group, the expression level of these genes in the EPC cells infected with reoviruses was consistent with that generated from transcriptome analysis (Figure 8).

Real-Time PCR Verification of DEGs
The selected genes were associated with cytokine receptors, Toll-like receptor signaling conduction (TLR), chemokine signaling pathways, and MAPK signaling pathways. Compared to the control group, the expression level of these genes in the EPC cells infected with reoviruses was consistent with that generated from transcriptome analysis (Figure 8).

Discussion
The molecular response mechanisms of animals following viral infection include changes in transcriptional and translational levels, which in turn regulate different metabolic pathways and signaling pathways. Therefore, transcriptome research has become an important direction of functional genomics research. The second-generation transcriptome sequencing (RNA-Seq) technology is a high-throughput sequencing approach involving the cDNA from fragmentally processed mRNA, sequence splicing assembly, and statistical correlation of number of sequences to obtain different transcripts (mRNA). This technique has been widely applied to the determination and analysis of host-virus interaction transcriptome. So far, studies on the mechanisms underlying the molecular immune response of cyprinids to reoviruses on cyprinids are limited. To fully understand the molecular mechanism underlying reovirus response, we first analyzed the transcriptome data of EPC cells before and after reovirus infection as well as obtained gene data. We then analyzed various gene expression profiles to identify genes that are involved in the immune response and signal transduction ( Table 7).
The immune system of fish provides protection from the pathogens, thereby allowing its survival and evolution. The Toll-like receptor family that mediates immune congenital immunity is the first line of defense of fish and plays a role in connecting innate immunity and acquired immune responses. Thirteen representative immune-related genes were significantly differentially expressed in the Toll-like receptor signaling pathway (Table 7, Figure 9). The MyD88 gene was significantly upregulated in the metabolic pathways of EPC cells after infection with reovirus, suggesting that it might play a key role in the generation of anti-reoviruses in carp and thus conferring pathogen resistance, which is similar to the findings of previous research studies [39][40][41][42]. In addition, the expression of the PIK3C and TAK1 genes were significantly downregulated after reovirus stimulation, which may be a negative regulatory mechanism of excessive inflammatory response. The results of this study differed from those of other viral infections, suggesting that certain genes in the Toll-like receptor signaling pathway play a key role in the response to reoviruses. Up-regulated expression of AP-1 and STAT after reovirus infection can promote the expression of proinflammatory cytokines and inflammatory cytokines, such as TNFα, IL-12/IL-1β. The release of these cytokines extracellularly can cause cells, macrophages chemotactic aggregation, increased capillary permeability, lymphocyte infiltration, and other inflammatory reactions.
The interaction between cytokines and membrane receptors is an important link in various biological activities. Cancer, autoimmune diseases, metabolic disorders, and other diseases are closely related to cytokine imbalance, and studying the interactions between cytokines and receptors becomes the cornerstone for the development of treatment of these diseases. Cytokines are a class of high activity, multi-functional protein peptide molecules or glycoproteins that are produced by activated immune cells (e.g., lymphocytes and mononuclear macrophages) and related cells (e.g., fibroblasts and endothelial cells) and thereby regulate cell function. These interact with specific receptors with high affinity, and are an indispensable component of the immune system. Approximately 12 representative immune-related genes were significantly differentially expressed in cytokine-cytokine receptor interaction signaling pathways after being infected with reovirus (Table 7, Figure 10). CXCL14 is a novel chemokine that decreases the inflammatory response. The CXCL14 gene was significantly downregulated in the metabolic pathways of EPC cells after being infected with reoviruses, thereby indicating that the inflammatory reaction may play a role in preventing the development of reoviruses. CXCR4 plays an important role in immunity, organogenesis and repair, anti-cancer, embryonic development, and hematopoiesis. CXCR4 can prevent the occurrence of excessive inflammatory reaction by negatively regulating the production of immune factors. In the present study, the CXCR4 gene was significantly downregulated in the metabolic pathways of EPC cells after infection with reovirus, indicating that it plays an important role in resistance against reovirus infection. The study of cytokine gene expression regulation may identify antagonists or activators that actively adjust the expression of specific genes. In addition, these studies may facilitate the establishment of an efficient eukaryotic expression system that generated a high yield of recombinant cytokines for the treatment of specific diseases or vaccine adjuvants. The JAK-STAT signaling pathway mediates cytokine signal transduction. It is involved in multiple life processes such as disease infection, immunity, inflammation, and development. In this study, ten representative immune-related genes were significantly differentially expressed in the JAK-STAT signaling pathway (Table 6, Figure 11). The three main members of the JAK-STAT signaling pathway are cytokine receptors, JAK, and STAT. STAT is a unique protein family that binds to DNA. So far, seven members of the STAT family have been identified, which include STAT1, STAT2, STAT3, STAT4, STAT5a, STAT5b, and STAT6. STAT1 is mainly involved in the response to IFNs, STAT1 deficiency can cause increased susceptibility to viral infection and disorders of intracellular pathogen infection regulation. In the present study, cytokines, growth factors, and other factors activated JAK, and then activated STAT1 after binding to their associated receptors. The STAT1 and CISH (cytokine inducible SH2-containing protein) were significantly upregulated in EPC cells that were infected with reoviruses, indicating that their susceptibility to virus infection decreased, the body was stimulated to activate the immune system, and the disorders of intracellular pathogen infection regulation were reduced. The role of IL-6 and receptor genes in the JAK-STAT signaling pathway may be further investigated. The PIK3R gene was upregulated in EPC cells after reovirus infection, possibly promoting apoptosis. This study laid the foundation for further investigating the role of the fish JAK-STAT gene in fish immune regulation and disease control. By performing series of analyses and research studies on the JAK-STAT signaling pathway, schemes for disease control and targets for the development of new drugs may be established.
NLRs were found recently, they are cytoplasmic pattern recognition receptors (PRRS). NLRs play an important role in defending against pathogen infection by recognizing pathogen-associated molecular patterns (PAMPs) and initiating the innate immune response, leading to an adaptive immune response through signal transduction. Approximately 3 representative immune-related genes were significantly differentially expressed in NOD-like receptor signaling pathway after being infected with reovirus ( Figure 12). Among them, the significant down-regulation of IkB and P38 inhibits NOD1 (Nucleotide-binding oligomerization domain-containing protein 1) directly recruiting RIPK2 through CARD-CARD (Caspase activation and recruitment domains) interaction during viral recognition, further activating NF-kB and MAPK (mitogen-activated protein kinase) signaling and stimulating the body to produce immunity. In this study, there was no difference in RIPK2, which may be due to the early stage of Infection increased and then decreased in the early stage of Infection increased and then decreased, Or the amount of expression increased but the difference was not significant compared with the control group, which can be further studied in the follow-up. It has been demonstrated that the HSP90 and SGT1 interact with the LRR (leucine-rich repeat) domain of NLRP3 (leucine-rich repeat and pyrin domain domains-containing protein 3) to maintain a non-activated but stable state, activating NLRP3 inflammation through changes in HSP90 The body causes secretion of IL-1β or IL-18 as well as programmed cell death. In addition HSP90 involved in a variety of physiological processes in vivo, as a molecular chaperone to improve the heat resistance of organisms and protect organisms and other functions. The infection of reovirus HSP90 in EPC cells significantly increased, indicating that not only by regulating its receptor protein activity and maintaining the stability of the receptor protein structure and indirectly involved in the regulation of multiple intracellular signaling pathways, but also to deal with resistance pathogen invasion and regulation of the body's immune system also plays an important role, suggesting that it might play a key role in the generation of anti-reoviruses in carp and thus conferring pathogen resistance, which is similar to the findings of previous research studies [43,44].
In addition to the Toll-like and JAK-STAT signaling pathways that interact with cytokines and cytokine receptors, various representative immune system-related genes are significantly expressed in phagocytosis and chemokine-and apoptosis-related pathways. Some genes play an important role in other pathways such as Ras; for example, it is involved in the regulation of the actin cytoskeleton, the MAPK signal transduction pathway, and the chemokine pathway; Furthermore, MyD88 is involved in apoptosis and the Toll pathway. The identification of these common genes in different pathways may improve our understanding of the molecular immune mechanisms of reovirus-infected cyprinids.

Conclusions
In the present study, comparative transcriptome analysis of EPC cells infected with reovirus and uninfected EPC cells was performed to explore the molecular mechanisms of immunity and the development of molecular markers. EPC cells infected with reoviruses showed significant changes in the expression of a variety of immune-related pathways such as Toll-like receptor signaling, cytokine-cytokine receptor interactions, chemokine signaling pathways, and endocytosis. In addition, the identification of 21,985 SSR markers and 62,992 SNPs in the transcriptome of EPC cells facilitated the development of markers, the analysis of inheritance patterns, and establishment of QTLs. In summary, this study provides insights into the immunoprophylaxis of reovirus-infected cyprinids and provides a theoretical basis for the pathogenesis and immunological pathways of Cypriniform reoviruses, as well as providing useful information for fish farming.