Dynamic Transcriptome Analysis Reveals Uncharacterized Complex Regulatory Pathway Underlying Dose IBA-Induced Embryogenic Redifferentiation in Cotton

The somatic embryogenesis (SE) process of plants is regulated by exogenous hormones. During the SE, different genes sensitively respond to hormone signals through complex regulatory networks to exhibit plant totipotency. When cultured in indole-3-butyric acid (IBA) concentration gradient medium supplemented with 0 mg dm−3, 0.025 mg dm−3, and 0.05 mg dm−3 IBA, the callus differentiation rate first increased then decreased in cotton. To characterize the molecular basis of IBA-induced regulating SE, transcriptome analysis was conducted on embryogenic redifferentiation. Upon the examination of the IBA’s embryogenic inductive effect, it was revealed that pathways related to plant hormone signal transduction and alcohol degradation were significantly enriched in the embryogenic responsive stage (5 days). The photosynthesis, alcohol metabolism and cell cycle pathways were specifically regulated in the pre-embryonic initial period (20 days). Upon the effect of the IBA dose, in the embryogenic responsive stage (5 days), the metabolism of xenobiotics by the cytochrome P450 pathway and secondary metabolism pathways of steroid, flavonoid, and anthocyanin biosynthesis were significantly enriched. The phenylpropanoid, brassinosteroid, and anthocyanin biosynthesis pathways were specifically associated in the pre-embryonic initial period (20 days). At different developmental stages of embryogenic induction, photosynthesis, flavonoid biosynthesis, phenylpropanoid biosynthesis, mitogen-activated protein kinase (MAPK) signaling, xenobiotics metabolism by cytochrome P450, and brassinosteroid biosynthesis pathways were enriched at low a IBA concentration. Meanwhile, at high IBA concentration, the carbon metabolism, alcohol degradation, circadian rhythm and biosynthesis of amino acids pathways were significantly enriched. The results reveal that complex regulating pathways participate in the process of IBA-induced redifferentiation in cotton somatic embryogenesis. In addition, collections of potential essential signaling and regulatory genes responsible for dose IBA-induced efficient embryogenic redifferentiation were identified. Quantitative real-time PCR (qRT-PCR) was performed on the candidate genes with different expression patterns, and the results are basically consistent with the RNA-seq data. The results suggest that the complicated and concerted IBA-induced mechanisms involving multiple cellular pathways are responsible for dose-dependent plant growth regulator-induced SE. This report represents a systematic study and provides new insight into molecular signaling and regulatory basis underlying the process of dose IBA-induced embryogenic redifferentiation during SE.


Introduction
Somatic embryogenesis (SE) is a process by which embryogenic differentiation is generated directly from somatic cells, similarly to the development of zygotic embryos, and it involves the developmental reprogramming of somatic cells toward the embryogenesis pathway. Somatic embryogenesis can be carried out in vitro under artificially controlled conditions to generate the most complete cell totipotency. Somatic embryogenesis plays an important role in cell fusion, genetic engineering, and somaclonal variation. SE was discovered in the 1950s [1]. Since then, scientists have conducted extensive research on the mechanism of somatic embryogenesis. It has been concluded that under artificially controlled in vitro conditions, plant somatic cells can regain their ability to regenerate after cell dedifferentiation and redifferentiation and can develop into seedlings. At present, more than 100 species are known to be capable of plant regeneration through somatic embryogenesis, mainly including alfalfa, soybean, cotton, spruce, pine and cypress. However, few of the molecular events involved in the transition of a somatic cell to an embryogenic-competent cell are known thus far [2,3].
Cotton is globally one of the most important commercial crops. Biotechnology approaches, including genetic engineering and tissue culture, have been widely applied to cotton breeding. All transgenic cotton production demands a productive plant regeneration procedure from the somatic cells in cotton. Cotton transgenic technology relies heavily on somatic embryogenesis [4]. Genotypes have a decisive influence on somatic embryogenesis in cotton. Xie et al. [5] believed that the genotype of the material is the first factor to affect the somatic embryogenesis and plant regeneration of cotton, followed by the type of phytohormone and its concentration. Since the 1970s, more than 100 cotton varieties have been used for tissue culture research. Approximately one half of these materials have embryogenic ability, while the remaining half are unable or extremely unlikely to induce somatic embryos [6,7]. Most cultivars have been recalcitrant, and few reports of high-frequency regeneration in cotton via SE have come forth, due to a genotype-dependent response [8][9][10].
Exogenous hormones are another important factor that induce somatic embryogenesis in cotton in addition to genotype. Currently, 2, 4-dichlorophenoxyacetic acid (2, 4-D), indole-3-butyric acid (IBA) and kinetin (KT) are commonly used. Some scholars have pointed out that among all auxins and auxin analogues, 2, 4-D has a relatively good effect on promoting the induction of explant callus [8,[11][12][13]. Furthermore, the addition of 2, 4-D in a suspension culture contributes to the formation of somatic embryos [14]. It has been reported that the combination of IBA and KT hormones contributes to embryogenic callus (EC) differentiation [12,[15][16][17]. Moreover, the combination of IBA and KT has good effects on some stubborn varieties in which induced differentiation has been difficult. Zhu et al. [15] found that simultaneous treatment with 2, 4-D, indole acetic acid (IAA) and KT yielded better effects, but the calli induced by the 2, 4-D medium must be transformed to 2, 4-D removed or replaced medium to ultimately undergo somatic embryogenesis. If 2, 4-D was removed and the hypocotyl segment was inoculated with ZSW-2 (Murashige and Skoog (MS) + 0.1 mg dm −3 KT + 0.1-0.5 mg dm −3 IBA + 3% (w/v) glucose + 6.5 g dm −3 agar, pH 6.2) [15] medium containing only KT and IBA, the ideal state of callus was achieved, and, therefore, the appropriate ratio of cytokinin and auxin yields a better induction rate. Chen et al. [18] found that an MSB (MS inorganic salt + B5 vitamin) medium supplemented with IBA could directly induce embryogenic calli and that appropriate concentrations of KT could promote the effect of IBA. IBA is used to induce root organogenesis [19], but studies have found that either IBA alone or in combination with KT can directly induce a certain amount of embryogenic callus from the hypocotyl segment. More extensive experiments are yet to be done to determine the mechanism through which IBA acts. Furthermore, the auxin and cytokinin signaling pathways play an important regulatory role in cotton somatic cell dedifferentiation and embryogenic cell redifferentiation [20][21][22][23]. Cotton plantlets can be regenerated via SE using various combinations of plant growth regulators, such as 2, 4dichlorophenoxyacetic acid (2, 4-D), indole-3-butyricacid (IBA), and naphthalene acetic acid (NAA) in combination with Kinetin (KT) [24]. Global transcriptome analyses have suggested that the auxin and cytokinin signaling pathways are critical in the dedifferentiation of somatic cells and the redifferentiation of SEs in cotton. Additionally, stress-responsive genes and pathways have also been found to be involved in SE [21]. The ethylene response factor (ERF) plays an important role in hormone signal transduction and interconnecting different hormone pathways [25]. Despite these reports, we still know little about the molecular mechanisms underlying plant growth regulator-induced SE in cotton.
Scientists found that auxin plays an important role in this process [21] while using small RNA sequencing (sRNA-seq) and degradome sequencing to find multiple small RNAs and their targets in somatic embryos. The occurrence of differential expression [26] indicates that cotton somatic embryogenesis is regulated by complex gene expression. With the development of next-generation sequencing technology and the continuous improvement of cotton genome information [27,28], more than 5000 differential expression genes have been discovered in cotton somatic embryogenesis using digital gene expression (DGE) technology. Our primary strategy was to use transcriptome sequencing techniques with the YZ-1 genotype material of highly embryogenic to identify genes expressed during early induction of SE under induction culture treatment in the IBA concentration gradient medium.

Callus Differentiation Rate of Cotton Induced by Different IBA Concentration and KT Combinations
Hypocotyl segments 5-7 mm in size were cultured on an MS medium supplemented with 0.1 mg dm −3 2, 4-D and 0.1 mg dm −3 KT (this hormone combination is named C3 in the following description). After 6 weeks of culture, all explants were transferred to different IBA concentration gradient mediums to induce embryogenic callus. The rate of embryogenic callus induction was calculated after 30 days. The results show that the callus tissue differentiation rate was the lowest on the medium supplemented with 0.1 mg dm −3 KT alone. After adding different IBA based on the 0.1 mg dm −3 KT, the callus tissue differentiation rate increased significantly, especially in the 0.1 mg dm −3 KT + 0.025 mg dm −3 IBA medium ( Table 1). The use of KT and IBA in the differentiation process of cotton calli is beneficial.

Density Distribution of Reads on Chromosomes
The total reads were mapped to all of the chromosomes in the genome, as shown in Figure A1. The specific mapping method was used to calculate the number of reads aligned to the base position in the window, calculate its depth distribution on the chromosome, and take the log2 value. Under optimal conditions, the longer the length of the entire chromosome, the more reads localized within the chromosome. From the diagram of the relationship between the number of reads localized in the chromosome and the length of chromosome, the uniformity of sequencing can be seen more intuitively.

Screening of Differentially Expressed Genes
The Reads Per Kilo bases per Million reads (RPKM) method was used to calculate the gene expression level. The sequencing depth and gene length were also normalized, making the gene expression levels estimated for different lengths of genes at different sequencing depths comparable. Volcano plots were drawn based on the results of the tests and were screened according to the differential significance criteria (greater than two-fold changes in the expression of differential genes and FDR ≤ 0.05), and the statistically significant differences in gene expression were measured ( Figure 1). The volcano plot indicates that there are many genes up-and down-regulated in callus tissue compared to the C0 treatment group (Figure 1a,b). seen more intuitively.

Screening of Differentially Expressed Genes
The Reads Per Kilo bases per Million reads (RPKM) method was used to calculate the gene expression level. The sequencing depth and gene length were also normalized, making the gene expression levels estimated for different lengths of genes at different sequencing depths comparable.
Volcano plots were drawn based on the results of the tests and were screened according to the differential significance criteria (greater than two-fold changes in the expression of differential genes and FDR ≤ 0.05), and the statistically significant differences in gene expression were measured ( Figure 1). The volcano plot indicates that there are many genes up-and down-regulated in callus tissue compared to the C0 treatment group (Figure 1a

Overall Transcriptome Sequencing Analysis
In the process of cotton somatic embryogenesis, transcriptome sequencing analysis was used by high-throughput sequencing technology. Comparing with the database, 1542-11,235 genes were matched to the Non-Redundant Protein Sequence Database in GenBank (Nr), 1038-7564 genes were matched to the Gene Ontology (GO) database, and 333-2828 genes were matched to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (Table 2).

Screening of Differentially Expressed Genes from Venn Diagram between Different Comparison Groups
Transcriptome analysis found 1211 common genes expressed during 2, 4-D-induced dedifferentiaion (24D-C3-0D) compared with series of IBA-induced redifferentiaion ( Figure 2). In the embryogenic responsive stage of redifferentiation, the higher IBA concentration (24D-C3-0D VS IBA-C2-5D) induced much more specific expression genes (4936) compared with 1876 genes in the lower IBA (24D-C3-0D VS IBA-C1-5D) ( Figure 2), but the callus differentiation rate was low (52.63% VS 72.22%) ( Table 1). The results suggest that excessive expression genes are activated in the higher concentration, but this is not beneficial to embryogenic responsiveness for redifferentiation (5 days). An appropriate and concerted amount of gene expression facilitates differentiation. As in the pre-embryonic initial period (20 days), the callus could adapt to different IBA concentrations-therefore, the specific expressed genes have little difference post responsive stage.

Type and Number of Alternative Splicing
Variable splicing allows a gene to produce multiple mRNA transcripts, and different mRNAs may be translated into different proteins. Therefore, multiple proteins can be produced by the variable splicing of a gene, which greatly increases protein diversity. ASprofile software (http://ccb.jhu.edu/software/ ASprofile/, v1.0.4, The Center for Computational Biology at Johns Hopkins University, Baltimore, MD, USA) was used to classify and count the variable splicing events of StringTie (v1.0.4)-predicted transcripts, as shown in Figure 3. There were many variable splicing events in all samples. The results show that the overall patterns of variable splicing events were similar among all samples, with more than 66% of the variable splicing events being concentrated in the TTS (Alternative 3 last exon), TSS (Alternative 5 first exon), and AE (Alternative exon ends (5 3 , or both) categories. The results indicate that there is no difference between the samples treated with different IBA concentrations. Variable splicing events proceed steadily and are not affected by different IBA concentrations.

Type and Number of Alternative Splicing
Variable splicing allows a gene to produce multiple mRNA transcripts, and different mRNAs may be translated into different proteins. Therefore, multiple proteins can be produced by the variable splicing of a gene, which greatly increases protein diversity. ASprofile software (http://ccb.jhu.edu/software/ASprofile/, v1.0.4, The Center for Computational Biology at Johns Hopkins University, Baltimore, MD, USA) was used to classify and count the variable splicing events of StringTie (v1.0.4)-predicted transcripts, as shown in Figure 3. There were many variable splicing events in all samples. The results show that the overall patterns of variable splicing events were similar among all samples, with more than 66% of the variable splicing events being concentrated in the TTS (Alternative 3′ last exon), TSS (Alternative 5′ first exon), and AE (Alternative exon ends (5′ 3′, or both) categories. The results indicate that there is no difference between the samples treated with different IBA concentrations. Variable splicing events proceed steadily and are not affected by different IBA concentrations.

Type and Number of Alternative Splicing
Variable splicing allows a gene to produce multiple mRNA transcripts, and different mRNAs may be translated into different proteins. Therefore, multiple proteins can be produced by the variable splicing of a gene, which greatly increases protein diversity. ASprofile software (http://ccb.jhu.edu/software/ASprofile/, v1.0.4, The Center for Computational Biology at Johns Hopkins University, Baltimore, MD, USA) was used to classify and count the variable splicing events of StringTie (v1.0.4)-predicted transcripts, as shown in Figure 3. There were many variable splicing events in all samples. The results show that the overall patterns of variable splicing events were similar among all samples, with more than 66% of the variable splicing events being concentrated in the TTS (Alternative 3′ last exon), TSS (Alternative 5′ first exon), and AE (Alternative exon ends (5′ 3′, or both) categories. The results indicate that there is no difference between the samples treated with different IBA concentrations. Variable splicing events proceed steadily and are not affected by different IBA concentrations.

Functional Annotation of Differentially Expressed Genes
GO level secondary gene annotation of differentially expressed genes (DEGs) during somatic embryogenesis in cotton was induced by applying different IBA serial concentration gradients. These results indicate that the relative differential gene expression patterns induced by different IBA concentrations are similar ( Figure A2), while the scale of differentially induced genes is greatly different ( Table 2).
When comparing different IBA concentrations in the C0-5D VS IBA-C1-5D, C0-5D VS IBA-C2-5D groups ( Figure A2a,b) at 5 d when plant hormone IBA was added, the number of differential genes increased with increasing concentrations of IBA. When comparing different IBA concentrations in the C0-20D VS IBA-C1-20D and C0-20D VS IBA-C2-20D groups ( Figure A2c,d) at 20 days when plant hormone IBA was added, the number of differential genes decreased with increasing concentrations of IBA. In the same concentration when comparing 5 days with 20 days, the number of differential genes decreased with the increase in culture time.

Enriched Metabolic Pathway in Comparison Groups with and without IBA
IBA had a great influence on the differentiation rate, and the callus differentiation rate was very low (7.81%) when IBA was not added. The differentiation rate upon the addition of IBA significantly increased (72.22%) nearly nine-fold.
To analyze the differential genes related to the IBA-induced cotton callus differentiation rate, we chose a combination of samples treated with IBA and without IBA to find the metabolic pathway that affected the differentiation rate. The high-differentiation combination selected was 0.1 mg dm −3 KT+ 0.025 mg dm −3 IBA, and the poorly differentiated combination was 0.1 mg dm −3 KT + 0 mg dm −3 IBA. At 5 d, the metabolic pathways for differentially expressed genes were significantly enriched in plant hormone signal transduction, zeatin biosynthesis, starch and sucrose metabolism, nitrogen metabolism, methane metabolism, metabolism of xenobiotics by cytochrome P450, fatty acid elongation, alcohol degradation, and ABC transporters ( Figure 4a). The plant hormone signal transduction pathway was highly enriched at 5 d, indicating that these series endogenous critical corresponding genes were induced and stimulated greatly by external IBA. At 20 days, the pathways with differentially expressed genes were significantly enriched in photosynthetic-antennary proteins, photosynthesis, mineral absorption, cell cycle, carbon fixation in photosynthetic organisms, alcohol degradation, and ABC transporters ( Figure 4b). The pathways enriched in photosynthesis, photosynthetic-antennary proteins, and carbon fixation in photosynthetic organisms indicated that genes related to photosynthesis were also essential in the cotton SE, especially at 20 days. The alcohol degradation pathway was significantly enriched in both periods, and these enriched differential genes were upregulated, which suggests that the alcohol degradation pathway may play a positive role in IBA-induced cotton somatic embryogenesis. The differentially expressed genes (photosystem II proteins PSBQ2, PSBS, PSBW, and ferredoxin SEND33) that were significantly enriched in photosynthetic pathways were downregulated at 20 days. Differential genes enriched in cell cycle pathways were upregulated, indicating that the photosynthetic pathway plays a negative regulatory role and the cell cycle pathway plays a positive regulatory role in somatic embryogenesis.

Enriched Differential Metabolic Pathway in Different IBA Doses
When adding different concentrations of IBA, the frequency of callus differentiation induced in cotton was also different. Compared with the ideal concentration of 0.025 mg dm −3 , a high concentration of 0.05 mg dm −3 inhibited embryonic differentiation. To analyze the difference in dose effects of IBA, we compared the concentration of 0 mg dm −3 , 0.025 mg dm −3 , and 0.05 mg dm −3 of IBA and found that the metabolism of xenobiotics by the cytochrome P450 pathway was enriched both in C0-5D VS IBA-C1-5D and IBA-C1-5D VS IBA-C2-5D (Figure 4a, Figure 5a). This indicated that the metabolism of xenobiotics by the cytochrome P450 pathway was dose-dependently expressed. In IBA-C1-5D VS IBA-C2-5D (Figure 5a), the specific enriched pathways were involved in secondary metabolisms such as steroid biosynthesis, flavonoid biosynthesis, and anthocyanin biosynthesis. Corresponding to the phenotypic results, the IBA concentration increases, while the differentiation rate of callus decreases. Therefore, we conclude that the secondary metabolites such as steroids, flavonoids, and anthocyanin may not be conducive to differentiation. At 20 days, photosynthesis-antenna pathway proteins were enriched in both C0-20D VS IBA-C1-20D and IBA-C1-20D VS IBA-C2-20D (Figure 4b, Figure 5b). In IBA-C1-20D VS IBA-C2-20D, the significantly enriched relevant pathways were phenylpropanoid biosynthesis, brassinosteroid biosynthesis and anthocyanin biosynthesis. We speculated that secondary metabolism affects the somatic embryogenesis of cotton. In particular, the results showed that in the IBA-induced cotton calli pre-embryonic initial period (20 days), the phenylpropanoid biosynthesis pathway, which has broad physiological activities and is related to plant growth regulation, was most highly enriched in comparison group of IBA-C1-20D VS IBA-C2-20D.   IBA-C1-20D VS IBA-C2-20D (Figure 4b, 5b). In IBA-C1-20D VS IBA-C2-20D, the significantly enriched relevant pathways were phenylpropanoid biosynthesis, brassinosteroid biosynthesis and anthocyanin biosynthesis. We speculated that secondary metabolism affects the somatic embryogenesis of cotton. In particular, the results showed that in the IBA-induced cotton calli pre-embryonic initial period (20 days), the phenylpropanoid biosynthesis pathway, which has broad physiological activities and is related to plant growth regulation, was most highly enriched in comparison group of IBA-C1-20D VS IBA-C2-20D.

Enriched Differential Metabolic Pathway with Different Culture Times
With the ideal IBA concentration of 0.025 mg dm −3 , we compared samples at 0 day, 5 days, and 20 days. In the embryogenic responsive stage (5 days), the photosynthesis, flavonoid biosynthesis, phenylalanine metabolism pathways were significantly enriched. And in the pre-embryonic initial period (20 days), the differentially enriched pathways were those of phenylpropanoid biosynthesis, MAPK signaling, glutathione metabolism, fatty acid elongation and brassinosteroid biosynthesis. (Figure 6a,b). Therefore, it is suggested that the pathways above and the corresponding signaling and regulatory genes essentially affect cotton somatic embryogenesis in low IBA concentrations.
When the IBA concentration was 0.05 mg dm −3 and we compared samples at 0 day, 5 days and

Enriched Differential Metabolic Pathway with Different Culture Times
With the ideal IBA concentration of 0.025 mg dm −3 , we compared samples at 0 day, 5 days, and 20 days. In the embryogenic responsive stage (5 days), the photosynthesis, flavonoid biosynthesis, phenylalanine metabolism pathways were significantly enriched. And in the pre-embryonic initial period (20 days), the differentially enriched pathways were those of phenylpropanoid biosynthesis, MAPK signaling, glutathione metabolism, fatty acid elongation and brassinosteroid biosynthesis. (Figure 6a,b). Therefore, it is suggested that the pathways above and the corresponding signaling and regulatory genes essentially affect cotton somatic embryogenesis in low IBA concentrations.  When the IBA concentration was 0.05 mg dm −3 and we compared samples at 0 day, 5 days and 20 days, it was found that the differentially enriched pathways were those of carbon metabolism, alcohol degradation, fatty acid degradation, circadian rhythm, carbon fixation in photosynthetic organisms, and biosynthesis of amino acids. The results show that in IBA-induced cotton embryogenic redifferentiation (0.05 mg dm −3 ), the carbon metabolism pathway is highly enriched in the comparison groups of 24D-C3-0D VS IBA-C2-5D and IBA-C2-5D VS IBA-C2-20D simultaneously (Figure 6c,d).
We speculated that dominant carbon metabolism sequentially affects somatic embryogenesis of cotton at high IBA concentrations.

Expression Profiles of Major Genes Associated with SE Regulation
Transcriptome analysis showed that the complicated and concerted IBA-induced multiple cellular pathways, as well as the corresponding signaling and regulatory genes, are responsible for plant growth regulator-induced SE. We comprehensively explored and mined the significantly representative regulatory pathway and essential genes for cotton SE enriched by KEGG analysis (Table 3), and candidate SE-related crucial transcription factor genes that were not enriched by KEGG (Table 4).

Candidate Gene Expression Pattern Validation
The differentially expressed genes analyzed in the expression profiling were verified using real-time fluorescence quantitative PCR. Six genes were randomly selected for validation. These six genes include endochitinases (CHIT1B), calcineurin B-like protein (CBL4), peroxidase (PER73), cytochrome P450 (CYP82A3), auxin response factor (ARF4), and auxin-inducible protein (AUX22B). Endochitinase (CHIT1B) plays an important role in somatic embryogenesis. For example, carrot embryogenic callus contains glycosylated acidic endochitinase, which can promote somatic embryogenesis. The endochitinase in sugar beet can also promote the early progression of this process. It is suggested that endochitinase is related to some somatic embryogenesis signal molecules. Calcineurin B-like protein (CBL4) was selected from the calcium signaling pathway. CBLs represent a family of plant calcium-binding proteins that function in calcium signaling by interacting with their interacting protein kinases. These roles are essential for plant growth and development. Cytochrome P450 (CYP82A3) is enriched in the brassinosteroid biosynthesis pathway. Peroxidase (PER73) was selected from phenylpropanoid biosynthesis pathway. Peroxidase is considered to take part in diverse plant processes, such as auxin metabolism, cell wall elongation and stiffening, and it is also related to different metabolic pathways to regulate somatic embryogenesis [29,30]. ARF4 and AUX22B genes were selected from the plant hormone signal transduction pathway, which are the two key transcription factors involved in regulating the expression of auxin-responsive genes. Auxin response factor (ARFs) is considered as essential SE genes that are specifically activated during embryogenic differentiation [31][32][33][34]. ARFs bind auxin response promoter elements, mediate transcriptional responses to auxin and regulate auxin-mediated transcriptional activation/repression, together with Aux/IAA [35,36]. Three repetitions of the biological experiments were performed. The results are shown in Figure 7. The results of the fluorescent quantitative PCR analysis are basically consistent with the gene expression patterns obtained by RNA-sequencing, indicating that the transcriptome data are reliable and that these genes are regulated during hormone-induced somatic embryogenesis in cotton. Their role can be determined by further studying their function.

Discussion
Somatic embryogenesis (SE) is a very complex biological process that is affected by genotypes, hormones, stress, light, mineral elements, nitrogen sources, and carbon sources in the medium. However, somatic cells under the influence of various internal and external factors initiate the expression of certain specific genes and then redifferentiate into embryonic cells. Comparative transcriptome analysis is useful to dissect the molecular mechanism of SE. The differentiation ability of plants is affected by the interaction of two factors in vivo and in vitro. Therefore, this experiment detected endogenous gene changes through exogenous IBA treatment and tried to find endogenous factors responding to exogenous IBA changes, to provide a reliable theoretical and experimental basis for cotton somatic embryogenesis.

IBA Embryogenic Inductive Effect and Dose Effect During Differentiation of Cotton SE
In this experiment, we found that different concentrations of IBA had different effects on embryonic differentiation, as demonstrated by the differentiation rate statistics in cotton calli. The results (Table 1) show that the combination of 0.1 mg dm −3 KT + 0.025 mg dm −3 IBA induced the highest differentiation rate of cotton. Our callus differentiation rate statistics for exogenously applied auxin and cytokinin were consistent with previous studies, i.e., the induction of EC and SEs, and the concentration of 2, 4-D in the medium was replaced by a combination of IBA and KT [24,37]. In this experiment, the role of auxin and cytokinin was not achieved alone, but by interacting with other pathways and components involved in the differentiation and development of SEs, such as their upstream TF regulators and hormone signaling pathways [38]. Wang et al. [39] found that, in cotton tissue cultures, the effect of IBA is stronger than that of the other auxins or auxin analogues. Comparing 0.025 mg dm −3 and 0.05 mg dm −3 , the high concentration of IBA inhibited callus differentiation. Under 0.1 mg dm −3 KT medium, the differentiation rate was lowest, further indicating that cytokinin alone inhibited cell differentiation and that appropriate ratio between auxin and cytokinin is necessary in callus culture. Moreover, these results indicate that the appropriate concentration of IBA is conducive to callus differentiation.
The investigation in this study shows that a series of different IBA concentrations induced similar relative differential gene expression patterns ( Figure A2), while the concentrations had different induction rates in cotton somatic embryogenesis (Table 1) and the amount of differentially induced genes were greatly different (Table 2). To resolve different IBA concentration induction rates, we analyzed the different IBA comparison groups under 0 mg dm −3 , 0.025 mg dm −3 , and 0.05 mg dm −3 concentrations. We found that the metabolism of xenobiotics by the cytochrome P450 and photosynthetic pathway shows a dose-dependent pattern. Meanwhile, the secondary metabolism and phenylpropanoid biosynthesis pathways were significantly enriched. These results indicate that these pathways could essentially affect the somatic embryogenesis of cotton. Previous studies have shown that secondary metabolites in the embryogenic callus are lower than in the nonembryonic callus. This indicates that secondary metabolism is relatively vigorous in a nonembryonic callus, which may affect the main metabolic rate and intensity and, ultimately, the cell differentiation, which is consistent with our findings. Especially, highly enriched phenylpropanoid biosynthesis in our study may play an important role in the SE. The data were consistent with and extended the broad physiological activities and related to plant growth regulation of phenylpropanoid metabolism.

Plant Hormone Signal Transduction and Alcohol Degradation Pathway in Embryogenic Induction Responsive Stage
The exogenous hormone is essential during cotton somatic embryogenesis [40]. The results in this study show that IBA has a great influence on the differentiation rate, and that the plant hormone signal transduction pathway is significantly enriched in the embryogenic induction responsive stage. It is suggested that the plant growth regulator IBA response generally involves initiating signals through receptor binding, transducing signals through complex downstream molecular events, and inducing changes in cell morphology and metabolism to achieve signal output. The signal of exogenous IBA is recognized by a receptor, thus triggering intracellular transduction and amplification of the signal, and will ultimately lead to cellular and organ responses. Hormone production, signaling and responses can affect plant growth and development, which are of high importance for cotton somatic embryogenesis.
In 1991, Perata P and Alpi A [41] detected the formation of 14C carbon dioxide by adding 14Clabeled ethanol to carrot cell suspensions. They thus deduced that carrot cells had the ability to degrade ethanol. Later, the scientists found that the tobacco callus is able to digest and metabolize ethanol. At the time of cultivation, the tobacco callus converts ethanol into acetic acid, which is almost harmless to plants [42]. Further experiments have also found that plant hormones have a significant regulatory effect on ethanol digestion and metabolism of plants. In this study, when the high differentiation rate IBA combination (0.025 mg dm −3 IBA) was compared with the low differentiation rate IBA combination (0 mg dm −3 IBA), we found that the alcohol degradation pathway was significantly enriched in both 5 days and 20 days cultures (Figure 4), and the genes enriched in this pathway were upregulated. We speculated that the alcohol degradation pathway plays a positive role in the regulation of IBA-induced cotton callus differentiation, which is consistent with previous studies.

Photosynthesis, Cell Cycle and SERK Pathway During Somatic Embryogenesis in Cotton
When the high differentiation rate of IBA combination (0.025 mg dm −3 IBA) was compared with the low differentiation rate of IBA combination (0 mg dm −3 IBA), the differentially expressed genes (the photosystem II proteins PSBQ2, PSBS, PSBW and ferredoxin SEND33) that were significantly enriched in the photosynthesis pathway were downregulated at 20 days of culture. The differentially expressed genes enriched in the cell cycle pathway (cell division regulatory protein sna41, CDC6B, DNA replication licenser MCM3, cell division cycle protein CDC20-1) were upregulated, indicating that photosynthesis may play a negative regulatory role in somatic embryogenesis, and that the cell cycle plays a positive regulatory role.
Some of the identified genes have been experimentally demonstrated to play an important role in SE [29,35,43,44]. Somatic embryogenesis receptor kinase (SERK) is involved in the signal transduction pathway during somatic embryogenesis and is a marker gene for the embryotic state. Schmidt et al. [45] screened a key gene regulating the transformation of somatic cells into embryonic cells in the hypocotyl regeneration of carrots, and this gene is named SERK1 (somatic embryogenesis receptor protein kinase 1). In this study, GhSERK1 (Gh_A01G0158) was found in four comparison groups, including 24D-C3-0D VS C0-5D, 24D-C3-0D VS IBA-C1-5D, 24D-C3-0D VS IBA-C1-20D, and 24D-C3-0D VS IBA-C2-20D. It was up-regulated in all of them, indicating that the up-regulated expression of SERK plays an important role in cotton somatic embryogenesis induction.

RNA Extraction, cDNA Library Preparation, and RNA-Seq
The total RNA of each sample was extracted by using the EASYspin Plus plant RNA rapid extraction kit (Aidlab Biotechnologies Co. Ltd., Beijing, China), following the manufacturer's protocol. The total RNA of each sample was quantified and verified with an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA), NanoDrop2000 (Thermo Fisher Scientific, Inc, Waltham, MA, USA) and 1% agarose gel electrophoresis. Next-generation sequencing libraries were prepared using NEBNext ® Ultra™ RNA Library Prep Kit for Illumina ® (NEB, USA), following the manufacturer's instructions. After library purification (Beckman Agencourt AMPure XP beads), the PCR products were cleaned up using AxyPrep Mag PCR Clean-up (Axygen, Union City, CA), validated using an Agilent 2100 Bioanalyzer, and quantified by Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA, USA). Then, the libraries with different indices were multiplexed and loaded on an Illumina HiSeq instrument according to the manufacturer's instructions (Illumina, San Diego, CA, USA). Sequencing was carried out using a 2 × 150 bp paired-end configuration; image analysis and base calling were conducted using HiSeq Control Software (v2.0.5, HCS, Illumina, San Diego, CA, USA), Off-Line Base caller and GAPipeline-1.6 on the HiSeq instrument.

Mapping and Differential Expression Analysis
The filtered sequencing clean data were compared with the upland cotton reference genome (https://www.cottongen.org/species/Gossypium_hirsutum/nbi-AD1_genome_v1.1). Hisat2 (v2.0.1) was used to index the reference genome sequences. First, transcripts in fasta format were converted from known gff annotation files and indexed properly. Then, with the file as a reference gene file, HTSeq (v0.6.1) estimated the gene and isoform expression levels from the pair-end clean data. RPKM (Reads Per Kilo bases per Million reads) was applied to calculate the level of the gene expression indifferent groups. The read counts were normalized with DESeq2 (v1.6.3, open source, http://www.bioconductor.org/) [47]. Differentially expressed genes (DEGs) were defined as those with ≥2-fold change and FDR ≤ 0.05. The normalized sequencing data of three replicates with repeatability were used for analysis.

GO Functional Annotation and KEGG Enrichment Analysis
Gene Ontology (GO) terms describe cellular components, biological processes, and molecular functions of gene sets from differentially expressed genes. GO-TermFinder (v0.86, http://search.cpan. org/dist/GO-TermFinder/) was used to identify GO terms that annotate a list of enriched genes with a significant P-value of less than 0.05. KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances (http://en.wikipedia.org/wiki/KEGG). We used scripts in house to determine which KEGG pathways the significantly differentially expressed genes were in.

Alternative Splicing Analysis
ASprofile v1.0.4 is a suite of programs for extracting, quantifying and comparing alternative splicing (AS) events from RNA-seq data. It took a GTF transcript file created by Cufflinks [48] as its input.

Validation of RNA-Seq Data by Real-Time RT-PCR
To verify the differentially expressed genes detected by the Illumina RNA-Seq data, real-time RT-PCR (qRT-PCR) was performed on a set of 3 replicates for each sample and the expression levels calculated were based on three technical replicates. Six genes were chosen for digital gene expression analysis. The gene-specific primers (Table A1) were designed using Primer Premier 5.0 software (http://www.premierbiosoft.com/primerdesign/, Premier Biosoft International, Palo Alto, CA, USA) and synthesized by the Shanghai Shenggong Company. The length of the products ranged was 100-300 bp. The cDNA was synthesized using EasyScript One-Step gDNA Removal and cDNA Synthesis SuperMix (TransGen Biotech, Beijing, China) in a 20 mm 3. reaction mixture according to the manufacturer's instructions. For reverse transcription, an RNA-amount of 5 µg was used. A total of 20 mm 3 reaction system as follows: Total RNA/mRNA 7 mm 3 , Anchored Oligo (dT) 18 Primer (0.5 µg mm −3 ) 1 mm 3 , 2 × TS Reaction mix 10 mm 3 , Tran Script RT/RI Enzyme Mix 1 mm 3 , gDNA Remover 1 mm 3 . The first step was to remove gDNA: added 5 × gDNA Buffer, Total RNA, Rnase-Free ddH 2 O, 65 • C incubation 5 min. Then, the solution with Anchored Oligo(dT) 18 Primer, 2 × TS Reaction mix, and Tran Script RT/RI Enzyme Mix added, was incubated at 42 • C for 15 min. Following this step, Trans Script RT/RI and gDNA Remover enzyme were heat-inactivated at 85 • C for 5 s and subsequently chilled on ice. The obtained cDNA was either subsequently processed or stored at −20 • C until use. qRT-PCR was performed in 15 mm 3 reactions on the Real-Time PCR Thermal Cycler (Analytik Jena AG, qTOWER 3 G, Germany), using 1.5 mm 3 of first-strand cDNA as the template, 7.5 mm 3 of 2 × UltraSYBR Mixture (with ROX) (CWBIO, Beijing, China), 1.2 mm 3 each of 10 µM forward and reverse gene-specific primer and 3.6 mm 3 of ddH2O. GhUB7 was used as the reference gene. The qRT-PCR conditions were as follows; pre-incubation at 95 • C for 10 min, followed by amplification by 38 cycles at 95 • C for 15 s, 60 • C for 30 s, and 72 • C for 30 s. A melting curve analysis was conducted to evaluate the primer specificity for each primer set to verify the presence of a single melting peak after amplification. The melting curve analysis conditions were as follows: 95 • C 15 s, 60 • C 1 min, 95 • C 15 s, 60 • C 15 s.