Comparative Transcriptomics of Non-Embryogenic and Embryogenic Callus in Semi-Recalcitrant and Non-Recalcitrant Upland Cotton Lines

Somatic embryogenesis-mediated plant regeneration is essential for the genetic manipulation of agronomically important traits in upland cotton. Genotype specific recalcitrance to regeneration is a primary challenge in deploying genome editing and incorporating useful transgenes into elite cotton germplasm. In this study, transcriptomes of a semi-recalcitrant cotton (Gossypium hirsutum L.) genotype ‘Coker312’ were analyzed at two critical stages of somatic embryogenesis that include non-embryogenic callus (NEC) and embryogenic callus (EC) cells, and the results were compared to a non-recalcitrant genotype ‘Jin668’. We discovered 305 differentially expressed genes in Coker312, whereas, in Jin668, about 6-fold more genes (2155) were differentially expressed. A total of 154 differentially expressed genes were common between the two genotypes. Gene enrichment analysis of the upregulated genes identified functional categories, such as lipid transport, embryo development, regulation of transcription, sugar transport, and vitamin biosynthesis, among others. In Coker312 EC cells, five major transcription factors were highly upregulated: LEAFY COTYLEDON 1 (LEC1), WUS-related homeobox 5 (WOX5), ABSCISIC ACID INSENSITIVE3 (ABI3), FUSCA3 (FUS3), and WRKY2. In Jin668, LEC1, BABY BOOM (BBM), FUS3, and AGAMOUS-LIKE15 (AGL15) were highly expressed in EC cells. We also found that gene expression of these embryogenesis genes was typically higher in Jin668 when compared to Coker312. We conclude that significant differences in the expression of the above genes between Coker312 and Jin668 may be a critical factor affecting the regenerative ability of these genotypes.


Introduction
Plant somatic embryogenesis (SE) is a unique developmental process that ultimately leads to the regeneration of a whole plant from single or multiple somatic cells [1]. This process involves sophisticated cellular reprogramming events that are controlled by gene expression programs and signaling pathways that direct callus cells to dedifferentiate, reprogram, and begin differentiation into polarized structures that eventually become a viable embryo [2][3][4][5]. SE is initiated by various factors, such as culture medium conditions, including concentrations, plant growth regulators (PGRs), and various stresses, such as plant wounding, temperature, and osmotic pressures [3,6,7]. Under SE initiation, somatic cells from various explant sources (e.g., hypocotyls, young leaves, and immature embryos)

RNAseq of Coker312 at NEC and EC Developmental Stages
Differential gene expression analysis revealed a total of 196 genes upregulated and 109 genes downregulated, respectively, in Coker312 EC when compared to Coker312 NEC, with at least 2-fold abundance difference and an adjusted p-value of 0.001, Supplemental Table S1.
Genes without known orthologs or predicted functional domains in model systems, such as Arabidopsis, were among the most differentially expressed (~seven-fold), as shown in Table 1. Important genes with functional annotations that were largely upregulated in EC cells included lipid transfer proteins, homeobox protein 31, homeobox-3 genes, early nodulin-like proteins, copper transporters, seed gene1, and APETALA2 (AP2) transcription factors, as shown in Table 1 and Supplemental Table S1. Genes with the largest downregulated expression profile (almost four-fold) during the transition to EC cells included mevalonate diphosphate decarboxylase 1, conserved peptide upstream open reading frame 9, serine protease inhibitors, expansin-like proteins, and nodulin transporter family proteins, as shown in Table 1 and Supplemental Table S1. Functional enrichment of the upregulated genes in Coker312 EC cells identified the biological processes involved in the biosynthesis of vitamins, such as thiamine; the transport of sucrose, copper, and lipids; and genes involved in embryo development, as shown in Figure 1 and Supplemental Table S2. In the molecular function category, the largest number of enriched genes was categorized as transcription factors (14), followed by lipid binding (8), growth factors (4), transporters (4), and hydrolase activity (3), as shown in Supplemental Table S2.   Gene enrichment analysis of the downregulated genes in EC callus in Coker312 revealed only a handful of enriched functional categories (Supplemental Table S2). Downregulated enriched functional categories were only represented by a few genes (<4) and included the response to biotic stimulus, response to defense and wounding, hydrolase activity, and a few others (Supplemental Table S3). Gene enrichment analysis of the downregulated genes in EC callus in Coker312 revealed only a handful of enriched functional categories (Supplemental Table S2). Downregulated enriched functional categories were only represented by a few genes (<4) and included the response to biotic stimulus, response to defense and wounding, hydrolase activity, and a few others (Supplemental Table S3).

Comparison of Coker312 to Jin668 at NEC and EC Stages
We compared the gene expression profiles of the semi-recalcitrant genotype, Coker312, to the previous transcriptomic study of a non-recalcitrant genotype, Jin668 [26]. Comparative analysis of the differentially expressed genes with at least a two-fold change profile and an error corrected p-value of 0.001 identified 151 unique genes in Coker312 and 2001 unique genes in Jin668 (Figure 2), with a total of 305 and 2155 differentially expressed genes in Coker312 and Jin668, respectively ( Figure 2). Grouping of the 2001 genes in Jin668 revealed the Cytochrome p450 family as the most abundant group (42 members), followed by lipid transfer (35 member), helix-loop-helix DNA-binding (32 members), aquaporins (29 members), MYB transcription factors (26 members), and AP2 (Supplemental Table S4). Aside from MYBs, we also discovered additional transcription factors, such as WRKY1 (16 members), bZIPs (12 members), PLATZ (5 members), and GATAs (3 members), as shown in Supplemental Table S4. We also identified genes related to histone maintenance, MADS-box genes, and methyltransferase (Supplemental Table S4).
Plants 2021, 10, x FOR PEER REVIEW We compared the gene expression profiles of the semi-recalcitrant g Coker312, to the previous transcriptomic study of a non-recalcitrant genotype, Jin Comparative analysis of the differentially expressed genes with at least a two-fold profile and an error corrected p-value of 0.001 identified 151 unique genes in C and 2001 unique genes in Jin668 (Figure 2), with a total of 305 and 2155 diffe expressed genes in Coker312 and Jin668, respectively ( Figure 2). Grouping of genes in Jin668 revealed the Cytochrome p450 family as the most abundant g members), followed by lipid transfer (35 member), helix-loop-helix DNA-bin members), aquaporins (29 members), MYB transcription factors (26 members), (Supplemental Table S4). Aside from MYBs, we also discovered additional tran factors, such as WRKY1 (16 members), bZIPs (12 members), PLATZ (5 memb GATAs (3 members), as shown in Supplemental Table S4. We also identified gene to histone maintenance, MADS-box genes, and methyltransferase (Supplemen S4). The 154 genes that are differentially expressed in both genotypes are of p interest. Interestingly, a duplicate gene pair (homoeologous gene copies on both t D subgenomes) with the highest expression values among each genotype was co of a bifunctional inhibitor of lipid transfer, followed by a tandemly duplicated chromosome 13 of the d-subgenome with an unknown annotation, as shown in and Supplemental Table S5. Other genes with high expressions in EC cells genotypes are lipid transfer proteins, genes with homeobox domains, and genes with histone proteins, as shown in Supplemental Table S5. A clustering analys genes based on their expression profiles showed aggregation by condition genotype, as shown in Figure 3. The 154 genes that are differentially expressed in both genotypes are of particular interest. Interestingly, a duplicate gene pair (homoeologous gene copies on both the A and D subgenomes) with the highest expression values among each genotype was comprised of a bifunctional inhibitor of lipid transfer, followed by a tandemly duplicated gene on chromosome 13 of the d-subgenome with an unknown annotation, as shown in Figure 3 and Supplemental Table S5. Other genes with high expressions in EC cells in both genotypes are lipid transfer proteins, genes with homeobox domains, and genes involved with histone proteins, as shown in Supplemental Table S5. A clustering analysis of the genes based on their expression profiles showed aggregation by condition and not genotype, as shown in Figure 3.
Functional gene enrichment of the 154 overlapping differentially expressed genes identified genes in enriched categories, such as lipid transport, embryo development, regulation of transcription, sugar transport, vitamin biosynthesis, growth factor activity, DNA binding, cell population proliferation, and others, as shown in Supplemental Table S6 and Figure 4. We also observed that, among the 154 overlapping genes, gene expression profiles were typically higher in Jin668 EC cells versus Coker312 EC cells, as shown in Figure 3. For example, the clusters of genes in Jin668 EC had a higher expression more often than those in Coker EC, as shown in Figure 3. Functional gene enrichment of the 154 overlapping differentially expressed genes identified genes in enriched categories, such as lipid transport, embryo development, regulation of transcription, sugar transport, vitamin biosynthesis, growth factor activity, DNA binding, cell population proliferation, and others, as shown in Supplemental Table  S6 and Figure 4. We also observed that, among the 154 overlapping genes, gene expression profiles were typically higher in Jin668 EC cells versus Coker312 EC cells, as shown Figure 3. For example, the clusters of genes in Jin668 EC had a higher expression m often than those in Coker EC, as shown in Figure 3. We also examined the expression profiles of genes known to have a role in soma embryogenesis, such as BBM, WUS, LEC1, WUSCHEL-RELATED HOMEOBOX 5 (WOX FUSCA3 (FUS3), and several other genes [26]. The genes with the sharpest fold chan were Gohir.D13G136000.1(LEC1−1), , a Gohir.A07G230400.1(FUS3_1) in Coker312 EC and Jin668EC, although the expression w higher in Jin668 EC in comparison of Coker312 EC, as shown in Figure 5. Howev Gohir.A08G227000.1(BBM−1) showed a high expression in Jin668 EC, while a much low expression was observed in Coker312 EC. Two other interesting gen Gohir.A10G233000.1(WOX5−1) and Gohir.D10G245300.1(WOX5), were upregulated Coker312 EC, Coker312 NEC, and Jin668 NEC, while they were almost off in Jin668 E as shown in  We also examined the expression profiles of genes known to have a role in somatic embryogenesis, such as BBM, WUS, LEC1, WUSCHEL-RELATED HOMEOBOX 5 (WOX5), FUSCA3 (FUS3), and several other genes [26]. The genes with the sharpest fold changes were Gohir.D13G136000.1(LEC1−1), Gohir.A13G132600.1(LEC1−2), Gohir.D07G237600.1(FUS3−2), Gohir.D08G035600.1(LEC1−3), and Gohir.A07G230400.1(FUS3_1) in Coker312 EC and Jin668 EC, although the expression was higher in Jin668 EC in comparison of Coker312 EC, as shown in Figure 5. However, Gohir.A08G227000.1(BBM−1) showed a high expression in Jin668 EC, while a much lower expression was observed in Coker312 EC. Two other interesting genes, Gohir.A10G233000.1(WOX5−1) and Gohir.D10G245300.1(WOX5), were upregulated in Coker312 EC, Coker312 NEC, and Jin668 NEC, while they were almost off in Jin668 EC, as shown in Figure 5. Several important transcription factors previously reported to have a role in somatic embryogenesis, such as Gohir.D03G115300.1(GRD/RKD), Gohir.D10G089500.1(WUS−3), Gohir.A12G059800.1(WUS−2), and Gohir.D12G060100.1(WUS−4), showed very little to no expression in either the genotype or developmental stages, as shown in Figure 5. In the Coker312, the highest upregulated gene was LEC1, followed by WOX5, ABSCISIC ACID IN-SENSITIVE (ABI3), FUS3, and WRKY2. In Jin668, LEC1, BBM, FUS3, and AGAMOUS-LIKE15 (AGL15) were the highly expressed genes, and their expression was several times higher in comparison to Coker312. Surprisingly, all the copies of WUS were either off or very lowly expressed in either the genotype or developmental stage.

RT-qPCR and Validation of RNAseq
RT-qPCR analysis of four critical embryogenesis genes were performed to valida the RNA-seq data, as shown in Supplemental Table S7. The relative expressions GhBBM, (Gohir.D08G247400.1), GhLEC1 (Gohir.D13G136000.1), GhWOX (Gohir.D10G245300.1), and GhWUS (Gohir.D10G089500.1) were measured in the NE and EC stages calli of Coker312 and Jin668, and results are presented in Figure  Consistent with the RNAseq data, LEC1 was the most highly upregulated in Jin668 E cells, followed by Coker312 EC, Jin668 NEC, and Coker312 NEC, respectively. WOX5 gen expression was the highest in Coker312 EC cells, followed by Jin668 NEC, Coker312 NEC and Jin668 EC, respectively. Interestingly, BBM, an important embryogenesis gene wa downregulated in the EC and NEC cells of Coker312, while it was highly upregulated Jin668 EC cells. In Coker312 NEC and Jin668 NEC cells, BBM expression was very low RT-qPCR data also validated WUS expression in Coker 312 and Jin668. In comparison the other embryogenesis genes, WUS expression was low in Jin668 EC cells, while it wa mostly off in Jin668 NEC, Coker 312 NEC, and Coker312 EC cells. The results of RT-qPC showed similar expression patterns of embryogenesis genes as the RNAseq data an

RT-qPCR and Validation of RNAseq
RT-qPCR analysis of four critical embryogenesis genes were performed to validate the RNA-seq data, as shown in Supplemental Table S7. The relative expressions of GhBBM, (Gohir.D08G247400.1), GhLEC1 (Gohir.D13G136000.1), GhWOX5 (Gohir.D10G245300.1), and GhWUS (Gohir.D10G089500.1) were measured in the NEC and EC stages calli of Coker312 and Jin668, and results are presented in Figure 6. Consistent with the RNAseq data, LEC1 was the most highly upregulated in Jin668 EC cells, followed by Coker312 EC, Jin668 NEC, and Coker312 NEC, respectively. WOX5 gene expression was the highest in Coker312 EC cells, followed by Jin668 NEC, Coker312 NEC, and Jin668 EC, respectively. Interestingly, BBM, an important embryogenesis gene was downregulated in the EC and NEC cells of Coker312, while it was highly upregulated in Jin668 EC cells. In Coker312 NEC and Jin668 NEC cells, BBM expression was very low. RT-qPCR data also validated WUS expression in Coker 312 and Jin668. In comparison to the other embryogenesis genes, WUS expression was low in Jin668 EC cells, while it was mostly off in Jin668 NEC, Coker 312 NEC, and Coker312 EC cells. The results of RT-qPCR showed similar expression patterns of embryogenesis genes as the RNAseq data and confirmed the RNAseq results.
Plants 2021, 10, x FOR PEER REVIEW Figure 6. Expression of embryogenesis related gens in the non-embryogenic cells (N embryogenic cells (EC) of Coker312 and Jin668 by RT-qPCR analysis. The relative expr embryogenesis-related gens GhLEC1, GhWOX5, GhBBM, and GhWUS were measure biological replicates and three technical replicates were used for statistical analysis. E indicate ±SE (n = 3). The ΔΔCt method was used for qPCR analysis. Asterisks indicate st significant differences compared with the Coker312 NEC: Student's t test; * p < 0.05; ** p < < 0.001.

Discussion
Upland cotton is one of the most important economic crops worldwide and p the largest source of renewable textile fiber. However, cotton is highly restricted t improvement via transformation and whole-plant regeneration through embryogenesis mainly because of the somaclonal variation in tissue culture, long regeneration via tissue culture, decline in vigor, and low potency of embryogen Moreover, regenerative capacity is highly genotype-dependent, and investigations on regenerable genotypes in cotton have not yielded many si advancements [23,32]. In several monocot species, such as rice and maize, embryogenesis has been examined at the transcriptional levels in both recalcit semi-recalcitrant species. These studies have identified several key transcription such as BBM, WUS2, LEC1, and LEC2, that have initiated somatic embryo formati ectopically expressed in recalcitrant genotypes, although the frequencies and embryo formation still remain low and slow, respectively [12,17,33,34].
A recent study revealed an upland cotton genotype, Jin668 with elite regeneration properties, such as a high frequency of embryo formation (~96%) a time to cellular differentiation (45-60 days), that was developed through su regeneration acclimation (SRA) [25]. The authors hypothesized that the rege Figure 6. Expression of embryogenesis related gens in the non-embryogenic cells (NEC) and embryogenic cells (EC) of Coker312 and Jin668 by RT-qPCR analysis. The relative expressions of embryogenesis-related gens GhLEC1, GhWOX5, GhBBM, and GhWUS were measured. Three biological replicates and three technical replicates were used for statistical analysis. Error bars indicate ±SE (n = 3). The ∆∆Ct method was used for qPCR analysis. Asterisks indicate statistically significant differences compared with the Coker312 NEC: Student's t test; * p < 0.05; ** p < 0.01; *** p < 0.001.

Discussion
Upland cotton is one of the most important economic crops worldwide and produces the largest source of renewable textile fiber. However, cotton is highly restricted to genetic improvement via transformation and whole-plant regeneration through somatic embryogenesis mainly because of the somaclonal variation in tissue culture, long in vitro regeneration via tissue culture, decline in vigor, and low potency of embryogenesis [24]. Moreover, regenerative capacity is highly genotype-dependent, and previous investigations on regenerable genotypes in cotton have not yielded many significant advancements [23,32]. In several monocot species, such as rice and maize, somatic embryogenesis has been examined at the transcriptional levels in both recalcitrant and semi-recalcitrant species. These studies have identified several key transcription factors, such as BBM, WUS2, LEC1, and LEC2, that have initiated somatic embryo formation when ectopically expressed in recalcitrant genotypes, although the frequencies and time to embryo formation still remain low and slow, respectively [12,17,33,34].
A recent study revealed an upland cotton genotype, Jin668 with elite somatic regeneration properties, such as a high frequency of embryo formation (~96%) and rapid time to cellular differentiation (45-60 days), that was developed through successive regeneration acclimation (SRA) [25]. The authors hypothesized that the regenerative potential (totipotency) is a trait that is encoded in the genome, but is epigenetically suppressed in most genotypes, leading to genotype-specific recalcitrance [25]. As a follow-up, we compared the global gene expression at two key developmental stages (NEC and EC) in the Jin668 genotype to identify the genes necessary for cellular reprogramming and the transition to EC cells [26]. We identified a sharp upregulation of several transcription factors that likely have a major role in regulating the shift from NEC to EC with subgenome bias in this allotetraploid species [26].
In this study, we collected transcriptome data from the semi-recalcitrant genotype Coker312. Coker312 is considered semi-recalcitrant because of its long time to embryo formation (90-120 days) and low frequency of embryo formation (<15%). The most upregulated genes in Coker312 EC cells are annotated as lipid transfer proteins (LTPs), homeobox-3 genes, AP2 transcription factors, early nodulin-like proteins, and copper transporters. Previous studies have demonstrated that the LTPs are involved in the formation of a protective layer of cutin in the cell wall, surrounding the young embryo, and are implicated in the initiation of somatic embryogenesis [35]. LTPs are also abundantly expressed in the epidermis of developing tissues and play an important role in fiber elongation [36]. Earlier studies have also identified AP2, a super-family transcription factor, that may have a role in callus formation [37], and also contributes to biotic and abiotic stress resistance in cotton [38]. In addition, some APETALA 2/ethylene-responsive element binding factors (AP2/ERFs) are implicated in growth and developmental processes mediated by growth hormones such as gibberellins (GAs), cytokinins (CTK), and brassinosteroids (BRs) [39]. AP2/ERFs may also have a role in the hormone sensing and signaling pathways important to cellular reprogramming during the transition from NEC to EC in upland cotton, as demonstrated by the results presented here.
Our previous data identified several thousand differentially expressed genes during the transition from NEC to EC in the non-recalcitrant genotype Jin668 [26]. Comparative transcriptome analysis between Jin668 and the semi-recalcitrant genotype, Coker312, could provide a much smaller 'candidate set' of key genes by examining the overlap between the two genotypes. In the EC calli of both Coker312 and Jin668, the d-subgenome-encoded homolog of LEC1 had the highest expression, indicating subgenome expression bias and a primary role in somatic embryogenesis in upland cotton. LEC1 was described as a master regulator that shapes embryo development in Arabidopsis [40]. In other studies, LEC1 has been described as a central regulator, controlling different parts of embryo morphogenesis and photosynthesis as well as seed development [41]. WUS is a morphogenic regulator that has been shown to induce or stimulate cellular differentiation in a range of species, such as Arabidopsis [16], Zea mays [34], and Medicago [42]. In cotton, the WUS gene had little to no expression in either genotype and developmental stage, suggesting that other genes have a more primary role in stimulating the transition from NEC to EC cells. In contrast, WOX5 may have a more primary role than WUS in cotton SE. WOX5 is a transcription factor that has been shown to be a regulator of a pool of pluripotent stem cells in the apical meristem [43] and has endowed gain-of-function mutants with somatic embryo formation in Arabidopsis [16]. WOX5 is expressed during different stages of embryogenesis and post-germination growth stages [44,45]. In both Coker312 and Jin668 NEC cells, WOX5 displays a moderate expression level and is upregulated even further in Coker312 EC cells. However, WOX5 is downregulated in Jin668 EC cells, which suggests that it may be an upstream regulator of cellular reprogramming and that early expression of this gene is necessary for the transition to embryo formation. Earlier studies have reported that BBM plays an important role in transcription of LEC1, LEC2, ABI3, and FUS3 [34]. In Jin668, BBM is highly expressed, while in Coker312, it shows less expression. Critical difference in the expression of morphogenic regulator BBM may be one of the reasons for the different regenerative ability of both genotypes. The genes ABI3 and FUS3 are also expressed in EC callus of both upland cotton genotypes ( Figure 5) [26]. These genes are transcription factors of LAFL genes [33]. The ectopic expression of ABI3 did not result in successful embryo development, but it has a reported role in embryo programming, being activated by BBM [33]. The ectopic expression of FUS3 results in cotyledon-likes leaves, while LEC1 and LEC2 overexpression results in the spontaneous development of somatic embryos turning into plantlets [46,47]. In Coker312, the most highly upregulated genes in EC cells are a tandem array of genes on chromosome D13 with expression profiles that are mostly off in NEC. We also observed several genes that were highly upregulated in EC cells that have only recently been implicated with somatic embryogenesis, such as early nodulin-like protein 3 [48] and copper transporters. The upregulation of these genes has been described as a stress response in regeneration systems [49].
In Jin668, we observed a large number of unique genes (2001, Figure 2) that are also important to discuss. The most abundant group was the Cytochrome p450 family, which are important proteins that participate in the metabolism of most plant growth regulators (PGRs) [50], which may be under epigenetic control [51]. Lipid transfer was the second most abundant category, and these genes have been shown to strongly correlate with early morphogenic processes [52]. Transcription factors, including MYBs, helix-loop-helix, AP2, WRKY1, bZIPs, PLATZ, and GATAs, were also in abundance in Jin668. The expression of each of these transcription factors (except PLATZ) has been implicated in the induction of somatic embryogenesis in previous studies [53][54][55][56][57]. However, the list is quite extensive, and comparisons with Coker312 can identify a shorter list of conserved candidate genes critical to SE in cotton.
Previous work has shown that GRF2 is strongly expressed in the developing tissues of the shoot apical meristem in the upper stems and root tips. Further, GRF2 is required for the coordination of cell division and differentiation during leaf development in Arabidopsis [58]. LEA are a large group of hydrophilic proteins that play a major role in drought stress tolerance in upland cotton and are required for normal growth and development. These proteins are mostly expressed during abiotic stresses, such as in cold, drought, and highsalinity conditions [59,60], but may function in callus cells in response to the tissue-culture microenvironment.
In plant transformation and regeneration systems, both frequency and time to embryo formation are critical factors. In agrobacterium-mediated transformation, it is important to note that agrobacterium stress and selection pressure results in a reduction in embryo formation efficiency and an increase in the duration to form an embryo when compared to simply regenerating a whole plant without transformation. In Jin668 and Coker312, this stress typically adds~8 weeks. However, embryo formation frequency in Jin668 still remains high (>80%), while, in Coker312, embryo formation drops to less than 15%, and the probability of transformed plants thorough SE is very low. In part, this may be due to differences in the expression levels of the key genes between the two genotypes. When analyzing the global trends in gene expression, a general upregulation of nearly the same transcription factors was found, but with a much higher expression profile in Jin668.

Plant Material and Callus Growth Conditions
The semi-recalcitrant genotype used in this study, Coker312, was obtained from the USDA Crop Germplasm Collection, College Station, TX. Seeds were surface sterilized and cultured in germination bottles on germination media [26] and kept in the dark for 7 d. The hypocotyls were excised from the 7-day-old, aseptic, etiolated seedlings, cut into 5-7 mm pieces, and cultured as explants on callus induction media containing Murashige and Skoog . The conditions used for these callus cultures included: 28 ± 1 • C, 16 h (day)/8 h (night) photoperiod, 1000 ppm CO 2 , with light provided by cool-white fluorescent lamps at an irradiation of 60 µmol m −2 s −1 and 50% relative humidity. Plates were rotated on alternate days for equal light distribution. After testing for 10 days in the growth chamber, whole calli were harvested at two different stages, i.e., NEC (45-day-old calli) and EC (90-day-old calli with ball-shaped embryo structures) into RNAlater stabilization and storage solution. Whole calli were kept at room temperature for 24 h and moved to −80 • C for longer storage.

RNAseq
A total of four biological replications for each stage were used for mRNA sequencing. Total RNA was extracted from Coker312 callus material following the guanidine thiocyanate method described by the authors of [61]. RNA integrity and concentration were assayed on an BioAnalyzer2100 (Agilent) and considered high quality with RNA Integrity Number (RIN) values ≥ 7 and total masses (≥2.0 µg total RNA) for all biological replicates. Sequencing libraries were prepared following the standard protocols of the Illumina TruSeq Stranded RNA kit. Transcriptome sequences were collected on an Illumina NovaSeq to a depth of at least 40 million read pairs per replicate sample. Raw sequences were preprocessed to remove adapter and low-quality bases with Trimmomatic software v.0.38 [61]. Cleaned reads were mapped to the Gossypium hirsutum (TM1 v.2.0) reference assembly [62] using the Bowtie2 short-read aligner [63]. Transcript abundance was quantified with RSEM [64], and differentially expressed transcripts were determined with edgeR [65]. Because cotton is an allotetraploid species with highly identical subgenomes, genes are expected to be in multiple copies and may be expressed with bias at the subgenome level [62]. It is important to note that genes described in Figure 5 were assigned a gene name, and the annotated gene and its expression value were derived from primary transcripts of the V2.0 assembly described by Chen et al., 2020 [62].

Reverse Transcription-Quantitative PCR (RT-qPCR)
Reverse transcription-quantitative PCR (RT-qPCR) was carried out to validate the RNAseq data [66]. Four differentially expressed embryogenesis genes (GhLEC1, GhWOX5, GhBBM, and GhWUS) and three biological reps of each NEC and EC stage callus of Coker312 and JIn668 were used for the RT-qPCR. In total, 1 µg of the total RNAs was used to synthesize the first strand of cDNA using the M-MuLV reverse transcriptase (New England Biolabs, USA) and primed by d(T)25-VN as per the manufacturer's instructions. RT-qPCR of gene transcripts was carried out on an iCycler iQ system (Bio-Rad, Hercules, CA, USA) in 20 µL of PCR reaction solution using the Luna Universal qPCR Master Mix, New England Biolabs, USA. Thermal cycling conditions comprised of initial denaturation at 95 • C for 60 s, followed by 40 cycles of 95 • C for 20 s, 62 • C for 20 s, and 72 • C for 20 s. Finally, a unique melting curve was performed from 55.0 • C to 95.0 • C in 0.5 • C increments to amplify a unique PCR product. Two reference genes, GhPP2A1 [67] and GhUBQ7 [25], were used to normalize the expression data. The Ct values of three technical samples for each of the three biological replicates were used to calculate the relative expression of genes using the 2 −∆∆Ct equation [68]. All primer pairs, except for the reference genes, were designed from the conserved coding sequence and listed in Supplemental Table S7.

Conclusions
In this study, comparative transcriptome profiling of two upland cotton genotypes that differ in regenerative capacity and developmental timing revealed a short list of candidate genes whose expression and expression abundance are critical for somatic embryogenesis in upland cotton. The genes LEC1, BBM, FUS3, AGL15, ABI3, and WOX5 were commonly expressed in Coker312 and Jin668 EC cells. These results provide a foundation for candidate gene testing (in various combinations) for their role in initiating somatic embryogenesis in recalcitrant cotton genotypes.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10.3 390/plants10091775/s1, Table S1: Differentially expressed genes in Coker312 in NEC compared to EC cells with at least 2-fold change and error corrected p-values less than or equal to 0.001, Table S2: Functional enrichment categories of genes upregulated in EC callus cells in Coker312,   Data Availability Statement: All sequence data reported in this manuscript can be found in the NCBI sequence read archive (SRA) under BioProject study PRJNA747913 (Samples: SRR15186777-ARR15186784).

Conflicts of Interest:
The authors declare no conflict of interest.