Comparative Transcriptome Analysis between Ornamental Apple Species Provides Insights into Mechanism of Double Flowering

: Double-ﬂower ornamental crabapples display eye-catching morphologies in comparison to single ﬂower, but the genetic basis of double-ﬂower development is not yet well known in apples. In order to comprehensively understand the differential expression of genes (DEGs) between single and double ﬂower, the transcriptome of double ﬂower crabapples Malus Kelsey, Malus micromalus, Malus Royalty, and a single ﬂower cultivar Malus Dolgo were compared by RNA-sequencing. The results showed that there were 1854 genes in overlapped DEGs among all sample comparisons in apple single and double ﬂower varieties. A large number of development and hormone related DEGs were also recognized on the basis of GO and KEGG annotations, and most of the genes were found to be down-regulated in double ﬂowers. Particularly, an AGL24-MADS-box gene (MD08G1196900) and an auxin responsive gene (MD13G1137000) were putatively key candidate genes in the development of double ﬂower by weighted gene co-expression network analysis (WGCNA). The study provides insights into the complex molecular mechanism underlying the development of the double ﬂower in apple.


Introduction
Double flower ornamental crabapples are one of the most eye-catching entity of plant landscape [1][2][3] but, little is known about the molecular mechanisms of double flowering in apple. The well-established ABCE model on flower development clearly explains how a few key regulatory genes, principally the ABC(E) genes, could control floral organ identity in double flowers [4,5]. In addition, researches on model plants Arabidopsis and Antirrhinum showed that the development of double flower is associated with plant hormones and environmental factors, such as temperature [6][7][8][9]. In previous studies, many specifically expressed transcription factor (TF) genes were identified during the double flowering process, besides, hormones also play a vital role in double flowering [10][11][12], suggesting a complex gene regulatory network underlying flower development regulating different floral forms [13]. Although, several double flowering related genes including MADS box and hormone Agronomy 2019, 9,112 3 of 17 The genome was then mapped to the reference genome database of apple available at Genome Database for Rosaceae (http://www.rosaceae.org) [28]. An index of the reference genome was built using Bowtie2 [29], and paired-end clean reads were aligned to the reference genome using HISAT (Hierarchical Indexing for Spliced Alignment of Transcripts) software v2.0.4 [30]. We used StringTie [31] to reconstruct transcripts. After that, we used Cuffmerge [32] and Cuffcompare [32] respectively to reconstruct the information of all the samples together and compare the reconstructed transcripts to reference annotations available through the genome browser [33]. By using CPC software [34], protein coding potential of the new transcripts was predicted, then coding novel transcripts were merged with reference transcripts to get a complete reference for further analysis.

Differentially Expressed Gene Detection
We used RNA-seq by expectation-maximization (RSEM) [35] to calculate the expression level of genes and transcripts. After that, cor and hclust functions of R were used to calculate the Pearson correlation and hierarchical clustering between all the samples [36]. The expressional abundances in each library were normalized to transcript per million, where, the most differentially expressed genes (DEGs) were selected. The significance of the differential transcript expression was determined by using the False Discovery Rate (FDR) control method [37] to justify the p-values. The only genes with an absolute log2 fold change [38] of ≥+1 (up-regulated) and ≤−1 (down-regulated), and a FDR significance score of ≤0.01, were included in subsequent analysis. For heat maps, the log10FPKM values were calculated for all DEGs, normalized between −1.5 to +1.5, and used as an input to the pheatmap function of R.

Gene Enrichment Analysis
To describe the characteristics and reaction features of DEGs that were retrieved from samples, we conducted gene ontology (GO) analysis and also performed GO functional enrichment using phyper, a function of R. FDR was calculated for each p-value, in general, the terms with FDR ≤ 0.01 were identified as significantly enriched. For pathway analysis, we mapped all the DEGs in terms of the Kyoto Encyclopedia of Genes and Genomes (KEGG) and retrieved the significantly enriched pathway with p-value ≤ 0.05 [39]. With KEGG annotation results, we classified the DEGs accordingly and performed pathway enrichment using the phyper function of R. To find the transcription factors (TFs) of DEGs, we used getorf [40], then aligned ORF to TF domain (from PlntfDB) by using hmmsearch [41], and coded according to the characteristics of the TF family.

Co-Expression Gene Models
All the genes expressed in three replicates of each sample were first used to calculate the mean FPKM values of respective samples. After that, genes having normalized reads, lower than 0.1 FPKM were considered "too lowly expressed genes" and were removed from analysis. Leftover genes after filtration, were then subjected to calculate the DEGs between DFK, DFM, DFR, and SFD. Furthermore, DEGs with log2FoldChange ≥1 and ≤−1 were used for sample comparisons of all double flowers to single flower (SFD). The weighted gene co-expression network analysis (WGCNA) R package was used to analyze and perform pairwise correlations between a limited number (~3000 genes) of most differentially expressed genes. The steps involved in analysis and module construction are described in detail by [42]. Clusters of highly interconnected genes i.e., modules, were constructed and to further determine the biological meaning of the identified co-expression module, the up/down-regulation of the gene within the modules were analyzed [43]. The co-expression network was further visualized using the free software Cytoscape [44].

Validation of RNA-seq Data
Different genes from DEGs were selected to validate the RNA-Seq data. The corresponding sequences were retrieved from the apple genome database, available at Genome Database for Rosaceae (GDR) [45] (http://www.rosaceae.org). Primers for the selected genes were designed using Beacon Designer-8 software. Moreover, the total RNA from the samples was isolated by using TaKaRa MiniBEST universal RNA Extraction Kit (Cat#9767 TaKaRa, Shiga, Japan). First-strand cDNA was synthesized from 1 µg of total RNA, using PrimeScript RT reagent kit with gDNA Eraser (perfect real time) (Cat#RR047A TaKaRa, Shiga, Japan), and was diluted 10 folds with ddH 2 O before further use as a template for quantitative real-time PCR (qRT-PCR) assays. qRT-PCR was performed in 96-well plates on BIO-RAD CFX Connect Real-Time System using SYBR Premix DimerEraser (Cat#RR091A Clontech, Madison, WI, USA). Actin was used as an internal control during qRT-PCR, along with NTC (no template control), to get more accurate results [46]. The thermal cycling conditions were set at 95 • C for 3 min, followed by 30 cycles of 94 • C for 20 s, 60 • C for 30 s and 72 • C for 30 s. The relative expression level of the selected genes was calculated with the 2 −∆∆CT method [47].

RNA Sequencing of Different Samples
In order to have a summary of the transcriptome, RNA-seq libraries from single and double flowers were generated using RNA samples from four different species of ornamental crabapples ( Figure 1a) in three replicates. We acquired in total reads ranging from 53.56 to 57.16 Mb with a ratio of total clean reads ranging from 77.73% to 83.92% ( Figure 1b, Table 1), while the average ratio of gene set rate was 78.10%. The high quality and rate of genome coverage of the present data exposed that it could be used for further bioinformatics study. Afterwards, the blast tool was used to identify the transcripts based on homology to the reference genes. About 42,009 genes were detected, including 39,628 known genes, and 2381 novel genes predicted in total. The data for each sample is listed in Table S1 of the Supplementary Materials. Venn analysis for replicates of each sample provided the common genes, which were used for downstream analysis (Figure 1c). Hierarchical cluster analysis between all the samples showed a high and indistinguishable genetic similarity between DFR and SFR from the same tree ( Figure 1d).  to the total number of raw reads; low quality reads: Reads with a quality value lower than 15 reads accounted for the total base-base ratio; clean reads: Ratio of number of filtered reads to the total raw reads. (c) Venn diagram analysis between each sample replica groups for common expressed genes. (d) Hierarchical cluster analysis between samples. The closer a sample to each other, the more similar the expression profile of the sample was.

Transcriptome Dynamics of Double Flowering
By comparing the transcript levels of each gene between single flower SFD compared to all the other samples (DFK, DFM, DFR, and SFR), we generated heat maps for DEGs of all the samples shown in Figure 2. The heat maps for SFR and DFR, showed the highest similarity of expression pattern to each other and also to other double flowers. In double flowers, along with SFR, most of the DEGs were observed as downregulated as compared to SFD, which indicated that in double flowers most of the expressed genes were downregulated. Afterwards, more detailed pairwise comparisons between all the samples were made to understand the pattern of expressed genes in double flowers. Analyzing on basis of p-value ≤ 0.05 and log2fold change (log2FC) ≥ 1 or ≤ −1, a total of 71933 DEGs in all comparative transcriptomes were retrieved. In DFK-VS-SFD 5041 DEGs were down-regulated while 3279 were up-regulated, in DFM-VS-SFD 3247 DEGs were down-regulated while 2983 were up-regulated, in DFR-VS-SFD 4652 DEGs were down-regulated while 3455 were up-regulated, in SFR-VS-SFD 4950 genes were down-regulated while 3142 were up-regulated, in DFK-VS-SFR 3786 genes were down while 3878 up-regulated, in DFM-VS-SFR 3209 genes were down-regulated while 4720 genes were up-regulated, in DFR-VS-SFR 2 genes were down-regulated while 69 were upregulated, in DFK-VS DFM 5008 were down-regulated while 3635 were up regulated, in DFK-VS- to the total number of raw reads; low quality reads: Reads with a quality value lower than 15 reads accounted for the total base-base ratio; clean reads: Ratio of number of filtered reads to the total raw reads. (c) Venn diagram analysis between each sample replica groups for common expressed genes. (d) Hierarchical cluster analysis between samples. The closer a sample to each other, the more similar the expression profile of the sample was.

Transcriptome Dynamics of Double Flowering
By comparing the transcript levels of each gene between single flower SFD compared to all the other samples (DFK, DFM, DFR, and SFR), we generated heat maps for DEGs of all the samples shown in Figure 2. The heat maps for SFR and DFR, showed the highest similarity of expression pattern to each other and also to other double flowers. In double flowers, along with SFR, most of the DEGs were observed as downregulated as compared to SFD, which indicated that in double flowers most of the expressed genes were downregulated. Afterwards, more detailed pairwise comparisons between all the samples were made to understand the pattern of expressed genes in double flowers. Analyzing on basis of p-value ≤ 0.05 and log2fold change (log2FC) ≥1 or ≤−1, a total of 71933 DEGs in all comparative transcriptomes were retrieved. In DFK-VS-SFD 5041 DEGs were down-regulated while 3279 were up-regulated, in DFM-VS-SFD 3247 DEGs were down-regulated while 2983 were up-regulated, in DFR-VS-SFD 4652 DEGs were down-regulated while 3455 were up-regulated, in SFR-VS-SFD 4950 genes were down-regulated while 3142 were up-regulated, in DFK-VS-SFR 3786 genes were down while 3878 up-regulated, in DFM-VS-SFR 3209 genes were down-regulated while 4720 genes were up-regulated, in DFR-VS-SFR 2 genes were down-regulated while 69 were up-regulated, in DFK-VS DFM 5008 were down-regulated while 3635 were up regulated, in DFK-VS-DFR 4751 genes were down-regulated while 4291 were up-regulated and in DFR-VS-DFM 4408 genes were down-regulated while 3427 were up-regulated (Table S2) DFR 4751 genes were down-regulated while 4291 were up-regulated and in DFR-VS-DFM 4408 genes were down-regulated while 3427 were up-regulated (Table S2)    DFR 4751 genes were down-regulated while 4291 were up-regulated and in DFR-VS-DFM 4408 genes were down-regulated while 3427 were up-regulated (Table S2)

Gene Ontology and Kyoto Encyclopedia of Genes and Genome (KEGG) analysis
According to the results of differential gene detection, we performed gene ontology (GO) classification and functional enrichment for DEGs. GO has three functional categories: Cellular component, molecular function, and biological process for which we performed functional enrichment analysis. The detailed GO analysis of DEGs between single flower and double flower was performed, and significantly enriched GO terms were obtained by using false discovery rate (FDR) ≤ 0.01 ( Figure 4). All the cumulative genes of double flowers were compared to SFD and were grouped into 49 GO terms; including 21 biological process terms, 15 cellular components and 13 molecular function related terms. For biological process, metabolic process and cellular processes were dominant terms; for the cellular component, the dominant terms were cell, cell part and organelle; and for molecular function, a large percentage of genes were related to catalytic activity, binding, and transporter activity (Table S3).

Gene Ontology and Kyoto Encyclopedia of Genes and Genome (KEGG) analysis
According to the results of differential gene detection, we performed gene ontology (GO) classification and functional enrichment for DEGs. GO has three functional categories: Cellular component, molecular function, and biological process for which we performed functional enrichment analysis. The detailed GO analysis of DEGs between single flower and double flower was performed, and significantly enriched GO terms were obtained by using false discovery rate (FDR) ≤ 0.01 ( Figure 4). All the cumulative genes of double flowers were compared to SFD and were grouped into 49 GO terms; including 21 biological process terms, 15 cellular components and 13 molecular function related terms. For biological process, metabolic process and cellular processes were dominant terms; for the cellular component, the dominant terms were cell, cell part and organelle; and for molecular function, a large percentage of genes were related to catalytic activity, binding, and transporter activity (Table S3). Numerous flower-development and flowering genes play a vital role in entire flower development cycle [48]. In total, 11 DEGs in double flowers showed similarity to known flower development and flowering related genes within the NCBI and Uniport databases (Table S4). According to this data, three flowering promoting factor genes (MD15G1109300, MD16G1170700, and flowering related genes within the NCBI and Uniport databases (Table S4). According to this data, three flowering promoting factor genes (MD15G1109300, MD16G1170700, and MD08G1131700) showed higher expression levels in double flowers than in single flower. Furthermore, two early flowering protein genes (MD01G1152700 and MD11G1074300), two flowering time control genes (MD17G1123300 and MD16G1150700), one flowering locus T gene (MD12G1262000), one flowering locus C gene (MD10G1041100), one protein terminal flower-like gene (MD14G1021100), and one Protein MOTHER of FT and TF (MFT) were also identified.
Based on the GO terms (GO: 0009908 and GO: 0009909), 25 DEGs were identified as flower development and regulation of flower development-related genes (Table S5). Most of the genes were up-regulated in double flowers but interestingly one flowering gene, a predicted glycerol-3-phosphate 2-O-acyltransferase (MD04G1148700), was very highly expressed in double flowers as compared to single flower Dolgo.
The KEGG pathways signified by all the DEGs, were predicted to further set the involvement of 135 KEGG pathways in flower organogenesis (Table S6). In all the sample comparisons the highest genes representation was observed in metabolic pathways with average of 1556 genes followed by biosynthesis of secondary metabolites with an average of 870 genes, plant-pathogen interaction with average of 345 genes, and endocytosis with 284 genes. In KEGG pathways analysis the trend of regulation of pathways enrichment showed that almost all the differentially enriched pathways were down-regulated in double flowers ( Figure 5). The KEGG pathways signified by all the DEGs, were predicted to further set the involvement of 135 KEGG pathways in flower organogenesis (Table S6). In all the sample comparisons the highest genes representation was observed in metabolic pathways with average of 1556 genes followed by biosynthesis of secondary metabolites with an average of 870 genes, plant-pathogen interaction with average of 345 genes, and endocytosis with 284 genes. In KEGG pathways analysis the trend of regulation of pathways enrichment showed that almost all the differentially enriched pathways were down-regulated in double flowers ( Figure 5).

Recognizing TF-Encoding Genes Associated with Floral Organ Development
TF genes seems to be involved in the regulation of various physiological systems in response to floral organogenesis and flower development [49]. All DEGs were predicted with transcription factor. In total 2923 TF genes were identified as DEGs and classified in 60 different TF families. In all the sample groups, the top largest differentially expressed TF families were: MYB (402 members), followed by MYB-related (334 members), AP2-EREBP (218 members), bHLH (191 members), NAC (177 members), WRKY (118 members) and MADS (116 members) as shown in Figure 6. Furthermore, several hormone-related TFs were also identified as DEGs, including 20 GRASs and 31 ARFs.

Recognizing TF-Encoding Genes Associated with Floral Organ Development
TF genes seems to be involved in the regulation of various physiological systems in response to floral organogenesis and flower development [49]. All DEGs were predicted with transcription factor. In total 2923 TF genes were identified as DEGs and classified in 60 different TF families. In all the sample groups, the top largest differentially expressed TF families were: MYB (402 members), followed by MYB-related (334 members), AP2-EREBP (218 members), bHLH (191 members), NAC (177 members), WRKY (118 members) and MADS (116 members) as shown in Figure 6. Furthermore, several hormone-related TFs were also identified as DEGs, including 20 GRASs and 31 ARFs.

Recognizing DEGs Related to Hormones
A huge number of DEGs involved in hormones metabolism and interactions were identified between double flowers and single flower comparison. The KEGG pathway analysis assigned the DEGs to key elements involved in numerous hormone signaling pathways. Interestingly, an outsized variety of hormone-related DEGs belonged to auxin, gibberellic acid (GA), brassinosteroid (BRs), and cytokinin (CK) signaling pathways. An overview of the different hormonal signaling networks in apple flowers is shown in Figure 7. Most hormone related DEGs were preponderantly expressed in SFD as compared to double flower. MADS box genes play a very pivotal role in double flowering [18,50]. In this study, we identified 116 MADS-box genes, in which 46 DEGs were significantly expressed on basis of their log2FC values (Table S7). Among these, 16 genes like; AGL15, AGL18, AGL17, SOC1, AGL24, and SVP belongs to MIKCc group of MADS family, while several other MADS genes like; AGL80, AGL61, AGL62, FBP24, AGL12, AGL14, SEP1/MADS8, MADS15, MADS22, PHERES2, ZMM17, CMB1, and JOINTLESS were also identified (Supplementary Figure S1). Firstly, DEGs associated with auxin were analyzed. For auxin biosynthesis and metabolism, 8 YUCCA genes were categorized as DEGs. Seven putative auxin influx carrier-encoding AUX1/LAX genes and 17 putative GH3 genes were categorized as DEGs, while for auxin transport, 25 efflux carrier-encoding genes were identified as DEGs. For downstream response, 27 putative auxinrepressed genes AUX/IAA and 27 auxin response factor ARF-encoding genes were identified as DEGs. After that, DEGs associated with GA were identified and analyzed. For GA biosynthesis, 52 putative DEGs encoding GA20OX, GA2OX, and GA3OX were identified, while 7 putative DEGs were annotated as GA receptors GID1. For GA downstream responses, nine putative DELLA, two F-box protein GID2, nine phytochrome-interacting factors3, and eight phytochrome-interacting factors4responsive genes were identified as differentially expressed. For ABA biosynthesis, 42 genes were identified, which were differentially expressed, encoding CYB707A, ABA2, ABA1, AOG, and NCED, while for receptors and transporters 2 PYL-encoding genes were identified. For downstream responses, 12 DEGs were annotated as PP2C, SnRK2, and ABA responsive element binding factor (ABF) genes. Furthermore, 79 putative genes were identified as DEGs and mapped to CK pathway (Figure 7a). Based on reads per kilobase per million mapped reads (RPKM) values, the expression level of the hormone-related genes were analyzed and shown in Figure 7b. The expression levels of most hormone-related genes in the double flower varieties were less than those in single flowers (Table S8). Firstly, DEGs associated with auxin were analyzed. For auxin biosynthesis and metabolism, 8 YUCCA genes were categorized as DEGs. Seven putative auxin influx carrier-encoding AUX1/LAX genes and 17 putative GH3 genes were categorized as DEGs, while for auxin transport, 25 efflux carrier-encoding genes were identified as DEGs. For downstream response, 27 putative auxin-repressed genes AUX/IAA and 27 auxin response factor ARF-encoding genes were identified as DEGs. After that, DEGs associated with GA were identified and analyzed. For GA biosynthesis, 52 putative DEGs encoding GA20OX, GA2OX, and GA3OX were identified, while 7 putative DEGs were annotated as GA receptors GID1. For GA downstream responses, nine putative DELLA, two F-box protein GID2, nine phytochrome-interacting factors3, and eight phytochrome-interacting factors4-responsive genes were identified as differentially expressed. For ABA biosynthesis, 42 genes were identified, which were differentially expressed, encoding CYB707A, ABA2, ABA1, AOG, and NCED, while for receptors and transporters 2 PYL-encoding genes were identified. For downstream responses, 12 DEGs were annotated as PP2C, SnRK2, and ABA responsive element binding factor (ABF) genes. Furthermore, 79 putative genes were identified as DEGs and mapped to CK pathway (Figure 7a). Based on reads per kilobase per million mapped reads (RPKM) values, the expression level of the hormone-related genes were analyzed and shown in Figure 7b. The expression levels of most hormone-related genes in the double flower varieties were less than those in single flowers (Table S8).

Weighted Gene co-Expression Analysis (WGCNA)
A WGCNA Analysis was performed to identify the genes related to double flowering. The similarity of genes that were expressed across conditions (single flower and double flower), governed the double flowering response. After filtering from lowly expressed genes, the expression values from remaining 2969 DEGs were analyzed and 9 WGCNA modules were identified. The number of genes per module ranged from 5 (grey module) to 1483 (turquoise module) (Figure 8a,b).

Weighted Gene co-Expression Analysis (WGCNA)
A WGCNA Analysis was performed to identify the genes related to double flowering. The similarity of genes that were expressed across conditions (single flower and double flower), governed the double flowering response. After filtering from lowly expressed genes, the expression values from remaining 2969 DEGs were analyzed and 9 WGCNA modules were identified. The number of genes per module ranged from 5 (grey module) to 1483 (turquoise module) (Figure 8a,b).
The analysis of the module-trait relationship revealed that the turquoise module was highly correlated with the double and single flower genes (r = −1 & 1; p = 2 ×10 −10 ). The correlation coefficients with the same expression level but opposite expression pattern (i.e., negative in the case of the double flower) indicated that most of the genes were downregulated in double flowers with respect to single flower. In this module, we found 11 structural genes involved in the double flowering mechanism (Table S9). The expression pattern of these genes were highly connected to each other, even when we used the stringent edge weight cut-off of 0.1 (Figure 8c, Table S10). Specifically, MD08G1196900 which is a MADS-box gene (AGL24), had the highest number of edges connecting to other genes, followed by MD13G1137000, which is an auxin responsive protein IAA31 gene.  The analysis of the module-trait relationship revealed that the turquoise module was highly correlated with the double and single flower genes (r = −1 & 1; p = 2 × 10 −10 ). The correlation coefficients with the same expression level but opposite expression pattern (i.e., negative in the case of the double flower) indicated that most of the genes were downregulated in double flowers with respect to single flower. In this module, we found 11 structural genes involved in the double flowering mechanism (Table S9). The expression pattern of these genes were highly connected to each other, even when we used the stringent edge weight cut-off of 0.1 (Figure 8c, Table S10). Specifically, MD08G1196900 which is a MADS-box gene (AGL24), had the highest number of edges connecting to other genes, followed by MD13G1137000, which is an auxin responsive protein IAA31 gene.

Validation of the Randomly Selected Genes Expression
To verify the RNA-seq results of gene differential expression, genes MD16G1148300, MD08G1196900, MD15G1014200, MD17G1259400, MD10G1007100, MD07G1174000, MD14G1054500, MD12G1198600, MD06G1022600, MD09G1074100, MD13G1137000, and MD03G1116000 (details are listed in Table S11), were selected from the data to perform qRT-PCR assay with independent samples collected from single flower and all double flower varieties. The expression results are attached in Table  S12 of the Supplementary Materials. All the genes showed expression levels in sync with RNA-seq data (Figure 9). Correlation analysis showed that qRT-PCR results were highly correlated to RNA-seq results ( Figure S2). In short, the expression levels of these selected genes were basically consistent with RNA-seq results and thus, validated the RNA-seq analysis results.

Validation of the Randomly Selected Genes Expression
To verify the RNA-seq results of gene differential expression, genes MD16G1148300,  MD08G1196900,  MD15G1014200,  MD17G1259400,  MD10G1007100,  MD07G1174000,  MD14G1054500, MD12G1198600, MD06G1022600, MD09G1074100, MD13G1137000, and  MD03G1116000 (details are listed in Table S11), were selected from the data to perform qRT-PCR assay with independent samples collected from single flower and all double flower varieties. The expression results are attached in Table S12 of the Supplementary Materials. All the genes showed expression levels in sync with RNA-seq data (Figure 9). Correlation analysis showed that qRT-PCR results were highly correlated to RNA-seq results ( Figure S2). In short, the expression levels of these selected genes were basically consistent with RNA-seq results and thus, validated the RNA-seq analysis results. Figure 9. Real-time quantitative PCR validation profiles of 12 randomly selected genes from flowers of different varieties of apple along with single and double flowers from same tree; the data was normalized by using actin as an internal reference. Data are represented as mean ± SD for three biological replicates.

Discussion
The evolution of double flower in apple is intriguing, but little is known about the molecular mechanism of double flower morphology. In our study, we carried out a comparative transcriptome analysis of flowers of three double flowered cultivars to a single flower cultivar, and acquired significant DEGs. Due to the fact that several transcription factor families, such as the MADS-box, ARF, bHLH, and NAC, were associated with floral development and organogenesis [51][52][53][54], we performed a detailed analysis about differential transcription factors. Particularly, several evidence of involvement of AGAMOUS-like C-class genes repression by transposable element (TE) insertion could enhance the petal number in ranunculid, Arabidopsis and many other species [55][56][57]. In apple, RNAi of apple AG-like led to showy polypetalous flowers [18]. In this study, our data extended previous findings that five AGL24-like MIKC-type MADS-box genes were down-regulated in double flowers, which may be involved in double flowering. Interestingly, among these, a MADS-box (MD08G1196900) was predicted as most significant in terms of interactions, correlations and WGCNA analysis, but its function remains to be investigated. Additionally, overexpression of PISTILLATA gene MADS8 contributes to the conversion of sepals into petal-like structures [58][59][60], and over expression of E-class SEPALLATA were involved in the floral organs transition from stamens to petals in Arabidopsis and some other plants [61,62]. Here, we identified three up-regulated genes (a MADS8 and two SEP3-like) in double flowers, which provide a new clue to further investigate floral transitions in double flower apple in future.
The regulation network of double flower in apple is also complex. Various endogenous environmental signals are related to an array of cellular and biochemical processes throughout floral organs formation [8]. Among these signals, the phytohormones, such as auxins, gibberellins (GAs), ethylene, abscisic acid (ABA) are endogenously occurring compounds involved in floral organ development and transitions [63]. Our enrichment analysis of KEGG pathway showed that the DEGs were considerably enriched to hormone signaling pathways associated with signal transduction and phytohormone metabolism processes throughout the double flowering [63][64][65]. Also, several environmental factors like low temperature, induces DNA methylation and regulates the hormones that alters the petal number in certain flowering plants [66][67][68]. In the present study, the AGAMOUS genes were down-regulated in double flowers and so was the ARF3, while the cytokinin was up-regulated in double flowers, which was in agreement with reports [69,70]. In addition, 52 putative GA20OX, GA2OX, and GA3OX genes and 34 putative DELLA genes were identified as DEGs suggesting the involvement of diverse regulatory mechanisms in GA biosynthesis during apple flower development [71,72]. However, the more complex problem is that there are both single flower and double flowers in the same tree, such as royalty cultivar (Figure 1). This causes us to think more deeply and discover the methylome of royalty through bisulfite sequencing in future.

Conclusions
In our study the RNA-Seq data analysis between a single flower and three double flower species was carried out, which led to the identification of double flower-specific expression patterns of flowering related genes focusing on morphogenesis of flower. Furthermore WGCNA analysis and subjection of genes to Cytoscape analysis led to the detection of two key genes MD08G1196900 (AGL24-which is a MADS-box gene) and MD13G1137000 (IAA31-which is an auxin responsive protein gene) indicating their strong involvement in the double flowering mechanism. Besides, these DEGs could be used for further experiments to validate the double flower mechanism in apple plants.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4395/9/3/112/s1, Table S1: statistics of predicted genes, Table S2: statistics of DEGs between samples, Table S3: GO classification of  all sample comparisons, Table S4: flowering related genes differentially expressed between all sample comparisons, Table S5: the gene list of GO:0009908 & GO:0009909 differentially expressed between all sample comparissons, Table S6: the KEGG pathways represented by all the DEGs, Table S7: DEGs related to MADS-box TFs in all sample comparisons, Table S8: hormones regulation in double flower varieties-VS-single flower control variety, Table S9: