Comparative Transcriptome Analysis Unveils the Molecular Mechanism Underlying Sepal Colour Changes under Acidic pH Substratum in Hydrangea macrophylla

The hydrangea (Hydrangea macrophylla (Thunb). Ser.), an ornamental plant, has good marketing potential and is known for its capacity to change the colour of its inflorescence depending on the pH of the cultivation media. The molecular mechanisms causing these changes are still uncertain. In the present study, transcriptome and targeted metabolic profiling were used to identify molecular changes in the RNAome of hydrangea plants cultured at two different pH levels. De novo assembly yielded 186,477 unigenes. Transcriptomic datasets provided a comprehensive and systemic overview of the dynamic networks of the gene expression underlying flower colour formation in hydrangeas. Weighted analyses of gene co-expression network identified candidate genes and hub genes from the modules linked closely to the hyper accumulation of Al3+ during different stages of flower development. F3′5′H, ANS, FLS, CHS, UA3GT, CHI, DFR, and F3H were enhanced significantly in the modules. In addition, MYB, bHLH, PAL6, PAL9, and WD40 were identified as hub genes. Thus, a hypothesis elucidating the colour change in the flowers of Al3+-treated plants was established. This study identified many potential key regulators of flower pigmentation, providing novel insights into the molecular networks in hydrangea flowers.


Introduction
Flower colour, one of nature's most magnificent displays, plays an important role in attracting animal pollinators, and is therefore crucial for plant ecology and evolution [1]. Among the various plant pigments, carotenoids and flavonoids are the most common and diverse types [2]. However, apart from attracting pollinators, anthocyanins and other flavonoid compounds that share a common biosynthetic pathway may also act as defensive agents or compounds that protect against various biotic and abiotic stresses [3]. Thus, vealed that mutations in the coding regions of ScCHI1/2 and ScbHLH17 prevented the formation of anthocyanin in yellow and white Senecio cruentus cultivars. Differences in the branched metabolic flux of pelargonidin (Pg), cyanidin (Cy), and delphinidin (Dp)-type pathways are determined by the competition for naringenin between ScF3 5 H, ScDFR1/2, and ScF3 H1 [28]. After studying the transcriptome, Chen, et al. [29] observed that Al exposure upregulated 730 genes in the leaves and 4287 genes in the roots of hydrangea, while it downregulated 719 genes in the leaves and 236 genes in the roots. From the data of metabolomic and transcriptomic analyses of S. miltiorrhiza flowers, a total of 100 unigenes that coded for 10 enzymes were recognized as the candidate genes linked with anthocyanin production. Decreased ANS gene expression lowered the anthocyanin content but led to an increased buildup of flavonoids in S. miltiorrhiza flowers [30]. Other studies have reported a connection between colour expression and DNA methylation in other species. A recent study using molecular markers of SSR and MSAP suggests that DNA methylation may be part of the molecular mechanism causing the change in the colour of hydrangea sepals in response to acidic pH [11]. Hyper methylation of the MdMYB10 promoter initiates striped colouration due to an increased anthocyanin concentration in Malus domestica fruit. Alternatively, varying amounts of promoter methylation of the anthocyanindin synthase gene resulted in varied red or white flower colouration in the ornamental plant Nelumbo nucifera [31,32].
Since transcriptome and targeted metabolomic technologies have proven to be powerful tools for elucidating the mechanisms of colouration in various ornamental plants, we used these technologies to investigate differentially expressed genes (DEGs) during different developmental stages in the infertile flowers of Al 3+ -treated hydrangea and thus to elucidate the molecular pathways driving colour alteration. Global gene expression profiles were examined, with an emphasis on genes involved in anthocyanin biosynthesis and flavonoid metabolism, and the regulatory networks were established. This is the first comprehensive transcriptome and metabolome study of hydrangea flower colour variation at acidic pH. These discoveries will aid in the breeding of multi-coloured hydrangea and several other hydrangea species, as well as in the functional characterisation of genes and proteins of interest.

Change of Sepal Colour under Different Soil pH
The sepal of plants grown in acidic soil (enriched with aluminum sulfate) changed from pale yellow at stage (I) or early flowering (EB, S1) to blue-violet at stage (S3) or full flowering (FB), as shown in Figure 1A. In the plants grown in untreated soil (control group, C), sepals were initially pale yellow at stage (I) or early flowering (EB). At stage (II, S2) or mid-flowering (MB), the margins of the sepals turned light pink, and at full flowering, pink ( Figure 1B). In addition, several common types of anthocyanidins comprising cyanidin, delphinidin, malvidin, pelargonidin, and petunidin, involved in colour development, were quantified. High levels of delphinidin, petunidin, and malvidin were found in TS3, with levels of 4.8 µg/g fresh weights (FW), 1.9 µg/g FW, and 1.5 µg/g FW, respectively. In contrast, high levels of cyanidin and pelargonidin were detected in the pink flowers of CS3 ( Figure 1C). This suggests that treatment causes differential accumulation of anthocyanidin content.

Transcriptome Sequencing, Annotation, and Analysis of DEGs
Using RNA-Seq, a sum of 72 million reads with a total nucleotide count of 328,854,166 bp (38.24 GB) was obtained. A total of 34.79 Gb of clean reads was acquired after cleaning and quality verification, with each library producing at least 5.78 Gb of clean reads. Q30 percentages were 91.76%, 92.89%, 90.94%, 91.87%, 93.91%, and 90.98%, respectively. These findings demonstrated that the quality of RNA-Seq was suitable for further investigation (Table S2). Successively, the de novo assembly produced 342,068 contigs and 186,477 unigenes, with an N50 of 903 nt and 794 nt. There were 109,316 unigenes between 200 and 500 nt (58.61%), 48,027 unigenes between 500 and 1000 nt (25.75%), and 7,459 unigenes longer than 2000 nt (4%) ( Table 1).

Functional Annotation
The assembled unigenes were examined using eight public databases (i.e., NR, egg NOG, KOG, KEGG COG, GO, Swiss-Prot, and Pfam), with an E-value cut-off that was more than 1 × 10 −5 , to functionally annotate the transcriptome. Using this method, the annotation of 76 Figure 2A). According to the statistical examination of the E-value features distributed in the Nr annotation, 88% of the assigned sequences exhibited strong homology (E-value < 10 −100 ) and 12% had exceptionally strong homology (E-value 100 −150 ) to the identified plant sequences ( Figure 2B). Figure 2C shows the distribution of the top 24 species for the best match from each sequence. Blast2GO software was used to categorize 128,376 unigenes into 43 functional categories based on Nr annotation, with 18 GO terms relating to biological processes, 12 to cellular components, and 13 to molecular functions ( Figure S1). KOG analysis was employed for analysing orthologous categorisation and the evolutionary rates of genes. The results showed that 78,993 unigenes (55.41% of all annotated unigenes) aligned with 25 KOG classifications (E-value cutoff 1 × 10 −3 ). Among the different categories, a considerable number of unigenes were involved in clusters for the biosynthesis, transport, and catabolism of secondary metabolites (48.25%), followed by transcription (16.45%), protein turnover, posttranslational modification, chaperones (10.22%), and chromatin structure and dynamic regulation (8.10%). Only small proportions (less than 1%) of unigenes were assigned to extracellular structure, uncertain function, and cell modification. There were also higher proportions of genes associated with translation, ribosomal structure and biogenesis (7.05%), signal transduction mechanism (5.92%), DNA methylation (5.94%), and the mitochondrial DNA metabolic process (4.22%) ( Figure 3). Transcripts with normalized read counts less than 0.5 FPKM were excluded from the study. CS1, CS2, and CS3 were discovered to express 28,365, 28,242, and 28,088 unigenes, respectively. Likewise, 27,810, 27,726, and 27,711 unigenes were found in treated samples from the different sepal maturation phases. Figure 4A shows the number of expressed transcripts dispersed in the 0.5-1 FPKM, 1-10 FPKM, and 10 FPKM ranges. The gene expression level correlation coefficient between the three biological replicate samples was greater than 0.73. The results of principal component analysis (PCA) indicated that the classification of the 18 samples could be easily classified into six groups: CS1, CS2, CS3, TS1, TS2, and TS3 ( Figure 4B). The control and treated samples of the same developmental stage showed a distant clustering relationship, indicating that there was a clear distinction between the whole transcriptome profile of control and treatment at each developmental stage ( Figure 4B).

DEG Identification and Functional Enrichment Analysis
The variations in gene expression were examined with the comparison of the three different sepal maturation stages, using thresholds of more than log2 (fold change) ≥2 and adjusted p-value less than 0.05 [33]. This resulted in a total of 896 DEGs (i.e., TS1 vs. CS1, TS2 vs. CS2, and TS3 vs. CS3, Figure 5A). TS2 vs. CS2 (814 DEGs) had the most DEGs among the three comparisons, with 380 and 434 unigenes up-and down-regulated, respectively. In contrast, TS3 vs. CS3 (41) had the fewest DEGs, with 18 and 23 unigenes upand down-regulated, respectively ( Figure 5B). Among all identified DEGs, the assignment of the 621 DEGs was made to one or more GO terms, and these DEGs revealed information about the molecular events that occur during the development of sepal, particularly colour formation. All DEGs were mapped to the GO database using the TopGO software (v2.12.0) to find highly enriched terms when related to the genomic background, using a corrected p-value of 0.01 (Fisher's exact test) as the cutoff value. Among the three Gene Ontology categories, there were a total of ten enrichment GO keywords ( Table 2). Under the biological process group, the most notable enrichment GO terms were pigment biosynthetic process, metabolic process, L-phenylalanine catabolic process, anthocyanin-containing, flavonoid biosynthetic process, response to abiotic stimulus, and pattern specification process. In the molecular function, catalytic activity, oxidoreductase activity, transporter activity, binding, peroxidase activity, electron carrier activity, and transcription factor activity were the most enriched whereas, in the cellular component (CC), intracellular and organelle cells were the most enriched.  Understanding the relationship between the biological processes and the genes can be aided by pathway analysis. In TS1 vs. CS1, TS2 vs. CS2, and TS3 vs. CS3, the number of DEGs enriched among KEGG pathways was 26, 127, and 34, respectively, which were assigned to 15, 57, and 12 metabolic pathways, respectively. The most enriched 20 metabolic pathways were explored ( Figure 6). In the comparison between TS1 and CS1, the biosynthesis of flavones, flavonols, isoflavonoids, and glucosinolates were enriched in TS1. In the comparison between TS2 and CS2, the biosynthesis of flavonoids, anthocyanins, stilbenoids, diarylheptanoids, and gingerol, as well as the biosynthesis of secondary metabolites, were increased in TS2. The biosynthesis of the metabolites flavone, flavonol, and phenylalanine was always significantly different at different developmental stages. The biosynthesis of flavonoids is predominant in the developmental stage S1 (CS1 and TS1), whereas the biosynthesis of anthocyanins was predominant in the developmental stage S2 (CS2 and TS2). Flavonoid biosynthesis was significantly enriched in TS1 compared with CS1, whereas anthocyanin was more enriched in TS2 compared with CS2. Al 3+ treatment could induce anthocyanin biosynthesis rather than flavonoid biosynthesis to produce the blue colouration in hydrangea flowers. The carotenoid and isoflavone biosynthesis pathways, the other two metabolic processes involved in the floral colour formation, were also detected in the KEGG enrichment pathway. From these metabolic pathways, information about the pigment metabolism at three different colouring stages of Hydrangea macrophylla can be gleaned. The blue hue of Al 3+ -treated flowers may be closely associated with these metabolic pathways.

Identification of TFs and Establishment of Gene Co-Expression Network Analysis (WGCNA)
Using BLASTX (E-values cutoff 1 × 10 −5 ), the assembled unigenes were aligned with the plant transcription factor database (PlantTFDB) and a total of 88 TFs from eight TF families were found. The MYB TF family had the most members (25), followed by the WD40 (20 TFs), HD-ZIP (14 TFs), bHLH (11 TFs), C2H2 (10 TFs), AP2-ERF (5 TFs), and NAC (3 TFs) families (Table S3). WGCNA was used to construct co-expression gene network modules to further investigate potential unigenes associated with pigmentation transition during the successive developmental stages of the two experimental conditions ( Figure 7A, note: in the dendrogram, each module is represented by a branch, while each gene is shown as a leaf). The co-expression network constructed using the 621 DEGs that remained after eliminating the low-expression unigenes from the total 896 DEGs was integrated into 10 modules. The largest of which was the light blue module with 345 unigenes, and the smallest contained only 26 unigenes (dark green). Figure 7B shows the unigene distribution in each module (indicated with different colours) and the connections between module and trait. The 9 out of 11 DEGs associated with anthocyanin biosynthesis and 2 of 12 DEGs related to flavonoid metabolism are included in the brown module. This indicates that the brown module unigenes play a key role in anthocyanin and flavonoid metabolism. We were particularly interested in the modules that were enriched in the control or treatment groups, especially blue and pink in S2, which aid in distinguishing the flower colour phenotype caused by an environmental pollutant. The modules of interest were therefore selected based on |r| > 0.5 and p ≤ 0.05 criteria and then annotated using KEGG and GO analysis. The light green module was closely associated with TS2. Many colour formation pathways were enriched in the light green module (p ≤ 0.01). The three major metabolic pathways were phenylpropanoid biosynthesis (ko00940, 30 DEGs), anthocyanin biosynthesis (ko00942, 19 DEGs), and flavonoid biosynthesis (ko00941, 10 DEGs) (Table S4). Pearson correlation coefficients of structural genes and transcription factors were determined with SPSS version 17.0 based on their FPKM values. The FPKM values of the unigenes of the transcription factor HymMYB2 and the two WDR40 unigenes were positively (p ≤ 0.01) and negatively (p ≤ 0.01) correlated with the values of CYP73A, F3 5 H, C3H, C2H2, DFR, and ANS FPKM, respectively. In addition, the FPKM level of a WDR68 unigene was correlated negatively with the levels of DFR and F3H (p ≤ 0.01) ( Table S5). The transcription factor HymMYB2, together with other transcription factors such as WER-like and WDR40, might play an important role in the formation of the blue colour of infertile hydrangea flowers.  Factors of interest (x-axis) were correlated with each module (y-axis). The first value in each square is the correlation and the second value in parentheses is the p-value of the association. The more positively correlated the module and factor, the redder the square; the more negatively correlated, the bluer. The coding of the factors was as follows: CS1, CS2, CS3, TS1, TS2, TS3, C, and T. * Significant at p < 0.05; ** Significant at p < 0.01; *** Significant at p < 0.001, (-) means no content. Table 3 shows the expression patterns of 15 potential genes based on closed modules. In summary, in the treated plants, all five PALs were down-regulated during sepal maturation, whereas in the untreated plants, they initially remained constant or decreased and later increased. Moreover, their relative expression levels were significantly higher in treated individuals S1-S3 than in the untreated groups. PAL9 and PAL6 were found to be putative hub genes for the dark green module. 4CL12 and 4CL14 were shown to be enriched in the dark green module, and 4CL12 was recognized as a possible hub gene for this module. The significantly higher expression of the 4CL12 gene in S1-S3 of the treated plants, when compared to the untreated groups, indicated that 4CL18 had a crucial role in the signalling pathway. The orange module had three enriched CHSs, with CHS2 and CHS4 recognized as possible hub genes that showed similar changes in expression during different phases in the treated and untreated groups. The relative expression levels of CHS1 and CHS3 in CS3 were 2.3 and 1.5-fold higher than in TS3, respectively. We also searched for two CHRs, which were enriched in these critical modules and discovered that the change in the expression patterns of CHR1 and CHR3 were compatible with the enriched CHSs. In addition, these modules enriched F3 H4, F3 H3, F3 H2, F3 5 H, FLS1, FLS2, PIP2, TIP1, UA3GT, PAP2, DFR1, DFR2, CYP75A, and CYP75B1. The expression levels of F3 5 H were 2.1 and 3.2 times higher in treated plants than in untreated plants in S1 and S2, respectively. DFR1 was up-regulated and peaked in S3 during flower development in Al 3+ -treated plants, whereas it was virtually absent in untreated plants. There was an up-regulation in the DFR2 gene in treated plants and it peaked at S3, whereas in untreated plants its expression was low and remained stable. The expression levels of DFR1 and DFR2 were significantly higher in all stages of the treated plants than in the untreated plants. The expression levels of CYP75A and CYP75B1 were also higher in the treated plants (Table 3).

qRT-PCR of the Transcriptomic Data
For the validation of the transcriptome sequencing data, the sequences of 15 nuclear unigenes related to the formation of the colour blue, which showed varying expression levels in the two experimental groups, were subjected to quantitative real-time PCR using the designed primers. The results show that the expression levels of transcriptome and qRT-PCR analyses are significantly correlated with each other with a correlation coefficient of R 2 = 0.92, indicating that the genes studied are involved in the signalling and/or metabolic pathways associated with colour formation (Figure 8).

Discussions
Flower colour has long fascinated scientists and breeders, and it has been demonstrated that flower colour develops as a consequence of interactions between genes and external environmental conditions [34]. Consequently, the development of the colour blue in the H. macrophylla cultivars can be attained by changing the growing conditions. For instance, altering the pH of the soil or the addition of exogenous Al 3+ can cause colour variation in some barren flowers of H. macrophylla. However, the knowledge about the underlying process of colour change and tolerance to acidic pH is still limited. Breeding plants that can tolerate acidic soils is a pressing issue in agricultural and plant physiology research. In this competition, Chen et al. performed a genome-wide transcriptome analysis of Al response genes in hydrangea roots and leaves using an RNA-Sequencing approach. Numerous transporters were involved in the transportation of the Al-citrate complex from hydrangea roots, including those from the MATE and ABC families. The aluminum transporter Nramp, a plasma membrane transporter for Al uptake, was upregulated in roots and leaves under Al stress, suggesting that it may play an important role in Al tolerance by lowering toxic Al levels. However, the signalling pathways and potential genes involved in the colour change remain to be elucidated [29]. In the current study, next-generation sequencing technology was utilized to compare the transcriptomes of hydrangea sepals grown under Al 3+ -treated and control conditions at three sepal maturation stages to discover the genes and signaling pathways responsible for the colour change in response to Al hyper accumulation.

Comparison of Genes Involved in Flavonoid Biosynthesis in Hydrangeas Grown under Different pH Conditions
Flavonoids are vital pigments found in many plant sepals [35]. Anthocyanins are the end products of the biosynthetic pathways of flavonoids. They produce a wide variety of colours, from pale yellow to blue-violet [36]. The accumulation of the floral anthocyanins malvidin and petunidin causes the colour difference between Al 3+ -treated (blue-violet) and untreated (pink) hydrangea sepals (Figure 1). The conversion from pink to blue needs a change in the pathway of anthocyanin biosynthesis, which most likely occurs several processes before the production of petunidin and malvidin. Thus, the abundance of potential genes involved in flavonoid biosynthesis was evaluated to find vital genes involved in blue colour metabolism. Several isoforms of flavonoid synthesis, including 4CLs, PALs, CHIs, CYP73A, DFRs, ANSs, F3H, F3 5 H, and UA3GT, showed distinct expression patterns in the Al 3+ -treated plants with blue-purple flowers compared with untreated plants with pink flowers, suggesting that the alteration in the expression of these genes, induced exogenously, may occur much earlier than the phenotypic changes. CHS, CYP73A, and CHI are upstream genes of the flavonoid biosynthetic pathway, whereas ANS, F3H, F3 5 H, C3 5 H, DFR, and CYP75B1 are downstream genes. They encode important enzymes in the biosynthetic pathway of flavonoids and thus contribute to the formation of flower colour [37]. The effects of CHI and CHS on flavonoid accumulation are considerable. The initial reaction step in the flavonoid biosynthetic pathway is catalysed by CHS contributing to the formation of the intermediate product chalcone, which is required for all flavonoid classes [38]. In overexpressed CHI tobacco plants, flavonoids were increased but anthocyanins were not detectable [39]. The overexpression of the peony-derived CHI gene in tobacco also increased flavonoid accumulation [40]. Almost all unigenes expressing CHI and CHS were up-regulated at S1 during the development of infertile flowers in both experimental groups, and flavonoid concentration was maximal at this time. When the early expressed genes in the flavonoid biosynthesis pathway were over-expressed, it resulted in flavonoid accumulation and provided precursors for anthocyanin biosynthesis [41]. The biosynthesis pathway of flavonoids is dependent on F3H, F3 H, and F3 5 H. They catalyse the hydroxylation of flavonoids required for anthocyanin production, such as dihydrokaempferol, dihydromyricetin, and dihydroquercetin [42,43]. F3H catalyses the conversion of naringenin to dihydroflavones. In the presence of NADPH and DFR, the three forms of dihydroflavones are reduced. It has been observed that flower colour can vary greatly depending on the type and extent of DFR expression and that DFR is most strongly expressed in sepals with a high concentration of anthocyanins [44]. We have discovered two different DFR copies. DFR-2 is predominantly expressed in pink flowers (it is almost undetectable in blue samples) and has an N residue in the substrate specificity region at the third position that has been shown to cause substrate affinity for the pelargonidin-like precursor, dihydrokaempferol, in a variety of angiosperms [45]. DFR-1, on the other hand, is mainly produced in blue flowering plants and mainly has a D residue at this position, which confers lower or no substrate affinity for dihydrokaempferol in other species [46]. Moreover, NADPH cytochrome P450 expression was significantly higher in the treated plants than in the control group during the first and second phases of sepal maturation. NADPH cytochrome P450 reductase was shown to catalyse electron transfer from NADPH to F3 5 H in petunia, resulting in the blue-purple colour of the plant [47]. The hydroxylation of dihydrokaempferol is catalysed by F3 5 H and leads to the formation of a delphinidin precursor [48]. The loss of F3 5 H function in Antirrhinum spp. [49] or reduced F3 5 H expression in Phlox drummondii [50] results in a transition from blue to red colour. C3 5 H is an enzyme important for the production of delphinidin and promotes blue flower formation [51]. ANS, a vital enzyme that catalyses the last step of the flavonoid biosynthetic pathway, can also catalyse the conversion of proanthocyanidins into coloured anthocyanins. Through the induction of anthocyanin synthesis in sepals by Antirrhinum majus dihydroflavonol 4-reductase (AmDFR) and anthocyanidin synthase (MiANS) genes (AmDFR), transformed through a sequential Agrobacterium-mediated transformation, the flower colour of forsythia (Forsythia x intermedia cv 'Spring Glory') was altered [52]. In the current study, the downstream genes F3 H, F3 5 H, C3 5 H, CYP75B1, DFR1, and ANS of the flavonoid biosynthesis pathway were all increased at the first and second developmental stages of Al 3+ -treated plants. We also analysed the expression levels and patterns of all glycosyltransferases of the anthocyanin biosynthetic pathway and observed four UA3 GT homologous unigenes. However, their FPKMs were extremely low, and there were no significant differences in the expression levels at the three sepal maturation stages of the treated plants, but there was a significant difference in their expression at the second stage between the treated and control groups, indicating that UA3 GTs may be one of the key enzymes for the accumulation of delphinidin-3 -glycosides, the main anthocyanins in blue sepals. Kogawa et al. discovered that UA3 5 GT is critical for the accumulation of polyacylated anthocyanins, called ternatins, in Clitoria ternatea. UA3 5 GT could be glycosylated at the 3 , 5 positions of delphinidin [53]. The blue-violet flower colour of treated individuals suggests that the basic regulation of gene transcription from upstream ANS (anthocyanidin synthase), F3 5 H (flavonoid 3 ,5 -hydroxylase) and DFR (flavanone 4-reductase) may play a vital role in the buildup of flavonoid intermediates and the transition of flower colour.

Identification of Hub Genes Related to Flower Formation by WGCNA
Understanding the changes in blue flower phenotype caused by external influences of the wild type (with pink colour) could shed light on the mechanisms of flower colouration in hydrangea. Any functional changes in critical enzymes of the flavonoid biosynthetic pathway, including changes in the frequency of gene transcription and branching changes of flavone products, could lead to a repeated transition from blue to red/pink [54]. The most important finding of this work was that, by using WGNCA, we were able to identify Al 3+ treatment-specific gene modules ( Figure 7B). This showed that 2 DFRs, 2 4CLs, 3 CHRs, 9 PALs, 4 CHSs, F3 H4, 2 UFGTs and MYB were strongly related with modules relevant to the TS2 or treatment group. They all showed significant variation in transcript levels between treated and untreated individuals, demonstrating that they play a crucial role in floral variation. It is important to note that the above genes were not the ones with the highest expression, suggesting that the genes with the highest expression are not essential for flower colour differentiation [55]. Thus, the usage of WGCNA analyses in this work delivered a good method for identifying key genes associated with different developmental conditions. Wang et al., 2020 used WGCNA analysis to identify nodule genes involved in heavy metal transport, which were found to be particularly abundant in nodules [56]. In camellia, a similar WGCNA approach was used to reveal unigenes related to flower colour, and it was shown that CHS, F3H, ANS, and FLS have a crucial role in controlling the synthesis of flavonols and anthocyanidins [57]. Tan et al. [58] further used WGCNA to extract the Cd-coupled co-expression gene modules from the 22,080 transcripts in 17 RNA-Seq datasets and recognized 271 transcripts as universal Cd-regulated DEGs, which are key components of the Cd-coupled co-expression module. The four hub genes were found upstream of the flavonoid biosynthetic pathway, suggesting that blue flower colouration was mainly stimulated upstream in the treated plant. The reduced expression of 4CL8, PAL9, and PAL6 in both treated and untreated plants is consistent with the results of Wang et al. [59]. The decreased expression of 4CLs and PALs altered the level of cinnamic acid in the ripe fruit peel, according to these researchers [59]. We hypothesized that the reduced expression of 4CL8, PAL6, and PAL9 would affect cinnamic acid concentration in sepals from both treated and untreated plants. The increased expression of CHIs and CHSs in TS2 may play a vital part in the biosynthesis of other flavones, such as isoflavones, which contribute to the colour of many hydrangea flowers.

Identification of Transcription Factors Related to Flower Colour Transition
MYB, WDR, and bHLH transcription factors control the flavonoid biosynthetic pathway in several higher plant species [60,61]. Transcription factors can regulate structural genes either alone or in cooperation, and they can be positively or negatively regulated. Generally, transcription factors affect flower colour in different ways. The MYB-bHLH-WD40 (MBW) ternary transcription complex triggers numerous late flavonoid biosynthesis genes (LBGs), which include three regulatory protein classes, including bHLHs, R2R3-MYBs, and TRANSPARENT TESTA GLABROUS1 (TTG1; also known as WD40) [62,63]. MYB transcription factors, bHLHs, and WD40 regulate the expression of ANS and other downstream genes in Arabidopsis and affect anthocyanin biosynthesis [59]. LrMYB15, a transcription factor regulating DFR, CHSa, ANS, and CHSb, was observed to be involved in anthocyanin biosynthesis in lilies [64]. MYBs were the largest TF family in our analysis with 23 genes. Analysing DEGs and transcription factors based on their co-expression pattern revealed that among these MYBs, the change of flower colour was consistent with that of the expression profiles of 17 genes, with only 5 genes showing an opposite expression trend. The gene HymMYB2 was expressed at all developmental stages of Al 3+ -treated plant sepals. TS2 showed the highest expression level. Despite previous findings of a connection between the expression of HymMYB2 and the total amount of anthocyanins in sterile flowers of numerous hydrangea varieties, no such association was found in this study [51], and the expression pattern in our investigation was not the same between treated and untreated plants. The expression of HymMYB2 was significantly elevated in the treated groups, but it was nearly undetectable in the control plants. This shows that HymMYB2 regulates anthocyanin intermediates in hydrangea with some specificity. The level of HymMYB2 expression was highly associated with the expression of important genes DFR, F3H, ANS, C3 5 H, and WDR40, according to co-expression analysis. The transcription factor HymMYB2 may act on C3 5 H, ANS, and DFR in the biosynthetic pathway of anthocyanin in hydrangea, which is similar to the function of the PeMYB11, PeMYB12, and PeMYB2 genes (structural genes) in Phalaenopsis [65]. Based on the data presented above, different flavonoid biosynthetic pathways were determined in treated and untreated hydrangeas (Figure 9). In summary, flavonoid production in Al 3+ -treated plants is advanced by PAL and 4CL compared with untreated specimens, whereupon a branch of isoflavone biosynthesis, regulated by CHS and CHR, completes the anthocyanin synthesis pathway. In addition, the elevated expression of F3 5 H, F3 H and F3H/FLS leads to an increase in other flavonoid molecules, such as kaempferol and myricetin, which further decreases anthocyanin production. Lastly, a high DFR expression combined with the availability of UFGT may stimulate the synthesis of anthocyanin, leading to blue colour formation.

Plant Material
Hydrangea macrophylla ssp. serrata were provided by the Koetterheinrich breeding company (Lengerich, Germany) at week 48 and placed in a greenhouse at IPK Gatersleben under the condition of an 18-h photoperiod with 200 µmol m −2 s −1 light intensity, a temperature of 21/19 • C (day/night), and 60% relative humidity. Plants were divided into two groups, either as the control without additional treatment or with external Al 3+ application. All plants were supplied with 1 g Universal Weiβ fertilizer (with the analysis of 15 + 0 + 19) per liter of irrigation water in weekly intervals. For the treated group, 1 g of aluminum sulfate (Al 2 (SO 4 ) 3− ) was added to this solution. Aluminum-treated groups and control groups were arranged in a block with complete randomisation and three replicates per group. For each of the developmental periods, three experimental samples were harvested from cuttings obtained from the same plant at the early blooming stage: pale yellow, middle blooming stage: light blue (Al-treated group) to light pink (control group), and full blooming stage: dark blue (Al-treated group) to dark pink (control group). The pH was recorded once during all cycles in the substratum of all the plants with the dilution method 1:2 reported by David et al. [66]. Healthy, barren flowers with appropriate development and no visible diseases or pests were procured, rinsed thrice with deionized water to prevent contamination during sampling, then immediately immersed in liquid nitrogen and placed in a refrigerator at −80 • C [67].

RNA Extraction, cDNA Library Creation, and Sequencing
The RNeasy Plant Mini Kit (Qiagen, CA, USA) was used for total RNA extraction following the manufacturer's instructions. The TURBO DNA-free TM kit was used to purify RNA (Invitrogen, CA, USA). The Agilent 2100 Bioanalyzer (Agilent Technologies, CA, USA) was employed for quality assessment and to measure the concentration of the extracted RNA. For all samples, the RNA integrity number (RIN) was more than eight. A total of 18 RNA-Seq libraries (including three biological replicates at three stages for treated (TS1, TS2, and TS3) and control samples (CS1, CS2, and CS3) were constructed from about 2 µg of hydrangea sepals as per the manufacturer's protocol (Lexogen GmbH, Vienna, Austria). The libraries were pooled in an equimolar way and the Agilent 4200 Tape Station System (Agilent Technologies, Inc., Santa Clara, CA, USA) was used for electrophoretic analysis. Using the Illumina HiSeq2500 equipment, libraries were quantified and sequenced (paired-end sequencing, rapid run, 2 × 101 cycles, onboard clustering) (Illumina, San Diego, CA, USA).

Expression Annotation
Bowtie (v4.4.7) alignment package was used to trace reads to the unigenes. Based on the comparison results, RSEM (RNA-Seq by Expectation maximisation) was used to estimate the expression levels [70]. The Fragments Per Kilobase of transcript per Million mapped reads (FPKM) method was applied to represent the differences in the unigene expression among the samples [70]. Differential expression analysis between different experimental conditions was performed using the DESeq package (v1.18.0) [71]. The differentially expressed genes (DEGs) were identified using log2FC ≥ 2 and p-value ≤ 0.05 [24]. The iTAK software was used for the prediction of the plant transcription factors [72]. For identifying the transcription factors (TFs), all the annotated unigenes were compared against the plant transcription factor database (Plant TF DB v. 4.0), and the best hits in Arabidopsis thaliana were considered as TFs. Using the gene co-expression networks generated from the DEGs and TFs discovered, the transcriptional architecture of anthocyanin, carotenoid, and flavonoid biosynthesis were established by the WGCNA (weighted gene co-expression network analysis) program [73] and then visualized using Cytoscape v. 3.5.1 (San Diego, CA, USA) [26].

GO and KEGG Pathway Enrichment Analysis for Differentially Expressed Unigenes
GO and KEGG pathway enrichment analyses were performed for the differentially expressed unigenes. The topGo package (v2.12.0) was applied for enrichment and refinement of the collected GO annotation, using the "elim" approach and the Kolmogorov-Smirnov test. In-house scripts were used for KEGG pathway enrichment in accordance with Fisher's exact test. Bonferroni correction was used to obtain enriched p-values. The corrected p-value of 0.05 was used as a criterion to determine whether gene sets were significantly enriched.

Quantitative Reverse Transcription-Polymerase Chain Reaction-Based Validation
Five unigenes related to the biosynthesis of anthocyanin and 10 unigenes associated with the biosynthesis of carotenoids were chosen for quantitative reverse transcriptionpolymerase chain reaction (qRT-PCR) analysis. For qRT-PCR, the TransStart Top Green qPCR SuperMix (TransGen Biotech, Beijing, China) and a Bio-Rad CFX96 RT-PCR system (Bio-Rad, Hercules, CA, USA) were used with the following reaction conditions: denaturation at 94 • C for 60 s and 45 cycles of amplification (94 • C for 5 s, 60 • C for 15 s, and 72 • C for 10 s). The 2 −∆∆Ct method was used for calculating the relative expression levels of target genes against the internal control [74]. To normalize the relative expression levels of target genes, the H. macrophylla actin gene was employed as a control [75]. Supplementary  Table S1 lists the gene-specific primers. For each experiment, three biological and three technical replicates were used.

Estimation of Relative Pigment Content
The extraction of anthocyanin was carried out using fresh sepal tissue acquired from three sepal maturation phases in Al 3+ -treated and untreated samples. Together, 0.5 g of tissue from each sample was powdered using 1 mL of 98% methanol comprising 1.6% formic acid at 4 • C. Following ultrasonic extraction for 30 min, the samples were centrifuged at 12,000× g for 10 min, the supernatant was transferred to new tubes, and the residues were re-extracted. Later, the supernatants were pooled and filtered through 0.45 mm nylon filters (Millipore). Cyanidin 3-O-glucoside, delphinidin 3-O-glucoside, peonidin 3-O-glucoside, pelargonidin 3-O-glucoside, petunidin 3-O-glucoside, and malvidin 3-O-glucoside were among the standard compounds (ZZBIO Co., Ltd., Shanghai, China). An amount of 10 µL of the extract was analysed by HPLC according to the method of Zheng, et al. [76], (Rigol L-3000, Beijing, China). Three biological replicates yielded mean results and standard errors (SE). For flavonoid extraction, 200 mg of sepal tissue was pulverized with liquid nitrogen, extracted in 10 mL of methanol solution, incubated in dark conditions for 24 h at 4 • C, and then suspended by sonication for 1 h. After centrifugation at 10,000 rpm for 10 min, the supernatant was filtered through a 0.22 mm membrane filter. After removing 2 mL of the supernatant, 2 mL of a 1.5% AlCl 3 solution and 3 mL of 1 M sodium acetate (pH 5.0) were supplemented, and ten minutes later, a UV-Vis spectrophotometer was used to measure the absorbance at 415 nm [77]. The trend in flavonoids' relative content was observed across the three periods on the basis of absorbance values. For carotenoid analysis, liquid nitrogen was used to crush 200 mg of sepal tissue, which was then extracted in 10 mL petroleum ether under dark conditions at 4 • C for 24 h and suspended by sonication for 1 h. After centrifugation at 10,000 rpm for 10 min, the supernatant was collected and filtered through a 0.22 mm membrane filter. A UV-Vis spectrophotometer was used to measure the absorbance at 440 nm [78]. The absorbance readings were used to observe the trend in the relative number of carotenoids across the three periods.

Statistical Analyses
All measurements were executed in triplicate, and outcomes are expressed as mean and standard deviation (SD). The Statistical Package for the Social Sciences (SPSS) v. 19.0 software was used to perform statistical analyses wherein the mean values of each developmental stage were compared using a one-way analysis of variance (ANOVA) according to the Duncan multiple choice test at p < 0.05.

Summary
Transcriptome analysis was used to investigate whether and how the pathways of anthocyanin and flavonoid in Al 3+ -treated (blue-violet) and untreated hydrangea plants with pink flowers contribute to the formation of colour. The quantitative analysis of the essential anthocyanins in the flowers of Al 3+ -treated hydrangea were delphinidin, petunidin, and malvidin derivatives, which were absent in the untreated plants. Transcriptome analysis of sepals from two different growth conditions and three different stages of sepal maturation revealed 186,477 unigenes. Several genes that alter or inhibit flavonoid biosynthetic pathways, competing with the production of other flavonoids or altering the synthesis of anthocyanins, may partially be responsible for the blue colour phenotype in hydrangea flowers. DFR and UFGT are among key genes involved in the blue colouration. Al 3+ -treated plants produce more delphinidin derivatives and have a higher ratio of F3 5 H/F3 H transcription than the untreated pink plants. In addition, we also identified several TF families such as WD40, bHLH17, and MYB11, which are likely important regulators in anthocyanin biosynthesis, chlorophyll metabolism, and carotenoid biosynthesis. This study contributes to a better understanding of molecular mechanisms of colour formation in hydrangea that has scientific value and helps breeders design and adapt the desired flower colours.

Institutional Review Board Statement:
This study does not involve any experiments with human participants or animals.

Informed Consent Statement: Not applicable.
Data Availability Statement: All data supporting the findings of this study are available from the corresponding author upon reasonable request.