Genome-Wide Analysis of MYB Transcription Factors and Screening of MYBs Involved in the Red Color Formation in Rhododendron delavayi

Flower color is one of the crucial traits of ornamental plants. Rhododendron delavayi Franch. is a famous ornamental plant species distributed in the mountain areas of Southwest China. This plant has red inflorescence and young branchlets. However, the molecular basis of the color formation of R. delavayi is unclear. In this study, 184 MYB genes were identified based on the released genome of R. delavayi. These genes included 78 1R-MYB, 101 R2R3-MYB, 4 3R-MYB, and 1 4R-MYB. The MYBs were divided into 35 subgroups using phylogenetic analysis of the MYBs of Arabidopsis thaliana. The members of the same subgroup in R. delavayi had similar conserved domains and motifs, gene structures, and promoter cis-acting elements, which indicate their relatively conserved function. In addition, transcriptome based on unique molecular identifier strategy and color difference of the spotted petals, unspotted petals, spotted throat, unspotted throat, and branchlet cortex were detected. Results showed significant differences in the expression levels of R2R3-MYB genes. Weighted co-expression network analysis between transcriptome and chromatic aberration values of five types of red samples showed that the MYBs were the most important TFs involved in the color formation, of which seven were R2R3-MYB, and three were 1R-MYB. Two R2R3-MYB (DUH019226.1 and DUH019400.1) had the highest connectivity in the whole regulation network, and they were identified as hub genes for red color formation. These two MYB hub genes provide references for the study of transcriptional regulation of the red color formation of R. delavayi.


Introduction
Anthocyanins are flavonoids and the main pigments in petals and fruits. These flavonoids can give plants different colors to attract insects and help spread pollen and seeds to promote plant reproduction [1]. Anthocyanins also have anti-oxidation and anti-inflammatory activities [2].
Anthocyanin biosynthesis is controlled by numerous genes, such as transcription factor (TF) and structural genes. Anthocyanidin 3-O-glucosyltransferase (UDPGT, in short UGT) is a structural gene and plays a role in the anthocyanin biosynthesis of plants, such as Capsicum annuum [3], Morella rubra [4], and Vaccinium spp. [5]. The expression of structural genes is often regulated by TFs. MYB is one of the largest families among plant TFs. This TF can bind to cis elements in the promoters of target genes to mainly regulate plant anthocyanin biosynthesis, development, and stress resistance [6]. The N-terminus of MYB has a highly conserved DNA-binding domain (DBD). The domain contains three irregular repeats, each of which forms three A-helices. The other two helices construct a Helix-turn-Helix (HTH) structure with three evenly spaced tryptophan (Trp) residues. Hydrophobic nuclei are formed in the three-dimensional HTH structure [7]. Based on the number of repeats of domains, MYBs can be classified into 1R-MYB, R2R3-MYB, R3-MYB, and 4R-MYB [8]. Among the four types of MYB, R2R3-MYB plays the most central role in anthocyanin synthesis [9].

Analysis of the MYB Conserved Motif and Gene Structure of R. delavayi
To study the sequence characteristics of the conserved MYB DBD of R. delavayi, we performed multiple sequence alignments on their amino acid sequences. The MYB repeats on R2 and R3 of R. delavayi contained characteristic amino acids ( Figure 2). Three highly conserved Trp residues were found at positions 5, 25, and 45 in the R2 repeats of R2R3-MYB, whereas two Trp residues were found at positions 28 and 47 in R3 ( Figure 2). The phylogenetic tree was constructed by the ML method using the 184 MYB protein sequences of R. delavayi. The MYB family was clustered into different branches (Figure 3). Ten conserved motifs of MYB proteins were identified by the MEME program. Among all MYBs, R2R3-MYB, 3R-MYB, and 4R-MYB genes had high similarity in the amino acid sequences ( Figure 3A). The 1R-MYB protein contained motifs 3 and 4. Other types of MYB proteins basically contained motifs 1, 2, 3, and 5 ( Figure 3A). The MYB proteins of R. delavayi in the same group had similar motif types and quantities, which indicates that these proteins may have similar functions. these proteins may have similar functions.
To further measure the structural characteristics of R. delavayi MYB genes, we further analyzed their introns/exons. The most homologous genes in a group shared the same gene structure layout and the number of exons and introns ( Figure 3B). All 3R-MYB contained seven exons, whereas 4R-MYB contained 10 exons. R2R3-MYB contained onefour exons, 68% of which contained conserved gene structure with three exons and two introns. In addition, 1R-MYB contained 1-13 exons, 51% of which had more than four exons. Some introns of the genes were large and accounted for a big part of the genes. Compared with CDS sequences, most MYB genes are with or without two untranslated regions (UTRs). The UTR sequences are less similar, and their diversity is expressed in terms of gene length, which indicates that UTR sequences are less conserved than the coding regions.  To further measure the structural characteristics of R. delavayi MYB genes, we further analyzed their introns/exons. The most homologous genes in a group shared the same gene structure layout and the number of exons and introns ( Figure 3B). All 3R-MYB contained seven exons, whereas 4R-MYB contained 10 exons. R2R3-MYB contained one-four exons, 68% of which contained conserved gene structure with three exons and two introns. In addition, 1R-MYB contained 1-13 exons, 51% of which had more than four exons. Some introns of the genes were large and accounted for a big part of the genes. Compared with CDS sequences, most MYB genes are with or without two untranslated regions (UTRs). The UTR sequences are less similar, and their diversity is expressed in terms of gene length, which indicates that UTR sequences are less conserved than the coding regions.

Phylogenetic Analysis of the MYB Family of R. delavayi
To study the evolutionary relationship of MYB genes in R. delavayi, we downloaded 132 protein sequences of A. thaliana MYBs from TAIR, together with 184 protein sequences of R. delavayi MYBs, and constructed a ML phylogenetic tree. The MYB members of R. delavayi were divided into 35 subgroups and named C1 to C35 based on sequence similarity and topological structure, according to the classification of the same family in A. thaliana. The clades (S1-S25 subgroups) were labeled based on the evolutionary relationship in A. thaliana in the tree (Figure 4). The phylogenetic tree showed that most of the R. delavayi MYBs were clustered with R2R3-MYB of A. thaliana. Subgroup C31 contained a maximum of 19 members, whereas subgroup C20 had none. Four subgroups, namely, C21, C31, C34, and C35, were not grouped with the MYB of A. thaliana. The results suggest that they may have undergone gene gain or loss events during evolution.

Phylogenetic Analysis of the MYB Family of R. delavayi
To study the evolutionary relationship of MYB genes in R. delavayi, we downloaded 132 protein sequences of A. thaliana MYBs from TAIR, together with 184 protein sequences of R. delavayi MYBs, and constructed a ML phylogenetic tree. The MYB members of R. delavayi were divided into 35 subgroups and named C1 to C35 based on sequence similarity and topological structure, according to the classification of the same family in A. thaliana. The clades (S1-S25 subgroups) were labeled based on the evolutionary relationship in A. thaliana in the tree (Figure 4). The phylogenetic tree showed that most of the R. delavayi MYBs were clustered with R2R3-MYB of A. thaliana. Subgroup C31 contained a maximum of 19 members, whereas subgroup C20 had none. Four subgroups,

Analysis of R. delavayi MYB Genes Promoters Cis-Acting Elements
Cis-acting elements in the promoter region of R. delavayi MYB genes can be divided into three categories: plant hormone, biological/abiotic stress, and plant growth and development response elements ( Figure 5 and Table S2). Plant hormone response elements, such as Me-JA response elements (e.g., CGTCA and TGACG motifs), abscisic acid response elements (ABRE), gibberellin response elements (e.g., P-box), and salicylic acid response elements (e.g., TCA element), were widely distributed in MYB promoters. The Me-JA elements accounted for the highest proportion (40%). Cis elements involved in abiotic stress responses included anaerobic inducible elements (ARE), low-temperature response elements (LTR), and drought-inducible elements (e.g., MBS). Among the growth and development response elements, the light (e.g., G-Box) and auxin response elements (e.g., TGA element) accounted for 77% and 11%, respectively. The defense and stress response elements (e.g., TC-rich repeats) accounted for 9%, but the trauma response elements (e.g., WUN motif) accounted for 1%. Flavonoid biosynthesis response elements (e.g., MBSI) accounted for 2%. All cis elements appeared most frequently in the promoters of 2R-MYB, and different subgroups of MYB promoters had distinct cis element types. One TCA element and one WUN motif were found in 1R-MYB and 2R-MYB promoters, respectively, which indicates the functional diversity of MYB in R. delavayi. Moreover, a total of 101,112 TF-binding sites and 223 potentially interacting TFs described previously in A. thaliana were revealed using the PlantPan 3.0 (Table S3). MYB binding sites were identified at all MYB promoters examined. There were also several binding sites of TFs other than the MYB family on the examined promoters, such as the binding sites of bZIP, bHLH, and WRKY.

Differentially Expressed Genes (DEGs), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) Analysis
The DEGs and shared genes between samples were analyzed. Comparative analysis showed 6736 DEGs between MY-1 and MY-2 (3090 up-regulated genes and 3646 downregulated genes) ( Figure 7A

Transcriptomic Analysis
To determine the MYB genes that play a role in the color formation of R. delavayi, we performed RNA-seq analysis based on the UMI strategy using spotted petals, unspotted petals, spotted throat, unspotted throat of flower, and branchlet cortex ( Figure 6A). The produced raw data of the transcriptome have been uploaded to the Sequence Reading Archive database (http://www.ncbi.nlm.nih.gov/sra/, accessed on 3 December 2022) with entry number PRJNA907866. After the removal of low-quality reads and adapters, an average of 23,879,787 clean reads were obtained per sample. The percentage of bases with a phred value greater than 20 in the total bases (QC20), the percentage of bases with a phred value greater than 30 in the total bases (QC30), and GC percentage were greater than 97%, 92%, and 47%, respectively (Table S4). In these clean reads, 41,484,380 (90.77%) were mapped to the R. delavayi reference genome (Table S5). Fragments per kilobase of exon models per million mapped fragments (FPKM) were used to determine the expression level of unigenes. Based on the FPKM values, principal component analyses (PCA) and Pearson correlation coefficients showed high correlation coefficients between three biological replicates ( Figure 6B,C, respectively).

Differentially Expressed Genes (DEGs), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) Analysis
The DEGs and shared genes between samples were analyzed. Comparative analysis showed 6736 DEGs between MY-1 and MY-2 (3090 up-regulated genes and 3646 downregulated genes) ( Figure

Expression Analyses of R2R3-MYB Genes in R. delavayi
The MYB TFs play an important role in anthocyanin biosynthesis, especially R2R3-MYB. To evaluate the expression pattern of R2R3-MYB genes in anthocyanin biosynthesis in different tissues of R. delavayi, we evaluated the expression levels of R2R3-MYB genes based on the transcriptome data of spotted petals, unspotted petals, spotted throat, unspotted throat, and branchlet cortex. A total of 91 R2R3-MYB genes were identified, and 85 of them showed differential expression in five types of red samples (Figure 8). Among these genes, 10 were significantly expressed in four tissues, including spotted petals, non-spotted petals, spotted larynx, and non-spotted larynx, whereas 34 were significantly expressed only in the branchlet cortex. The expression of certain genes was limited to certain tissues, such as MYB genes DUH010372.1 and DUH007487.1 which were only significantly expressed in unspotted petals and spotted throat, respectively. Heat maps of R2R3-MYB expression in R. delavayi showed that most MYB genes were expressed differently in the five types of samples, and several genes were expressed only in the branchlet cortex. metabolic and biosynthetic pathways ( Figures S1 and 7C), and the DEGs of all four combinations were enriched in the pathways encoding phenylpropanoid biosynthesis (ko00940), flavonoid biosynthesis (ko00941), and anthocyanin biosynthesis (ko00942). Thus, DEGs may be important for anthocyanin biosynthesis in R. delavayi.

Expression Analyses of R2R3-MYB Genes in R. delavayi
The MYB TFs play an important role in anthocyanin biosynthesis, especially R2R3-MYB. To evaluate the expression pattern of R2R3-MYB genes in anthocyanin biosynthesis in different tissues of R. delavayi, we evaluated the expression levels of R2R3-MYB genes based on the transcriptome data of spotted petals, unspotted petals, spotted throat, unspotted throat, and branchlet cortex. A total of 91 R2R3-MYB genes were identified, and 85 of them showed differential expression in five types of red samples (Figure 8). Among these genes, 10 were significantly expressed in four tissues, including spotted petals, non- expressed only in the branchlet cortex. The expression of certain genes was limited to certain tissues, such as MYB genes DUH010372.1 and DUH007487.1 which were only significantly expressed in unspotted petals and spotted throat, respectively. Heat maps of R2R3-MYB expression in R. delavayi showed that most MYB genes were expressed differently in the five types of samples, and several genes were expressed only in the branchlet cortex.

Analysis of Co-Expression Network
Tissue colors were determined using the CIE 1976 (L*a*b*) system [29], that is, the larger the value of L*, the brighter the color, and the smaller the value, the darker the color; the larger the value of a*, the more reddish, and the smaller the value, the more greenish the tissue; the larger the value of b*, the more yellowish, and the smaller the value, the more bluish the tissue [30,31]. To screen hub genes regulating the red-color formation of R. delavayi, the correlation of color indicators of L*, a*, and b* values (Table 1) and the gene expression levels based on the RNA-Seq data of the five types of red samples were analyzed using WGCNA. A total of 19 co-expression modules were detected in the screened unigenes ( Figure 9). Modules ranged in size from 45 unigenes (gray module) to 5225 unigenes (turquoise module). For the a* value, the a* indicator represents the transition from red to green [30,31], which was positively correlated with the MEturquoise module, with the highest correlation and strongest significance (r = 0.9, p = 7 × 10 −6 ). A total of 23 highly connective genes (degree ≥ 4962), including 10 MYBs, six bHLHs, one WD40 gene, two MYB-relate genes, and four structural genes (UDPGT) were screened in the module ( Figure 10). Among the 10 MYBs, seven were R2R3-MYB genes (DUH019226.1, DUH019400.1, DUH013179.1, DUH007661.1, DUH003273.1, DUH022833.1, and DUH022115.1), and three were 1R-MYB genes (DUH007734.1, DUH012270.1, and DUH021327.1). These results indicate that the MYBs, bHLHs, WD40, and UDPGT identified are probably involved in the red color formation of R. delavayi. In addition, the co-expression network showed that two R2R3-MYBs (DUH019226.1 and DUH019400.1) had the highest connectivity in the whole regulation network and were identified as hub genes for red color formation. The identified bHLH, WD40, and UDPGT genes may also be involved in red color formation in R. delavayi.

Quantitative Reverse Transcription qRT-PCR Analysis of Gene Expression
Based on the co-expression network analysis, nine genes were randomly selected to further verify the accuracy of transcriptome results. Fold changes in the different tissues relative to the branchlet cortex were analyzed using qRT-PCR. The results showed that all genes detected by qRT-PCR had expression patterns similar to those of transcriptome data, and the relative expression was significantly different in petal and vegetative tissues ( Figure 11). UDPGT (DUH009790.1) and MYB (DUH013179.1) were differentially expressed in all five types of red samples. Meanwhile, MYB (DUH019226.1, DUH012270.1, and DUH003273.1), bHLH (DUH011332.1), and WD40 (DUH022768.1) had high relative expression levels in the branchlet cortex. However, MYB-related genes (DUH031703.1) and MYB (DUH019400.1) had high expression in flower tissues, including spotted petals, unspotted petals, spotted throat, and unspotted throat. These qRT-PCR results further confirmed the gene repression reliability of the transcriptome.

Discussion
MYB TFs are one of the largest gene families in plants and play a crucial role in anthocyanin biosynthesis. This family has been systematically studied in A. thaliana, rice, tomato, petunia, and kiwi fruit [8,[32][33][34][35][36]. A total of 184 members of the MYB family were identified in R. delavayi. This family has more members in R. delavayi than in rice (155) and A. thaliana (168), whereas the number was lower than those in peach (256), apple (229), and other plant species [37,38]. Overall, the number is greater than 100 for different species, which indicates that this family is greatly amplified in higher plants [39]. In R. delavayi, the R2R3-MYB subfamily is the most abundant in the MYB family, consistent with previous studies that reported 110 members of R2R3-MYB in rice [40] and 126 members in A. thaliana [9]. In this study, the majority of R2R3-MYB genes (68%) in R. delavayi contained a conserved gene structure with three exons and two introns, which are also found in other plant species [41]. Most homologous genes in the same group of the family have a common exon/intron structure and exon number, which indicates a common evolutionary history.
Besides, some introns of the genes were large and accounted for a big part of the genes, the length of introns sometimes are also related to some process of adaptation in evolution [42].
Phylogenetic analysis revealed clustered homologous genes in the same clades and subclades, which often exhibit similar functions [43]. The phylogenetic trees constructed from 132 A. thaliana MYBs and 184 R. delavayi MYBs further illustrated the evolutionary relationship of R. delavayi MYBs. Based on the classification of MYB in A. thaliana [9], the MYB members of R. delavayi were divided into 35 subgroups (C1-C35) (Figure 4), among which four subgroups (C21, C31, C34, and C35) of R. delavayi did not cluster with the MYBs of A. thaliana. This result indicated that the six subgroups may have acquired or lost genes during evolution compared with A. thaliana. In addition, most of the clades had different numbers of A. thaliana MYBs and R. delavayi MYBs, and genes from the same clades may share a common evolutionary process. Based on the reported classification and function of MYBs in A. thaliana, phylogenetic trees would be helpful to predict the function of R. delavayi MYB. R2R3-MYBs of A. thaliana in the subclades of S3, S4, S5, S6, and S7 regulated phenylpropanoid biosynthesis [8]. This result indicated that R. delavayi MYBs clustered in the same subclades may also be involved in regulating phenylpropanoid biosynthesis.
Transcriptome analysis had been widely used to study gene regulation in horticultural traits [43][44][45]. In this study, KEGG analysis showed that the petals and flower throat had enriched genes in the pathway of phenylpropanoid and flavonoid biosynthesis. In addition, R2R3-MYB TFs in the reported plant species, such as A. thaliana [8], strawberry [46], and kiwi fruit [47], are involved in the activation of structural genes related to phenylpropanoid biosynthesis pathway and regulation of biosynthesis and accumulation of anthocyanins. Similarly, in this study, 86 R2R3-MYB TFs of R. delavayi were differentially expressed in the differently colored tissues. Ten R2R3-MYB genes were significantly expressed in the tissues from spotted petals, unspotted petals, spotted throat, and unspotted throat. Several R2R3-MYB genes were expressed only in specific tissues, which suggests that these R2R3-MYB genes play a key role in various aspects of color formation or the development of R. delavayi [34]. In addition, an analysis of cis-acting elements in R. delavayi showed that almost all MYB genes contained cis-acting elements, which respond to hormone signal transduction, abiotic stress, and plant growth and development. This result is consistent with previous studies on Petunia [34] and Actinidia, in which MYB promoters also contained these cis-acting elements [47].
TFs play an important role in color formation by regulating the temporal and spatial expression of structural genes [48]. MYB can form MBW complexes with bHLH and WD40 to activate or inhibit the transcription of target genes to regulate anthocyanin synthesis [48,49]. MdMYB10 affects the anthocyanin biosynthesis of Malus spectabilis by regulating the expression of ANS [50]. PsMYB10.2 promotes anthocyanin accumulation in Prunus salicina fruits by activating PsUFGT and PsGST [51]. GbBM of Gossypium barbadense encoding MYB113 directly targets the promoter of four flavonoid biosynthesis genes and positively regulates the development of petal spots [52]. The interaction between AcMYB10 and AcbHLH5 in A. chinensis promotes the expression of AaF3H and AaF3G, respectively, and this process is conducive to anthocyanin accumulation in the flesh [53]. These findings suggest that MYBs are also widely involved in the regulation of anthocyanin biosynthesis in non-model plant species. In addition, WGCNA showed that DcMYB113 of Daucus carota activates the expressions of DcbHLH3 and anthocyanin biosynthesis-related structural genes [54]. Eleven genes (e.g., PaMYB10 and seven structural genes) were identified in the Prunus armeniaca co-expression network. PaMYB10, MdMYB10, and PsMYB10 belong to the anthocyanin-associated R2R3-MYB branch, showing high homology [55]. Similar to previous reports, GbBM and DcMYB113 belong to the S6 clade of the R2R3 MYB. In this study, WGCNA showed that 10 MYBs, six bHLHs, one WD40, two MYB-related genes, and four UDPGTs genes constructed the co-expression network behind the coloration in R. delavayi. Among these genes, two R2R3-MYB TF genes (DUH019400.1 and DUH019226.1) were identified as hub genes associated with anthocyanin biosynthesis in R. delavayi. qRT-PCR analysis revealed that the MYB TF DUH019226.1 was significantly expressed in the branchlet cortex, and DUH019400.1 was significantly expressed in the tissues of the remaining four petals. DUH019226.1 in the phylogenetic tree clustered into the S4 subgroup related to anthocyanin synthesis [5]. DUH019400.1 was clustered into the S19 subgroup, which controls another development or function [48]. The results indicate the possible gene acquisition or loss events during the evolution of R. delavayi MYBs.

Plant Materials and Color Determination
Plant samples of R. delavayi were collected from Baili Rhododendron Reserve, Guizhou Province, China. The samples were obtained from five types of red samples, namely, spotted petals, unspotted petals, spotted throat, unspotted throat, and branchlet cortex. The samples were randomly collected with three biological replicates. After sampling, the samples were immediately frozen in liquid nitrogen and stored at −80 • C. Color deference was represented using the CIE 1976 (L*a*b*) system, the color values of all samples were determined by a portable color spectrometer (CHN Spe, Hangzhou, China) before sampling, and the values of L*, a*, and b* were determined with three replicates.

MYB Sequence Alignment and Phylogenetic Analysis
Muscle was used for multiple sequence alignment of MYBs. MEGAX v7.0.14 was used to explore the evolutionary relationship between the MYB sequences of A. thaliana as a reference sequence [62]. A phylogenetic tree (ML) was constructed using Muscle to compare the full lengths of MYB amino acid sequences (184 R. delavayi MYBs and 132 A. thaliana MYBs). Parameters were set to WAG + G, paired deletion, and bootstrap analysis with 1000 bootstrap replicates. R. delavayi MYBs were classified based on the phylogenetic relationship with the corresponding classification of MYBs in A. thaliana.

Analysis of Promoter Cis-Acting Elements of MYB
The upstream 2000 bp sequence of MYB CDSs of R. delavayi was extracted from the RPGD as promoters [28]. Putative binding sites for TF candidates were determined using the PlantPAN3.0 server (http://plantpan.itps.ncku.edu.tw/, accessed on 9 February 2023) and A. thaliana as references [63].Cis elements of the promoters were predicted using PlantCARE [64] to identify the three main types of cis-acting elements on the promoter region: phytohormone response, including Me-JA response element (CGTCA and TGACG motifs), ABRE, gibberellin response element (P-box), and salicylic acid response element (TCA element); biotic/abiotic stress response, including ARE, LTR, and drought induction element (MBS); plant growth and development, including light response element (G-Box), growth hormone response element (TGA element), defense and stress response element (TC-rich repeats), trauma response element (WUN-motif), and flavonoid biosynthesis (MBSI). Statistical analysis and cis-element visualization were performed using Excel and TBtools, respectively [59].

Transcriptome Analysis of Five Types of Red Samples
Total RNA was extracted from samples using the RNAprep Pure Polysaccharide Polyphenol Plant Kit (TianGen, Beijing, China). The qualified RNA samples were sent to Novogene Bioinformatics Technology Co., Ltd. (Beijing, China). The libraries were sequenced on Illumina Novaseq 6000 Platform and generated 150 nt pair-end reads. Raw reads were first processed through in-house perl scripts. Clean reads were obtained by removing reads containing adapter and/or ploy-N, and low-quality reads from raw data. At the same time, Q20, Q30, and GC content of the clean data were calculated. UMI sequences on each read were identified with UMI-tools (1.0.0) [65]. Genome of R. delavayi (http://bioinfor.kib.ac.cn/RPGD/index.html, accessed on 1 October 2022) [28,66] was used as reference to assemble the transcriptome using HISAT2 v2.0.4 [67]. The number of reads mapped to each gene was calculated using HTSeq v0.6.1 [68]. The FPKM of each gene was calculated based on the gene length and counts mapped to the corresponding gene.

Analysis of DEGs, GO, and KEGG
DESeq R package (1.10.1) was used to analyze the differential expression between two conditions/groups (two biological replicates per condition). Corrected p-value of 0.005 and log 2 (fold change) of 1 were set as the threshold for significance. GO analysis of DEGs was implemented by the GOseq R package, in which gene length bias was corrected. GO terms with corrected p-value less than 0.05 were considered significantly enriched by DEGs. In KEGG enrichment analysis, KOBAS 3.0 (http://kobas.cbi.pku.edu.cn/index.php, accessed on 20 May 2022 ) database was used to test the statistical enrichment of DEGs.

Expression Analyses of R2R3-MYB Genes in R. delavayi
The transcript abundances of R. delavayi R2R3-MYB genes were calculated with FPKM values. Heatmaps of R2R3-MYB gene expression data were generated using TBtools v1.0987663 [59]. Gene expression levels were expressed by log 2 FPKM values.

Analysis of Weight Co-Expression Network
WGCNA was performed using the R v1.63 software package. Genes with coefficient of variation less than 0.5 were removed, and cluster trees were constructed based on the correlation and gene expression levels. In the WGCNA network, the soft threshold was set to 5 ( Figure S2), the module size was set to at least 30, and the merge cut height value was set to 0.25 ( Figure S3). The correlation between the modules and the chromatic aberration values of the five types of red samples was analyzed for all genes in each module. The significant trait correlation module was identified based on the high correlation values and significance (p-value). Finally, Cytoscape v3.9.1 [69] was used to display the weight co-expression network of the selected modules, calculate the degree values of the whole network, and screen the hub genes with high connectivity.

Conclusions
In this study, 184 MYB genes were identified in the R. delavayi genome and classified into 35 subgroups based on phylogenetic analyses. Bioinformatics analysis was performed including, conserved domains and motifs, gene structures, promoter cis-acting elements, and transcriptomic profiling. In addition, the expression levels of the R2R3-MYB gene were significantly different in five types of red samples. WGCNA between transcriptome and chromatic aberration values showed that 10 MYBs, six bHLHs, one WD40, two MYB-related genes, and four structural genes UDPGTs constructed the co-expression network behind the coloration in R. delavayi. The MYBs were the most important transcription factors involved in the red color formation, among them, two R2R3-MYB (DUH019226.1 and DUH019400.1) were identified as hub genes. Furthermore, the qRT-PCR validated the results of the transcriptome. This work has enhanced a comprehensive understating of the MYB gene family in R. delavayi and provided references for the study of transcriptional regulation of the red color formation of R. delavayi.