Genome-Wide Characterization and Analysis of bHLH Transcription Factors Related to Anthocyanin Biosynthesis in Cinnamomum camphora (‘Gantong 1’)

Cinnamomum camphora is one of the most commonly used tree species in landscaping. Improving its ornamental traits, particularly bark and leaf colors, is one of the key breeding goals. The basic helix–loop–helix (bHLH) transcription factors (TFs) are crucial in controlling anthocyanin biosynthesis in many plants. However, their role in C. camphora remains largely unknown. In this study, we identified 150 bHLH TFs (CcbHLHs) using natural mutant C. camphora ‘Gantong 1’, which has unusual bark and leaf colors. Phylogenetic analysis revealed that 150 CcbHLHs were divided into 26 subfamilies which shared similar gene structures and conserved motifs. According to the protein homology analysis, we identified four candidate CcbHLHs that were highly conserved compared to the TT8 protein in A. thaliana. These TFs are potentially involved in anthocyanin biosynthesis in C. camphora. RNA-seq analysis revealed specific expression patterns of CcbHLHs in different tissue types. Furthermore, we verified expression patterns of seven CcbHLHs (CcbHLH001, CcbHLH015, CcbHLH017, CcbHLH022, CcbHLH101, CcbHLH118, and CcbHLH134) in various tissue types at different growth stages using qRT-PCR. This study opens a new avenue for subsequent research on anthocyanin biosynthesis regulated by CcbHLH TFs in C. camphora.


Introduction
Cinnamomum camphora is an evergreen tree species with a large canopy and dense shade all year round. C. camphora is one of the most commonly used tree species in landscaping, with high economic, ornamental, and ecological value [1]. C. camphora is a native landscape tree in southern China, and its ornamental traits, such as bark and leaf colors, are important selection targets. Many studies have shown that the color of the bark and leaves of angiosperms depends on the proportion and distribution of four pigments: chlorophylls, flavonoids, carotenoids, and betaine [2]. Our previous studies on the transcriptome and metabolome of C. camphora demonstrated that red bark results from an enrichment of anthocyanins [3]. These pigments are widely distributed in the leaves, flowers, fruits, and vegetative tissues of vascular plants. Plant anthocyanins also help them to cope with biotic and abiotic stresses. For example, increased anthocyanin accumulation in grape leaves improves their resistance to cold temperatures [4]. Compared with the sensitivity of other orange varieties, the anthocyanin-rich blood orange 'Tarocco' is less susceptible to the necrotizing fungus (Penicillium digitatum) that causes green mold [5].

Phylogenetic Analysis of CcbHLHs
The recently reported numbers of bHLH protein subpopulations in different plant species ranged between 15 and 32 [16,30,31]. The reported bHLH TF family members of A. thaliana (162) were downloaded from the TAIR databases [15]. After multiple alignments of the protein sequences of C. camphora and A. thaliana using MAFFT version 7, a rootless phylogenetic tree was constructed based on 162 AtbHLHs (Figures 1 and S1A). We performed cluster analysis of CcbHLHs based on the classification method developed for A. thaliana [15,31,32] and made appropriate adjustments. The phylogenetic tree revealed that bHLHs from the two species could be separated into 28 clades. The members that did not belong to any subfamilies of A. thaliana (CcbHLH020, CcbHLH086, CcbHLH097, CcbHLH129, CcbHLH136, and CcbHLH149) were classified as "Orphans". Twenty-eight clades coalesced into 27 subfamilies and 1 "Orphan" subfamily. CcbHLHs were distributed in 26 of these subfamilies, and 2 subfamilies, VI and XV, belonged to the specific family of A. thaliana. The number of members in different subfamilies varied greatly. Among them, Ib was the largest subfamily, with 21 members, followed by subfamilies XII and X, with 16 and 12 members, respectively. The smallest subfamilies, Va, XIII, IVb, and XIV, each had two CcbHLH members. The classification result of the C. camphora bHLH TF family based on the phylogenetic tree illustrates the evolutionary relationship of CcbHLHs.

Analysis of Structure and Conserved Motifs of CcbHLHs
Analysis of gene structures and conserved motifs can help elucidate CcbHLH functions ( Figure S1B,C). Twenty conserved motifs in total were predicted by online MEME software, and the members of the same subgroup tended to have the same or comparable motifs. Motif 1 was the most frequently occurring among all CcbHLH members, being present in more than 96% of CcbHLH TFs. Only CcbHLH066, CcbHLH073, CcbHLH096, CcbHLH119, and CcbHLH148 did not contain Motif 1. Motif 2 was the second-most abundant, being present in more than 86% of CcbHLH TFs. Some motifs only appeared in specific subfamilies: for instance, motif 10 was only noted in subfamily XII, motif 17 was only found in subfamily Ib, and motif 14 was only found in subfamily VIIIb. Gene structure analysis indicated that the CcbHLH TF family members have a wide range of exon numbers as well as gene structural varieties. The number of exons in CcbHLHs ranged from 1 to 17, with 17 CcbHLHs having only 1 exon, whereas 4 CcbHLHs had more than 10 exons ( Figure S1C). The CcbHLH TFs were highly conserved within the same subfamily and had a broadly similar exon/intron structure.

Analysis of Structure and Conserved Motifs of CcbHLHs
Analysis of gene structures and conserved motifs can help elucidate CcbHLH functions ( Figure S1B,C). Twenty conserved motifs in total were predicted by online MEME software, and the members of the same subgroup tended to have the same or comparable motifs. Motif 1 was the most frequently occurring among all CcbHLH members, being present in more than 96% of CcbHLH TFs. Only CcbHLH066, CcbHLH073, CcbHLH096, CcbHLH119, and CcbHLH148 did not contain Motif 1. Motif 2 was the second-most abundant, being present in more than 86% of CcbHLH TFs. Some motifs only appeared in specific subfamilies: for instance, motif 10 was only noted in subfamily XII, motif 17 was only found in subfamily Ib, and motif 14 was only found in subfamily VIIIb. Gene structure analysis indicated that the CcbHLH TF family members have a wide range of exon numbers as well as gene structural varieties. The number of exons in CcbHLHs ranged from 1 to 17, with 17 CcbHLHs having only 1 exon, whereas 4 CcbHLHs had more than 10 exons ( Figure S1C). The CcbHLH TFs were highly conserved within the same subfamily and had a broadly similar exon/intron structure.

Chromosomal Localization and Collinearity Analysis of CcbHLHs
Unknown genes can be quickly understood, located, and cloned by comparisons with genes and gene structures from well-characterized species [33]. The positions and sequences of genes on homologous chromosomes are similar between and within species, so a large number of collinear regions revealed by the gene collinearity analysis can be regarded as direct evidence of the whole-genome duplication [34]. Large chromosome repeats, tandem repeats, and transposition events are the key means of gene family amplification [35]. The results of intraspecies genome localization analysis showed that seven CcbHLHs were clustered into two tandem duplication regions (CcbHLH067/CcbHLH068/CcbHLH069 and CcbHLH125/CcbHLH126/CcbHLH127/CcbHLH128) on C. camphora chromosomes 4 and 10, respectively. Among the 150 CcbHLHs, there were 68 pairs of segmentally duplicated genes ( Figure 2). To understand whether CcbHLHs were subjected to natural selection during evolution, Ka/Ks analysis was performed on tandem-duplicated and fragmentally duplicated genes. The gene was considered to have undergone purification selection if Ka/Ks < 1 [36]. For all duplicated CcbHLHs, Ka/Ks values were below 1, indicating that C. camphora eliminated harmful mutations through purifying selection during evolution (Table S2).
respectively. Among the 150 CcbHLHs, there were 68 pairs of segmentally duplicated genes ( Figure 2). To understand whether CcbHLHs were subjected to natural selection during evolution, Ka/Ks analysis was performed on tandem-duplicated and fragmentally duplicated genes. The gene was considered to have undergone purification selection if Ka/Ks < 1 [36]. For all duplicated CcbHLHs, Ka/Ks values were below 1, indicating that C. camphora eliminated harmful mutations through purifying selection during evolution (Table S2). We also performed collinearity analysis on C. camphora and two other representative model plants, A. thaliana and P. trichocarpa. In total, 76 and 203 orthologous gene pairs We also performed collinearity analysis on C. camphora and two other representative model plants, A. thaliana and P. trichocarpa. In total, 76 and 203 orthologous gene pairs were identified between C. camphora and A. thaliana and between C. camphora and P. trichocarpa, respectively (Figure 3), indicating a closer homologous evolutionary relationship of the C. camphora bHLH TF family with that of P. trichocarpa than with that of A. thaliana. There were 60 CcbHLHs that had no collinear gene pairs with the bHLH TF of A. thaliana and P. trichocarpa, indicating that these TFs were of different origin from A. thaliana and P. trichocarpa. Further experiments will be needed to characterize the functions of these CcbHLHs. Subsequently, we conducted collinear analysis between C. camphora and the related species Cinnamomum kanehirae, and found 256 pairs of homologous genes, among which 16 CcbHLH TFs were of a different origin than that of the C. kanehirae genes. It is possible that these 16 TFs have specific functions in C. camphora ( Figure S2). In addition, the Ka/Ks values of the directly homologous CcbHLHs of C. camphora, A. thaliana, and P. trichocarpa were all less than 1, indicating a strong effect of purifying selection on the CcbHLH gene family (Tables S3 and S4). The Ka/Ks values of eight CcbHLH directly homologous genes of C. camphora and C. kanehirae were greater than 1, indicating their positive selection (Table S5). possible that these 16 TFs have specific functions in C. camphora ( Figure S2). In addition, the Ka/Ks values of the directly homologous CcbHLHs of C. camphora, A. thaliana, and P. trichocarpa were all less than 1, indicating a strong effect of purifying selection on the CcbHLH gene family (Tables S3 and S4). The Ka/Ks values of eight CcbHLH directly homologous genes of C. camphora and C. kanehirae were greater than 1, indicating their positive selection (Table S5).

Predicted Protein-Protein Interaction Network of CcbHLHs
The bHLH TF family members typically function by forming homo-or heterodimers with other proteins, which is indispensable for their binding to the promoters of target genes. We used STRING to predict protein interaction networks based on CcbHLH direct homologs in A. thaliana. The results showed that most CcbHLHs interacted with more than 1 CcbHLH protein, and 19 CcbHLHs interacted with more than 10 CcbHLHs ( Figure  S3 and Table S6). CcbHLH001, CcbHLH022, CcbHLH118, and CcbHLH134 (TT8 in A. thaliana) cooperate with TT1, PAP1 and TTG1 to regulate biosynthesis of proanthocyanidins and anthocyanidins by modulating the expression of the DFR gene ( Figure 4 and Table  S7) [18]. These results suggest that CcbHLH001, CcbHLH022, CcbHLH118, and CcbHLH134 may participate in anthocyanin biosynthesis.

Predicted Protein-Protein Interaction Network of CcbHLHs
The bHLH TF family members typically function by forming homo-or heterodimers with other proteins, which is indispensable for their binding to the promoters of target genes. We used STRING to predict protein interaction networks based on CcbHLH direct homologs in A. thaliana. The results showed that most CcbHLHs interacted with more than 1 CcbHLH protein, and 19 CcbHLHs interacted with more than 10 CcbHLHs ( Figure S3 and Table S6). CcbHLH001, CcbHLH022, CcbHLH118, and CcbHLH134 (TT8 in A. thaliana) cooperate with TT1, PAP1 and TTG1 to regulate biosynthesis of proanthocyanidins and anthocyanidins by modulating the expression of the DFR gene ( Figure 4 and Table S7) [18]. These results suggest that CcbHLH001, CcbHLH022, CcbHLH118, and CcbHLH134 may participate in anthocyanin biosynthesis.

Analysis of Expression Levels of Candidate TFs for Anthocyanin Biosynthesis
To further investigate the roles of CcbHLH TFs in anthocyanin biosynthesis in C. camphora, we characterized the expression levels of seven CcbHLHs in various tissue types at different stages by qRT-PCR. Three CcbHLHs were identified previously (CcbHLH015, CcbHLH017, and CcbHLH101) and four were revealed in this study (CcbHLH001, CcbHLH022, CcbHLH118, and CcbHLH134) (Figure 6). We compared the expression patterns of these TFs in the mutant 'Gantong 1' variety to those of the 'Gantong 1' half-sib progenies which were used as a genetic background control. In the 'Gantong 1' leaves, transcriptional levels of three CcHLHs (CcbHLH001, CcbHLH015, and CcbHLH017) peaked in April and May, while expression of CcbHLH022 and CcbHLH134 peaked in May and August. The expression of CcbHLH101 reached its peak in July and August. However, the expression levels of CcbHLH118 showed one peaks in April ( Figure 6A). Furthermore, we observed significant differences in expression levels of six CcbHLHs in the 'Gantong 1'

Analysis of Expression Levels of Candidate TFs for Anthocyanin Biosynthesis
To further investigate the roles of CcbHLH TFs in anthocyanin biosynthesis in C. camphora, we characterized the expression levels of seven CcbHLHs in various tissue types at different stages by qRT-PCR. Three CcbHLHs were identified previously (CcbHLH015, CcbHLH017, and CcbHLH101) and four were revealed in this study (CcbHLH001, CcbHLH022, CcbHLH118, and CcbHLH134) (Figure 6). We compared the expression patterns of these TFs in the mutant 'Gantong 1' variety to those of the 'Gantong 1' half-sib progenies which were used as a genetic background control. In the 'Gantong 1' leaves, transcriptional levels of three CcHLHs (CcbHLH001, CcbHLH015, and CcbHLH017) peaked in April and May, while expression of CcbHLH022 and CcbHLH134 peaked in May and August. The expression of CcbHLH101 reached its peak in July and August. However, the expression levels of CcbHLH118 showed one peaks in April ( Figure 6A). Furthermore, we observed significant differences in expression levels of six CcbHLHs in the 'Gantong 1' leaves compared to those in control plants. Subsequently, we analyzed the expression pattern of CcbHLH in bark, and found that the expression levels of CcbHLH001 and CcbHLH101 peaked in April and July, whereas expression of CcbHLH022 and CcbHLH134 peaked in January and August. CcbHLH118 peaked in January and April, while CcbHLH015 only peaked in April ( Figure 6B). Interestingly, significant differences were found for some genes that were exclusively higher in 'Gantong 1' than in the control plants. Taken together, these results suggest potential roles of the characterized CcbHLHs in leaf and bark development.

Discussion
C. camphora, one of the essential landscaping species of the world, is widely used in street-side greening and to create shade. The colorful leaves, bark, and other ornamental traits are important breeding goals for C. camphora. bHLH TFs are the essential TFs regulating anthocyanin biosynthesis. The function of some bHLH TF family members in anthocyanin biosynthesis has been previously demonstrated in species such as Vitis davidii, Ficus carica L., and Juglans regia L. [18,21,23]. We previously reported that bHLH TFs are involved in anthocyanin synthesis in C. camphora ('Gantong 1') [3]. However, the roles of bHLHs in regulating the colors of C. camphora tissues remained unclear. The elucidation of genomic characteristics of C. camphora bHLH TFs could help understand the development of this ornamental trait.
In this study, 150 CcbHLH TF family members were identified in C. camphora, which was a similar number of bHLH TFs compared to that in Panax ginseng (169) and tomato

Discussion
C. camphora, one of the essential landscaping species of the world, is widely used in street-side greening and to create shade. The colorful leaves, bark, and other ornamental traits are important breeding goals for C. camphora. bHLH TFs are the essential TFs regulating anthocyanin biosynthesis. The function of some bHLH TF family members in anthocyanin biosynthesis has been previously demonstrated in species such as Vitis davidii, Ficus carica L., and Juglans regia L. [18,21,23]. We previously reported that bHLH TFs are involved in anthocyanin synthesis in C. camphora ('Gantong 1') [3]. However, the roles of bHLHs in regulating the colors of C. camphora tissues remained unclear. The elucidation of genomic characteristics of C. camphora bHLH TFs could help understand the development of this ornamental trait.
In this study, 150 CcbHLH TF family members were identified in C. camphora, which was a similar number of bHLH TFs compared to that in Panax ginseng (169) and tomato (159) [14,37]. The number of TFs revealed in C. camphora was larger than that in Liriodendron chinense (91) [19], Prunus avium L. (66) [38], and Juglans regia L. (102) [23], but smaller than that in Helianthus annuus L. (183) [39] and Pyrus bretschneideri (197) [17]. These differences might result from differences in gene/genome duplication events during evolution. A total of 143 bHLH transcription factors were identified in C. kanehirae of the same genus as C. camphora, which was basically similar to C. camphora, indicating the same evolutionary relationship within the same genus. C. camphora is rich in essential oils, which contain terpenoids. The leaf essential oil of C. camphora is considered a contributor to the beneficial properties of this plant [40]. Terpene synthase (TPS) is a critical enzyme in terpene synthesis. MYC, a bHLH family transcription factor, regulates the expression of terpenoid biosynthetic genes in various plants. Hong et al. demonstrated that MYC2 in A. thaliana directly binds to the promoters of TPS21 and TPS11 and activates their expression, thereby promoting biosynthesis of (E)-β-caryophyllene and other terpenoids [41]. Therefore, C. camphora may contain more CcbHLH family members than other species to regulate TPS gene expression.
The phylogenetic tree results showed that 150 CcbHLH TFs were divided into 26 subfamilies, which is similar to the number of 26 reported for Andrographis paniculata [42], 24 in Panax ginseng [37], 25 in Osmanthus fragrans [43], and 25 in A. thaliana [32]. However, compared to Arabidopsis, the minor subfamily VI was not found in C. camphora, which may be attributed to the loss of genes during evolution. We then adopted the Pires's classification method to analyze CcbHLHs that did not match Heim's classification in C. camphora. These members then fell into subfamilies XIII, XIV, and Orphan. The numbers of CcbHLH TFs in subfamilies Ib, X, and XII were the largest, broadly in agreement with the pattern of each corresponding subfamily in A. thaliana. Subfamilies Va (CcbHLH074, CcbHLH140), XIII (CcbHLH119, CcbHLH135), IVb (CcbHLH038, CcbHLH117), and XIV (CcbHLH148, CcbHLH143) had two members each, suggesting a relatively slow evolution rate. So far, the biological functions of CcbHLHs still remain unclear. However, according to this study and previous studies on other plant species, we were able to narrow down the number of candidate CcbHLH TFs potentially involved in anthocyanin biosynthesis. For instance, CcbHH001, CcbHH022, CcbHH118, and CcbHH134 belonged to the subfamily IIIf, in which A. thaliana (AtbHLH042) was found, which is known to be involved in anthocyanin synthesis [25]. In addition, FcBHLH42 in Ficus carica L. [21], FabHLH29 in strawberry [31], FhGL3L and FhTT8L in Freesia hybrida [44], CmbHLH2 in Chrysanthemum morifolium R. [45], and VdbHLH037 in Vitis davidii [18] belonged to subfamily IIIf. These genes have also been proven to be involved in anthocyanin and proanthocyanin biosynthesis.
Further analysis of CcbHLH gene structures and conserved motifs confirmed the phylogenetic relationships within the CcbHLH TF family. Most CcbHLH members in each subfamily shared similar conserved motifs and gene structures, implying semblable biological functions. We found that motifs 1 and 2 occurred most frequently in the CcbHLH TF family, suggesting that these motifs are major components of the CcbHLH domain with highly conserved DNA-binding capacity [46]. In this study, the number of exons in CcbHLH TFs ranged from 1 to 17, which was consistent with the numbers obtained in Panax ginseng [37] and Osmanthus fragrans [43], among which 17 CcbHLHs had only 1 exon whereas 4 CcbHLHs had more than ten exons. This result suggests the possible ongoing evolution of C. camphora bHLH genes.
In addition, the chromosomal localization analysis suggested that fragment duplication in the CcbHLH gene family drove its expansion. Similar events were also found in Liriodendron chinense [19], Ficus carica L. [21], and Pyrus bretschneideri [17]. Furthermore, the genomic collinearity analysis of C. camphora and other species, as well as the Ks analysis of homologous genes in the collinearity block, suggested that the C. camphora genome had undergone three genome-wide duplication events during its evolution [29].
Mutual analysis of CcbHLH proteins can help to predict the potential functions of the CcbHLH TF family genes. Maize R1, B1, Lc, and Sn were the first bHLH TFs shown to regulate anthocyanin synthesis [47][48][49]. The homologs of maize R transcription factor, TT8, GL3, and EGL3, were then proven to function similarly in Arabidopsis [25]. In this study, we found that CcbHLH001, CcbHLH022, CcbHLH118, and CcbHLH134 were orthologous to AtbHLH42/TT8 and, therefore, they likely regulate anthocyanin synthesis in C. camphora.
The specificity of the expression pattern in different tissue types usually indicates the tissue-specific functions of the gene. Thus, we investigated the expression specificities of several CcbHLHs through transcriptomic analysis using seven different tissue types. For example, CcbHLH005 was highly expressed in the flowers, whereas CcbHLH042 was highly expressed in the leaves and bark. bHLH TFs from the subfamily IIIf were previously reported to regulate anthocyanin and flavonoid biosynthesis [25]. One of the subfamily members, CcbHLH001, was significantly more highly expressed in the stem and phloem compared to its levels in other tissues. However, the expression levels of CcbHLH022, CcbHLH118, and CcbHLH134 did not show any tissue-specificity. Another three CcbHLHs previously identified to be associated with anthocyanin synthesis showed tissue-specific expression patterns as well, including CcbHLH015 in the xylem and phloem, CcbHLH017 in the xylem, and CcbHLH101 in the stem and phloem. We verified the expression patterns of seven CcbHLHs (CcbHLH001, CcbHLH022, CcbHLH118, CcbHLH134, CcbHLH015, CcbHLH017, and CcbHLH101) in coloration mutant 'Gantong 1' at various growth stages and in different tissues by using qRT-PCR. The significantly higher expression of CcbHLH TFs in the bark of 'Gantong 1' implied their role in the anthocyanin synthesis. Taken together, the transcriptome and qRT-PCR data indicated that these CcbHLHs may play a crucial part in the regulation of anthocyanin biosynthesis. However, the functions of these CcbHLHs (CcbHLH001, CcbHLH015, CcbHLH017, CcbHLH022, CcbHLH101, CcbHLH118, and CcbHLH134) need to be further verified by transgenesis, the yeast one-hybrid method, and other approaches.

Plant Materials
Seedlings of 'Gantong 1', a new variety of C. camphora, were selected by the single plant breeding method. The young leaves of this variety are orange-red or orange, and gradually turn to yellow-green or green after maturity. The young bark is light pink with white spots and bright red after half-lignification, with obvious seasonal changes. In this study, we used three C. camphora 'Gantong 1' plants and their half-sib progeny, which were planted in the same plot at the Jiangxi Academy of Sciences (28 • 69 N, 116 • 00 E) in the same growth environment. All plants had a similar growth trend. 'Gantong 1' was the experimental group, and the half-sib progeny was the control group ( Figure 7). According to the timing of the most dramatic changes in leaf and bark colors, material sampling was performed in January, April, May, July, August, and December 2021. Two-gram samples of the leaves and bark of three biological replicates of the 'Gantong 1' and control groups were collected, placed in a 50 mL centrifuge tube, and stored in a −80 • C ultra-low temperature refrigerator after quick freezing in liquid nitrogen.

Identification and Physicochemical Properties Analysis of CcbHLHs
Genome sequencing of C. camphora has been completed by our research group previously (GWHBGBX00000000) [29]. The Hidden Markov Model configuration file containing the bHLH domain (PF00010) was downloaded from Pfam (http://pfam.xfam.org/ (accessed on 1 October 2022)). We used online websites NCBI Batch CD-Search

Identification and Physicochemical Properties Analysis of CcbHLHs
Genome sequencing of C. camphora has been completed by our research group previously (GWHBGBX00000000) [29]. The Hidden Markov Model configuration file containing the bHLH domain (PF00010) was downloaded from Pfam (http://pfam.xfam.org/ (accessed on 1 October 2022)). We used online websites NCBI Batch CD-Search (https://www. ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi (accessed on 1 October 2022)) and SMART (http://smart.embl.de/ (accessed on 1 October 2022)), to perform domain verification of the bHLH TF family protein sequences, and to exclude CcbHLHs without the bHLH conserved domain. Subsequently, we downloaded the genome data and annotation information of Cinnamomum kanehirae from the NCBI database (https://www.ncbi.nlm.nih. gov/data-hub/taxonomy/337451/ (accessed on 1 October 2022)) and made a preliminary identification using the same method. The annotation information of C. camphora was extracted from the GFF file, and the TBtools package was used to visualize it. The physicochemical properties of CcbHLH TFs were analyzed using ProtParam EXPASY online software (https://web.expasy.org/protparam/ (accessed on 3 October 2022)) [50].

Phylogenetic Analysis of CcbHLHs
We used the online tool MAFFT version 7 (https://mafft.cbrc.jp/alignment/software/ algorithms/algorithms.html (accessed on 5 October 2022)) to perform multiple sequence alignment of the AtbHLH and CcbHLH protein sequences based on default parameters. The alignment results were uploaded to MEGA7 software, and a phylogenetic tree was constructed based on the neighbor-joining method and a bootstrap of 1000. Optional parameters were p-distance and pairwise deletion [51]. The evolutionary tree was drawn using the iTOL website (https://itol.embl.de/itol.cgi (accessed on 5 October 2022)).

Analysis of the Structure and Conserved Motifs in CcbHLHs
We used online software GSDS 2.0 (http://gsds.cbi.pku.edu.cn/ (accessed on 8 October 2022)) and MEME (https://meme-suite.org/meme/ (accessed on 8 October 2022)) to analyze the structures and conserved motifs of CcbHLHs, respectively, on the basis of 150 CcbHLH TFs cDNA sequences with corresponding genomic DNA sequences. The number of conserved motifs was set at 20. The results were visualized with TBtools [52].

Chromosomal Localization and Collinearity Analysis of CcbHLHs
The chromosomal localization, length, and density information were extracted from the GFF file of C. camphora. The genome-wide data of A. thaliana and Populus trichocarpa were downloaded from the EnsemblPlants website (http://plants.ensembl.org/index.html (accessed on 10 October 2022)), and the collinearity analysis was performed using MCScanX software [53]. The Ka and Ks values were analyzed with TBtools.

Protein-Protein Interaction Network Prediction
We used the online website STRING (https://cn.string-db.org/ (accessed on 15 October 2022)) to query A. thaliana protein sequences and predict protein interaction networks with 150 CcbHLH protein sequences as references [54].

Expression Levels of CcbHLHs in Different Tissues
In our previous study, we performed RNA-Seq on seven different tissues (stem, fruit, root, xylem, leaf, flower, and phloem) of C. camphora ('Gantong 1') [29]. These data were used to explore the expression pattern of CcbHLH TFs. The expression levels of CcbHLHs were estimated as kilobases per million (FPKM) reads. Tbtools software was used to visualize transcriptome FPKM data and draw gene expression heat maps based on log 2 (FPKM).

Analysis of Expression Levels of Candidate TFs Regulating Anthocyanin Biosynthesis
The CcbHLHs candidate TFs for anthocyanin biosynthesis were detected by qRT-PCR using the leaf and bark cDNA of C. camphora ('Gantong 1') and the control group, with leaf and bark as templates. Sample RNA was extracted using an RNA extraction kit (Huayueyang Biotechnology Co., Ltd., Beijing, China) and reverse-transcribed into cDNA, which was used as a template for qPCR analysis (Yeasen BioTechnologies Co., Ltd., Shanghai, China). The primers were designed using Primer Premier 5.0 software with Tm values ranging from 58 • C to 61 • C and amplified fragments ranging from 100 to 200 bp (Table S8). 'Gantong 1' and the control had three biological replicates, each of which was independently repeated three times. The expression level of actin mRNA (KM086738.1) was selected as a reference, and the 2 −∆∆CT method was used for calculating fold changes in gene expression levels [3]. The expression levels of CcbHLHs in the control sample taken in January were set as "1".

Conclusions
This study is the first comprehensive and systematic analysis of bHLH TFs in the C. camphora genome. We identified 150 CcbHLHs that were distributed on 12 chromosomes and could be divided into 26 subfamilies. We analyzed their gene structures and conserved motifs. The collinearity analysis showed that there were 68 pairs of fragmentally duplicated genes among the 150 CcbHLHs. Phylogenetic analysis of C. camphora and A. thaliana revealed four candidate CcbHLHs (CcbHLH001, CcbHLH022, CcbHLH118, and CcbHLH134) potentially involved in anthocyanin biosynthesis in C. camphora. Transcriptional analysis revealed the expression pattern of CcbHLHs in a coloration mutant C. camphora 'Gantong 1'. This study opens a new avenue for subsequent research on the functions of CcbHLHs in C. camphora.