Genome-Wide Analysis of NAC Gene Family in Betula pendula

: NACs (NAM, ATAF1 / 2, and CUC2) are plant-speciﬁc transcription factors that play diverse roles in various plant developmental processes. In this study, we identiﬁed the NAC gene family in birch ( Betula pendula ) and further analyzed the function of BpNACs . Phylogenetic analysis reveals that the 114 BpNACs can be divided into seven subfamilies. We investigated the expression levels of these BpNACs in di ﬀ erent tissues of birch including roots, xylem, leaves, and ﬂowers, and the results showed that the BpNACs seem to be expressed higher in xylem and roots than leaves and ﬂowers. In addition to tissue-speciﬁc expression analysis, we investigated the expression of BpNACs under low-temperature stress. A total of 21 BpNACs were di ﬀ erentially expressed under low-temperature stress, of which 17 were up-regulated, and four were down-regulated. Using the gene expression data, we reconstructed the gene co-expression network for the 21 low-temperature-responsive BpNACs . In conclusion, our results provide insight into the evolution of NAC genes in the B. pendula genome, and provide a basis for understanding the molecular mechanism for BpNAC -mediated cold responses in birch.


Introduction
The genome sequence is the whole genetic information of an individual [1]. Functional analysis such as gene family identification, gene cloning, and reconstruction of gene regulatory networks largely depend on the genome references. In recent years, more plant genomes have been sequenced, and they have had a major impact on plant research and improvement in a short time [2]. However, most of the sequenced plant genomes are herbaceous [3][4][5], while whole genome sequences of woody trees [6,7] are still limited. Fortunately, the genome sequence of Betula pendula [8] has become available in the last few years, which can be exploited in numerous ways [9].
In molecular biology, a transcription factor (TF) [10] is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. In plant genomes, about 10% of the products of coding genes are TFs [11]. It is reported [12,13] that TFs are involved in all the aspects of the plant life cycle including growth, development, and stress tolerance. Research studies on TFs have been an important part of functional genomics analysis. Transcription factors in plants can be assigned to different families according to the conserved DNA binding domains. The well-studied TFs in plants include MIKC, C2H2, WRKY, bZIP, MYB, SBP, HB, AP2/EREBP, and NAC [11].
The gene family always arose from multiple ways including tandem duplication, duplicative transposition, and whole genome duplication (WGD), which was followed by mutation and divergence. The NAC (NAM, ATAF, and CUC) transcription factors are plant-specific TFs involved in multiple biological functions [14]. The name of NAC was originally derived from the abbreviation of three proteins including no apical meristem (NAM), ATAF1-2, and CUC2 (cup-shaped cotyledon), which contain a similar DNA-binding domain [15,16]. It is reported that smaller numbers of NAC TFs are present in ancient land plants. The larger number of NAC TFs were considered to be the results of WGD (Whole Genome Duplication) and other small-scale duplication events [17]. Accumulating evidence indicated that a considerable portion of NAC transcription factors plays crucial roles in the processes of xylogenesis, fiber development, and wood formation in vascular plants. In Arabidopsis thaliana, NAC secondary wall thickening promoting factor 1 (NST1) and NAC secondary wall thickening promoting factor 3 (NST3) are key regulators of secondary wall formation in the woody tissue [18]. Secondary wall-associated NAC domain protein 1 (SND1) could upregulate the expression of multiple transcription factors, including one that is highly expressed in fibers during secondary wall synthesis [19]. Xylem NAC domain 1 (XND1) transcription factor negatively regulates lignocellulose synthesis and programmed cell death in the xylem [20]. A vascular related NAC-domain protein 7 (VND7) plays a pivotal role in regulating the differentiation of root protoxylem vessels [21]. In rice and maize, the secondary wall-associated NACs (namely OsSWNs and ZmSWNs) have been shown to be key transcriptional switches regulating secondary wall biosynthesis [22]. In Eriobotrya japonica, EjNAC1 was associated with fruit lignification by activating genes involved in lignin biosynthesis [23]. In Gossypium hirsutum, GhXND1 encodes an NAC transcription factor, and the cell wall-related genes were down-regulated in the GhXND1 transgenic plants. Wang et al. [24] suggested that GhXND1 may be involved in the negative regulation of plant xylem development. In Populus, WND2B and WND6B could effectively complement secondary wall defects in the snd1/nst1 double mutant. They have been shown to be functional orthologs of A. thaliana SND1, and they could induce ectopic deposition of secondary walls when ectopically over-expressed in A. thaliana [25]. In addition, Hu et al. [14] identified a subset of more than 30 NAC genes that showed the highest level of transcript abundance in differentiating xylem and cambium tissues of Populus.
Numerous NAC domain proteins have been implicated in plant biotic stress and abiotic stress, such as fungi, bacteria, insects, drought, salinity, and cold and mechanical wounding [14,26]. The AtNAC2 gene from A. thaliana is a transcription factor downstream of ethylene and auxin signaling pathways, which plays an important role in salt stress response [27]. Similarly, the expression of the RD26 gene in A. thaliana was also induced by drought and high salinity stress [28]. Long vegetative phase 1 (LOV1) could regulate the response of plants to freezing temperature by controlling a subset of cold response genes via a CBF/DREB1-independent pathway [29]. In rice, transgenic plants overexpressing ONAC045 showed enhanced tolerance to drought and salt treatments [30]. Banana fruit NAC transcription factor MaNAC1 is a direct target of MaICE1 and is involved in cold stress by interacting with MaCBF1 [31]. In Pyrus betulifolia, PbeNAC1 could confer cold and drought tolerance by interacting with PbeDREBs and activating the expression of stress-responsive genes [32].
Understanding the function of the NAC gene family will be instrumental in the study of molecular mechanisms of development and stress tolerance of plants. However, the research on the NAC gene mainly focuses on the model plants A. thaliana and Oryza sativa [33]. No systemic characterization of the NAC genes have been conducted in B. pendula. In this study, we used multiple bioinformatics methods to identify the B. pendula NAC gene family, and performed the comprehensive analysis including gene structure, conserved motifs, chromosomal location, phylogeny, tissue-specific expression profiles, and identification of low-temperature-responsive BpNACs. All the results laid the foundation for deeper research of the BpNAC genes.

Identification and Basic Information Statistics of BpNAC Gene Family
The Betula pendula genome [8] was downloaded and used for identifying BpNACs. We used NCBI (National Center for Biotechnology Information) Conserved Domain Database [34], InterProScan [35], PfamScan [36], and HMMER [37] to identify proteins that contain the NAC domain in the B. pendula genome. We also downloaded all the annotated NACs proteins from the plant transcription factor database (PlantTFDB) [11] and used them as queries to identify candidates of BpNACs via BLASTP with a thread of e-value less than 1e-5. In addition, all the BpNAC proteins were further manually analyzed using the NCBI Conserved Domain Database [38] to confirm the presence of the NAC domain.
The basic statistics information of BpNACs was performed using the bio-python module [39]. Molecular weight and theoretical isoelectric point of the BpNAC protein sequences were analyzed by the ExPaSy Compute pI/Mw tool [40]. The genomic structure information (GFF) of the B. pendula genome was downloaded from the CoGe comparative genomics platform [41] and visualized by using TBtools [42]. The amino acid motif analysis of BpNAC proteins was performed using MEME 5.0.4 [43]. The motif discovery mode was selected as a classic mode and the site distribution was ZOOPS (Zero or One Occurrence Per Sequence). Other options have remained as defaults.
MCScanX [44] was used to identify the intra-genome collinear blocks of B. pendula. The results were displayed using Circos [45].

Phylogenetic Analysis based on the BpNAC Domain
To investigate the phylogenetic information of the BpNACs of B. pendula, an unrooted tree was constructed. Multiple sequence alignment was performed using MUSCLE [46] with the default parameters. Then, an unrooted phylogenetic tree was constructed using RAxML [47] with the maximum likelihood (ML) method. The parameters used for RAxML were "PROTGAMMALGX" and the bootstrap value was 1000.

Functional Analysis of the BpNAC Gene Family
The function of the BpNAC gene family was annotated by eggNOG-mapper [48]. GO, COG, and KEGG annotation were retrieved from the eggNOG [49] results. We also used BLAST (E-value < 1e −5 ) to find the A. thaliana homology of BpNACs. The homologous annotation of A. thaliana in TAIR [50] was used for functional annotation of BpNACs. For comparative analysis, the expression data of A. thaliana were downloaded from the TAIR abiotic stress database (https://www.arabidopsis.org/). The promoter sequences of BpNACs were analyzed using PlantCARE [51], which is a database of plant cis-acting regulatory elements and a useful promoter sequence analysis tool.

Differential Expression Profile of BpNAC Gene Family
Three biological replicates of roots, young leaves, and female inflorescences were collected from two-year-old birch planted in the experimental field of Northeast Forestry University (Harbin, China). Two-month-old birch were transplanted to the field from the greenhouse in the spring. Birch planted in the experimental field can bloom two years later with a good supply of water and fertilizer. Each biological replicate contains three individuals. We collected five leaves from each individual and mixed them together as one biological replicate. For xylem, we used the upper stem (about 10th nodes) of two-year old birch trees, and the tissue-specific sampling was performed, according to the Juan Alonso-Serra et al. [52] method. Sequential tangential cryosections were collected with two biological replicates per fraction and processed independently. Once a fraction was removed, a cross-section was observed under the microscope to localize the anatomical position. All the plants were grown under natural conditions with enough water and nutrition. Total RNA of the samples were extracted using the traditional CTAB (Cetyltrimethylammonium Bromide) method [53]. The quality and concentration of the isolated RNA were evaluated using agarose gel electrophoresis and UV-spectrophotometry. The cDNA libraries were constructed at Biomarker Technologies Corporation (Beijing, China), which was subjected to paired-end read sequencing using the Illumina HiSeq platform. The raw sequencing data were deposited into the NCBI SRA (Sequence Read Archive) database with an accession number of PRJNA535361.
To identify BpNACs in response to low-temperature stress, we designed a time series experiment including six time points, according to Yang's method [54]. Two-month-old B. pendula plants grown in the greenhouse at a constant temperature of 25 • C with a photoperiod of 16 h of light and 8 h of dark were exposed to low temperatures (6 • C) for 0.5 h, 1 h, 1.5 h, 2 h, 2.5 h, and 3 h, respectively. After low-temperature exposure, all young leaves were harvested at the same time to avoid falsely differentially expressed genes caused by diurnal changes. Plants without low temperature treatment were used as the control for this study. We generated two biological replicates for time points of 1 h, 2 h, 2.5 h, 3 h, and CK (control check), and three biological replicates for time points of 0.5 h and 1.5 h, respectively. Total RNA of the samples were extracted using the traditional CTAB method [53]. The constructed cDNA libraries were sequenced using the Illumina HiSeq platform at Biomarker Technologies Corporation (Beijing, China). The raw sequencing data were deposited into the NCBI SRA database with an accession number of PRJNA532995.
After removing reads of low quality, each set of sequences was separately aligned to the B. pendula reference genome [8] using bowtie2 [55]. The RNA-Seq data were subsequently analyzed using the RSEM (RNA-Seq by Expectation-Maximization) pipeline [56] and the data were processed using a paired-end sequencing mode. The RSEM handle ambiguously mapping reads used the method of Li et al. [57], which treats mapping uncertainty in a statistically rigorous manner. This approach allows for accurate gene-level abundance estimates between both isoforms and genes using paired-end reads [56]. Compared to controls (plants without low-temperature treatment), differentially expressed genes (DEGs) at each time point were identified using edgeR [58] software with a threshold of FDR < 0.05 and log 2 FC > 1. The gene expression levels were normalized using the trimmed mean of the M-values (TMM) method. The fold changes of differentially expressed genes (DEGs) were the average TMM of each time point divided by the average TMM of the control group. The number of minimum repeat samples was 2, and the minimum value of count per million was 1 to filter out samples with a low expression.

Quantitative RT-PCR
The two-month-old plants of birch were treated with low-temperature stress for 3 h and 6 h. Three biological replicates were prepared for each time point. Total RNA of young leaves was then isolated for the qRT-PCR experiment. The resulting RNA was used as a template for first-strand cDNA synthesis with random hexamers, according to the manufacturer's instructions. Quantitative real-time RT-PCR (qPCR) was performed with an Applied Biosystems 7500 Real-time PCR System to examine the expression patterns of the selected genes. Primer pairs for real-time quantitative PCR (Table S1) were designed using A plasmid Editor v1.11. Each reaction contained 10 µL of THUNDERBIRD SYBR qPCR Mix (QPS-201, TOYOBO, Japan), 2 µL of cDNA, 1 µL of forward primer, 1 µL of reverse primer, and 6 µL of double-distilled water (ddH 2 O). The PCR reactions were performed under the following conditions: 95 • C for 10 min, which was followed by 45 cycles of 95 • C for 30 s and 60 • C for 10 s. The 2 −∆∆CT method of analysis was used with the 18S RNA gene as the normalizing gene. Each sample was analyzed in triplicate.

Co-Expression Network Construction and GO Enrichment Analysis of BpNAC Gene Family
We used WGCNA (Weighted Correlation Network Analysis) to identify genes co-expressed with the low-temperature responsive BpNACs [59]. The co-expression network was visualized using Cytoscape [60]. The data used as input for the co-expression network construction are the gene expression data derived from the time series experiment of low-temperature stress. Only members with an expression greater than 10 in all samples are retained. The optimal β (soft thresholding power) value was determined to be 13 after 20 iterations. The Pearson algorithm is then used to calculate the correlation coefficient and the result is stored as a signed co-expression matrix. The "deepSplit" value was set to 2 during clustering, the "minModuleSize" value was set to 30, and the "mergeCutHeight" value was set to 0.15. To ensure the reliability and readability of the results, we filtered out the edges with weights below 0.1 and used Cytoscape to display genes directly related to TFs. The GO annotation was performed by EggNOG-mapper with the diamond model and visualized using WEGO 2.0 [61].

Genome-Wide Identification of the BpNAC Gene Family
We first identified candidates of BpNACs by searching no apical meristem (NAM) domain (PF02365) in the Betula pendula genome using InterProScan [35], PfamScan [36], and HMMER [37]. A total of 108 putative BpNACs (Table 1) were identified in the B. pendula genome at this step. Further Conserved Domains Database (CDD) [34] analysis revealed that all the genes have a NAM or NAC conserved domain structure, which demonstrates the reliability of the results. The 108 genes were defined as BpNAC001 to BpNAC108, and used for further analysis. In addition, we also identified six genes ( Table S2) that were similar to the known NACs. Lastly, a total of 114 BpNACs were identified in the B. pendula genome. However, all the six genes may be truncated BpNACs and contain no conserved NAC domain. Thus, in this study, we did not use them for further analysis. The coding region lengths of the BpNACs were from 276 bp to 3804 bp. The deduced proteins range from 91 to 1268 amino acids in length with theoretical isoelectric points from 4.28 to 9.65. The predicted molecular weights of the BpNAC proteins range from 9.91 kDa to 139.55 kDa. The average predicted the isoelectric point of the encoded proteins, which was 6.30. The average molecular weight was 39.50 kDa. The detailed information is summarized in Table 1.

Sequence-Structure Features of BpNAC Gene Family
In a natural evolution, the diversity of genes is an important factor in promoting morphological diversity, and the morphological diversity allows plants to use natural resources efficiently or to adapt to the adverse environmental conditions [62]. To investigate the genomic structural features, we plotted the gene structural map of the BpNAC genes ( Figure S1). BpNAC038 contains 15 introns, which have the most introns in the BpNAC gene family, followed by BpNAC064, which contains nine introns. A large number of introns may be due to the complex evolutionary process of B. pendula. The number of exons in the BpNAC gene family is generally less than eight.
The conserved amino acid motifs of the BpNAC proteins were identified using MEME [43]. A total of eight amino acid conserved motifs of the BpNAC family were identified. As shown in Figure S2, about 50% of the BpNACs contain motif 1, 2, 3, 4, 5, 6, and 7. In the remaining sequence, motif 4 and motif 7 are replaced by motif 8. Motif 8 is the longest motif in BpNAC protein family including 29 amino acids. By comparison with the NR database [63], we found that this motif is the most similar with OMO69178 from Corchorus capsularis, which proved to be the NAM protein. Motif 1 has been found in Glycine soja's XP_028199933 and XP_028199935 protein, and they all belong to the NAC 59-like gene. Motif 1 was found in CUC2 of Spinacia oleracea, Mucuna pruriens, and Chenopodium quinoa. Kamiuchi et al. [64] reported that CUC2 promote carpel margin meristem formation during A. thaliana gynoecium development. Motif 2 was found in NAC48, and has similar structures in various species, such as Dichanthelium oligosanthes, Oryza sativa, Apostasia shenzhenica, Panicum hallii, Ananas comosus, and Setaria italica. NAC48 is involved in multicellular organismal development and regulation of transcription [65]. Motif 3 exhibits a similarity to NAC71 and NAC86, and has been found in Capsella rubella, Arachis duranensis, Elaeis guineensis, Amborella trichopoda, and Solanum lycopersicum. Motif 4 exhibits similarity to NAC40 and NAC86 and is widely distributed in a variety of plants, such as Raphanus sativus, Brassica rapa, Arachis duranensis, Medicago truncatula, and Cicer arietinum. Motif 6 exhibits similarity to the CUP-SHAPED COTYLEDON 3 in Medicago truncatula. Hibara et al. [66] reported that CUP-SHAPED COTYLEDON3 could regulate postembryonic shoot meristem and organ boundary formation. The other shorter motif sequences of motif 5 and motif 7 also show varying degrees of similarity in some NAC-related genes. Based on the genomic structure results, we have a preliminary understanding of the characteristics of the BpNACs. The similar gene structures and conserved amino acid motifs of BpNACs may provide additional support to the phylogenetic analysis.

Chromosomal Location and Gene Duplication of the BpNAC Gene Family
We investigated the formation of the BpNAC family, according to the chromosomal location and intra-genome syntenic information. The analysis of chromosomal location revealed that the 97 BpNAC genes are distributed among 14 chromosomes (Chr). As shown in Figure S3, 13, 12, 11, and 10 BpNACs are located on Chr2, Chr12, Chr1, and Chr7, respectively. Seven BpNACs are located on Chr5 and Chr10, six on Chr3, Chr9, and Chr14, five on Chr4 and Chr13, four on Chr6 and Chr8, and the last one on Chr11.
According to the chromosomal location results, we found that some BpNAC genes were organized into duplicated blocks. A total of 35 BpNACs are distributed as clusters in the B. pendula chromosomes and may originate from duplicative transposition ( Figure S3). For Chr1, Chr5, and Chr12, each chromosome contains eight, six, and seven tandem BpNACs, respectively. For Chr9 and Chr13, each chromosome contains four tandem BpNACs. For Chr2, Chr3, and Chr8, each chromosome contains two tandem BpNACs.
WGD events were the products of nondisjunction during meiosis [67], and can cause excessive gene duplication in a short period of time. It is reported that the B. pendula genome had experienced the paleohexaploid event [8]. To determine the possible relationship between the formation of the BpNAC family and the paleohexaploid event, we constructed 3538 chromosome segmental duplicated blocks for the B. pendula genome using MCScanX [44]. Then, the BpNACs were mapped to the duplicated blocks and the distributions were illustrated in Figure S4. According to the MCScanX results, 66 BpNACs were found to be in the syntenic blocks, and a total of 280 links were produced. Among them, chromosome 2 is most closely related to other chromosomes, and links with 40 BpNACs from 11 chromosomes.

Phylogenetic Analysis of the BpNAC Gene Family
In A. thaliana, rice, and soybean, the NAC family was divided into two, five, and 15 groups, respectively [68,69]. More recently, comprehensive phylogenetic analyses of numerous NAC proteins from a variety of plant species indicated that NAC proteins can be divided into eight subfamilies [70]. To investigate the phylogenetic relationship among the BpNAC proteins in the B. pendula genome, an unrooted phylogenetic tree was constructed ( Figure S5) using BpNAC protein sequences by RAxML [47] with 1000 bootstrap values. According to the structure similarity and conserved motif information of branches on the tree, the 108 BpNAC proteins could be classified into seven clades: Clade I, Clade II, Clade III, Clade IV, Clade V, Clade VI, and Clade VII. Clade III is the largest branch including 30 BpNACs, which is followed by Clade I and includes 21 BpNAC members. Clade V and Clade VI include nine members, while Clade II and Clade IV contain 17 and 15 BpNAC members, respectively. BpNAC002, BpNAC005, BpNAC015, BpNAC057, BpNAC072, BpNAC090, and BpNAC103 are not identical in structure similarity to the above categories, and were divided into Clade VII. In addition, we constructed a phylogenetic tree using proteins encoded by BpNACs and AtNACs (Figure 1), which facilitated us to better understand the functions of BpNACs via comparative analysis.
NAC proteins were considered to have undergone extensive expansion after the divergence of vascular plants [17]. We identified homologous genes of each BpNAC in A. thaliana, and then compared the results with the COG phylogenetic trees constructed by Jin et al. [69]. The results indicated that the B. pendula genome gained a total of 36 NAC genes by gene expansion. A total of 2, 1, 16, 3, 13, and 1 expanded BpNACs were distributed on clade I to VI of the COG phylogenetic trees, respectively. On the contrary, the B. pendula genome lost 26 NAC genes by gene contraction. A total of 11, 3, 10, and 2 BpNACs were lost on clade I, II, III, and V of the COG phylogenetic trees, respectively.

Functional Analysis of the BpNAC Gene Family
Functional annotation of the BpNAC gene family was performed using eggNOG-mapper [48]. Of the 108 BpNACs, 80 were annotated with the GO database [71]. Of these, 28 BpNACs were annotated with GO terms of responses to stimuli, 14 were annotated to positive regulation of biological processes, and 18 were annotated to the impact on plant development processes. According to the COG database annotation [72], BpNAC058 and BpNAC086 may be involved in cell wall/membrane/envelope biogenesis. The annotation information is listed in Table S3.
Gene expression in eukaryotes is a highly coordinated process involving regulation at different levels. Promoter sequences of genes play a critical role in initiating the transcription of a gene. Identification and functional prediction of the promoter regulatory elements of the BpNAC gene family were performed using PlantCARE [51]. According to the PlantCARE result, a variety of motifs were distributed in the promoters of BpNACs, which indicates that BpNACs may be implicated in various biological processes of birch. All the promoter sequences of the BpNACs contain at least one light response motif (Table S4). In addition, many promoters of BpNAC contain motifs involved in methyl jasmonate, salicylic acid, and auxin responsiveness. The results indicated that BpNACs may be implicated in various aspects of plant development.

Tissue-Specific Expression Profiles of BpNACs
To further explore the function of BpNACs in birch development, we investigated the expression profiles of BpNACs in different tissues including roots, leaves, xylem, and flowers. We generated 11 samples from four tissues, with three biological replicates of flowers, leaves, and roots, and two biological replicates of xylem, which resulted in 435,991,087 reads. After removing reads of low quality and adaptor sequences, a total of 405,059,111 clean paired-end reads were obtained. The detailed information is summarized in Table S5.
We found that the BpNACs are widely expressed in B. pendula roots and xylem ( Figure 2). BpNAC053 is expressed at the highest level in all tissues when compared to other BpNACs. EggNOGmapper predicts it to be the BTF3, which is considered to be the nascent polypeptide-associated complex NAC. BpNAC017, BpNAC019, BpNAC052, BpNAC054, BpNAC071, BpNAC083, and BpNAC096 are specifically expressed in roots when compared with other tissues. According to the

Functional Analysis of the BpNAC Gene Family
Functional annotation of the BpNAC gene family was performed using eggNOG-mapper [48]. Of the 108 BpNACs, 80 were annotated with the GO database [71]. Of these, 28 BpNACs were annotated with GO terms of responses to stimuli, 14 were annotated to positive regulation of biological processes, and 18 were annotated to the impact on plant development processes. According to the COG database annotation [72], BpNAC058 and BpNAC086 may be involved in cell wall/membrane/envelope biogenesis. The annotation information is listed in Table S3.
Gene expression in eukaryotes is a highly coordinated process involving regulation at different levels. Promoter sequences of genes play a critical role in initiating the transcription of a gene. Identification and functional prediction of the promoter regulatory elements of the BpNAC gene family were performed using PlantCARE [51]. According to the PlantCARE result, a variety of motifs were distributed in the promoters of BpNACs, which indicates that BpNACs may be implicated in various biological processes of birch. All the promoter sequences of the BpNACs contain at least one light response motif (Table S4). In addition, many promoters of BpNAC contain motifs involved in methyl jasmonate, salicylic acid, and auxin responsiveness. The results indicated that BpNACs may be implicated in various aspects of plant development.

Tissue-Specific Expression Profiles of BpNACs
To further explore the function of BpNACs in birch development, we investigated the expression profiles of BpNACs in different tissues including roots, leaves, xylem, and flowers. We generated 11 samples from four tissues, with three biological replicates of flowers, leaves, and roots, and two biological replicates of xylem, which resulted in 435,991,087 reads. After removing reads of low quality and adaptor sequences, a total of 405,059,111 clean paired-end reads were obtained. The detailed information is summarized in Table S5.
We found that the BpNACs are widely expressed in B. pendula roots and xylem ( Figure 2). BpNAC053 is expressed at the highest level in all tissues when compared to other BpNACs. EggNOG-mapper predicts it to be the BTF3, which is considered to be the nascent polypeptide-associated complex NAC. BpNAC017, BpNAC019, BpNAC052, BpNAC054, BpNAC071, BpNAC083, and BpNAC096 are specifically expressed in roots when compared with other tissues. According to the eggNOG-mapper annotation, we found that these genes may play a role in promoting cell growth and cellulose synthesis. Combined with the annotations of the GO database, we speculated that these genes may be related to root development. In addition to root, the expression levels of BpNACs in xylem are higher than other tissues. The average TMM value of xylem is 24.63, which is much higher than 16.72 and 20.50 of flowers and leaves. This is slightly lower than 26.67 of roots. A total of 28 genes are expressed at the highest levels in xylem when compared to other tissues. Eight of these genes are specifically expressed in xylem (Table S6). Several BpNACs, i.e., BpNAC002, BpNAC003, BpNAC004, and BpNAC072, which are highly expressed in xylem, have been shown to be involved in the organic substance metabolic process. In addition, BpNAC002 and BpNAC072 also show the potential to synthesize the desired components of the cell wall. eggNOG-mapper annotation, we found that these genes may play a role in promoting cell growth and cellulose synthesis. Combined with the annotations of the GO database, we speculated that these genes may be related to root development. In addition to root, the expression levels of BpNACs in xylem are higher than other tissues. The average TMM value of xylem is 24.63, which is much higher than 16.72 and 20.50 of flowers and leaves. This is slightly lower than 26.67 of roots. A total of 28 genes are expressed at the highest levels in xylem when compared to other tissues. Eight of these genes are specifically expressed in xylem (Table S6). Several BpNACs, i.e., BpNAC002, BpNAC003, BpNAC004, and BpNAC072, which are highly expressed in xylem, have been shown to be involved in the organic substance metabolic process. In addition, BpNAC002 and BpNAC072 also show the potential to synthesize the desired components of the cell wall.

Differential Expression Profiles of the BpNAC Gene Family under Cold Stress
Since birch is a cold-resistant species, we used RNA-Seq to analyze the expression profiles of BpNACs when exposed to a low temperature for 0.5 h, 1 h, 1.5 h, 2 h, 2.5 h, and 3 h (Table S7). We generated two biological replicates at 0 h, 1 h, 2 h, 2.5 h, 3 h, and three biological replicates at 0.5 h and 1.5 h. A total of 16 sets of data were obtained, including 212,356,980 reads. After removing reads of low quality, adaptor sequences, we got a total of 192,271,376 clean paired-end reads. The detailed information is summarized in Table S5.
A total of 21 BpNACs were differentially expressed in at least a one-time point under lowtemperature stress (Table S7). Of the 21 low-temperature-responsive genes, 17 were up-regulated, and four were down-regulated by low-temperature stress. A total of 12 BpNACs were classified into Clade III, and half of them were significantly up-regulated by low-temperature treatment (FDR < 0.05), including BpNAC016, BpNAC024, BpNAC025, BpNAC026, BpNAC052, and BpNAC107. In addition, the other genes in this clade were slightly up-regulated (up-regulated, but FDR > 0.05). The results indicated that BpNACs in Clade III may be sensitive to low temperatures and may play a role during low-temperature treatment. In addition to Clade III, eight BpNACs in Clade V, including BpNAC001, BpNAC004, BpNAC032, BpNAC035, BpNAC037, BpNAC043, BpNAC077, and BpNAC085, were up-regulated by low-temperature treatment. On the contrary, no low-temperature response genes were found on Clade I.

Differential Expression Profiles of the BpNAC Gene Family under Cold Stress
Since birch is a cold-resistant species, we used RNA-Seq to analyze the expression profiles of BpNACs when exposed to a low temperature for 0.5 h, 1 h, 1.5 h, 2 h, 2.5 h, and 3 h (Table S7). We generated two biological replicates at 0 h, 1 h, 2 h, 2.5 h, 3 h, and three biological replicates at 0.5 h and 1.5 h. A total of 16 sets of data were obtained, including 212,356,980 reads. After removing reads of low quality, adaptor sequences, we got a total of 192,271,376 clean paired-end reads. The detailed information is summarized in Table S5.
A total of 21 BpNACs were differentially expressed in at least a one-time point under low-temperature stress (Table S7). Of the 21 low-temperature-responsive genes, 17 were up-regulated, and four were down-regulated by low-temperature stress. A total of 12 BpNACs were classified into Clade III, and half of them were significantly up-regulated by low-temperature treatment (FDR < 0.05), including BpNAC016, BpNAC024, BpNAC025, BpNAC026, BpNAC052, and BpNAC107. In addition, the other genes in this clade were slightly up-regulated (up-regulated, but FDR > 0.05). The results indicated that BpNACs in Clade III may be sensitive to low temperatures and may play a role during low-temperature treatment. In addition to Clade III, eight BpNACs in Clade V, including BpNAC001, BpNAC004, BpNAC032, BpNAC035, BpNAC037, BpNAC043, BpNAC077, and BpNAC085, were up-regulated by low-temperature treatment. On the contrary, no low-temperature response genes were found on Clade I. We then analyzed the cis-acting elements in the promoter sequences of the 21 low-temperatureresponsive BpNACs using PlantCARE [51]. The promoters of eight BpNACs, including BpNAC016, BpNAC024, BpNAC043, BpNAC052, BpNAC063, BpNAC075, BpNAC077 and BpNAC108, contain low-temperature response-related motifs. In addition, the promoter of BpNAC041 contains cis-acting elements involved in dehydration, a low temperature, and salt stress. All these results indicated that BpNACs may play important roles in cold resistance in birch.

Validation of DEGs Identified by RNA-seq Using qRT-PCR
We used qRT-PCR (Quantitative Real-time PCR) to validate the expression profiles of BpNACs after low temperature treatment. A total of 12 cold-responsive BpNACs were selected for qRT-PCR analysis. As shown in Figure 3, the qRT-PCR results indicated that all the 12 BpNACs were up-regulated by low-temperature stress, which was consistent with the results derived from the RNA-seq data. BpNACs including BpNAC002, BpNAC004, BpNAC019, BpNAC025, BpNAC026, BpNAC032, BpNAC052, BpNAC054, BpNAC063, and BpNAC108 were up-regulated when exposed to a low temperature for 3 h and 6 h. BpNAC003 and BpNAC092 were up-regulated when exposed to a low temperature for 6 h. In general, this qRT-PCR result supports the reliability of the RNA-seq analysis. We then analyzed the cis-acting elements in the promoter sequences of the 21 low-temperatureresponsive BpNACs using PlantCARE [51]. The promoters of eight BpNACs, including BpNAC016, BpNAC024, BpNAC043, BpNAC052, BpNAC063, BpNAC075, BpNAC077 and BpNAC108, contain lowtemperature response-related motifs. In addition, the promoter of BpNAC041 contains cis-acting elements involved in dehydration, a low temperature, and salt stress. All these results indicated that BpNACs may play important roles in cold resistance in birch.

Validation of DEGs Identified by RNA-seq Using qRT-PCR
We used qRT-PCR (Quantitative Real-time PCR) to validate the expression profiles of BpNACs after low temperature treatment. A total of 12 cold-responsive BpNACs were selected for qRT-PCR analysis. As shown in Figure 3, the qRT-PCR results indicated that all the 12 BpNACs were upregulated by low-temperature stress, which was consistent with the results derived from the RNAseq data. BpNACs including BpNAC002, BpNAC004, BpNAC019, BpNAC025, BpNAC026, BpNAC032, BpNAC052, BpNAC054, BpNAC063, and BpNAC108 were up-regulated when exposed to a low temperature for 3 hours and 6 hours. BpNAC003 and BpNAC092 were up-regulated when exposed to a low temperature for 6 hours. In general, this qRT-PCR result supports the reliability of the RNAseq analysis.

Co-Expression of Low-Temperature Stress Genes in the BpNAC Gene Family
The gene co-expression network analysis is a system biology method for describing the correlation patterns among genes across RNA-Seq samples. We used WGCNA [73] to construct a TFcentered co-expression network for the 21 low-temperature response BpNACs. With a weight threshold of 0.1, we filtered out some branches with lower weights and only keep edges with strong

Co-Expression of Low-Temperature Stress Genes in the BpNAC Gene Family
The gene co-expression network analysis is a system biology method for describing the correlation patterns among genes across RNA-Seq samples. We used WGCNA [73] to construct a TF-centered co-expression network for the 21 low-temperature response BpNACs. With a weight threshold of 0.1, we filtered out some branches with lower weights and only keep edges with strong connections. Lastly, 15 co-expression networks were constructed (Figure 4), of which seven networks contain less than 100 genes, and two contain more than 500 genes.
connections. Lastly, 15 co-expression networks were constructed (Figure 4), of which seven networks contain less than 100 genes, and two contain more than 500 genes. The largest co-expression network of BpNAC077 contains 822 genes. By dissecting the cellular component of the co-expressed genes, we found that nearly 200 genes may be involved in membrane structure formation ( Figure 5). The fluidity and stability of cell membranes are the basis for the survival of cells and even for the whole plants. It can effectively improve the cold resistance of plants by promoting the stability of membranes. The co-expressed network of BpNAC041 contains 630 genes. GO annotation results indicated that one-third of the BpNAC041 co-expressed genes are related to organelles, and they play a wide role in the growth and development of ribosomes, cell plates, spindles, the microtubule cytoskeleton, and other parts.  The largest co-expression network of BpNAC077 contains 822 genes. By dissecting the cellular component of the co-expressed genes, we found that nearly 200 genes may be involved in membrane structure formation ( Figure 5). The fluidity and stability of cell membranes are the basis for the survival of cells and even for the whole plants. It can effectively improve the cold resistance of plants by promoting the stability of membranes. The co-expressed network of BpNAC041 contains 630 genes. GO annotation results indicated that one-third of the BpNAC041 co-expressed genes are related to organelles, and they play a wide role in the growth and development of ribosomes, cell plates, spindles, the microtubule cytoskeleton, and other parts. connections. Lastly, 15 co-expression networks were constructed (Figure 4), of which seven networks contain less than 100 genes, and two contain more than 500 genes. The largest co-expression network of BpNAC077 contains 822 genes. By dissecting the cellular component of the co-expressed genes, we found that nearly 200 genes may be involved in membrane structure formation ( Figure 5). The fluidity and stability of cell membranes are the basis for the survival of cells and even for the whole plants. It can effectively improve the cold resistance of plants by promoting the stability of membranes. The co-expressed network of BpNAC041 contains 630 genes. GO annotation results indicated that one-third of the BpNAC041 co-expressed genes are related to organelles, and they play a wide role in the growth and development of ribosomes, cell plates, spindles, the microtubule cytoskeleton, and other parts.

Discussion
The study of transcription factors is an important part of functional genomics research. Transcription factors control the expression of downstream genes and are key regulators of various biological processes [74]. The NAC family are plant-specific transcription factors. Previous studies [15,26,75] have shown that NACs are involved in plant growth and development, which pattern organs, biotic stress, and abiotic stress. However, current research on NAC transcription factors is mainly focused on model plants. In this study, we identified and investigated the function of NACs in a woody species, Betula pendula. Our study revealed that the B. pendula genome contains more NAC genes than some of the sequenced dicotyledonous angiosperms such as Arabidopsis thaliana [33], Theobroma cacao [76], and Carica papaya [77]. By examining the distribution of BpNACs on B. pendula chromosomes, we found that some of the BpNACs are clustered on the B. pendula genome. In addition, we found that some of the BpNACs are located on the intra-genome synthetic blocks. The results revealed that tandem duplication and the whole genome duplication have an important impact on the expansion of the BpNAC gene family.
Genetic analyses have revealed that NACs can affect the plant life cycle by regulating cell division [78]. In the A. thaliana NTM1 mutant, a series of CDK (cyclin-dependent kinases) repressor genes are induced to express the inhibition of cell division, which results in delayed growth [79]. The expression of AtNAP is closely related to the senescence of A. thaliana leaves [80]. In our study, BpNAC003, BpNAC004, BpNAC013, BpNAC032, BpNAC085, and BpNAC091 are homologous to NTM1/AtNAP of A. thaliana and are classified as Clade VI members. The results indicated that the Clade VI of BpNACs may be related to the growth of birch.
In this study, we identified BpNACs that were highly expressed in the xylem of birch, such as BpNAC002, BpNAC003, BpNAC004, BpNAC013, BpNAC027, BpNAC042, BpNAC072, and BpNAC095. These xylem-specific BpNACs may be involved in lignin biosynthesis. Compared to A. thaliana, we found that BpNAC013 and BpNAC042 are homologous to AT4G17980 and AT2G33480, respectively. Pitaksaringkarn et al. [81] reported that AT4G17980 (ANAC071) is involved in the development of inflorescence stems by regulating the expression of XTH20 and XTH19 in A. thaliana. Kim et al. reported that AT2G33480 (ANAC041) directly regulates the expression of CSLA9 [82], which belongs to one of the cellulose synthase-like genes [83]. It is interesting that BpNAC002 and BpNAC072 are homologous to AT2G46770, and BpNAC027 and BpNAC095 are homologous to AT4G28500, respectively. AT2G46770 (ANAC043) is known as NST1, which regulates the secondary wall thickenings in anther walls and siliques, and an NST1 promoter fusion was detected in various tissues in which lignified secondary walls develop [84]. AT4G28500 (ANAC073), known as SND2, regulates genes involved in the development of secondary cell walls in A. thaliana fibers and increases the fiber cell area in Eucalyptus [85]. BpNAC003 and BpNAC004 are homologous to AT4G35580 (NTL9), which encodes an NAC protein that contains a calmodulin-binding domain at the C-terminus of a protein. It is reported that AT4G35580 functions as a calmodulin-regulated transcriptional repressor [86]. Wang et al. [87] found that AT4G35580 may be a cell-wall related transcription factor based on a co-expression network analysis. BpNAC005 is highly expressed in the xylem, which is homologous to AT2G18060 (VND1). VND1 could regulate the development of genes required for secondary cell wall biosynthesis with other members of this family. BpNAC011 is highly expressed in all the four tissues, especially in the roots and xylem. Its homologous gene in A. thaliana is AT5G13180 (ANAC083), which interacts with VND7. This acts as a master regulator of xylem vessel differentiation [88].
Plants are challenged by various environmental stresses. Low-temperature is one of the major factors that adversely impact plant development [89]. Plants have evolved multiple mechanisms to avoid cold stress [90]. Through a cold stress experiment, we identified 19 cold stress responsive BpNACs. By combining A. thaliana expression data [50], we found that BpNAC026 and BpNAC077 are homologous to AT4G27410 (ANAC072) and AT1G34190 (ANAC017), respectively, and they were up-regulated by cold stress in both A. thaliana and B. pendula. It is reported that AT4G27410 acts as a transcriptional activator in the ABA-mediated dehydration response [91]. AT1G34190 has been shown to regulate mitochondrial retrograde responses and coordinate organelle function and a stress response [92].
Although the NAC genes are widely involved in growth and development as well as biotic and abiotic stress [75,93,94], the research on the complex regulatory network of NAC genes is still in its infancy. Given that only a few BpNAC genes have been well characterized in function, our results could help select candidate genes for further characterization.

Conclusions
This study aims to provide information on Betula pendula NACs regarding their potential physiological roles and the molecular mechanism associated. Using bioinformatics and phylogenetic analyses, we identified 114 BpNACs that were divided into seven subfamilies. The BpNACs gene structure, chromosome location, and BpNACs conserved protein motifs were identified. Using available B. pendula expression data, tissue expression profiles of the BpNACs were investigated and the results showed that the expression levels of BpNACs were higher in xylem and roots when compared to other tissues. In addition, RNA-seq analysis was used to determine the expression profiles of BpNACs when exposed to a low temperature. A total of 21 BpNACs were differentially expressed under a low temperature and associated gene co-expression networks were constructed. These analyses will help decipher the genetic information of the NAC genes in birch, and provide theoretical support for the screening and functional analysis of subsequent functional genes.
Supplementary Materials: The following are available online at http://www.mdpi.com/1999-4907/10/9/741/s1. Figure S1: Exon-intron structures of BpNACs. The genomic structure information of the BpNACs was obtained from CoGe (https://genomevolution.org/coge/). Figure S2: Conserved amino acid motifs of the BpNAC proteins. Figure S3: Chromosomal distribution of BpNAC genes in B. pendula. Figure S4: Chromosomal distribution and gene duplications of the BpNAC genes. Figure S5: The phylogenetic tree of BpNACs made by RAxML. Its confidence was evaluated with 1000 bootstraps. Table S1: Primer pairs for real-time quantitative PCR. Table S2: Basic information of the putative truncated BpNACs without the NAC domain. Table S3: Functional annotation of the BpNAC gene family. Table S4: Promoter identification of the BpNAC gene family. Table S5: Summary of B. pendula RNA-Seq data. Table S6: Tissue-specific expression profiles of the BpNAC gene family. Table S7