Genome-Wide Survey and Functional Verification of the NAC Transcription Factor Family in Wild Emmer Wheat

The NAC transcription factor (TF) family is one of the largest TF families in plants, which has been widely reported in rice, maize and common wheat. However, the significance of the NAC TF family in wild emmer wheat (Triticum turgidum ssp. dicoccoides) is not yet well understood. In this study, a genome-wide investigation of NAC genes was conducted in the wild emmer genome and 249 NAC family members (TdNACs) were identified. The results showed that all of these genes contained NAM/NAC-conserved domains and most of them were predicted to be located on the nucleus. Phylogenetic analysis showed that these 249 TdNACs can be classified into seven clades, which are likely to be involved in the regulation of grain protein content, starch synthesis and response to biotic and abiotic stresses. Expression pattern analysis revealed that TdNACs were highly expressed in different wheat tissues such as grain, root, leaves and shoots. We found that TdNAC8470 was phylogenetically close to NAC genes that regulate either grain protein or starch accumulation. Overexpression of TdNAC8470 in rice showed increased grain starch concentration but decreased grain Fe, Zn and Mn contents compared with wild-type plants. Protein interaction analysis indicated that TdNAC8470 might interact with granule-bound starch synthase 1 (TdGBSS1) to regulate grain starch accumulation. Our work provides a comprehensive understanding of the NAC TFs family in wild emmer wheat and establishes the way for future functional analysis and genetic improvement of increasing grain starch content in wheat.


Introduction
Transcription factors (TFs) can activate or inhibit the expression of associated target genes by binding to their promoter regions [1]. Over 6-8% of plant genome sequences encoded TFs [2], which are implicated in plant growth, development and response to biotic and abiotic stresses [3]. In common wheat (Triticum aestivum L.), 5776 TFs belonging to 56 TF families have been identified. Among them, bHLH TF is the largest TF family, while STAT TF is the smallest [4].
NAM (no apical meristem), ATAF1/2 (Arabidopsis transcription activator factor 1/2) [5], and CUC2 (cup-shaped cotyledon) [6] are abbreviated as NAC TFs, which is one of the largest plant-specific TF families [7]. The NAC TF family contains eight different subfamilies (NACa, NACb, NACc, NACd, NACe, NACf, NACg and NACh) that play different roles in plant growth and development processes [8]. The NAC protein usually has a highly differentiated C-terminal transcriptional regulator region and a conserved N-terminal DNA-binding domain (~150 amino acids). The C-terminal transcriptional regulator region functions as a transcription activator or repressor of target genes [9]. The conserved N-terminal DNA-binding domain can be further classified into five subdomains,

Identification and Analysis of TdNAC Genes in Wild Emmer
By using the HMMER search tool with E-value ≤ 0.0001, we found that 263 wild emmer genes might belong to NAC TF family. However, based on NCBI-CDD analysis, fourteen genes did not contain NAC/NAM protein domain, which lead to the identification of the other 249 genes as TdNAC TF family members. Among them, 233 genes contained NAM-conserved domain while the other 16 genes contained NAC-conserved domain ( Figure S1). Two hundred forty-nine TdNAC genes were mapped on 14 chromosomes of wild emmer wheat, of which most were located on chromosome 2B (33 TdNACs) and the least was on chromosome 1B (6 TdNACs) ( Figure S2). The protein lengths of the 249 TdNACs ranged from 49 AA (TRIDC4BG055130.1) to 730 AA (TRIDC5AG041100.4). Among the 249 TdNAC proteins, most (223/249) were in full length, while the minority were fragmented with either an N-terminal or a C-terminal region, but all had a complete NAM/NAC domain (Table S1). The theoretical pI and Mw ranged from 4.23 (TRIDC5AG073570.3) to 11.68 (TRIDC2BG090110.1) and from 5470.32 (TRIDC4BG055130.1) to 80026 (TRIDC5AG041100.4), respectively. The subcellular location prediction results showed that 244 TdNAC proteins were located in the nuclear and only 6 NAC proteins were located in the chloroplast, of which TRIDC2BG055170.1 accumulated in both the nuclear and chloroplast (Table S1).

Expression Patterns of TdNAC Genes in Different Tissues
Based on the expression data of TdNACs retrieved from the public RNA-seq database (http://202.194.139.32/expression/emmer.html) (accessed on 15 September 2022), we constructed a heat map to show the expression patterns of the 249 TdNAC genes in different tissues, including leaves, shoots, roots, flowers, grains, spikes, lemma and glume at different developing stages. One hundred and nineteen of the 249 TdNACs were considered as expressed genes (TPM ≥ 1) and 71 were highly expressed (TPM ≥ 5) at 20 days in the root, among which TRIDC1BG045200.1 (133.61), TRIDC4BG062830.2 (142.23) and TRIDC1AG035350. 1 (195.64) had the top three highest TPM values. Fifty-one TdNACs were highly expressed in leaves at 54, 77 or 134 days, among which TRIDC7AG042610.1 (TPM = 101.42) had the highest expression and expressed in leaves at both 54 and 77 days. TRIDC1AG024190.3 and TRIDC5AG073570.3 had the highest expression in leaves at 77 and 134 days, respectively. A total of 46 TdNACs were highly expressed in the developing spike, 30 were highly expressed during the development spike (1-5.5 cm); 58 and 54 TdNAC genes were highly expressed at 112 days of lemma and glume, respectively; 74 TdNAC genes were highly expressed among flowers at 105-112 days, and 65 genes were highly expressed in 123 and 134 days of grain ( Figure S5, Table S2).

Functional Analysis of TdNAC Genes
Of the 16 differentially expressed TdNAC genes in grains (Table S5), TRIDC7AG078470 was especially expressed in wild emmer D97 compared to that of CN16. TRIDC7AG078470 was phylogenetically close to rice genes ONAC020 and ONAC026, wheat genes TaNAC019A/B/D. Previously reports showed that ONAC020, ONAC026 and TdNAC019 can regulate either grain protein or starch concentration [24,27]. Therefore, we chose TRIDC7AG078470 (named TdNAC8470) for further functional characterization. TdNAC8470-GFP fusion vector was constructed and transiently expressed in

Functional Analysis of TdNAC Genes
Of the 16 differentially expressed TdNAC genes in grains (Table S5), TRIDC7AG078470 was especially expressed in wild emmer D97 compared to that of CN16. TRIDC7AG078470 was phylogenetically close to rice genes ONAC020 and ONAC026, wheat genes TaNAC019A/B/D. Previously reports showed that ONAC020, ONAC026 and TdNAC019 can regulate either grain protein or starch concentration [24,27]. Therefore, we chose TRIDC7AG078470 (named TdNAC8470) for further functional characterization. TdNAC8470-GFP fusion vector was constructed and transiently expressed in Nicotiana benthamiana leaves. The result indicated that TdNAC8470 was localized to the nucleus (Figure 3). Nicotiana benthamiana leaves. The result indicated that TdNAC8470 was localized to the nucleus ( Figure 3). To further verify the function of TdNAC8470, we constructed thepCAMBIA2300-GFP-TdNAC8470 vector and transinfected it into a rice cultivar (Oryza Sativa L. spp. Japonica) and generated six TdNAC8470 overexpression lines (OE-TdNAC8470: OE-1, OE-2, OE-3, OE-4, OE-5 and OE-6), which were confirmed by PCR, sequencing analysis and hygromycin-resistant selection ( Figure S7). Two overexpression lines (OE-1 and OE-2) were further selected for subsequent analysis. Phenotypic investigation found that the plant height, number of tillers, 1000-grain weight and grain protein content had no significant difference between overexpression lines (OE-TdNAC8470) and wild-type (WT) plants. Surprisingly, the transgenic plant OE-TdNAC8470 had significantly higher starch concentration compared with that of WT plants ( Figure 4). The grain Cu content had no significant difference between OE-TdNAC8470 and WT plants, while the grain Zn, Mn and Fe contents of OE-TdNAC8470 were significantly lower than those of WT plants ( Figure 5). To further verify the function of TdNAC8470, we constructed thepCAMBIA2300-GFP-TdNAC8470 vector and transinfected it into a rice cultivar (Oryza Sativa L. spp. Japonica) and generated six TdNAC8470 overexpression lines (OE-TdNAC8470: OE-1, OE-2, OE-3, OE-4, OE-5 and OE-6), which were confirmed by PCR, sequencing analysis and hygromycinresistant selection ( Figure S7). Two overexpression lines (OE-1 and OE-2) were further selected for subsequent analysis. Phenotypic investigation found that the plant height, number of tillers, 1000-grain weight and grain protein content had no significant difference between overexpression lines (OE-TdNAC8470) and wild-type (WT) plants. Surprisingly, the transgenic plant OE-TdNAC8470 had significantly higher starch concentration compared with that of WT plants ( Figure 4). The grain Cu content had no significant difference between OE-TdNAC8470 and WT plants, while the grain Zn, Mn and Fe contents of OE-TdNAC8470 were significantly lower than those of WT plants ( Figure 5).

Protein Interaction Network Analysis of TdNAC8470 Protein
To further explore the function of TdNAC8470, we constructed a protein interaction network for TdNAC8470 (Traes_7AL_38B48B7B2.2) with T. aestivum as reference using software STRING version 11.5. The result showed that ten wheat proteins probably interacted with the TdNAC8470 protein.

Protein Interaction Network Analysis of TdNAC8470 Protein
To further explore the function of TdNAC8470, we constructed a protein interaction network for TdNAC8470 (Traes_7AL_38B48B7B2.2) with T. aestivum as reference using software STRING version 11.5. The result showed that ten wheat proteins probably interacted with the TdNAC8470 protein.  Table S5).

Discussion
The NAC gene family is one of the largest TF families that has been reported to play important roles in biotic and abiotic stresses, and grain development, grain protein and starch accumulation in rice, maize, Arabidopsis thaliana and common wheat [16,33,47,58]. Wild emmer wheat is the A, B genome donor of common wheat, which has abundant gene resources for high grain protein, Fe and Zn content and abiotic and biotic stress tolerance [16,30]. However, there are few reports on functional survey of the NAC genes from wild emmer and only NAM-B1 has been reported. Overexpression of the functional NAM-B1 could accelerate senescence and increase nutrient remobilization from leaves to developing grains, and then improve grain protein, Zn and iron content in wheat, whereas modern wheat varieties carry a nonfunctional NAM-B1 allele. The result showed that some NAC genes may have functions in wild emmer wheat, while these functions were not found in common wheat due to sequence variation in the process of wheat evolution. Therefore, it is necessary to identify and utilize the excellent NAC gene resources in wild emmer for wheat improvement. In the current study, we performed a genome-wide investigation of the NAC TF family in the wild emmer genome and identified 249 NAC genes that had conserved NAM or NAC domains. Our findings suggest that these NAC genes may have potential applications in providing new candidates for improving the biotic and abiotic resistance and the nutritional quality of common wheat.

Discussion
The NAC gene family is one of the largest TF families that has been reported to play important roles in biotic and abiotic stresses, and grain development, grain protein and starch accumulation in rice, maize, Arabidopsis thaliana and common wheat [16,33,47,58]. Wild emmer wheat is the A, B genome donor of common wheat, which has abundant gene resources for high grain protein, Fe and Zn content and abiotic and biotic stress tolerance [16,30]. However, there are few reports on functional survey of the NAC genes from wild emmer and only NAM-B1 has been reported. Overexpression of the functional NAM-B1 could accelerate senescence and increase nutrient remobilization from leaves to developing grains, and then improve grain protein, Zn and iron content in wheat, whereas modern wheat varieties carry a nonfunctional NAM-B1 allele. The result showed that some NAC genes may have functions in wild emmer wheat, while these functions were not found in common wheat due to sequence variation in the process of wheat evolution. Therefore, it is necessary to identify and utilize the excellent NAC gene resources in wild emmer for wheat improvement. In the current study, we performed a genome-wide investigation of the NAC TF family in the wild emmer genome and identified 249 NAC genes that had conserved NAM or NAC domains. Our findings suggest that these NAC genes may have potential applications in providing new candidates for improving the biotic and abiotic resistance and the nutritional quality of common wheat.
Previous studies had reported that the temporal and spatial expression patterns of genes were usually closely related to their functions [67]. In this study, we performed expression patterns for 249 TdNACs in root, leaf, spike, lemma, glume, flower and grain at different stages. We found that 51 and 65 genes were highly expressed in either leaf or grain, respectively. Recent studies showed that NAC genes such as TaNAC019, ZmNAC128 and ZmNAC130, which specifically highly expressed in wheat or maize grains at the filling stage, were involved in the regulation of the grain protein and starch synthesis [15,26,27]. Therefore, we believe that the 65 TdNACs highly expressed in grains may have redundant functions at the grain-filling stage.
Transcriptome study found that TRIDC3AG009300, TRIDC3BG013080, TRIDC3BG 013090, TRIDC7AG018690, TRIDC7AG024270, TRIDC7AG078470, TRIDC7AG078490, TRIDC 7AG078510, TRIDC7BG008180 and TRIDC7BG014950 were significantly upregulated in wild emmer D97 compared with common wheat CN16. Especially, TRIDC7AG078470 (TdNAC8470) was only expressed in D97. The overexpression of TdNAC8470 in rice showed that there was no difference in plant height, number of tillers, 1000-grain weight and grain protein content between OE-TdNAC8470 and WT plants. The grain starch content of OE-TdNAC8470 was significantly higher than that of WT, and the grain Fe, Zn and Mn contents were decreased in OE-TdNAC8470 compared with WT. In rice, ONAC26/20 double mutant had significantly decreased starch and storage protein contents [24]. In maize, the knockdown of ZmNAC128 and ZmNAC130 with RNA interference (RNAi) caused a shrunken kernel phenotype with significant reduction in starch and protein [26]. In wheat, TaNAC100 positively regulated grain starch content and negatively regulated grain protein content [29]. On the contrary, TaNAC019 negatively regulated grain starch synthesis and positively regulated grain protein content [27]. In our study, we found TdNAC8470 not only regulated grain protein synthesis, but also had positive effect on grain starch synthesis and negatively regulated grain Fe, Zn and Mn accumulation.
TdNAC100 can bind the promoters of two key genes, TaGBSS1 and TaSUS, to activate their expression that leads to increased grain starch synthesis [29]. TaNAC019-A1 repressed the expression of TaAGPS1-A1 and TaAGPS1-B1 by directly binding to the 'ACGCAG' motif in the promoter and then decreased starch synthesis in wheat endosperm [15]. ZmNAC128 and ZmNAC130 repressed the expression of Bt2 by binding to the 'ACGCAA' site that was a rate-limiting step in starch synthesis of maize endosperm and led to increasing grain starch accumulation [26]. In this study, TdNAC8470 protein could interact with granulebound starch synthase 1 (TdGBSS1, Traes_7AS_25D8C69E9.1 and Traes_4AL_4B9D56131.3). Granule-bound starch synthase 1 directly participated in grain starch accumulation in different plants [68][69][70]. Thus, we speculated that TdNAC8470 could activate the expression of TdGBSS1 and increase grain starch synthesis in wild emmer. In addition, TdNAC8470 protein interacted with seven proteins that were involved in responding to superoxide/ozone/salt stress/water deprivation stresses, implying that the TdNAC8470 might response to multiple abiotic stresses.

Identification of NAC Genes in Wild Emmer
The wild emmer wheat genome sequences (Triticum_dicoccoides.WEWSeq_v  [71] to identify the conserved protein domain and reject some candidate genes that are outside the NAC or NAM domain.

Plant Materials
Rice cultivar (Oryza. Sativa L. spp. Japonica) was used in this study. The transgenic plants were planted in the transgenic closed-experiment field of Sichuan Agriculture University (Chengdu, Sichuan Province, China). All samples were stored at −80 • C for RNA-Seq and RNA extraction. RNA-Seq was performed by the BioMarker company and the standardized analysis was obtained by using the BMKCloud (http://www.biocloud. net/) (accessed on 15 September 2022) online tool.

RNA Extraction
Total RNA from grain samples was isolated using TRIzolTM reagent (Thermo Fisher Scientific, Tokyo, Japan). First-strand cDNA synthesis was performed using the TaKaRa PrimeScriptTMRT Reagent Kit (Takara, Dalian, China) according to the manufacturer's instructions.

Rice Transformation
The cDNA of TdNAC8470 from wild emmer wheat D97 was cloned into the overexpression vector pCAM-BIA2300-EGFP (pCAMBIA2300-EGFP-TdNAC8470). The construct had KpnI and SpeI on the 3 side of the CaMV 35S promoter (Table S6). An Agrobacterium tumefaciens strain (AGL1) carrying this construct was used to transform rice (Oryza Sativa L. spp. Japonica) using the method of Hiei et al. [78]. The T 1 seeds obtained from the transformants were germinated on MS medium containing 50 mg/L hygromycin to select resistant plants. In addition, the hygromycin-resistant lines were further confirmed by PCR using gene-specific primer. Leaf segments of T 2 plants at two weeks old were soaked in 50 mg/L hygromycin solution to further confirm the transgene. The positive transgene has hygromycin resistance and the negative plants produce black spots when soaked in hygromycin solution. Homozygous T 3 transgenic lines were selected for subsequent experimental analysis [31].

Subcellular Localization
The CDS of TdNAC8470 without stop codon (TGA) was cloned into the vector pCAM-BIA2300-EGFP using the In-fusion system. The final construct (35S::TdNAC8470-EGFP) and the control vectors (35S::EGFP) were introduced into Agrobacterium tumefaciens strain GV3101, which was used to inject the leaves of Nicotiana benthamiana, respectively. After 24 h of darkness, the Nicotiana benthamiana plants were transferred into a plant growth chamber under the conditions of 20 • C and 16 h photoperiod. The leaves were collected and the fluorescence signals were detected using a laser-scanning confocal microscope.

Measurement of Grain Protein, Starch and Microelement Concentration
The mature rice seeds were harvested for measurement of grain protein and starch concentrations. Total nitrogen content was tested and converted to grain protein content by coefficient 6.25 using the Kjeldahl method (KjeltecTM8400). The total grain starch content was measured using an EnzyChromTM Starch Assay Kit (BioAssay Systems, Hayward, CA, USA). The mature seeds were sampled and dried at 37 • C for 3 days. The samples were wet-ashed by HNO 3 (60%) as described previously. After dilution, the Zn (213.856 nm), Fe (238.204 nm) and Mn (293.930 nm) concentrations were determined by inductively coupled plasma atomic emission spectrometry (SPS1200VR; Seiko, Tokyo, Japan).

Protein Interaction Network Analysis
The protein interaction network of TdNAC8470 protein was analyzed using online software STRING version 11.5 (https://cn.string-db.org/) (accessed on 15 September 2022). The amino acid of TdNAC8470 was mapped to Chinese Spring (T. aestivum) protein sequences using a single protein by the sequence of STRING [79].

Statistical Analysis
Analysis of variance was performed using IBM SPPS version 22 statistics software; the means were compared by Duncan's new multiple range test (Duncan) at a significance level of 0.05.

Conclusions
NAC TFs play major roles in plant growth, development and responding to biotic and abiotic stresses. In this study, a genome-wide analysis of NAC TFs family in wild emmer was performed. A total of 249 TdNAC genes were identified and all had NAM/NAC-conserved domains. We performed the phylogenetic, gene structure, chromosomal localization and expression, and conserved motif analyses of the 249 NAC genes. TdNACs of clade E, F and G are likely to be involved in the regulation of grain protein and starch synthesis, and TdNACs of clade A are likely to respond to biotic and abiotic stresses. The overexpression of TdNAC8470 in rice improved grain starch content and decreased grain Zn, Fe and Mn concentrations. TdNAC8470 may activate the expression of TdGBSS1 to increase grain starch synthesis.

Conflicts of Interest:
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.