Genome-Wide Analyses of the NAC Transcription Factor Gene Family in Pepper (Capsicum annuum L.): Chromosome Location, Phylogeny, Structure, Expression Patterns, Cis-Elements in the Promoter, and Interaction Network

The NAM, ATAF1/2, and CUC2 (NAC) transcription factors form a large plant-specific gene family, which is involved in the regulation of tissue development in response to biotic and abiotic stress. To date, there have been no comprehensive studies investigating chromosomal location, gene structure, gene phylogeny, conserved motifs, or gene expression of NAC in pepper (Capsicum annuum L.). The recent release of the complete genome sequence of pepper allowed us to perform a genome-wide investigation of Capsicum annuum L. NAC (CaNAC) proteins. In the present study, a comprehensive analysis of the CaNAC gene family in pepper was performed, and a total of 104 CaNAC genes were identified. Genome mapping analysis revealed that CaNAC genes were enriched on four chromosomes (chromosomes 1, 2, 3, and 6). In addition, phylogenetic analysis of the NAC domains from pepper, potato, Arabidopsis, and rice showed that CaNAC genes could be clustered into three groups (I, II, and III). Group III, which contained 24 CaNAC genes, was exclusive to the Solanaceae plant family. Gene structure and protein motif analyses showed that these genes were relatively conserved within each subgroup. The number of introns in CaNAC genes varied from 0 to 8, with 83 (78.9%) of CaNAC genes containing two or less introns. Promoter analysis confirmed that CaNAC genes are involved in pepper growth, development, and biotic or abiotic stress responses. Further, the expression of 22 selected CaNAC genes in response to seven different biotic and abiotic stresses [salt, heat shock, drought, Phytophthora capsici, abscisic acid, salicylic acid (SA), and methyl jasmonate (MeJA)] was evaluated by quantitative RT-PCR to determine their stress-related expression patterns. Several putative stress-responsive CaNAC genes, including CaNAC72 and CaNAC27, which are orthologs of the known stress-responsive Arabidopsis gene ANAC055 and potato gene StNAC30, respectively, were highly regulated by treatment with different types of stress. Our results also showed that CaNAC36 plays an important role in the interaction network, interacting with 48 genes. Most of these genes are in the mitogen-activated protein kinase (MAPK) family. Taken together, our results provide a platform for further studies to identify the biological functions of CaNAC genes.


Introduction
Transcriptional regulation of gene expression controls many important cellular processes in plants, such as environmental stress responses, signal transduction, and cellular morphogenesis. Transcription factors (TFs) regulate gene expression by binding to specific cis-acting promoter elements, resulting in the activation or repression of the transcriptional rates of target genes, thereby facilitating the evolution and adaption of complex developmental systems in plant responses to environmental stress. Many plant-specific transcription factors, such as NAC, WRKY, AP2/EREBP, and MYB proteins have been identified and shown to play an important role in defense responses to stress [1,2]. NAC proteins are encoded by one of the largest plant-specific transcription factor gene families and contain both highly conserved N-terminal DNA binding domains and variable C-terminal domains. The conserved N-terminal DNA binding domain consists of five subdomains, A through E. On the basis of the structure of the domains and subdomains, NACs can be classified into two main groups (I and II), which can be further divided into 14 and 4 subgroups according to their tree topology in Arabidopsis and rice [3].
Many studies of NAC identification and functional analysis have shown that NAC proteins play significant roles in signaling and regulation of gene expression in response to biotic and abiotic stresses. Studies have shown that in Arabidopsis, the NAC transcription factor ANAC096 cooperates with the bZIP-type TF abscisic acid responsive element (ABRE) binding factor and the ABRE binding protein (ABF/AREB) to help plants survive under dehydration and osmotic stress conditions [16]. The overexpression of ANAC055 significantly improved drought resistance in transgenic plants [17]. In addition, the transcription of SNAC3 is induced by drought, high temperature, salinity stress, and ABA treatment in rice. Overexpression of SNAC3 resulted in enhanced tolerance to high temperatures, drought, and oxidative stress, whereas suppression of SNAC3 by RNAi resulted in increased sensitivity to these stresses [18]. In apple, a total of 180 NAC genes were identified, and 17 of 29 selected MdNAC genes were differentially expressed in response to at least one abiotic stress (low temperature, drought, high salinity, or exogenous ABA) [8]. In cassava, the expression profiles of MeNAC genes under various conditions (osmotic, salt, cold, ABA, and H 2 O 2 ) suggest that different MeNAC genes may participate in different signaling or stress responses, and most of the cassava NAC genes can be significantly induced by multiple stressors, such as ABA and H 2 O 2 treatment [12]. In maize, three genes, ZmNAC18, ZmNAC51, and ZmNAC145 were upregulated in the drought-tolerant genotype and downregulated in the susceptible genotype. These three genes are closely linked to SNAC1 and OsNAC6 in rice, which have been reported as drought responsive, and act as transcriptional activators/regulators of genes that encode proteins involved in the regulation of stress responses and the production of osmolytes [13]. In potato, 48 StNAC genes were expressed under different abiotic stress treatments (salt, mannitol, and heat), and 44 StNAC genes were expressed under biotic stress conditions with P. infestans inoculum (Pi isolate US8:Pi02-007), acibenzolar-s-methyl (BTH) and DL-β-amino-n-butyric acid (BABA) [10]. Meanwhile, the NAC transcription factor SISRN1 positively regulates the defense response against Botrytis cinerea and Pseudomonas syringae pv. tomato DC3000, but negatively regulates the oxidative and drought stress response in tomato [19].
Compared to potato and tomato, the few NAC genes identified were shown to display tissue-specific, inducible expression patterns in Capsicum. The first NAC gene, i.e., CaNAC1, was cloned by reverse RNA gel blot analysis. It was shown that the expression of the CaNAC1 gene is rapidly and specifically induced during incompatible interactions between pepper and bacterial or viral pathogens. Additionally, CaNAC1 was strongly induced by exogenously applied SA and ethephon (ET), whereas MeJA only had a transient effect [20]. The full-length cDNA of CaNAC72, a gene encoded by NAC transcription factors, was isolated from the normalized cDNA library. It was shown that this gene shares 70% amino acid sequence identity with the AtNAC072 gene of Arabidopsis. The expression of CaNAC72 was induced by hormones (SA, MeJA, ET, and ABA), dehydration, high salinity, cold, heat, and Ralstonia solanacearum inoculation [21]. The transcription expression of CaNAC2 was strongly induced by abiotic stress treatments, such as cold, salt, and ABA, but was inhibited by osmotic stress and SA treatment. Viral silencing of CaNAC2 in pepper seedlings resulted in increased susceptibility to cold stress and delayed salt-induced leaf chlorophyll degradation [22]. Although 73 and 45 putative NAC sequences were identified from a wild Mexican pepper accession "Chiltepin" and a Chinese inbred derivative "Zunla-1", respectively [23], detailed analyses including gene classification, chromosome distribution, gene duplication, gene phylogeny, gene structure, conserved motif composition, and gene expression under various abiotic and biotic stress conditions was lacking.
Pepper (Capsicum annuum L.) is the second most widely cultivated Solanaceous vegetable worldwide, after tomato. However, when in the reproductive stage, pepper is sensitive to biotic and abiotic stresses such as pathogens, drought, cold, and high temperatures. The NAC transcription factors are one of the largest families of transcriptional regulators in plants, and members of the NAC gene family are thought to play important roles in the regulation of transcriptional reprogramming associated with plant abiotic and biotic stress responses [24]. Thus, it is necessary to identify NAC transcription factors and determine the molecular mechanisms involved in the multiple regulatory biological processes in pepper. Drafts of the C. annuum L. genome sequence were reported recently [25,26]. In the present study, we searched two pepper genome sequences to identify the NAC genes (CaNAC). Detailed analyses were then conducted, including gene classification, chromosome distribution, gene structure, gene duplication, gene phylogeny, conserved motif composition, and cis-acting elements in the promoter. Further, we analyzed the expression of the identified 22 CaNAC genes under abiotic or biotic stress conditions (salt, heat shock, drought, Phytophthora capsici, ABA, SA, and MeJA), and predicted the protein-protein interaction network of CaNACs. These results will provide a platform for the future identification of the biological functions of NAC genes in pepper.

Identification of NAC Family Members in the Capsicum Annuum Genome
A comprehensive analysis was performed to identify NAC genes in the pepper genome sequences downloaded from PGP (Available online: http://peppergenome.snu.ac.kr) and PGD (Available online: http://peppersequence.genomics.cn). A total of 104 non-redundant putative NAC genes were identified using two pepper genome databases and named CaNAC1 to CaNAC104. The NAC gene family in the pepper is comprised of >100 genes, like those reported for Arabidopsis, rice, potato, apple, and soybean. The annotation IDs of each CaNAC in two pepper genome databases (PGP and PGD) are given in Table 1. Ten genes (CaNAC8, CaNAC9, CaNAC42, CaNAC44, CaNAC47, CaNAC48, CaNAC58, CaNAC64, CaNAC68, CaNAC89) have only one annotation ID because of the specificity of their gene sequences. It is interesting to note that these ten genes only exist in the PGD database. The nucleotide and protein sequences of CaNAC gene family were also (File S1).
Typically, the NAC proteins have an N-terminal NAC domain and a C-terminal activation domain. The NAC domain can be further divided into five subdomains, A to E [3]. All of the putative 104 NAC genes were analyzed to confirm the presence of the NAC domain, and 97 CaNAC genes, containing a complete N-terminal NAC domain, were identified; another seven genes did not have complete domains. Thereinto, CaNAC48 has only a D subdomain, CaNAC4 and CaNAC47 have D and E subdomains, CaNAC64 and CaNAC82 have C and D subdomains, and CaNAC31 and CaNAC68 have C, D, and E subdomains. Similar phenomena were observed in the potato, where 13 StNACs lack conserved A and/or B subdomains, and four StNACs do not contain conserved C and/or D subdomains [10].
The 104 identified CaNAC proteins varied greatly in length, from 100 amino acids (aa, AA) (CaNAC2) to 719 aa (CaNAC96); the average NAC protein length was 331.9 aa. The relative WT (molecular weight) varied from 11.80 kDa (CaNAC2) to 79.51 kDa (CaNAC96), and the PI (isoelectric point) ranged from 4.08 (CaNAC16) to 10.32 (CaNAC82). The detailed information about CaNAC proteins, including gene loci accession number in PGP or PGD, chromosome location, gene classification, introns, PI, AA, WT, and Arabidopsis thaliana orthologous genes or orthologs, are listed in Table 1.

Chromosomal Distribution and Duplication of CaNAC Genes
CaNACs were widely distributed within the Capsicum annuum reference genome, with all 104 CaNAC genes mapped to 12 chromosomes (Figure 1). Chromosomes 1, 2, 3, 6, and 7 contained relatively more CaNAC genes, with 11, 11, 11, 16, and 11 genes, respectively, compared to chromosomes 8 and 9 which each possessed only four of the genes. Interestingly, CaNAC genes were enriched on chromosomes 2 and 6. The sequenced size of chromosome 2 is 156.37 M, and it contains 11 (10.6%) of the 104 CaNAC genes; the sequenced size of chromosome 6 is 209.35 M, and it contains 16 (15.4%) CaNAC genes. Further, chromosomes 2 and 6 account for only 4.87 and 6.53% of the assembled pepper genome (3.13 G). Similar to pepper, chromosome 2 in potato also contains the largest number of StNAC genes, comprising 14 members (~13%), whereas chromosome 9 in potato contains only three members (~3%) [10]. These data suggest that in Solanaceae, NAC genes are enriched on chromosome 2, but not chromosome 9; however, the distribution of NAC genes in other Solanaceae family plants, such as tomato and eggplant, has yet to be determined.
Further, we determined the tandem and segmental duplications of CaNAC genes along the 12 pepper chromosomes. As shown in Figure 1, 16 CaNAC gene clusters (genes labeled in red) containing 32 tandemly duplicated genes were identified on 10 chromosomes, except for chromosomes 8 and 9. Interestingly, we identified eight tandemly duplicated genes within four gene clusters on chromosome 12. The sequences of the clustered members are highly alike, suggesting the use of gene duplication in the expansion of the CaNAC family. However, the number of segmental duplicated genes was lower compared to the number of tandem duplicated genes, containing only two pairs and four genes (CaNAC1 to CaNAC87 and CaNAC8 to CaNAC9) ( Figure 1).

Classification of CaNACs
The most prominent structural feature of NAC proteins is the conserved NAC domain, which is divided into five subdomains at its N-terminal region. The phylogenetic relationship of the 97 CaNAC proteins and 31 known representative NAC proteins belonging to different subgroups from Arabidopsis, rice, and potato was examined by multiple sequence alignment of their NAC domain sequences. Because of short sequences or the lack of complete subdomains, the other seven NAC genes cannot be used for phylogenetic tree construction. Based on the ANAC, ONAC, and TNAC classification [10] and NAC domain alignments of CaNACs, 97 CaNACs were classified into three main groups (Groups I, II, and III) which contained 63, 10, and 24 pepper NAC members, respectively ( Figure 2). Further, the CaNAC domains in each main group could be divided into several subgroups based on similarities in NAC domain structures.
Fourteen subgroups in Group I, and four subgroups in Group II have been identified in Arabidopsis and rice [3]. In this study, 14 subgroups were also identified in the Group I NAC gene family in pepper. The number of CaNAC genes in each subgroup varied greatly, as shown in Figure 2. Subgroup I (OsNAC3) and I (NAP) possessed only one gene, respectively. Subgroup I (OsNAC7) and I (NAM) contained relatively more CaNAC genes, with 12 and 8 genes, respectively. Further, we identified four subgroups in the pepper Group II NAC gene family, which is similar to Arabidopsis and rice, where the Group II NAC gene family contains the four subgroups (ANAC001, ONAC003, ONAC001, and ANAC063). Interestingly, we found that Group III contains 24 CaNAC genes, but no Arabidopsis and rice NAC. The 24 CaNAC genes are included into the TNAC subfamily, which is thought to be exclusive to the Solanaceae family [10,27]. Similar phenomena were observed in tobacco and potato. Here, we confirmed the existence of a Solanaceae-specific NAC subfamily. Further, we determined the characteristic features of each group and subgroup using the sequence alignments of the quantified sequences of NAC domains. As shown in Figure 3, subdomains A, C, and D were tightly conserved, whereas subdomains B and E were relatively divergent, except in subgroups 4, 5, and 6 of Group I. The sequences of subdomains B and E in subgroups 4, 5, and 6 were almost identical, akin to observations in Arabidopsis [3]. In this regard, pepper is similar to Arabidopsis and rice. Compared with Group I, the sequences of NAC in Group II from Arabidopsis or rice have low similarity to CaNAC in pepper, except for subgroup ANAC075 (Figure 3). A to E are shown by arrows above the sequences. Amino acids in the consensus sequences that are common to all groups are shown in black (=100%). Amino acids in the consensus sequences that are common to 20-25 subgroups (≥75%) are shown in blue, and those common to 13-19 (≥50%) are shown in green.

Gene Structure and Motif Composition Analysis of CaNAC Proteins
To better understand the similarity and diversity of CaNAC proteins, the intron/exon distribution of CaNAC was examined, and the conserved motifs of NAC family proteins in pepper, potato, Arabidopsis, and rice were also investigated using MEME version 4.12.0 online software (Available online: http://meme-suite.org/tools/meme). Gene structure analysis indicated that the number of introns of CaNAC ranged from 0 to 8 (Table 1). In particular, 26 and 39 CaNAC genes contained zero and two introns, respectively. Eighty-three (78.9%) CaNAC genes contained two or less introns, which is similar to cassava and banana, indicating low structural diversity of CaNAC genes. In addition, the CaNAC73 gene possessed eight introns. As shown in Figure 4 and File S2, a total of 20 distinct motifs were identified, and a schematic overview of the identified motifs is provided. Interestingly, most of the conserved motifs located in the N-terminal of NAC proteins are highly conserved for DNA-binding, and a similar motif composition is shared by the same subgroups, indicating functional similarities among members of the same subgroups.
CaNAC1, CaNAC54, and CaNAC87 contained relatively more motifs, (11,11, and 10 motifs, respectively), compared to CaNAC42, CaNAC43, and CaNAC47, which contained only 2 motifs each. Among the 20 motifs in Group I, motif 3, motif 4, motif 1, and motif 5 comprised the subdomains A, B, D, and E, respectively, and motifs 2 and 6 together comprised subdomain C. The number and kind of motifs were almost identical in Group I. However, motif 13 was unique in subgroup I (OsNAC7). Compared to Group I, the number and kind of motifs were diverse in Group II and III. Motif 3 existed in all CaNAC genes, where motifs 2 and 4 were only represented in subgroup II (ANAC084). Further, motifs 1 and 6 were not found in Group II. Two unique motifs (motifs 7 and 8) were identified in subgroup II (ONAC003), and motifs 8, 7, and 12 comprised the subdomain B, C, and D, respectively. Motif 12 was identified in most CaNAC genes in Group II and III. In Group III, motifs 3, 5, 9, 10, and 11 were found in most CaNAC genes. Motif 9 comprised subdomain B, motifs 6 and 10 together comprised subdomain C, and motifs 11 and 12 together comprised subdomain D. Interestingly, we found that five motifs (15, 16, 17, 18, and 19) were always represented in five genes and that subdomain C in these five CaNAC genes was comprised of motif 16.
Seven NAC genes, CaNAC4, CaNAC31, CaNAC47, CaNAC48, CaNAC64, CaNAC68, and CaNAC82 cannot be divided into any subgroups because they cannot be used for phylogenetic tree construction as a consequence of their shortened sequences or lack of complete subdomains. However, as displayed schematically in Figure 4, CaNAC64, CaNAC68, and CaNAC82 can be divided into Group I because they contain motif 2. Overall, the number and kind of motifs in Group I, Group II, and Group III are quite different, indicating that the functions of these three main groups have been diversified.

Expression Profiling of CaNAC Genes under Biotic and Abiotic Stresses
We analyzed the expression patterns of 22 CaNACs belonging to the different subgroups under normal growth conditions and various biotic and abiotic stress conditions using real-time quantitative RT-PCR.

Expression of Patterns of CaNAC Genes under Salt, Heat, and Drought Treatment
Under salt treatment, only the expression of one gene, CaNAC20, was not detectable in the leaves of the treatment groups. However, the expression of 10 genes was significantly upregulated in response to stress treatment, increasing on average by more than 10-fold after treatment ( Figure 5, genes labeled in red or dark green). Of note, CaNAC72 was induced by more than 600-fold after 12 h of salt treatment. As shown in Figure 5, the expression of CaNAC50 and CaNAC86 were downregulated, while the expression of CaNAC50 significantly decreased by 0.2-fold after 24 h of treatment.
Under heat shock treatment, as shown in Figure 5, CaNAC50 and CaNAC56 were not expressed in the leaves of the treated plants. The expression of four genes (CaNAC13, CaNAC20, CaNAC29, and CaNAC53) was significantly upregulated following stress treatment. The expression of CaNAC27, CaNAC35, CaNAC37, CaNAC61, CaNAC72, and CaNAC102 was also upregulated. Meanwhile, the expression value of seven genes (CaNAC13, CaNAC20, CaNAC27, CaNAC29, CaNAC35, CaNAC53, and CaNA102) peaked after 24 h of treatment. In addition, the expression of CaNAC41 and CaNAC86 was significantly decreased by 0.02-and 0.06-fold after 12 and 24 h of heat shock treatment, respectively. As shown in Figure 5, the expression of CaNAC20 was not detectable in the roots after drought treatment. Six genes (CaNAC23, CaNAC27, CaNAC61, CaNAC72, CaNAC79, and CaNAC92) were expressed with relatively higher intensities and showed significant upregulation; CaNAC72 was induced by more than 77-fold after 3 h, and CaNAC79 was induced by more than 110-fold after 12 h of drought treatment. Conversely, the expression of six genes (CaNAC29, CaNAC33, CaNAC35, CaNAC50, CaNAC56, and CaNAC86) was downregulated, significantly decreasing by 0.65-, 0.04-, 0.17-, 0.13-, 0.2-, and 0.06-fold after 3 h of drought treatment, respectively. The leaves or roots of the seedlings (six true leaves) were used to test the changes of CaNAC gene expression levels using real-time PCR at different timepoints (0, 3, 6, 12 and 24 h) with salt, heat shock, abscisic acid (ABA), salicylic acid (SA), methyl jasmonate (MeJA), drought, and Phytophthora capsici treatment, respectively. Actin1 was used as an internal control. qRT-PCR data are shown relative to 0 h. The relative expression levels were calculated using the 2(−∆Ct) method. The heat map was created using R language.

Expression Patterns of CaNAC Genes under Hormone Treatment
Under ABA treatment, the expression of five genes (CaNAC27, CaNAC37, CaNAC41, CaNAC61, and CaNAC72) was upregulated, and seven genes (CaNAC13, CaNAC23, CaNAC33, CaNAC35, CaNAC38, CaNAC55, and CaNAC102) were downregulated. The expression value of two genes (CaNAC27 and CaNAC72) and three genes (CaNAC37, CaNAC41, and CaNAC61) peaked after 6 h and 12 h of treatment, respectively. In addition, the expression of five other genes (CaNAC20, CaNAC53, CaNAC60, CaNAC75, and CaNAC86) was downregulated, except after 12 h of treatment. As shown in Figure 5, CaNAC50 and CaNAC56 were not expressed in the leaves in the treatment groups.

Analysis of Stress-Related Cis-Elements in the CaNAC Promoters
We scanned the cis-elements involved in the activation of defense-related genes in the promoter regions of CaNAC to better understand potential regulatory mechanisms of CaNAC genes in response to abiotic or biotic stress responses. The promoter regions (−1000 bp upstream of the translation start site) of all the CaNAC genes to which gene expression analysis was applied were examined. The predicted cis-elements in the promoter regions of twenty-two CaNAC genes are shown in Figure 6.
The ABRE element was found in nine selected promoter regions of CaNAC27, CaNAC38, CaNAC53, CaNAC55, CaNAC56, CaNAC61, CaNAC72, CaNAC75, and CaNAC86. In addition, three ABRE elements were identified in CaNAC38, and four were found in CaNAC55. The CGTCA element was found in 10 of the selected promoter regions of CaNAC13, CaNAC20, CaNAC23, CaNAC27, CaNAC38, CaNAC41, CaNAC55, CaNAC56, CaNAC79, and CaNAC86, and at least two CGTAC elements were found in the CaNAC27, CaNAC38, CaNAC55, and CaNAC86 genes. The TCA element was found in 10 selected promoter regions of CaNAC33, CaNAC37, CaNAC38, CaNAC41, CaNAC55, CaNAC56, CaNAC61, CaNAC72, CaNAC79, and CaNAC86. In addition, the GARE element was found in 10 selected regions of CaNAC20, CaNAC29, CaNAC37, CaNAC38, CaNAC55, CaNAC56, CaNAC60, CaNAC61, CaNAC72, and CaNAC102. It is well known that ABRE, CGTCA, TCA, and GARE cis-acting elements are involved in hormone responsiveness. We consistently found these four elements in CaNAC38, CaNAC55, and CaNAC56, thus we suspect that these three NAC genes may play a key role in regulating responsiveness to hormones. The elements are identified as follows: green dovetail for the CGTCA element, red triangle for the ABRE element, yellow square for the W-box element, purple five-pointed star for the TCA element, black diamond for the TC-rich repeats element, dark green pentagon for the MBS element, orange oval for the WUN element, blue pentagon for the GARE element, pink circular crown for the HSEs element, the grey rectangle for the ERE element, the rose red hexagon for the LTR element.
As shown in Figure 6, HSE, TC-rich, MBS, and LTR elements were found in sixteen, fourteen, nine, and five selected promoter regions of CaNAC genes, respectively. The promoter region of CaNAC20 contains three HSE elements, and that of CaNAC41 contains four TC-rich elements. These results suggest that CaNAC20 and CaNAC41 may play a key role in regulating responsiveness to biotic stress. In addition, other stress-related cis-elements were detected, including a W-box element in eight genes (CaNAC20, CaNAC27, CaNAC29, CaNAC41, CaNAC50, CaNAC55, CaNAC60, and CaNAC75), a WUN element in three genes (CaNAC37, CaNAC55, and CaNAC61), and an ERE element in seven genes (CaNAC13, CaNAC20, CaNAC50, CaNAC56, CaNAC60, CaNAC79, and CaNAC86).

The Interaction Network of CaNAC Genes
An interaction network of CaNAC genes was constructed by making a comparison to orthologs of Arabidopsis, to better understand the relationships among the CaNAC gene family members. As shown in Table 1, 80.8% (84/104) of CaNAC genes were orthologous to 41.9% (49/117) of Arabidopsis NAC genes, indicating diversity in the sequences of NAC genes in Arabidopsis and pepper and that the sequences of CaNAC genes were tightly conserved. Positive, negative, and unknown correlations were marked by red, blue, and gray lines in Figure 7, which represent Pearson correlation coefficients of >0, <0, and no correlation, respectively. As shown in Figure 7, a complex functional relationship was observed in the interaction network of CaNAC genes; 35 pairs of interacting genes showed positive correlation, 30 showed negative correlation, and 6 were unknown. This suggests that CaNAC36 plays a key role in the interaction network, as it fully interacted with 48 genes. The green circles represent the CaNAC genes, and the pink circles represent the pepper genes interacting with CaNAC genes. The red lines correspond to PCC > 0 (Pearson Correlation Coefficient), the blue lines mean PCC < 0, and the gray lines indicate that the PCC is unknown.
CaNAC36 interacted with the Ras-related protein RAB1A gene (Capana07g000433), OsRAN2, a gene orthologous to Capana07g000433 in rice, which enhances cold tolerance by promoting the export of intranuclear tubulin and maintaining cell division under cold stress [28]. CaNAC36 also interacted with the SNARE protein gene (Capana12g000194), GsVAMP72, a gene orthologous to Capana12g000194 in wild soybeans, which is involved in regulating plant salt tolerance and ABA sensitivity [29]. CaNAC14 directly interacted with three CaNAC genes (CaNAC81, CaNAC30, and CaNAC63), and CaNAC40 interacted with mitogen-activated protein kinase genes (Capana02g002737, Capana06g002562, and Capana08g000964), which play important roles in plant growth, development, and in response to a variety of biotic and abiotic stresses [30]. No other direct interactions among CaNAC genes were observed.

NAC Gene Expansion and Evolution in Pepper
It has been demonstrated that NAC transcription factors play in important role in regulating stress tolerance against various abiotic and biotic challenges [9]. As a result of the availability of whole genome sequences in recent years, more and more NAC gene families have been identified from many different species of plants, such as Arabidopsis, rice, apple, and potato. However, very little is known about the NAC family in pepper. In a recently published report, a total of 45 putative NAC sequences were identified from the Pepper Genome Database (PGD, http://peppersequence.genomics.cn) [23].
In the present study, we further revealed an expanded NAC gene family in pepper, with a total of 104 members in the pepper genome. However, the NAC gene family in pepper is by far the smallest compared to those estimated for other plant species, which include 117 members in Arabidopsis [3], 151 in rice [6], and 110 in potato [10]. In particular, the genome size of pepper (3.13 Gb) is much larger than that of Arabidopsis (genome size 125 Mb), rice (480 Mb), and potato (840 Mb). By studying subgroups of NAC genes in Arabidopsis and rice, we found that the number of common subgroup CaNAC genes in pepper was much lower than expected as compared to Arabidopsis and rice, especially in subgroups 2, 8, 12, and 14. The number of the identified CaNAC genes in subgroups 2, 8, 12, and 14 were 3, 5, 6, and 8 in pepper, respectively. However, they are 6, 10, 13, and 13 in Arabidopsis, and 7, 8, 6, and 11 in rice, respectively. In our studies, we also identified two new subgroups, subgroup15 and subgroup16, which could not be further divided into subgroups in Group I. A similar phenomenon has been observed in other plants such as potato and is possibly related to the specificity of each species [10]. Further, we also found that 23 NAC genes belong to the Solanaceae specific subfamily (Figure 2). The number of NAC genes is lower in pepper as compared to other Solanaceae specific subfamilies such as tobacco (50 NAC genes) and potato (36 NAC genes). Our results provide further evidence that there is one NAC subfamily which is exclusive to the Solanaceae family.
It is well known that gene duplication events are important in the rapid expansion and evolution of gene families [25]. Tandem gene duplication of NAC transcription factors has been observed in many plant species, such as Arabidopsis, rice, Populus, potato, etc. In potato, one of the most important Solanaceae family plants, 24.5% of StNAC (27/110) were found to be tandemly duplicated [10]. In our investigation, 30.8% (32/104) of CaNAC genes were found to evolve from tandem gene duplication ( Figure 1). Therefore, tandem gene duplication plays an important role in NAC gene family expansion in pepper. However, in comparing the number of NAC genes with the size of the genome, the reason for the low number of NAC genes might be due to whole-genome duplication (WGD) events in pepper [25,26].

CaNAC Proteins Play Important Roles in Various Biological Processes
Most of the NAC proteins that have been studied to date are involved in the response to abiotic and biotic stress and in the regulation of developmental processes. However, very few NAC transcription factors involved in the biotic stress response have been identified in pepper. Therefore, the goal of this study was to obtain more insight into the expression patterns and putative functions of CaNAC genes in response to biotic or abiotic stresses. The expression levels of 22 CaNAC genes under seven stress treatments (salt, heat shock, drought, Phytophthora capsici, ABA, SA, and MeJA) were calculated ( Figure 5). Almost all 22 CaNAC genes were induced by these treatments, except CaNAC20 under salt and drought treatment, CaNAC50 and CaNAC56 under heat shock and hormone treatment, and CaNAC56 under Phytophthora capsici infection. These results indicate that the accumulation of CaNAC effectively reduces the damage from biotic or abiotic stress.
In the present study, the expression of the CaNAC72 gene was upregulated under stress treatment, except for Phytophthora capsici infection. Under salt and drought treatments ( Figure 5), CaNAC72 gene showed relatively higher intensities, increasing more than 600-and 70-fold, respectively. We also found that CaNAC92 was significantly upregulated by salt (60-fold increase) and drought treatment (18-fold increase). As shown in Table 1, two Arabidopsis NAC genes, AT3G15500 (ANAC055) and AT1G69490 (ANAC029), were orthologous genes to CaNAC72 and CaNAC92, respectively. According to the phylogenetic tree analysis, CaNAC72 and CaNAC92 were attributed to subgroups 1(5) and 1(4), which shared high identity with subgroups AtNAC3 and AtNAP of Arabidopsis, respectively. In a previous study,  reported that the expression of AtNAP and ANAC055 increased in a leaf during aging, and that these two genes were senescence-associated NAC transcription factors, as candidate downstream components of ETHYLENE-INSENSITIVE2 (EIN2) [31]. We suspect that NAC genes from subgroups AtNAC3 and AtNAP may play important roles not only in response to abiotic stress, but also in plant development. However, this theory requires further investigation to be confirmed.
In tomato, the expression of SISRN1 was significantly induced by infection with B. cinerea or P. syringae pv. tomato DC3000, leading to a 6-to 8-fold higher expression than in the mock-inoculated plants [19]. SISRN1 has high identity with CaNAC23, and CaNAC23 also exhibited an increased expression pattern under P. capsici infection stress in this study. These results suggest that CaNAC23 can be induced by pathogens. Further, we also found that the expression of CaNAC23 was significantly induced by salt, drought, and heat shock treatment. Thus, CaNAC23 genes may play important roles in response to abiotic or biotic stress. Similarly, the expression of StNAC030 was significantly induced by salt, heat, and ABA [10] in potato, and CaNAC27, the gene orthologous to StNAC030, was also significantly induced by salt, heat, and ABA. CaNAC27 was also induced by drought and MeJA in the present study, suggesting that CaNAC27 genes may play important roles in response to abiotic or biotic stress. Thus, we suspect that orthologous NAC genes in the Solanaceae family may have similar gene functions; however, their functional characterization would be required to ascertain if they play similar roles in plant processes.
It was reported that two maize NAC transcription factors, namely, ZmNAC41 and ZmNAC100, are induced in leaves infected with Colletotrichum graminicola [32]. However, sequence information about these ZmNAC transcription factors is unavailable, therefore we could not identify orthologous genes in the CaNAC family. These results imply that these NAC members might be regulators of responses to various biotic stresses. To date, there are only three CaNAC genes with expression models that have been characterized after biotic or abiotic inoculation treatment, while the biological and cellular functions of most CaNAC genes remain largely unknown [20][21][22]. The current investigation identified several CaNAC genes that might be involved in stress defense and provides clues for the selection of candidate genes for further studies.

Sequence Database Searches
Arabidopsis NAC protein sequences were downloaded from TAIR (Available online: ftp://ftp.arabidopsis.org) [33]. Rice NAC protein sequences were obtained from rice full-length cDNA data (KOME) [34]. The pepper annotated genome sequences were obtained from the Pepper Genome Database (PGP, available online: http://peppergenome.snu.ac.kr and PGD, available online: http://peppersequence.genomics.cn) [25,26]. In a previous study, we obtained 44.6 Gb databases from RNA-Seq using resistant and sensitive gene pools against P. capsici in pepper and identified 47 NAC transcription factors after gene annotations [35]. In the current study, 47 pepper NAC proteins were used as query sequences for searching candidate sequences (e value was ≤e −10 ) against the two pepper genome databases using the BLAST tool (https://solgenomics.net/tools/blast/. The candidate sequences were confirmed for the presence of NAC domains using the Hmmsearch program (HMMER 3.1, available online: http://hmmer.janelia.org/). The putative NAC sequences, confirmed by Hmmsearch in the two pepper genome databases, were in turn used to search the pepper predicted proteins until no new sequences were found. The NAC sequences in two different pepper genome databases were then blasted using DNAMAN software (8.0, Lynnon Biosoft, San Ramon, CA, USA).

Multiple Sequence Alignment, Gene Chromosomal Location, and Phylogenetics Analysis
Approximately 160 amino acid sequences spanning the NAC core domain of all CaNAC proteins, 20 selected Arabidopsis NAC proteins and rice NAC proteins belonging to different group/subgroups [3], and 9 tobacco NAC proteins belonging to the Solanaceae specific NAC subfamily, were used to create multiple protein sequence alignments using Clustal X 2.1 (University College Dublin, Belfield, Dublin, Ireland) with default settings [36]. The gene chromosomal locations were obtained by the pepper gene annotation giff3 file, downloaded from the Pepper Genome Databases (Available online: http://peppersequence.genomics.cn). The NAC domain boundary was defined following a previously described method [3]. The neighbor-joining method was used to construct the phylogenetic tree, based on the amino acid sequence of NAC domains using MEGA 6.06 [37]. The parameters used in tree construction were the Jones-Thornton-Taylor (JTT) model plus gamma-distributed rates, as determined by ProTest 3.0 and 1000 bootstraps [38]. The tandemly and segmentally duplicated genes were defined as an array of two or more homologous genes within a 100-kb range distance, or low-copy repeats of DNA segments (blocks of sequence ≥1 kb in length and showing ≥90% sequence identity), respectively [39].

Motif Composition Analysis of CaNAC Proteins
The MEME 4.12.0 online program (Available online: http://meme-suite.org/) was used for the identification of motifs in the C. annuum NAC protein sequences. The optimized parameters of MEME were employed as follows: number of repetitions, any; maximum number of motifs, 20; optimum width of each motif, between 6 and 50 residues [40].

Treatments of Pepper Plants with Various Biotic and Abiotic Stresses
Accession PI201234, which is highly resistant to P. capsici, was used throughout this study [41]. The seedlings were grown in a growth chamber for 26 • C (16 h/day) and 20 • C (8 h/night) till reaching the age of six true leaves and were then treated with heat shock (42 • C), salt (300 Mm NaCl), drought (400 Mm mannitol), P. capsici (10 5 spores mL −1 ), salicylic acid (SA, 100 µM), methyl jasmonate (MeJA, 100 µM), and abscisic acid (ABA, 100 µM) for 24 h. The samples were prepared and collected using the method reported in our previous study [41]. The roots treated with drought or P. capsici spore suspension inoculation, along with the leaves treated with salt, heat shock, SA, MeJA, or ABA were also collected separately at 0, 3, 6, 12, and 24 h for RNA isolation [42].

Real-Time Quantitative RT-PCR
Total RNA extraction, cDNA synthesis and real-time quantitative PCR (qRT-PCR) analysis were carried out using the methods reported in our previous study [43]. A total of 22 CaNACs belonging to the different subgroups were randomly selected for gene expression analysis under abiotic or biotic stresses, and the data of three biological replicates were analyzed. Actin1 (NCBI accession number GQ339766) was used as the internal reference gene.

Search for Cis-Acting Elements in the Promoters of CaNAC Genes
Upstream regions (−1000 bp) of the CaNAC genes which were selected for gene expression analysis, were derived from Sol Genomics Network (Available online: https://solgenomics.net/) and were searched for regulatory elements, including CGTCA (cis-acting regulatory element involved in the MeJA-responsiveness), ABRE (cis-acting element involved in the abscisic acid responsiveness), W-box (binding site for the WRKY transcription factor in the defense response), TCA (cis-acting element involved in salicylic acid responsiveness), TC-rich repeats (cis-acting element involved in the defense and stress responsiveness), MBS (MYB binding site involved in drought induction), WUN (wound-responsive element), GARE (gibberellins-responsive element), HSE (cis-acting element involved in heat stress responsiveness), ERE (ethylene-responsive element), and LTR (low temperature-responsive element), in the promoters using the PlantCARE (Available online: http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) and PLACE databases [44,45].

Prediction of CaNACs Protein-Protein Interaction Network
An interaction network of CaNAC proteins was constructed to understand the relationships among CaNAC gene family members and other genes. As described in detail previously [46], the interolog from Arabidopsis was used for predicting protein-protein interaction networks of CaNAC genes.
An interaction network of Arabidopsis NAC proteins was constructed, the Arabidopsis NAC proteins were mapped to CaNAC proteins on the basis of their homologous relationships, and an interaction network of CaNAC proteins was drawn by Cytoscape.

Conclusions
A genome-wide analysis of the CaNAC gene family in pepper was performed to reveal gene location, gene structure, gene phylogeny, conserved motifs, stress-related cis-element, gene expression response to seven different abiotic or biotic stresses, and interaction network of CaNACs proteins. The vast majority of the 22 CaNAC genes showed a response to different stress types, displaying distinct expression patterns. In addition, one NAC subfamily that is exclusive to the Solanaceae family was identified.