Genome-Wide Identification and Characterization of NODULE-INCEPTION-Like Protein (NLP) Family Genes in Brassica napus

NODULE-INCEPTION-like proteins (NLPs) are conserved, plant-specific transcription factors that play crucial roles in responses to nitrogen deficiency. However, the evolutionary relationships and characteristics of NLP family genes in Brassica napus are unclear. In this study, we identified 31 NLP genes in B. napus, including 16 genes located in the A subgenome and 15 in the C subgenome. Subcellular localization predictions indicated that most BnaNLP proteins are localized to the nucleus. Phylogenetic analysis suggested that the NLP gene family could be divided into three groups and that at least three ancient copies of NLP genes existed in the ancestor of both monocots and dicots prior to their divergence. The ancestor of group III NLP genes may have experienced duplication more than once in the Brassicaceae species. Three-dimensional structural analysis suggested that 14 amino acids in BnaNLP7-1 protein are involved in DNA binding, whereas no binding sites were identified in the two RWP-RK and PB1 domains conserved in BnaNLP proteins. Expression profile analysis indicated that BnaNLP genes are expressed in most organs but tend to be highly expressed in a single organ. For example, BnaNLP6 subfamily members are primarily expressed in roots, while the four BnaNLP7 subfamily members are highly expressed in leaves. BnaNLP genes also showed different expression patterns in response to nitrogen-deficient conditions. Under nitrogen deficiency, all members of the BnaNLP1/4/5/9 subfamilies were upregulated, all BnaNLP2/6 subfamily members were downregulated, and BnaNLP7/8 subfamily members showed various expression patterns in different organs. These results provide a comprehensive evolutionary history of NLP genes in B. napus, and insight into the biological functions of BnaNLP genes in response to nitrogen deficiency.


Introduction
Nitrogen is a primary nutrient that is critical for the survival of all living organisms. The absorption and utilization of nitrogen directly affects plant growth and development, as well as crop yields. During their long evolutionary history, plants have established complex and delicate regulatory mechanisms to adapt to changes in nitrogen conditions in the external environment.
The main forms of nitrogen in soil include nitrate nitrogen, ammonium nitrogen, amino acids, proteins, and other nitrogenous substances. Nitrate nitrogen, the major form of nitrogen in soil, plays important physiological and nutritional roles in plant growth and development [1]. Nitrate nitrogen uptake and translocation in plant roots are mainly accomplished by nitrate nitrogen transporters (NRT) [2]. The first transporter identified to sense nitrate was the NRT1.1 in Arabidopsis thaliana, and the CIPK23 (a protein kinase) and CIPK8 can phosphorylate NRT1.1 to regulate the high-or low-affinity nitrate response [3,4]. Meanwhile, several proteins are involved in nitrogen assimilation in plants exposed to changing nitrogen concentrations, such as ammonium transporters (AMTs) and nitrate reductases (NRs) [5][6][7]. Genes and transcription factors (TFs) involved in nitrate-signaling pathways play key roles in nitrate absorption and assimilation [8]. NIN (NODULE INCEPTION) genes were first discovered as being defective in bacterial recognition, infection thread formation, and nodule primordia initiation in the lotus (Lotus japonicus), and the formation of rhizobia in leguminous species was later confirmed to be dependent on the presence of NIN genes [9,10]. NIN genes encode nuclear-targeted DNA binding proteins with bZIP domains and the most obvious feature of the NIN proteins is a strongly conserved 60-amino acid (aa)-long sequence, known as the RWP-RK (also known as the RWPxRK motif) sequence. Moreover, a few genes that possessed high degree homology with NIN were identified in legumes, named NLP (NIN-like proteins) [10][11][12]. Subsequent studies found that NLP genes widely exist among non-nitrogen fixating plants, such as rice (Oryza sativa) and A. thaliana [10], but the NIN proteins seem to be unique in legumes.
The NLP family is a plant-specific TF family, their proteins presented a homology to NIN not only in the RWP-RK domain but also in N-terminal regions [10,13]. NLP proteins contain two major conserved domains, the RWP-RK and PB1 (Phox and Bem1) domains. The RWP-RK domain functions in the DNA binding domain whose activity is unrelated to nitrate signaling, whereas the N-terminal regions of NLPs function as a transcriptional activation domain that mediates this signaling, and the PB1 domain is involved in protein-protein interactions [14,15]. A highly conserved GAF domain (a ubiquitous signaling motif and a new class of cyclic GMP receptor, which ensures the normal functioning of photosensitizing light reversals and optical signal transduction [3]) was also identified in the nitrogen termini of some NLP proteins, and may be related to signal transduction or dimerization [3,16,17]. A. thaliana contains nine NLP genes [8], whereas lotus contains four NLPs with a single NIN [13,18,19]. A number of NLPs have also been identified in maize (Zea mays) (nine), sorghum (Sorghum bicolor) (five), and rice (six) [10,14,20]. Functional studies have shown that AtNLPs play key roles in orchestrating primary nitrogen responses by binding to the nitrogen responsive cis-elements (NREs) in the promoters of target genes [9,21,22]. AtNLP7 functions in nitrate-and nitrogen starvation responses by binding to key nitrogen pathway genes, including ANR1, NRT1.1, NRT2, and LBD37/38, thus moderating nitrogen assimilation and metabolism by transcriptionally activating or suppressing downstream genes [23,24]. AtNLP8 regulates nitrate-promoted seed germination and directly binds to the promoter of an abscisic acid (ABA) catabolic enzyme gene to reduce ABA levels in a nitrate-dependent manner [19]. AtNLP6 and AtNLP7 proteins interact with the key transcriptional regulator TEOSINTE BRANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR1-20 under nitrogen starvation and continuous nitrate treatment; these interacting regulators play a central role in controlling the expression of nitrate-responsive genes [25]. Nitrate triggers nitrate-CPK (Ca 2+ -sensor protein kinases)-NLP signaling and nitrate-coupled CPK signaling to phosphorylate NLPs, which play a crucial role in nutrient-growth networks [26].
Brassica napus is one of the most important oilseed crops in non-nitrogen fixating plants. This species was formed by the hybridization of two progenitor species, Brassica rapa and Brassica oleracea [27,28]. NLP genes play important roles in nitrogen uptake and utilization, which are critical for plant growth and development [10,14,20,29]. However, few studies have focused on identifying NLP genes in B. napus. In this study, we identified 31 BnaNLP genes using nine AtNLP proteins from A. thaliana as query sequences. We investigated the evolutionary relationships among NLP members from nine plant species and evaluated the predicted three-dimensional structures of a representative NLP protein (BnaNLP7-1) in B. napus. We also examined the spatio-temporal expression patterns of BnaNLP genes, as well as their expression under nitrogen deficient conditions, to reveal the relationship between BnaNLPs and nitrogen deficiency responses. Our study provides insight into the evolution of NLP proteins in plants and lays the foundation for elucidating the roles of NLPs in regulating nitrogen responses in B. napus.

Identification of NLP Genes in B. napus
Using the nine AtNLP protein sequences in A. thaliana as queries, 31, 17, and 8 NLP genes were identified from B. napus, B. rapa, and B. oleracea, respectively. All Brassica NLP proteins share two conserved domains: The RWP-RK and PB1 domains. The nomenclature used for Brassica NLP genes was based on the corresponding AtNLP orthologs (Table 1 and Table S2). The number of amino acid residues in the BnaNLP proteins ranged from 681 (BnaNLP3-1) to 978 (BnaNLP7-3), with an average of 845. The relative molecular weights (MWs) ranged from 76.37 (BnaNLP3-1) to 108.05 kDa (BnaNLP7-3). The isoelectric points (pIs) were predicted to range from 4.98 (BnaNLP1-2) to 7.31 (BnaNLP3-1), and only two members had pI values > 7 (BnaNLP2-4 and BnaNLP3-1). No transmembrane helix or signal peptide was found in any BnaNLP protein (Table 1). Most BnaNLP proteins were predicted to localize in the nucleus, while BnaNLP2-4 was predicted to localize on the chloroplast. In addition, BnaNLP2-2 was predicted to be localized to the chloroplast and nucleus (Table 1), which is consistent with the general nuclear localization of TFs.

Phylogenetic Analysis of BnaNLP Proteins
To clarify the evolutionary relationships between NLP proteins from A. thaliana and B. napus, we aligned the sequences of 40 NLP proteins using AtANR1 as an outgroup. Based on previous studies [8,10], NLP proteins were divided into three groups ( Figure 1A), including the NLP1/2/3/4/5 clade (group I), the NLP6/7 clade (group II), and the NLP8/9 clade (group III). Group I contained 17 BnaNLPs, while groups II and III contained six and eight BnaNLPs, respectively. Both the BnaNLP4 and BnaNLP8 subfamilies contained six members, with more members compared to the other BnaNLP subfamilies.
To further investigate the evolutionary relationships of NLP proteins in plants, we constructed a larger neighbor-joining (NJ) tree based on the 99 NLP protein sequences from A. thaliana, B. napus, B. rapa, B. oleracea, Amborella trichopoda, rice, grape (Vitis vinifera), maize, and sorghum ( Figure 2). The 99 plant NLP proteins were also classified into three groups. Group I was the largest clade, containing 49 NLP members, including 36 NLPs in Brassicaceae. Group II was the smallest clade, containing 23 NLP members, including 12 NLPs in Brassicaceae. Group III contained 26 NLP members, including 16 NLPs in Brassicaceae. BnaNLP members first clustered with NLPs from two progenitors of B. napus, followed by AtNLPs, forming a small Brassicaceae clade, and subsequently further clustered with NLP members in V. vinifera to form a dicot clade. Finally, a large clade could be observed by joining NLP proteins in dicot and monocot plants ( Figure 2). In group I, the dicot clade was formed by two subclades, the NLP1/2/3 and NLP4/5 subclades. In the NLP1/2/3 subclade, NLP1/2 members in Brassicaceae first formed a small subclade clustered with VvNLP1 and then clustered with NLP3 members specifically in Brassicaceae. Plant NLP members were divided into three groups, with at least one NLP member present in A. trichopoda, suggesting that at least three ancient NLP copies were present in the ancestor before the monocot and dicot divergence.

Chromosome Location and Ka/Ks Ratio Calculation
Chromosome location analysis revealed that 31 BnaNLP genes are located on 16 chromosomes in B. napus, with 16 genes in the A subgenome and 15 in the C subgenome ( Figure 3, Table S2). Four BnaNLP genes were identified on chromosomes A06 and A07, and three were found on chromosomes A03 and C07. Two BnaNLP genes each were found on chromosomes A01, C01, C03, C04 and C06, and only one BnaNLP gene was found on chromosomes A04, A05, A08, A09, C05 and C08. No BnaNLP genes were found on chromosomes A01, A010, C01 and C09. In addition, all BnaNLP genes share syntenic relationships with NLP members in A. thaliana, B. rapa, and B. oleracea (Figure 4), and no tandemly duplicated BnaNLP family genes were identified.
To explore the selective pressure on BnaNLP genes, we calculated the nonsynonymous/synonymous mutation ratio (Ka/Ks); Ka/Ks > 1 indicates positive selection, Ka/Ks = 1 indicates neutral selection, and Ka/Ks < 1 indicates purifying selection [30]. The Ka/Ks ratio for all BnaNLP genes was <1, ranging from 0.0842 (BnaNLP4-1) to 0.5926 (BnaNLP8-4), indicating that these genes are functionally conserved (Table 2). A comparison of the Ka/Ks ratios of BnaNLP genes between the A and C subgenomes showed that the average Ka/Ks ratio was higher in the C subgenome (0.1901) than in the A subgenome (0.1728), suggesting that BnaNLP genes in the C subgenome experienced higher selection pressure during the evolutionary history of B. napus. A comparison of Ka/Ks ratios among the three groups indicated that the average Ka/Ks ratio in group

Chromosome Location and Ka/Ks Ratio Calculation
Chromosome location analysis revealed that 31 BnaNLP genes are located on 16 chromosomes in B. napus, with 16 genes in the A subgenome and 15 in the C subgenome ( Figure 3, Table S2). Four BnaNLP genes were identified on chromosomes A06 and A07, and three were found on chromosomes A03 and C07. Two BnaNLP genes each were found on chromosomes A01, C01, C03, C04 and C06, and only one BnaNLP gene was found on chromosomes A04, A05, A08, A09, C05 and C08. No BnaNLP genes were found on chromosomes A01, A010, C01 and C09. In addition, all BnaNLP genes share syntenic relationships with NLP members in A. thaliana, B. rapa, and B. oleracea (Figure 4), and no tandemly duplicated BnaNLP family genes were identified.
To explore the selective pressure on BnaNLP genes, we calculated the non-synonymous/synonymous mutation ratio (Ka/Ks); Ka/Ks > 1 indicates positive selection, Ka/Ks = 1 indicates neutral selection, and Ka/Ks < 1 indicates purifying selection [30]. The Ka/Ks ratio for all BnaNLP genes was <1, ranging from 0.0842 (BnaNLP4-1) to 0.5926 (BnaNLP8-4), indicating that these genes are functionally conserved (Table 2). A comparison of the Ka/Ks ratios of BnaNLP genes between the A and C subgenomes showed that the average Ka/Ks ratio was higher in the C subgenome (0.1901) than in the A subgenome (0.1728), suggesting that BnaNLP genes in the C subgenome experienced higher selection pressure during the evolutionary history of B. napus. A comparison of Ka/Ks ratios among the three groups indicated that the average Ka/Ks ratio in group III was higher than that in groups I and II and that BnaNLP genes in group II had the lowest average Ka/Ks ratio among the three groups (Table 2). Unlike the BnaNLP8 subfamily, most homologous members in the remaining subfamilies possessed similar Ka/Ks ratios, suggesting that the homologous genes in the BnaNLP8 subfamily might have been subjected to different evolutionary pressures after the whole genome triplication (WGT) event in B. napus.

Gene Structure and Conserved Motif Analyses
To compare the exon/intron structures of AtNLPs and BnaNLP genes, we aligned the coding sequences with their corresponding genomic sequences. Similar gene structures were not only observed in homologous genes in B. napus, but they were also found within NLP homologs in B. napus and A. thaliana ( Figure 1B). Most BnaNLP members in groups I and II contain four exons and three introns, while BnaNLP2-1/4-6/4-2 genes contain five exons and four introns. Most members in group III contain at least six exons and five introns, except for BnaNLP8-3 (five exons and four introns) and BnaNLP8-6 and BnaNLP9 (seven exons and six introns each). The similar gene structures observed in the same BnaNLP subfamilies are consistent with their phylogenetic relationships.
Sequence alignment showed that motif 1 is the conserved GAF domain. InterProScan annotation showed that motif 2 is the RWP-RK domain, which plays a key role in N-mediated control of development. Motif 6 is the PB1 domain, comprising approximately 80 amino acid residues. BnaNLP proteins in the same group share similar motifs, suggesting that similar BnaNLP members share conserved biological functions.

Gene Structure and Conserved Motif Analyses
To compare the exon/intron structures of AtNLPs and BnaNLP genes, we aligned the coding sequences with their corresponding genomic sequences. Similar gene structures were not only observed in homologous genes in B. napus, but they were also found within NLP homologs in B. napus and A. thaliana ( Figure 1B). Most BnaNLP members in groups I and II contain four exons and three introns, while BnaNLP2-1/4-6/4-2 genes contain five exons and four introns. Most members in group III contain at least six exons and five introns, except for BnaNLP8-3 (five exons and four introns) and BnaNLP8-6 and BnaNLP9 (seven exons and six introns each). The similar gene structures observed in the same BnaNLP subfamilies are consistent with their phylogenetic relationships.
Sequence alignment showed that motif 1 is the conserved GAF domain. InterProScan annotation showed that motif 2 is the RWP-RK domain, which plays a key role in N-mediated control of development. Motif 6 is the PB1 domain, comprising approximately 80 amino acid residues. BnaNLP proteins in the same group share similar motifs, suggesting that similar BnaNLP members share conserved biological functions.

Prediction of TF Binding Sites in the BnaNLP Promoters
TF binding sites (TFBSs) are short, specific DNA sequences that bind with TFs to regulate the transcription of genes. TFBS identification is a key step in understanding transcriptional regulation mechanisms and establishing transcriptional regulatory networks. To further understand the transcriptional regulation and potential functions of BnaNLPs genes, we screened for TFBSs in the 2000-bp upstream promoter sequences of these genes using PlantPan2.0 [31]. Sixty-three types of TFBSs were detected in the promoters of BnaNLP genes, involving 43 TF families (Table S3). TFBSs for 29 TF families were observed in all of the BnaNLP promoters, such as binding sites for AP2, GATA, NAC, and MADS-box TFs. In addition, BnaNLP9-2 might be associated with physiological processes, such as plant resistance and aging, and thus its promoter contained more binding sites for the WRKY TFs compared with other family members (Table S3).

Three-Dimensional Structure Prediction
We predicted the three-dimensional structure of BnaNLP7-1 protein, a representative BnaNLP proteins, using I-TASSER [32]. The best template used for the 3-D structure prediction was the RNA-dependent RNA polymerases of transcribing cypoviruses [33]. The sequence identity between BnaNLP7-1 and the 3JA4 template was 0.18 across the whole template. The coverage of the threading alignment (equal to the number of aligned residues divided by the length of the query protein) was 0.96. Alignment with a normalized Z-score > 1 indicates good alignment and vice versa. The normalized Z-score of the threading alignments was 2.83, suggesting that good alignment was obtained in our prediction. The modeled structure for BnaNLP7-1 has 31 α-helices and five β-strands ( Figure 5A). The RWP-RK domain contains two α-helices (α18 and α19), while the PB1 domain contains five α-helices (α27, α28, α29, α30, and α31). The two conserved domains also possess many loops and lack β-strands.
NLP proteins are plant-specific TFs that play key roles in regulating Nitrogen responses. To gain deeper insight into their biological function, COFACTOR and COACH [33] were used to predict the binding activities of the BnaNLP proteins with the promoters of downstream genes. The 2r7rA ligand (PBD: 2r7rA) [34] was used to bind BnaNLP7-1 protein as the nucleic acid substrate. The docking results showed that Gln320, Ala322, Leu323, Gln343, Thr344, Trp345, Gly369, Asp381, Ala383, Ala499, Ser500, Gly501, Phe767, and Pro768 are involved in ligand binding ( Figure 5C). Interestingly, the binding of BnaNLP7-1 and a nitrogen deficiency-responsive cis-element form a ring that associates with polymerase II transcription ( Figure 5D). No binding ability was detected for two conserved domains (RWP-RK and PB1) due to the lack of a binding site in their sequences, suggesting that those two conserved domains not function in ligand binding. obtained in our prediction. The modeled structure for BnaNLP7-1 has 31 α-helices and five β-strands ( Figure 5A). The RWP-RK domain contains two α-helices (α18 and α19), while the PB1 domain contains five α-helices (α27, α28, α29, α30, and α31). The two conserved domains also possess many loops and lack β-strands. The conserved PWP-PK domain and PB1 motifs are shown in red on the 3D structure. αhelices, β-strands, and random coils are indicated in green, yellow, and navy blue, respectively. All images were generated using Chimera 1.2. α-helices, β-strands, and random coils are indicated in green, yellow, and navy blue, respectively. All images were generated using Chimera 1.2.

Organ-Specific Expression Patterns of BnaNLP Genes
To investigate the organ-specific expression profiles of the BnaNLP genes, we subjected 17 types of organ to RNA-seq. Calculation of the expression levels of the BnaNLP genes revealed diverse expression patterns in different subfamilies ( Figure 6). Four BnaNLP1 subfamily members were highly expressed in silique pericarps (SP) at 30 days after flowering (DAF), especially BnaNLP1-1 and BnaNLP1-3. Four BnaNLP2 subfamily members and BnaNLP3 were expressed at extremely low levels in all organs examined. BnaNLP4-1, BnaNLP4-4, BnaNLP8-1, BnaNLP8-4, BnaNLP9-1 and BnaNLP9-2 were expressed at high levels in seeds at 40 and 3 DAF. Two BnaNLP5 genes were strongly expressed only in seeds at 46 DAF. Unlike the abovementioned subfamilies, two BnaNLP6 subfamily members were mainly expressed in roots, and four BnaNLP7 subfamily members were highly expressed in leaves. These results indicate that most BnaNLP genes were expressed at different levels in different organs and were preferentially expressed in specific organs. in all organs examined . BnaNLP4-1, BnaNLP4-4, BnaNLP8-1, BnaNLP8-4, BnaNLP9-1 and BnaNLP9-2 were expressed at high levels in seeds at 40 and 3 DAF. Two BnaNLP5 genes were strongly expressed only in seeds at 46 DAF. Unlike the abovementioned subfamilies, two BnaNLP6 subfamily members were mainly expressed in roots, and four BnaNLP7 subfamily members were highly expressed in leaves. These results indicate that most BnaNLP genes were expressed at different levels in different organs and were preferentially expressed in specific organs.

Expression Patterns of BnaNLP Genes under Nitrogen Deficiency
To investigate the expression patterns of BnaNLP genes under nitrogen deficiency, seedling leaves (SLs) and (seedling roots) SRs were harvested at various time points (0 to 72 h) after culturing in Hoagland nutrient solution with or without Nitrogen. In SLs, BnaNLP8-3, BnaNLP8-4, BnaNLP8-5, BnaNLP9-1, BnaNLP1, BnaNLP4, and BnaNLP5 subfamily members were upregulated under nitrogen deficiency treatment, while other members were downregulated (Figure 7). Most BnaNLPs were significantly upregulated or downregulated after 48-72 h of nitrogen deficient treatment, except for BnaNLP1-1 and BnaNLP8-4. In addition, a few genes were upregulated immediately after treatment and subsequently downregulated after 6 to 18 h, such as BnaNLP6-1 and BnaNLP8-1.
Interestingly, the responses of most BnaNLP genes in SRs to nitrogen deficiency treatment were highly similar to those in SLs (Figures 7 and 8). BnaNLP1-3 and BnaNLP1-4 were the most sensitive genes to nitrogen deficiency treatment in SLs, whereas BnaNLP4-3 was most sensitive to this treatment in SRs (Figures 7 and 8). Among the four BnaNLP7 subfamily members, three (BnaNLP7-1, BnaNLP7-3, and BnaNLP7-4) were downregulated after treatment in SLs, while BnaNLP7-2 was upregulated ( Figure 7). BnaNLP7-1/3 were downregulated and BnaNLP7-2/4 were upregulated after treatment in SRs. In general, members of the same subfamily often shared similar expression patterns, except for members of the BnaNLP7/8 subfamilies, consistent with their phylogenetic relationships.

Characterization of the NLP Gene Family in B. napus
Previous studies have revealed nine NLP family genes in Arabidopsis [8] and maize [17], six in rice [14], and five in sorghum [10]. In this study, we identified 31 BnaNLP family genes in B. napus. The BnaNLP gene family is larger than the NLP gene family in other plant species, because B. napus experienced an extra WGT and merging events [35]. Interestingly, NLP family genes have showed long protein residues and slightly lower pI values in several species [10,14,20,29]. Similar results were also detected in BnaNLP genes (Table 1), which illustrated that this TFs prefer to be active in acidic conditions. In maize, most ZmNLP genes possessed four to six exons, with the exception of ZmNLP1 which contained seven exons [20]. Similarly, for NLP genes in Populus trichocarpa, most NLP also contained four to six exons, except for PtrNLP1 (two exons) and PtrNLP4 (seven exons) [29]. In this study, different gene structures were found among BnaNLP family members, and group I and II members had fewer exons (four to five) than group III members (five to six, BnaNLP8-6 and BnaNLP9 possessed seven exons), implying structural diversification among the BnaNLP subfamilies. NLP proteins contain two conserved domains, RWP-RK and PB1. All BnaNLP proteins identified in this study contain two domains, except for BnaNLP4-4, which contains only the RWP-RK domain. Sequence analysis of BnaNLP4-4 indicated that more than 50 amino acids (including the PB1 domain) are absent in its C-terminus, which was caused by sequence fragmentation in the evolution or sequencing error in the B. napus genome. The RWP-RK domain contains a helix-turn-helix motif and an amphipathic leucine zipper, which might be involved in DNA binding [10,11]; further studies showed that its activity is unrelated to nitrate signaling, whereas the N-terminal regions of NLPs function as a transcriptional activation domain to mediate this signaling [22]. However, our docking results suggested that this domain may not be involved in the binding of NLP protein with nucleic acids. The PB1 domain contains two α-helices and a mixed five-stranded β-sheet, which might play a key role in its protein-binding ability [15]. We found that the PB1 domain acts like a switch, which may involve controlling the initiation or stopping the binding process. Hence, the biological function of the PB1 domain in the protein binding process deserved further elucidation.
Previous studies suggested that the structurally characterized GAF domains bind with low-molecular-weight ligands, including cGMP, 2-oxoglutarate, nitric oxide and nitrate, or serve as homodimerization modules associated with gene regulation in organisms ranging from bacteria to higher plants [3,36,37]. Due to the limitation of sequence identity between BnaNLP7-1 and the template, the binding site for GAF domain was not identified in our docking results. But we found the conserved sensing and signaling GAF domain in all BnaNLP proteins. Although legumes have specifically evolved as the NIN gene with functional responsibility for symbiotic nitrogen fixation, the presence of these three domains (GAF, RWP-RK, PB1) allows the NLPs to function in various aspects of nitrogen responses in non-nitrogen fixating plants, including nitrogen sensing, transcriptional modulation, and signal transduction [20]. In this study, all BnaNLP proteins possessed the three domains (besides BnaNLP4-4), reflecting the pivotal roles of BnaNLP proteins in rapid response and adaptation to nitrogen deficiency in B. napus.

Phylogenetic Relationship and Duplication Analysis of BnaNLPs
At least three ancient copies of NLP genes are thought to have existed in the ancestor of monocots and dicots [10], and two NLP genes have experienced one round of duplication in Arabidopsis, whereas the third gene has undergone several rounds of duplication since the divergence of eudicots and monocots. Our phylogenetic analysis of NLP genes in nine plant species provided evidence for the hypothesis that the ancestor of monocots and dicots contained at least three NLP gene family members (Figure 2), since there is at least one NLP gene in each of three NLP groups in A. trichopoda. Our results suggest that the ancestor of NLPs in most plant species has experienced one round of duplication, while the ancestor of NLPs in group III may have been duplicated more than once in the Brassicaceae lineages, perhaps due to the WGT event that have occurred in Brassicaceae species.
In general, B. rapa and B. oleracea contain three syntenic copies of each A. thaliana gene [28,29] and B. napus contains six, since it was generated from B. rapa and B. oleracea [35]. In this study, 0-2 copies of each NLP gene were identified in B. rapa and B. oleracea and 1-6 copies were identified in B. napus, perhaps due to genome shrinkage or gene loss after WGT. We detected 31 syntenic copies of BnaNLP genes and 16 syntenic copies of BraNLP genes in ancient WGT blocks, suggesting that the NLP gene family in the A subgenome of Brassica crops is highly conserved. However, 14 syntenic gene copies were found in ancient WGT blocks in B. oleracea, only eight of which contained both the RWP-RK and PBl domains, perhaps due to assembly errors of B. oleracea genome.

Expression Profiling and Response of Genes to Nitrogen Deficient Conditions
The expression patterns of NLP genes in different organs have been investigated in several plants. In rice and Arabidopsis, NLPs are expressed in almost all organs [14]. AtNLP8 and AtNLP9 are preferentially expressed in senescent leaves and seeds and are expressed at moderate or very low levels in other organs [14]. In rice, OsNLP1 and OsNLP3 are highly expressed in source organs, representing the most abundant OsNLP transcripts [14]. NLPs in P. trichocarpa are highly expressed in leaves, roots, male catkins, xylem, seeds, and female catkins [29]. In this study, we found that most BnaNLP genes were expressed in leaves, while only the four BnaNLP7 subfamily members were highly expressed in leaves ( Figure 6). Similar to previous reports, the accumulations of BnaNLP genes in leaves might also be used for storing nitrogen to coordinate leaf expansion and photosynthetic capacity, and nitrogen supply can improve their storage to promote leaf growth and biomass [38]. BnaNLP1 genes were highly expressed in silique pericarps, the main source organ during the podding stage in B. napus, suggesting that BnaNLP1 subfamily members may be involved in nitrogen storage or supply mainly in silique pericarps. In addition, six NLPs (BnaNLP4-1, BnaNLP4-4, BnaNLP8-1, BnaNLP8-4, BnaNLP9-1, and BnaNLP9-2) were strongly expressed in seeds at 40 and 46 DAF, suggesting that these genes may function in nitrogen assimilation during seed maturation. BnaNLP6 subfamily members were primarily expressed in roots, which is similar to the expression patterns of NLP genes in maize [20]. Different from nitrogen-fixing plants, the sources of nitrogen in Brassica species are mainly based on the absorption of nitrate nitrogen in soil though roots [2,39]; the strong expression of BnaNLP6 subfamily in root may reflect the important roles of BnaNLP6 in nitrogen absorption in roots of B. napus. We found that members of the same subfamily share similar expression profiles, including BnaNLP1, BnaNLP2, BnaNLP5, BnaNLP6, BnaNLP7 and BnaNLP9 subfamilies, suggesting basic functional conservation within each subfamily [14]. However, members of the BnaNLP4 or BnaNLP8 subfamily showed less similar expression profiles and some of them are not expressed at any organs, illustrating that nonfunctionalization or subfunctionalization occurred in the two BnaNLP subfamilies.
Nitrogen deficiency treatment can trigger rapid, extensive transcriptional changes of genes involved in a wide range of cellular and physiological processes [40,41]. In the current study, most BnaNLP genes were up-or down-regulated under nitrogen deficiency treatment in both SLs and SRs, due to the specific functions in nitrogen remobilization during nitrogen deficiency [42]. These upregulated BnaNLP genes may play key roles in the distribution and remobilization of nitrogen in response to the short-time demand for nitrogen during nitrogen deficiency. Interestingly, the induction of BnaNLP genes was slower than that observed for the homologs in maize, which were up/downregulated within 2 h of nitrate treatment [20], pointing to different nitrogen utilization mechanisms between B. napus and maize.
In Arabidopsis, AtNLP7 regulates nitrate and nitrogen starvation responses [9] and was identified in a genetic screen for regulators of the primary nitrate response [2]. AtNLP7 binds to the promoters of 851 genes in response to nitrate signals, binding preferentially to the transcriptional start sites of target genes. Moreover, the accumulation of AtNLP7 in the nucleus occurs independently of transcriptional regulation, and inhibiting nuclear export mimics the nitrate signal [24]. In this study, we identified four BnaNLP7 subfamily members. Of these, three (BnaNLP7-1/7-3/P7-4) were downregulated and only the BnaNLP7-2 expression was upregulated in SLs after nitrogen deficiency treatment ( Figure 7). However, BnaNLP7-1/7-3 were downregulated and BnaNLP7-2/7-4 were upregulated in SRs after nitrogen deficiency treatment, suggesting that BnaNLP7 subfamily members have experienced subfunctionalization during the polyploidy evolution of B. napus. The three different expression patterns in BnaNLP7 subfamily (upregulated in both SLs and SRs (BnaNLP7-2), downregulated in both SLs and SRs (BnaNLP7-1/7-3) and the exhibition of opposite regulated patterns in SLs vs. SRs (BnaNLP7-4)) point to the presence of at least three types of molecular mechanisms that function in response to nitrogen deficiency. Based on functional studies of AtNLP7, it could be speculated that there might be at least three groups of genes which are regulated by different BnaNLP7 subfamily members. Under the condition of nitrogen deficiency, upstream nitrogen signaling pathway genes activate different BnaNLP7 subfamily members in different organs, then the post-translationally activated BnaNLP7 regulate the expression of nitrate-inducible assimilation and downstream regulatory genes through interaction with NREs to improve the nitrogen absorption in roots and nitrogen assimilation in whole B. napus plants.
In summary, we found that BnaNLP genes have various expression patterns and most of them are extremely sensitive to nitrogen deficiency. This provides insights into the biological functions of BnaNLP genes in response to nitrogen deficiency.

Plant Materials and Treatments
Seeds of Brassica napus cultivar ZS11 were germinated in plant growth chambers (under a 16 h photoperiod at 25/18 • C day/night, 60% humidity, 250 µmoles/m 2 /s; PGC Flex, Conviron, MB, Canada) and transplanted into a field in Chongqing, China. To detect the expression patterns of BnaNLP family members in different organs, root (Ro), hypocotyl (Hy), and cotyledon (Co) organs were collected at 48 h after seed germination. Ro, stem (St), mature leaf (Le), bud (Bu), and flower (Fl) organs were harvested at the initial blooming stage, while seeds (Se) were harvested at 10, 20, 40, and 46 DAF, and silique pericarps (SP) were harvested at 10, 21, 30, 40, and 46 DAF in the field. Two biological replicates per sample were collected for RNA sequencing (RNA-seq) analysis.
To investigate the expression patterns of BnaNLP genes under nitrogen deficiency, B. napus seedlings were cultivated in pots in full-strength Hoagland solution without sand as a substrate in a plant growth chamber with a thermo-photoperiod of 25 • C for 16 h/18 • C for 8 h (light/dark) [43]. One-month-old seedlings were transferred into new full-strength Hoagland solution or modified N-deficient Hoagland nutrient solution for treatment. All nutrition solutions were renewed every 3 days. Three biological replicates of SLs and SRs were harvested after 0, 6, 18, 24, 48, and 72 h of treatment. All samples were immediately frozen in liquid nitrogen and stored at −80 • C.
To identify NLP genes in these species, nine NLP protein sequences from Arabidopsis were used as queries in a reciprocal Basic Local Alignment Search Tool Protein (BLASTP) analysis [44,45] using the threshold and minimum alignment coverage parameters described previously [46]. All NLP protein sequences were further analyzed using PfamScan (http://www.ebi.ac.uk/Tools/pfa/pfamscan/) to confirm the presence of two NLP-related domains, RWP-RK (PF02042) and PBl (PF00564), with an e-value cut-off of 1 × 10 −5 .

Phylogenetic Analysis of the NLP Family in B. napus
Multiple sequence alignments of NLP coding sequences between B. napus and A. thaliana and of protein sequences in the nine plant species (A. thaliana, B. napus, B. rapa, B. oleracea, rice, maize, sorghum, grape and A. trichopoda) mentioned above were conducted using ClustalW2 with default parameters (http://www.genome.jp/tools-bin/clustalw). The phylogenetic trees were generated using the Molecular Evolutionary Genetics Analysis (MEGA) 7 program (Tokyo Metropolitan University, Tokyo, Japan) [47] with the NJ method, the p-distance + G substitution model, 1000 bootstrap replications, and conserved sequences with a coverage of 70%. The phylogenetic trees were visualized using FigTree v1.4.0 (http://tree.bio.ed.ac.uk/software/figtree/). The coding sequence alignments were imported into KaKs_calculator2.0 (Chinese Academy of Sciences, Beijing, China) to calculate the synonymous mutation rate (Ks) and non-synonymous mutation rate (Ka) using the NG method [22].

Chromosomal Locations and Protein Property Analysis
Chromosomal locations of the BnaNLP genes and the orthologous relationships between A. thaliana and Brassica species were determined based on the results of reciprocal BLASTP analysis and the general feature format (GFF) files downloaded from the Brassica database (http://brassicadb.org/ brad). The chromosome distribution was plotted with MapChart2.0 (https://mapchart.net/) [48].

Motif Identification and TF Binding Site Analysis
Conserved motifs in the proteins were identified using the Expectation Maximization for Motif Elucidation program (MEME v4.12.0, http://meme-suite.org) with the following parameter settings: The maximum number of motifs was 15, and the optimum width was set from 6 to 200. Only motifs with an e-value of 1e-10 were retained for further analysis. The promoter sequences (2-kb upstream region) of the BnaNLP genes were obtained from the B. napus genome (http://www.genoscope.cns.fr/ brassicanapus). PlantPAN2.0 (http://plantpan2.itps.ncku.edu.tw) was used to analyze TF binding sites (TFBSs).

Three-Dimensional Structure Prediction of BnaNLP7-1
The three-dimensional (3D) structure of BnaNLP7-1 protein was predicted using the I-TASSER program (https://zhanglab.ccmb.med.umich.edu/I-TASSER/) [32]. To identify the best structural template for BnaNLP7-1 in the Protein Data Bank (PDB) database [52], the query sequence was subjected to multiple rounds of threading using LOMETS [53]. RNA-dependent RNA polymerases of transcribing cypoviruses (PDB ID: 3JA4) [33] were the best structural template for our queried protein. The newly generated 3D model was aligned to the template using TM-align, and COFACTOR was used to predict the binding sites of ligands to the protein structure [54,55]. Chimera 1.2 (https: //www.cgl.ucsf.edu/chimera/) was used to examine and visualize the newly generated 3D model of BnaNLP7-1.

RNA Isolation, RNA-Sequencing, and Quantitative Reverse-Transcription PCR
Total RNA was extracted from the organ using an RNA Extraction Kit (Tiangen, Beijing, China), and cDNA was synthesized from 1 µg of total RNA using M-MLV transcriptase (TaKaRa Biotechnology, Dalian, China) according to the manufacturer's instructions.
To determine the organ-specific expression profiles of BnaNLP genes, publicly available B. napus RNA-seq data (BioProject ID PRJNA358784) were used to quantify the expression levels of the B. napus genes in different organs as fragments per kilobase of exon per million reads mapped (FPKM) values using Cufflinks with default parameters. Two independent biological replicates were analyzed per sample. The expression values of the 31 BnaNLP genes were extracted from the RNA-seq results and normalized by Log2 (FPKM + 1).
Reverse transcription quantitative PCR (RT-qPCR) was performed on a Bio-Rad CFX96 Real Time System (USA) according to a previously described method [20]. Gene-specific primer sequences for the BnaNLP genes were obtained from the qPrimerDB database [56]. (Table S1). Each reaction was carried out in biological triplicate in a reaction volume of 20 µL containing 1.6 µL of gene-specific primers (1.0 µM), 2.0 µL of cDNA, 10 µL of SYBR green, 0.2 µL of ROX Reference Dye II, and 6 µL of sterile distilled water. The PCR program was as follows: 95 • C for 3 min; 45 cycles of 10 s at 95 • C and 30 s at 60 • C; and then melt curve 65 • C to 95 • C, increment 0.5 • C for 5 s. Melting curves were generated to estimate the specificity of these reactions. Relative expression levels were calculated using the 2 −∆∆Ct method, with BnaACT7 and BnaUBC21 used as internal controls [57].