Unravelling Cotton Nonexpressor of Pathogenesis-Related 1(NPR1)-Like Genes Family: Evolutionary Analysis and Putative Role in Fiber Development and Defense Pathway

The nonexpressor of pathogenesis-related 1 (NPR1) family plays diverse roles in gene regulation in the defense and development signaling pathways in plants. Less evidence is available regarding the significance of the NPR1-like gene family in cotton (Gossypium species). Therefore, to address the importance of the cotton NPR1-like gene family in the defense pathway, four Gossypium species were studied: two tetraploid species, G. hirsutum and G. barbadense, and their two potential ancestral diploids, G. raimondii and G. arboreum. In this study, 12 NPR1-like family genes in G. hirsutum were recognized, including six genes in the A-subgenome and six genes in the D-subgenome. Based on the phylogenetic analysis, gene and protein structural features, cotton NPR-like proteins were grouped into three different clades. Our analysis suggests the significance of cis-regulatory elements in the upstream region of cotton NPR1-like genes in hormonal signaling, biotic stress conditions, and developmental processes. The quantitative expression analysis for different developmental tissues and fiber stages (0 to 25 days post-anthesis), as well as salicylic acid induction, confirmed the distinct function of different cotton NPR genes in defense and fiber development. Altogether, this study presents specifications of conservation in the cotton NPR1-like gene family and their functional divergence for development of fiber and defense properties.


Introduction
The sessile nature of plants makes them susceptible to sporadicity in environmental and biotic stresses. To protect themselves against intruding pathogens, plants have evolved several barrier systems for defense. Systemic acquired resistance (SAR) is an induced immune mechanism activated against a broad spectrum of pathogens and provides long-lasting immunity to the distal uninfected tissues from the synchronized activation of various pathogenesis-related genes [1,2]. Salicylic acid (SA) is a crucial signaling molecule for the initiation of the SAR response and expression of pathogenesis-related genes (PRs) [3,4]. Screening of Arabidopsis mutants that are dysfunctional in mounting salicylic acid-mediated defense response leads to the identification of a master regulator of the SAR pathway, NPR1 (nonexpressor of pathogenesis-related 1) [5]. Interestingly, NPR1 has not only emerged as the   (Table S2).

Phylogenetic and Synteny Analysis of Cotton NPR1-Like Protein Family
In order to gain insight into evolutionary association and phylogenetic relationships of the NPR1-like protein family in cotton, a phylogenetic tree was generated using full-length protein   (Table S2).

Phylogenetic and Synteny Analysis of Cotton NPR1-Like Protein Family
In order to gain insight into evolutionary association and phylogenetic relationships of the NPR1-like protein family in cotton, a phylogenetic tree was generated using full-length protein

Phylogenetic and Synteny Analysis of Cotton NPR1-Like Protein Family
In order to gain insight into evolutionary association and phylogenetic relationships of the NPR1-like protein family in cotton, a phylogenetic tree was generated using full-length protein sequences of dicot plants (A. thaliana, Persea americana, Carica papaya, Glycine max, Malus domestica, Populus trichocarpa, and Lycopersicon esculentum), monocot plants (Sorghum bicolor, Oryza sativa, and Zea mays), bryophytes (Physcomitrella patens), and lycophytes (Selaginella moellendorffii) ( Table S2). The phylogenetic analysis showed that all putative Gossypium NPR1-like proteins were divided into three clades based on the homologous sequence. In clade I, all the NPR1 homologs in Gossypium species were located in a similar cluster with AtNPR1 and AtNPR2. Similarly, in Gossypium species NPR3 and NPR4 marked their presence with AtNPR3 and AtNPR4 in clade II. In clade III, Gossypium species NPR5 and NPR6 were clustered together with AtNPR5 and AtNPR6 ( Figure 3). The approach of elucidating three groups of cotton NPR1-like protein was also similar to previous reports in Arabidopsis, papaya, apple, or avocado [11,25,27,35]. In all the three groups, members belonging to dicots or monocots were clustered together, while bryophytes and pteridophytes forming a similar cluster. The NPR1-like proteins homologous to Gossypium species were closely clustered together with different dicots than with the members of monocot species, indicating closer relationships with members of dicots as compared to monocots. Further, NPR1-like proteins from Gossypium species were closer to tree-like plants such as C. papaya, M. domestica, and P. trichocarpa ( Figure 3).  (Table S2). The phylogenetic analysis showed that all putative Gossypium NPR1-like proteins were divided into three clades based on the homologous sequence. In clade I, all the NPR1 homologs in Gossypium species were located in a similar cluster with AtNPR1 and AtNPR2. Similarly, in Gossypium species NPR3 and NPR4 marked their presence with AtNPR3 and AtNPR4 in clade II. In clade III, Gossypium species NPR5 and NPR6 were clustered together with AtNPR5 and AtNPR6 ( Figure 3). The approach of elucidating three groups of cotton NPR1-like protein was also similar to previous reports in Arabidopsis, papaya, apple, or avocado [11,25,27,35]. In all the three groups, members belonging to dicots or monocots were clustered together, while bryophytes and pteridophytes forming a similar cluster. The NPR1-like proteins homologous to Gossypium species were closely clustered together with different dicots than with the members of monocot species, indicating closer relationships with members of dicots as compared to monocots. Further, NPR1-like proteins from Gossypium species were closer to tree-like plants such as C. papaya, M. domestica, and P. trichocarpa ( Figure 3).  The NPR1-like genes are mostly located on four chromosomes of the G. hirsutum A-and D-subgenomes ( Figure S2). Events of gene duplication have an effect on the amplification of gene families. The allotetraploid cotton species G. hirsutum, and G. barbendense originated from G. raimondii and G. arboreum. The event of gene duplication was executed across the four genomes of G. hirsutum, G. barbendense, G. arboreum, and G. raimondii (Figure 4). The duplicate genes of the NPR1-like gene family in the two allotetraploid cotton species were categorized into segmental duplication ( Figure 4). The NPR1-like genes are mostly located on four chromosomes of the G. hirsutum A-and Dsubgenomes ( Figure S2). Events of gene duplication have an effect on the amplification of gene families. The allotetraploid cotton species G. hirsutum, and G. barbendense originated from G. raimondii and G. arboreum. The event of gene duplication was executed across the four genomes of G. hirsutum, G. barbendense, G. arboreum, and G. raimondii (Figure 4). The duplicate genes of the NPR1-like gene family in the two allotetraploid cotton species were categorized into segmental duplication ( Figure  4).

Cotton NPR1-Like Gene Family Protein and Gene Structural Properties
The results of domain architecture represent the fact that all the members of the cotton NPR family have BTB/POZ and ankryin repeat domains, almost at a similar position with reference to the Arabidopsis NPR family ( Figure 5 and Figures S3-S5). The NPR1-like C-terminal region (a transactivation domain) is absent from cotton NPR5 and NPR6, similar to Arabidopsis AtNPR5 and AtNPR6 homologs, respectively ( Figure 5 and Figures S3-S5). Similar to AtNPR1 and AtNPR2, certain essential serine phosphorylated sites, like S11, S15, S55, and S59, were also present in all putative cotton NPR1 homologs. All the cotton NPR homologs contain conserved cysteine residues at C82, C150, C155, and C160, yet only cotton NPR1A/1D possess conserved cysteine at 216 positions.

Cotton NPR1-Like Gene Family Protein and Gene Structural Properties
The results of domain architecture represent the fact that all the members of the cotton NPR family have BTB/POZ and ankryin repeat domains, almost at a similar position with reference to the Arabidopsis NPR family ( Figure 5 and Figures S3-S5). The NPR1-like C-terminal region (a transactivation domain) is absent from cotton NPR5 and NPR6, similar to Arabidopsis AtNPR5 and AtNPR6 homologs, respectively ( Figure 5 and Figures S3-S5). Similar to AtNPR1 and AtNPR2, certain essential serine phosphorylated sites, like S11, S15, S55, and S59, were also present in all putative cotton NPR1 homologs. All the cotton NPR homologs contain conserved cysteine residues at C82, C150, C155, and C160, yet only cotton NPR1A/1D possess conserved cysteine at 216 positions. These cysteine residues play an important role in maintaining the oligomeric state of NPR1 in the cytoplasm [36]. Furthermore, none of the cotton NPR proteins possess conserved cysteine residue at C521 and C529, which is required for the binding of salicylic acid through transition metal copper [7]. C521 is absent in GhNPR1A/D, whereas C529 is present in GhNPR1A/1D. All cotton NPR proteins, similar to Arabidopsis, have a conserved motif-like penta-amino acid (LENRV), VDLNETP motif, and NIMIN1/2 binding site at the C-terminus. Cotton NPR1, 3, and 4 contain three out of five conserved basic amino acid motifs in their nuclear localization signal (NLS) except for GhNPR4A.1, GhNPR4D.1, GbNPR4A.1, GbNPR4D.1, GrNPR4.1, and GaNPR4, which have four substitutions in this region ( Figure 5 and Figures S3-S5) [37].
GhNPR1-like genes share a similar level of exon-intron structural organization of each gene, except for GhNPR3A.2, GhNPR3D.2, and GhNPR6A. Based on the exon-intron structure, the NPR genes were classified into three groups (Figure 1 and Figure S1). The homologs of NPR genes in groups I and II had four exons, whereas those in group III had two. Furthermore, the sizes of corresponding exons in each group were nearly similar. The protein charge of putative GhNPR proteins displays high variation and ranges from −8.5 to 4.0, suggesting the presence of acidic and basic amino acids (Table S4). However, theoretical pI varied from 5.3 to 6.7, indicating that these proteins could possess neutral side-chain amino acids. Consistently, protein charge and pI were observed in G. barbadense, G. arboreum, and G. ramondii. However, Arabidopsis showed a different nature of protein charge and pI in NPR family homologs. Furthermore, the presence of a negative grand average of hydropathy (GRAVY) score for cotton NPR1-like genes reflected their hydrophilic nature, revealing higher variability in hydrophilicity (Table S4).

Prediction of the Cis-Regulatory Elements in Cotton NPR1-Like Gene Family Promoter
The occurrence of cis-regulatory element in the promoter of genes plays major roles in regulating gene expression during developmental or environmental changes. To examine the putative function of cotton NPR1-like genes, cis-regulatory elements in the 2000 bp upstream region from ATG (translational start codon) in the promoter of the GhNPR homologs in defense hormonal or stress conditions or developmental stages of cotton were analyzed from the PLACE and PlantCare databases ( Figure 6, Table S5). The cis-regulatory element analysis suggested a large number of defense elements are present in GhNPR1A/D, 3A.1/D.1, 4D, and 5A as compared with GhNPR3A.2/D.2, 4A, 5D, and 6A/D. ASF1MOTIF, WRKY motif, and W-Box, well-known defense cis-regulatory elements, are generously present in GhNPR1, 3, and 4 homologs compared to GhNPR5/6 homologs ( Figure 6A). Furthermore, hormone-related cis-regulatory elements were also profusely present in the upstream region of GhNPR1, GhNPR3, and GhNPR5, with comparatively fewer found in GhNPR4 and GhNPR6 homologs ( Figure 6B). Additionally, the cis-regulatory elements associated with development were less present in GhNPR1A, GhNPR1D, GhNPR3A.2, and GhNPR5D as compared to others ( Figure 6C). Furthermore, hormone-related cis-regulatory elements were also abundantly present in GhNPR genes, except for the GhNPR4 and GhNPR6 homologs. These results indicated that GhNPR homologs have important roles in the defense and hormonal signaling pathways, as well as developmental conditions, according to the prediction of in silico analysis on upstream regions of the cotton GhNPR genes.

In Silico Gene Expression Pattern of Cotton NPR1-Like Gene Family
Understanding the gene expression pattern of identified GhNPR genes in different tissues can provide insightful information regarding their probable functional roles. To address gene expression patterns, we examined the expression profiles of GhNPR genes in different developmental stages, such as the leaf, calycle, stem, torus, stamen, root, pistil, petal, and seed stages using available transcriptome data [34] ( Figure 7A). Most of the putative cotton NPR1-like genes showed either a very low or moderate level of transcripts. Interestingly, the lowest expression for GhNPR1A/1D genes was observed in all of the analyzed cotton tissues, suggesting no role in the development process. A higher expression was observed for GhNPR3A.1 and GhNPR6A/6D in the seed stage, whereas GhNPR3A.2/3D.2 and GhNPR6A were amongst the most highly expressed in the pistil stage. GhNPR3A.2 was among the more expressed genes in the root. In tissues like the stem and stamen, there was a high expression of GhNPR3A.2 and GhNPR5A.2 genes.

In Silico Gene Expression Pattern of Cotton NPR1-Like Gene Family
Understanding the gene expression pattern of identified GhNPR genes in different tissues can provide insightful information regarding their probable functional roles. To address gene expression

Expression Analysis of the Cotton NPR1-Like Gene Family in Developmental Tissue
The gene expression analysis of representative GhNPR genes in different tissues-young leaf, mature leaf, stem, flower, and root-were determined by qRT-PCR analysis ( Figure 8). The qRT-PCR analysis result showed that the gene expression level of all GhNPR genes was declined in flower, with respect to young leaf tissue. The expression level of GhNPR1A and GhNPR1D was low in stem, flower, and root tissue, in comparison with young leaf tissue. GhNPR3A.2 showed relatively lower expression in the mature leaf and stem, whereas GhNPR3A.1 and GhNPR3D.2 exhibited higher expression in root and mature leaf tissue, respectively. The expression level of GhNPR4A and GhNPR4D was increased in the mature leaf and root as compared with the young leaf. The gene expression of GhNPR5A and GhNPR5D was downregulated in the mature leaf as compared with young leaf and upregulated in the root tissue, whereas GhNPR6A and GhNPR6D expression levels were increased in the mature leaf and stem as compared with the young leaf. We further analyzed the expression profiles for putative cotton NPR1-like genes at different cotton fiber development stages, such as 0 days post-anthesis (DPA), 5 DPA, 10 DPA, 20 DPA, and 25 DPA, from the available transcriptome data [34] ( Figure 7B). Similarly, like in the cotton development tissue, we observed the lowest expression for GhNPR1A/1D genes in various fiber stages, except for the high expression of GhNPR1D at the 20 DPA fiber stage. GhNPR6A and GhNPR6D were highly expressed throughout the 0 DPA to 25 DPA fiber stages, except for GhNPR6A at 0 DPA and GhNPR6D at 10 DPA. GhNPR3A.2 and GhNPR3D.2 were highly expressed throughout the 0 DPA to 20 DPA fiber stages, except for GhNPR3A.2 in 10 DPA. The transcriptomic analysis of the fiber also suggests that GhNPR5A and GhNPR5D were highly expressed at the 5 DPA and 10 DPA fiber stages. We can broadly conclude from the in silico analysis of the transcriptomic data of the cotton fiber that a few cotton NPR1-like genes are probably involved in various stages of the fiber development.

Expression Analysis of the Cotton NPR1-Like Gene Family in Developmental Tissue
The gene expression analysis of representative GhNPR genes in different tissues-young leaf, mature leaf, stem, flower, and root-were determined by qRT-PCR analysis (Figure 8). The qRT-PCR analysis result showed that the gene expression level of all GhNPR genes was declined in flower, with respect to young leaf tissue. The expression level of GhNPR1A and GhNPR1D was low in stem, flower, and root tissue, in comparison with young leaf tissue. GhNPR3A.2 showed relatively lower expression in the mature leaf and stem, whereas GhNPR3A.1 and GhNPR3D.2 exhibited higher expression in root and mature leaf tissue, respectively. The expression level of GhNPR4A and GhNPR4D was increased in the mature leaf and root as compared with the young leaf. The gene expression of GhNPR5A and GhNPR5D was downregulated in the mature leaf as compared with young leaf and upregulated in the root tissue, whereas GhNPR6A and GhNPR6D expression levels were increased in the mature leaf and stem as compared with the young leaf.

Expression Analysis of the Cotton NPR1-Like Gene Family in Different Fiber Stages
To address the impact of GhNPR genes on fiber development, qRT-PCR was performed at different fiber development stages: 0 DPA, 5 DPA, 10 DPA, 15 DPA, 20 DPA, and 25 DPA (Figure 9). The result of qRT-PCR analysis showed a decrease in expression of GhNPR1A in all fiber stages and GhNPR1D at later stages (15 DPA, 20 DPA, and 25 DPA) of fiber development ( Figure 9A). Interestingly, enhanced expression of GhNPR3, GhNPR4, GhNPR5D, and GhNPR6 was seen at 10 DPA, except for GhNPR5A ( Figure 9B,C). Similarly, gene expression for GhNPR3D.1, GhNPR3A.2/3D.2, and GhNPR6A were increased at 25 DPA. However, at 15 DPA, expression of GhNPR3A.2, GhNPR4A, GhNPR6A, and GhNPR6D were increased, whereas the expression of GhNPR3D.2 was decreased. Likewise, there is either a decrease or change in expression level at 5 DPA in GhNPR genes, except for GhNPR3D.2, which showed enhanced expression. It could be concluded that GhNPR3, GhNPR4, GhNPR5, and GhNPR6 are involved in fiber development, contingent upon a detailed experimental validation.

Expression Analysis of the Cotton NPR1-Like Gene Family in Different Fiber Stages
To address the impact of GhNPR genes on fiber development, qRT-PCR was performed at different fiber development stages: 0 DPA, 5 DPA, 10 DPA, 15 DPA, 20 DPA, and 25 DPA (Figure 9). The result of qRT-PCR analysis showed a decrease in expression of GhNPR1A in all fiber stages and GhNPR1D at later stages (15 DPA, 20 DPA, and 25 DPA) of fiber development ( Figure 9A). Interestingly, enhanced expression of GhNPR3, GhNPR4, GhNPR5D, and GhNPR6 was seen at 10 DPA, except for GhNPR5A ( Figure 9B,C). Similarly, gene expression for GhNPR3D.1, GhNPR3A.2/3D.2, and GhNPR6A were increased at 25 DPA. However, at 15 DPA, expression of GhNPR3A.2, GhNPR4A, GhNPR6A, and GhNPR6D were increased, whereas the expression of GhNPR3D.2 was decreased. Likewise, there is either a decrease or change in expression level at 5 DPA in GhNPR genes, except for GhNPR3D.2, which showed enhanced expression. It could be concluded that GhNPR3, GhNPR4, GhNPR5, and GhNPR6 are involved in fiber development, contingent upon a detailed experimental validation. Internal control gene GhUBQ4 was used to normalize the expression of GhNPR1-like genes. Data are the mean ± SE of three biological replicates with two technical replicates in each set. Significant differences are indicated by * p < 0.05; ** p < 0.005.

Expression Analysis of the Cotton NPR1-Like Gene Family During SA-Induction
The members of the NPR1-like gene family have a prominent role in the SAR pathway [38,39]. Therefore, the expression pattern of G. hirsutum NPR1-like genes were addressed in response to an exogenous application of salicylic acid ( Figure 10, Figure S6). To assess the response from spraying 2 mM of salicylic acid on the leaves, expressions of the GhNPR-like genes were studied at 12 h and 24 h with respect to the water treated samples, which acted as a control. The qRT-PCR for samples with SA treatment showed a change in gene expression level for the GhNPR1D homolog, which was upregulated 2.7-fold (a statistically-significant change) at 12 h and downregulated 2.12-fold at 24 h ( Figure 10A). The gene expression for GhNPR3A.1, GhNPR3D.1, GhNPR4A, and GhNPR4D showed downregulation at 12 h after SA induction, followed by an increase in the gene expression at 24 h of

Expression Analysis of the Cotton NPR1-Like Gene Family During SA-Induction
The members of the NPR1-like gene family have a prominent role in the SAR pathway [38,39]. Therefore, the expression pattern of G. hirsutum NPR1-like genes were addressed in response to an exogenous application of salicylic acid ( Figure 10, Figure S6). To assess the response from spraying 2 mM of salicylic acid on the leaves, expressions of the GhNPR-like genes were studied at 12 h and 24 h with respect to the water treated samples, which acted as a control. The qRT-PCR for samples with SA treatment showed a change in gene expression level for the GhNPR1D homolog, which was upregulated 2.7-fold (a statistically-significant change) at 12 h and downregulated 2.12-fold at 24 h ( Figure 10A). The gene expression for GhNPR3A.1, GhNPR3D.1, GhNPR4A, and GhNPR4D showed downregulation at 12 h after SA induction, followed by an increase in the gene expression at 24 h of SA induction, significantly ( Figure 10B-D). The gene expression of both the At and Dt subgenomes of GhNPR5 and GhNPR6 homologs was down-regulated at both 12 h and 24 h post-induction of salicylic acid, but the change for GhNPR6A/D gene expression was statistically insignificant at 12 h ( Figure 10E-F). These results suggested that GhNPR genes GhNPR1 to GhNPR4 were regulated by SA, whereas GhNPR5A/D genes were repressed, indicating that SA has a different effect on different members of G. hirsutum NPR1-like genes.  Figure 10E-F). These results suggested that GhNPR genes GhNPR1 to GhNPR4 were regulated by SA, whereas GhNPR5A/D genes were repressed, indicating that SA has a different effect on different members of G. hirsutum NPR1-like genes.

Discussion
NPR1-like genes have been recognized as an important regulator of SA-dependent signaling and also of development pathways [5,11,[40][41][42]. The present study is illustrative of the NPR1-like gene family in Gossypium species and provides a foundation for functional and evolutionary characterization of the cotton NPR1-like protein family. The protein and domain similarities between Arabidopsis and cotton were very high, suggestive of the evolutionary conservation of the proteins and their domain. Interestingly, the diploid genome tree of cotton G. arboreum (GaNPR genes) contains only five NPR1-like genes, whereas G. raimondii possesses six NPR1-like genes. Twelve NPR1-like genes in both allotetraploid cotton species of upland cotton (GhNPR genes) and Pima or sea-land cotton (GbNPR genes) were identified in an attempt to better understand signaling in the defense and fiber-related developmental response, which probably better facilitates these processes. Instead, NPR1 and BOP1 were identified earlier in G. hirsutum and their role in defense was identified, yet information about their isoforms and the functions of other members of the family is still elusive [23,39].

Discussion
NPR1-like genes have been recognized as an important regulator of SA-dependent signaling and also of development pathways [5,11,[40][41][42]. The present study is illustrative of the NPR1-like gene family in Gossypium species and provides a foundation for functional and evolutionary characterization of the cotton NPR1-like protein family. The protein and domain similarities between Arabidopsis and cotton were very high, suggestive of the evolutionary conservation of the proteins and their domain. Interestingly, the diploid genome tree of cotton G. arboreum (GaNPR genes) contains only five NPR1-like genes, whereas G. raimondii possesses six NPR1-like genes. Twelve NPR1-like genes in both allotetraploid cotton species of upland cotton (GhNPR genes) and Pima or sea-land cotton (GbNPR genes) were identified in an attempt to better understand signaling in the defense and fiber-related developmental response, which probably better facilitates these processes. Instead, NPR1 and BOP1 were identified earlier in G. hirsutum and their role in defense was identified, yet information about their isoforms and the functions of other members of the family is still elusive [23,39].
Perusal of the phylogenetic tree and the exon-intron structural organization suggested that the putative NPR1-like genes of all four cotton species, Arabidopsis and other known species (dicot, monocot, and lower group plants), were clustered in three different clades/groups (Figures 1-3). As previously reported in Arabidopsis, members of clade I and II NPR proteins are implicated in activating or repressing the SAR signaling response, respectively, whereas clade III NPR members are mainly involved in leaf and flower development [41,[43][44][45]. Identified NPR1 proteinds in Gossypium species have belonged to clade I, whereas NPR3 and NPR4 were clustered in clade II, and NPR5 and NPR6 belong to clade III. Those Gossypium species NPR proteins that fall in the same clade could function in a similar manner. Notably, NPR2 was not identified in either allotetraploid or diploid cotton, suggesting that it might be removed in the course of evolution. Further, phylogenetic analysis suggested that NPR1-like homologs probably originated from the bryophytes, as no evidence of these proteins exits in the algal genome, which is consistent with the previous findings [25]. We also identified several NPR1-like genes collinearity blocks among G. hirsutum (At and Dt-genome), G. barbadense (At and Dt-genome), G. raimondii (D-genome), and G. arboreum (A-genome) (Figure 4), revealing that segmental duplication occurred during expansion and evolution of NPR1-like genes in cotton. NPR1-like proteins comprise two essential domains mediating protein-protein interaction, the N-terminus BTB/POZ, and the central region ankyrin repeat domain, which are well conserved in all cotton species in this study. In contrast, the essential features characteristic of defense-related NPR1-like C-terminal domain were exclusively present in the group members of NPR1 to NPR4. Another DUF domain was also revealed in NPR-like proteins by CDD analysis, which is required to explore its function experimentally.
GhNPR1A/D homologs possess a phosphorylation site at S11/S15 and a dephosphorylation site at S55/S59, which is similar to the distinguished features of AtNPR1/2 and orchestrates salicylic acid induction for the regulation of PR1 gene expression and proteasome-mediated turnover of NPR1 ( Figure 5) [10,46]. SA induction turns oligomeric NPR1 into monomeric NPR1, which leads to nuclear localization of the NPR1 protein [37,47]. In normal conditions, cysteine residues (marked as • in Figure 5) at C82 (conserved in all cotton species) and C216 (present only in GhNPR1, GhNPR3.1, and GhNPR4) allow the NPR1 to exist in an oligomer form in the cytoplasm [36,48]. However, we also found some deviations from the conserved amino acid of Arabidopsis in cotton species, for example, cotton GhNPR1A/D proteins do not harbor the transactivation domain of Arabidopsis NPR1 at C521, but rather possess C529, which is essential for coactivator function and required for the binding of salicylic acid [7,9]. In addition, highly conserved arginine residues present in the penta-amino acid motif (LENRV), which are important for SA sensitivity in AtNPR1, AtNPR3, and AtNPR4, are also present in cotton NPR1, NPR3, and NPR4 [45]. Phosphorylated monomeric AtNPR1 at S589/T373 by SnRK2.8 which is significant for nuclear import, is well conserved in all cotton species and Arabidopsis NPR proteins [49]. The NLS motif is responsible for nuclear localization of AtNPR1/2, which is highly conserved in cotton NPR1A/1D, and a few basic amino acids (marked as ∆ in Figure 5), and is also similar in the cotton NPR3/4 group of NPR proteins. However, the C-terminal of cotton NPR5/6 lacks certain indispensable properties of defense-related NPR1-like proteins, for instance, a clear bipartite NLS and the motif for NIMIN1/2 binding [37,50].
Distinct cis-elements present in the promoter or upstream regions of genes, which are binding sites for transcription factors, are also key factors in gene regulation [51][52][53]. The density of specific or diverse cis-elements present in promoter regions provide indications for the tissue-specific or stress-responsive expression patterns in diverse challenging environmental conditions [53][54][55][56]. Accumulating evidence has shown that the promoters of pathogen-or SA-inducible genes comprise a higher number of defense-related cis-elements [35,57,58]. By analogy, the promoters of development or tissue-specific genes comprise a higher number of cis-elements, accordingly [53,54]. In silico cis-element analysis suggested a higher enrichment of defense related cis-elements in the GhNPR1 to GhNPR4 group, whereas development-related motifs were enriched in the GhNRP3 to GhNPR6 group, suggesting that the putative role of the cotton NPR family is conserved throughout evolution in these important biological processes [11,[40][41][42]. Chen et al., 2019, revealed that NPR1 protein positively regulates its expression by binding to its own promoter, and communicating with RNA polymerase II. AtNPR1 directly interacts and recruits cyclin-dependent kinase 8, which phosphorylates the RNA polymerase II C-terminal domain [59,60]. Similarly, a motif occurring in the AtNPR1 promoter GhNPR1A/D contains several defense-regulated cis-elements, like W-box or Asf1 motifs for the binding of WRKY or TGA proteins, respectively, suggesting that GhNPR1A/D probably regulates its own expression.
In addition to the cotton seed, which is used for oil and protein, the natural cotton fiber is also a valuable resource as a raw material in industrial textile manufacturing, and is thereby a major contributor to a state's economy. The in silico expression profiling of nine different tissues revealed that GhNPR3A.1/D.1, GhNPR4A/D, and GhNPR6 have higher expression levels in the developmental tissue, relative to the GhNPR1A/D. Previously, several studies also reported the NPR1-like genes family expression pattern in various tissues at different levels [25,26,35,38,39,[61][62][63]. The higher expression level of GhNPR3A.2/D.2, GhNPR5A/D, and GhNPR6A/D showed in fiber cell initiation and elongation stages (0-10 DPA), whereas GhNPR3A.2/D.2 and GhNPR6A/D have higher expression levels during the rapid elongation stage (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20). Interestingly, in silico expression profiling in the fiber reasonably correlated with qRT-PCR and analysis of the promoter cis-elements for GhNPR3A.2/D.2, GhNPR5A/D, and GhNPR6A/D, indicating that these genes are preferentially expressed in fiber and could potentially function for the initiation and elongation stages of fiber development. Additionally, for the proper development of the plant, its resistance/tolerance should be strong enough for a different kind of biotic or abiotic stress. NPR genes are mostly involved in the defense signaling pathway and their expression patterns are also variable at different time periods of SA induction or pathogen infestation [8,25,26,35,40]. The qRT-PCR analysis suggested that GhNPR1D, GhNPR3A.1/D.1, and GhNPR4A/D were significantly activated after SA exogenous treatment. Although Zhang et al., 2008, emphasized GhNPR1 expression in SA induction at different time intervals, no information was provided about SA's effect on GhNPR1 homologs and their functions in fiber development [23]. Notably, GhNPR3A.2/D.2, GhNPR5A/D, and GhNPR6A/D were not expressed after exogenous SA treatment, which suggests no probable significant contribution in the defense pathway, and conversely showed participation in fiber development. Interestingly, a lower number of defense-regulated cis-elements were present in the upstream region of GhNPR3A.2/D.2, GhNPR5A/D, and GhNPR6A/D genes.
In conclusion, a set of homologs of cotton NPR1-like genes were identified in G. hirsutum, including six genes in the A-subgenome and six genes in the D-subgenome. Based on the gene and protein structural characteristics and comparison with homologs from other cotton and plant species, the NPR-like proteins were grouped into three different clades, signifying evolutionary conservation and functional divergence. Consequently, in silico and quantitative expression profile analysis of GhNPR genes suggest the probable functions of different aspects of G. hirsutum for biotic stress tolerance and various fiber development stages. Therefore, the present study could be a useful resource in using the cotton NPR1-like gene family for the future challenges, such as developing varieties for pathogen tolerance and higher fiber yield.

Plant Materials
G. hirsutum was grown in a soil mixture under long-day conditions (16/8 day/night cycle at 28 • C ± 2 • C, 1600 µmol m −2 sec −1 ) in a controlled glasshouse environment. Arabidopsis thaliana seeds (Col-0 ecotype) were stratified on Soilrite for four days at 4 • C, after which seeds were allowed to grow under controlled environmental conditions (22 • C ± 1 • C, 120 µmol m −2 sec −1 , 16/8 day/night cycle). Leaves from G. hirsutum and Arabidopsis plants 8 weeks and 4 weeks post-sowing were taken for study, respectively.

Protein Structure Analysis and Domain Distribution
Multiple sequence alignment was performed using identified putative cotton NPR and Arabidopsis NPR proteins with Clustal-X version 2 [64]. The locations of conserved BTB/POZ, ankyrin repeat domain, transactivation domain, and nuclear localization sequences (NLS) were determined using the Conserved Domain Database [37,65]. The predicted NPR1-like sequences were further confirmed for the presence of the conserved domain of NPR family members-BTB/POZ and ankyrin repeat domain through SMART and pfam databases. Percentages of amino acid identity and similarity between Arabidopsis and cotton sequences were identified by BioEdit software.

Physical Property Analysis of Cotton NPR Genes
The parameters of the identified putative cotton NPR genes-deduced amino acid length, molecular weight, theoretical pI, and grand average of hydropathy (GRAVY)-were obtained from the CottonFGD.

Analysis of Gene Structure and Chromosomal Localization
The chromosomal localization of putative G. hirsutum NPR genes in A-and D-subgenomes to particular chromosomes, distinctly, was marked using Mapchart 2.3 software. For the gene structure analysis, coordinates of exon-intron for cotton and Arabidopsis NPR1 gene families were diagrammatically represented by the Gene Structure Display Server 2.0 [66].

Phylogenetic and Synteny Analysis
The protein sequences of NPR proteins from different plant species were retrieved and their multiple sequence alignment (MSA) were carried out with identified cotton NPR proteins using Clustal-X version 2 with default parameters. Subsequently, the obtained aligned protein sequences were used as inputs to construct unrooted phylogenetic trees following the neighbor-joining method with a Jones-Taylor-Thornton model and pairwise gap deletion by MEGA-X software with 1000 bootstrap replicates [67]. Other NPR1-like gene family homologs from different plant species were identified by executing BLASTP search or from published research articles [24,25,27,28] by taking protein sequences of 6 Arabidopsis NPR1 family members as a query for generation of a phylogenetic tree. MCScanX software was used for the syntenic analysis with the default parameters [68]. Circos was used to plot the segmental duplication events diagram on chromosomes [69].

Conserved Cis-Element Analysis in the Promoter
Sequences from the 2 kb upstream region of translation sites were retrieved for GhNPR genes from CottonFGD and AtNPR genes through the TAIR database for identification of the cis-regulatory elements. The promoter cis-regulatory elements present in the NPR1-like genes were identified by PLACE and PlantCare [70,71].

Analysis of RNA-Seq Data for Putative GhNPR Genes in Different Tissues and Fiber Development Stages
The RNA-Seq data of G. hirsutum TM1 in different tissues and fiber development stages were downloaded from NCBI (PRJNA248163). The obtained raw reads were quality filtered and further aligned to the reference genome of G. hirsutum using TopHat2 with default parameters [72]. The aligned reads' fragments per kilobase million (FPKM) values were identified and heat maps were generated by Multi Experiment Viewer (MEV V.4.9.0) [73,74].

RNA Extraction and Real-Time PCR Analysis
To evaluate the putative GhNPR gene expression in the different fiber development stages, the flower was tagged on the day of flowering. The fiber tissues were harvested at 0 DPA (ovule), 5 DPA, 10 DPA, 15 DPA, 20 DPA, and 25 DPA. Similarly, samples were also collected for RNA isolation from different tissues: young leaf (top leaves from 8-week-old plants), mature leaf (lowest leaves from 8-week-old plants), stem (8-week-old plants), flower (10-week-old plants), and root (8-week-old plants) of G. hirsutum. For the induction of salicylic acid, 2 mM of salicylic acid was sprayed on G. hirsutum leaf tissue and harvested at 12 h and 24 h after induction for RNA extraction. Similarly, 2 mM of salicylic acid was sprayed and Arabidopsis leaf tissue was harvested after 12 h from 30 DAS (days after sowing) for RNA extraction [75]. RNA was isolated by the Spectrum Plant Total RNA Kit (Sigma-Aldrich). For the removal of DNA contamination, approx. 10 µg of isolated total RNA was treated with DNase1 (Turbo DNA-free, Ambion). The 2 µg of DNase1-treated RNA was used to make cDNA by SuperScript II RT-kit (Invitrogen, Life technologies). The quantitative reaction was executed on the 7500 Fast Real-Time PCR system (Applied Biosystem) by a fast SYBR Green master mix. Three biological and two technical replicates were taken for each experiment, and the 2 -∆∆CT method was used for analyzing the relative expression of target genes. For the validation of all genes, gene-specific primers (forward and reverse) were made by Primer Express 3.0.1 software (Applied Biosystems) (Table S1). Similarly, Real-time PCR analysis was also done for AtNPR genes in leaves of water-and salicylic acid-treated Arabidopsis thaliana by using the same procedure described above. Internal control genes like GhUbiquitin4 [76] and AtUbiquitin10 (At4g05320) [51] were used for normalizing the expression level of target genes in GhNPR and AtNPR genes, respectively.