Evolution and Identification of the WRKY Gene Family in Quinoa (Chenopodium quinoa)

The WRKY gene family plays a unique role in plant stress tolerance. Quinoa is a cultivated crop worldwide that is known for its high stress tolerance. The WRKY gene family in quinoa has not yet been studied. Using a genome-wide search method, we identified 1226 WRKY genes in 15 plant species, seven animal species, and seven fungi species. WRKY proteins were not found in animal species and five fungi species, but were, however, widespread in land plants. A total of 92 CqWRKY genes were identified in quinoa. Based on the phylogenetic analysis, these CqWRKY genes were classified into three groups. The CqWRKY proteins have a highly conserved heptapeptide WRKYGQK with 15 conserved elements. Furthermore, a total of 25 CqWRKY genes were involved in the co-expression pathway of organ development and osmotic stress. The expression level of more than half of these CqWRKY genes showed significant variation under salt or drought stress. This study reports, for the first time, the findings of the CqWRKY gene family in quinoa at the genome-wide level. This information will be beneficial for our understanding of the molecular mechanisms of stress tolerance in crops, such as quinoa.


Introduction
The WRKY gene family is an important transcription factor, playing a unique regulatory role in plants' defense responses to abiotic and biotic stresses. They can positively or negatively regulate the expression of other genes to increase the stress tolerance of plants [1]. In wild diploid woodland strawberry, the expression of FvWRKY42 was induced and interacted with different stress-response proteins under powdery mildew, salt, drought, salicylic acid (SA), methyl jasmonate, abscisic acid (ABA) and ethylene (ET) treatment [2]. The overexpression of FvWRKY42 in Arabidopsis resulted in the enhanced powdery mildew resistance, salt and drought stress tolerance [2]. Previous study also found that VvWRKY30 in grape had a positive effect on salt stress and transgenic Arabidopsis had super salt stress resistance by regulating a series of glycol-metabolism dependent genes [3]. ZmWRKY40 and ZmWRKY106 were induced in maize by salt, drought, ABA and high temperatures. Overexpression of ZmWRKY40 regulates the diverse stress-related genes, including STZ, DREB2B and RD29A in Arabidopsis and rice, to improve drought stress tolerance, and overexpression of ZmWRKY106 regulates ABA-signaling pathway related genes to improve the drought and heat tolerance compared to wild type, respectively [4,5]. What is more, GmWRKY27 can interact with GmMYB174 and also suppresses GmNAC29 expression. Overexpression and knockout of GmWRKY27 demonstrated that it played crucial roles in response to salt and drought stress in soybean [6]. The WRKY transcription factor also plays vital roles in regulating grain yield in transgenic crops. For example, Gao et al., reported that overexpression of TaWRKY2 could significantly enhanced grain yield in transgenic wheat, which had longer panicle length and more kernels per spike contrast to wild type [7]. In rice, knockout of OsWRKY47 displayed lower drought tolerance and reduced yield, while overexpressing of OsWRKY47 increased the drought tolerance [8]. The tiller number, grain weight and P concentration were also increased in OsWRKY74 overexpressing plants in contrast to WT under P-deficient stress [9].
Generally, the WRKY proteins possess 60 amino acid domains known as WRKY domains. These domains, together with the zinc-finger motif (Cys2-His2) at the C-terminal, act as a specific DNA-binding peptide sequence WRKYGQK at the N-terminal. The WRKY domains interact specifically with a DNA motif termed the W-box (TGACC(A/T)) or SURE (cis-responsive element) in the target gene promoter, then regulates their biotic and abiotic stress responses [10,11]. Based on the sequence features of the zinc-finger motif and number of WRKY domains, WRKY genes were further classed into three groups: WRKY I, II, and III. Members of group I have the C2H2 (C-X4-C-X22-23-HXH) zinc finger motif and two WRKY domains. Members of group II have a similar zinc finger motif to group I, but only have one WRKY domain. All members of group III contain the C2CH (C-X7-C-X23-HXC) motif and one WRKY domain [12,13]. Group II can be further divided into five subgroups: II-a, II-b, II-c, II-d, and II-e. The WRKY gene families play a unique role in regulating developmental and multiple physiological processes. For instance, in Arabidopsis, the suppression of WRKY75 expression, and lateral root number and root length were increased significantly when under abiotic stresses [14]. Similarly, the group I WRKY gene, GhWRKY25 was cloned from cotton, and the expression level of GhWRKY25 was reduced under abiotic stresses. Overexpression of GhWRKY25 not only enhanced sensitivity to fungal pathogen B. cinerea in Nicotiana benthamiana by reducing the expression of SA or ET signaling related genes, but also reduced tolerance to stress caused by drought [15]. The expression level of VfWRKY1 and VfWRKY2 was significantly increased in various plant organs of faba bean when under drought or salt stress [16]. Furthermore, CsWRKY2 in tea (Camellia sinensis L.) was up-regulated by cold or drought stress, which suggested that CsWRKY2 plays an important role in the response to cold or drought stress by participating in the ABA signaling pathway [17].
WRKY II and III proteins play important roles in leaf senescence and plant stress response. Two WRKY transcription factors, WRKY6 and WRKY11, have been shown to play a role in leaf senescence of Arabidopsis [14]. In comparison, WRKY III members, WRKY54 and WRKY70, act as negative regulators of leaf senescence [18]. Overexpression of MtWRKY76 in cotton in rice have also been reported to enhance their salt and drought tolerance [19]. Furthermore, WRKY III proteins can bind to cis elements in the promoter regions of other genes to enhance their expression in response to pathogenic stresses [20].
Quinoa (Chenopodium quinoa Wild.) is one of the world's oldest cultivated crops. It was first domesticated as the staple food in South America more than 7000 years ago [21,22]. The grains of quinoa are high in nutritional value for humans and animals. The protein, total fat and dietary fiber contents ranged from 9.1 g to 15.7 g, 4.0 g to 7.6 g, 8.8 g to 14.1 g per 100 g of fresh quinoa grain, respectively [23,24].
The sequence of the plant whole genome is useful to identify gene families, and WRKY proteins have been widely identified in diverse plant species [25,26]. There are 102 WRKY genes found in rice, as well as 32 in broomcorn millets and 77 peanut strains by using HMMER search program [1, 6,27]. However, to date, the role/function of the WRKY gene family is poorly understood in quinoa. Here, we systematically characterized the WRKY gene family in quinoa by genome-wide search, and identifiedthe evolutionary relationship of CqWRKY with other plant WRKYs. The phylogeny evolutionary relationships, conserved motifs, subgroup classification, regulatory network, and the expression patterns of WRKY proteins were analyzed. These results will be beneficial for our understanding of the molecular mechanisms of stress tolerance in crops such as quinoa. These results will not only provide the potential candidate for further functional analysis, but also be beneficial for our understanding of the molecular mechanisms of stress tolerance in quinoa and beyond.

Identification of WRKY Gene Family and Chromosomal Localizations
Animal and fungi protein sequences were downloaded from NCBI, plant protein sequences were downloaded from Ensembl plants (http://plants.ensembl.org/index.html), thequinoa genome and protein sequences were downloaded from the Phytozome database (http://www.phytozome.net/) and NCBI, and the WRKY protein sequences were screened to identify the WRKY genes [21,28]. These protein sequences were firstly used to create the local database. The database was then searched against known WRKY protein sequences collected from Arabidopsis. We used the local BLASTP program (https://blast.ncbi.nlm.nih.gov) with an E-value cut-off <1e -5 in our database search [29]. After manual curation, the hidden Markov model (HMM) profile of the conserved WRKY domain (PF03106) sequences were obtained from the PFAM 31.0 database(http://pfam.xfam.org/) and the HMM 3.2 software (http://hmmer.org/) was used to search all WRKY proteins from the 15-plant species, 7 fungi species, and 7 animal species [30]. Redundant sequences and incomplete residual sequences were removed from protein sequences containing complete WRKY domains by using DNAMAN 5.0 (LynnonBioSoft, Quebec, Canada). The molecular weight, amino acid length, and isoelectric point were computed for each WRKY in quinoa by using the ExPASy online database. Cello v2.5 software, pLoc-mPlant (http://www.jci-bioinfo.cn/pLoc-mPlant/) and PSORT (https://psort.hgc.jp/) were used to predict the subcellular localization of these CqWRKY proteins. All parameters were set as the default. The genome locations were mapped on scaffolds and chromosomes of the quinoa genome by using the location BLASTN + 2.8.1 program (NCBI, Bethesda, MD, USA) with the E-value <10−5 and the best hits were obtained [21].

Phylogenetic Analysis, Gene Structure, Protein Conserved Motifs Identification and Gene Duplication
The ClutsalW 1.83 program was used for multiple protein sequence alignments between the plants, fungi, and animals. MEGA6.0 (Phoenix, AZ, USA) was used to construct the phylogenetic tree with the neighbor-joining (NJ) method with 1000 bootstrap replications.
In quinoa, conserved motifs of CqWRKYs were investigated using MEME (Multiple EM for Motif Elicitation, http://meme-suite.org/tools/meme) with the following parameter setup: number of repeat elements was set to any, maximum motifs were set to 15, and the best motif widths were set to 6-200 residues [31]. The gene structure information including intron, exon, and genome location of the CqWRKY genes were obtained from the quinoa genome database and displayed in the GSDS (Gene Structure Display Server, http://gsds.cbi.pku.edu.cn/) [32]. Gene duplication events of WRKY genes in quinoa were investigated as described by Wang et al. By using three criteria: the alignment covered >80% of the longer gene, the aligned region had an identity >80% and the tightly linked genes was counted as one duplication event [33]. Then, the duplicated regions were visualized by the Circos tool (http://circos.ca/) in the quinoa genome.

Expression Profile Analysis in Various Tissues and under Different Abiotic Stresses of TaHDZ Genes
Publicly available quinoa RNA-Seq datasets were obtained from the SRA (Sequence Read Archive, https://www.ncbi.nlm.nih.gov/sra). These data were analyzed for the expression patterns of the identified CqWRKY genes. Materials from dry seed (SRS2464892), 1-week seedling (SRS2434487, SRS2464890), leaf (SRS4026093, SRS4026094, SRS4026095), stem (SRS2464891), and inflorescence (SRS2464889) were used to identify the tissue specific expression profiles. Moreover, salt (SRS2458686, SRS2458681, SRS2458685) and drought (SRS1204205) stress treatments were used to examine the stress response genes. The expression level was normalized and log10-transformed were used for producing the heat map using the heatmap Development Package in R software 3.5.1 (https: //www.r-project.org/) [36].

Global Identification and Evolution of WRKY Proteins from Eukaryotes to Plants
To understand the evolutionary history and relationship of WRKY, we identified 1226 WRKY genes using profile HMM searches and BLASTP of 15 representative plant species, 7 animal species, and 7 fungi species (Table A1, Supplementary Material 1). The results showed that WRKY proteins were not identified in the 7 animal species including flatworms, molluscs, horsehair worm, fish, amphibians, and reptiles. Furthermore, no WRKY werefound from five of the fungi species. However, five and one WRKY proteins were identified in Rhizopus azygosporus and Dictyostelium discoideum, respectively. Five WRKY proteins from Rhizopus azygosporus had one WRKY domain, and one WRKY protein from Dictyostelium discoideum had two WRKY domains. This phenomenon was also found in Giardia lamblia [37]. A total of 15 plant species including three dicotyledon species, two gymnosperm species, five monocotyledon species, one pteridophyta species, two bryophytes species, and two algae, were used to investigate WRKY. The results showed that WRKY were widespread in land plants. Among them, angiosperm plants showed the most abundant WRKY genes, followed by gymnosperm species, pteridophyta species, and bryophytes species. In particular, a total of 296 WRKY genes were found in wheat. This difference between these species of mainly the WRKY gene family did expand with wheat polyploidization and genome evolution. The number of WRKY genes in algae was particularly low when compared to that of other plant species. Interestingly, the evolution of Chlamydomonas reinhardtii was before the divergence of land plants, so we found that only two WRKY proteins were identified and classed into group I. At the same time, two WRKY proteins were found in Volvox carteri (Figure 1). This phenomenon may have happened due to algae with only a single copy, so the WRKY gene duplication events occurred during the evolution from lower to higher plants [37].
Additionally, the number of WRKY in each subgroup of these species was investigated ( Figures A1-A9). In monocotyledon species, group III was the larger subgroup except for quinoa while group II-c was the larger subgroup in dicotyledon and gymnosperm species except for Solanum tuberosum. Group II-a was present only in angiosperm and gymnosperm, but not from ancient plants, and group III arose early in the evolution of pteridophyta plants and was conserved across angiosperm, gymnosperm, and pteridophyta ( Figure 1).

Identification of WRKY Genes in Quinoa
A total of 92 non-redundant genes containing the complete WRKY domains were identified. These protein sequences were named as CqWRKY1A to CqWRKY58A and regarded as the putative quinoa WRKY genes. These WRKY genes had large compositional differences. The molecular weights of these genes ranged from 21.8 kDa to 91.0 kDa, their length from 191 to 639 amino acids, and the isoelectric points were from 5.3 to 9.8. Among these proteins, 86 CqWRKY were located in the nuclear, only CqWRKY30A-2 and CqWRKY51A-1 in the extracellular, CqWRKY47B-1/2 in the peroxisome, CqWRKY54 in the chloroplast and CqWRKY58A in the cytoplasm, respectively. 92 CqWRKY genes were unevenly distributed on all the 18 quinoa chromosomes, of which chromosome 7A contained the most CqWRKY genes with the number of 18, followed by 1B with the number of 16, then 10B with the number of 11, while the chromosome 3B had not CqWRKY gene. In total, 38 and 40 CqWRKY were located on the A and B sub-genome, respectively (Table A2).

Multiple Sequence Alignment of CqWRKY
All CqWRKY proteins were classified into three groups. Group I was comprised of 16 CqWRKY proteins, each containing two WRKY domains and C2H2 zinc-finger motif. Group II was comprised of 62 proteins, each with one WRKY domain and a similar zinc-finger motif. Group III was comprised of 14 proteins, each containing only one WRKY domain and a C2CH zinc-binding motif. Seven CqWRKY proteins had a difference by a single amino acid in the WRKY domain. Four CqWRKY proteins including CqWRKY30A-1 and CqWRKY30A-2 of group II and CqWRKY51A-1 and CqWRKY51A-3 of group III contained the common variant sequence WRKYGKK. Three CqWRKY proteins, CqWRKY44A, CqWRKY45B, and CqWRKY51A-2, had the less common variant sequence WRKYGEK. The remaining 85 CqWRKY proteins had the highly conserved sequence WRKYGQK ( Figure A10).

Phylogenetic Analysis of CqWRKY Genes
A previous study reported that the WRKY transcription factor family had an early origin in eukaryotes, where this ancestral gene seems to have duplicated many times during the evolution of plants, resulting in a large gene family for WRKY proteins. However, the WRKY gene family was highly conserved during the evolution process in plants, regardless of being in monocots or eudicots, which could be classified into three groups according to the WRKY motif and zinc-finger sequence [37]. In order to explore the phylogenetic relationships of the WRKY gene family in quinoa, the 92 CqWRKY proteins of quinoa and 72 available from Arabidopsis were selected for phylogenetic analysis. Using the same classification criteria as in Arabidopsis, the WRKY proteins of quinoa were classified into three groups, I, II, and III, containing 16, 62, and 14 CqWRKY proteins, respectively. Group II was further classified into five sub-groups, from II a-e, containing 4, 12, 22, 17, and 7 CqWRKY proteins, respectively ( Figure 2). Group II proteins were the most abundant type in quinoa, accounting for 67.4% of all CqWRKY proteins. It was similar to that of Arabidopsis with a percentage of 65.2%. This phenomenon was also found in peanut and sesame [6,38]. However, group III was the largest group of the WRKY gene family in broomcorn millet and wheat, which comprised about 50% and 41%, respectively [1,39]. Generally, group-III members accounted for 20% of the WRKY gene family in higher plants [40]. The WRKY genes from quinoa and Arabidopsis showed an interspersed distribution in groups I-III suggesting that the expansions of WRKY occurred before the divergence of the two species. Furthermore, group I members of the WRKY genes are the most ancient, with loss or gain of the N-terminal domain during the evolution process, and group II/III evolved late in land plants that contained only the C-terminal domain [41]. Furthermore, the number of CqWRKY proteins in group II-c was higher than the other subgroups. Group II-c was shown to be relatives of group I in quinoa, and this phenomenon was also found in wheat [42].
sequence [37]. In order to explore the phylogenetic relationships of the WRKY gene family in 214 quinoa, the 92 CqWRKY proteins of quinoa and 72 available from Arabidopsis were selected for 215 phylogenetic analysis. Using the same classification criteria as in Arabidopsis, the WRKY proteins of 216 quinoa were classified into three groups, I, II, and III, containing 16, 62, and 14 CqWRKY proteins, 217 respectively. Group II was further classified into five sub-groups, from II a-e, containing 4, 12, 22, 218 17, and 7 CqWRKY proteins, respectively ( Figure 2). Group II proteins were the most abundant 219 type in quinoa, accounting for 67.4% of all CqWRKY proteins. It was similar to that of Arabidopsis 220 with a percentage of 65.2%. This phenomenon was also found in peanut and sesame [6,38].

221
However, group III was the largest group of the WRKY gene family in broomcorn millet and wheat, 222 which comprised about 50% and 41%, respectively [1,39]. Generally, group-III members accounted 223 for 20% of the WRKY gene family in higher plants [40]. The WRKY genes from quinoa and

224
Arabidopsis showed an interspersed distribution in groups I-III suggesting that the expansions of 225 WRKY occurred before the divergence of the two species. Furthermore, group I members of the 226 WRKY genes are the most ancient, with loss or gain of the N-terminal domain during the evolution 227 process, and group II/III evolved late in land plants that contained only the C-terminal domain [41].

228
Furthermore, the number of CqWRKY proteins in group II-c was higher than the other subgroups.

229
Group II-c was shown to be relatives of group I in quinoa, and this phenomenon was also found in 230 wheat [42].

Conserved Motifs and Protein Structure Analysis
The conserved motifs of CqWRKY were examinedby using the MEME program. We identified 15 conserved motifs (Figure 3). The identified amino acids length of CqWRKY motifs ranged from 8 to 50. All sequence details of the conserved motifs are shown in Figure A11. The number of motifs was different in those proteins, ranging from two to eight. The results indicated that motif 1 was found in the WRKY domain, and motif 2 was defined as the zinc-finger domain. Similar motif compositions were found in the same group of CqWRKY proteins. Motif 7 was found in the group II-a and II-b subgroups and motif 13 in group III. In general, the location of introns and exons in the genome can provide important evidence for the evolutionary relationships of quinoa. The quinoa genome database was used to obtain the intron and exon distribution of CqWRKY. Results showed that the number of exons in CqWRKYs ranged from 2-6, and most contained 5 exons (Figure 2), while 88% of OsWRKY genes contained 2-6exons in rice. Among these genes, 47 OsWRKY genes (48%) contained 3exons [43]. For members of group II, the II-a and II-b subgroups contained 3-6 exons, group II-c contained 2 or 3 exons, and subgroups II-d and II-e contained 3 exons. All members of group III contained 3-4 exons (Figure 3). Similar to rice, 3 exon genes were the most common, but all of the group I OsWRKY genes contained more than 3 exons. Interestingly, it was surprising that three OsWRKY genes contained 1 exon and one OsWRKY gene contained 20 exons. In summary, this was the first time that the exons distributed in the CqWRKY were explored, which provide valuable information for the study of the evolutionary processes and expansion of the WRKY gene family in quinoa and other species. genome can provide important evidence for the evolutionary relationships of quinoa. The quinoa 242 genome database was used to obtain the intron and exon distribution of CqWRKY. Results showed 243 that the number of exons in CqWRKYs ranged from 2-6, and most contained 5 exons (Figure 2), 244 while 88% of OsWRKY genes contained 2-6exons in rice. Among these genes, 47 OsWRKY genes 245 (48%) contained 3exons [43]. For members of group II, the II-a and II-b subgroups contained 3-6 246 exons, group II-c contained 2 or 3 exons, and subgroups II-d and II-e contained 3 exons. All 247 members of group III contained 3-4 exons (Figure 3). Similar to rice, 3 exon genes were the most 248 common, but all of the group I OsWRKY genes contained more than 3 exons. Interestingly, it was 249 surprising that three OsWRKY genes contained 1 exon and one OsWRKY gene contained 20 exons.

250
In summary, this was the first time that the exons distributed in the CqWRKY were explored, which 251 provide valuable information for the study of the evolutionary processes and expansion of the

Gene Duplication Analysis of CqWRKY
Gene duplication, arising from polyploidization or during tandem and segmental duplication associated with replication, is a major factor causing gene family expansion [33,44]. In this study, 2 CqWRKY genes pairs, including CqWRKY51-1/2/3 and CqWRKY39B/40B/49B, were found to have three copies. 37 CqWRKY genes pairs contains two copies in the A and B homoeologous chromosome. Among them, the two copies of 25 genes pairs were existed together on the same chromosome. Just one copy of remaining 12 CqWRKY genes was identified in quinoa chromosomes by sequence similarity and chromosome localization (Figure 4). These results suggested gene loss may also happen in quinoa WRKY gene family, causing loss of some homologous copies.

Interaction Network between CqWRKY Genes and Other Genes in Quinoa
In order to understand the interaction relationship between CqWRKY and other genes, the co-expression network was created by an orthology-based method. We found 25 CqWRKY genes involved in 193 related genes of network interactions, including MYB, ZAT, NAC, ERF and WRKY gene family members ( Figure 5). These results suggest that CqWRKY proteins maybe involved in a wide range of regulatory and stress related traits in quinoa. Previous studies have suggested that lateral organ boundaries domain (LBD) proteins are plant-specific transcription factors with a highly conserved LOB domain and play important roles in plant growth and development [45]. For example, LBD37, LBD38, and LBD39 not only act as novel repressors of anthocyanin biosynthesis and N availability signals, but also take part in metabolic regulation [46]. In the present study, CqWRKY1A was found to interact with LBD26 and other growth-related IDD2 genes. Furthermore, CqWRKYs were also involved in the response to abiotic stresses. For instance, CqWRKY51A, CqWRKY26B, and CqWRKY7A were found to interact with many quinoa stress-responsive genes including NAC102, ERF15, and MYB86/108 [47][48][49].

256
Gene duplication, arising from polyploidization or during tandem and segmental duplication 257 associated with replication, is a major factor causing gene family expansion [33,44]. In this study, 2

268
In order to understand the interaction relationship between CqWRKY and other genes, the 269 co-expression network was created by an orthology-based method. We found 25 CqWRKY genes and N availability signals, but also take part in metabolic regulation [46]. In the present study, CqWRKY26B, and CqWRKY7A were found to interact with many quinoa stress-responsive genes 280 including NAC102, ERF15, and MYB86/108 [47][48][49].

Expression Profile Analysis of CqWRKY Genes
In order to discover the tissue specificity and stress response genes in quinoa, the expression pattern of 92 CqWRKY genes were analyzed. RNA-seq data of various tissues including seedlings, stems, leaves, inflorescences, and seeds, and salt or drought stress were downloaded from the SRA database. The FPKM values were calculated by Hisat v2.0.4 (http://ccb.jhu.edu/software/ hisat2/index.shtml) and Tophat v2.1.1 software (http://ccb.jhu.edu/software/tophat/index.shtml). Almost a quarter of the CqWRKY genes had no significant differences. Others have shown that clear tissue-specific expression. The CqWRKY gene family is involved in a wide range of growth and development processes in quinoa. Most CqWRKY shared a similar expression pattern between different copies in the A and B genome. CqWRKY19A-1/2 in group III had the lowest expression levels in mature seeds, while the strongest expressions of these genes were detected in seedlings, stems, leaves, and inflorescences. In addition, the strongest expressions of CqWRKY9B-1/2 were found in seedling and leaf, but showed low expression in other tissues. However, different pattern of CqWRKY10B expression was found between two copies in the B genome. CqWRKY10B-1 showed strongest expression in 1-week seedling, while CqWRKY10B-2 has the strongest expression in stem. These genes could be used as candidate genes for further functional studies ( Figure 6A).
To study the stress response of CqWRKY, the expression level of each CqWRKY under salt or drought stress were analyzed. Almost half of the genes significantly induced expression under salt or drought stress, suggesting that these genes play an important role in response to abiotic stress. The expression levels of CqWRKY52A-1/2 was similar in group II-d under salt stress was significantly higher than that of the control condition. The different copies of CqWRKY genes have different expression patterns in quinoa under abiotic stress. In particular, CqWRKY21A-1 and CqWRKY56A-2 had the highest expression levels under salt stress compared to normal conditions, while CqWRKY21B-1 and CqWRKY56A-1 had no significant expression difference, respectively ( Figure 6B).

Discussion
The WRKY gene family plays vital roles in developmental process, diverse defense and abiotic stress responses [3,25,26]. Besides, WRKY proteins were not found in animal species and almost fungi species, while being widespread in land plants ranging from algae to angiosperm also make them fascinating candidates for the evolution of organism. The complex related features and biological functions of the WRKY gene family in Arabidopsis and rice have been extensively identified. Nevertheless, no data set of WRKY is available for quinoa. The WRKY gene family has been identified in 15 plant species by genome sequencing. We found that lower plants had smaller numbers of WRKY genes when compared to higher plants. Groups II-a and III were present in higher plants but not in ancient plants. This suggests an early evolution origin of these two groups in land plants. There were specific WRKY domain loss events in the evolution of WRKY from lower plants to higher plants. Groups II and III mainly regulate leaf senescence and responses to environmental stress. WRKY gene evolution might increase environment adaptability in higher plants.
In this study, using WRKY transcript factors of Arabidopsis as reference genes, we identified 92 CqWRKY genes in quinoa. The size of the WRKY gene family in quinoa was much higher than that of Cucumis sativus (57 members), grapevine (59 members), and Hevea brasiliensis (81 members), but lower than that of Populus trichocarpa (104 members), and rice (102 members) [27,[50][51][52][53]. The abundance of transcription factor has been found to be largely dependent on sequence duplications during genome evolution [44]. The relatively high number of CqWRKY genes in quinoa indicates that the duplication events may have occurred during the genome evolution. It can be hypothesized that the presence of most WRKY genes in quinoa genome may reveal the specific requirements of these WRKY genes to be involved in the complicated mechanism of transcriptional regulation.
Additionally, 16, 62, and 14CqWRKY proteins were clustered into groups I, II, and III, respectively. Phylogenetic analysis showed that group I was considered as the original ancestor of the other two groups of the WRKY gene family, with structural variations and changes in the number of WRKY domains [54]. At the same time, the WRKYGQK heptapeptide variation and zinc finger structure changes also demonstrated the diversity of the evolutionary process of the WRKY gene family. Here, we found that the structure of the WRKY motif was highly conserved, and only seven CqWRKYs had the WRKYGQK mutation. Many species including broomcorn millet and wheat also showed a single amino acid variation [1,55]. Variance or variations in the WRKY motif can affect the normal activity of DNA binding. However, most closely related members in same subgroup share similar motif structure, also indicating these conserved motifs might have a similar role in subgroup functions ( Figure 3).
Recent studies suggest that gene duplications not only play a vital role in the expansion and rearrangement of genome during the evolution process, but also induce gene function diversification [56,57]. Segmental duplication, tandem duplication, and transposition events are the three principal evolutionary patterns [58]. It has been reported that many gene duplications events had happened in apple, soybean and maize, which caused to the expansion of several gene classes [57][58][59]. Here, we found that among the 92 CqWRKY genes, 39 gene pairs duplication, consistent with previous report of gene duplication in quinoa (Figure 4) [21].
RNA-Seq data revealed the expression patterns of 92 CqWRKY genes in different tissues under drought and salt stress treatment [60,61]. The CqWRKY gene family in quinoa is involved in growth and development by RNA-seq data. Almost a third of CqWRKY genes had high expression in seedlings, stems, leaves, and inflorescences. Similar results were also found in poplar, Salix suchowensis, and cotton [62][63][64]. Interestingly, one CqWRKY gene (CqWRKY10B-1/2) has the different expression pattern between different copies in the B genome. Further studies are needed to determine the functions of the CqWRKY genes in quinoa ( Figure 6A). Half of the CqWRKY genes were considered as stress response genes as these genes could be induced under salt and drought stress. CqWRKY18B-1, CqWRKY21A-1, CqWRKY51A-1 and CqWRKY56A-2 were highly regulated by salt stresses, suggesting that these genes might act as important stress regulators responses factors. Besides, the different copies of these CqWRKY genes have different expression patterns. CqWRKY18B-2 was significantly down-expressed under epidermal bladder cell (EBC)-salt stress. CqWRKY21B-1, CqWRKY51A-2 and CqWRKY51A-3, CqWRKY56A-1 was no difference under abiotic stress compared to normal conditions ( Figure 6B).

Conclusions
Here, we studied the evolution analysis of the WRKY gene family in quinoa and other species, and systematically identified the CqWRKY gene family in quinoa at the genome-wide level. A total of 92 CqWRKY genes were identified. Based on the phylogenetic relationship and conserved motif and protein structure analysis, these genes were classified into three groups. We constructed the interaction network between CqWRKYs and other quinoa genes. A total of 193 gene interactions were identified. The CqWRKY genes were involved in a wide range of biological processes and stress responses in quinoa. These genes are good candidates for future functional analysis. These results will be beneficial for our understanding of the molecular mechanisms of stress tolerance in crops such as quinoa.

Conflicts of Interest:
The authors declare no conflict of interest. The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.                  WRKY transcription factor from the diploid woodland strawberry (Fragaria vesca), enhances resistance to 423 powdery mildew, improves osmotic stress resistance, and increases abscisic acid sensitivity in Figure A11. Logos of the CqWRKY protein conserved motifs. A total of 15 motifs were identified by the MEME (Multiple EM for Motif Elicitation) tool.