Genome-Wide Identification of WRKY Transcription Factors in the Asteranae

The WRKY transcription factors family, which participates in many physiological processes in plants, constitutes one of the largest transcription factor families. The Asterales and the Apiales are two orders of flowering plants in the superorder Asteranae. Among the members of the Asterales, globe artichoke (Cynara cardunculus var. scolymus L.), sunflower (Helianthus annuus L.), and lettuce (Lactuca sativa L.) are important economic crops worldwide. Within the Apiales, ginseng (Panax ginseng C. A. Meyer) and Panax notoginseng (Burk.) F.H. Chen are important medicinal plants, while carrot (Daucus carota subsp. carota L.) has significant economic value. Research involving genome-wide identification of WRKY transcription factors in the Asterales and the Apiales has been limited. In this study, 490 WRKY genes, 244 from three species of the Apiales and 246 from three species of the Asterales, were identified and categorized into three groups. Within each group, WRKY motif characteristics and gene structures were similar. WRKY gene promoter sequences contained light responsive elements, core regulatory elements, and 12 abiotic stress cis-acting elements. WRKY genes were evenly distributed on each chromosome. Evidence of segmental and tandem duplication events was found in all six species in the Asterales and the Apiales, with segmental duplication inferred to play a major role in WRKY gene evolution. Among the six species, we uncovered 54 syntenic gene pairs between globe artichoke and lettuce. The six species are thus relatively closely related, consistent with their traditional taxonomic placement in the Asterales. This study, based on traditional species classifications, was the first to identify WRKY transcription factors in six species from the Asteranae. Our results lay a foundation for further understanding of the role of WRKY transcription factors in species evolution and functional differentiation.


Introduction
WRKY transcription factors, which constitute one of the largest transcription factor families in plants, are involved in growth, development, and biotic and abiotic stress processes. WRKY transcription factors can participate in various plant hormone signaling pathways, such as the gibberellin (GA) and abscisic acid (ABA) pathways, and regulate other physiological processes, including fruit ripening and leaf senescence [1,2]. For example, HaWRKY10 is regulated by ABA and GA, thereby reducing carbohydrate metabolism and improving lipid metabolism in sunflower seeds [3]. WRKY transcription factors encoding genes in lettuce are mainly expressed in leaves and can regulate bolting [4]. The drought tolerance of Arabidopsis thaliana was enhanced after ZmWRKY106 gene overexpression with drought response was identified in monocotyledon maize [5]. AtWRKY53 in A. thaliana belongs to group III. Under drought climate conditions, it can promote the metabolism of starch in the guard cells and also reduce the H 2 O 2 content, which ultimately promotes stomatal movement [6]. Furthermore, WRKY can regulate plant secondary metabolic processes, such as artemisinin secondary metabolism. The AaGSW1 gene, from Artemisia annua L. (the Asterales), is a positive regulator in artemisinin biosynthetic pathways and its overexpression can significantly increase artemisinin and dihydroartemisinic acid contents [7]. WRKY proteins have one or two WRKY domains containing the conserved heptapeptide WRKYGQK and a zinc-finger motif. WRKY proteins can be divided into groups I, II, and III according to the number of WRKY domains and the type of zinc-finger motif [8]. Group I, which contains two WRKY domains and a C 2 H 2 zinc-finger motif, can be further divided into two subgroups: Ia with a C 2 H 2 zinc-finger motif, and Ib with a C 2 HC zinc-finger motif [9]. Group II contains a WRKY domain and a C 2 H 2 -type zinc-finger motif and is further divided into five subgroups. Group III contains a WRKY domain and a C 2 HC-type zinc finger motif. The WRKY domain binds to a cis-acting element, the W box. The core sequence (TGAC/T) of the W box is necessary for binding WRKY, which reflects the conservation of the WRKY domain [8].
In this study, we identified genome-wide WRKY transcription factors from sequenced genomes of six species of the superorder Asteranae and analyzed phylogenetic relationships, gene structures, cis-acting elements, and WRKY gene duplication events. Our results establish a foundation for further studies of WRKY transcription factors in the other Asteranae species, including their evolutionary relationships and systematic taxonomy.

Phylogenetic Analysis and Classification of WRKY Genes
We used the HMMER program with the HMM profile of the WRKY domain (PF03106) as a query to search for WRKY genes in A. thaliana and six species of the Asteranae. We then searched for WRKY conserved domains in Pfam, CDD, and SMART databases. As a result, 490 WRKY genes were finally identified (Table S1) in the following species: Ginseng (123 genes), P. notoginseng (56), carrot (65), sunflower (112), globe artichoke (60), and lettuce (74). Moreover, we identified 72 WRKY genes in the reference species A. thaliana, using the same pipeline. According to their chromosomal distributions, these genes were designated as PgWRKY1-PgWRKY123, PnWRKY1-PnWRKY56, HaWRKY1-HaWRKY113, CcWRKY1-CcWRKY60, LsWRKY1-LsWRKY72 and AtWRKY1-AtWRKY72. Multiple sequence alignment of WRKY domains from the six Asteranae species revealed the structure of the WRKY domain conserved heptapeptide to be WRKYGQK ( Figure S1). A number of variants were also uncovered, including WRKYGDK, WRKYGKK, WSKYGQK, WKKYGEK, WKKYDQK, WKKYDHK, WKKYTHK, WRKYGEK, WRKYRQK, WKKYGKK, WRKYDQK and WRKYYQK (Figure 1), which were distributed in groups Ia, Ib, IIc, IIe, III and unclassified. Except in group IIc, where all variants had the sequence WRKYGKK, no uniform trend was observed in the distribution of variants among groups.
The phylogenetic tree was constructed with about 60 amino acids WRKY domain. On the basis of a previous study of A. thaliana WRKY genes, their phylogenetic relationships and WRKY conserved domain characteristics, the WRKY genes of ginseng, P. notoginseng, carrot, sunflower, globe artichoke, lettuce, and A. thaliana could be divided into three groups ( Figure 2) [8,21]. Group I contained 98 WRKY genes classified into subgroups Ia and Ib. Subgroup Ib comprised CcWRKY25 and HaWRKY67 genes with a C 2 HC zinc-finger motif, whereas subgroup Ib clustered in the phylogenetic tree with group III members because they shared the same type of zinc-finger motif. Group II could be divided into five subgroups and included 313 (63.8%) WRKY genes. Groups IIc and I were relatively closely related, and WRKY genes in groups IId and IIe clustered together and thus had a relatively close evolutionary relationship as well. Group III contained 74 WRKY genes with a C 2 HC zinc-finger motif; on the basis of their position in the phylogenetic tree, they have appeared relatively recently during WRKY evolution according to the research of the Brand et al. [19]. Because their domain characteristics diverged from those of the others, the remaining five WRKY genes could not be classified into the three main WRKY groups ( Figure S1). Among the six species, the largest proportion of WRKY genes was in group II (ginseng, 69.9%; P. notoginseng, 66.1%; carrot, 76.9%; globe artichoke, 55%; sunflower, 57.1%; lettuce, 58.1%). In the phylogenetic tree, DcWRKY60 was AtWRKY53 homology gene in Group III, and AtWRKY53 (AT4G23810) has been shown to have drought tolerance, presumably DcWRKY60 has the same function [6]. WRKY homologs located under the same group may have similar functions.

Conserved Motif and Gene Structure Analysis of WRKY Genes
We analyzed the conserved motifs of WRKY proteins in the six species using MEME online software. We found 10 conserved motifs (Table S2) involving the WRKY domain conserved heptapeptide (WRKYGQK), zinc-finger motifs, and the remaining WRKY conserved proteins. The motifs identified in globe artichoke included motifs 1 and 3 containing the WRKY domain conserved heptapeptide, and motifs 2 and 9 with a zinc-finger motif (Figure 3a). Motifs 1 and 3 were present in globe artichoke in group I members containing two WRKY domains. In contrast to other members of group I, CcWRKY25 (Ib) possessed motif 9. Motif 7 was only found in group I, and members of group III only contained motifs 1, 3 and 9. Motif 10, the rarest motif, was distributed in all three groups (Figure 3b). Among the motifs found in carrot were motifs 1 and 3 containing the WRKY domain conserved heptapeptide, and motifs 2 and 10 containing a zinc-finger motif ( Figure S2a). Motifs 1 and 3 in carrot were found in group I. Motif 9 was only found in group IIb; this motif was the least abundant of the seven different types of motifs present in the group ( Figure S2b). Motifs uncovered in sunflower included 1, 2 and 5 containing WRKY domain conserved heptapeptide, and 3 and 4 containing a zinc-finger motif ( Figure S3a). HaWRKY34 (in group I) possessed six WRKY domain conserved heptapeptides. HaWRKY67 (in group Ib) contained motif 4; its zinc-finger motif was different from that of other group members, similar to CcWRKY25 mentioned above. The greatest similarity in motifs in sunflower was between groups IIa and IIb, which had a relatively close evolutionary relationship. Motif 10, the rarest motif, was distributed in HaWRKY domains ( Figure S3b). In lettuce, identified motifs were WRKY domain conserved heptapeptide containing motifs 1 and 3, and motifs 2 and 4 ( Figure S4a). Group I WRKY in lettuce possessed motifs 5 and 6, which also contained WRKY domain conserved heptapeptide and a zinc-finger motif. Motif 6 was only present in lettuce in group I. Group IIb, with seven motif types, possessed the greatest diversity of conserved motifs, including seven kinds. LsWRKY45 did not contain motifs 1 and 3, possibly because of the absence of a WRKY domain ( Figure S4b). In ginseng, motifs 1, 2 and 5 contained WRKY domain conserved heptapeptide, and motifs 3, 4, 6 and 9 contained a zinc-finger motif ( Figure S5a). Motif 7 was also mainly found in ginseng in group I, along with motifs 1, 2, 4, 5 and 6 ( Figure S5b). Motifs found in P. notoginseng included 1 and 3 containing WRKY domain, and 2 and 4 containing a zinc-finger motif ( Figure S6a). PnWRKY40 and PnWRKY51 in group I possessed a large number of copies of motif 10. Group IIb contained seven types of motifs. Motif 8 was mainly distributed in groups IId and IIe. Conserved motifs in group IIb were the most diverse of all WRKY proteins of the six studied species other than sunflower. Among the 10 motifs, motifs 9 and 10 were the least abundant of any found in the six species, and the remaining WRKY conserved protein sequences exhibited no uniformity.
WRKY intron-exon structures were characterized by analysis of genomic data using TBtools. The number of introns in globe artichoke WRKY ranged from 1 to 12, with an average of 2.9 per gene ( Figure 3b). CcWRKY25 contained 12 introns, the largest number detected. The intronic distribution of CcWRKY genes in group III was conserved, with each gene contained two introns. The number of introns in carrot ranged from two to nine, with an average of 2.6 introns per DcWRKY gene ( Figure S2b). The number of introns per HaWRKY gene in sunflower ranged from two to nine, with an average of 2.6 ( Figure S3b). HaWRKY genes in group IIa contained four introns, while group IIe members possessed one. Most HaWRKY genes in group III contained two introns. Group Ib WRKY genes were present in both globe artichoke and sunflower, and their gene structures were similar, with more introns than those of other group members. The number of introns in lettuce ranged from one to six, with an average of 2.6 introns per LsWRKY gene ( Figure S4b). Group IId gene structures were similar, with all containing two introns. With one exception, all members of group III contained two introns; LsWRKY45 contained only one. The number of introns in ginseng ranged from 1 to 36, with an average of 3.4 ( Figure S5b). PgWRKY genes in group I contained four introns on average. PgWRKY85 (IIb), PgWRKY90 (IId), and PgWRKY110 (IIe) genes, respectively, contained 21, 36, and 28 introns, and the number of introns and the sequence lengths were much larger than those of other PgWRKY genes. PnWRKY genes in P. notoginseng possessed one to nine introns, and the average was 3.3 introns per gene ( Figure S6b). PnWRKY49 (IIe) contained nine introns, the most in its group. The number of introns was conserved in PnWRKY genes in group III, which contained two introns. WRKY genes in group I contained at least three introns on average, the highest of any group in the six species; in globe artichoke, WRKY genes in group I contained an average of six introns (Figure 3b). The structure of WRKY genes in group Ib was different from other group members, and the number of introns was higher than that of other WRKY genes (Figure 3b and Figure S3b). The gene structure of group III members was the most conserved among the six species, while that of group Ib was the most distinct. Except for five WRKY genes in ginseng, only two introns were present in group III, and it was thus the most conserved group of WRKY genes ( Figure S5b).
Conserved motif and intron-exon distribution patterns of WRKY genes in the six species were generally group specific, while WRKY gene structures were similar within each group, thereby verifying the phylogenetic relationships of the six species. The distribution of introns in groups I and III in the six species also exhibited similarities and thus also reflected the evolutionary relationships of the six species.

Analysis of WRKY Gene Promoters
To study the expression and regulation of WRKY transcription factors, we used PLANTCARE software to analyze cis-acting elements in WRKY promoters (data not shown). Our analysis revealed that carrot, globe artichoke, sunflower, and lettuce contained a large number of promoter core regulatory elements (CAAT-box), light responsive elements (GT1-motif, Sp1 and ACE), and W box elements. We also uncovered a variety of abiotic stress responsive elements, such as drought-inducibility elements (MBS), flavonoid biosynthetic gene regulators (MBS I), low-temperature responsive elements (LTR), defense and stress responsive elements (TC-rich repeates), abscisic acid responsive elements (ABRE), methyl jasmonate (MeJA) responsive elements (CGTCA-motif and TGACG-motif), auxin-responsive elements (TGA-element and AuxRR-core), gibberellin-responsive elements (GARE-motif, P-box, and TATC-box), circadian control elements (circadian), salicylic acid responsive elements (TCA-element), and wound-responsive elements (WUN-motif).
Some abiotic stress response elements and W box elements were visualized using TBtools. In carrot, W box elements were found in 33 promoters of DcWRKY genes ( Figure 4). The most common elements were ABRE, contained in 49 DcWRKY promoters, followed by CGTCA motif elements; in contrast, MBS I were the least abundant elements. DcWRKY38 genes (in group III) contained 13 MBS elements, the highest of any group in DcWRKY genes. W box elements were found in 41 promoters of CcWRKY genes from globe artichoke ( Figure S7). LTR elements were present in 23 promoters of CcWRKY. MeJA responsive elements (CGTCA-motif), which were found in 45 CcWRKY promoters, accounted for the largest proportion of cis-acting elements, followed by ABRE elements. The rarest cis-acting element in CcWRKY promoters was MBS I. In sunflower, 76 promoters of HaWRKY genes contained W box elements ( Figure S8). CGTCA-motif elements, which were present in 96 HaWRKY genes, accounted for the largest proportion of abiotic stress elements, followed by ABRE elements. The least common cis-acting elements in HaWRKY promoters were the circadian elements. In lettuce, 40 promoters of LsWRKY genes contained W box elements ( Figure S9). A total of 55 LsWRKY genes possessed ABRE elements; these were the most abundant abiotic stress elements, followed by CGTCA-motif elements. In contrast, only six LsWRKY promoters contained MBS I elements. The LsWRKY26 gene did not contain these abiotic stress elements or any other elements. The most common cis-acting elements among globe artichoke, carrot, sunflower, and lettuce were MeJA responsive elements and ABRE elements, whereas MBS I elements were the rarest in three of these species.

Chromosomal Distribution and Duplication of WRKY Genes
We further studied the evolution of the WRKY gene family by analyzing the chromosomal distribution of WRKY genes. Since the ginseng and P. notoginseng genome assembly information is incomplete, only the other four species were analyzed at the chromosome level. In sunflower, 112 HaWRKY genes were distributed on 17 chromosomes ( Figure 5). Chromosome 10 contained 18 HaWRKY genes, whereas chromosomes 2 and 13 contained only 2. The proportion of HaWRKY genes in group I and III was 21.4%, while those of group IIa were the least (4.4%). Five gene clusters (comprising HaWRKY2, HaWRKY3; HaWRKY12, HaWRKY13; HaWRKY36, HaWRKY37; HaWRKY59, HaWRKY60; HaWRKY86, HaWRKY87, and HaWRKY88) were found on chromosomes 1, 3, 7 10, and 14. A total of 60 CcWRKY genes were distributed on the 17 chromosomes of globe artichoke ( Figure S10). Chromosome 2 had the most CcWRKY genes, while chromosomes 4, 5, and 10 contained only one. On the 17 chromosomes, the proportion of CcWRKY genes in group III was 11.4%, while groups IIa and IId were the least common (8.3%). A cluster of genes (CcWRKY53, CcWRKY54, and CcWRKY55), all belonging to group IIa, was present on chromosome 16. A total of 74 LsWRKY genes were distributed on the nine chromosomes of lettuce ( Figure S11). Chromosome 9 harbored the most LsWRKY genes (nine), while chromosomes 1 and 6 had the fewest. LsWRKY genes in group IIa were the least (4.0%). Two clusters of genes belonging to group III (LsWRKY5, LsWRKY6, LsWRKY7; LsWRKY65, LsWRKY66, LsWRKY67, and LsWRKY68) were uncovered on chromosomes 2 and 9. In carrot, 65 DcWRKY genes were distributed across nine chromosomes ( Figure S12). Chromosome 2 had the highest number of DcWRKY genes (16 genes), while chromosome 9 contained only three. DcWRKY genes in group IIe were the most widely distributed on chromosomes (29.2%), whereas group IIa was the least (4.6%). A cluster of group IIc genes (DcWRKY42 and DcWRKY43) was present on chromosome 5. Among the four species, WRKY genes in group IIa were the least common, and these four species contained at least one gene cluster.
In addition, 189 pairs of orthologous genes between the four species were identified. To study evolutionary pressures acting on WRKY genes, we aligned coding sequences of orthologous and tandem duplicated gene pairs (paralogous gene pairs) of the four species (Table S4) and calculated the non-synonymous to synonymous substitution-rate ratio (Ka/Ks) of WRKY genes. A Ka/Ks > 1 was obtained for the CcWRKY53-CcWRKY55 gene pair in globe artichoke, which indicates that positive selection pressure may have operated on CcWRKY genes. No Ka/Ks ratios were calculated for tandem duplicated genes in lettuce or carrot because these genes had low similarities. Six tandem duplicated WRKY gene pairs were identified in sunflower with a Ka/Ks < 1, which suggests the action of purifying selection pressure on WRKY genes in this species (Table 1). A total of 76.1% of orthologous genes had Ka/Ks values <1, thus indicating that WRKY genes in globe artichoke, sunflower, carrot, and lettuce may have been subjected to purifying selection pressure during species evolution.

Discussion
The evolution, function, abiotic stress response, and other aspects of WRKY, one of the largest plant transcription factor families, have been investigated in many species of plants. In the present study, 490 WRKY gene family members were identified from six species of the superorder Asteranae. We identified 65 WRKY genes in carrot, which compares with 67 in a previous study of the same variety [29]. This difference may be due to the fact that the E values were set differently when the domains were screened. Consequently, we identified a different number of WRKY genes. Furthermore, this study was reconfirmed through the database to remove the incomplete WRKY domain and zinc-finger motifs. In another study, 95 WRKY genes were identified in a different variety (D. carota L. cv. Kuroda) [30].
Among the 490 WRKY genes uncovered in the six species, 485 could be divided into three groups. The remaining five WRKY members could not be assigned to any of the three groups because of differences in their domain characteristics. Group Ib includes CcWRKY25 and HaWRKY67 genes. Previous studies have reported the presence of group Ib in some monocotyledonous plants; for example, eight WRKY genes found in rice, six in O. officinalis Wall ex Watt, and one (PheWRKY61) in moso bamboo [9,24,31]. The only WRKY genes in group Ib uncovered in our study were in globe artichoke and sunflower (all dicotyledonous plants). Group Ib is a new group produced by the duplication of the DNA binding domain in group III, and is currently only found in monocotyledonous plants [19]. The existence of group Ib in dicotyledonous plants can also provide a basis for studying evolution of monocotyledonous and dicotyledonous plants.
The proportion of WRKY genes represented by each group differed among the six species. Among the other species, the largest proportion of WRKY genes was also in group II, such as rice (46 genes, 56.8%) and grape (40, 69.0%). We identified 12 WRKY domain variants, including WRKYGKK, WKKYDQK, and WKKYGEK, in WRKY genes of the six species. Such variation, which has been reported in many species [9,32,33], may have an effect on the normal physiological metabolic functions of WRKY genes. The conserved nature of the WRKY domain heptapeptide of group IIc may be a reflection of the ancestral position of this group in the evolutionary tree. In contrast, group Ib have arisen more recently via DNA binding domain duplication, with subsequent base mutations taking place during duplication [19]. MEME analysis of WRKY protein sequences revealed obvious group specificities. For example, WRKY proteins belonging to group IIc in ginseng are relatively conserved and rarely contain other motifs. We observed similar phenomena in the remaining five species. Group Ib includes CcWRKY25 and HaWRKY67 genes. C 2 H 2 zinc-finger motifs are present in N-and C-terminal regions of the proteins of these two genes, similar to the zinc-finger motifs of group III. Although CcWRKY25 and HaWRKY67 genes fell into group III in the phylogenetic tree, they contain two WRKY domains and thus belong to group I. Group Ib has arisen from group III via the duplication of WRKY genes [19]; this WRKY transcription factor's evolutionary patterns has also been confirmed in wild rice [31]. Although PgWRKY42 and PgWRKY122 of ginseng cannot be classified into any of the three groups in the phylogenetic tree, they cluster with group I and may have originated through the loss of the WRKY domain during the course of evolution. Similarly, LsWRKY2, DcWRKY52, and CcWRKY8 are also closely related to group I, but they contain only one WRKY domain.
Our examination of WRKY intron-exon structural characteristics in the six species indicated that the gene structure of group III is the most conserved. Moreover, the number of WRKY gene introns in group I is higher than that of the other two groups, possibly because of gene duplication in ancestral group IIc. The number of introns in CcWRKY25 and HaWRKY67 is different that of other group Ib members, with 12 and 9 introns, respectively, inserted into exons, a situation that may be due to an increase in the number of introns caused by gene duplication during evolution. The structural characteristics of PgWRKY85, PgWRKY90, and PgWRKY110 are different from those of other members in ginseng, and the number of introns is higher. Furthermore, the functions of these three genes in plants may be different from those of other members.
Cis-acting elements play an important role in gene transcription and expression. We analyzed cis-acting elements of four species in the Asterales and the Apiales and found a large number of hormone response elements (ABRE, CGTCA-motif, TGA-element, GARE-motif, TCA-element), stress response elements (MBS, LTR, TC-rich repeates, and WUN-motif), W box, and secondary metabolic pathway-related (MBS I) elements. DcWRKY38, which belongs to group III in carrot with same as AtWRKY53, contains 13 MBS elements and may also be involved in plant drought resistance [6]. MeJA elements are one of the most abundant of the 12 types of cis-acting elements in globe artichoke, sunflower, carrot, and lettuce. Exogenous MeJA induces WRKY genes containing MeJA elements and may enhance the activity of WRKY gene promoters in the six species, thereby participating in the regulation of other plant physiological processes. In Conyza blinii H. Lév, expression of the CbWRKY24 gene leads to increased total saponin content and subsequent upregulation of the transcription of key enzyme genes in the mevalonate pathway. In tomato, expression of CbWRKY24 downregulates the expression of key genes in the lycopene pathway [34]. CrWRKY1 identified in Catharanthus roseus was overexpressed under the induction of plant hormones such as MeJA, resulting in upregulation of key enzyme genes in the terpenoid indole alkaloids (TIAs) pathway [35]. WRKY transcription factors can specifically bind to W box elements in gene promoters and affect gene transcription [36]. The WsWRKY1 transcription factor in Withania somnifera can directly regulate the triterpenoid metabolic pathway by binding to the W box elements in the squalene synthase and squalene epoxidase gene promoter regions and enhance synthesis of triterpenoids [37]. The TaWRKY2 and TaWRKY19 proteins of wheat are overexpressed in A. thaliana and bind to their downstream gene promoter regions to regulate their gene expression and regulate plant tolerance under stress conditions [38]. WRKY genes are induced by flk22, in which the induced WRKY can rapidly bind to its own promoter or other WRKY gene to establish network [39]. In addition, WRKY transcription factors can combine with W box elements in their own gene promoters. In chickpea infected with Fusarium, for example, WRKY40 has been found to bind to its own promoter to regulate its gene expression [40]. Many genes in the three Asterales species examined in this study contain W box elements. For instance, 41 CcWRKY genes in globe artichoke include W box elements, thus suggesting that these CcWRKY genes are also regulated by WRKY transcription factors or themselves [36].
Gene duplication plays a key role in species evolution, genome amplification, and gene family evolution [41]. The three main types of gene duplication are whole genome duplication, tandem duplication, and segmental duplication [42]. In this study, we mainly analyzed tandem duplication and segmental duplication events in the four species. We identified 19, 23, 11 and 24 segmental duplicated WRKY gene pairs in globe artichoke, carrot, sunflower, and lettuce, respectively. Tandem duplicated genes were relatively less common, a finding consistent with observations in watermelon and pineapple [43,44]. This result also implies that segmental duplication is the main type of gene duplication occurring in the four species. In the four species, the tandem duplicated gene pairs are all located on the same chromosome. Two, one, five, and two tandem clusters (gene clusters) were found in globe artichoke, carrot, sunflower, and lettuce, respectively. Gene clusters have been found in radish, O. officinalis, and other species [31,45]. Duplicated genes may have different expressions. These genes may exhibit functional diversity, and the mechanisms underlying their expression regulation may also change. These mechanisms can be identified by sequencing [46].
We analyzed the synteny of WRKY genomes of four species: Carrot, globe artichoke, sunflower, and lettuce. The largest number of syntenic relationships was found between globe artichoke and lettuce. This result indicates that these species have a relatively close evolutionary relationship, consistent with traditional classifications in the Asterales. The smallest number of syntenic relationships was found between carrot and globe artichoke, which belong to different orders. This outcome may be due to the incomplete assembly of chromosomes in the globe artichoke genome. In the six groups of syntenic relationships, 23 WRKY genes were common among the four species and may reveal insights into the evolution of WRKY genes among different species.
Purifying selection pressure has acted on tandem duplicated gene pairs in sunflower, which is in line with findings in peanut and pineapple [28,44]. Only one pair of tandem duplicated genes (CcWRKY53-CcWRKY55 belonging to group IIa) was uncovered in globe artichoke with Ka/Ks > 1, which suggests that positive selection has acted on CcWRKY53-CcWRKY55. Most orthologous genes among the four species had a Ka/Ks < 1. Compared with paralogous pairs, WRKY genes may be purified by whole genome duplication or species differentiation in carrot, sunflower, lettuce, and globe artichoke. In a comparison of WRKY orthologous pairs of watermelon, melon, and cucumber, strong purifying selection pressure was inferred to have operated during the evolution of these three species of Cucurbitaceae [47]. A similar conclusion has been drawn for the evolution of pineapple WRKY genes [44].

Classification and Phylogenetic Analysis of WRKY Genes
WRKY domains were multiply aligned in ClustalW and displayed using GeneDoc. A neighbor-joining phylogenetic tree was then constructed in MEGA7.0 with default parameters (http://www.megasoftware. net/). Then WRKY genes were divided into three groups on the basis of domain characteristics and phylogenetic relationships.

Analysis of WRKY Gene Duplication and Synteny Among Species
To analyze gene duplication events, each WRKY protein sequence from carrot, sunflower, globe artichoke, and lettuce was aligned against itself using BLASTp with an E-value <1e -10 threshold and default parameters. MCScanX was used to analyze WRKY gene duplication events and detect syntenic relationships among species. The WRKY genes were then mapped to chromosomes to illustrate segmental duplication gene pairs using Circos (http://circos.ca/). Kaks_Calculator 2.0 software was used to calculate the Ka/Ks ratio of orthologous and paralogous WRKY gene pairs. Supplementary Materials: The following are available online at http://www.mdpi.com/2223-7747/8/10/393/s1: Table S1: Identification of WRKY genes in globe artichoke, sunflower, lettuce, carrot, ginseng and P. notoginseng; Figure S1: Alignment of the WRKY domain uncovered in ginseng, ginseng, P. notoginseng, carrot, globe artichoke, sunflower, and lettuce; Table S2: WRKY protein sequences of globe artichoke, sunflower, lettuce, carrot, ginseng, and P. notoginseng; Figure S2a: Ten types of conserved motifs in carrot; Figure S2b: Phylogenetic relationships, conserved motifs, and intron-exon structures of DcWRKY genes in carrot; Figure S3a: Ten types of conserved motifs in sunflower; Figure S3b: Phylogenetic relationships, conserved motifs, and intron-exon structures of HaWRKY genes in sunflower; Figure S4a: Ten types of conserved motifs in lettuce; Figure S4b: Phylogenetic relationships, conserved motifs, and intron-exon structures of LsWRKY genes in lettuce; Figure S5a: Ten types of conserved motifs in ginseng; Figure S5b: Phylogenetic relationships, conserved motifs, and intron-exon structures of PgWRKY genes in ginseng; Figure S6a: Ten types of conserved motifs in P. notoginseng; Figure S6b: Phylogenetic relationships, conserved motifs, and intron-exon structures of PnWRKY genes in P. notoginseng; Figure S7: Cis-acting elements in globe artichoke CcWRKY promoters; Figure S8: Cis-acting elements in sunflower HaWRKY promoters; Figure S9: Cis-acting elements in lettuce LsWRKY promoters; Figure S10: Distribution of CcWRKY genes on globe artichoke chromosomes; Figure S11: Distribution of LsWRKY genes on lettuce chromosomes; Figure S12: Distribution of DcWRKY genes on carrot chromosomes; Figure S13: Synteny analysis of globe artichoke CcWRKY genes; Figure S14: Synteny analysis of carrot DcWRKY genes; Figure S15: Synteny analysis of sunflower HaWRKY genes; Table S3: WRKY synteny gene pairs between carrot, globe artichoke, sunflower, and lettuce genomes; Figure S16: Synteny between genomes of carrot (Daucus carota) and globe artichoke (Cynara cardunculus); Figure S17: Synteny between genomes of carrot (Daucus carota) and sunflower (Helianthus annuus); Figure S18: Synteny between genomes of sunflower (Helianthus annuus) and lettuce (Lactuca sativa); Figure S19: Synteny between genomes of sunflower (Helianthus annuus) and globe artichoke (Cynara cardunculus); Table S4: WRKY coding sequences of globe artichoke, sunflower, lettuce, carrot, ginseng and P.notoginseng.

Conflicts of Interest:
The authors declare no conflicts of interest.