Evolutionary Analysis of Calcium-Dependent Protein Kinase in Five Asteraceae Species

Calcium-dependent protein kinase (CPK) is crucial in Ca2+ signal transduction, and is a large gene family in plants. In our previous work, we reported Hevea brasiliensis CPKs were important for natural rubber biosynthesis. However, this CPK gene family in other rubber producing plants has not been investigated. Here, we report the CPKs in five representative Asteraceae species, including three rubber-producing and two non-rubber species. A total of 34, 34, 40, 34 and 30 CPKs were identified from Taraxacum koksaghyz, Lactuca sativa, Helianthus annuus, Chrysanthemum nankingense and Cynara cardunculus, respectively. All CPKs were classified into four individual groups (group I to IV). In addition, 10 TkCPKs, 11 LsCPKs, 20 HaCPKs, 13 CnCPKs and 7 CcCPKs duplicated paralogs were identified. Further evolutionary analysis showed that, compared to other subfamilies, the group III had been expanded in the Asteraceae species, especially in the rubber-producing species. Meanwhile, the CPKs in group III from Asteraceae species tend to expand with low calcium binding capacity. This study provides a systematical evolutionary investigation of the CPKs in five representative Asteraceae species, suggesting that the sub-family specific expansion of CPKs might be related to natural rubber producing.


Introduction
Calcium (Ca 2+ ) participates in miscellaneous signal transduction pathways as the second messenger such as stress, immune and signaling. To date, there are three major classes of Ca 2+ -binding proteins that have been characterized in higher plants, including calcium-dependent protein kinases (CPK), calmodulins (CaMs) and CaM-like proteins (CaMLs), and calcineurin B-like proteins (CBLs) [1,2]. The CPK constitutes one of the largest protein kinase families that sense the calcium signal in plants [3]. The CPKs are monomeric proteins with structures that contain four conserved domains: the N-terminal variable domain, serine/threonine kinase domain, auto-inhibitory junction domain and the calmodulin-like domain [4,5]. The N-terminal domain is highly variable and contains myristoylation or palmitoylation sites for subcellular targeting [6]. The protein kinase domain is the catalytic domain with an adenosine triphosphate (ATP) binding site, which is often followed by the auto-inhibitory domain that serves as an auto-inhibitor to switch CPKs between inactive and/or active forms depending on the level of calcium concentration [7]. Moreover, the calmodulin-like domain often contains four EF-hands for Ca 2+ binding [8,9]. The palmitoylation and myristoylation sites were predicted using TermiNator (https://bioweb.i2bc.paris-saclay.fr/ terminator3/).

Phylogenetic Analysis of CPK Members
The five Asteraceae species CPKs, rubber tree CPKs, as well as five representative model plant CPK members were used to investigate the evolutionary relationship of CPKs in Asteraceae plants and rubber-producing plants. The result revealed that all CPK genes fell into four different groups, group I (yellow), group II (blue), group III (green) and group IV (pink) (Figure 1). No species-specific clades and rubber-producing-specific clades were identified. However, the amounts of CPKs in different groups of eleven species are different. Group I usually has the largest number and group IV contains the fewest number of CPKs.

Evolutionary Analyses of Duplicated Gene Pairs in Asteraceas Species
To explore the evolution of the CPK family in detail, we analyzed the duplication events of the Asteraceae CPK gene family. A number of 10, 11, 20, 13 and 7 duplicated paralogs were identified in Tks, L sativa, H. annuus, C. nankingense and C. cardunculus, respectively. The Ka/Ks ratio was calculated to assess the selection pressure of each duplicated paralog pairs. The results showed that most duplicated Asteraceae CPK paralogs are under purifying selection, except six duplicated gene pairs (HaCPK2/HaCPK23, HaCPK9/HaCPK19, HaCPK9/HaCPK28, CnCPK5/CnCPK28, CnCPK16/CnCPK31 and CnCPK18/CnCPK26), which are under positive selection ( Figure 2, Tables S1-S5). It therefore appears that CPKs play critical roles during plant development, which requires highly conserved sequences. The paralogs under positive selection might have potential functional divergence which is involved in specific tissues, and development processes in H. annuus and C. nankingense after the emergence of Asteraceae.  We also addressed the question of whether these duplicated paralogs of Asteraceae CPKs are under an accelerated evolutionary rate. To this end, we assessed the Tajima relative rates of Asteraceae CPK paralogs. The TkCPK26/TkCPK27 and TkCPK27/TkCPK31 duplication pairs have prominently accelerate evolutionary rates (    We also addressed the question of whether these duplicated paralogs of Asteraceae CPKs are under an accelerated evolutionary rate. To this end, we assessed the Tajima relative rates of Asteraceae CPK paralogs. The TkCPK26/TkCPK27 and TkCPK27/TkCPK31 duplication pairs have prominently accelerate evolutionary rates (  We also addressed the question of whether these duplicated paralogs of Asteraceae CPKs are under an accelerated evolutionary rate. To this end, we assessed the Tajima relative rates of Asteraceae CPK paralogs. The TkCPK26/TkCPK27 and TkCPK27/TkCPK31 duplication pairs have prominently accelerate evolutionary rates ( Table 2). The number of duplicated gene pairs from group III were more than that from the other three groups in the five Asteraceae species. Meanwhile, a total of 16 duplicated gene pairs (LsCPK19/LsCPK28, LsCPK20/LsCPK31, LsCPK11/LsCPK18, LsCPK15/LsCPK29, LsCPK16/LsCPK18, HaCPK33/HaCPK34, HaCPK33/HaCPK38, HaCPK10/HaCPK16, HaCPK19/HaCPK28, CnCPK14/CnCPK25, CnCPK29/CnCPK34, CnCPK16/CnCPK31, CnCPK20/CnCPK24, CnCPK2/CnCPK30, CnCPK1/CnCPK26, CnCPK18/CnCPK26) are under accelerated evolutionary rates in the other four Asteraceae species, suggesting that they potentially play specific roles. (Tables S6-S9). a The Tajima relative rate test was used to examine the equality of evolutionary rate between Tks paralogs; b Mt is the sum of the identical sites in all three sequences tested; c M1 is the number of unique differences in the first paralog; d M2 is the number of unique differences in the second paralog; e If p < 0.05, the test rejects the equal substitution rates between the two duplicates and infers that one of the two duplicates has an accelerated evolutionary rate.

Syntenic Analysis of CPKs from Five Asteraceae Species
The syntenic analysis of CPK members from Tks, L sativa, H. annuus, C. nankingense, C. cardunculus and S. lycopersicum was performed. The Circos program was used to visualize the syntenic relationship. A total of 10, 11, 20, 13 and 7 duplicated CPK pairs in Tks, L. sativa, H. annuus, C. nankingense and C. cardunculus were identified. The number of 10, 4, 7 and 1 TkCPKs from group I, group II, group III and group IV had syntenic relationships with CPKs from the other four Asteraceae species and tomato ( Figure 3). Overall, there is a close CPK syntenic relationship among the five Asteraceae species, especially among different subgroups.

The CPKs in Group II and Group III Are Expanded
To investigate the evolution of the CPK gene family, 39 GmCPKs in Leguminosae were chosen to represent CPKs in eurosids I. In addition, 34 AtCPKs in Brassicaceae represented CPKs in eurosids II, and 29 Solanum lycopersicum CPKs, 26 Solanum tuberosum CPKs, 28 Nicotiana tabacum CPKs, 31 Capsicum annuum CPKs and 40 Ipomoea nil CPKs in Solanaceae represented CPKs from euasterids I. Five Asteraceae species stand for CPKs from euasterids II and 30 OsCPKs in Gramineae represented CPKs in Monocotyledon. By comparing the CPK number in all mentioned species, we found that the CPK gene family has the largest number in group I and the smallest numbers in group IV. The AtCPKs and GmCPKs in rosids are significantly expanded in group II, while CPKs in Asteraceae are expanded in group III (Figure 4), indicating the potential functional divergence of expanded CPKs in group II and group III in rosids and Asteraceae, respectively. Notably, rubber-producing Asteraceae plants have the largest group III members (12, 11 and 11 for Tks, Ha and Ls), implying members of group III might be potentially involved in NR-related metabolism processes. cardunculus and S. lycopersicum was performed. The Circos program was used to visualize the syntenic relationship. A total of 10, 11, 20, 13 and 7 duplicated CPK pairs in Tks, L. sativa, H. annuus, C. nankingense and C. cardunculus were identified. The number of 10, 4, 7 and 1 TkCPKs from group I, group II, group III and group IV had syntenic relationships with CPKs from the other four Asteraceae species and tomato ( Figure 3). Overall, there is a close CPK syntenic relationship among the five Asteraceae species, especially among different subgroups.

The CPKs in Group II and Group III Are Expanded
To

Gene Structure and Motif Distribution of CPKs
Gene structure divergence plays considerable roles in gene family evolution and can be used to assess phylogenetic relationships [33,34]. To further investigate the expansion mechanism in group III, maps of exon-intron structure and motif distribution were constructed based on coding DNA sequences as well as protein sequences of group III CPKs from five model plants and five Asteraceae species. The result displayed a very similar exon-intron structures of ten species in group III. The first exon in most CPK members was the longest, followed by several shorter exons. Meanwhile, the gene

Gene Structure and Motif Distribution of CPKs
Gene structure divergence plays considerable roles in gene family evolution and can be used to assess phylogenetic relationships [33,34]. To further investigate the expansion mechanism in group III, maps of exon-intron structure and motif distribution were constructed based on coding DNA sequences as well as protein sequences of group III CPKs from five model plants and five Asteraceae species. The result displayed a very similar exon-intron structures of ten species in group III. The first exon in most CPK members was the longest, followed by several shorter exons. Meanwhile, the gene structure and motif distributions showed similar patterns in other three groups ( Figure 5, Figures S1-S3).

Motif Sequence Analysis of CPKs in Group III from Asteraceae
The CPK members of Asteraceae were significantly expanded in group III. The exon-intron structure and motif distribution analysis showed that all CPKs in group III from Asteraceae were conservative ( Figure 5). To further explore the tendency of the CPKs' expansion in group III, the detailed information of conserved motifs (amino acid sequences) were analyzed ( Figure 6). The result

Motif Sequence Analysis of CPKs in Group III from Asteraceae
The CPK members of Asteraceae were significantly expanded in group III. The exon-intron structure and motif distribution analysis showed that all CPKs in group III from Asteraceae were conservative ( Figure 5). To further explore the tendency of the CPKs' expansion in group III, the detailed information of conserved motifs (amino acid sequences) were analyzed ( Figure 6). The result showed that the DLK motif and auto-inhibitory domain of all four groups were highly conserved. However, lower conservativeness was observed in EF-hand 1 to EF-hand 3 of group III CPKs from five Asteraceae species. In Asteraceae, the "D1-X-D3-X-S5" regions of group III in EF-hand 1 to EF-hand 3 have an obviously lower convergence than that of group I, II and IV. (framed in Figure 6). The Ca 2+ -binding sites of the EF-hands were reported to be D1-D3-S5-E12 (EF-hand 1), D1-D3-S5-E12 (EF-hand 2), D1-D3-S5-E12 (EF-hand 3), and D1-D3-D5-E12 (EF-hand 4) [35]. It seems that group III CPKs from Asteraceae might have lower calcium binding capacity than other groups since the amino acids of the EF loop region participating in Ca 2+ -binding. Figure 6. Comparison of sequence logos for conserved motifs of CPKs from the five Asteraceae species. Sequence logos of the consensus motifs were created using MEME online software. The height of each letter represents the frequency of amino acids at corresponding position. The red star means all CPK members have the exact same amino acid in corresponding site. Black frames showed the amino acid sites with much lower conservation in EF-hand 1, 2 and 3 of group III.

Identification and Characteristics of CPKs in Asteraceae Species
Genome-wide identification of the CPK family has been conducted in various higher plants [28][29][30][31][32][36][37][38][39]. A total of 34, 34, 40, 34 and 30 novel CPKs and 10, 11, 20, 13 and 7 duplication gene pairs in Tks, L sativa, H. annuus, C. nankingense and C. cardunculus were identified. (Table 2, Tables S1-S5). Four species have a similar number of CPKs to that of Arabidopsis and rice, expect H. annuus, in which the CPK number significantly expanded, which may be ascribed to a much more complex evolutionary history experienced by sunflower with a lineage-specific whole-genome duplication (WGD) event around 29 million years ago [14]. In the five Asteraceae species studied in this study, an additional WGD event had been identified only in H. annuus, but not in the other four species [14][15][16][17][18].
Gene structure analysis of CPKs showed that the first exon in most CPKs was the longest one, followed by several shorter exons. The exon number in the four sub-groups were different, CPKs in group IV had more but shorter exons than groups I-III. The exon-intron patterns were similar between CPKs belonging to the same evolutionary groups (Table 1, Figure 5). Further, duplicated CPK gene pairs had highly conserved exon-intron patterns, which may also impact on the functional similarities and/or redundancy between these duplicated genes. Sequence logos of the consensus motifs were created using MEME online software. The height of each letter represents the frequency of amino acids at corresponding position. The red star means all CPK members have the exact same amino acid in corresponding site. Black frames showed the amino acid sites with much lower conservation in EF-hand 1, 2 and 3 of group III.
It seems that group III CPKs from Asteraceae might have lower calcium binding capacity than other groups since the amino acids of the EF loop region participating in Ca 2+ -binding.
Gene structure analysis of CPKs showed that the first exon in most CPKs was the longest one, followed by several shorter exons. The exon number in the four sub-groups were different, CPKs in group IV had more but shorter exons than groups I-III. The exon-intron patterns were similar between CPKs belonging to the same evolutionary groups (Table 1, Figure 5). Further, duplicated CPK gene pairs had highly conserved exon-intron patterns, which may also impact on the functional similarities and/or redundancy between these duplicated genes.
The CPK sequences among all higher plants are highly conserved, particularly in the protein kinase domain, the auto-inhibitory domain and the four EF-hand domains [40]. The CPKs of five Asteraceae species in this study are also highly conserved. Gene structure and motif distribution analyses of CPKs showed that the members in the same group shared similar distribution patterns of exon-introns and motifs ( Figure 5 and Figures S1-S3).
Moreover, the Ka/Ks ratio among paralogs in five Asteraceae species demonstrated that evolutionary pressure for these sequences was maintained as most Ka/Ks ratios are less than 1 (Figure 2), indicating that these CPKs are under purifying selection.

Phylogenetic Analysis and Group-Specific Expansion of CPKs in Asteraceae Species
The CPKs in group IV appeared to diverge from the common ancestor with algae; group III formed a clade separate from groups I and II, while the split between group I and II appeared to be the most recent evolutionary event. The phylogenetic tree of CPK members in eleven species revealed that the group IV have the longest main branch followed by group III, and the branch of group II is the shortest (Figure 1). Furthermore, the exon-intron numbers and distribution in group IV were also different from the other three groups (Table 1, Figure 5 and Figures S1-S3), supporting the hypothesis that group IV CPKs form a separate clade of earlier lineage [32].
The CPK gene family has expanded greatly from four genes in the land plant ancestor, and less than 11 genes in green algae to approximately 30-40 members among angiosperms. Our phylogenetic analysis provides insights regarding the evolutionary relationship and group-specific expansion of CPKs from Asteraceae and rosids. CPKs in group II and group III are significantly expanded in rosids and Asteraceae, respectively ( Figure 4). Gene replication contributes to the expansion of gene families. In these five Asteraceae species, the numbers of paralogs in group III were much higher than in other groups, indicating that gene replication was the main reason of the group III CPKs' expansion.
The AtCPK10 and AtCPK30 play a central role in regulating primary nitrate responses and controlling of primary transcription by the RNA sequencing [41], suggesting that TkCPK25/TkCPK33, located in the same phylogenetic tree branch with AtCPK10 and AtCPK30, might also take part in the primary transcription regulation. In addition, TkCPK4/TkCPK19 and TkCPK4/TkCPK32 may participate in drought stress regulation since the ortholog gene AtCPK8 functions in ABA-mediated stomatal movement in response to drought stress through the regulation of catalase 3 [42]. AtCPK24 could negatively regulate pollen tube growth by inhibiting K + inward currents [43], indicating that TkCPK9, TkCPK15 and TkCPK16 might also be involved in the development of pollen tube. Notably, within the five Asteraceae species, more group III members were observed in the three rubber-producing plants (12 for Tks, 11 for H. annuus and 11 for L. sativa), compared to non-rubber species (Figure 4). For rubber-producing plants, the HbCPKs also show a slightly expansion in group III (nine members) compared with other groups [32], indicating that there might be some potential roles for group III CPKs in NR biosynthesis.
Previous research investigated the sequence degeneration of group III CPK Ca 2+ -binding sites, showing that five AtCPKs (CPK 7, 8, 10, 13 and 32) have lower or no calcium sensitivity [44]. All these weak CPKs carry one or two altered EF-hand motif(s), suggesting that the degeneration of the EF-hand motifs can greatly influence the calcium loading. AtCPK13, a member of group III, inhibits stomatal opening under light-induced conditions [45], indicating their orthologs TkCPK8/TkCPK21 may also be involved in a similar pathway. The expanded group III CPKs from Asteraceae exhibited less conservative in EF-hands of "D-X-D-X-S" region than other three groups (framed in Figure 6). Unlike EF-hands 1 to 3, EF-hands 4 of Asteraceae group III CPKs still showed high conservation, implying their importance to the CPK Ca 2+ -binding capacity.

Identification and Characteristics of CPK Members in Five Representative Asteraceae Species
The protein sequences of CPKs from Arabidopsis and rice served as a query sequence to perform the local BLASTP program for identifying CPK members in Tks, L. sativa, H. annuus, C. nankingense and C. cardunculus (e-value < 1 × 10 −5 ).

Duplication Event and Syntenic Analysis
Paralogs of CPKs from five Asteraceae species were determined by multiple sequence alignment with the amino acid identification > 80%. The Ka/Ks ratios for these CPK paralogs were calculated to evaluate the selection pressure; the ratio >1, <1, or =1 indicates positive, negative or neutral evolution, respectively. The Ka/Ks ratios of these paralogs were calculated using Dnasp 4.0 software [47]. Tajima relative rate tests were detected by MEGA 7.0 using the amino acid sequences of the duplicated CPK pairs [48]. The result of local BLASTp program (with an E-value setting of 1 × 10 −10 ) and the sorted GFF profiles (with four columns, the first column is chromosome name, the second column is gene name, the third column is gene starting position and the last column is gene ending position) were then submitted to the MCScan program to identify the syntenic relationships of paralogs and/or orthologs of CPKs among six species [49]. Circos program [50] was used for visualizing the syntenic results.

Gene Structure and Motif Distribution Analysis
The gene structures of CPK members from ten species were constructed using TBtools JRE1.6 [51] based on the genomic sequence and coding DNA sequences corresponding to each predicted gene. The conserved motifs for all CPK protein sequences and conserved CPK sequence motif logos of five Asteraceae species were detected by Multiple Expectation Maximization for Motif Elicitation (MEME) online tools (http://meme.sdsc.edu/meme/intro.html).

Conclusions
In summary, our study provides a comprehensive evolutionary and systematical analysis of CPK members in five representative Asteraceae species, showing that for the representative rubber-producing Asteraceae plant Tks, duplicated gene pairs were under purifying selection pressure and two TkCPK duplication paralogs had an accelerated evolutionary rate. By comparing the CPK numbers in four groups, we found that CPKs in group II and group III were significantly expanded in rosids and rubber-producing Asteraceae plants, indicating that potential functional divergence of expanded CPKs in group II and group III in rosids and rubber-producing Asteraceae plants, respectively. Further gene structure and motif distribution analyses in group III revealed that the exon-intron and motif distribution were similar and conserved. Detailed conserved motif logos analysis revealed that CPKs in group III of Asteraceae species have lower amino acid conservation in EF-hand I to III, indicating that they might have lower calcium binding ability than the other three groups. Our data provide a systematical evolutionary investigation of the CPKs in five representative Asteraceae species, suggesting sub-family specific expansion of CPKs might be related to natural rubber producing.
Supplementary Materials: The following are available online at http://www.mdpi.com/2223-7747/9/1/32/s1, Figure S1: Gene structure and conserved motif distribution of CPKs from group I., Figure S2: Gene structure and conserved motif distribution of CPKs from group II., Figure S3: Gene structure and conserved motif distribution of CPKs from group IV., Table S1: The Ka/Ks ratios for duplicated CPK genes in T. koksaghyz,

Conflicts of Interest:
The authors declare no conflict of interest.