Complex Molecular Evolution and Expression of Expansin Gene Families in Three Basic Diploid Species of Brassica

Expansins are a kind of structural proteins of the plant cell wall, and they enlarge cells by loosening the cell walls. Therefore, expansins are involved in many growth and development processes. The complete genomic sequences of Brassica rapa, Brassica oleracea and Brassica nigra provide effective platforms for researchers to study expansin genes, and can be compared with analogues in Arabidopsis thaliana. This study identified and characterized expansin families in B. rapa, B. oleracea, and B. nigra. Through the comparative analysis of phylogeny, gene structure, and physicochemical properties, the expansin families were divided into four subfamilies, and then their expansion patterns and evolution details were explored accordingly. Results showed that after the three species underwent independent evolution following their separation from A. thaliana, the expansin families in the three species had increased similarities but fewer divergences. By searching divergences of promoters and coding sequences, significant positive correlations were revealed among orthologs in A. thaliana and the three basic species. Subsequently, differential expressions indicated extensive functional divergences in the expansin families of the three species, especially in reproductive development. Hence, these results support the molecular evolution of basic Brassica species, potential functions of these genes, and genetic improvement of related crops.


Introduction
The cell wall is one of the main differences between plant cells and animal cells, and it is a highly dynamic and complex network, which is closely related to many processes of plant growth and development. The size and shape of various plant organs are precisely controlled through regional cell division, and subsequent cell expansion during plant growth [1]. While inhibiting the expansion of protoplasts, the cell wall undergoes stress relaxation due to various factors. The cells are always in a dynamic equilibrium of protoplast expansion and cell wall restraint, and undergo irreversible expansion [2]. The enlargement of the cell walls begins with the stress relaxation of the walls, which enables the cells to absorb water and physically enlarge. Expansins are a class of nonenzymatic  [27] The expansin gene family is a large family with four subfamilies, namely, EXPA, EXPB, EXLA, and EXLB. To clarify which subfamily these expansin genes belong to, we employed MEGA 6.0 to construct a phylogenetic tree with the maximum likelihood algorithm using the entire expansin protein sequences of A. thaliana, B. rapa, B. oleracea, and B. nigra. Given that the expansin genes of A. thaliana have already been classified, we were able to classify the expansin genes according to the clustering exhibited on the phylogenetic tree. The number of genes in each subfamily is shown in Table 1. On the basis of the nomenclature rules proposed by a previous study [6], we named the expansin genes in B. rapa, B. oleracea, and B. nigra using their genome locations and the subfamily to which they belonged (Table S2).
We compared the number of expansin genes in the three basic diploid species of Brassica to other available genomes and found that EXPA and EXPB subfamilies accounted for the largest proportion of the entire family and appeared first in the evolutionary process of plants ( Figure 1). The EXPB subfamilies in monocotyledonous plants were also larger than the dicotyledonous plants, whereas those of the three basic species of Brassica were similar in size to A. thaliana and larger than other dicotyledons currently known. Although B. rapa, B. oleracea and B. nigra had more expansin genes than A. thaliana, none of them had tripled that of A. thaliana. This result indicated that approximately 50% of BrEXPs, BolEXPs, and BniEXPs had been deleted during the evolution process after WGT (Table 1).
Int. J. Mol. Sci. 2020, 21, x FOR PEER REVIEW 4 of 22 subfamilies in monocotyledonous plants were also larger than the dicotyledonous plants, whereas those of the three basic species of Brassica were similar in size to A. thaliana and larger than other dicotyledons currently known. Although B. rapa, B. oleracea and B. nigra had more expansin genes than A. thaliana, none of them had tripled that of A. thaliana. This result indicated that approximately 50% of BrEXPs, BolEXPs, and BniEXPs had been deleted during the evolution process after WGT (Table 1).  Table 1.
A detailed characterization of the these expansin genes, including gene names, gene IDs, positions, predicted amino acids, isoelectric points (pIs), molecular weights (Mw), and signal peptide positions, is provided in Table S2. The isolated BrEXPs encode proteins ranging from 170 to 360 amino acids in size with an average length of 259 amino acids and a molecular weight ranging from 17.9 to 40.9 kD, except BrEXPA20 (Br016981), which contained not only DPBB_1 and Pollen_allege_1 but also two different macro domains, namely, PF01661 and PF01283. The 58 and 60 expansins in B. oleracea and B. nigra were 141-342 and 190-372 amino acids long, with molecular weights ranging from 14.9 kD to 38.5 kD and 19.9 kD to 42.2 kD, respectively. Most BrEXPs, BolEXPs, and BniEXPs contained signal peptides with a length of 17 to 30 amino acids, except for 7 BrEXPs, 8 BolEXPs, and 2 BniEXPs. The pI values ranged from 4.52 to 10.74 in A. thaliana and three basic species of Brassica. These characteristics of the EXLB subfamily in the four species were very conserved. Among them, the average of pI values was less than 7.0 in the EXLB subfamily and above 7.0 in other subfamilies. Meanwhile, the lengths of the signal peptides and proteins of the EXLB subfamily were the same, except for BolEXLB1, which was 251 amino acids (Table S2).  Table 1. A detailed characterization of the these expansin genes, including gene names, gene IDs, positions, predicted amino acids, isoelectric points (pIs), molecular weights (Mw), and signal peptide positions, is provided in Table S2. The isolated BrEXPs encode proteins ranging from 170 to 360 amino acids in size with an average length of 259 amino acids and a molecular weight ranging from 17.9 to 40.9 kD, except BrEXPA20 (Br016981), which contained not only DPBB_1 and Pollen_allege_1 but also two different macro domains, namely, PF01661 and PF01283. The 58 and 60 expansins in B. oleracea and B. nigra were 141-342 and 190-372 amino acids long, with molecular weights ranging from 14.9 kD to 38.5 kD and 19.9 kD to 42.2 kD, respectively. Most BrEXPs, BolEXPs, and BniEXPs contained signal peptides with a length of 17 to 30 amino acids, except for 7 BrEXPs, 8 BolEXPs, and 2 BniEXPs. The pI values ranged from 4.52 to 10.74 in A. thaliana and three basic species of Brassica. These characteristics of the EXLB subfamily in the four species were very conserved. Among them, the average of pI values was less than 7.0 in the EXLB subfamily and above 7.0 in other subfamilies. Meanwhile, the lengths of the signal peptides and proteins of the EXLB subfamily were the same, except for BolEXLB1, which was 251 amino acids (Table S2).

Phylogenetic and Structural Analyses of BrEXPs, BolEXPs, and BniEXPs
The phylogenetic tree of expansin genes in the three basic diploid species of Brassica and A. thaliana was constructed based on their deduced amino acid sequences from the multiple sequence alignment.

Phylogenetic and Structural Analyses of BrEXPs, BolEXPs, and BniEXPs
The phylogenetic tree of expansin genes in the three basic diploid species of Brassica and A. thaliana was constructed based on their deduced amino acid sequences from the multiple sequence alignment. Phylogenetic analysis showed that members of four subfamilies were clustered together according to their evolutionary relationships ( Figure 2). According to the grouping methods used for A. thaliana and rice expansin gene families [5] and phylogenetic tree clustering, we divided expansin genes of the four species into EXPA with 12 subgroups, EXPB with 2 subgroups, and EXLA and EXLB with 1 subgroup. These subfamilies were sorted in Roman alphabetical order. AtEXPA19 was classified as EXPA-XI, which showed a different concept from the previous definition of EXPA-XI as a specific subgroup in rice [5]. Among these subgroups, EXPA-IV was the largest clade with 38 EXPAs, which was consistent with the research in tobacco [26], whereas EXPA-VIII was the smallest clade and contained only four EXPAs (Figure 3).
To obtain additional insights into the possible gene structural evolution and provide valuable information concerning duplication events when interpreting phylogenetic relationships within gene families, the exon-intron structures of expansin family genes in A. thaliana, B. rapa, B. oleracea and B. nigra were analyzed ( Figure 3C). The number of introns varied from 0 to 5, except for BrEXPA20. Basically, expansin genes in the same subfamily had similar exon-intron structures. In accordance with ancient exon-intron pattern, the introns of the expansin genes were named A, C, B, F, H, and I [5,28]. Most expansin genes in the EXPA subfamily contained two introns, A or B, or only one of A and B. However, most members of the EXPB subfamily belonged to two exon-intron structural patterns, namely, A-B-F and A-C-B. In particular, the members of the EXLA subfamily contained both A and B introns, whereas those of the EXLB subfamily had A, C, and F introns ( Figure 4). A previous study reported that most of the genes encoding EXLA and EXLB proteins belong to the exon-intron According to the grouping methods used for A. thaliana and rice expansin gene families [5] and phylogenetic tree clustering, we divided expansin genes of the four species into EXPA with 12 subgroups, EXPB with 2 subgroups, and EXLA and EXLB with 1 subgroup. These subfamilies were sorted in Roman alphabetical order. AtEXPA19 was classified as EXPA-XI, which showed a different concept from the previous definition of EXPA-XI as a specific subgroup in rice [5]. Among these subgroups, EXPA-IV was the largest clade with 38 EXPAs, which was consistent with the research in tobacco [26], whereas EXPA-VIII was the smallest clade and contained only four EXPAs (Figure 3).
To obtain additional insights into the possible gene structural evolution and provide valuable information concerning duplication events when interpreting phylogenetic relationships within gene families, the exon-intron structures of expansin family genes in A. thaliana, B. rapa, B. oleracea and B. nigra were analyzed ( Figure 3C). The number of introns varied from 0 to 5, except for BrEXPA20. Basically, expansin genes in the same subfamily had similar exon-intron structures. In accordance with ancient exon-intron pattern, the introns of the expansin genes were named A, C, B, F, H, and I [5,28]. Most expansin genes in the EXPA subfamily contained two introns, A or B, or only one of A and B. However, most members of the EXPB subfamily belonged to two exon-intron structural patterns, namely, A-B-F and A-C-B. In particular, the members of the EXLA subfamily contained both A and B introns, whereas those of the EXLB subfamily had A, C, and F introns ( Figure 4). A previous study reported that most of the genes encoding EXLA and EXLB proteins belong to the exon-intron structure pattern of A-C-B-F, which was inconsistent with our conclusion [5]. This result showed that unique exon-intron patterns were present in the EXLA and EXLB subfamilies of the four species. structure pattern of A-C-B-F, which was inconsistent with our conclusion [5]. This result showed that unique exon-intron patterns were present in the EXLA and EXLB subfamilies of the four species.

Chromosomal Distribution, Duplication Mechanism and Retained Proportion in Three Basic Species of Brassica
To explore the underlying duplication mechanism accounting for the expansion of the expansin gene family, we explored the genomic distributions of the expansin genes in three basic species. Except for 2 BrEXPs, 14 BolEXPs, and 5 BniEXPs, 54, 44, and 55 expansin genes were located on the chromosomes of B. rapa, B. oleracea, and B. nigra, respectively. Except for the C01 chromosome of B. oleracea, every other chromosome had uneven distribution of expansin genes ( Figure S1). In B. rapa, B. oleracea, and B. nigra genomes, 12.5%, 22.4%, and 15% expansin genes were formed by tandem duplication and distributed in 3, 5, and 4 tandem arrays of 2-5 genes, respectively. However, this ratio was up to 27.8% in A. thaliana. ( Figure S1). Using A. thaliana as a reference, we also traced these orthologous expansin gene pairs between A. thaliana and Brassica species to detect their evolutionary history. From the analysis of the orthologous regions for comparative analysis, we obtained 49, 29, and 39 sets of orthologous genes between A. thaliana and B. rapa, A. thaliana and B. oleracea, and A. thaliana and B. nigra, as shown in Figure 5. In addition, we performed collinearity analyses of expansin genes among the three Brassica species and their individual genomes. We acquired 67 homologous gene pairs between B. rapa and B. oleracea, 88 between B. rapa and B. nigra and 56 between B. oleracea and B. nigra ( Figure 5). These results indicated that segmental duplications played important roles in expansin gene family expansion in Brassica genomes.
Further analyses on expansin genes' retention in Brassica species after WGT were performed. On the one hand, EXPA subfamily genes occupied the major part of the entire family in each basic species of Brassica ( Figure 6A). On the other hand, the retention rates of the EXPA and EXLB subfamilies in three species significantly exceeded the overall rate of the corresponding family, whereas those of the EXLA subfamilies were lower ( Figure 6B).

Chromosomal Distribution, Duplication Mechanism and Retained Proportion in Three Basic Species of Brassica
To explore the underlying duplication mechanism accounting for the expansion of the expansin gene family, we explored the genomic distributions of the expansin genes in three basic species. Except for 2 BrEXPs, 14 BolEXPs, and 5 BniEXPs, 54, 44, and 55 expansin genes were located on the chromosomes of B. rapa, B. oleracea, and B. nigra, respectively. Except for the C01 chromosome of B. oleracea, every other chromosome had uneven distribution of expansin genes ( Figure S1). In B. rapa, B. oleracea, and B. nigra genomes, 12.5%, 22.4%, and 15% expansin genes were formed by tandem duplication and distributed in 3, 5, and 4 tandem arrays of 2-5 genes, respectively. However, this ratio was up to 27.8% in A. thaliana. ( Figure S1). Using A. thaliana as a reference, we also traced these orthologous expansin gene pairs between A. thaliana and Brassica species to detect their evolutionary history. From the analysis of the orthologous regions for comparative analysis, we obtained 49, 29, and 39 sets of orthologous genes between A. thaliana and B. rapa, A. thaliana and B. oleracea, and A. thaliana and B. nigra, as shown in Figure 5. In addition, we performed collinearity analyses of expansin genes among the three Brassica species and their individual genomes. We acquired 67 homologous gene pairs between B. rapa and B. oleracea, 88 between B. rapa and B. nigra and 56 between B. oleracea and B. nigra ( Figure 5). These results indicated that segmental duplications played important roles in expansin gene family expansion in Brassica genomes.
Further analyses on expansin genes' retention in Brassica species after WGT were performed. On the one hand, EXPA subfamily genes occupied the major part of the entire family in each basic species of Brassica ( Figure 6A). On the other hand, the retention rates of the EXPA and EXLB subfamilies in three species significantly exceeded the overall rate of the corresponding family, whereas those of the EXLA subfamilies were lower ( Figure 6B).

Coding Sequence and Promoter Evolution Analyses of Expansin Genes in the Three Species of Brassica
To understand the evolution of coding sequences of different branch members of expansin families in the three species of Brassica, we calculated the ω ratios (the number of nonsynonymous substitutions per nonsynonymous site (Ka)/the number of synonymous substitutions per synonymous site (Ks)) for different subfamilies of expansin genes with A. thaliana as the reference ( Table 2). For BrEXPs, BolEXPs, and BniEXPs, the LRT p-values between one and free-ratio models were higher than 0.05. This result indicated that no significant differences were among ω values in different expansin subfamilies. Their ω ratios were also between 0 and 1, indicating that they were undergoing purification selection. EXLB subfamilies also had the smallest ω values, suggesting that their sequences were the most conserved, which was also consistent with the results of previous physical and chemical analyses. Afterward, we calculated the ω ratio of each subgroup in the three species and found that only a few subgroups (EXPA-IV and EXPB-II for B. rapa, EXPA-IV for B. oleracea and EXPA-XI for B. nigra) had significant differences (Table S5).

Coding Sequence and Promoter Evolution Analyses of Expansin Genes in the Three Species of Brassica
To understand the evolution of coding sequences of different branch members of expansin families in the three species of Brassica, we calculated the ω ratios (the number of nonsynonymous substitutions per nonsynonymous site (Ka)/the number of synonymous substitutions per synonymous site (Ks)) for different subfamilies of expansin genes with A. thaliana as the reference ( Table 2). For BrEXPs, BolEXPs, and BniEXPs, the LRT p-values between one and free-ratio models were higher than 0.05. This result indicated that no significant differences were among ω values in different expansin subfamilies. Their ω ratios were also between 0 and 1, indicating that they were undergoing purification selection. EXLB subfamilies also had the smallest ω values, suggesting that their sequences were the most conserved, which was also consistent with the results of previous physical and chemical analyses. Afterward, we calculated the ω ratio of each subgroup in the three species and found that only a few subgroups (EXPA-IV and EXPB-II for B. rapa, EXPA-IV for B. oleracea and EXPA-XI for B. nigra) had significant differences (Table S5).

Coding Sequence and Promoter Evolution Analyses of Expansin Genes in the Three Species of Brassica
To understand the evolution of coding sequences of different branch members of expansin families in the three species of Brassica, we calculated the ω ratios (the number of nonsynonymous substitutions per nonsynonymous site (Ka)/the number of synonymous substitutions per synonymous site (Ks)) for different subfamilies of expansin genes with A. thaliana as the reference (Table 2). For BrEXPs, BolEXPs, and BniEXPs, the LRT p-values between one and free-ratio models were higher than 0.05. This result indicated that no significant differences were among ω values in different expansin subfamilies.
Their ω ratios were also between 0 and 1, indicating that they were undergoing purification selection. EXLB subfamilies also had the smallest ω values, suggesting that their sequences were the most conserved, which was also consistent with the results of previous physical and chemical analyses. Afterward, we calculated the ω ratio of each subgroup in the three species and found that only a few subgroups (EXPA-IV and EXPB-II for B. rapa, EXPA-IV for B. oleracea and EXPA-XI for B. nigra) had significant differences (Table S5). The Ka and Ks substitution rates and Ka/Ks ratios for orthologous gene pairs were used to detect the evolutionary selection pattern of expansin genes among the four species (Figure 7, Table S3). According to distributions, several "abnormal" expansins were observed in several subfamilies, such as BrEXPA33 and BrEXPA13, which showed Ka and Ks of up to 0.126 and 1.070, respectively. Without considering abnormal values, in EXPA subfamilies, the orthologous Ka and Ka/Ks values between B. rapa and B. oleracea, and between B. rapa and B. nigra were significantly different, but there was no significant difference between B. oleracea and B. nigra. In other subfamilies, the differences among the three species were not significant, either. The sequences of BrEXPAs exhibited the largest sequence changes, whereas BniEXPAs were highly conserved, with mean Ka and Ka/Ks values as low as 0.048 and 0.114, respectively (Table 3). For the Ks values of the expansin gene pairs in the three species, the discrepancies among species or subfamilies were not significant.
The promoter divergence of the orthologs were determined by SharMot. At the species level, we found that the distribution of expansin gene pairs in the three species increased with the growth of d SM values and the peak value of d SM ranged from 0.8 to 1.0, indicating that the degree of promoter divergence was considerably high ( Figure 8A). At the subfamily level, EXPA subfamilies had the least distribution between 0 and 0.2, and it was evenly distributed in the remaining intervals. However, the three other subfamilies were mainly distributed between 0.6 and 1.0, especially for EXLA and EXLB subfamilies ( Figure 8B-D, Table 3), demonstrating that the promoters of the two subfamilies showed a higher overall divergence during the evolutionary process. For the relatively limited number of paralogs, we found that the distribution of d SM values was relatively uniform, both at the species and family level. However, among the three species, the Ka/Ks values of the EXPA subfamilies were the smallest among the subfamilies, indicating that their coding sequences were the most conservative ( Figures S2 and S3, Tables S4 and S6).   To explore whether a correlation exists between the promoter and coding sequence evolution of the expansin family members, we performed a multiple regression analysis involving Ka, Ks, Ka/Ks, and dSM. We observed weak but extremely significant correlations between dSM and Ka/Ks (dSM vs. (Ka/Ks), rs ≤ 0.42, p ≤ 0.0004; multiple regression formula: dSM~Ka+Ks+(Ka/Ks), Pearson correlation) between A. thaliana and B. rapa, A. thaliana and B. oleracea, and A. thaliana and B. nigra (Figure 9). This result indicated that when the promoter sequence of an expansin gene had a large variation, its corresponding coding sequence was also less conservative. However, no significant correlations were detected between dSM and Ka/Ks in the paralogs of B. rapa, B. oleracea, and B. nigra (multiple regression formula: dSM~Ka + Ks + (Ka/Ks)) ( Figures S2 and S3).

Expression Patterns of Expansin Family Members in Various Tissues and Organs of B. rapa, B. nigra, and B. oleracea
RNA-seq data were used to investigate the tissue-specific expression profile of BrEXPs in six tissues (root, stem, leaf, inflorescences, silique, and callus). However, the lack of RNA-Seq data for nine expansin genes (BrEXPB2/6/7, BrEXPA4/9/10/16/38/39) indicated that these genes may express only at specific developmental stages (e.g., BrEXPA9 can be detected at the pollen development stage) or under special conditions ( Figure 10A). Simultaneously, the expression patterns of 58 BolEXPs and 60 BniEXPs were studied by quantitative reverse transcription polymerase chain reaction (qRT-PCR) in five major tissues, including roots, stems, leaves, inflorescences and siliques. A total of 58 BolEXPs and 59 BniEXPs can be detected in at least one tissue (except BniEXPA21) ( Figure 10B,C). Through the analysis of relevant data, the expression levels of expansin genes were related to various tissues,  To explore whether a correlation exists between the promoter and coding sequence evolution of the expansin family members, we performed a multiple regression analysis involving Ka, Ks, Ka/Ks, and d SM . We observed weak but extremely significant correlations between d SM and Ka/Ks (d SM vs. (Ka/Ks), r s ≤ 0.42, p ≤ 0.0004; multiple regression formula: d SM~K a+Ks+(Ka/Ks), Pearson correlation) between A. thaliana and B. rapa, A. thaliana and B. oleracea, and A. thaliana and B. nigra (Figure 9). This result indicated that when the promoter sequence of an expansin gene had a large variation, its corresponding coding sequence was also less conservative. However, no significant correlations were detected between d SM and Ka/Ks in the paralogs of B. rapa, B. oleracea, and B. nigra (multiple regression formula: d SM~K a + Ks + (Ka/Ks)) ( Figures S2 and S3).

Expression Patterns of Expansin Family Members in Various
Tissues and Organs of B. rapa, B. nigra, and B. oleracea RNA-seq data were used to investigate the tissue-specific expression profile of BrEXPs in six tissues (root, stem, leaf, inflorescences, silique, and callus). However, the lack of RNA-Seq data for nine expansin genes (BrEXPB2/6/7, BrEXPA4/9/10/16/38/39) indicated that these genes may express only at specific developmental stages (e.g., BrEXPA9 can be detected at the pollen development stage) or under special conditions ( Figure 10A). Simultaneously, the expression patterns of 58 BolEXPs and 60 BniEXPs were studied by quantitative reverse transcription polymerase chain reaction (qRT-PCR) in five major tissues, including roots, stems, leaves, inflorescences and siliques. A total of 58 BolEXPs and 59 BniEXPs can be detected in at least one tissue (except BniEXPA21) ( Figure 10B,C). Through the analysis of relevant data, the expression levels of expansin genes were related to various tissues, and the expression pattern of each expansin gene was different. In B. rapa, 35.7% of expansins were constitutively expressed in all six tissues. Meanwhile, due to the high precision of qRT-PCR, 73.2% and 85% of expansin genes were expressed in these five tissues in B. oleracea and B. nigra, but the level of expression varied. According to these results, we speculated that expansin genes play important roles in multiple development stages of B. rapa, B. oleracea, and B. nigra. Most of expansin genes appeared to be predominantly expressed in certain tissues. We found that 50% of BrEXPs, 55.4% of BolEXPs, and 58.3% of BniEXPs had peak expressions in a specific tissue. This result demonstrated that expansin genes had diverged, and they relaxed cell walls in corresponding organs.  73.2% and 85% of expansin genes were expressed in these five tissues in B. oleracea and B. nigra, but the level of expression varied. According to these results, we speculated that expansin genes play important roles in multiple development stages of B. rapa, B. oleracea, and B. nigra. Most of expansin genes appeared to be predominantly expressed in certain tissues. We found that 50% of BrEXPs, 55.4% of BolEXPs, and 58.3% of BniEXPs had peak expressions in a specific tissue. This result demonstrated that expansin genes had diverged, and they relaxed cell walls in corresponding organs.        Seventeen BrEXPs, 24 BolEXPs, and 31 BniEXPs had high expression levels in inflorescences and were used to study potential roles in the reproductive development of B. rapa, B. oleracea, and B. nigra ( Figure 10A-C). Therefore, we used our previous RNA-Seq data of 'Bcajh97-01A/B' GMS line of B. rapa [40]and qRT-PCR method for B. oleracea and B. nigra to analyze their potential roles in pollen development. Through the study of BolEXP and BniEXP expression in five different stages of floral buds (pollen mother cell, tetrad, uninucleate pollen, binucleate pollen, and trinucleate pollen), we found that except for high expression levels of BolEXPA7/10/27/29/31/38, BolEXPB12, and BniEXPA20/43/46 in mature pollen, other expansin genes were more uniformly expressed at different stages of pollen development ( Figure 10E,F). For B. rapa, BrEXPA8 was highly expressed at all stages, whereas BrEXPA30 and BrEXPB4/9 were prominent at the mature pollen stage ( Figure 10D).
When comparing the expression levels of expansin genes in five pollen development stages of the fertile line 'Bcajh97-01B' and the sterile line 'Bcajh97-01A', we found that the expression levels of BrEXPA1/22/23 and BrEXLB2 in the sterile line were lower than those in the fertile line during the tetrad period, suggesting that these genes had crucial roles in early anther development. Seventeen BrEXPs, 24 BolEXPs, and 31 BniEXPs had high expression levels in inflorescences and were used to study potential roles in the reproductive development of B. rapa, B. oleracea, and B. nigra ( Figure 10A-C). Therefore, we used our previous RNA-Seq data of 'Bcajh97-01A/B' GMS line of B. rapa [40] and qRT-PCR method for B. oleracea and B. nigra to analyze their potential roles in pollen development. Through the study of BolEXP and BniEXP expression in five different stages of floral buds (pollen mother cell, tetrad, uninucleate pollen, binucleate pollen, and trinucleate pollen), we found that except for high expression levels of BolEXPA7/10/27/29/31/38, BolEXPB12, and BniEXPA20/43/46 in mature pollen, other expansin genes were more uniformly expressed at different stages of pollen development ( Figure 10E,F). For B. rapa, BrEXPA8 was highly expressed at all stages, whereas BrEXPA30 and BrEXPB4/9 were prominent at the mature pollen stage ( Figure 10D).
When comparing the expression levels of expansin genes in five pollen development stages of the fertile line 'Bcajh97-01B' and the sterile line 'Bcajh97-01A', we found that the expression levels of BrEXPA1/22/23 and BrEXLB2 in the sterile line were lower than those in the fertile line during the tetrad period, suggesting that these genes had crucial roles in early anther development. Simultaneously, nine genes were expressed in all five stages, implying their continued roles in pollen development. Particularly, BrEXPA8 was highly expressed during these periods, but the difference between fertile and sterile lines was not significant. However, BrEXPB4/9 and BrEXPA14/22 had high expression levels in the mature pollen stage of the fertile line but were comparatively low or even not expressed in the sterile line. Generally, genes that were highly expressed in mature pollen could continue to be expressed during pollination and fertilizations ( Figure 11A). At 0, 1, 3, and 10 h after pollination, the expression levels of BrEXPs in pistil were observed. Particularly, for two copies of AtEXPB5, BrEXPB4 and BrEXPB9, the expression level of BrEXPB9 in the mature pollen stage and pollination and fertilization was higher than that of BrEXPB4. Hence, BrEXPB4 was in the process of gradual degradation ( Figure 11B). Simultaneously, nine genes were expressed in all five stages, implying their continued roles in pollen development. Particularly, BrEXPA8 was highly expressed during these periods, but the difference between fertile and sterile lines was not significant. However, BrEXPB4/9 and BrEXPA14/22 had high expression levels in the mature pollen stage of the fertile line but were comparatively low or even not expressed in the sterile line. Generally, genes that were highly expressed in mature pollen could continue to be expressed during pollination and fertilizations ( Figure 11A). At 0, 1, 3, and 10 h after pollination, the expression levels of BrEXPs in pistil were observed. Particularly, for two copies of AtEXPB5, BrEXPB4 and BrEXPB9, the expression level of BrEXPB9 in the mature pollen stage and pollination and fertilization was higher than that of BrEXPB4. Hence, BrEXPB4 was in the process of gradual degradation ( Figure 11B).

Scale of Expansin Families after WGT in B. rapa, B. oleracea, and B. nigra
The ancestor of diploid Brassica species and A. thaliana lineages diverged approximately 20 MYA. Subsequently, a WGT event occurred in the Brassica ancestor approximately 16 MYA. As the WGT of the Brassica ancestor, expansin genes in the A. thaliana genome might have triplicated orthologous copies in B. rapa, B. oleracea and B. nigra [30]. However, the total number of expansins in the three basic species of Brassica did not triple that of A. thaliana suggesting that expansin genes experienced loss and functional differentiation after WGT. The rapid loss of paralogs in genome-wide duplication might be due to the imbalance issue of gene dosage [41][42][43].
Recent research studies assumed that 60-90% of angiosperms underwent duplication events [44,45]. The size of a genome was expanded by duplication and provided a hotbed for the emergence Figure 11. Expression profile analyses of representative expansin genes by RNA-Seq data in five development stages of pollen in 'Bcajh97-01A/B' (A) and pollinated pistils at 0, 1, 3, and 10 h after pollination (B). PMC pollen mother cell, Tds tetrads, UC uninuclear pollen, BC binucleate pollen, TC trinucleate pollen.

Scale of Expansin Families after WGT in B. rapa, B. oleracea, and B. nigra
The ancestor of diploid Brassica species and A. thaliana lineages diverged approximately 20 MYA. Subsequently, a WGT event occurred in the Brassica ancestor approximately 16 MYA. As the WGT of the Brassica ancestor, expansin genes in the A. thaliana genome might have triplicated orthologous copies in B. rapa, B. oleracea and B. nigra [30]. However, the total number of expansins in the three basic species of Brassica did not triple that of A. thaliana suggesting that expansin genes experienced loss and functional differentiation after WGT. The rapid loss of paralogs in genome-wide duplication might be due to the imbalance issue of gene dosage [41][42][43].
Recent research studies assumed that 60-90% of angiosperms underwent duplication events [44,45]. The size of a genome was expanded by duplication and provided a hotbed for the emergence of new gene functions. In duplication, several genes are likely to be retained as duplicates because of the need for certain functions in a particular organism [46]. These genes can follow one of three functional outcomes, including gene loss, neo-functionalization, and sub-functionalization [47]. Generally, the emergence of these new gene functions helped plants in adapting to the changing environment. Hence, they would not become extinct.
Of the three principal evolutionary patterns, segmental duplication, tandem duplication, and transposition [48], the expansion of expansin families in B. rapa and B. nigra seemed to be achieved mainly by segmental duplication (62.5% in B. rapa, 50.0% in B. nigra). Meanwhile, tandem duplication seemed to play only a minor role. Approximately 12.5% of expansin genes in B. rapa, 22.4% in B. oleracea and 15.0% in B. nigra were distributed in tandem arrays. However, in B. oleracea, the proportions of the segmental and tandem duplications were similar. According to a previous research, expansin families in different species showed species-specific expansion patterns. For instance, segmental duplication seemed to be the predominant form of expansin families in most dicots, such as A. thaliana, B. rapa, B. nigra, and soybean [25]. Inversely, tandem duplication is popular in monocots, such as rice and Triticum aestivum [5,35]. Intriguingly, for the expansin family, we observed that both segmental and tandem duplication played even roles in B. oleracea.

Conservative and Large Family Size of EXLB in Three Basic Species of Brassica
The EXLB subfamily is much larger in soybean than in A. thaliana and rice [25]. By comparing the size of expansin families in different species, we found that the size of EXLB subfamilies in monocotyledons was the smallest, accounting for 0%-1.79%, and the proportion of cruciferous plants was in the range of 2.78%-5.36%. Meanwhile, the EXLB subfamilies of Solanaceae were larger, reaching 10.5%-16.67% (Figure 1, Table 3). The different scales of the EXLB subfamilies suggested that they changed with the evolution of different families, but divergences among species in the same family were small. For the EXLB subfamilies in A. thaliana and three basic species of Brassica, we found that the pI values, length of proteins, and signal peptides were conserved, and they had the smallest ω values, suggesting that their sequences were the most conserved. The larger scales of EXLB subfamilies in the three species of Brassica might reflect adaptations to certain functions or environments. Hence, EXLB members have special functions in growth and development; for example, recent studies provided evidence to establish their functions in responding to abiotic stresses [15,49] and improving phosphorus acquisition [50].

Promoter Divergence is Closely Related to the Coding Sequence Evolution of Expansin Genes in Three Basic Species of Brassica
Although promoters can control gene expression at appropriate times, places, and levels and are related to phenotypic evolution [51], little is known about their own evolutionary divergences and their relationship to the evolution of coding sequences of specific gene families. Therefore, we applied SharMot [52], which effectively quantified differences between promoters by finding regions of high local similarity but did not consider order, direction, or spacing, to resolve whether or not the divergences of the promoters are correlated with the evolutionary rates of the coding sequences [53][54][55]. The method was also used to reveal the formation of pseudogenes in A. thaliana [56], similar trends in the upstream regions to the coding sequences of H3K27me3 [57], and relationship between promoters and protein sequences of the PG family in B. rapa [58].
In the present study, SharMot was used to explore evolutionary dynamics between upstream 500 bp sequences and sequences encoding the expansin genes of the orthologs and paralogs among A. thaliana and the three species of Brassica. We detected weak but significant positive correlations between d SM and Ka/Ks in orthologs of the three Brassica species (Figure 9). On the basis of the Ka/Ks results, expansin genes in the three species were under purification selection. According to a previous study [52], if Ka/Ks reflected the action of purifying selection, then a correlation in promoter and protein divergence indicated that the selective consequences of a deleterious mutation in either the promoter or protein coding sequence of a given gene were similar. However, this correlation was not found in paralogs. This result may be related to the selection of B. rapa, B. oleracea, and B. nigra after the common ancestor of Brassica diverged, leading to the departure of the promoter evolutionary pathway of the expansin genes to some extent. The distribution of d SM values also supported differences among four subfamilies (Figure 8).

Speculative Roles of Expansin Genes during Reproductive Development
EXPB genes are particularly numerous and abundantly expressed in grass, but the size of EXPB subfamilies is small in dicotyledons [59]. In the present study, we also found similar conclusion by comparing the size of expansin families of different species (Figure 1, Table 3). EXPB genes were previously thought to be members of pollen allergens and had cell wall-relaxing activity in herbaceous plants [59,60]. This phenomenon is explained by the fact that scales of EXPB subfamilies were much larger in monocots than in dicots. A recent study showed that Zea m 1 (EXPB1 of maize) and orthologous group-1 pollen allergens in other grasses were abundant in pollen and they could promote pollen tubes to enter the stigma and style by softening the maternal cell walls. The decreased expression levels of the EXPBs in maize can cause pollen accumulation, resulting in the poor dispersal of pollen when the anther is cracked, and pollen tubes cannot enter the corn whisker easily [61]. This result was also validated in a maize transgenic line that silenced all members of the EXPB subfamily [62]. However, we do not believe that the EXPB subfamily plays no role or play a minor role in the dicotyledon development. A tobacco EXPB subfamily gene, PPAL, was specifically expressed in the stigma secretion area and epidermal layer of the placenta, and expression levels increased during pollination and fertilization, indicating that it participated in the penetration of the pollen tube into the stigma [63]. The analysis of the expression profile of A. thaliana also showed that AtEXPA4 and AtEXPB5 are strongly expressed in dried pollen grains, swollen pollen, and pollen tubes. However, further research on their biological functions is needed [64].
Transcript abundance at specific times and organs is an important factor in characterizing the corresponding proteins required for subsequent developmental processes [65]. Our analysis of expansin gene expression in B. rapa, B. oleracea, and B. nigra indicated that they played indispensable roles in pollen development. This process was essential for plant reproduction, and significant implications for the selection of excellent vegetable varieties were observed. By analyzing the expression data, we found that the expression levels of 30.4% of BrEXPs (17 of 56), 44.8% of BolEXPs (26 of 58) and 51.7% of BniEXPs (31 of 60) in inflorescence were higher than those in other organs, suggesting the important roles of expansin genes in reproductive development ( Figure 10D-F). Furthermore, we explored the potential roles of expansin genes during pollen development through the specific expression profiles of these BrEXPs at different flower bud stages and periods after pollination. Among them, two expansin genes (BrEXPB4 and BrEXPB9) possessed similar expression patterns and showed significant differences in the fertile and sterile lines ( Figure 11A). These expansin genes were also homologous genes of AtEXPB5, which maintained high expression level in pollen grains, suggesting that they had analogical biological functions in pollen development. Meanwhile, the expression level of BrEXPB4 was lower than that of BrEXPB9, which demonstrated that BrEXPB4 is in a stage of degradation. However, another A. thaliana gene, AtEXPA4, which was highly expressed in pollen grains, had two orthologous genes in B. rapa, namely BrEXPA19 and BrEXPA22, which showed different expression patterns ( Figure 11B). This result indicates that since the divergence of A. thaliana and B. rapa, orthologous genes gradually evolved different functions under the pressure of selection. Nonetheless, evidence in expression data alone was insufficient to clarify their functions during reproductive development. Hence, additional works are needed to provide evidence for this view.

Identification and Physicochemical Property Predictions of Expansin Family Members in Three Basic Diploid Species of Brassica
To identify all expansin gene family members in B. rapa, B. oleracea, and B. nigra, we performed a three-step analysis. First, the 36 protein sequences of the A. thaliana expansin family [66] were downloaded from the TAIR website (http://www.arabidopsis.org/) and compared with the sequences of B. rapa (genome v1.5), B. oleracea (genome v1.1), and B. nigra (genome v1.1) (BRAD, http://brassicadb. org/brad/) to obtain the first round of the candidates. Second, Pfam (http://www.sanger.ac.uk/Software/ Pfam/), Simple Modular Architecture Research Tool (SMART, http://smart.embl-heidelberg.de/smart/ batch.pl), and CDD/SPARCLE (https://www.ncbi.nlm.nih. gov/Structure/bwrpsb/bwrpsb.cgi) were used to confirm whether each predicted sequence was an expansin family member, sharing DPBB_1 (PF03330) and Pollen_allerg_1 (PF01357). Finally, the DNA sequences of candidates with missing domains and abnormal lengths, together with their 5 kb flanking regions of both sides, were analyzed and reannotated using FGENESH (http://linux1.softberry.com/berry.phtml?topic=fgenesh&group= programs&subgroup=gfind). The expansin genes of the three basic diploid species were nominated according to A. thaliana [6] and followed the order in which they appeared on chromosomes.
The computation of the pI values and molecular weights were carried through the Compute pI/Mw tool in ExPASy (http://web.expasy.org/compute_pi/), and signal peptide sequences were analyzed by SignalP 5.0 Server (http://www.cbs.dtu.dk/services/SignalP/).

Phylogenetic Genetic Tree Construction and Structural Analysis
A data file containing all the information from target genes (including the locations on the chromosomes, genomic sequences, coding sequences, protein sequences, and 1500 bp of the nucleotide sequences upstream of the translation initiation codon) was extracted from the files of B. rapa, B. oleracea, and B. nigra, which were downloaded from BRAD using the tools of TBtools [67].
Multiple sequence alignment was analyzed by MUSCLE in MEGA 6.0 software. According to amino acid sequence alignment, a phylogenetic tree was constructed by the maximum likelihood algorithm containing the protein sequences of AtEXPs and Br/Bol/BniEXPs. WAG + G (the WAG model + the "Gamma Distributed" option) was selected as the most suitable model through the model selection tool of MEGA 6.0. Other parameters were also set to the bootstrap replications of 1000 and "partial deletion" option. Finally, the visualization and beautification of the phylogenetic tree was accomplished by iTOL online tool (https://itol.embl.de/).
The composition and position of exons and introns for expansin genes of the four species were obtained by comparing the predicted coding sequences with their corresponding DNA sequences using TBtools. The online program MEME (http://meme-suite.org/) was used to detect eight conserved motifs contained for each gene with optimum motif width of ≥6 and ≤200 [68]. Furthermore, their conserved domains and relative positions on sequences were obtained from Pfam (http://www.sanger.ac.uk/ Software/Pfam/). Lastly, the above results were visualized by TBtools.

Chromosome Location, Synteny and Retained Rate Analysis
The chromosomal locations of expansins in the three species of Brassica were retrieved from the genome data and mapped to chromosomes by TBtools with the Amazing Gene Location program.
For synteny analysis between A. thaliana and the three species, we used the Quick MCScanX Wrapper program of TBtools, which can be used in the Windows environment without any command line. Moreover, the results were visualized by the Amazing Super Circos tool.
Retention rates were calculated using the term "locus" instead of "gene" according to a previous study, which eliminated the disturbance of tandem duplication after WGT [33].

Evolution Analysis of Coding Sequences and Promoters
To understand the evolutionary dynamics of the expansin gene coding sequences, we performed comparisons from two dimensions, that is, horizontal and vertical. On the one hand, ω ratios (Ka/Ks), which represented selective pressure for different subfamilies in the four species were calculated separately using branch models of CODEML in PAML [69,70], and p-values were obtained from the R Project. Afterward, the ω ratios of every subgroup in these species were calculated using the branch models of EasyCodeML, which was a wrapper tool that provided a user-friendly graphical interface for using CODEML [71]. Two a priori assumptions were considered: the one-ratio model assumed that all branches had the same ω-parameter, whereas the free-ratio model allowed different branch values. To verify which model was best for the data, we applied likelihood ratio tests by comparing twice the difference in log-likelihood values between the pairs of models using a χ 2 distribution. On the other hand, the Simple Ka/Ks Calculator program in TBtools was used to estimate Ks, Ka, and Ka/Ks ratios between homologous expansin gene pairs.
The d SM scores were used to evaluate the divergence between the upstream sequences of each homologous gene pair, which were obtained by SharMot software. By definition, d SM is the fraction of both sequences that do not contain a region of significant local similarity by specific criteria. For example, a d SM of 0 indicated a complete sharing of motifs between the sequences, whereas a d SM of 1.0 indicated the absence of shared motifs [52]. According to previous studies, we chose to analyze upstream 500 bp sequences from the start of translation, and determined that the optimal minimum length of a conservative upstream sequence was 18 bp [57,58,72]. SPSS 23.0 was used to calculate descriptive and inferential statistics and GraphPad Prism 5.01 was used to display the statistical results.

Expression Analysis of Expansin Family Members
The RNA-seq data of B. rapa were obtained from expression profile database (https://www.ebi. ac.uk/gxa/plant/experiments) and used to analyze the expression levels of the expansin genes in six tissues, including root, stem, leaf, inflorescences, silique, and callus [73].
However, given that the RNA-Seq data of B. nigra and B. oleracea are not available at present, the expression levels of expansin genes were measured using qRT-PCR. B. nigra (B. nigra cv. 1411-02) and B. oleracea (B. oleracea cv. Jingfengyihao) were kindly provided by Dr. Xiaolin Yu (Institute of Vegetable Science, Zhejiang University, Hangzhou, China) and grown in a 22 ± 1 • C growth chamber under long-day conditions (16 h light/8 h dark). The total RNA was extracted from five tissues including roots, stems, leaves, inflorescences, and siliques, and five different flower bud stages. To ensure the specificities, we designed the primers of each BniEXP and BolEXP by Primer Premier 5.0, and UBC10 genes in B. nigra and B. oleracea were chosen as reference genes (Table S1) [58,75]. qRT-PCR was performed as described by [76]. For each sample, three biological replicates were conducted with three technical replicates, and the results were calculated using the 2 −∆∆Ct method [77].

Conclusions
Through the analyses and discussion of the above results, we can draw the following four conclusions. Firstly, through the construction of phylogenetic tree of the three basic species of Brassica and A. thaliana, we identified 56, 58 and 60 expansin genes in B. rapa, B. oleracea and B. nigra, and they were divided into four subfamilies. Each subfamily was divided into different subgroups, and there were 16 subgroups in total. Proper classification of gene family members facilitates exploration of evolution, expression, and function. Secondly, based on the analyses of the physicochemical properties, gene structures and conserved domains of expansin genes in the three basic Brassica species, and combined with previous studies, a brief model of an expansin family including typical elements of each subfamily was established (Figure 4). In particular, we noticed that EXLB subfamilies in B. rapa, B. oleracea and B. nigra were more conservative than other subfamilies. Further synteny analyses showed that the expansion of expansin families in these species was mainly due to segmental duplication rather than tandem duplication. Thirdly, the results of evolutionary rates in different clusters indicated that expansin families of the three basic species of Brassica were in the state of purification selection. In addition, by exploring the relationship between the promoter sequences and coding sequences of orthologous gene pairs, it showed that there were significant but weak positive correlations between upstream regulatory sequences and coding sequences of orthologs. Finally, studies on the expression patterns of B. rapa, B. oleracea, and B. nigra demonstrated that expansin genes had indispensable roles in most processes of plant growth and development, especially in reproductive development. Because EXPB subfamily proteins are abundant in the pollen of monocotyledons, they have received widespread attention in reproductive development, especially in the development of pollen. Although the scale of EXPB subfamilies was reduced in dicotyledons, similar gene functions may still be retained, such as BrEXPB4 and BrEXPB9, whereas their specific biological functions should be further explored.
Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/1422-0067/21/10/3424/s1. Table S1. Primers used in qRT-PCR analysis of expansin genes in B. nigra and B. oleracea. Table S2. Identification and physicochemical property predictions of expansin genes in A. thaliana, B. rapa, B. oleracea, and B. nigra and nomenclature for the three basic species of Brassica. Table S3. Evolutionary rate analyses of orthologous expansin genes and upstream 500 bp promoters between A. thaliana and B. rapa, A. thaliana and B. oleracea, and A. thaliana and B. nigra. Table S4. Evolutionary rate analyses of paralogous expansin genes and upstream 500 bp promoters in B. rapa, B. oleracea, and B. nigra, respectively. Table S5. Detection of selection for expansin genes using branch model of EasyCodeML in different subgroups. Table S6. The mean evolutionary rates (Ka, Ks and Ka/Ks) and promoter divergences (d SM ) of different subfamilies' paralogous expansin genes in B. rapa, B. oleracea, and B. nigra. Figure S1. Genomic distribution of expansin genes on chromosomes of A. thaliana, B. rapa, B. oleracea, and B. nigra. Figure S2

Conflicts of Interest:
The authors declare no conflict of interest.