Transcriptome-Wide Analysis of SAMe Superfamily to Novelty Phosphoethanolamine N-Methyltransferase Copy in Lonicera japonica

The S-adenosyl-l-methionine-dependent methyltransferase superfamily plays important roles in plant development. The buds of Lonicera japonica are used as Chinese medical material and foods; chinese people began domesticating L. japonica thousands of years ago. Compared to the wild species, L. japonica var. chinensis, L. japonica gives a higher yield of buds, a fact closely related to positive selection over the long cultivation period of the species. Genome duplications, which are always detected in the domestic species, are the source of the multifaceted roles of the functional gene. In this paper, we investigated the evolution of the SAMe genes in L. japonica and L. japonica var. chinensis and further analyzed the roles of the duplicated genes among special groups. The SAMe protein sequences were subdivided into three clusters and several subgroups. The difference in transcriptional levels of the duplicated genes showed that seven SAMe genes could be related to the differences between the wild and the domesticated varieties. The sequence diversity of seven SAMe genes was also analyzed, and the results showed that different gene expression levels between the varieties could not be related to amino acid variation. The transcriptional level of duplicated PEAMT could be regulated through the SAM–SAH cycle.

sunflowers [13] and, as well, other species. However, few studies have detailed the mechanisms through which duplications in a protein superfamily produce special characteristics in domesticated species.
In this study, we investigated the evolution of SAMe genes in 21 species, including L. japonica and L. japonica var. chinensis. We also further analyzed the roles that gene duplicates among special groups played during domestication, and we suggest that differences in the transcript levels of gene duplicates are related to variations in the amino acid or SAM-SAH cycle.

Global Phylogeny of SAMe Proteins
Using Superfamily, Interpro and BlastP, as well as information from public genome databases and our own transcriptome databases of L. japonica, we gathered 2354 non-redundant sequences that encode SAMe proteins from 21 different species (Table S1), representing a diverse taxonomic background. The results show that SAMe proteins are widely distributed among bacteria, fungi, animals, and plants. Among the 2354 sequences, 288 putative SAMe protein sequences are identified in Selaginella moellendorffii, compared to 6 SAMe proteins in Escherichia coli (Table 1).
We classified all SAMe protein sequences into three clusters ( Figure S1). Some 48% of them are in cluster I. Those from the gymnospermae species are all found in cluster III; a few sequences from Pinus taeda and Pseudotsuga menziesii appear in cluster I. Pteridophyta, algae, monocotyledoneae, and dicotyledoneae species appear in all three clusters.

Expression of SAMe Genes in L. japonica Flowers
In order to further study the functional fate of the duplicated SAMe genes, we analyzed the transcript level of the SAMe genes based on L. japonica transcriptome data and real-time PCR. Besides II-2 subgroups, reads per kilo base per million (RPKM) of SAMe genes in subgroups with redundancy copies (Table S2) was greater in L. japonica than in L. japonica var. chinensis. In subgroup II-11, the total RPKM of the SAMe genes was 7.62-fold greater in L. japonica than that in L. japonica var. chinensis.
After duplication, both copies continue functioning when natural selection favors duplicated protein function or expression, or when mutations make them functionally distinct before one copy is silenced [15]. Approximately 50% of paralogs were differentially expressed and thus had undergone expression sub-functionalization by Soybean RNA-seq [16]. The RPKM of paralogs (FLJSAMT37 and FLJSAMT132 in I-1 subgroup) in the buds of L. japonica were 52.42 and 12.81, respectively. These differentially expressed copying genes in L. japonica could have undergone expression sub-functionalization or neo-functionalization.
We also analyzed the difference in gene expression between buds and flower1 of L. japonica. Buds had white or red petals that had not yet bloomed into a full-sized flower, and flower1 have white petals that had bloomed into a full-sized flower. Because of stable flower yields in buds and flower1, the differences in the transcriptional levels of genes in buds and flower 1 related to the flower yields should not be of any significance. Among the total eight SAMe genes, only seven SAMe genes, including FLJSAMT37, had different transcriptional levels between L. japonica and in L. japonica var. chinensis, but not between buds and flower 1 ( Table 3), suggesting these genes could be related to the difference between the yields of flowers from the wild and from the domesticated varieties.
We further validated some above-mentioned SAMe genes as representatives using qRT-PCRs, and the results are consistent with the RNA-seq data (Table S3).  Figure S1; II-2, subgroups 2 in the second cluster; II-8, subgroups 8 in the second cluster; II-11, subgroups 11 in the second cluster; I-1, subgroups 1 in the fiest cluster.

Phosphoethanolamine N-Methyltransferase (PEAMT) in L. japonica Domestication
In plants, SAMe occurs as small superfamilies with defined roles for each of its members in flower development. Arabidopsis histone methyltransferase is crucial for both sporophyte and gametophyte development [17]. The transcript pattern showed that Arabidopsis O-methyltransferase is related to the developmental changes in flowers [9]. Acyltransferase was shown to be specifically expressed in anther tapetum cells in the early stages of flower development [18]. The in vitro substrate specificity and the in vivo RNAi-mediated suppression data of the corresponding gene suggest a role of this cation-dependent CCoAOMT-like protein in the stamen/pollen development of A. thaliana [19].
Because of a higher transcript level and greater number of copies in L. japonica than in L. japonica var. chinensis, a phosphoethanolamine N-methyltransferase (PEAMT, FLJSAMT37, Table S5) was selected to determine whether or not the duplicated gene in L. japonica domestication affects flower development. PEAMT has a central role in phosphatidylcholine biosynthesis via the methylation pathway [7]. Studies have shown that the synthesis of phosphatidylcholine is affected by the plant growth regulator indole-3-acetic acid [20], suggesting that phosphatidylcholine has a fundamental function in plant growth and development.
Phosphatidylcholine is also the immediate precursor of many phospholipids [21] and catalyzes the hydrolysis of phospholipids in the cell membrane into phosphatidic acid and polar free heads [22]. Increased expression of phospholipase D, which hydrolyzes membrane lipids to generate phosphatidic acid and associated lipid changes, promotes root growth, flowering, and stress avoidance [23]. A phospholipase A1 catalyzes the initial step of jasmonic acid biosynthesis, which synchronizes pollen maturation, anther dehiscence, and flower opening in Arabidopsis [24]. PEAMT could exert a role in cell division and inflorescence meristem. Inhibition of PEAMT biosynthesis led to necrotic lesions in leaves, multiple inflorescences, sterility in the flower, and early flowering in short day conditions [25]. However, an increase in endogenous phosphocholine content during plant development improves the root meristem size, cell division, and cell elongation in Arabidopsis [26]. Two phosphatidylinositol/ phosphatidylcholine transfer protein genes are predominantly transcribed in the development of the male gametic cells and/or the fertilization process [27]. Thus, we suggest that duplicated PEAMT in L. japonica domestication could affect the inflorescence development and flower yield.
Flower development has two phases: (1) the steps from the vegetative to the flower-producing phases and (2) flower morphogenesis. We selected sequences of FVE, FCA, APETALA, PISTILLATA, AGAMOUS, and SEPALLATA from Arabidopsis thaliana and obtained sequences of their orthologs from L. japonica and L. japonica var. chinensis using BlastX, pfam, and interpro analysis. FVE and FCA follow a single-phase transition between the vegetative and the flower-producing phases [28]. APETALA and PISTILLATA control the formation of petals and stamens during Arabidopsis flower development [29]. APETALA and AGAMOUS-like act redundantly to control the identity of the floral meristem [30]. The SEPALLATA subfamily also plays a crucial role in the development of all types of floral organs [31]. The transcript level of FVE and FCA in buds of L. japonica was inferior to that of L. japonica var. chinensis, whereas those of APETALA, PISTILLATA, AGAMOUS, and SEPALLATA were greater, suggesting stronger floral meristem and morphogenesis in L. japonica than their wide variety ( Table 2). This is consistent with the results of PEAMT expression.

Sequence Diversity of SAMe between L. japonica and Its Wild Variety
In order to investigate the reason for the differential expression of SAMe in L. japonica and its wild variety, the sequence diversity in SAMe proteins from subgroups I-1, I-12, II-2, II-8, and II-11 were analyzed. Consensus contigs developed from L. japonica served as the basis for alignment to detect single-nucleotide polymorphism (SNP). The readings of the individual sequences, realigned to the consensus contigs, enabled detection of 57 SNPs and 5 indels in the SAMe of L. japonica and L. japonica var. chinensis, based on a total uniquely aligned read number > 20 and a contingency test p-value <0.01. Only 13 residues of amino acids are changed in SAMe proteins of L. japonica and L. japonica var. chinensis, whereas novel PEAMT does not has either SNP or an indel; neither was a change in the residues in amino acids seen (Table 3).

SAM-SAH Cycle Regulates PEAMT Activity
Phosphocholine is synthesized by three successive S-adenosyl-Met (SAM)-dependent N-methylations of the phospho-base phosphoethanolamine [22]. This pattern is presumably due to the very active re-synthesis of SAM from ATP and Met made possible by recycling adenosine and homo-Cys derived from SAH [32]. Poulton and Butt [33] suggested that the ratio of SAM to SAH could regulate caffeic acid O-methyltransferase activity in the leaves of sugar beet (Beta vulgaris). The expression and activities of two enzymes, adenosine kinase (ADK) and S-adenosylhomocysteine hydrolase (SAHH), are both required for the maintenance and recycling of S-adenosylmethionine-dependent methylation in plants [34,35]. Subcellular localization of SAHH and ADK in the cytosol with the phospho-base N-methyltransferase activities in spinach [33]. More transcript levels of SAHH and ADK were also found in buds of L. japonica than were found in those of L. japonica var. chinensis (Table 3). SAHH and ADK were found to accumulate in a similar pattern and were also found at high levels in inflorescence meristems, likely to support their higher rates of cell division [36]. Greater amounts of PEAMT were observed in buds of L. japonica than in its wild variety, a finding consistent with the abundance of ADK and SAHH observed in these samples. These results indicate a positive correlation among transcript levels of PEAMT, ADK, and SAHH, reflecting their respective contributions to methyl metabolism.

Plant Material
Buds and leaves of six each L. japonica and L. japonica var. chinensis plants were sampled in May 2012. These plants are 5 years old and situated in the field in Linyi planting garden, Yate Co, Shandong, China. Buds samples have similar morphology and have not yet bloomed into a full-size flower.

SAMe Classification
We searched the adenosyl-L-methionine-dependent methyltransferase sequences of 21 species (Table S1) using the superfamily [4] and InterPro databases. The species include one animal, one bacterium, two fungi, two algae, three gymnospermae, two pteridophyta, seven dicotyledoneae, and three monocotyledoneae. L. japonica database derived from five normalized libraries of transcriptome analysis. Flower samples (corollas or all petals) were randomly collected from five independent 3-year old FLJ and rFLJ in Doudian plantation (Beijing, China) to construct transcriptome libraries.
A total of 16,723 plant adenosyl-L-methionine-dependentmethyltransferases (SAMe) extracted from the NCBI non-redundant protein database, the SCOP SUPERFAMILY database [37] and the Unipro database [38]. We compared all searching sequences against the above plant SAMe sequence with an e-value cut-off below 1e −15 using BlastP [39] to determine the SAMe proteins from the best reciprocal hits. The resultant ESTs were dealt with Perl scripts to remove any repeated sequences.

SAMe Annotation
Domain and motif analyses were performed by InterPro [40] and Pfam [41]. The protein sequence similarities of SAMe were analyzed by DNAMAN. All the L. japonica and L. japonica var. chinensis SAMe sequences were submitted to COG [42] to cluster the SAMe orthologous groups with a p-value cut-off below 10 −5 .

SAMe Phylogeny
We used the SAMe sequences to construct neighbor-joining trees using Mega 5.0 [43] and ClustalW2 [44], respectively, with a bootstrap value of 1000 replicates. Furthermore, we reconciled preliminary trees by setting the bootstrap value greater than 50% to yield a consensus tree even more credible.

Orthologs and Paralogs
To identify orthologs, we performed an all-against-all sequence comparison using BLAST with an e-value cut-off below 1e −20 . The orthologs were then determined based on the best reciprocal hits [45]. We implemented a more stringent criterion: the alignment length percentage against the longer protein had to be above 80%.

Gene Expression Analyses and Experimental Validation
The gene expression profiling of L. japonica flowers was performed in a previous study [8]. The expression level was normalized with total mapped reads and the contig length, similar to RPKM method [46]. The RPKM value for each transcript was calculated as the number of reads per kilobase of the transcript sequence per million mapped reads [47].
Individual RNA samples extracted from the buds of six each L. japonica and L. Japonica var. chinensis plants were used to produce cDNAs for qRT-PCR, including reactions without reverse transcriptase. The PrimerScript 1st Strand cDNA Synthesis Kit from Takara (Tokyo, Japan) was used, according to the manufacturer's instructions. Gene-specific primers were designed using Primer 3 [48]. The primers are shown in Table S4. The amplifications were carried out with a 1 min incubation at 95 °C followed by 35 cycles at 95 °C for 15 s, 57-60 °C for 30 s and 68 °C for 30 s. The lengths of PCR products ranged from 100 to 250 bp. FLJ18S was chosen as an endogenous control in studying gene expressions in various bud samples of L. japonica and L. Japonica var. chinensis. The specificity of amplification was assessed by melting curve analysis, and the relative abundance of genes was determined using the comparative Ct method as suggested in ABI 7500 Software v2.0.1 (ABI, California, CA, USA).

SNP Identification, Validation and Sequences Diversity
Reads of L. japonica var. chinensis in the transcriptomes were mapped to FLJSAMes nucleotide sequences by BWA [49] and homozygous FLJSAMes SNPs were prepared by SAMtools [50]. Homozygous rFLJSAMes SNPs were also prepared by mapping the L. japonica reads to the FLJSAMes nucleotide sequences. Nucleotide sequences of FLJSAMes and rFLJSAMes were BLAST each other to find the best hits and matching position show the same variants count to candidate SNPs. SNPs with genotypic variants of the position within more than two reads or frank SNPs within 60 bp were removed. The filters we used to find the variants we could consider true SNPs are as follows: The minimum coverage of the position was eight reads and the minimum average quality of the bases was 20.

Conclusions
All SAMe protein sequences were classified into three clusters and several subgroups. Almost all subgroups have the same number of SAMe gene copies of L. japonica and L. japonica var. chinensis, whereas copy numbers of L. japonica are higher than L. japonica var. chinensis in the subgroups I-1, I-12, II-2, II-8, and II-11. The difference in transcriptional levels of the duplicated genes showed that seven SAMe genes could be related to the differences between the wild and the domesticated varieties. The sequence diversity of seven SAMe genes showed that the different expressed levels between varieties could be related to variations in the amino acid sequences. However, in the case of those containing PEAMT that had neither SNP/indel nor changes of amino acid residues, the transcript levels of PEAMT could be related to ADK and SAHH, reflecting their respective contributions to methyl metabolism.

Acknowledgments
The project was funded by Natural Science Foundation of China (81001605, 81373959).

Author Contributions
Yuan Yuan and Luqi Huang contributed to the study design. Linjie Qi, Jun Yu and Xumin Wang performed the research and conducted the data analysis. Yuan Yuan wrote the manuscript.

Conflicts of Interest
The authors declare no conflict of interest.