Organ- and Growing Stage-Specific Expression of Solanesol Biosynthesis Genes in Nicotiana tabacum Reveals Their Association with Solanesol Content

Solanesol is a noncyclic terpene alcohol that is composed of nine isoprene units and mainly accumulates in solanaceous plants, especially tobacco (Nicotiana tabacum L.). In the present study, RNA-seq analyses of tobacco leaves, stems, and roots were used to identify putative solanesol biosynthesis genes. Six 1-deoxy-d-xylulose 5-phosphate synthase (DXS), two 1-deoxy-d-xylulose 5-phosphate reductoisomerase (DXR), two 2-C-methyl-d-erythritol 4-phosphate cytidylyltransferase (IspD), four 4-diphosphocytidyl-2-C-methyl-d-erythritol kinase (IspE), two 2-C-methyl-d-erythritol 2,4-cyclo-diphosphate synthase (IspF), four 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase (IspG), two 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate reductase (IspH), six isopentenyl diphosphate isomerase (IPI), and two solanesyl diphosphate synthase (SPS) candidate genes were identified in the solanesol biosynthetic pathway. Furthermore, the two N. tabacum SPS proteins (NtSPS1 and NtSPS2), which possessed two conserved aspartate-rich DDxxD domains, were highly homologous with SPS enzymes from other solanaceous plant species. In addition, the solanesol contents of three organs and of leaves from four growing stages of tobacco plants corresponded with the distribution of chlorophyll. Our findings provide a comprehensive evaluation of the correlation between the expression of different biosynthesis genes and the accumulation of solanesol, thus providing valuable insight into the regulation of solanesol biosynthesis in tobacco.


Introduction
Solanesol is a medically important noncyclic terpene alcohol that is synthesized by the condensation of nine isoprene units. The molecule is a precursor of ubiquinone, also known as coenzyme Q 10 , vitamin K 2 , and anticancer drugs, including N-solanesyl-N,N -bis(3,4-dimethoxybenzyl) ethylenediamine (SDB) [1][2][3]. Coenzyme Q 10 possesses both antioxidant and antiaging properties and is reported to strengthen the body's immune system and cardiovascular function, to improve brain health, and to moderate blood lipids. As a result, coenzyme Q 10 has potential usefulness in the treatment of migraines, neurodegenerative diseases, hypertension, and cardiovascular diseases [2][3][4][5], and it is also being used as a dietary supplement by patients with type-2 diabetes [6]. Meanwhile, vitamin K 2 promotes bone formation and mineralization, inhibits bone resorption, prevents and mitigates osteoporosis, promotes osteoporosis, promotes blood coagulation, and improves arterial stiffness [7]; and SDB, mediated by P-proteins, is used in the treatment of several types of drug resistance in tumours and plays a synergistic role with certain antitumor drugs [8,9]. Recently, Yao et al. [10] reported that solanesol can protect human hepatic L02 cells from ethanol-induced oxidative injury via upregulation of HO-1 and Hsp70 expression. Thus, the medical benefits of solanesol and its derivatives are well established.

Organ-and Growing Stage-Specific Solanesol Content
Leaf tissues were harvested from four growing stages of tobacco plants (S1, S2, S3, and S4) at 10, 20, 40, and 60 days after transplanting, respectively; and root, stem, and leaf samples were harvested from S3-stage plants.
The solanesol content was measured using ultra-high performance liquid chromatography (UPLC) method. The solanesol content was highest in the leaves of S3-stage tobacco plants, followed by that in the stems and roots (p < 0.05); and the levels of total, free-state, and bound-state solanesol in the leaves were 21.45-, 21.23-, and 21.70-fold that in the stems, respectively ( Figure 1A), whereas no measurable solanesol was detected in the roots. The total, free-state, and bound-state solanesol content of leaves from the four growing stages of plants were significantly different, and the content was lowest at the S1 stage, increased at the S2 stage, reached a maximum at the S3 stage, and then decreased slightly at the S4 stage ( Figure 1B). Currently, nothing is known about the expression of genes involved in the biosynthetic pathway of solanesol in tobacco, and although some studies have investigated the solanesol content of tobacco leaves [3,12], little is known about the relative contents of different states of solanesol in different organs or at different growing stages. Therefore, in the present study, we investigated the expression of solanesol biosynthesis genes and the accumulation of solanesol in three organs (leaves, stems, and roots) of tobacco plants, as well as in the leaves of tobacco plants at four growing stages. This work provides a starting point for further research regarding solanesol biosynthesis genes in tobacco, as well as in other solanaceous plants, and provides valuable insight into the regulation of solanesol biosynthesis in tobacco.

Organ-and Growing Stage-Specific Solanesol Content
Leaf tissues were harvested from four growing stages of tobacco plants (S1, S2, S3, and S4) at 10, 20, 40, and 60 days after transplanting, respectively; and root, stem, and leaf samples were harvested from S3-stage plants.
The solanesol content was measured using ultra-high performance liquid chromatography (UPLC) method. The solanesol content was highest in the leaves of S3-stage tobacco plants, followed by that in the stems and roots (p < 0.05); and the levels of total, free-state, and bound-state solanesol in the leaves were 21.45-, 21.23-, and 21.70-fold that in the stems, respectively ( Figure 1A), whereas no measurable solanesol was detected in the roots. The total, free-state, and bound-state solanesol content of leaves from the four growing stages of plants were significantly different, and the content was lowest at the S1 stage, increased at the S2 stage, reached a maximum at the S3 stage, and then decreased slightly at the S4 stage ( Figure 1B). Values and error bars represent means ± SD. Different lowercase letters indicate significant differences (p < 0.05) between organs or growing stages. S1, 10 days after transplanting; S2, 20 days after transplanting; S3, 40 days after transplanting; S4, 60 days after transplanting; DW, dry weight. Values and error bars represent means ± SD. Different lowercase letters indicate significant differences (p < 0.05) between organs or growing stages. S1, 10 days after transplanting; S2, 20 days after transplanting; S3, 40 days after transplanting; S4, 60 days after transplanting; DW, dry weight.

Organ-Specific Expression of Solanesol Biosynthesis Genes
To identify candidate genes in the solanesol biosynthetic pathway, RNA-seq analyses of the leaves, stems, and roots of S3-stage tobacco plants were conducted. Six DXS, two DXR, two IspD, four IspE, two IspF, four IspG, two IspH, six IPI, and two SPS candidate genes were identified based on the similarity of the predicted amino acidic sequences to known sequences in the National Center for Biotechnology Information (NCBI) GenBank database ( Figure 2). For gene expression analysis, the number of expressed sequence tags was calculated, and the numbers were normalized to fragments per kilobase of exon per million fragments mapped (FPKM) reads.

Organ-and Growing Stage-Specific NtSPS Expression
The expression of two N. tabacum SPS genes (NtSPS1 and NtSPS2) was analysed, in order to validate the results from RNA-seq analyses. The relative expression of the NtSPS genes was significantly higher in the leaves of the tobacco plants than in the stems and roots (p < 0.05), in which the NtSPS levels were statistically similar (p > 0.05) ( Figure 3A), and the relative expression of NtSPS1 and NtSPS2 in the leaves was 13.19 and 10.17 fold that in the stems, respectively. to 226% (DXS4) higher in the stems than in the roots and from 26% (DXS4) to 35% (IPI3) higher in the roots than in the leaves. For IPI1, IPI2, and IPI6, the FPKM values ranged from 14% (IPI6) to 209% (IPI6) higher in the roots than in the stems and from 170% (IPI2) to 598% (IPI1) higher in the stems than in the leaves. For IPI4 and IPI5, the FPKM values ranged from 8% (IPI4) to 44% (IPI5) higher in the roots than in the leaves and from 4% (IPI4) to 103% (IPI5) higher in the leaves than in the stems.

Organ-and Growing Stage-Specific NtSPS Expression
The expression of two N. tabacum SPS genes (NtSPS1 and NtSPS2) was analysed, in order to validate the results from RNA-seq analyses. The relative expression of the NtSPS genes was significantly higher in the leaves of the tobacco plants than in the stems and roots (p < 0.05), in which the NtSPS levels were statistically similar (p > 0.05) ( Figure 3A), and the relative expression of NtSPS1 and NtSPS2 in the leaves was 13.19 and 10.17 fold that in the stems, respectively. Values and error bars represent means ± SD. Different lowercase letters indicate significant differences (p < 0.05) between organs or growing stages. S1, 10 days after transplanting; S2, 20 days after transplanting; S3, 40 days after transplanting; S4, 60 days after transplanting.
In addition, the relative expression of NtSPS1 and NtSPS2 also differed significantly among the leaves from the four growing stages. The expression was lowest in the leaves from S1-stage plants, intermediate in leaves from S2-stage plants, greatest in the leaves from S3-stage plants, and low again in the leaves from S4-stage plants ( Figure 3B). Therefore, the relative expression of NtSPS1 and NtSPS2 was consistent with the observed solanesol contents. Values and error bars represent means ± SD. Different lowercase letters indicate significant differences (p < 0.05) between organs or growing stages. S1, 10 days after transplanting; S2, 20 days after transplanting; S3, 40 days after transplanting; S4, 60 days after transplanting.
In addition, the relative expression of NtSPS1 and NtSPS2 also differed significantly among the leaves from the four growing stages. The expression was lowest in the leaves from S1-stage plants, intermediate in leaves from S2-stage plants, greatest in the leaves from S3-stage plants, and low again in the leaves from S4-stage plants ( Figure 3B). Therefore, the relative expression of NtSPS1 and NtSPS2 was consistent with the observed solanesol contents.

Phylogenetic Analysis of NtSPS
To define the phylogenetic relationships among the SPS proteins from tobacco (NtSPS1 and NtSPS2) and other plant species, we downloaded 37 plant SPS sequences from the NCBI database and used these data to construct a phylogenetic tree (Figure 4). In the phylogenetic analysis, the NtSPS1 and NtSPS2 sequences clustered with those from other solanaceous plants, i.e., SlSPS from Solanum lycopersicum, SpSPS from S. pennellii, and StSPS from S. tuberosum (Figure 4), which suggests that the biological function of the tobacco SPS proteins is similar to that reported for other solanaceous plants. Similarly, the SPS sequences from brassicaceous plants (e.g., Arabidopsis thaliana, B. napus, B. oleracea var. oleracea, and B. rapa) also formed a distinct cluster (Figure 4). In addition, a multiple sequence alignment of the SPS amino acid sequences from A. thaliana, B. napus, tomato, and tobacco revealed the presence of two conserved aspartate-rich DDxxD domains ( Figure 5).

Phylogenetic Analysis of NtSPS
To define the phylogenetic relationships among the SPS proteins from tobacco (NtSPS1 and NtSPS2) and other plant species, we downloaded 37 plant SPS sequences from the NCBI database and used these data to construct a phylogenetic tree (Figure 4). In the phylogenetic analysis, the NtSPS1 and NtSPS2 sequences clustered with those from other solanaceous plants, i.e., SlSPS from Solanum lycopersicum, SpSPS from S. pennellii, and StSPS from S. tuberosum (Figure 4), which suggests that the biological function of the tobacco SPS proteins is similar to that reported for other solanaceous plants. Similarly, the SPS sequences from brassicaceous plants (e.g., Arabidopsis thaliana, B. napus, B. oleracea var. oleracea, and B. rapa) also formed a distinct cluster (Figure 4). In addition, a multiple sequence alignment of the SPS amino acid sequences from A. thaliana, B. napus, tomato, and tobacco revealed the presence of two conserved aspartate-rich DDxxD domains ( Figure 5).

Organ-and Growing Stage-Specific Chlorophyll Content
The contents of total chlorophyll, chlorophyll a, and chlorophyll b were highest in the leaves of the S3-stage tobacco plants, followed by the levels detected in the stems and roots, respectively (p < 0.05) ( Figure 6A). In the leaves, the content of total chlorophyll, chlorophyll a, and chlorophyll b were 23.05-, 28.33-, and 14.78-fold that observed in the stems, respectively ( Figure 6A), and no chlorophyll was detected in the roots. Significant differences in chlorophyll content were also observed in the leaves collected from the four growing stages, and all parameters were lowest in the leaves from S1-stage plants, intermediate in leaves from S2-stage plants, greatest in the leaves from S3-stage plants, and slightly lower than the observed maximum in the leaves from S4-stage plants ( Figure 6B). These changes were consistent with the distribution of solanesol in the three organs and the levels of solanesol detected at four growing stages.

Organ-and Growing Stage-Specific Chlorophyll Content
The contents of total chlorophyll, chlorophyll a, and chlorophyll b were highest in the leaves of the S3-stage tobacco plants, followed by the levels detected in the stems and roots, respectively (p < 0.05) ( Figure 6A). In the leaves, the content of total chlorophyll, chlorophyll a, and chlorophyll b were 23.05-, 28.33-, and 14.78-fold that observed in the stems, respectively ( Figure 6A), and no chlorophyll was detected in the roots. Significant differences in chlorophyll content were also observed in the leaves collected from the four growing stages, and all parameters were lowest in the leaves from S1-stage plants, intermediate in leaves from S2-stage plants, greatest in the leaves from S3-stage plants, and slightly lower than the observed maximum in the leaves from S4-stage plants ( Figure 6B). These changes were consistent with the distribution of solanesol in the three organs and the levels of solanesol detected at four growing stages. Values and error bars represent means ± SD. Different lowercase letters indicate significant differences (p < 0.05) between organs or growing stages. S1, 10 days after transplanting; S2, 20 days after transplanting; S3, 40 days after transplanting; S4, 60 days after transplanting; FW, fresh weight.

Solanesol Content of Tobacco Plants
Solanesol is a long-chain polyisoprenoid alcohol that mainly accumulates in solanaceous plants, especially tobacco [1,3,12], and is an important intermediate in the synthesis of ubiquinone and anticancer drugs. Because the chemical synthesis of solanesol is difficult [11], we assessed some aspects of its biosynthesis in tobacco plants and observed the differential accumulation of solanesol in the leaves, stems, and roots of tobacco plants, with the content in leaves being the highest ( Figure 1A).  Values and error bars represent means ± SD. Different lowercase letters indicate significant differences (p < 0.05) between organs or growing stages. S1, 10 days after transplanting; S2, 20 days after transplanting; S3, 40 days after transplanting; S4, 60 days after transplanting; FW, fresh weight.

Solanesol Content of Tobacco Plants
Solanesol is a long-chain polyisoprenoid alcohol that mainly accumulates in solanaceous plants, especially tobacco [1,3,12], and is an important intermediate in the synthesis of ubiquinone and anti-cancer drugs. Because the chemical synthesis of solanesol is difficult [11], we assessed some aspects of its biosynthesis in tobacco plants and observed the differential accumulation of solanesol in the leaves, stems, and roots of tobacco plants, with the content in leaves being the highest ( Figure 1A). Our results also revealed that the chlorophyll content in the leaves, stems, and roots of tobacco plants ( Figure 6A) corresponded with the distribution of solanesol ( Figure 1A), which is consistent with other studies that have suggested that solanesol is synthesized in chloroplasts [3,12,14], and our results also indicate that solanesol mainly accumulates in green plant tissues (e.g., leaves and stems). Thus, tobacco leaves are likely an ideal source material for solanesol extraction [3,12].
The results of the present study also indicated that the solanesol content of leaves varies as the plants develop, and the solanesol content was highest in leaves from S3-stage plants (40 days after transplanting; Figure 1B). Therefore, the S3 stage can be considered the most appropriate period during which to harvest tobacco leaves for solanesol extraction. However, since the accumulation of solanesol is reportedly influenced by genetic and environmental factors [1,3,12,15], the optimal harvest period for solanesol extraction may vary, depending on individual tobacco varieties and environmental conditions.

Solanesol Biosynthesis Genes in Tobacco
Based on the similarity of amino acid sequences, six DXS, two DXR, two IspD, four IspE, two IspF, four IspG, two IspH, six IPI, and two SPS candidate genes were identified in the solanesol biosynthetic pathway of tobacco ( Figure 2).
DXS is the first enzyme in the MEP pathway and is responsible for catalysing the conversion of pyruvate and glyceraldehyde 3-phosphate to form 1-deoxy-D-xylulose 5-phosphate (DXP) (Scheme 1) [3,12]. In the present study, we assessed the FPKM values of different DXS genes in the leaves, stems, and roots of tobacco plants, and the relative expression levels of DXS1, DXS2, DXS3, DXS5, and DXS6 were higher in the tobacco leaves and stems than in the roots ( Figure 2). In Medicago truncatula, MtDXS1 was preferentially expressed in several aboveground tissues (e.g., leaves and stems) but not in the roots, whereas MtDXS2 transcript levels were low in most tissues but strongly stimulated in roots upon colonization by mycorrhizal fungi [16]. Compared with non-transgenic wild-type plants, transgenic Arabidopsis plants that over-or underexpressed DXS gene accumulated different levels of various isoprenoids, including chlorophylls, tocopherols, carotenoids, abscisic acid, and gibberellins [17]; and when the A. thaliana DXS gene was constitutively expressed in spike lavender, the leaves of flowers of the transgenic plants accumulated significantly more essential oils (monoterpenes) [18]. Thus, as an important regulatory factor in the MEP pathway, DXS is a crucial enzyme in the solanesol biosynthetic pathway, and its overexpression or inhibition can influence the content of downstream metabolites.
Meanwhile, DXR catalyses the intramolecular rearrangement of DXP's straight-chain carbon skeleton to MEP (Scheme 1) [3,12]. In the present study, the variation in FPKM values of DXR1 and DXR2 in leaves, stems, and roots ( Figure 2) was consistent with the distribution of solanesol (Figure 1). Zhang et al. [19] cloned two DXR genes from N. tabacum and found that the expression of NtDXR1 and NtDXR2 was highest in tobacco leaves, followed by that in the stem and roots. In addition, the downregulation of DXR in A. thaliana results in variegation, reduced pigmentation, and defects in chloroplast development, whereas DXR-overexpressing lines exhibit increased accumulation of MEP-derived plastid isoprenoids, such as chlorophylls and carotenoids [20]. In contrast, the overexpression of the tobacco DXR gene in chloroplasts has been shown to increase the production of isoprenoids, including solanesol, chlorophyll a, beta-carotene, lutein, antheraxanthin, and beta-sitosterol [21]. Therefore, DXR is obviously another key enzyme in the biosynthesis of solanesol in tobacco, and its overexpression in chloroplasts can promote the accumulation of solanesol.
IspD, IspE, IspF, IspG, and IspH sequentially catalyse the transformation of MEP to IPP and DMAPP (Scheme 1) [3,22]. In the present study, we observed variation in the FPKM values of Isp genes in leaves, stems, and roots (Figure 2), and the relative expression levels of IspE, IspF, IspG, and IspH were higher in the tobacco leaves and stems than in the roots. Similarly, Hsieh et al. [23] observed that IspD1, IspD2, and IspE1 are mainly expressed in the leaves and stems of A. thaliana. However, Kim et al. [24] cloned an IspF gene from Ginkgo biloba and found that its expression was higher in embryonic roots than in embryonic leaves. Gao et al. [25] also reported that the expression of IspF was highest in G. biloba roots, followed by that in the leaves and seeds, respectively, and Lu et al. [26] cloned the IspH gene from G. biloba and found that its expression was highest in roots, as well, followed by its expression in stems and leaves. In contrast, Kim et al. [27] cloned two IspH genes from G. biloba and found that the expression of IspH1 was higher in the leaves than in the roots, whereas the expression of IspH2 was higher in the roots than in the leaves. Thus, our results regarding the distribution of IspD, IspE, IspF, IspG, and IspH expression are not fully consistent with those of previous studies in other plants, and further investigation is warranted.
IPI catalyses the isomerization of IPP and DMAPP, which requires Mg 2+ , and the resulting IPP and DMAPP can then bind to form isoprenoids, such as solanesol (Scheme 1) [3,28]. In the present study, we assessed the FPKM values of different IPI genes in the leaves, stems, and roots of tobacco plants, and the relative expression levels of IPI1, IPI2, IPI4, IPI5, and IPI6 were highest in tobacco roots of these three organs (Figure 2). Nakamura et al. [29] cloned two N. tabacum IPI genes and found that the expression of IPI1 increases under high-salt and high-light stress conditions, whereas the expression of IPI2 increases under high-salt and cold stress conditions. Sun et al. [30] cloned an IPI gene from S. lycopersicum and found that its expression was highest in roots, followed by that in stems and leaves, which was consistent with the distribution of IPI1, IPI2, and IPI6 expression observed in the present study. Thus, our results regarding the distribution of different IPI genes expression are not fully consistent with those of previous studies, and further investigation is warranted.
SPS catalyses the reaction of IPP and DMAPP to form SPP, which is a precursor of solanesol (Scheme 1) [3,12]. In the present study, both qRT-PCR and RNA-seq analyses indicated that the relative expression of NtSPS1 and NtSPS2 were significantly higher in the leaves than in the stems and roots (p < 0.05) (Figures 2 and 3). To date, SPS homologs have been identified in A. thaliana [31][32][33], Hevea brasiliensis [34], Oryza sativa [35], and S. lycopersicum [36]. Hirooka et al. [32] cloned two SPS genes from A. thaliana and found that the expression of AtSPS1 and AtSPS2 was significantly higher in stems and leaves than in roots, which is similar to the expression of NtSPS1 and NtSPS2 observed in the present study (Figure 3). The homology between NtSPS1 and NtSPS2 and SPS proteins from other solanaceous plants was relatively higher than that in other families (Figure 4), and our results suggest that the biological function of SPS proteins in solanaceous plants is similar to that reported in other plants [37,38]. SPS proteins from other plants contain two conserved aspartate-rich DDxxD domains ( Figure 5), which are involved in the coordination of divalent metal ions with the diphosphate groups in substrates and play a key role in substrate positioning [39]. Jones et al. [36] suggested that the constitutive overexpression of SlSPS in tobacco could significantly increase the solanesol content of mature leaves, and also reported that the solanesol content of mature leaves from transgenic tobacco plants was positively correlated with the expression of SlSPS. Thus, SPS is a key enzyme in the solanesol biosynthetic pathway, and its overexpression can promote the accumulation of downstream metabolites, such as solanesol [3,12,36]. To measure the solanesol and chlorophyll contents and to perform RNA-seq and quantitative real-time PCR (qRT-PCR) analyses, leaves were harvested from four growing stages (S1, S2, S3, and S4) of tobacco plants at 10,20,40, and 60 days after transplanting, respectively; and root, stem, and leaf samples were harvested from S3-stage plants. All the experiments were performed in triplicate.

Analysis of Total, Free-State, and Bound-State Solanesol Contents
The tobacco leaves, stems, and roots were dried to constant weight with a freeze-dryer (Alpha 1-2 LD Plus; Christ, Osterode am Harz, Germany), ground, and sifted through a 40-mesh sieve. Portions (2 g) of the powdered samples were transferred to individual 50-mL centrifuge tubes with stoppers, and 20 mL hexane was added. Ultrasonic extraction was performed at 65 • C for 15 min, and the resulting homogenates were centrifuged at 3000 rpm for 10 min, after which the supernatants were transferred to 50-mL volumetric flasks. The extraction steps were then repeated by extracting the precipitated layers two additional times, each with 15 mL hexane, and the two additional extracts for each sample were combined with the corresponding initial extracts.
To prepare the samples for quantification of free-state solanesol, 4 mL aliquots of the hexane extracts were transferred to individual 10-mL stoppered centrifuge tubes and supplemented with 6 mL distilled water. The mixtures were vortexed for 3 min to remove the water-soluble impurities and then centrifuged at 3000 rpm for 10 min. Next, the upper layers were removed, diluted with a 50:50 (v:v) methanol-acetonitrile solution in brown volumetric flasks, and then filtered through a 0.2-µm membrane.
Meanwhile, to prepare the samples for quantification of total solanesol, 4 mL aliquots of the hexane extracts were transferred to individual 100-mL brown stoppered flasks and supplemented with 4 mL 0.02 M NaOH (diluted in ethanol). After thorough mixing, the samples were oscillated in a water bath for 30 min at 60-65 • C to allow saponification and then incubated in an 83-87 • C water bath to allow the solvent to evaporate. Subsequently, 2 mL hexane was added to each sample, and the mixtures were subject to ultrasonication for 2 min, in order to dissolve the residue; transferred to clean 10-mL stoppered centrifuge tubes; and dissolved with another 2 mL hexane. Next, 6 mL distilled water was added, and the mixtures were vortexed for 3 min to remove the water-soluble impurities. After a 10-min centrifugation at 3000 rpm, the upper layers were removed, diluted with a 50:50 (v:v) methanol-acetonitrile solution in brown volumetric flasks, and then filtered through a 0.2-µm membrane.
The amounts of total and free-state solanesol were measured using ultra-high performance liquid chromatography (ACQUITY UPLC H-Class; Waters, Milford, MA, USA) with an Atlantis T3-C 18 column (4.6 mm × 150 mm, 3 µm; Waters) that was maintained at 35 • C. A 50:50 (v:v) methanol-acetonitrile solution was used as the mobile phase at a flow rate of 1.0 mL/min, and a diode array detector was used for detection at 213 nm. Meanwhile, the amount of bound-state solanesol in each extract was calculated as the difference between the free-state and total solanesol levels.

Preparation of Digital Gene Expression Library, Sequencing, and Analysis
The roots, stems, and leaves collected from S3-stage tobacco plants were used to generate three digital gene expression libraries, in order to identify the genes in the solanesol biosynthetic pathway in tobacco plants, and plants with possible microbial contamination were excluded. Total RNA was extracted using TRIzol (Invitrogen, Carlsbad, CA, USA), and RNA degradation and contamination were assessed using electrophoresis on 1% agarose gels. The purity of the extracted RNA was checked spectrophotometrically, using a NanoPhotometer (Implen, Inc., Westlake Village, CA, USA), and the RNA was quantified using the Qubit RNA Assay Kit and a Qubit 2.0 fluorometer (Life Technologies, Carlsbad, CA, USA). The integrity of the RNA was assessed using an RNA Nano 6000 Assay Kit and the BioAnalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA).
Three mixtures that contained equal amounts of RNA from the three organs were prepared for each sample and subsequently used to construct the library. The library quality was assessed using the Agilent Bioanalyzer 2100 system (Agilent Technologies), and the clustering of index-coded samples was performed on a cBot Cluster Generation System using the TruSeq PE Cluster Kit v3-cBot-HS (Illumina, Santiago, CA, USA), according to the manufacturer's instructions. After cluster generation, the library preparations were sequenced using an Illumina HiSeq platform to generate paired-end reads.
Dirty raw reads (i.e., reads with adapters, unknown nucleotides, or quality values ≤5 that accounted for >50% of the read) were discarded, since they would negatively affect downstream analyses, and de novo assembly of the short reads was performed using the Trinity assembly program, following the method described by Grabherr et al. [40]. Gene functions were identified by BLASTing the unigene sequences against protein databases, including the NCBI non-redundant (Nr) database (http: //www.ncbi.nlm.nih.gov), Universal Protein Resource (UniProt) database (http://www.uniprot.org), and Cluster of Orthologous Groups of proteins (COG) database (http://www.ncbi.nlm.nih.gov/COG). For gene expression analysis, the number of expressed sequence tags was calculated, and the numbers were normalized to FPKM reads, according to the method of Mortazavi et al. [41]. The annotations and FPKM values of the unigenes were obtained from the China Tobacco Genome (CTG) database (http://218.28.140.17/).

Screening and Cloning of NtSPS1 and NtSPS2
In order to clone the two NtSPS genes, total RNA was isolated from tobacco leaves, and its quality was assessed as described in Section 4.3 above. In order to obtain first strand cDNA, reverse transcription was performed using a PrimerScript RT-PCR kit (Takara Bio, Inc., Shiga, Japan), and the cDNA was used as a template for PCR amplification, using gene-specific primers to amplify the complete coding sequences of both NtSPS1 (upstream primer: 5 -ATGATGTCTGTGACTTGCC ATAATC-3 , downstream primer: 5 -CTATTCAATTCTCTCCAGATTATACTTCAC-3 ) and NtSPS2 (upstream primer: 5 -ATGATGTCTGTGAGTTGCCATAATC-3 , downstream primer 5 -CTATTC AATTCTCTCCAGATTATACTTCAC-3 ). The PCR amplification was performed as described by Block et al. [31], and the amplified products were gel extracted and transformed into competent Escherichia coli DH5α cells. Finally, the positive clones were screened and sequenced at the Beijing Genomics Institute (Shenzhen, China).

RNA Extraction, CDNA Synthesis, and QRT-PCR Analysis
Total RNA was extracted from the roots, stems, and leaves of S3-stage tobacco plants, as well as from the leaves of S1-, S2-, and S4-stage plants, as described above. Two micrograms of total extracted RNA from each sample was reverse transcribed to generate first-strand cDNA, using the PrimerScript kit (Takara Bio, Inc.) according to the manufacturer's instructions, and the experiments were performed in triplicate. Gene-specific primer pairs were designed for sequence analysis of NtSPS1 (upstream primer: 5 -TGTCTGTGACTTGCCATAA-3 , downstream primer 5 -CATTGAATCCTCC TCTACTT-3 ) and NtSPS2 (upstream primer: 5 -CAGTGTTGGGTTTGAATA-3 , downstream primer: 5 -CTTGTTTAGAGTAAGGAGGTC-3 ), using the Primerquest software (http: //www.idtdna.com/pages/scitools), and qRT-PCR was performed using an ABI 7500 Real-time system (Applied Biosystems, Foster City, CA, USA) with SYBR Premix Ex Taq™ Kit (TaKaRa Bio, Inc.), according to the manufacturer's protocol, and with the following amplification conditions: 95 • C for 2 min, followed by 40 cycles of 95 • C for 15 s and 60 • C for 1 min, and plate reading after each cycle. Two fragments of the constitutively expressed Ntactin gene were amplified as a reference, using the gene-specific upstream and downstream primers 5 -CATTCCAAATATGAGATGCGTTGT-3 and 5 -TGTGGACTTGGGAGAGGACT-3 , respectively.

Phylogenetic Analysis of NtSPS
We aligned SPS amino acid sequences from different plant species using CLUSTAL W, computed the evolutionary distances between the sequences using the Poisson correction method, and constructed a neighbour-joining (NJ) tree using MEGA 5.0 [38]. The reliability of the topology was assessed using the bootstrap re-sampling method with 1000 bootstrap replications.

Determination of Chlorophyll Content
Chlorophyll was extracted from leaf, stem, and root samples by grinding 0.5 g of each sample in 1 mL of 100% acetone with a pinch of calcium carbonate in a mortar. The individual extracts were transferred to test tubes, after which the mortars were rinsed with 100% acetone, the rinsates were added to the extracts, and each sample extract was further diluted to 5 mL with acetone. Afterward, each of the extracts was filtered through a 0.45-µm syringe filter to remove the debris, and the absorbance of the filtered extracts at 663 nm (A 663 ) and 645 nm (A 645 ) was determined using a UV-2410PC spectrophotometer (Shimadzu, Tokyo, Japan). Then the chlorophyll a and chlorophyll b contents were calculated using the following equations: C A = 0.25 × (12.7 × (A 663 ) − 2.69 × (A 645 )) and C B = 0.25 × (22.9 × (A 645 ) − 4.68 × (A 663 )), where, C A and C B are the contents (mg/g) of chlorophyll a and b, respectively [42], and the total chlorophyll content was calculated as the sum of the chlorophyll a and chlorophyll b contents.

Statistical Analysis
The data were analysed using analysis of variance (ANOVA) with SPSS 18.0 statistical software (SPSS, Inc., Chicago, IL, USA), and significant differences between the treatments were investigated using Tukey's multiple comparison test at the p < 0.05 significance level.

Conclusions
The present study demonstrates that solanesol is abundant in tobacco leaves and provides a starting point for further research regarding solanesol biosynthesis genes in tobacco plants. Six DXS, two DXR, two IspD, four IspE, two IspF, four IspG, two IspH, six IPI, and two SPS candidate genes were identified in the solanesol biosynthetic pathway of tobacco, and the two N. tabacum SPS proteins (NtSPS1 and NtSPS2), which possessed two conserved aspartate-rich DDxxD domains, were highly homologous with the SPS enzymes of other solanaceous plant species. In addition, the solanesol contents of three organs (leaves, stems, and roots) and leaves from four growing stages of tobacco plants corresponded with the distribution of chlorophyll. Our findings provide a comprehensive evaluation of the correlation between the expression of different biosynthesis genes and the accumulation of solanesol, thus providing valuable insight into the regulation of solanesol biosynthesis in tobacco.