Transcriptome Analysis in Haematococcus pluvialis: Astaxanthin Induction by High Light with Acetate and Fe2+

Haematococcus pluvialis is a commercial microalga, that produces abundant levels of astaxanthin under stress conditions. Acetate and Fe2+ are reported to be important for astaxanthin accumulation in H. pluvialis. In order to study the synergistic effects of high light stress and these two factors, we obtained transcriptomes for four groups: high light irradiation (HL), addition of 25 mM acetate under high light (HA), addition of 20 μM Fe2+ under high light (HF) and normal green growing cells (HG). Among the total clean reads of the four groups, 156,992 unigenes were found, of which 48.88% were annotated in at least one database (Nr, Nt, Pfam, KOG/COG, SwissProt, KEGG, GO). The statistics for DEGs (differentially expressed genes) showed that there were more than 10 thousand DEGs caused by high light and 1800–1900 DEGs caused by acetate or Fe2+. The results of DEG analysis by GO and KEGG enrichments showed that, under the high light condition, the expression of genes related to many pathways had changed, such as the pathway for carotenoid biosynthesis, fatty acid elongation, photosynthesis-antenna proteins, carbon fixation in photosynthetic organisms and so on. Addition of acetate under high light significantly promoted the expression of key genes related to the pathways for carotenoid biosynthesis and fatty acid elongation. Furthermore, acetate could obviously inhibit the expression of genes related to the pathway for photosynthesis-antenna proteins. For addition of Fe2+, the genes related to photosynthesis-antenna proteins were promoted significantly and there was no obvious change in the gene expressions related to carotenoid and fatty acid synthesis.


Introduction
Haematococcus pluvialis is a unicellular freshwater microalga that is rich in astaxanthin. The red ketocarotenoid astaxanthin (3,30-dihydroxy-β,β-carotene-4,40-dione) is the most powerful biological antioxidant and is highly demanded in health care and aquaculture, with the latest research showing that astaxanthin may play roles in anti-depression and preventing UVA-induced skin photoaging [1][2][3][4]. Astaxanthin is a 40-carbon isoprenoid and is synthesized from isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). Three IPP molecules are added to DMAPP sequentially to form 20-carbon geranylgeranyl pyrophosphate (GGPP) and two GGPP molecules bind head-to-tail to form a tetraterpene phytoene. The phytoene is transformed to lycopene by dehydrogenation and then to β-carotene by cyclization. Finally, astaxanthin is generated from β-carotene [5][6][7]. When the cells of H. pluvialis are under stress conditions, such as nutrient deprivation, high irradiation, or high salt concentration, astaxanthin is produced in the form of fatty acyl mono-or diesters [7].
H. pluvialis is now the best source of natural astaxanthin, constituting up to 4% of cellular dry weight [8][9][10]. The life cycle of H. pluvialis can be divided into four stages: I, vegetative cell growth; II, encystment (vegetative to immature cyst cells); III, maturation (immature to mature cyst cells); and IV, germination (mature cyst to vegetative cells) [11]. Astaxanthin is accumulated mainly in stage II and stage III. Many studies attempting to increase astaxanthin accumulation in H. pluvialis by optimization of induction conditions have been reported. Jeon found that 30 mM acetate and 170 µEm −2 ·s −1 light intensity was good for astaxanthin accumulation in H. pluvialis [12]. Kobayashi pointed out that algal astaxanthin formation was more enhanced by the addition of acetate plus Fe 2+ than by the addition of acetate alone in H. pluvialis, from which he concluded that oxidative stress was involved in the posttranslational activation of carotenoid biosynthesis in acetate-induced cyst cells [13]. Hong successfully used 50 µM Fe 2+ to enhance the production of astaxanthin [14]. Following-up such studies, we chose light, acetate and Fe 2+ as variables to explore the molecular mechanism of astaxanthin production in the red-cell stage of H. pluvialis by transcriptome analysis.
Moreover, it has been reported that microalgae can produce fatty acids with unusual chain lengths and a high degree of unsaturation, which are not found in significant amounts in nature elsewhere. Prominent examples are the polyunsaturated fatty acids (PUFAs) eicosapentaenoic acid (EPA, 20:5) or docosahexaenoic acid (DHA, 22:6), ω-3 fatty acids, that are commercially produced as human health supplements [15,16]. In H. pluvialis, the accumulation of astaxanthin is correlated with increasing fatty acid biosynthesis under stress conditions [17]. Gwak indicated that astaxanthin mono-and diesters form lipid globules with triacylglycerol (TAG) based on an analysis of lipidomes and transcriptomes of H. pluvialis cultured under high irradiance [7]. In H. pluvialis, the lipid content is close to 35% of the dry weight under stress conditions and the content of polyunsaturated fatty acids (PUFAs) is closed to 50% of the neutral lipid [18]. This suggests that H. pluvialis could also produce fatty acids as a byproduct which could be used in human health supplements.
RNA sequencing (RNA-seq) is a newly developed deep-sequencing technology for transcriptome analysis, which has been a key technique for biological inquiry for decades [19]. In the present study, the transcriptomes of H. pluvialis were sequenced using Illumina Hiseq2000 (Illumina, Santiago, CA, USA). Through transcriptome analysis of H. pluvialis, we attempt to find the specific changes in metabolic pathways of algal cells grown with acetate and Fe 2+ under high light, which may lay a foundation for insight into the astaxanthin synthesis mechanism in H. pluvialis and also provide guidance for production of astaxanthin in industrial farming.

Short-Read De Novo Sequencing and Assembly
By sequencing the transcriptome of H. pluvialis, fifty to sixty-five million raw reads of each sample were obtained. To guarantee the quality of data used for the analyses, stringent parameters were used to measure the quality of reads and all reads with undetermined information greater than 10% were removed. After this filtering, about sixty million clean reads of each sample were generated, with more than 95% bases showing quality greater than Q20, while the GC percentage was around 60% (Table 1). Then, the k-mer (k = 25) technique was used to measure the nucleotide sequence of reads. As a no-reference genome project, these clean reads were assembled into contigs using overlap information until the contigs could not be further extended. After a series of subsequent steps, 156,992 unigenes (>200 bp) were generated, with an N50 of 1620 bp and N90 of 456 bp. The size of the unigenes varied from 201 bp to 35,138 bp with a mean length of 1047 bp and a total length of 164,437,994 bp ( Figure 1).
After the GO annotation, the annotated unigenes were classified according to the three GO categories: Biological Process (BP), Cellular Component (CC) and Molecular Function (MF). All of the 60,846 unigenes annotated in the GO database were classified into 56 sub-categories. The top three sub-categories for BP were cellular process, metabolic process and single-organism process, for which the numbers of genes were 35,416 (58.21% of the 60,846 unigenes annotated to GO), 32,315 (53.11%) and 26,630 (43.77%) respectively (some of the unigenes could annotated in more than one category). The top three sub-categories of CC were cell part, cell and organelle, for which the numbers of genes were 20,557 (33.79%), 20,577 (33.79%) and 7693 (12.64%) respectively. The top three sub-categories of MF were binding, catalytic activity and transporter activity, for which the numbers of genes were 32,522 (53.45%), 27,253 (44.79%) and 3974 (6.53%) respectively ( Figure 2).
After the GO annotation, the annotated unigenes were classified according to the three GO categories: Biological Process (BP), Cellular Component (CC) and Molecular Function (MF). All of the 60,846 unigenes annotated in the GO database were classified into 56 sub-categories. The top three sub-categories for BP were cellular process, metabolic process and single-organism process, for which the numbers of genes were 35,416 (58.21% of the 60,846 unigenes annotated to GO), 32,315 (53.11%) and 26,630 (43.77%) respectively (some of the unigenes could annotated in more than one category). The top three sub-categories of CC were cell part, cell and organelle, for which the numbers of genes were 20,557 (33.79%), 20,577 (33.79%) and 7693 (12.64%) respectively. The top three sub-categories of MF were binding, catalytic activity and transporter activity, for which the numbers of genes were 32,522 (53.45%), 27,253 (44.79%) and 3974 (6.53%) respectively ( Figure 2).    According to the annotations in the KOG database, 21,569 annotated unigenes were subdivided to 25 groups. The top KOG category was "Posttranslational modification, protein turnover, chaperones" (2931, 13.58%), followed by "General function prediction only" (2557, 11.85%). The smallest groups were "Cell motility" (21, 0.09%) and "Extracellular structures" (21, 0.09%) (Figure 3). According to the annotations in the KOG database, 21,569 annotated unigenes were subdivided to 25 groups. The top KOG category was "Posttranslational modification, protein turnover, chaperones" (2931, 13.58%), followed by "General function prediction only" (2557, 11.85%). The smallest groups were "Cell motility" (21, 0.09%) and "Extracellular structures" (21, 0.09%) ( Figure 3). The 23,632 unigenes annotated using the KO database were classified into five pathways of "hierarchy 1." These are Cellular Processes, Environmental Information Processing, Genetic Information Processing and Organismal Systems. As for the number of unigenes in "hierarchy 2," the largest number was 2513 unigenes (10.63%) in the pathway for "Translation," followed by 1899 unigenes (8.03%) in the pathway for "Carbohydrate metabolism" (Figure 4).  The 23,632 unigenes annotated using the KO database were classified into five pathways of "hierarchy 1." These are Cellular Processes, Environmental Information Processing, Genetic Information Processing and Organismal Systems. As for the number of unigenes in "hierarchy 2," the largest number was 2513 unigenes (10.63%) in the pathway for "Translation," followed by 1899 unigenes (8.03%) in the pathway for "Carbohydrate metabolism" (Figure 4). According to the annotations in the KOG database, 21,569 annotated unigenes were subdivided to 25 groups. The top KOG category was "Posttranslational modification, protein turnover, chaperones" (2931, 13.58%), followed by "General function prediction only" (2557, 11.85%). The smallest groups were "Cell motility" (21, 0.09%) and "Extracellular structures" (21, 0.09%) ( Figure 3). The 23,632 unigenes annotated using the KO database were classified into five pathways of "hierarchy 1." These are Cellular Processes, Environmental Information Processing, Genetic Information Processing and Organismal Systems. As for the number of unigenes in "hierarchy 2," the largest number was 2513 unigenes (10.63%) in the pathway for "Translation," followed by 1899 unigenes (8.03%) in the pathway for "Carbohydrate metabolism" (Figure 4).

DEG Profiles in Response to the Treatments
To investigate the effect of a different inducing factor of H. pluvialis, six comparison groups of DEGs were set as follows: HLvsHG-high light treatment group (HL) compared with green growth group (HG); HAvsHG-addition of acetate together with high light treatment (HA) compared with green growth group; HAvsHL-addition of acetate together with high light treatment compared to only high light treatment; HFvsHG-addition of Fe 2+ together with high light treatment (HF) compared with green growth group; HFvsHL-addition of Fe 2+ together with high light treatment compared to only high light treatment; HAvsHF-acetate treatment compared with Fe 2+ treatment. The DEG numbers for each comparison group are shown in Table 3. When using HG as control, there were more DEGs in HAvsHG, HFvsHG and HLvsHG groups than using HL as control, like HAvsHL and HFvsHL groups. The Venn Diagram of DEGs ( Figure 5) showed that there were 9731 similar DEGs compared to HG group and 245 similar DEGs compared to HL group. Cluster analysis of DEGs was used for judging the expression pattern of DEGs under different experimental conditions ( Figure 6). The cluster of DEGs showed that HL, HF and HA had significantly different expression patterns than HG and HF was more similar to HL, which means that stress, high light, acetate and Fe 2+ , caused a pronounced change in gene expression and that the effect of acetate was greater than that of Fe 2+ .

DEG Profiles in Response to the Treatments
To investigate the effect of a different inducing factor of H. pluvialis, six comparison groups of DEGs were set as follows: HLvsHG-high light treatment group (HL) compared with green growth group (HG); HAvsHG-addition of acetate together with high light treatment (HA) compared with green growth group; HAvsHL-addition of acetate together with high light treatment compared to only high light treatment; HFvsHG-addition of Fe 2+ together with high light treatment (HF) compared with green growth group; HFvsHL-addition of Fe 2+ together with high light treatment compared to only high light treatment; HAvsHF-acetate treatment compared with Fe 2+ treatment. The DEG numbers for each comparison group are shown in Table 3. When using HG as control, there were more DEGs in HAvsHG, HFvsHG and HLvsHG groups than using HL as control, like HAvsHL and HFvsHL groups. The Venn Diagram of DEGs ( Figure 5) showed that there were 9731 similar DEGs compared to HG group and 245 similar DEGs compared to HL group. Cluster analysis of DEGs was used for judging the expression pattern of DEGs under different experimental conditions ( Figure  6). The cluster of DEGs showed that HL, HF and HA had significantly different expression patterns than HG and HF was more similar to HL, which means that stress, high light, acetate and Fe 2+ , caused a pronounced change in gene expression and that the effect of acetate was greater than that of Fe 2+ .

DEG Analysis in the High Light Treatment
There were 17,089 DEGs composed of 8243 up-regulated genes and 8846 down-regulated genes in the pairwise comparison of HLvsHG. To generate an overview of the high-light-responsive DEGs, a GO analysis was performed. The GO enrichment analysis revealed that 46 GO terms were significantly enriched (corrected p value < 0.05) in the DEGs responding to high light irradiation. The main categories were in the "Biological Process" domain, including "metabolic process," "singleorganism process" and "single-organism cellular process." The main categories in the "Molecular Function" domain were "hydrolase activity, acting on glycosyl bonds," "hydrolase activity, hydrolyzing O-glycosyl compounds" and "phosphorus-oxygen lyase activity." There was only one category in the "Cellular Component" domain, which was "ribonucleotide-diphosphate reductase complex" (Table S1).
To identify the biological pathways that were active under high light irradiation, the upregulated and down-regulated DEGs were mapped to canonical signaling pathways in the KEGG database. The 8243 up-regulated DEGs were significantly enriched (p < 0.05) in 22 KEGG terms. Among these, numerous genes were involved in nucleotide metabolism, such as "Amino sugar and nucleotide sugar metabolism," "DNA replication," "Purine metabolism," "Pyrimidine metabolism." Another large DEG group was enriched in lipid-related pathways, such as "Fatty acid elongation," "Fatty acid degradation," "Folate biosynthesis," "Ascorbate and aldarate metabolism," "alpha-Linolenic acid metabolism," "One carbon pool by folate." The 8846 down-regulated DEGs were significantly enriched in 19 KEGG terms, which were related to metabolism ("Purine metabolism," "Pyruvate metabolism," "Propanoate metabolism," "Glycerolipid metabolism," "Taurine and hypotaurine metabolism") and biosynthesis ("Carotenoid biosynthesis," "Fatty acid biosynthesis"). In summary, the genes related to fatty acid pathways and the carotenoid pathway are changed under high light condition.

DEG Analysis in the High Light Treatment
There were 17,089 DEGs composed of 8243 up-regulated genes and 8846 down-regulated genes in the pairwise comparison of HLvsHG. To generate an overview of the high-light-responsive DEGs, a GO analysis was performed. The GO enrichment analysis revealed that 46 GO terms were significantly enriched (corrected p value < 0.05) in the DEGs responding to high light irradiation. The main categories were in the "Biological Process" domain, including "metabolic process," "single-organism process" and "single-organism cellular process." The main categories in the "Molecular Function" domain were "hydrolase activity, acting on glycosyl bonds," "hydrolase activity, hydrolyzing O-glycosyl compounds" and "phosphorus-oxygen lyase activity." There was only one category in the "Cellular Component" domain, which was "ribonucleotide-diphosphate reductase complex" (Table S1).
To identify the biological pathways that were active under high light irradiation, the up-regulated and down-regulated DEGs were mapped to canonical signaling pathways in the KEGG database. The 8243 up-regulated DEGs were significantly enriched (p < 0.05) in 22 KEGG terms. Among these, numerous genes were involved in nucleotide metabolism, such as "Amino sugar and nucleotide sugar metabolism," "DNA replication," "Purine metabolism," "Pyrimidine metabolism." Another large DEG group was enriched in lipid-related pathways, such as "Fatty acid elongation," "Fatty acid degradation," "Folate biosynthesis," "Ascorbate and aldarate metabolism," "alpha-Linolenic acid metabolism," "One carbon pool by folate." The 8846 down-regulated DEGs were significantly enriched in 19 KEGG terms, which were related to metabolism ("Purine metabolism," "Pyruvate metabolism," "Propanoate metabolism," "Glycerolipid metabolism," "Taurine and hypotaurine metabolism") and biosynthesis ("Carotenoid biosynthesis," "Fatty acid biosynthesis"). In summary, the genes related to fatty acid pathways and the carotenoid pathway are changed under high light condition.

DEG Analysis for Addition of Acetate under High Light Condition
There were 15,555 (7699 up-regulated and 7856 down-regulated) DEGs in the comparison of HAvsHG (acetate treatment group under high light condition compared with green growth group) and 1824 (1121 up-regulated and 703 down-regulated) DEGs in the comparison of HAvsHL (acetate treatment group compared with only high light irradiation group). GO enrichment analysis showed that 78 GO terms were significantly enriched in the comparison of HAvsHG. The main categories were in "Biological Process" and "Molecular Function" domain, including "single-organism process," "single-organism cellular process," "single-organism cellular process" and "catalytic activity." There were also some categories related to fatty acid, such as "fatty acid biosynthetic process" and "fatty acid metabolic process." There were only five significant terms in the "Cellular Component" domain, these were viral envelope, viral membrane, exodeoxyribonuclease VII complex, obsolete flagellum, nematocyst. The top 3 significantly enriched terms were "cyclic nucleotide biosynthetic process," "cyclic nucleotide metabolic process" (Biological Process) and "phosphorus-oxygen lyase activity" (Molecular Function) (Table S2). Further analysis of DEGs in HAvsHL could lead us to directly understand the function of acetate by subtracting the expression effects due to the common high light factor. The 1824 DEGs were significantly enriched to 139 GO terms, in which the main categories were also in the "Biological Process" and "Molecular Function" domains, including "metabolic process," "catalytic activity," "single-organism process," "single-organism metabolic process," "oxidoreductase activity" and "oxidation-reduction process." Besides the categories related to fatty acid ("fatty acid biosynthetic process," "fatty acid metabolic process"), there were some categories related to oxidation-reduction processes ("oxidoreductase activity, acting on the CH-CH group of donors," "oxidoreductase activity, acting on CH-OH group of donors," etc.) and response processes ("response to abiotic stimulus," "response to light intensity," etc.). Moreover, there were 19 terms significantly enriched in the "Cellular Component" domain, most of which were related to photosynthesis components, such as "chloroplast," "thylakoid part," "photosynthetic membrane," "chloroplast stroma," etc. (Table S3).
Using the KEGG database to analyze the DEGs in the HAvsHL comparison, the up-regulated genes were significantly enriched in 18 KEGG terms, including "Carbon fixation in photosynthetic organisms," "Fatty acid elongation," "Taurine and hypotaurine metabolism," "Biosynthesis of unsaturated fatty acids," "Carotenoid biosynthesis," "Linoleic acid metabolism" and so on. The down-regulated genes were significantly enriched in 11 KEGG terms, including "DNA replication," "Amino sugar and nucleotide sugar metabolism," "Oxidative phosphorylation," "Photosynthesis," "Photosynthesis-antenna proteins" and so on. The 21,271 DEGs in HFvsHG were significantly enriched in 43 GO terms, in which the main categories were in "Biological Process" and "Molecular Function" domain, including "metabolic process," "catalytic activity," "organic substance metabolic process," "single-organism process," "primary metabolic process" and no categories were in the "Cellular Component" domain. The top 5 significantly enriched categories were "cyclic nucleotide biosynthetic process," "nucleoside phosphate biosynthetic process," "nucleotide biosynthetic process," "cyclic nucleotide metabolic process" (Biological Process) and "phosphorus-oxygen lyase activity" (Molecular Function) ( Table S4). The 1900 DEGs in HFvsHL comparison were significantly enriched in 57 GO terms, in which the main categories were in the "Biological Process" and "Cellular Component" domains, including "metabolic process," "cellular metabolic process," "cell" and "cell part." There were only five categories in the "Molecular Function" domain, including "acireductone dioxygenase [iron(II)-requiring] activity," "beta-glucosidase activity," "nutrient reservoir activity," "structural constituent of ribosome," "structural molecule activity." The top 5 significantly enriched categories were "thylakoid," "photosynthetic membrane," "thylakoid part," "chloroplast" (Cellular Component) and "photosynthesis" (Biological Process), which were related to photosynthesis (Table S5).
2.3.5. KEGG Analysis of DEGs between Two Groups: Addition of Acetate or Fe 2+ The above DEG analyses indicated that the two salts had different influences on H. pluvialis under high light stress. The 4864 (2844 up-regulated and 2020 down-regulated) DEGs in the HAvsHF comparison were analyzed using the KEGG database. The 2844 up-regulated DEGs were significantly enriched in 17 KEGG terms, such as "Fatty acid elongation," "Fatty acid biosynthesis," "Linoleic acid metabolism," "Glycerolipid metabolism," "Biosynthesis of unsaturated fatty acids," which related to lipid substances and "Carotenoid biosynthesis," which related to astaxanthin synthesis. The 2020 down-regulated DEGs were significantly enriched in 12 KEGG terms, in which the main terms were related to DNA replication ("DNA replication," "Mismatch repair," "Pyrimidine metabolism," "Porphyrin and chlorophyll metabolism") and photosynthesis ("Photosynthesis," "Photosynthesis-antenna proteins," "Carbon fixation in photosynthetic organisms," "Porphyrin and chlorophyll metabolism"). All the above shows that acetate promotes the up-regulated expression of genes related to lipid and carotenoid pathways in H. pluvialis. As for Fe 2+ , the expression of genes that related to photosynthesis were significantly promoted compared with the inhibition effect of sodium acetate.

The Effects of Induction Treatment on Specific Biological Pathways
In order to explore the specific effects of the three induction treatments (high light, acetate, Fe 2+ ) on H. pluvialis, based on the DEG analysis in KEGG, three specific pathways were chosen: "carotenoid biosynthesis," "fatty acid biosynthesis" and "photosynthesis-antenna proteins".

The Effects on the Carotenoid Biosynthesis Pathway
In HLvsHG, the 22 DEGs related to carotenoid biosynthesis (pathway map: ko00906) were annotated as 8 genes: phytoene desaturase (PDS), lycopene β-cyclase (lcyB), prolycopene isomerase (crtISO), carotene epsilon-monooxygenase (LUT1), zeaxanthin epoxidase (ZEP), beta-carotene hydroxylase (crtZ), β-ring hydroxylase (LUT5), carotenoid cleavage dioxygenase 8 (CCD8). In HAvsHL, the 7 DEGs related to carotenoid biosynthesis were annotated as 3 genes: CCD8, crtZ and lycopene epsilon-cyclase (lcyE) and there were no DEGs in HFvsHL (Table 4). CrtZ and LUT5 are genes related to the conversion of β-carotene to astaxanthin and in addition, they also play a role in the conversion of lycopene to lutein with LUT1 (Figure 7). The expression of crtZ was increased under high light and significantly increased in the acetate treatment. LUT5 and LUT1 were up-regulated under high light but did not change in the comparison groups of HAvsHL and HFvsHL. The genes in the conversion of geranylgeranyl pyrophosphate (GGPP) to β-carotene, including PDS, crtISO, lcyB, were down-regulated under high light in this study. The results indicate that not all the genes involved in astaxanthin synthesis were up-regulated under the stress condition, maybe due to the stress duration, stress condition or other reasons. LcyE is an important gene in the lycopene to lutein synthesis pathway and it was down-regulated in HAvsHL, which would decrease competition for the precursor of β-carotene. ZEP converts zeaxanthin to violaxanthin under low-light condition, which would further generate abscisate. ZEP was down-regulated in HLvsHG, which means that the conversion of zeaxanthin to violaxanthin was strongly inhibited under high light. The last gene CCD8 is related to the conversion of β-carotene to strigol (Figure 7). CCD8 was significantly up-regulated both in HLvsHG and HAvsHL, which indicates that strigol production is enhanced, together with astaxanthin, for both the high light and the high light with acetate treatments. No significantly enriched DEGs were related to carotenoid biosynthesis in HFvsHL, which indicates that addition of Fe 2+ probably did not influence the accumulation of astaxanthin. enriched DEGs were related to carotenoid biosynthesis in HFvsHL, which indicates that addition of Fe 2+ probably did not influence the accumulation of astaxanthin. "Up" means all of the DEGs were up-regulated, "down" means all of the DEGs were down-regulated, "ns" means the no significant change, with the DEGs both up-regulated and down-regulated, "/" means no DEGs.

The Effects on the Fatty Acid Elongation Pathway
Fatty acid elongation (pathway map: ko00062) can be divided into two pathways, one to form palmitic acid in mitochondria, (0 < n < 16, where "n" is the number of carbon atoms in the fatty acid chain) and the second to form long-chain fatty acids (n > 16) in the endoplasmic reticulum ( Figure 8). There were 28 differentially expressed unigenes related to the fatty acid elongation pathways in the HLvsHG comparison group and all of them were up-regulated. Among those, 25 unigenes annotated

The Effects on the Fatty Acid Elongation Pathway
Fatty acid elongation (pathway map: ko00062) can be divided into two pathways, one to form palmitic acid in mitochondria, (0 < n < 16, where "n" is the number of carbon atoms in the fatty acid chain) and the second to form long-chain fatty acids (n > 16) in the endoplasmic reticulum ( Figure 8). There were 28 differentially expressed unigenes related to the fatty acid elongation pathways in the HLvsHG comparison group and all of them were up-regulated. Among those, 25 unigenes annotated as 3-ketoacyl-CoA synthase (KCS) and the other three were mitochondrial trans-2-enoyl-CoA reductase (MECR), very-long-chain (3R)-3-hydroxyacyl-CoA dehydratase (PHS1) and palmitoyl-protein thioesterase (PPT), respectively (Table 5). In the comparison group HAvsHL, there were 10 up-regulated unigenes annotated as KCS, 1 up-regulated unigene annotated as MECR and 1 down-regulated unigene annotated as PPT. And in the comparison group HFvsHL, there were only 4 down-regulated unigenes, annotated as KCS and PPT. KCS and PHS1 promote the formation of long-chain fatty acids from Malonyl-CoA. MECR and PPT promote the formation of palmitic acid from Acetyl-CoA. The results clearly show that high light promotes fatty acid elongation, especially through an increase in the expression of KCS and that acetate addition further increases the expression of KCS, whereas Fe 2+ addition inhibited the expression of KCS to a certain extent. as 3-ketoacyl-CoA synthase (KCS) and the other three were mitochondrial trans-2-enoyl-CoA reductase (MECR), very-long-chain (3R)-3-hydroxyacyl-CoA dehydratase (PHS1) and palmitoylprotein thioesterase (PPT), respectively (Table 5). In the comparison group HAvsHL, there were 10 up-regulated unigenes annotated as KCS, 1 up-regulated unigene annotated as MECR and 1 downregulated unigene annotated as PPT. And in the comparison group HFvsHL, there were only 4 downregulated unigenes, annotated as KCS and PPT. KCS and PHS1 promote the formation of long-chain fatty acids from Malonyl-CoA. MECR and PPT promote the formation of palmitic acid from Acetyl-CoA. The results clearly show that high light promotes fatty acid elongation, especially through an increase in the expression of KCS and that acetate addition further increases the expression of KCS, whereas Fe 2+ addition inhibited the expression of KCS to a certain extent. "Up" means all of the DEGs were up-regulated, "down" means all of the DEGs were down-regulated, "/" means no DEGs.

The Effects on Photosynthesis-Antenna Proteins Pathway
The antenna proteins that exist in light-harvesting chlorophyll protein complexes in green plants act as peripheral antenna systems, enabling more efficient absorption of light energy. The DEGs related to the photosynthetic-antenna protein pathway (pathway map: ko00196) is shown in Table  6. The genes encoding light-harvesting complex I chlorophyll a/b binding proteins 1-5 (LHCA1-5)

The Effects on Photosynthesis-Antenna Proteins Pathway
The antenna proteins that exist in light-harvesting chlorophyll protein complexes in green plants act as peripheral antenna systems, enabling more efficient absorption of light energy. The DEGs related to the photosynthetic-antenna protein pathway (pathway map: ko00196) is shown in Table 6. The genes encoding light-harvesting complex I chlorophyll a/b binding proteins 1-5 (LHCA1-5) encode binding proteins in light-harvesting complex I and those encoding light-harvesting complex II chlorophyll a/b binding proteins 1-7 (LHCB1-7) encode binding proteins in light-harvesting complex II. In HLvsHG, the significantly up-regulated genes were LHCA1, LHCA3 and LHCB5. More up-regulated genes were found in HFvsHL, including LHCA1, LHCA3, LHCA4, LHCB2, LHCB3, LHCB5, LHCB6 and LHCB7. But in HAvsHL, all of the genes related to photosynthesis-antenna proteins were down-regulated (Figure 9). These results indicate that under high light condition, the genes related to photosynthesis-antenna proteins are partially up-regulated. Adding Fe 2+ to cultures under high light condition up-regulated the expression of additional photosynthesis-antenna genes, whereas acetate down-regulated the expression of photosynthesis-antenna genes. Table 6. The up-and-down regulated genes related to photosynthetic-antenna protein pathway. encode binding proteins in light-harvesting complex I and those encoding light-harvesting complex II chlorophyll a/b binding proteins 1-7 (LHCB1-7) encode binding proteins in light-harvesting complex II. In HLvsHG, the significantly up-regulated genes were LHCA1, LHCA3 and LHCB5. More up-regulated genes were found in HFvsHL, including LHCA1, LHCA3, LHCA4, LHCB2, LHCB3, LHCB5, LHCB6 and LHCB7. But in HAvsHL, all of the genes related to photosynthesis-antenna proteins were down-regulated ( Figure 9). These results indicate that under high light condition, the genes related to photosynthesis-antenna proteins are partially up-regulated. Adding Fe 2+ to cultures under high light condition up-regulated the expression of additional photosynthesis-antenna genes, whereas acetate down-regulated the expression of photosynthesis-antenna genes. Table 6. The up-and-down regulated genes related to photosynthetic-antenna protein pathway.

Analysis of the Results of Real-Time Quantitative PCR
To validate the sequencing results obtained by RNA-seq, four genes related to the carotenoid pathway were chosen for real-time PCR: crtZ, lcyB, crtISO and ZEP. The results of qRT-PCR showed that crtZ was significantly up-regulated in three experimental groups (HL, HA and HF) relative to the control group (HG). The level increased more than 10 fold in HAvsHL and HFvsHL and 6 fold in

Analysis of the Results of Real-Time Quantitative PCR
To validate the sequencing results obtained by RNA-seq, four genes related to the carotenoid pathway were chosen for real-time PCR: crtZ, lcyB, crtISO and ZEP. The results of qRT-PCR showed that crtZ was significantly up-regulated in three experimental groups (HL, HA and HF) relative to the control group (HG). The level increased more than 10 fold in HAvsHL and HFvsHL and 6 fold in HLvsHG. CrtISO and LcyB were down-regulated in three experimental groups, decreasing 1.5-7.5 fold. ZEP was significantly down-regulated in the HAvsHG group (7.5 fold) ( Figure 10). Overall, the expression levels of the four genes were similar in the results of RNA-seq and qRT-PCR. There were also some small differences between the two measurements, such as the up-regulated expression of ZEP in HLvsHG. This outcome might have been affected by different measurement methods. HLvsHG. CrtISO and LcyB were down-regulated in three experimental groups, decreasing 1.5-7.5 fold. ZEP was significantly down-regulated in the HAvsHG group (7.5 fold) ( Figure 10). Overall, the expression levels of the four genes were similar in the results of RNA-seq and qRT-PCR. There were also some small differences between the two measurements, such as the up-regulated expression of ZEP in HLvsHG. This outcome might have been affected by different measurement methods.

Discussion
In recent years, transcriptome analysis has become a hot tool for studying the level of gene transcription [19]. It has been reported in another transcriptome analysis of H. pluvialis that in the case of three genes-crtZ, ZDS, PDS-involved in the carotenoid biosynthesis pathway, crtZ was upregulated after induction with both salicylic acid and jasmonic acid [9]. Using transcriptome analysis, Gwak found that beta-carotene hydroxylase (BKT or crtZ), phytoene synthase (PSY) and phytoene desaturase (PDS) were all up-regulated under high irradiance [7]. In our research, the expression of crtZ was affected by high light and was significantly promoted by acetate. It can be seen that βcarotene hydroxylase (crtZ) played a fundamental role in the astaxanthin biosynthesis pathway and many factors can promote its up-regulated expression, such as high light, acetate and specific hormones. High light impacted the expression of more genes involved in the carotenoid biosynthesis pathway, including PDS, crtISO, LcyB, LUT1, LUT5 and ZEP, than the other two induction conditions, which shows that high light is the main driver for changes in expression of genes related to carotenoid biosynthesis. Acetate further promotes astaxanthin biosynthesis by enhancing the expression of crtZ and inhibiting the expression of LcyE. With regard to lipids, using a comparison of lipidomes and transcriptomes, Gwak found that synthesis of the storage lipid, triacylglycerol (TAG), was substantially stimulated under high irradiance and that it was regulated by expression of de novo fatty acid biosynthesis at the gene level [7]. Here, the genes involved in fatty acid related pathways, especially the fatty acid elongation pathway, were significantly affected by high light conditions. The addition of acetate further enhanced expression of the related genes (KCS, MECR). From all of the above, it can be seen that there was a similar effect of high light and acetate on the expression of genes involved in the carotenoid pathway and fatty acid pathway. There may be some link between the two processes where affecting one also affects the other. As for Fe 2+ , Kobayashi pointed out that Fe 2+ Figure 10. The results of qRT-PCR. "RQ" represented 2 −∆∆Ct and "Log (RQ) > 0" means the gene was up-regulated, "Log (RQ) < 0" means the gene was down-regulated. (A-D) represented crtZ, crtISO, lcyB and ZEP, respectively.

Discussion
In recent years, transcriptome analysis has become a hot tool for studying the level of gene transcription [19]. It has been reported in another transcriptome analysis of H. pluvialis that in the case of three genes-crtZ, ZDS, PDS-involved in the carotenoid biosynthesis pathway, crtZ was up-regulated after induction with both salicylic acid and jasmonic acid [9]. Using transcriptome analysis, Gwak found that beta-carotene hydroxylase (BKT or crtZ), phytoene synthase (PSY) and phytoene desaturase (PDS) were all up-regulated under high irradiance [7]. In our research, the expression of crtZ was affected by high light and was significantly promoted by acetate. It can be seen that β-carotene hydroxylase (crtZ) played a fundamental role in the astaxanthin biosynthesis pathway and many factors can promote its up-regulated expression, such as high light, acetate and specific hormones. High light impacted the expression of more genes involved in the carotenoid biosynthesis pathway, including PDS, crtISO, LcyB, LUT1, LUT5 and ZEP, than the other two induction conditions, which shows that high light is the main driver for changes in expression of genes related to carotenoid biosynthesis. Acetate further promotes astaxanthin biosynthesis by enhancing the expression of crtZ and inhibiting the expression of LcyE. With regard to lipids, using a comparison of lipidomes and transcriptomes, Gwak found that synthesis of the storage lipid, triacylglycerol (TAG), was substantially stimulated under high irradiance and that it was regulated by expression of de novo fatty acid biosynthesis at the gene level [7]. Here, the genes involved in fatty acid related pathways, especially the fatty acid elongation pathway, were significantly affected by high light conditions. The addition of acetate further enhanced expression of the related genes (KCS, MECR). From all of the above, it can be seen that there was a similar effect of high light and acetate on the expression of genes involved in the carotenoid pathway and fatty acid pathway. There may be some link between the two processes where affecting one also affects the other. As for Fe 2+ , Kobayashi pointed out that Fe 2+ promoted carotenoid formation by providing oxidative stress in acetate-induced cyst cells (through the method of inhibiting by potassium iodide) and the effect of Fe 2+ could be substituted by active oxygen species [13]. Here, the genes most significantly affected by Fe 2+ were related to photosynthesis, especially the photosynthesis -antenna proteins. The effect of Fe 2+ on H. pluvialis was not directly on the genes involved in astaxanthin biosynthesis but rather indirectly by promoting oxidative stress, affecting the genes related to photosynthesis and thereby promoting astaxanthin accumulation.

Cell Culture and Treatments
H. pluvialis HOUC9 was stored in the Laboratory of Phycology at the Ocean University of China. 1000 mL conical flasks were used to culture 600 mL of algal cells in modified BBM (Bold's Basal Medium) medium, with a 12 h:12 h light/dark cycle at 22 • C. The light intensity used for the green growth period was 30 µEm −2 ·s −1 . When the cell concentration reached to (4-5) × 10 5 cells/mL, induction treatments were initiated.
The inductions were performed in 250 mL conical flasks with 50 mL inducing medium added to 50 mL of algal culture. The induction medium contained either acetate or Fe 2+ as inducing salt. The experiments are grouped as follows: group HL was cultured under high light intensity (195 µEm −2 ·s −1 ) without any addition of inducing salt; group HA had acetate added to a total concentration of 2 g/L under the same high light intensity as group HL; group HF had Fe 2+ added to a total concentration of 20 µM under the same high light intensity as group HL; group HG was the control group grown under normal light intensity (30 µEm −2 ·s −1 ) without addition of an inducing salt. Each group was comprised of three biological repeats and were cultured for 48 h. The algal cells were then collected and frozen in liquid nitrogen for RNA extraction.

RNA Extraction and cDNA Library Construction
Total RNA of all samples was extracted using the EasyPure ® Plant RNA Kit (TransGen, Beijing, China). Then, four different methods were used to detect the quality and quantity of the total RNA. First, agarose gel electrophoresis was used to determine the RNA content. Second, a Nanodrop (Thermo Fisher Scientific, Waltham, MA, USA) spectrophotometer OD260/280 ratio, was used to measure its purity. Then, the concentration of the RNA was quantified using a Qubit fluorimeter (Thermo Fisher Scientific, Waltham, MA, USA). Finally, an Agilent 2100 bioanalyzer (Agilent, Beijing, China) was used to determine the integrity of the RNA accurately.
After the samples were characterized, the eukaryotic mRNA was enriched using Oligo(dT) magnetic beads. The mRNA was cut into short fragments using fragment Buffer and then used as the template for single-strand cDNA synthesis, after adding 6-bp random hexamers. The second-strand cDNA synthesis was performed by adding buffer, dNTPs, DNA polymerase I and RNase H. After purification of the double-stranded cDNA using AMPure XP beads, the ends were repaired and poly (A) was linked to the 3 -cDNA. Sequencing adapters, which contained a recognition site, were added to the end of the poly (A) sequence. After amplifying the double-stranded cDNA, the PCR amplification product was purified by AMPure XP beads. Finally, the cDNA library was constructed and the preliminary initial quantitation of the library was performed by Qubit 2.0 (Thermo Fisher Scientific, Waltham, MA, USA). Then, the cDNA concentration was diluted to 1.5 ng/µL. Subsequently, the insert size was checked with an Agilent 2100. After ascertaining that the insert size met the requirement, the effective concentration of the library was accurately quantified by Q-PCR (Effective concentration of library >2 nM) to guarantee the quality of the library. After detection of the libraries of all samples, the different libraries were pooled according to the effective concentration and the required amount of data and then sequencing was performed by Illumina HiSeq™ (Illumina, Santiago, CA, USA).

Quality Controls, De Novo Assembly and Clustering
The raw data were obtained and the base quality was tested using Q phred software. If the sequence error rate was expressed in e, the base-quality value of Illumina HiSeq™ was expressed in Q phred , then the Q phred = −10Log 10 (e). And the distribution of the sequence error rate can reflect the quality of the sequenced data. Q10, Q20, Q30, Q40 represent an accuracy of 90%, 99%, 99.9%, 99.99%, respectively. The low-quality reads were filtered out, (i.e., reads containing primer/adaptor sequences, reads in which the proportion of undetermined bases were greater than 10%, reads with more than 50% proportion of bases with Q phred ≤ 20 and so on) and the remaining clean reads were retained [20].
For transcriptome without reference sequence, the clean reads were de novo assembled by using Trinity [21] to obtain a reference sequence for subsequent analysis. Trinity combined three independent software modules (Inchworm, Chrysalis and Butterfly) to deal with a large amount of RNA-seq data successively. Inchworm was used to form contigs by decomposing reads, constructing k-mer (k = 25) dictionary, selecting seed k-mer and extending on both sides. Chrysalis was used to form components through the cluster of overlapped contigs and every component became a collection of possible characterization of variable shear isoform or homologous genes. And each component had the corresponding de Bruijn graph. Butterfly was used to output full-length transcripts of variable shear isoforms and comb transcripts that corresponded to paralogous genes through simplifying each component's de Bruijn graph. After these three steps, the assembled results of TRINITY.fasta were finally obtained.
Then Corset was used to hierarchically cluster transcripts by expression patterns and the number of reads which had been blasted to transcripts [22]. After hierarchical clustering by Corset, the longest Cluster sequence was used for further analysis (Corset website: https://code.google.com/p/corsetproject/).

Gene Function Annotation and Classification
Gene functional annotation was performed using seven databases, including Nr, Nt, Pfam, KOG/COG, SwissProt, KEGG and GO. By using Blast to search the Nr database, we could find functional information for genes of H. pluvialis and the similarity of gene sequences between H. pluvialis and related species. By GO annotation, the annotated genes were classified to the next terms of the three GO categories (BP biological process, CC cellular component, MF molecular function). By KO annotation, the annotated genes could be classified according to their KEGG metabolic pathway.

Analysis of Differentially Expressed Genes (DEGs)
For biologically repeated samples, DESeq [23] was used to do the differential analysis based on the negative binomial distribution. Using padj < 0.05 (Padj was the corrected value of p-value) as threshold value to screen out DEGs in different comparison groups, which had been set to 6 groups: HLvsHG, HAvsHG, HAvsHL, HFvsHG, HFvsHL and HAvsHF. Then two comparison combinations of the Venn diagram: HLvsHG-HAvsHG-HFvsHG and HAvsHL-HFvsHL, were set to show the number of total genes and specific genes between those comparison groups. As for the expression level of DEGs, cluster analysis was used to determine the expression patterns of DEGs under different experimental conditions based on the FPKM (expected number of Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced) values. It is the most common method of gene expression level estimation [24]. Then the DEG enrichment analysis using GO and KEGG was used to identify gene function and metabolic pathway in the microalgae under different conditions. The Gene Ontology (GO) database (http://www.geneontology.org/) was used to map the DEGs to the term types and calculate the number of every term type. With the termed GOseq, the significantly enriched terms were found by comparing them to the whole genome background [25]. KEGG is the main public database for pathway determination [26]. KEGG enrichment could identify the principal metabolic pathways and signal transduction pathways of the DEGs. The KEGG significant enrichment was analyzed using a hypergeometric test to look for significantly enriched pathways of DEGs, compared to the all annotated genes by using KEGG Pathway as a unit. Computational formula of the analysis is: N stands for the number of genes that can be annotated by pathways and n is the number of DEGs in N. M is the number of genes annotated to a certain pathway among all the genes and m is the number of DEGs in M. Those pathways with FDR (false discovery rate) ≤ 0.05 were defined as the significant enrichment pathways for DEGs. In this paper, FDR was a corrected p-value obtained by the BH method using KOBAS (2.0) [27].

Conclusions
Based on the DEGs involved in carotenoid metabolism, the expression pattern of 9 genes (PDS, crtISO, LUT1, LUT5, lcyB, lcyE, crtZ, CCD8, ZEP) are different under different induction conditions. The results indicated that acetate and Fe 2+ have different mechanisms for inducing astaxanthin production in H. pluvialis. High light is the direct factor for the changes in gene expression during the induction period. Acetate could further enhance expression of the genes involved in carotenoid and fatty acid synthesis. Fe 2+ mainly influences the genes involved in photosynthesis and especially promoted the expression of genes related to photosystem I and photosystem II. Taken together, these results obtained through application of RNA-seq provide a foundation and orientation for future studies on H. pluvialis.