De Novo Biosynthesis of p-Coumaric Acid in E. coli with a trans-Cinnamic Acid 4-Hydroxylase from the Amaryllidaceae Plant Lycoris aurea

p-Coumaric acid is a commercially available phenolcarboxylic acid with a great number of important applications in the nutraceutical, pharmaceutical, material and chemical industries. p-Coumaric acid has been biosynthesized in some engineered microbes, but the potential of the plant CYP450-involved biosynthetic route has not investigated in Escherichia coli. In the present study, a novel trans-cinnamic acid 4-hydroxylase (C4H) encoding the LauC4H gene was isolated from Lycoris aurea (L’ Hér.) Herb via rapid amplification of cDNA ends. Then, N-terminal 28 amino acids of LauC4H were characterized, for the subcellular localization, at the endoplasmic reticulum membrane in protoplasts of Arabidopsis thaliana. In E. coli, LauC4H without the N-terminal membrane anchor region was functionally expressed when fused with the redox partner of A. thaliana cytochrome P450 enzyme (CYP450), and was verified to catalyze the trans-cinnamic acid to p-coumaric acid transformation by whole-cell bioconversion, HPLC detection and LC-MS analysis as well. Further, with phenylalanine ammonia-lyase 1 of A. thaliana, p-coumaric acid was de novo biosynthesized from glucose as the sole carbon source via the phenylalanine route in the recombinant E. coli cells. By regulating the level of intracellular NADPH, the production of p-coumaric acid was dramatically improved by 9.18-fold, and achieved with a titer of 156.09 μM in shake flasks. The recombinant cells harboring functional LauC4H afforded a promising chassis for biological production of p-coumaric acid, even other derivatives, via a plant CYP450-involved pathway.


Introduction
p-Coumaric acid is a commercially available phenolcarboxylic acid with a great number of important applications in the nutraceutical, pharmaceutical, material and chemical industries. p-Coumaric acid possesses potent anti-oxidant, antibacterial and anti-inflammatory properties, and serves as a conventional precursor for the production of flavors and fragrances used in edible and daily chemical products. p-Coumaric acid is also a starting material for the preparation of environmentally degradable thermoplastics with liquid crystalline behavior [1]. Recently, p-coumaric acid has been found to have many novel bioactivities, such as antiproliferative effect [2], anxiolytic effect [3], a nephroprotective role [4], melanogenesis inhibition [5], and neuroprotective effects [6]. In virtue of the plant biosynthetic route PAL-C4H, p-coumaric acid has been biosynthesized from phenylalanine only in engineered S. cerevisiae [18]. The enzyme C4H is a member of the CYP73A subfamily in cytochrome P450 enzymes (CYP450s) catalyzing a series of oxidation reactions with a CYP450 reductase as redox partner for supplying electrons from NADPH, and is supposed to commonly localize at the cytoplasmic side of endoplasmic reticulum (ER) membrane [14]. Given that prokaryotic microbes such as E. coli do not possess compartmentalized organelle(s), the functional expression of CYP450s is difficult [24,25]. Vannelli and his co-workers have functionally co-expressed the C4H and the CYP450 reductase from Helianthus tuberosus with a fungal PAL enzyme in S. cerevisiae cells, and produced p-coumaric acid from the central metabolite L-phenylalanine via the PAL-C4H route [18].
Though p-coumaric acid can be produced directly from tyrosine, it was necessary to the potential of the plant CYP450-involved biosynthetic route from phenylalanine in E. coli which could to some extent expand the biosynthetic pathway of p-coumaric acid and offer an alternative approach to produce p-coumaric acid from phenylalanine as well as tyrosine. In the present study, based on our previous transcriptome data of Lycoris aurea (L' Hér.) Herb [26], an ornamentally and medicinally important plant of the Lycoris genus of the Amaryllidaceae family, a novel C4H In virtue of the plant biosynthetic route PAL-C4H, p-coumaric acid has been biosynthesized from phenylalanine only in engineered S. cerevisiae [18]. The enzyme C4H is a member of the CYP73A subfamily in cytochrome P450 enzymes (CYP450s) catalyzing a series of oxidation reactions with a CYP450 reductase as redox partner for supplying electrons from NADPH, and is supposed to commonly localize at the cytoplasmic side of endoplasmic reticulum (ER) membrane [14]. Given that prokaryotic microbes such as E. coli do not possess compartmentalized organelle(s), the functional expression of CYP450s is difficult [24,25]. Vannelli and his co-workers have functionally co-expressed the C4H and the CYP450 reductase from Helianthus tuberosus with a fungal PAL enzyme in S. cerevisiae cells, and produced p-coumaric acid from the central metabolite L-phenylalanine via the PAL-C4H route [18].
Though p-coumaric acid can be produced directly from tyrosine, it was necessary to investigate the potential of the plant CYP450-involved biosynthetic route from phenylalanine in E. coli which could to some extent expand the biosynthetic pathway of p-coumaric acid and offer an alternative approach to produce p-coumaric acid from phenylalanine as well as tyrosine. In the present study, based on our previous transcriptome data of Lycoris aurea (L' Hér.) Herb [26], an ornamentally and medicinally important plant of the Lycoris genus of the Amaryllidaceae family, a novel C4H encoding gene was isolated from L. aurea, and designated as LauC4H. Then LauC4H was expressed truncatedly at the N-terminus in protoplasts of A. thaliana to identify the amino acids responsible for the subcellular localization. Moreover, in E. coli, LauC4H without the N-terminal membrane anchor region was heterogeneously expressed for functional identification. PAL1 of A. thaliana was further introduced in the recombinant E. coli for p-coumaric acid de novo biosynthesis from glucose via the phenylalanine route. By regulating the level of intracellular NADPH, the production of p-coumaric acid was further increased by 9.18-fold, and a titer of 156.09 µM was achieved in shake flasks.

C4H Homology of L. aurea Transcriptome
Previously, de novo transcriptome sequencing has been performed to produce a comprehensive expressed sequence tag (EST) dataset for L. aurea using high-throughput sequencing technology [26]. The EST dataset of L. aurea provides a platform be critical in the speeding-up identification of a large number of related genes of secondary metabolite products. Further batch alignment results revealed that 226 contigs and unigenes were annotated to be responsible for the phenylpropanoid biosynthetic pathway. Of them, one unigene, namely unigene CL5217, showing high similarity with plant C4Hs was retrieved. Unigene CL5217 was 1794 bp with a predicted 1518 bp open reading frame (ORF), and was selected for further molecular cloning and functional characterization for the biosynthesis of p-coumaric acid.

Cloning of Full-length C4H Genes in L. aurea
By quantitative real-time polymerase chain reaction (qRT-PCR) and rapid amplification of cDNA ends (RACE) approaches, a cDNA encoding C4H homology was isolated from L. aurea. Firstly the quantitative PCR primer pairs annealing at the 3 -terminus of the predicted ORF and 3 -untranslational region were designed and used to determine the abundance of the unigene CL5217 in various tissues of L. aurea. qRT-PCR results revealed that the level of unigene CL5217 was highest in scape ( Figure 2). Then, based on the sequence of unigene CL5217, the predicted ORF was cloned with the cDNA pool of scape as the template. Four gene variants were obtained with above 99% identity to each other at the nucleic acid level (Appendix Figure A1) referring to two protein variants at the amino acid level (Appendix Figure A2). Based on the genome size observed by flow cytometry, the plant material L. aurea used in this study is diploid (2n = 16) [27]. Therefore, a number of C4H paralogs were expected in the genome of L. aurea. Moreover, the plant used in this study was flowering and the bulb has been vegetatively propagated. For these reasons, the existence of multiple similar C4H Figure 2. The abundance of C4H candidate gene (unigene CL5217) in various tissues of L. aurea. The values and error bars represent the mean ± standard error from three independent samples in three replicates per sample.
Then, based on the sequence of unigene CL5217, the predicted ORF was cloned with the cDNA pool of scape as the template. Four gene variants were obtained with above 99% identity to each other at the nucleic acid level (Appendix A Figure A1) referring to two protein variants at the amino acid level (Appendix A Figure A2). Based on the genome size observed by flow cytometry, the plant material L. aurea used in this study is diploid (2n = 16) [27]. Therefore, a number of C4H paralogs were expected in the genome of L. aurea. Moreover, the plant used in this study was flowering and the bulb has been vegetatively propagated. For these reasons, the existence of multiple similar C4H transcripts was not surprising. This phenomenon was also reported in cloning the coding genes of norbelladine 4 -O-methyltransferase and para-para' C-C phenol coupling CYP450 in the Amaryllidaceae plant Narcissus sp. aff. pseudonarcissus [28,29]. The deduced LauC4H protein (variant 1) had a predicted molecular mass of 58.19 kDa and pI of 9.04.
LauC4H possessed all the diagnostic features of the primary structure for CYP450s as well as the CYP73A subfamily. A hydrophobic membrane-spanning region was predicted in the N-terminus of LauC4H between amino acid residues 9 and 26 by TMpred online program (Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland) and SignalP 3.0 server [30]. Following the N-terminal transmembrane sequence, a proline-rich region (PPGPLPVP) was present ( Figure 3A). A conserved heme-binding motif (PFGVGRRSCPG) was also found near the C-terminus of LauC4H ( Figure 3A). In addition, the "PERF" consensus sequence and some conserved helices such as I-helix (AAIET), J-helix (PDIQQKLRNE), K-helix (KETLR) and K'-helix (AWWLANN) were also identified in LauC4H sequence ( Figure 3A). In the homologous tree, plant C4Hs were grouped into two classes as that implied by the differential N-terminus and C-terminus in the amino acid sequence alignment ( Figure 3A,B), and LauC4H showed higher identity with the C4Hs of Class I than those of Class II ( Figure 3B).

Subcellular Localization of LauC4H
Further, an investigation of the subcellular localization of LauC4H in plant cells was carried out. LauC4H was firstly predicted to localize in the ER using the WoLF PSORT program [31], and its

Subcellular Localization of LauC4H
Further, an investigation of the subcellular localization of LauC4H in plant cells was carried out. LauC4H was firstly predicted to localize in the ER using the WoLF PSORT program [31], and its N-terminal sequence was considered responsible for anchoring to the membrane using the online software TMpred (SIB) and SignalP 3.0 [30]. Then, the whole ORF and the truncated sequences encoding the N-terminal 28 amino acids (LauC4H 1-28 ) and the LauC4H without the N-terminal 28 amino acids (LauC4H ∆2-28 ), were fused with the enhanced green fluorescence protein (EGFP) gene, respectively. These fusion transgenes were transiently expressed in protoplasts prepared from the tender leaf of A. thaliana. Meanwhile, the fusion protein HDEL-mCherry with red fluorescence was used as the ER marker [32]. In Arabidopsis protoplasts containing LauC4H-EGFP and HDEL-mCherry, the green fluorescence was present overlapping with the red fluorescence of the ER marker (Figure 4), showing a typical fluorescence pattern of ER localization. A similar pattern was also observed for the LauC4H 1-28 -EGFP fusion protein overlapping with the ER marker, though there was a small portion of green fluorescence not overlapped with the red fluorescence ( Figure 4). In Arabidopsis protoplasts containing LauC4H ∆2-28 -EGFP, the fluorescence pattern was distinctly different from those of LauC4H-EGFP and LauC4H 1-28 -EGFP, and there was scarcely any overlap between the green fluorescence and the red fluorescence of the ER marker ( Figure 4). Thereby, our observation suggested that the N-terminal 28 amino acids spanning the hydrophobic transmembrane region was responsible for leading LauC4H to localize in ER of plant cells. Previous studies showed that the CYP450 redox partners from A. thaliana and hybrid poplar were localized at the ER membrane of Arabidopsis protoplasts [33,34], those supported the notion that the N-terminus maybe confer LauC4H to co-localize with the redox partner at the ER membrane and guide the electron transfer in plant cells ( Figure 5A). N-terminus maybe confer LauC4H to co-localize with the redox partner at the ER membrane and guide the electron transfer in plant cells ( Figure 5A).

Functional Identification of LauC4H
Given that the oxygenation process of CYP450s requires a redox partner as electron supplier, the A. thaliana CYP450 redox partner ATR2 was employed for the functional exploration of LauC4H in E. coli. Since there was no internal ER-like membrane in E. coli, both LauC4H and ATR2 were expressed heterogenously without the N-terminal membrane anchor region, rather than in the full-length sequence, thus to avoid the so-called incompatibility between the membrane recognition signal of heterogenous proteins and the prokaryotic host [25]. In detail, the truncated ATR2 (ATR2 Δ2-74 ) was fused in the N-terminus of truncated LauC4H (LauC4H Δ2-28 ) via a flexible octapeptide linker for electron supply ( Figure 5B). The linker and the following proline-rich region in LauC4H could orient optimally between ATR2 and LauC4H [25,35]. When the recombinant E. coli cells harboring the chimeric ATR2 Δ2-74 -LauC4H Δ2-28 protein were incubated with the substrate trans-cinnamic acid, a new product peak with the same retention time of p-coumaric acid was

Functional Identification of LauC4H
Given that the oxygenation process of CYP450s requires a redox partner as electron supplier, the A. thaliana CYP450 redox partner ATR2 was employed for the functional exploration of LauC4H in E. coli. Since there was no internal ER-like membrane in E. coli, both LauC4H and ATR2 were expressed heterogenously without the N-terminal membrane anchor region, rather than in the full-length sequence, thus to avoid the so-called incompatibility between the membrane recognition signal of heterogenous proteins and the prokaryotic host [25]. In detail, the truncated ATR2 (ATR2 ∆2-74 ) was fused in the N-terminus of truncated LauC4H (LauC4H ∆2-28 ) via a flexible octapeptide linker for electron supply ( Figure 5B). The linker and the following proline-rich region in LauC4H could orient optimally between ATR2 and LauC4H [25,35]. When the recombinant E. coli cells harboring the chimeric ATR2 ∆2-74 -LauC4H ∆2-28 protein were incubated with the substrate trans-cinnamic acid, a new product peak with the same retention time of p-coumaric acid was detected by HPLC analysis ( Figure 5C). UV-Vis absorption spectra confirmed that the new product was identical to p-coumaric acid ( Figure 5D).  The corresponding product was further examined using a liquid chromatography-mass spectrometry (LC-MS) in positive ion mode. LC-MS analysis of the targeted peak of the two LauC4H paralogs displayed the [M + H] + ions at m/z 165 ( Figure 5E,F), corresponding to the calculated molecular weight of p-coumaric acid (MW = 164.16). In contrast, no any visible peak with the same retention time of p-coumaric acid was observed in the conversion systems with ATR2 ∆2-74 only as well as the empty vector pET29a, even though, with trans-cinnamic acid as the substrate. These results indicated that the new product synthesized with the chimeric ATR2 ∆2-74 -LauC4H ∆2-28 protein was p-coumaric acid. Thus, the LauC4H is an authentic trans-cinnamic acid 4-hydroxylase able to participate the conversion of trans-cinnamic acid into p-coumaric acid with the plant CYP450 redox partner (Figure 1).

p-Coumaric Acid De Novo Biosynthesis Using LauC4H in E. coli
For the application of LauC4H to de novo biosynthesize p-coumaric acid, the E. coli cells harboring the functional chimera ATR2 ∆2-74 -LauC4H ∆2-28 were cultured in mineral medium added with glucose as the sole carbon source. However, no p-coumaric acid was detected (Table 1), indicating that E. coli could not produce trans-cinnamic acid. When A. thaliana phenylalanine ammonia-lyase 1 (AthPAL1) was further introduced to the E. coli cells mentioned above, p-coumaric acid could be produced and accumulated in the medium of the recombinant cells with ATR2 ∆2-74 -LauC4H ∆2-28 and AthPAL1 (Table 1). Therefore, the plant CYP450-involved p-coumaric acid biosynthesis pathway has been successfully established in procaryotic E. coli. Moreover, within a duration of 42 h induction with IPTG, the recombinant E. coli produced 17.22 µM of p-coumaric acid. The expression level and solubility of the chimeric ATR2 ∆2-74 -LauC4H ∆2-28 fusion protein were detected in Ec/LauC4H and Ec/LauC4H-AthPAL ( Figure A3). As shown in Figure A3, the expression level of the chimeric ATR2 ∆2-74 -LauC4H ∆2-28 fusion protein in Ec/LauC4H was more than that in Ec/LauC4H-AthPAL. However, the level of the soluble ATR2 ∆2-74 -LauC4H ∆2-28 fusion protein was comparable between Ec/LauC4H and Ec/LauC4H-AthPAL. Comparably, E. coli cells only with AthPAL1 or PAL/TAL from the yeast Rhodotorula glutinis (RglPAL/TAL) [18] were also cultured as controls. In cells only with AthPAL1, trans-cinnamic acid rather than p-coumaric acid was detected in the medium (Table 1). In cells only with RglPAL/TAL, both trans-cinnamic acid and p-coumaric acid were accumulated, of which the concentration was 91.94 µM and 46.09 µM, respectively (Table 1). Notably, there was 342.82 µM of trans-cinnamic acid detected in the medium of the recombinant cells with ATR2 ∆2-74 -LauC4H ∆2-28 and AthPAL1 (Table 1).  1 The data represent the mean ± the standard error from three independent experiments of each strain. 2 ND, not detected.

p-Coumaric Acid Production Improved by Intracellular NADPH Regulation
The biochemical process of LauC4H catalysis of the conversion of trans-cinnamic acid into p-coumaric acid needs two electrons per mole of substrate. ATR2 as a CYP450 redox partner is NADPH-dependent [33,34]. Since we observed that about 20-fold the substrate of LauC4H still existed in the medium of strain Ec/LauC4H-AthPAL (Table 1), we were assuming that the output of p-coumaric acid was subjected to the level of intracellular NADPH (Figure 1). For testing this assumption, we attempted to elevate the level of intracellular NADPH. Using synthetic small regulatory RNA (srRNA) anti(sthA) [36] to specific repression of the translation of the soluble transhydrogenase SthA (also referred to as UdhA) [37], the conversion of NADPH to NADH may be down-regulated when treated with anti(sthA) so that the content of NADPH would be relatively enhanced. As shown in Figure 6, the introduction of the srRNA anti(sthA) accelerated cell growth and glucose utilization, and resulted in about 2-fold increase of the p-coumaric acid production. In addition, we also pursued overexpression of the membrane-bound transhydrogenase PntAB catalyzing the NADH to NADPH conversion [37]. When the overexpression of five-copy pntAB driven by a T7 promoter took place, the production of p-coumaric acid was dramatically improved by 7.93-fold up to 136.53 µM ( Figure 6C). Both over-expressed PntAB and srRNA anti(sthA) resulted in a 9.18-fold increase, along with relative lower biomass and slower glucose consumption. Thus over-expressed PntAB and srRNA anti(sthA) played a synergetically positive effect on the de novo biosynthetic production of LauC4H-mediated p-coumaric acid. The expression level and solubility of the chimeric ATR2 ∆2-74 -LauC4H ∆2-28 fusion protein in those p-coumaric acid producers were comparable ( Figure A3). Under such circumstance, there was a considerable amount of trans-cinnamic acid yet in the medium, indicating that the chimera ATR2 ∆2-74 -LauC4H ∆2-28 was involved in a rate-limiting step for the formation of p-coumaric acid besides NADPH. Subsequently, there should be more approaches to be tested for improving the output of p-coumaric acid in E. coli. For example, the turnover of LauC4H could be enhanced either by increasing the expression level of the chimera or by modulating the spatial structure of the chimera using modularized bioengineering tools in synthetic biology. catalyzing the NADH to NADPH conversion [37]. When the overexpression of five-copy pntAB driven by a T7 promoter took place, the production of p-coumaric acid was dramatically improved by 7.93-fold up to 136.53 μM ( Figure 6C). Both over-expressed PntAB and srRNA anti(sthA) resulted in a 9.18-fold increase, along with relative lower biomass and slower glucose consumption. Thus over-expressed PntAB and srRNA anti(sthA) played a synergetically positive effect on the de novo biosynthetic production of LauC4H-mediated p-coumaric acid. The expression level and solubility of the chimeric ATR2 Δ2-74 -LauC4H Δ2-28 fusion protein in those p-coumaric acid producers were comparable ( Figure A3). Under such circumstance, there was a considerable amount of trans-cinnamic acid yet in the medium, indicating that the chimera ATR2 Δ2-74 -LauC4H Δ2-28 was involved in a rate-limiting step for the formation of p-coumaric acid besides NADPH. Subsequently, there should be more approaches to be tested for improving the output of p-coumaric acid in E. coli. For example, the turnover of LauC4H could be enhanced either by increasing the expression level of the chimera or by modulating the spatial structure of the chimera using modularized bioengineering tools in synthetic biology.

C D
A B

Plant Materials and Chemicals
The Lycoris aurea plants used in this study were collected from Nanjing Botanical Garden Mem. Sun Yat-Sen (Nanjing, China), and were about three-year old when the flowers bloom unless otherwise stated. A. thaliana wild-type (Columbia ecotype) plants used in this study were grown at 22 • C for four weeks after germination. Chemicals and reagents used in this study were purchased from either Sigma-Aldrich (St. Louis, MO, USA) or Sangon Biotech (Shanghai, China).

RNA Extraction and Isolation of LauC4H Genes
Total RNA from different tissues of L. aurea were extracted using the RNAprep pure Plant Kit (TIANGEN, Beijing, China). The cDNA pool was then synthesized by PrimerScript TM RT reagent Kit (TaKaRa, Dalian, China). To quantify the unigene CL5217 expression in different tissues, quantitative real-time polymerase chain reaction (qRT-PCR) was performed using AceQ ® qPCR SYBR ® Green Master Mix (High ROX Premixed) (Vazyme Biotech, Nanjing, China) with the gene LauTIP41 as internal reference [38]. To obtain desired sequences, 5 -and 3 -rapid-amplification of cDNA ends (RACE) were carried out using the SMARTer TM Table A1.

Subcellular Localization Analysis of LauC4H
To determine the subcellular localization of LauC4H, enhanced green fluorescent protein (EGFP) was in frame fused to the C-terminus of the LauC4H protein sequence under the control of the dual cauliflower mosaic virus (CaMV) 35S promoter in the pAN580 vector. The plasmid pAN580 was digested by NcoI (TaKaRa, Dalian, China). The DNA fragments without termination codon encoding full-length LauC4H, N-terminal 28 amino acids of LauC4H (LauC4H 1-28 ) and N-terminal truncated LauC4H (LauC4H ∆2-28 ) were prepared with primer pairs pAN580-LauC4H-PF and EGFP-LauC4H-PR, primer pairs pAN580-LauC4H-PF and EGFP-LauC4H(N28)-PR, and primer pairs pAN580-LauC4H(∆N28)-PF and EGFP-LauC4H-PR, respectively. The DNA fragments were assembled to linear pAN580 by ClonExpress One Step Cloning Kit (Vazyme Biotech) to create pDual35S::LauC4H-EGFP, pDual35S::LauC4H 1-28 -EGFP and pDual35S::LauC4H ∆2-28 -EGFP. Primers used to make these constructs were designed according to the product manual and are listed in Appendix A Table A1. The well-established fluorescent protein marker mCherry-HDEL [32] was used for the indication of the endoplasmic reticulum (ER). All transient expression constructs were transformed separately into Arabidopsis protoplasts with the mCherry-HDEL construct according to the method [40]. The transformed samples were incubated for 16-18 h before examination. Fluorescent images were observed by a laser scanning confocal microscope using a Zeiss LSM780 camera (Carl Zeiss Microscopy GmbH, Jena, Germany).

Bioconversion of LauC4H with trans-Cinnamic Acid
Whole-cell bioconversion strategy was adopted to identify the function of LauC4H. 50 mL cell cultures were harvested by centrifugation at 8000 rpm and 4 • C for 5 min, washed twice and re-suspended in 40 mL nitrogen-free M9Y mineral medium. The re-suspended cells were added 20 g L −1 glucose and 100 µM trans-cinnamic acid, and incubated at 28 • C with a stirring rate of 250 rpm for the whole-cell bioconversion. Samples were taken with an interval of 12 h, an equal volume of methanol was added to terminate the reaction and the samples were then subjected to HPLC analysis.

p-Coumaric Acid De Novo Biosynthesis in E. coli
To in vivo biosynthesize the substrate of LauC4H in E. coli cells, the AthPAL1 gene encoding A. thaliana phenylalanine ammonia lyase 1 was overexpressed in pACYC184 vector [42,43] under a trc promoter induced by IPTG. For gene overexpression, pACYC184 was digested by NcoI and EcoRI (TaKaRa, Dalian, China) to get the plasmid skeleton with p15A origin of replication and tetracycline-resistant gene. Primer pairs 184-trc-lacO-PF and BBa_B0034-lacO-PR were annealed and elongated, and then purified to obtain the DNA fragment trcO-RBS containing the core-trc promoter (−10 box and −35 box), lacI binding site and the BioBrick ribosome binding site (RBS) BBa_B0034 chosen from the MIT Registry of Standard Biological Parts (http://parts.igem.org/Main_Page) (The International Genetically Engineered Machine (iGEM), Cambridge, MA, USA). Primer pairs BBa_B0034-BBa_B0015-PF and 184-BBa_B0015-PR were annealed and elongated, and then purified to obtain the DNA fragment BBa_B0015 containing the BioBrick terminator BBa_B0015 also chosen from the MIT Registry of Standard Biological Parts (iGEM, Cambridge, MA, USA). The three fragments, plasmid skeleton, trcO-RBS and BBa_B0015 terminator, sharing the overlap one by one, were assembled together by ClonExpress Ultra One Step Cloning Kit (Vazyme Biotech). The new plasmid was named as p15A-trcO3415. Then, the AthPAL1 gene was amplified from the A. thaliana cDNA pool with primers BBa_B0034-AthPAL1-PF and BBa_B0015-AthPAL1-PR and assembled into the linear p15A-trcO3415 by NdeI restriction endonuclease (TaKaRa) via ClonExpress One Step Cloning Kit (Vazyme Biotech) to create p15A-trcOAthPAL1. Then p15A-trcOAthPAL1 along with the plasmid pET29a-ATR2 ∆2-74 LauC4H ∆2-28 were transformed into E. coli Rosetta (DE3) cells to obtain the recombinant strain Ec/LauC4H-AthPAL. The empty vector p15A-trcO3415 along with the plasmid pET29a-ATR2 ∆2-74 LauC4H ∆2-28 were transformed into E. coli Rosetta (DE3) cells to obtain the recombinant strain Ec/LauC4H. The plasmid p15A-trcOAthPAL1 along with the empty vector pET29a were transformed into E. coli Rosetta (DE3) cells to obtain the recombinant strain Ec/AthPAL. The primers involved in this experiment are summarized in Appendix A Table A1.
The DNA sequence encoding the PAL/TAL from the yeast Rhodotorula glutinis [18] was optimized according to the codon preference of E. coli, and synthesized on the clone vector pUC57 to obtain pUC57-OptRglPAL/TAL plasmid by Sangon Biotech (Shanghai) Co., Ltd (Shanghai, China). The RglPAL/TAL gene was amplified from the plasmid pUC57-OptRglPAL/TAL with primers BBa_B0034-RglPAL/TAL-PF and BBa_B0015-RglPAL/TAL-PR and assembled into the linear p15A-trcO3415 by NdeI restriction endonuclease (TaKaRa) via ClonExpress One Step Cloning Kit (Vazyme Biotech) to create p15A-trcORglPAL/TAL. Then p15A-trcORglPAL/TAL plasmid along with the empty vector pET29a were transformed into E. coli Rosetta (DE3) cells to obtain the recombinant strain Ec/RglPAL(TAL). The primers are summarized in Appendix A Table A1 and the optimized DNA sequence of RglPAL/TAL was shown in Appendix A Figure A4.
A single clone of the recombinant E. coli strains was incubated in 3 mL LB medium for 24 h. Then 1 mL of the cell cultures was collected, the supernatant was discarded, and the cell pellet was transferred into 50 mL fermentation medium and grown at 37 The cells were induced with 0.1 mM IPTG at 30 • C when grown to A 600 of 0.6-0.8. The pH in the medium was maintained at about 7.0 by adding ammonium hydroxide aperiodically. Samples were taken with an interval of 6 h and analyzed by HPLC.

Intracellular NADPH Regulation
For down-regulation of EcoSthA gene, the synthetic sRNA-based strategy [36] was applied. The plasmid skeleton with pSC101 origin of replication and spectinomycin-resistant gene was obtained from pCL1920 [44] as previously described [45]. The P R -MicC fragment including the constitutive P R promoter and the MicC scaffold was amplified from the wild-type E. coli MG1655 genome with primer pairs BBa_R0051-MicC-PF and BBa_B0015-MicC-PR. The BBa_B0015 terminator was amplified from the plasmid p15A-trcO3415 with primer pairs BBa_B0015-PF and BBa_B0015-PR. The P R -MicC-BBa_B0015 fragment was amplified with primer pairs pCL-BBa_R0051-PF and pCL-BBa_B0015-PR from both of P R -MicC fragment and BBa_B0015 terminator as the template. Then, the plasmid skeleton and the P R -MicC-BBa_B0015 fragment were assembled by ClonExpress One Step Cloning Kit (Vazyme Biotech) to construct the new plasmid pSC101-sRNA for synthetic sRNA production. Subsequently, the N-terminal 24 bp of EcoSthA gene was obtained by annealing and elongation with primer pairs anti(sthA)-PF and anti(sthA)-PR, and assembled into the linear pSC101-sRNA by NsiI restriction endonuclease (TaKaRa) via ClonExpress One Step Cloning Kit (Vazyme Biotech) to create pSC101-anti(sthA).

Protein Detection
Cells induced for 12 h were harvested by centrifugation at 8000 rpm for 5 min at 4 • C, washed twice with TNG buffer (20 mM Tris-HCl pH 7.9, 0.5 M NaCl, 10% Glycerol), and re-suspended in the same buffer containing 1 mM phenylmethylsulfonyl fluoride (PMSF). The suspended cells were sonicated on ice-bath followed by centrifugation at 12,000 rpm for 30 min at 4 • C. The supernatant was transferred out, and the sediment was re-suspended in the PMSF-contained TNG buffer. Proteins were assessed by SDS-PAGE.

High Performance Liquid Chromatography (HPLC) Analysis
All samples taken from the cultures were centrifuged at 12,000 rpm for 2 min and the supernatants were filtered through a 0.22 µm polytetrafluorethylene (PTFE) filter and analyzed by HPLC (Shimadzu LC-20A, Kyoto, Japan) with a reverse phase Shimadzu InertSustain C18 column (5 µm, 4.6 mm × 250 mm) and a Shimadzu SPD-M20A photodiode array detector. The mobile phase used was a gradient of solvent A (H 2 O containing 1.3% acetic acid) and solvent B (100% acetonitrile) applied as following time procedure: 0-20 min, 10-100% B linear; 20-20.5 min, 100%-10% linear; 20.5-30 min, 10% B isocratic. The flow rate was set at 1.0 mL·min −1 , and the injection volume was 10 µL. The column was maintained at 35 • C, and the eluted compounds were monitored at 309 nm for p-coumaric acid and at 274 nm for trans-cinnamic acid respectively. The concentration of trans-cinnamic acid and p-coumaric acid were quantified by fitting the peak area with a standard curve (R 2 > 0.999) of the corresponding standard.

Conclusions
In the present study, p-coumaric acid was de novo produced via the plant CYP450-involved biosynthetic route in Escherichia coli. Firstly, a novel CYP73A from the Amaryllidaceae plant Lycoris aurea was cloned based on the transcriptome data, of which the N-terminal 28 amino acids were characterized responsible for localizing at the endoplasmic reticulum membrane in protoplasts of Arabidopsis thaliana. Then, LauC4H was expressed functionally in E. coli when fused with the CYP450 redox partner from A. thaliana, and shown to catalyze the conversion of trans-cinnamic acid into p-coumaric acid. Further, p-coumaric acid was de novo biosynthesized via introducing a phenylalanine ammonia-lyase from A. thaliana into the recombinant E. coli cells. The production of p-coumaric acid was dramatically improved via regulation of the intracellular NADPH level of the constructed cell factory. The producer reported herein afforded a promising chassis for the biosynthesis of p-coumaric acid-derived molecules with great application value.