Genome Mining Reveals Two Missing CrtP and AldH Enzymes in the C30 Carotenoid Biosynthesis Pathway in Planococcus faecalis AJ003T

Planococcus faecalis AJ003T produces glycosyl-4,4′-diaponeurosporen-4′-ol-4-oic acid as its main carotenoid. Five carotenoid pathway genes were presumed to be present in the genome of P. faecalis AJ003T; however, 4,4-diaponeurosporene oxidase (CrtP) was non-functional, and a gene encoding aldehyde dehydrogenase (AldH) was not identified. In the present study, a genome mining approach identified two missing enzymes, CrtP2 and AldH2454, in the glycosyl-4,4′-diaponeurosporen-4′-ol-4-oic acid biosynthetic pathway. Moreover, CrtP2 and AldH enzymes were functional in heterologous Escherichia coli and generated two carotenoid aldehydes (4,4′-diapolycopene-dial and 4,4′-diaponeurosporene-4-al) and two carotenoid carboxylic acids (4,4′-diaponeurosporenoic acid and 4,4′-diapolycopenoic acid). Furthermore, the genes encoding CrtP2 and AldH2454 were located at a distance the carotenoid gene cluster of P. faecalis.


Introduction
Carotenoids are the most common isoprenoid pigments, comprising colors ranging from yellow through bright orange to red [1,2]. They act as precursors for several hormones and reveal diverse biological functions in nature, such as light-harvesting, photoprotection, and coloration [3]. Biotechnologically, carotenoids have been mainly utilized as food colorants, antioxidants, and animal feed supplements [4]. Their applications in biotechnology are extended to nutraceuticals, cosmetics, and pharmaceuticals. Compared with C40 carotenoids, which comprise a backbone of 40 carbons, including lycopene and β-carotene, C30 carotenoids are rare in nature, and their biological activities, biosynthetic pathways, and regulation of gene expression remain unclear [5]. A recent study on the biological functions of C30 carotenoids, including stem cell proliferation and antioxidative activities, has attracted interest toward C30 carotenoids [6].
Herein, we report that the genome mining approach successfully identified two missing genes, crtP2 and aldH2454, encoding CrtP and AldH, respectively, in the glycosyl-4,4 -diaponeurosporen-4 -ol-4-oic acid biosynthetic pathway of P. faecalis. Complementation-based functional studies of CrtP2 and AldH enzymes were performed in heterologous E. coli.

Identification of crtP Encoding 4,4-diaponeurosporene Oxidase
Our previous study [7] revealed that, although the crtP gene encoding 4,4-diaponeurosporene oxidase was present in the glycosyl-4,4 -diaponeurosporen-4 -ol-4-oic acid biosynthetic gene cluster ( Figure 1A), CrtP1 enzyme was not active, unlike the other three enzymes (CrtM, CrtN1, and CrtN2), in the heterologous E. coli expressing both crtM and crtN of P. faecalis as illustrated in Figure 2A. Therefore, we assumed the presence of other active crtP genes located remotely from the carotenoid gene cluster. To identify the active crtP genes, based on amino acid similarity score with the CrtP enzyme of S. aureus [8], putative CrtP-like enzymes were computationally explored against the genome of P. faecalis. One putative crtP-like gene was identified and named crtP2, to distinguish it from the inactive crtP gene (renamed as crtP1) present in the carotenoid gene cluster.

Identification of aldH Encoding Aldehyde Dehydrogenase
As CrtP2 exhibited only oxygenase activity by adding aldehyde groups, other gene(s) encoding AldH enzymes that catalyze the oxidation reaction of the aldehyde group of 4,4′-diaponeurosporene-4-al (2) should be present in the genome of P. faecalis. It has been reported that aldH genes encoding carotenoid aldehyde dehydrogenase are remotely located from the corresponding carotenoid pathway gene clusters in S. aureus and Methylomonas sp. [8,15]. Therefore, using an approach similar to the computational identification of CrtP2, putative AldH-like enzymes were explored against the To verify the functionality of crtP2, the gene was cloned into a high copy number pUCM expression vector, generating pUCM_crtP2 PF and was expressed in the 4,4 -diaponeurosporene (1)/4,4 -diapolycopene (6) producing E. coli [pACM_crtM SA -crtN SA ]. As illustrated in Figure 2B, two main peaks were detected in the high performance liquid chromatography (HPLC) chromatogram of the cell extract, which had similar retention times to those of the two peaks from the control E. coli strain [pACM_crtM SA -crtN SA -crtP SA ] ( Figure 2C). Further analysis using UV/VIS spectroscopy and LC/MS ( Figure 2E Figure 1B), but peak 2 is not. Several studies have reported that recombinantly expressed carotenoid pathway enzymes exhibit broad substrate specificities, in contrast to those endogenous to the hosts [7][8][9][10][11][12][13], which might be the case for the CrtP2 enzyme of P. faecalis in heterologous E. coli. Therefore, detection of 4,4 -diapolycopene-dial (8) and 4,4 -diaponeurosporene-4-al (2) in E. coli strongly indicates that crtP2 is the first missing gene involved in the oxidation of 4,4 -diaponeurosporene (1) into 4,4 -diaponeurosporene-al (2) ( Figure 1B) during the biosynthesis of glycosyl-4,4 -diaponeurosporen-4 -ol-4-oic acid (5) in P. faecalis. Notably, formation of the dialdehyde carotenoid 4,4 -diapolycopene-dial (8) ( Figure 2F) indicates that the CrtP2 enzyme could introduce the second aldehyde group into a carotenoid monoaldehyde. Moreover, crtP2 was expressed with crtM PF -crtN PF of P. faecalis as a low copy number plasmid (pACM_crtM PF -crtN PF -crtP2 PF ) in E. coli to investigate the effect of expression level of three enzymes on the resulting carotenoid profile: unlike the two-plasmid system utilizing a high and low copy number plasmid, 4,4 -diapolycopene-dial (8) was the dominant carotenoid, with small amounts of 4,4 -diaponeurosporene-4-al (2), 4,4 -diaponeurosporene (1), and 4,4 -diapolycopene (6) ( Figure 2D), thereby suggesting that balanced expression of pathway enzymes could influence the carotenoid profile. Eventually, although the CrtP enzyme has dual functions of oxygenase and AldH [14], carotenoid carboxylic acid intermediates such as 4,4 -diaponeurosporenoic acid (3) or 4,4 -diapolycopenoic acid (10) were not detected in either plasmid system. This suggests that CrtP2 may only exhibit oxidase activity while adding the aldehyde groups into 4,4 -diaponeurosporene (2) and 4,4 -diapolycopene (6), similar to CrtP from S. aureus.

Identification of aldH Encoding Aldehyde Dehydrogenase
As CrtP2 exhibited only oxygenase activity by adding aldehyde groups, other gene(s) encoding AldH enzymes that catalyze the oxidation reaction of the aldehyde group of 4,4 -diaponeurosporene-4-al (2) should be present in the genome of P. faecalis. It has been reported that aldH genes encoding carotenoid aldehyde dehydrogenase are remotely located from the corresponding carotenoid pathway gene clusters in S. aureus and Methylomonas sp. [8,15]. Therefore, using an approach similar to the computational identification of CrtP2, putative AldH-like enzymes were explored against the genome of P. faecalis with the amino acid sequence of the AldH enzyme of S. aureus. Genome mining, based on the high amino acid sequence similarity scores, identified four putative aldH genes, namely aldH420, aldH905, aldH1759, and aldH2454, on the genome of P. faecalis.

Bacterial Strains Culture Condition and Plasmids
The bacterial strains and plasmids used in the present study are listed in Table 1. E. coli strain Top10 was used for gene cloning, and XL1-Blue was used for the expression of C30 carotenoid biosynthetic pathway genes. E. coli strains were aerobically cultured in Luria-Bertani (LB) medium at 30 • C on a rotary shaker at 250 rpm. Appropriate antibiotics ampicillin (100 µg/mL), chloramphenicol (50 µg/mL), and kanamycin (30 µg/mL) were supplemented as required. For carotenoid production, a preculture was grown in a 4 mL tube of Terrific Broth (TB) medium supplemented with 100 µg/mL ampicillin and/or 50 µg/mL chloramphenicol overnight at 30 • C by shaking at 250 rpm. Thereafter, the preculture was inoculated into a 300 mL baffle flask containing TB medium supplemented with the required antibiotics at 30 • C by shaking at 250 rpm for 36 h.

Genome Mining
A standalone basic local alignment and search tool program package (BLAST+) v2.2.31 (http: //www.ncbi.nlm.nih.gov/) was locally installed and utilized to identify the missing pathway enzymes of P. faecalis. A local protein BLAST database of the P. faecalis genome (GenBank accession number CP019401) was generated by running the makeblastdb program [17]. Putative CrtP-like enzymes encoding 4,4-diaponeurosporene oxidase were explored by running the blastp program with default parameters against the local protein database, with the query amino acid sequence as that of CrtP (GenBank accession number ALY16520.1) from Staphylococcus aureus. Similarly, putative AldH-like enzymes were explored against the local protein database with the query amino acid sequence as that of the AldH enzyme (GenBank accession number BAF68130.1) from S. aureus.

Cloning and Construction of Expression Modules of Carotenoid Pathway Genes
Genomic DNA of P. faecalis was extracted using the Genomic DNA extraction kit (Macrogen, Seoul, South Korea). A crtP2 gene encoding 4,4-diaponeurosporene oxidase (CrtP2) and four aldH-like genes encoding aldehyde dehydrogenase (AldH) were amplified from the genomic DNA using specific PCR primers (Table 2). Each PCR product was cloned into the corresponding sites of the constitutive expression vector pUCM [9], resulting in pUCM_X y (where X is a cloned gene name and subscript Y is the bacterial source name) ( Table 1). In order to construct the 4,4 -diaponeurosporen-4 -al biosynthetic pathway, crtP2 gene on pUCM_crtP2 PF was subcloned with the promoter and terminator sequences into pACM_crtM PF -crtN PF [7], generating pACM_crtM PF -crtN PF -crtP2 PF .

Isolation of Carotenoids
For carotenoid isolation, the cells and media were separated via centrifugation (4 • C, 4000 rpm). The pelleted cells were repeatedly extracted with 30 mL of acetone until all visible pigments were removed. Colored supernatants were pooled after centrifugation (4 • C and 3000 rpm) and concentrated into a small volume (1-2 mL) using a Genevac TM EZ2-Plus centrifugal evaporator (Genevac, New York, NY, USA). Thereafter, 5 mL of ethyl acetate (EtOAc) was added to the concentrated solution and re-extracted after the addition of 5 mL NaCl (5 N) solution. Next, the upper organic phase containing carotenoids was collected, washed with distilled water, dehydrated by the addition of 0.5 g sodium sulfate, and completely dried using the EZ2-Plus evaporator. Dried samples were stored at −80 • C until further analysis.

Analysis of Carotenoids
A 10 µL aliquot of the carotenoid extracts was applied to a C18 reverse phase column, and then eluted under isocratic conditions with a solvent system (acetonitrile:methanol:2-propanol, 80:15:5) at a flow rate of 1 mL/min using an Agilent 1200 HPLC system (Agilent Technologies, Santa Clara, CA, USA) equipped with a photodiode array detector according to our previous paper [5]. The mass fragmentation spectra of carotenoids were monitored using both positive and negative ion modes in the mass range of 200-900 m/z on a liquid chromatography/mass spectrometry system (LC/MS; Agilent 6150, Agilent Technologies) equipped with an atmospheric pressure chemical ionization ion source according to our previous paper [8]. For structural elucidation, carotenoids were identified using a combination of HPLC retention times, UV/VIS absorption spectra, and mass fragmentation spectra.