Probing a Coral Genome for Components of the Photoprotective Scytonemin Biosynthetic Pathway and the 2-Aminoethylphosphonate Pathway

Genome sequences of the reef-building coral, Acropora digitifera, have been decoded. Acropora inhabits an environment with intense ultraviolet exposure and hosts the photosynthetic endosymbiont, Symbiodinium. Acropora homologs of all four genes necessary for biosynthesis of the photoprotective cyanobacterial compound, shinorine, are present. Among metazoans, these genes are found only in anthozoans. To gain further evolutionary insights into biosynthesis of photoprotective compounds and associated coral proteins, we surveyed the Acropora genome for 18 clustered genes involved in cyanobacterial synthesis of the anti-UV compound, scytonemin, even though it had not previously been detected in corals. We identified candidates for only 6 of the 18 genes, including tyrP, scyA, and scyB. Therefore, it does not appear that Acropora digitifera can synthesize scytonemin independently. On the other hand, molecular phylogenetic analysis showed that one tyrosinase gene is an ortholog of vertebrate tyrosinase genes and that the coral homologs, scyA and scyB, are similar to bacterial metabolic genes, phosphonopyruvate (ppyr) decarboxylase and glutamate dehydrogenase (GDH), respectively. Further genomic searches for ppyr gene-related biosynthetic components indicate that the coral possesses a metabolic pathway similar to the bacterial 2-aminoethylphosphonate (AEP) biosynthetic pathway. The results suggest that de novo synthesis of carbon-phosphorus compounds is performed in corals.

The indole-alkaloid, scytonemin, is a UV-blocking compound, found exclusively in cyanobacteria, and has been evaluated for biomedical applications [17]. Recently, Soule et al. [18,19] showed that scytonemin synthesis is controlled by an 18-gene cluster in the cyanobacterium, Nostoc punctiforme ( Figure 1). The Nostoc operon includes scyA, scyB, scyC, scyD, scyE, scyF, NpR1270 (glycosyltransferase), tyrA, dsbA and aroB. Although scytonemins have not been found in corals, the presence of symbiotic cyanobacteria in coral species has been reported [20]. Furthermore, some cyanobacteria have been implicated in coral disease [21] and the roles of microbial communities associated with coral are being discussed [22]. Therefore, in this study, we investigated whether the coral genome contains genes encoding proteins that are homologous to cyanobacterial enzymes involved in scytonemin synthesis. In relation to the homologs of scyA, we surveyed the Acropora genome for genes encoding enzymes of the 2-aminoethylphosphonate (AEP) pathway. AEP is a natural carbon-phosphorus compound, first reported by Horiguchi & Kandatsu [23]. This study will provide a basis for natural product surveys of anthozoans.  [6,16]. Gene homologs encoding enzymes indicated with asterisks were identified in the A. digitifera genome. (b) Schematic showing the organization of the scytonemin gene cluster. Genes indicated by red arrows encode enzymes involved in the biosynthesis of aromatic amino acids. The presence of corresponding genes in various organisms is indicated by "+", indicating that a TBLASTN search against N. punctiforme as query showed significant hits. Anthozoan genomes encode a gene homologous to aroB, involved in aromatic amino acid metabolism, which is not found in higher metazoans.
Screening of the A. digitifera genome via BLAST and domain structure comparisons led to the identification of candidates for six of the 18 genes involved in scytonemin synthesis: scyA, scyB, scyF, dsbA, aroB, and tyrP ( Figure 1b). Analysis of aroB (DHQS) in a previous study identified an aroB homolog in the Acropora genome [15]. Molecular phylogenetic analyses group the aroB-like sequences of Acropora and Nematostella with those of several dinoflagellates, consistent with the possibility that the aroB-like genes of cnidarians originated by horizontal transfer from dinoflagellates [12]. Here we describe results of molecular phylogenetic analyses of scyA, scyB, dsbA, and tyrP. Detailed analyses of scyF homologs were not performed for reasons that will be explained subsequently (See Section 2.3).

scyA (TPP-Dependent Enzyme)
scyA encodes a TPP (thiamine pyrophosphate)-dependent enzyme [25], a protein similar to human 2-hydroxyacyl-CoA lyase, which has close homologs in a variety of organisms, including Drosophila and Arabidopsis (Figure 1b; Table 1). It is also similar to acetolactate synthase which is found in plants and micro-organisms. Both 2-hydroxyacyl-CoA lyase and acetolactate synthase are involved in synthesis of the essential amino acids, valine, leucine, and isoleucine [26]. Biosynthesis of 2-aminoethylphosphonate (AEP) from phosphoenolpyruvate (PEP) requires just three enzymes: PEP mutase, phosphonopyruvate decarboxylase, and AEP transaminase, collectively known as the AEP biosynthetic pathway [27] (Figure 2; See Section 2.6). Phosphonopyruvate (ppyr) decarboxylase is also similar to both 2-hydroxyacyl-CoA lyase and acetolactate synthase.          Table 1, is uncommon in metazoans. Homologs of the other two enzyme genes involved, indicated by asterisks, are also found in coral; see Table 2 for details.
Molecular phylogenetic analysis showed that two Acropora proteins containing a TPP enzyme domain were separated into two clades, one containing PEP decarboxylase, with orthology to the Bacteroides fragilis enzyme and the other, 2-hydroxyacyl-CoA lyase, with orthology to the human protein (Table 1; Figure S1). Both enzymes have Nematostella counterparts, and these were closely related to each other ( Figure S1). In contrast, the latter group formed a clade that includes Homo, Drosophila, and Arabidopsis orthologs. PEP decarboxylase was not found in other metazoan genomes. The Acropora PEP decarboxylase gene has six introns and was located at the 5′ terminus of scaffold 12471. Its neighbor was a gene for an ephrin-like protein, which belongs to the tyrosine kinase receptor subfamily. mRNA corresponding to ppyr decarboxylase, but not hydroxyacyl-CoA lyase, was present in EST databases ( Table 1). The gene for acetolactate synthase was not found. Neither of the two Acropora genes formed a clade with scyA of the cyanobacteria, Nostoc and Nodularia.
On the other hand, gdh-1-1 and gdh-1-2 form another clade with the corresponding Nematostella genes ( Figure S2). This group includes bacterial and Arabidopsis genes, but not those of metazoans ( Figure S2). All trees (Bayesian inference, Neighbor joining, and Maximum likelihood) supported the clade ( Figure S2). gdh-1-1 has no introns while gdh-1-2 has one. The expression of gdh-1-1 was confirmed in the EST database. gdh-1-2 was located at the 5′ terminus of scaffold 16875 and the neighboring gene is similar to human caseinolytic peptidase B, a hexameric chaperone. This analysis indicates that corals have two GDH class 1 and two GDH class 2 enzymes. Because GDH class 1 has not been found in metazoans [29], corals may have unknown GDH metabolic pathways.

scyF (NHL Repeat Containing)
NHL is a conserved structural motif present in a large family of growth regulators. Many NHL-containing proteins also possess additional domains, e.g., RING fingers, B-box zinc fingers, and coiled-coil motifs. According to structural model analysis, the NHL domain-containing genes could be involved in proteinprotein interactions and/or protein-nucleic acid interactions [30]. scyF encodes a protein that contains an NHL repeat (Ncl-1, HT2A and Lin-41), which is defined by amino acid sequence similarities to Ncl-1, HT2A, and Lin-41 proteins [30].
Most animal and plant genomes contain scyF-like genes (Figure 1b). A Pfam domain search of the NHL domain revealed that the Acropora genome contains 107 genes encoding NHL-containing proteins. In addition, the three Acropora genes most similar to Nostoc scyF, aug-v2a.11071, aug-v2a.01011, and aug-v2a.06686, included other domains such Filamin, SGL, and zf-B Box. Therefore, it was difficult to clarify the relationship among NHL-repeat-containing genes. Only three genes encode proteins with one NHL repeat each. Some of these may be members of novel metabolic pathways.

dsbA
DsbA (disulfide bond A) is a subfamily of the thioredoxin family [31,32]. Efficient, correct folding of bacterial disulfide-bonded proteins in vivo is dependent upon a class of periplasmic oxidoreductase proteins called DsbA. The bacterial protein-folding factor DsbA is the most oxidizing member of the thioredoxin family.
dsbA genes with high similarities to Nostoc dsbA have been identified in each of the cnidarians (A. digitifera, Nematostella vectensis and Hydra magnipapillata) and in Trichoplax (Phylum Placozoa), but are not found in Drosophila and Homo (Table 1). Metazoan dsbA genes have greatly diverged from bacterial DsbA genes; therefore, it was difficult to align the sequences. Such low similarities may be due to selenoproteins, in which it is difficult to predict the open reading frame [33]. By domain search, we found three candidates, aug-v2a.12085, aug-v2a.05997, and aug-v2a.00764 in the Acropora genome. However, the gene models, aug-v2a.05997 and aug-v2a.00764, were likely partial, and were excluded from further analyses. These models may be artifacts of insufficient assembly or inaccurate gene prediction. The four cnidarian dsbA sequences formed discrete clades in molecular phylogenetic analyses ( Figure S3), suggesting diversification of these genes in the cnidarian lineage. In addition, DSBA domain-containing gene-1 was positioned in a subgroup different from the cyanobacterium dsbA.

TyrP
TyrP (Tyrosinase-related Protein) has a well-established role in melanin biosynthesis in mammals, and is involved in several biological functions [34]. We found six candidate tyrosinases, but four of them were partial sequences. Therefore, we used only the two complete candidates for molecular phylogenetic analysis. Interestingly, TyrP 1 forms a clade with its vertebrate equivalents ( Figure S4), although we could not find any Nematostella and Hydra orthologs in this clade ( Figure S4). On the other hand, TyrP 2 is a member of a group that included the tyrosinase-related proteins of cnidarians ( Figure S4). No Acropora tyrosinase genes form a clade with cyanobacterium TyrP, but further studies will be needed to understand the relationships of the four unknown, partial genes.

Genes for AEP Pathway
Because it has been reported that PEP decarboxylase is an enzyme for one of three steps in the AEP biosynthetic pathway in protists and bacteria [35], we surveyed homologs of enzyme genes for the other two steps. Interestingly, we found candidate genes for phosphoenolpyruvate mutase and aminoethylphosphonate transaminase ( Table 2; Figures S5 and S6). Our gene survey suggests that Acropora digitifera has a complete AEP biosynthetic pathway from phosphoenolpyruvate (PEP) (Figures 2, S1, S5 and S6), which is the shortest known pathway for construction of natural phosphonate [35]. Therefore, corals may be important producers of carbon-phosphorus compounds in marine ecosystems.
It is possible that reported draft genome sequences of metazoans could include sequences from other organisms, resulting from contamination. However, the coral A. digitifera genome sequences from the purified sperm genomic DNA of one individual did not contain contaminated sequences [15]. The following observations indicate that all of the annotated genes in this study are encoded by the A. digitifera genome: (1) Orthologs of these genes, which formed a clade in molecular phylogenetic analysis, were found in Nematostella; (2) Expression of most genes was confirmed by embryonic transcriptome analysis; and (3) Some of the gene orders, including annotated genes, were conserved between A. digitifera and N. vectensis.

Gene Search
We used two methods to search the A. digitifera database [36,37] for genes encoding components of the scytonemin biosynthetic pathway. First, BLAST searches with cyanobacterial protein sequences as queries (BLASTP) were used to probe A. digitifera gene models for putative orthologs. Genome sequences of Nematostella vectensis [11], Hydra magnipapillata [13], Drosophila melanogaster [38], Homo sapiens [39], and Arabidopsis thaliana [40] were also surveyed. In addition, several bacteria genes and eukaryotic genes with high similarity to A. digitifera models were retrieved from the NCBI genome database [41] for molecular phylogenetic analysis. The second method was the characterization of specific protein domains. To screen and identify protein domains in the gene models, we used the Pfam database [42], which contains 11,912 conserved domains using HMMER (hmmer3) [43]. In order to avoid eliminating cnidarian-or coral-specific domains, we first used an E-value cutoff of 10 −3 , as previously suggested [44] and subsequently an E-value cutoff of 1.

Molecular Phylogenetic Analysis
Amino acid sequences found in gene searches were aligned using ClustalX [45] with default parameters. Gaps and ambiguous areas were excluded manually, using Se-Al v2.0 [46]. For Bayesian inference analysis, the alignment datasets were analyzed using PhyloBayes 3.3 [47] with the site heterogeneous mixture CAT model and two independent Markov chains. Phylogenetic trees were constructed by Neighbor-Joining (NJ). Calculations of the NJ bootstrap value (1000 trials) were made using ClustalX, and tree constructions were performed in SeaView [48] or Njplot [49]. Maximum likelihood analyses employed TREEFINDER version October 2008 [50] and Aminosan [51]. The bootstrap value was calculated using 100 trials.

Conclusion
We have previously identified environmental response genes in corals. These included genes unique to metazoans, such as fluorescent proteins [52] and enzymes involved in shinorine synthesis [15]. The present gene survey does not support the hypothesis that A. digitifera can synthesize scytonemin independently. Although the A. digitifera genome contains homologs of several genes that function in scytonemin synthesis in Nostoc, these genes may have acquired new functions in Acropora that remain to be elucidated. The homologs of scyA and scyB, ppyr decarboxylase, gdh-1-1, and gdh-1-2 are similar to genes involved in general bacterial metabolic pathways. Our genome-wide surveys for genes of enzymes involved in synthesis of photoprotective compounds indicate that corals retain genes for some enzymes not found in Homo and Drosophila. Therefore, it is likely that not only marine bacteria, but also marine invertebrates produce many unknown natural compounds, as suggested by the presence of the AEP pathway. Genomic surveys will undoubtedly provide more clues regarding natural product synthesis.