The Draft Genome Sequence of Actinokineospora bangkokensis 44EHWT Reveals the Biosynthetic Pathway of the Antifungal Thailandin Compounds with Unusual Butylmalonyl-CoA Extender Units

We report the draft genome sequence of Actinokineospora bangkokensis 44EHWT, the producer of the antifungal polyene compounds, thailandins A and B. The sequence contains 7.45 Mb, 74.1% GC content and 35 putative gene clusters for the biosynthesis of secondary metabolites. There are three gene clusters encoding large polyketide synthases of type I. Annotation of the ORF functions and targeted gene disruption enabled us to identify the cluster for thailandin biosynthesis. We propose a plausible biosynthetic pathway for thailandin, where the unusual butylmalonyl-CoA extender unit is incorporated and results in an untypical side chain.


Introduction
Next generation sequencing and genome mining are powerful and rapid technologies to identify the genetic potential of a strain to synthesize secondary metabolites with various biological activities. One order known to produce many secondary metabolites with different bioactivities is the Actinomycetales. Under laboratory conditions only a few compounds are produced by a strain while their genomes comprise often more than 20 biosynthetic gene clusters. Cryptic clusters have been activated by heterologous expression [1], changing growth conditions [2] or by the manipulation of regulatory genes [3,4]. The knowledge of the genome sequence and the biosynthetic cluster composition of a secondary metabolite gives insights into the biosynthetic pathway. Therefore, it is a valuable tool for metabolic engineering to increase the production of a specific compound or to generate novel metabolites by combinatorial biosynthesis.
The genus Actinokineospora is a member of the order Actinomycetales and was introduced in 1988 as a separate genus [5]. The characteristics of this genus include having meso-diaminopimelic acid as component of their cell wall and the occurrence of menaquinone MK-9 (H4), phospholipids type II and iso-C 16:0 fatty acids in their cell membrane. Until now only 16 strains of this genus have been identified. Thus, Actinokineospora belongs to the rare actinomycetes. Draft genome sequences are only available for A. spheciospongiae (GCA_000564855.1) [6], A. enzanensis (GCA_000374445.1) and A. inagensis (GCA_000482865.1).
The strain Actinokineospora bangkokensis 44EHW T was isolated from the rhizosphere soil of an elephant ear plant (Colocasia esculenta) in Bangkok (Thailand) [7]. It produces thailandins A and B, antifungal polyenes with 28 membered macrocyclic lactone ring with two methyl groups, seven free hydroxyl groups and five conjugated double bonds. In addition, thailandin A is O-rhamnosylated at position C15, where thailandin B has only a hydroxyl group. Both compounds show significant inhibition of anthracnose fungi and pathogenic yeast strains [8].
In this study, we performed whole genome sequencing of A. bangkokensis 44EHW T and successfully identified and verified the thailandin biosynthetic gene cluster. Herein, we report the putative biosynthetic pathway where the unusual butylmalonyl-CoA extender unit is incorporated into the polyketide chain.

Draft Genome Sequence of Actinokineospora bangkokensis 44EHW T
A draft genome of A. bangkokensis 44EHW T was sequenced and aligned to 32 scaffolds and 79 contigs. The largest scaffold has 931,456 nucleotides. The genome sequence consists of 7,453,713 nucleotides with an overall GC content of 74.1%, ranging from 44.2%-87.4% (calculated for 500 bp fragments). The sequence contains 6287 coding sequences (CDS), 50 tRNAs, and four clustered regularly interspaced short palindromic repeats (CRISPRs) predicted by the NCBI prokaryotic genome annotation pipeline [9]. The genome coding density is 87.9% and the average gene length is 1030 bp. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession MKQR00000000. The version described in this paper is version MKQR01000000. An antiSMASH 3.0 [10] analysis revealed the presence of 35 gene clusters encoding secondary metabolite biosynthetic pathways (Table 1, Figure 1 and Appendix A, Table A1).

Identification and Verification of the Thailandin Biosynthetic Gene Cluster
Thailandins A and B were assumed to be synthesized by a polyketide synthase type I (PKS I). On the basis of the carbon chain length of the polyketide backbone, the PKS I is expected to have 14 modules. In the draft genome sequence of A. bangkokensis, three large PKS I clusters could be identified. Among these three clusters, cluster #11 encodes four PKSs comprising in total 14 modules. Cluster #16 has 20 modules, cluster #19 has at least 24 modules, yet it seems that the cluster is interrupted by the end of the scaffold. All three clusters have a higher GC content than the average genome. Cluster #11 has a GC content of 75.4%, cluster #16 and cluster #19 have 75.2% and 76.9%, respectively ( Figure 1, circles B/C). Circle B: GC% content of 500 bp range, in 250 bp steps, between 50%-100%, line indicates average GC content of 74%. Circle C: Localization of putative secondary metabolite gene cluster, illustrated in PKS I (green), other PKS (purple), NRPS/PKS I (blue), other NRPS (pink), terpene (orange), siderophore (brown), lantipeptide (yellow) and other kind of cluster (grey); * Thailandin biosynthetic gene cluster is highlighted. Circle D: Localization of ORFs of general metabolism, subdivided into metabolism of amino acids (green), aromatic compounds (purple), fatty acids (blue), carbohydrates (pink), secondary metabolites (orange), and cofactors, vitamins and pigments (brown). Circle E: Localization of ORFs with putative modifying functions as carboxylases (green), dehydrogenases (purple), esterases (blue), hydratases (pink), hydrolases (orange), involved in redox reactions (brown), reductases (yellow) and transferases (grey). Circle F: Localization of ORFs putatively involved in ion metabolism, subdivided into metabolism of iron (green), phosphate (purple), sulfur (blue), nitrogen (pink) and other ions (orange). Circle G: Localization of ORFs putatively involved in replication/transcription/translation, subdivided into ORFs from nucleotide metabolism (green), protein-turnover and chaperons (purple), replication and repair (blue), transcription (pink), translation (orange) and tRNA metabolism (brown). Circle H: Localization of ORFs encoding putatively membrane proteins (green), transporters (purple), proteins involved in cell separation (blue) and from cell wall or membrane biosynthesis (pink). Circle I: Localization of ORFs putatively involved in defense and (stress-)response (green), from (pro-)phages (purple), for sporulation (blue) and communication (pink). Circle J: Localization of ORFs with unknown functions (grey). Circle B: GC% content of 500 bp range, in 250 bp steps, between 50%-100%, line indicates average GC content of 74%. Circle C: Localization of putative secondary metabolite gene cluster, illustrated in PKS I (green), other PKS (purple), NRPS/PKS I (blue), other NRPS (pink), terpene (orange), siderophore (brown), lantipeptide (yellow) and other kind of cluster (grey); * Thailandin biosynthetic gene cluster is highlighted. Circle D: Localization of ORFs of general metabolism, subdivided into metabolism of amino acids (green), aromatic compounds (purple), fatty acids (blue), carbohydrates (pink), secondary metabolites (orange), and cofactors, vitamins and pigments (brown). Circle E: Localization of ORFs with putative modifying functions as carboxylases (green), dehydrogenases (purple), esterases (blue), hydratases (pink), hydrolases (orange), involved in redox reactions (brown), reductases (yellow) and transferases (grey). Circle F: Localization of ORFs putatively involved in ion metabolism, subdivided into metabolism of iron (green), phosphate (purple), sulfur (blue), nitrogen (pink) and other ions (orange). Circle G: Localization of ORFs putatively involved in replication/transcription/translation, subdivided into ORFs from nucleotide metabolism (green), protein-turnover and chaperons (purple), replication and repair (blue), transcription (pink), translation (orange) and tRNA metabolism (brown). Circle H: Localization of ORFs encoding putatively membrane proteins (green), transporters (purple), proteins involved in cell separation (blue) and from cell wall or membrane biosynthesis (pink). Circle I: Localization of ORFs putatively involved in defense and (stress-)response (green), from (pro-)phages (purple), for sporulation (blue) and communication (pink). Circle J: Localization of ORFs with unknown functions (grey).
To verify that cluster #11 is responsible for thailandin biosynthesis, we interrupted the first PKS gene by a single crossover using a 3 kb internal fragment. Conjugation, electroporation as well as protoplast transformation was conducted for strain manipulation. However, only supplemention of MS plates with 10 mM CaCl 2 could generate transconjugants, which were hygromycin resistant.
The recombinant strain A. bangkokensis PKS11::pKCLP2 was tested for the production of thailandins in comparison to the wildtype strain. HPLC/MS analysis of organic extracts demonstrated that thailandins were not produced in the mutant strain ( Figure 2). The result confirms that cluster #11 is the thailandin biosynthetic gene cluster. To verify that cluster #11 is responsible for thailandin biosynthesis, we interrupted the first PKS gene by a single crossover using a 3 kb internal fragment. Conjugation, electroporation as well as protoplast transformation was conducted for strain manipulation. However, only supplemention of MS plates with 10 mM CaCl2 could generate transconjugants, which were hygromycin resistant.
The recombinant strain A. bangkokensis PKS11::pKCLP2 was tested for the production of thailandins in comparison to the wildtype strain. HPLC/MS analysis of organic extracts demonstrated that thailandins were not produced in the mutant strain ( Figure 2). The result confirms that cluster #11 is the thailandin biosynthetic gene cluster. The thailandin biosynthetic gene cluster was further analyzed in detail ( Figure 3, Table 2). It spans 96.1 kb with 25 open reading frames (ORFs), but the main cluster probably contains only 13 genes (thaRI-thaOII, thaT). The polyketide synthase is encoded by the four genes thaBI, thaBII, thaBIII and thaBIV. Beside them, the cluster encodes a crotonyl-CoA carboxylase/reductase (thaC), two monooxygenases (thaOI, thaOII), seven regulatory genes (thaRI-thaRIV; orf8/11/12), one MFS transporter (thaT), and further genes with various functions. By BLAST analysis the function of five ORFs could not be assumed.  The thailandin biosynthetic gene cluster was further analyzed in detail ( Figure 3, Table 2). It spans 96.1 kb with 25 open reading frames (ORFs), but the main cluster probably contains only 13 genes (thaRI-thaOII, thaT). The polyketide synthase is encoded by the four genes thaBI, thaBII, thaBIII and thaBIV. Beside them, the cluster encodes a crotonyl-CoA carboxylase/reductase (thaC), two monooxygenases (thaOI, thaOII), seven regulatory genes (thaRI-thaRIV; orf8/11/12), one MFS transporter (thaT), and further genes with various functions. By BLAST analysis the function of five ORFs could not be assumed. To verify that cluster #11 is responsible for thailandin biosynthesis, we interrupted the first PKS gene by a single crossover using a 3 kb internal fragment. Conjugation, electroporation as well as protoplast transformation was conducted for strain manipulation. However, only supplemention of MS plates with 10 mM CaCl2 could generate transconjugants, which were hygromycin resistant.
The recombinant strain A. bangkokensis PKS11::pKCLP2 was tested for the production of thailandins in comparison to the wildtype strain. HPLC/MS analysis of organic extracts demonstrated that thailandins were not produced in the mutant strain ( Figure 2). The result confirms that cluster #11 is the thailandin biosynthetic gene cluster. The thailandin biosynthetic gene cluster was further analyzed in detail ( Figure 3, Table 2). It spans 96.1 kb with 25 open reading frames (ORFs), but the main cluster probably contains only 13 genes (thaRI-thaOII, thaT). The polyketide synthase is encoded by the four genes thaBI, thaBII, thaBIII and thaBIV. Beside them, the cluster encodes a crotonyl-CoA carboxylase/reductase (thaC), two monooxygenases (thaOI, thaOII), seven regulatory genes (thaRI-thaRIV; orf8/11/12), one MFS transporter (thaT), and further genes with various functions. By BLAST analysis the function of five ORFs could not be assumed.   All acyltransferase domains have the typical GHSxG-Motif [11]. Except for the ATs of module 6 and 13, they all show specificity to the extender unit malonyl-CoA (x = LVIFAM). In the loading module, the AT domain has also malonyl-CoA-specificity. There, malonyl-CoA is probably decarboxylated by the KS of the loading module to provide an acetyl starter unit for transfer onto the first extension module. Like in other PKS systems with a KS domain in the loading module, the common cysteine of "condensing" KS domains is occupied by a glutamine in the active site [12].

Proposed Thailandin Biosynthetic Pathway
The sequence of the AT domain of module 6 shows specificity to methylmalonyl-CoA (x = Q), which is consistent with the thailandin structure. In the last module, module 13, it is predicted that the extender unit ethylmalonyl-CoA is incorporated. In comparison with the polyene structure, we would suppose butylmalonyl-CoA as the unusual extender unit. Downstream of thaBIV, the gene thaC is located. The encoded protein is assumed to have a crotonyl-CoA carboxylase/reductase activity. It was shown, that crotonyl-CoA carboxylases/reductases are essential for the biosynthesis of various substituted malonyl-CoA extender units. They catalyze the NADPH-dependent carboxylation of α,β-unsaturated acyl-thioesters [13][14][15][16][17]. In the thailandin gene cluster, thaC encodes for such an enzyme, likely involved in the biosynthesis of butylmalonyl-CoA. In module 7, a KS and a DH domain are located. Because of the hydroxyl group at C15 in the thailandin molecule, the DH domain is apparently inactive. This hydroxyl group is later used for the attachment of rhamnose moiety. The DH7 domain contains the conserved HxxxGxxxxP motif found in active DH domains, but possesses alterations of the GYxYGPxF, LPFxW, and Dxxx(Q/H) motifs [18].
After biosynthesis, the mature polyketide chain is released from the PKS and cyclized via the action of a thioesterase domain located at the C-terminal end of ThaBIV. The other cyclization takes place between C9 and C13. In other polyene biosynthetic pathways, this cyclization is formed by a keto and a hydroxyl group building a hemiketal ring [19,20]. Therefore, we assume that the KR domain of module 8 must be inactive, yet it contains all conserved amino acids of type A ketoreductases [21][22][23]. Following, the hydroxyl group could be transferred from C13 to C14 by an epoxide intermediate, which could be catalyzed by one of the two cytochrome P450 monooxygenases ThaOI or ThaOII. Alternatively, the tetrahydropyran ring could be generated by oxa-Michael addition on an α,β-unsaturated thioester intermediate. Therefore, special dehydratases and pyran-forming cyclases are required [24][25][26], which are not present within the cluster.
The other monooxygenase is likely responsible for hydroxylation of C26. For their activity, they require electrons from NADH, often mediated by ferredoxin. In the thailandin cluster, downstream of thaOI, the gene thaF is located, encoding for ferredoxin. Finally, thailandin B is rhamnosylated and results in thailandin A. Within the cluster no gene encodes for an enzyme with glycosyltransferase activity. A putative major faciliate transporter encoded near the structural genes, thaT, is suggested to be responsible for the transport of thailandin out of the producing organism. The other ORFs near the biosynthetic genes are unlikely to have an important role in the thailandin biosynthesis.  In module 7, a KS and a DH domain are located. Because of the hydroxyl group at C15 in the thailandin molecule, the DH domain is apparently inactive. This hydroxyl group is later used for the attachment of rhamnose moiety. The DH7 domain contains the conserved HxxxGxxxxP motif found in active DH domains, but possesses alterations of the GYxYGPxF, LPFxW, and Dxxx(Q/H) motifs [18].
After biosynthesis, the mature polyketide chain is released from the PKS and cyclized via the action of a thioesterase domain located at the C-terminal end of ThaBIV. The other cyclization takes place between C9 and C13. In other polyene biosynthetic pathways, this cyclization is formed by a keto and a hydroxyl group building a hemiketal ring [19,20]. Therefore, we assume that the KR domain of module 8 must be inactive, yet it contains all conserved amino acids of type A ketoreductases [21][22][23]. Following, the hydroxyl group could be transferred from C13 to C14 by an epoxide intermediate, which could be catalyzed by one of the two cytochrome P450 monooxygenases ThaOI or ThaOII. Alternatively, the tetrahydropyran ring could be generated by oxa-Michael addition on an α,β-unsaturated thioester intermediate. Therefore, special dehydratases and pyran-forming cyclases are required [24][25][26], which are not present within the cluster.
The other monooxygenase is likely responsible for hydroxylation of C26. For their activity, they require electrons from NADH, often mediated by ferredoxin. In the thailandin cluster, downstream of thaOI, the gene thaF is located, encoding for ferredoxin. Finally, thailandin B is rhamnosylated and results in thailandin A. Within the cluster no gene encodes for an enzyme with glycosyltransferase activity. A putative major faciliate transporter encoded near the structural genes, thaT, is suggested to be responsible for the transport of thailandin out of the producing organism. The other ORFs near the biosynthetic genes are unlikely to have an important role in the thailandin biosynthesis.

Discussion
Actinokineospora bangkokensis 44EHW T produces the polyene compounds thailandin A and thailandin B [8]. Thailandin B is probably the precursor of the A-form, because thailandin A is further rhamnosylated. Both compounds show activity against pathogenic fungal strains with minimum inhibitory concentrations ranging between 16-32 µg/mL [8].
Polyene compounds are efficient antibiotics because they directly target the fungal plasma membrane by interacting with the main sterol, ergosterol, which often results in membrane permeabilization [27]. In addition, further acitivities of various polyene compounds have been demonstrated. The clinical application of these antifungal compounds is complicated by their low water-solubility and dose-dependent side effects, notably nephrotoxicity [28]. Many studies have been done to modify existing molecules in order to improve them. Thereby important structural elements were identified. Accordingly, the polyol [29] and polyene regions [30], the sugar modification, mainly by the aminoglycoside D-mycosamine [31,32], the exocyclic carboxyl group [30,33], and an additional aromatic heptaen side chain, which leads to haemolytic activity [34], seem to be particularly important for selective toxicity and activity. The modification or the loss of the D-mycosamine sugar moiety led to significant reduced antifungal activity [31,32]. Thailandin A has a rhamnose instead of D-mycosamine modification, but has also higher MIC compared to amphotericin B. Surprisingly, although thailandin B is not modified by a sugar moiety, it has even better antifungal activity than thailandin A [8]. In addition, both compounds do not have the exocyclic carboxyl group. In contrast to other polyenes, thailandins have an additional short side chain at C2 similar to chainin [35], filipin [36], fungichromin [37] and antifungalmycin [38] (Figure 5).

Discussion
Actinokineospora bangkokensis 44EHW T produces the polyene compounds thailandin A and thailandin B [8]. Thailandin B is probably the precursor of the A-form, because thailandin A is further rhamnosylated. Both compounds show activity against pathogenic fungal strains with minimum inhibitory concentrations ranging between 16-32 μg/mL [8].
Polyene compounds are efficient antibiotics because they directly target the fungal plasma membrane by interacting with the main sterol, ergosterol, which often results in membrane permeabilization [27]. In addition, further acitivities of various polyene compounds have been demonstrated. The clinical application of these antifungal compounds is complicated by their low water-solubility and dose-dependent side effects, notably nephrotoxicity [28]. Many studies have been done to modify existing molecules in order to improve them. Thereby important structural elements were identified. Accordingly, the polyol [29] and polyene regions [30], the sugar modification, mainly by the aminoglycoside D-mycosamine [31,32], the exocyclic carboxyl group [30,33], and an additional aromatic heptaen side chain, which leads to haemolytic activity [34], seem to be particularly important for selective toxicity and activity. The modification or the loss of the D-mycosamine sugar moiety led to significant reduced antifungal activity [31,32]. Thailandin A has a rhamnose instead of D-mycosamine modification, but has also higher MIC compared to amphotericin B. Surprisingly, although thailandin B is not modified by a sugar moiety, it has even better antifungal activity than thailandin A [8]. In addition, both compounds do not have the exocyclic carboxyl group. In contrast to other polyenes, thailandins have an additional short side chain at C2 similar to chainin [35], filipin [36], fungichromin [37] and antifungalmycin [38] (Figure 5). The genus Actinokineospora belong to the group of rare Actinomycetales with great potential to produce novel secondary metabolites. Only 16 strains of this genus are known to date. Many studies have been carried out to categorize these strains, yet studies into their secondary metabolite production  The genus Actinokineospora belong to the group of rare Actinomycetales with great potential to produce novel secondary metabolites. Only 16 strains of this genus are known to date. Many studies have been carried out to categorize these strains, yet studies into their secondary metabolite production is limited (Table A2). Recently, new antitrypanosomal and antioxidant compounds actinosporins were isolated from A. spheciospongia EG49 T [39,40]. Co-cultivation of this strain with Nocardiopsis sp. RV163 led to induction of further secondary metabolite biosynthesis [41]. Furthermore, only three other Actinokineospora genomes have been sequenced, A. enzanensis DSM 44649 T (ID 1120934) (8119858 bp, GC 70.8%, 37 predicted gene cluster), A. inagensis DSM 44258 T (ID 1120935) (7278759 bp, GC 70.2%, 34 predicted gene cluster) and A. spheciospongia EG49 T (ID 909613) (7529476 bp, GC 72.8%, 36 predicted gene cluster). In this study, we sequenced the genome of A. bangkokensis 44EHW T . The draft genome has 7.5 Mb with 74.1% GC content, which is significantly higher than the other Actinokineospora genomes. It is also remarkable, that there are many regions within the genome with a GC content <50%, as well as the three large PKS I gene clusters have an overall higher GC content of 75.4% (cluster #11, thailandin cluster), 75.2% (cluster #16) and 76.9% (cluster #19). This indicates a high frequeny of horizontal DNA transfer during evolution.
The antiSMASH analysis of the genome revealed 35 putative secondary metabolite gene clusters. The detailed evaluation of the PKS encoding clusters led to the assumption that cluster #11 should be responsible for thailandin biosynthesis. This hypothesis was proved by targeted inactivation of the first PKS gene thaBI. For genetic manipulation, different protocols were conducted without success. Finally, the supplementation of 10 mM CaCl 2 resulted in the mutant strain via conjugation. The addition of CaCl 2 could also increase conjugation frequency of several Streptomyces strains [42]. Noteworthy, A. bangkokensis is resistant against the commonly used antibiotics apramycin and spectinomycin.
The thailandin biosynthetic gene cluster encodes next to the PKS enzymes for proteins with regulatory function, post-polyketide modification, one transporter and few other proteins. Remarkably, there is no gene within the cluster which encodes a glycosyltransferase. In other polyene biosynthetic gene clusters, the genes for biosynthesis of the sugar moiety and the glycosyltransferase are present [43]. In the genome of A. bangkokensis 45 glycosyltransferase genes could be identified, of which two genes are in the PKS cluster #19 and nine in cluster #16, respectively. One of them should catalyze the rhamnosylation of the thailandin aglycon.
Beside modules 6 and 13, the bioinformatic analysis identified malonyl-CoA as extender unit, which corresponds to the chemical structure of thailandin. In module 6, methylmalonyl-CoA should be incorporated. In the last module we postulated the incorporation of butylmalonyl-CoA. The high diversity among polyketides is caused by the number of modules in the PKS assembly line, the presence of reducing domains, and other modifying enzymes, whereas malonyl-CoA, methylmalonyl-CoA or ethylmalonyl-CoA are usually incorporated. Therefore, butylmalonyl-CoA is an unusual extender unit. From structural analysis, we would also suppose the incorporation of butylmalonyl-CoA in chainin biosynthesis, but so far there is no cluster information available. The usage of unusual extender units is also postulated in few other biosynthetic pathways, but often the acyltransferases seem to be less specific, incorporating not only one defined unit. The AT domain of RevA of the reveromycin biosynthesis putatively uses butylmalonyl-CoA, isopentylmalonyl-CoA, pentylmalonyl-CoA or hexylmalonyl-CoA [44]. In the biosynthesis of neoansamycin A-C, pentylmalonyl-CoA or butylmalonyl-CoA is incorporated [45]. Different acylmalonyl-CoA extender units are proposed on the basis of various derivatives in both antimycin [46] and nemadectin [47]. In polyoxypeptin, methylbutylmalonyl-CoA is putatively used as an extender unit [48]. In the polyene compounds fungichromin, filipin and antifungalmycin, an unusual extender like hydroxyl-hexylmalonyl-CoA should be incorporated. These secondary products with their unusual extender units are shown in Figure 5. The incorporation of the lengthened extender unit in thailandin, chainin, filipin, fungichromin and antifungalmycin leads to the particular C2 side chain of the molecules, which is not common in other polyene compounds. Further structural studies would shed new light on polyene-fungus interaction.
The sequences of the described acyltransferases were aligned ( Figure A1). The alignment indicates that a later motif may encode for this specificity. Whereas acyltransferases with malonyl-CoA specificity have a HAFH-motif, the sequences differ in the these acyltransferases. The AT13 of the thailandin biosynthesis pathway has a GHSQH-and a AAGH-motif.
Studies on rare actinomycetes such as Actinokineospora are very promising in order to identify novel secondary metabolites. The genome of A. bangkokensis 44EHW T revealed 34 other secondary metabolite biosynthetic gene clusters indicating that this strain can produce more compounds. In addition, the examination of PKS systems with unusual extender units is important. The knowledge of the specificity of acyltransferases and the underlying sequence motifs gives basics to modify compounds by combinatorial biosynthesis.

Bacterial Growth Condition
A. bangkokensis 44EHW T was incubated at 28 • C in TSB medium (CASO Bouillon 30 g/L; Carl Roth GmbH, Karlsruhe, Germany) for 3-5 days, 180 rpm. For production analysis, the strain was pre-cultivated in TSB medium for 2 days at 28 • C and then further cultivated in 100 mL HA medium (0.4% yeast extract, 1% malt extract, and 0.4% glucose; pH 7.4) for 8 days.

Extraction of Secondary Metabolites
After cultivation in HA medium the culture broth was centrifuged (3500× g, 10 min, 4 • C). The pH of the supernatant was adjusted to pH 4 by addition of HCl (1 M) and extracted by shaking vigorously with an equal volume of ethyl acetate for 30 min. The organic phase was evaporated to dryness using rotary evaporation at 240 bar. The extract was dissolved in MeOH and analyzed by HPLC/MS.

Analysis of Secondary Metabolite Production by HPLC/MS
The extract was analyzed by a HPLC system equipped with a photodiode array detector (200-600 nm) a mass spectrometer (1100 Series, Agilent Technologies, Waldbronn, Germany) The separation was done by usage of a XBridge™ C18 column (4.6 × 100 mm) with precolumn (4.6 × 20 mm) on a non-linear 0.5% AcOH-CH 3 CN:H 2 O gradient ranged from 20% to 95% at a flow rate of 0.5 mL/min. Thailandin A has UV/vis maxima at 325, 340, and 358 nm and a mass of 754 g/mol, thailandin B has 608 g/mol.

Isolation of Genomic DNA
The strain A. bangkokensis 44EHW T was incubated in TSB-Media for 4 days, at 28 • C and 180 rpm. Accordingly, 15 mL of culture was centrifuged (3500× g, 10 min, 4 • C) and pellet was washed in 15 mL H 2 O and finally resuspended in 15 mL SET buffer (75 mM NaCl, 25 mM EDTA, 20 mM Tris/HCl pH 8) with lysozyme (100 µg/mL). After incubation for 30 min at 37 • C, RNase (100 µg/mL) was added and further incubated for additional 2 h at 37 • C. Afterwards proteinase K and SDS were added to obtain a final concentration of 100 µg/mL and 0.5%, respectively, and incubated at 50 • C overnight. The DNA was extracted by the addition of equal volume of phenol/chloroform/isoamyl alcohol (25:24:1), inverting carefully for 10 min. After centrifugation (2000× g, 20 min, 4 • C), the aqueous phase was transferred into a new tube and the last step was repeated two times. Two volumes of isopropanol (100%) were added to the aqueous phase and genomic DNA was spooled by a glass Pasteur pipette. The DNA was washed in EtOH (70%), dried and finally dissolved in 1 mL H 2 O.

Genome Sequencing of A. bangkokensis 44EHW T
The genome of A. bangkokensis 44EHW T was sequenced twice. Eurofins Genomics GmbH (Ebersberg, Germany) sequenced the genome using 454 technologies and assembled the reads by Newbler to 247 contigs and 56 scaffolds. In addition, the genome was sequenced at the Center for Biotechnology at the University of Bielefeld, Bielefeld, Germany, using Illumina-HiSeq 1000 technology. Initially, all reads were assembled to a draft genome of 206 contigs and 105 scaffolds using GS de novo assembler version 3.0 (Roche, Branford, CT, USA). The genome was further assembled to 7,453,713 bp genome sequence.

Genome Annotation and Identification of the Thailandin Biosynthetic Gene Cluster
The automatic functional annotation results were obtained using the NCBI prokaryotic genome annotation pipeline [9]. In addition, the ORFs were categorized into different subsystems manually. Secondary metabolite gene clusters were identified by antiSMASH 3.0 [10]. The genes were further analyzed by BLAST [49].

Visualization of the Genome of A. bangkokensis 44EHW T
The genome was visualized by Circos, generated with the R package circlize [50,51]. Therefore, the scaffolds with gaps were visualized, the GC content in 500 bp ranges (each 250 bp), the predicted secondary metabolite gene clusters and annotated ORFs grouped by function.

Cloning of Single Crossover Vector pKCLP2_PKS11
For the inactivation of thaBI of the PKS cluster #11, a 3087 bp internal fragment of the PKS I gene was amplified using the primers TTATACTGCAGACCGAGGACGAGGTCATC and ATCGGGAGAACTAGACGAACAG by PCR. The PCR product was firstly cloned into pUC19 (Stratagene, La Jolla, CA, USA). Subsequently, it was cut by PstI/EcoRV and cloned into the final vector pKCLP2 [52]. For cloning experiments E. coli XL1 Blue (Stratagene) was used.

Alignment of Sequences
The sequences of thailandin acyltransferases and other acyltransferase from biosynthetic pathways with proposed unusual extender units were aligned by Clustal Omega [54] and compared in detail.