Halogenase-Targeted Genome Mining Leads to the Discovery of (±) Pestalachlorides A1a, A2a, and Their Atropisomers

Genome mining has become an important tool for discovering new natural products and identifying the cryptic biosynthesis gene clusters. Here, we utilized the flavin-dependent halogenase GedL as the probe in combination with characteristic halogen isotope patterns to mine new halogenated secondary metabolites from our in-house fungal database. As a result, two pairs of atropisomers, pestalachlorides A1a (1a)/A1b (1b) and A2a (2a)/A2b (2b), along with known compounds pestalachloride A (3) and SB87-H (4), were identified from Pestalotiopsis rhododendri LF-19-12. A plausible biosynthetic assembly line for pestalachlorides involving a putative free-standing phenol flavin-dependent halogenase was proposed based on bioinformatics analysis. Pestalachlorides exhibited antibacterial activity against sensitive and drug-resistant S. aureus and E. faecium with MIC values ranging from 4 μg/mL to 32 μg/mL. This study indicates that halogenase-targeted genome mining is an efficient strategy for discovering halogenated compounds and their corresponding halogenases.


Introduction
Halogenated compounds play a profound role in the pharmaceutical industry as halogen substituents can significantly impact the bioactivity and reactivity of organic compounds [1][2][3]. According to an economic report, 88% of the 100 top-selling drugs employed chlorine in their final pharmaceutical products or the manufacturing process [4]. Nature is an important source of halogenated compounds. To date, over 5000 halogenated natural products have been discovered from fungi, bacteria, algae, cyanobacteria, plants, et al. [5]. Amongst, fungi as the third kingdom in nature, contributed nearly one-fifth (988) of halogenated metabolites [6] and are expected to harbor many more halogenated natural products to be identified [7].
Nature usually orchestrates halogen-carbon bond formation by a variety of halogenases. Several types of halogenases have been identified so far, including heme-or vanadium-dependent haloperoxidases, S-adenosyl-L-methionine-dependent halogenases, nonheme-iron α-ketoglutarate-dependent halogenases, and flavin-dependent halogenases (FDHs) [2,[8][9][10]. Amongst halogenases, FDHs are widely distributed across all kingdoms of life [11] and are particularly notable for their strong regioselectivity and substrate diversity [2,12]. Almost all FDHs have the following two conserved motifs: A flavin-binding motif GxGxxG, for binding of the diffusible flavin adenine dinucleotide (FAD) [3], and a structural motif WxWxIP, thought to prevent a monooxygenation reaction by blocking direct contact between the substrate and hydroperoxy flavin [13,14]. These signature motifs can be used as probes for promptly identifying putative FDHs from genomic sequences. [2,3] Fungi are a rich source of flavin-dependent halogenases (FDHs). Up to now, twentythree halogenases have been reported from fungi, twenty of which are FDHs [7]. As

Genome Mining of the Halogenase-Containing Biosynthesis Gene Cluster
Flavin-dependent halogenases (FDHs), the most characterized halogen their substrates, can be categorized into the following five main classes: free nol, free-standing indole, carrier protein-dependent phenol, carrier prot pyrrole, and aliphatic FDHs [9]. GedL is a free-standing phenol FDH fro terreus NIH2624 [26]. It is involved in the biosynthesis of geodin and haloge strate at the late stage of biosynthesis [26]. Here, we used GedL as the pro tBlastp analysis on our in-house fungal genome sequences. An antiSMASH subsequently performed, and a gene ptaK, encoding a putative flavin-depen ase with 51% amino acid sequence identity to GedL [26], was found to be cryptic BGC of endolichenic Pestalotiopsis rhododendri LF-19-12. Succeeding analysis showed that PtlK grouped with free-standing phenol FDHs (Figure that its substrate might hold a phenol moiety.

Genome Mining of the Halogenase-Containing Biosynthesis Gene Cluster
Flavin-dependent halogenases (FDHs), the most characterized halogenases based on their substrates, can be categorized into the following five main classes: free-standing phenol, free-standing indole, carrier protein-dependent phenol, carrier protein-dependent pyrrole, and aliphatic FDHs [9]. GedL is a free-standing phenol FDH from Aspergillus terreus NIH2624 [26]. It is involved in the biosynthesis of geodin and halogenates the substrate at the late stage of biosynthesis [26]. Here, we used GedL as the probe to conduct tBlastp analysis on our in-house fungal genome sequences. An antiSMASH analysis was subsequently performed, and a gene ptaK, encoding a putative flavin-dependent halogenase with 51% amino acid sequence identity to GedL [26], was found to be contained in a cryptic BGC of endolichenic Pestalotiopsis rhododendri LF-19-12. Succeeding phylogenetic analysis showed that PtlK grouped with free-standing phenol FDHs (Figure 2), suggesting that its substrate might hold a phenol moiety.

Figure 2.
Phylogenetic tree based on amino acid sequences of PtlK and the selected flavin-dependent halogenases (FDHs). GenBank, UniProtKB, or PDB accession numbers are given in parentheses. The three dominant categories of FDHs: free-standing phenol, free-standing indole, and carrier protein-dependent phenol FDHs are highlighted in orange, blue, and green, respectively. Representative products are shown beside. The phylogenetic tree was constructed using the UPGMA method. Visualization was conducted with MEGA7.
Subsequently, LC-MS and OSMAC strategies were employed to exploit the production of halogenated secondary metabolites. Pestalotiopsis rhododendri LF-19-12 was cultured in four different media (M1, M2, PDB, and YES) and then extracted using MeOH. The obtained material was applied to LC-MS analysis. As a result, a group of potential halogenated compounds with characteristic isotope patterns of two chloride atoms were Phylogenetic tree based on amino acid sequences of PtlK and the selected flavin-dependent halogenases (FDHs). GenBank, UniProtKB, or PDB accession numbers are given in parentheses. The three dominant categories of FDHs: free-standing phenol, free-standing indole, and carrier proteindependent phenol FDHs are highlighted in orange, blue, and green, respectively. Representative products are shown beside. The phylogenetic tree was constructed using the UPGMA method. Visualization was conducted with MEGA7.
Subsequently, LC-MS and OSMAC strategies were employed to exploit the production of halogenated secondary metabolites. Pestalotiopsis rhododendri LF-19-12 was cultured in four different media (M1, M2, PDB, and YES) and then extracted using MeOH. The obtained material was applied to LC-MS analysis. As a result, a group of potential halogenated compounds with characteristic isotope patterns of two chloride atoms were detected in the crude extract of the Pestalotiopsis rhododendri LF-19-12 culture in the M2 medium ( Figure 3).
The HR-ESIMS spectrum of compound 1a revealed a characteristic isotope pattern of double chlorides ( Figure 3). Furthermore, analysis of HR-ESIMS and 13 C NMR data disclosed that 1a has a molecular formula of C23H26Cl2NO6 ([M+H] + m/z 482.1121, calcd. 482.1137). Interpretation of the 1 H, 13 C NMR, and HSQC data for 1a (Table 1, Figures S1-S3) disclosed a carbonyl group (δC 168.6), 12 aromatic carbons, one of which is protonated, a trisubstituted olefin, a methine, three methylene units, one of which is attached to an oxygen atom, four methyl moieties, one of which is methoxy, and three phenolic hydroxyl groups. All the above interpretations accounted for 8 degrees of unsaturation and required 1a to incorporate three rings, two of which should be aryl rings.
The HR-ESIMS spectrum of compound 1a revealed a characteristic isotope pattern of double chlorides ( Figure 3). Furthermore, analysis of HR-ESIMS and 13 C NMR data disclosed that 1a has a molecular formula of C 23 H 26 Cl 2 NO 6 ([M+H] + m/z 482.1121, calcd. 482.1137). Interpretation of the 1 H, 13 C NMR, and HSQC data for 1a (Table 1, Figures S1-S3) disclosed a carbonyl group (δ C 168.6), 12 aromatic carbons, one of which is protonated, a trisubstituted olefin, a methine, three methylene units, one of which is attached to an oxygen atom, four methyl moieties, one of which is methoxy, and three phenolic hydroxyl groups. All the above interpretations accounted for 8 degrees of unsaturation and required 1a to incorporate three rings, two of which should be aryl rings. 1 H-1 H COSY correlations (Figures 4 and S3) revealed two isolated proton spin-systems attributed to -CH 2 -CH 2 -OH and -CH 2 -CH= ( Figure 4). Furtherly, an isoprenyl unit in 1a was established by HMBC correlations (Figures 4 and S5) from H-4' and H-5' to vinylic carbons C-3' and C-2'. HMBC correlations from H-1' and H-2' to C-6 suggested that the isoprenyl group was connected to the aromatic ring at C-6. Two phenolic hydroxyl groups at C-5 and C-3, respectively, can be inferred by the downfield chemical shifts of C-3 and C-5. Further correlations from H-4 to C-2, C-6, C-3, C-5, and C-1, from H-8 to C-6, C-2, and C-1, as well as from H-16 to C-8 and C-1, allowed construction of the substituted isoindole-1-one scaffold.  HMBC correlations from H-1" to C-11, C-12, and C-13, from the phenolic pro 10.03 to C-9, C-10, and C-11, from the methoxy protons at δ 3.05 to C-14, from H-8 C-10, and C-14 indicated that a hexasubstituted benzene ring was attached to C-8 C bond. As a result, the two chlorine atoms in 1a could only be located at C-11 an Therefore, the planar structure of 1a was assembled as shown in Figure 4.
The structure of 1a was further confirmed by the single crystal X-ray analy crystallographic data disclosed that 1a featured a centrosymmetric space group suggestive of its being a racemate of 8R and 8S enantiomers ( Figure 5). HMBC correlations from H-1" to C-11, C-12, and C-13, from the phenolic proton at δ 10.03 to C-9, C-10, and C-11, from the methoxy protons at δ 3.05 to C-14, from H-8 to C-9, C-10, and C-14 indicated that a hexasubstituted benzene ring was attached to C-8 via a C-C bond. As a result, the two chlorine atoms in 1a could only be located at C-11 and C-13. Therefore, the planar structure of 1a was assembled as shown in Figure 4.
The structure of 1a was further confirmed by the single crystal X-ray analysis. The crystallographic data disclosed that 1a featured a centrosymmetric space group P121/c1, suggestive of its being a racemate of 8R and 8S enantiomers ( Figure 5). C bond. As a result, the two chlorine atoms in 1a could only be located at C-11 an Therefore, the planar structure of 1a was assembled as shown in Figure 4.
The structure of 1a was further confirmed by the single crystal X-ray analy crystallographic data disclosed that 1a featured a centrosymmetric space group P suggestive of its being a racemate of 8R and 8S enantiomers ( Figure 5). Compound 1b, an isomer of 1a by HR-ESIMS analysis, was quickly converted in acetonitrile aqueous. Therefore, only a mixture of 1a and 1b was obtained. The 1 spectrum of the mixture displayed the following two sets of signals ( Figure S6): on signals is identical to that of 1a, and the others are nearly similar to those of 1a ex the methoxy proton chemical shifts (δ 3.97, deshielded for 1b vs. δ 3.05, shielded suggestive of 1b as an atropisomer of 1a. The 13 C NMR spectrum ( Figure S7) furth ported the above hypothesis. To our knowledge, the analog of 1a and 1b, pestalac A (3) from Pestalotiopsis adusta, also has atropisomer axial chirality due to the h rotation around the C8-C9 bond, but its two atropisomers could not be chromat ically separated [27].
The  Figures S11 and S12). The upfield methoxyl proton signals at δ 3.04 in that the methoxy was located in the shielded area of the isoindole-1-one residue. A examination of the NMR spectra of 2a disclosed the presence of a minor compon Compound 1b, an isomer of 1a by HR-ESIMS analysis, was quickly converted into 1a in acetonitrile aqueous. Therefore, only a mixture of 1a and 1b was obtained. The 1 H NMR spectrum of the mixture displayed the following two sets of signals ( Figure S6): one set of signals is identical to that of 1a, and the others are nearly similar to those of 1a except for the methoxy proton chemical shifts (δ 3.97, deshielded for 1b vs. δ 3.05, shielded for 1a), suggestive of 1b as an atropisomer of 1a. The 13 C NMR spectrum ( Figure S7) further supported the above hypothesis. To our knowledge, the analog of 1a and 1b, pestalachloride A (3) from Pestalotiopsis adusta, also has atropisomer axial chirality due to the hindered rotation around the C8-C9 bond, but its two atropisomers could not be chromatographically separated [27].
The  Figures S11 and S12). The upfield methoxyl proton signals at δ 3.04 indicated that the methoxy was located in the shielded area of the isoindole-1-one residue. A careful examination of the NMR spectra of 2a disclosed the presence of a minor component 2b, which was subsequently proved to be an atropisomer of 2a. Compound 2a showed no optical activity, suggestive of it also being a racemate.
HR-ESIMS analysis revealed 2b as an isomer of 2a. By comparison with the 1 H NMR spectra of 2a, that of 2b exhibited nearly identical signals to those of the minor component in 2a ( Figure S13). The methoxyl proton signals of 2b (δ 3.97), downfield relative to those of 2a (δ 3.04), inferred that the methoxy in 2b was located in the deshielded area of the isoindole-1-one residue. 2a and 2b can also be interconverted with each other at room temperature.
As proved above, axial chirality was present for pestalachlorides A, A1, and A2, which resulted in time-dependent atropisomerism. To interrogate the stability of pestalachlorides atropisomers, we calculated the relative Gibbs energy barriers for the atropisomers interconversions at the M062X/def2TZVP/SMD (H 2 O)//B3LYP/6-31G(d)/PCM (H 2 O) level. The results disclosed that the barriers of 1a to 1b and 1b to 1a were 24.6 kcal/mol and 24.4 kcal/mol, and the corresponding interconversion half-times were 34 h and 24 h at room temperature, respectively, in agreement with the fact that 1b is a little more unstable than 1a; the barriers between two atropisomers of pestalachloride A were 21.4 kcal/mol and 21.6 kcal/mol, and the corresponding interconversion half-times were 0.15 h and 0.23 h, respectively, supporting their inseparability; the barriers of 2a and 2b interconversion were 26.9 kcal/mol and 27.4 kcal/mol, respectively, indicating that they can also interconvert with each other [29].

Antimicrobial Activities of Pestalachlorides
The analog of pestalachloride A1a and A2a, pestalachloride A, was previously reported to show antibacterial activity against the standard and methicillin-resistant Staphylococcus aureus (MIC = 10 μg/mL) and the plant pathogenic fungus Fusarium culmorum (MIC = 3.2 μg/mL). To preliminarily explore the bioactivity of new compounds and the structure-activity relationship of pestalachlorides, compounds A1a and A2a, together with pestalachloride A (3), were evaluated for their activity against standard Staphylococcus aureus ATCC 29213 and methicillin-resistant Staphylococcus aureus (MRSA), as well as other human pathogenic microbes, Enterococcus faecium ATCC 35667, Vancomycin-Resistant Enterococcus faecium (VRE), and Candida albicans ATCC 10231. The pestalachloride A showed moderate activity against Staphylococcus aureus ATCC 29213, MRSA, and VRE with minimum inhibitory concentrations (MIC) of 8 μg/mL, 4 μg/mL, and 8 μg/mL, respectively. Pestalachloride A1a showed weak antibacterial activity against four Grampositive bacteria (MIC = 32 μg/mL), while pestalachloride A2a showed no antibacterial activity within the tested concentration range, indicating that the bulky N-substituents can reduce the antibacterial activity of these compounds (Table 3). On the other hand, all of the tested three compounds showed no activity against the fungus Candida albicans.

Antimicrobial Activities of Pestalachlorides
The analog of pestalachloride A1a and A2a, pestalachloride A, was previously reported to show antibacterial activity against the standard and methicillin-resistant Staphylococcus aureus (MIC = 10 µg/mL) and the plant pathogenic fungus Fusarium culmorum (MIC = 3.2 µg/mL). To preliminarily explore the bioactivity of new compounds and the structure-activity relationship of pestalachlorides, compounds A1a and A2a, together with pestalachloride A (3), were evaluated for their activity against standard Staphylococcus aureus ATCC 29213 and methicillin-resistant Staphylococcus aureus (MRSA), as well as other human pathogenic microbes, Enterococcus faecium ATCC 35667, Vancomycin-Resistant Enterococcus faecium (VRE), and Candida albicans ATCC 10231. The pestalachloride A showed moderate activity against Staphylococcus aureus ATCC 29213, MRSA, and VRE with minimum inhibitory concentrations (MIC) of 8 µg/mL, 4 µg/mL, and 8 µg/mL, respectively. Pestalachloride A1a showed weak antibacterial activity against four Gram-positive bacteria (MIC = 32 µg/mL), while pestalachloride A2a showed no antibacterial activity within the tested concentration range, indicating that the bulky N-substituents can reduce the antibacterial activity of these compounds (Table 3). On the other hand, all of the tested three compounds showed no activity against the fungus Candida albicans.

Discussion
With the development of sequencing and bioinformatics, genome mining has increasingly become an important strategy for identifying new compounds and cryptic enzymes and exploring new biosynthetic logics. Here we succeeded in discovering new pestalachloride analogs and thus unearthing their biosynthetic gene cluster by utilizing the strategy of halogenase-targeted genome mining combined with characteristic isotope patterns of halogen atoms. Pestalachlorides A1a, A2a, and their analog pestalachloride A share an isoinodin-1-one core structure that occurs in a number of bioactive compounds [37]. From the biosynthesis view, pestalachlides belong to the pestalone-type benzophenones [37]. This class of compounds features a prenyl group attached to a benzophone that is often clorinated. Although a total of 21 natural analogs of pestalone, including SB87-Cl and SB87-H from Chrysosporium sp. [38], pestalone from Pestalotia sp. CNL-365 [39], pestalachloride A-C from Pestalotiopsis adusta [27], (±)pestalachloride D from Pestalotiopsis sp. ZJ-2009-7-6 [40], pestalachlorides E and F from Pestalotiopsis sp. ZJ-2009-7-6 [41], pestalones B-H from Pestalotiopsis neglecta F9D003 [42], and pestalotinones A-D from Pestalotiopsis trachicarpicola SC-J551 [28] have been discovered, no biosynthesis gene clusters responsible for their assembly are reported. To our knowledge, this is the first report of the biosynthesis gene clusters of pestalachlorides and their analogs, pestalone-type benzophenones. So far, there are lots of known natural metabolites that are still not connected with their biosynthesis gene clusters, which hinders the further mining of natural products. Given a large part of them contain halogen atoms, halogenase-targeted genome mining reported here might be an efficient strategy to uncover their biosynthesis origin.
PtlK, assembling double chloride atoms to the phenol residue of pestalachlorides at the late stage of biosynthesis, was reasoned to be a free-standing phenol FDH. Freestanding FDHs, including indole and phenol FDHs, have gained broader interest because it is easier to use them in biotransformation. Amongst, free-standing indole FDHs have been deeply investigated and engineered [13,14,[43][44][45][46]; however, the counterpart researches on free-standing phenol FDHs are still scarce. Although free-standing phenol FDHs are widely distributed in fungi, only a few are connected with their products, and none of their structures have been determined [7], which hinders the application of these enzymes. Further mining of fungal free-stand phenol FDHs and their products will benefit their structural determination and engineering for biocatalytic application.

Genome Mining of the Halogenase-Containing Biosynthesis Gene Clusters
TBlastp analysis was performed using fungal FDH GedL as the probe to explore new halogenated secondary metabolites from our in-house fungal genomic database. The hit-containing sequences were further analyzed by antiSMASH and the putative halogenase potentially involved in secondary metabolite biosynthesis were picked out for further phylogenetic analysis with characterized FDHs. The (6QGM). The amino acid sequence of the putative halogenase PtlK combined with the selected known halogenases was aligned by MUSCLE [48], and their phylogenetic tree was constructed based on the UPGMA [49] method and visualized with MEGA 7.0.26 [50].

Culture Condition Prioritization for the Production of Chlorinated Compounds
Pestalotiopsis rhododendri LF-19-12 was originally isolated from a lichen sample collected from Tibet, China, and identified based on phylogenetic NJ tree based on ITS sequences ( Figure S14). To explore the production of chlorinated compounds, four culture media, M1 (peptone 2 g, yeast powder 4 g, starch 10 g, 1 L distilled water), M2 (mannitol 40 g, maltose 40 g, yeast powder 10 g, K 2 HPO 4 2 g, MgSO 4 ·7H 2 O 0.5 g, FeSO 4 ·7H 2 O 0.01 g, 1 L distilled water), PDB (200 g potato, 20 g glucose, 1 L distilled water), and YES media (sucrose 150 g, yeast powder 20 g, MgSO 4 ·7H 2 O 0.5 g, ZnSO 4 ·7H 2 O 0.01 g, CuSO 4 ·5H 2 O 0.005 g, 1 L distilled water) were selected for culturing Pestalotiopsis rhododendri LF-19-12. The fungus Pestalotiopsis LF-19-12 was first cultured in 250 mL Erlenmeyer flasks containing 50 mL of potato dextrose broth (PDB) medium and incubated on a rotary shaker at 220 rpm and 28 • C for 48 h to yield the seed culture. Then 50 mL of the seed culture was inoculated into a 500 mL Erlenmeyer flask containing 100 mL of fermentation medium and incubated at 220 rpm and 28 • C for 9 days. The fermentation for each culture medium was carried out in triplicate. Subsequently, 2 mL of culture was filtrated, and the obtained mycelia were extracted using methanol. The obtained crude extract was pretreated with ODS and then analyzed using HR-ESIMS/MS.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.