Dereplication, Annotation, and Characterization of 74 Potential Antimicrobial Metabolites from Penicillium Sclerotiorum Using t-SNE Molecular Networks

Microorganisms associated with termites are an original resource for identifying new chemical scaffolds or active metabolites. A molecular network was generated from a collection of strain extracts analyzed by liquid chromatography coupled to tandem high-resolution mass spectrometry, a molecular network was generated, and activities against the human pathogens methicillin-resistant Staphylococcus aureus, Candida albicans and Trichophyton rubrum were mapped, leading to the selection of a single active extract of Penicillium sclerotiorum SNB-CN111. This fungal species is known to produce azaphilones, a colorful family of polyketides with a wide range of biological activities and economic interests in the food industry. By exploring the molecular network data, it was shown that the chemical diversity related to the P. sclerotiorum metabolome largely exceeded the data already reported in the literature. According to the described fragmentation pathways of protonated azaphilones, the annotation of 74 azaphilones was proposed, including 49 never isolated or synthesized thus far. Our hypothesis was validated by the isolation and characterization of eight azaphilones, among which three new azaphilones were chlorogeumasnol (63), peniazaphilone E (74) and 7-deacetylisochromophilone VI (80).


Introduction
For decades, natural products have been the most productive source of leads for new drugs, including antimicrobials [1]. Nevertheless, new chemical scaffolds are always required to extend therapeutic arsenals in order to address global public health problems, such as antibiotic resistance. To this end, new ecological niches must be explored, and their relative chemodiversity must be evaluated [2].
Among the ecological niches that have been little studied, the microorganisms associated with insects are rising in interest. One million five thousand insect species (Arthropods) have been formally described to date [3]. Arthropods colonize almost all terrestrial habitats, including forests, deserts and coasts. These organisms are also colonized by microorganisms located in different compartments, such as cuticles, digestive systems and glands [4]. Insect-microorganism interactions have been widely studied within apocrites (bees, wasps, ants), but few studies are related to termite-microorganism interactions outside the trophobiosis [5][6][7][8]. However, examples of antimicrobial compounds produced by microorganisms associated with termites from French Guiana were previously published in the literature, especially by the research group of D. Stien and V. Eparvier [9][10][11][12].

In Silico Azaphilone Structure Prediction Using MS/MS Data and a t-SNE Molecular Network
According to the first in silico dereplication steps, 4 distinct subclasses of azaphilone produced by the P. sclerotiorum SNB-CN111 strain were uncovered. In fact, these azaphilone subclasses presented one or several modifications of the same scaffold, including acylations on R 1 , a lactone ring at R 1 -R 2 , the presence of an oxygen or a functionalized nitrogen on Y, a diol or a methylethylene at R 3 -R 4 and finally chlorine or hydrogen at position X. (Figure 2). Isolated or synthetic azaphilones do not display all this combinatorial diversity in the literature [21]. To further annotate the azaphilone-related metabolome from the P. sclerotiorum SNB-CN111 strain, all available chemical information, including MS/MS data, chemical formulas, biosynthetic pathways and literature surveys, was gathered and combined. Five distinct clusters were analyzed in more detail (Figures 3-5).
Subsequently, a second step in the dereplication process was carried out. First, molecules described in the Atlas of Natural Products as compounds 1 to 7 analogs were searched. Then, a second search was carried out on the Reaxys database on compounds structurally similar to molecules isolated from natural resources [24]. Structures proposed through the literature review were annotated with level 3 using only exact mass and taxonomic information. Five molecules, i.e., geumsanol A-C and G (14-17) and eupeniazaphilone C (18), were therefore annotated in cluster C, where hypocrellone (7) was first dereplicated [25][26][27]. Two molecules, i.e., isochromophilone IX (6) or penazaphilone F (19) and penazaphilone D (20), were also annotated in cluster B, including compounds 2 and 5 [28,29]. Isochromophilone IV (21) and sclerketide B isomer (22) completed the annotation of cluster A, including sclerotiorin (1) [30,31]. Finally, 5-chloroisorotiorin (23) completed cluster D, featuring molecules 3 and 4 [32]. Using the literature review, it was thus possible to annotate 10 more azaphilones produced by P. sclerotiorum SNB-CN111 (Figures S10, S14-S26). Despite these two first in silico steps, more than approximately one hundred molecules related to azaphilones remain unannotated.

In Silico Azaphilone Structure Prediction Using MS/MS Data and a t-SNE Molecular Network
According to the first in silico dereplication steps, 4 distinct subclasses of azaphilone produced by the P. sclerotiorum SNB-CN111 strain were uncovered. In fact, these azaphilone subclasses presented one or several modifications of the same scaffold, including acylations on R1, a lactone ring at R1-R2, the presence of an oxygen or a functionalized nitrogen on Y, a diol or a methylethylene at R3-R4 and finally chlorine or hydrogen at position X. (Figure 2). Isolated or synthetic azaphilones do not display all this combinatorial diversity in the literature [21]. To further annotate the azaphilone-related metabolome from the P. sclerotiorum SNB-CN111 strain, all available chemical information, including MS/MS data, chemical formulas, biosynthetic pathways and literature surveys, was gathered and combined. Five distinct clusters were analyzed in more detail (Figures 3-5).  First, cluster A, containing annotated sclerotiorin (1), isochromophilone IV (21) and sclerketide B isomer (22), was clearly separated from others on the t-SNE molecular network. Two nonchlorinated analogs, 24 and 25, of these three compounds were first annotated according to exact mass measurements below 5 ppm, isotopic patterns, MS/MS data and high cosine score values of 0.93 and 0.81 for pairs 1/24 and 22/25, respectively. In particular, three common neutral losses of 42 [33][34][35][36], suggests that the nature of the acyl group can be identified from this typical fragmentation pattern, as described in Figure 4.     According to this first rule of azaphilone fragmentation, compounds 26 and 27 were annotated as sclerotiorin analogs with butanoyl groups at R 1 ( Figures S29 and S30), whereas compounds 28 and 29 were annotated as sclerotiorin analogs with pentanoyl groups (Figures S31 and S32). Protonated nonacetylated compounds 30 and 31 were also detected ( Figures S33 and S34). For these two particular compounds, fragments at m/z 181.0051 were observed for chlorinated molecule 30 and at m/z 147.0441 for hydrogenated molecule 31 ( Figures S33 and S34). Dechlorosclerotiorin (32), dechlorobenzoylsclerotiorin (33), aminobenzoylsclerotiorin (34), 5-epoxysclerotiorin (35), benzoylsclerotiorin (36) and methoxysclerotiorin (37) were similarly annotated in cluster A. Finally, compound 38 was annotated as an analog of molecule 26 with hydroxylation on the butanoyl moiety ( Figures S35-S41).
Using the same methodology, subclusters B1 and B2, initially containing molecules 2 and 5, respectively, were further annotated with ( Figure 4) compounds 39-46 in cluster B1 ( Figure 5) and compounds 47-49 in cluster B2 ( Figures S42-S52). These two subclusters were close in t-SNE MN, allowing us to deduce that they exhibit strong structural similarities. Because nitrogenated azaphilones spontaneously transform from their oxygenated analogs, molecule 42 can be annotated by comparison with compound 22 [21]. The two molecules had a cosine score of 0.61 and shared eight neutral losses. It was also observed in cluster B2 that compound 5 is an analog of compound 2 with an ethanol function on its nitrogen. Moreover, nitrogen-substituted azaphilones 5, 6, 19 and 20 were annotated on the northern part of t-SNE MN. Thus, cluster B2 only contained azaphilones 50 to 60 with substituted nitrogen (Figures S53-S63). Molecules 61 and 62 in cluster B3, initially containing molecule 6, were finally annotated, taking into account the same hypothesis ( Figures S64-S65).
Cluster C, containing hypocrellone A (7), was then annotated. As previously described in the other clusters, a chlorinated analog of 7 was observed (63) ( Figure 6 and Figure S66

Isolation and Characterization of Compounds
To confirm the in silico annotation, azaphilones from the most active fractions on T. rubrum ( Figure S8) were isolated for complete structural elucidation. Annotated and known compounds 1, 2, 5, 23 and 75 or newly annotated compounds 63, 74 and 80 were thus purified and structurally characterized ( Figure 7). The four known compounds were identified by 1 H and 13 C NMR and data comparison with the literature (Tables S1 and S2

Isolation and Characterization of Compounds
To confirm the in silico annotation, azaphilones from the most active fractions on T. rubrum ( Figure S8) were isolated for complete structural elucidation. Annotated and known compounds 1, 2, 5, 23 and 75 or newly annotated compounds 63, 74 and 80 were thus purified and structurally characterized (Figure 7). The four known compounds were identified by 1 H and 13 C NMR and data comparison with the literature (Tables S1 and S2

Isolation and Characterization of Compounds
To confirm the in silico annotation, azaphilones from the most active fractions on T. rubrum ( Figure S8) were isolated for complete structural elucidation. Annotated and known compounds 1, 2, 5, 23 and 75 or newly annotated compounds 63, 74 and 80 were thus purified and structurally characterized (Figure 7). The four known compounds were identified by 1 H and 13 C NMR and data comparison with the literature (Tables S1 and S2   Compound 63 was obtained as a yellow oil, and its molecular formula was determined to be C 23 .1 ppm). The similarity of NMR data with compound 3 (Tables S1 and S2) indicated the presence of a lactone ring and a chlorine atom on carbon 5. Thus, 1 H NMR and 13 C spectroscopic data of 63 were analyzed and compared with the literature [25,41]. The azaphilone scaffold was identified by HMBC correlations of H1/C-3, C-4a, C-5 and C-8, H4/C-3, C-5, C-8, C-8a and H18/C-6, C-7, C-8, downfield chemical shifts of C-1 (δ C 146.5) and C-3 (δ C 157.3) and chemical shifts from ketocarbonyl carbon C-6 (δ C 184.5). The side chain was connected to C-3 by the HMBC correlation of H-9 and H-10. The lactone moiety was confirmed by the presence of four additional carbon resonances comprising two carbonyls (δ C 199.7 and 168.1), one methine (δ C 57.3), one methyl group (δ C 30.3), a COSY correlation between H-8 and H-3" and HMBC from H-3"/C-2". Ketocarbonyl carbon C-4" and C-5" were connected to C-3" by HMBC correlation of H3"/C4" and H5"/C3", C4". The chlorine atom was positioned on C-5 because it had no HSQC correlation. Finally, a typical correlation of transcoupled olefinic protons was observed in COSY with correlations between H-9/H-10, H-17/H-12/H-13/H-16 and H-13/H-14/H-15 (43). Observation of two carbons, C-11 (δ C 75.9 ppm) and C-12 (δ C 78.4 ppm), displayed hydroxylated carbon chemical shifts and permitted the establishment of the side chain as 3,5-dimethylhept-1-ene-3,4-diol (Figures S97-S102, S117, Tables S1 and S2). This attribution is in accordance with the reported NMR characterization of 15, an analog of 63 without chlorine at position X [25]; this compound was named chlorogeumasnol.
Molecule 74 was isolated as a purple amorphous oil, and its molecular formula was determined to be C 23  . The 3,5-dimethyl-1,3-heptadienyl unit, the azaphilone scaffold and their connection were identified as described for 63. The lactone ring was established with the HMBC correlation of H-5" with C-4", C-3" and C-2" and by comparison with its analog described in the literature [28,32] (Figures S103-S108, S117, Tables S1 and S2). Compound 74 was named peniazaphilone E.
Compound 80 exhibited a high peak intensity in LC-MS and was close to cluster D in t-SNE MN ( Figure S109). By studying MS/MS fragmentation, compound 80 was expected to be an analog of 5 with a hydroxyl group at R 1 ( Figure S110). Compound 80 was obtained as a red oil, and its molecular formula was determined to be C 21 H 27 ClNO 4 based on the ESI-HRMS experiment ([M + H] + peak at m/z 392.1604 calcd for C 21 H 26 ClNO 4 H+, 392.1623, err. 4.9 ppm). The 3,5-dimethyl-1,3-heptadienyl unit, the azaphilone scaffold and their connection were identified as described for 63. The ethanol chain was established by the COSY correlation between H1 and H2 , as well as the C1 (δ C 56.6) and C2 (δ C 60.5) chemical shifts and HMBC correlation of H-1/C-1 (Figures S111-S117, Tables S2 and S3). Molecule 80 was named 7-deacetylisochromophilone VI.
Crystal structures for molecules 1 and 5 were obtained, allowing us to determine the absolute configuration of each chiral carbon atom (Figures S118-S121, Tables S3-S15). Thus, the absolute configuration of C-7 of all other isolated compounds was determined by comparison of the circular dichroism of each isolated molecule with compounds 1 and 5.
All isolated compounds were tested on two fungal human pathogens and showed moderate MICs (Table 1)

Discussion
P. sclerotiorum SNB-CN111 specialized metabolism was deeply examined among 109 extracts of microorganisms associated with termites from French Guiana. This strain showed both antimicrobial biological activity and a wide variety of molecules from the azaphilone class.
The LC-MS/MS and in silico dereplication process provided an example of how an extended analysis of MS/MS data can lead to large-scale azaphilone annotation. First, it could be demonstrated that the fractionation of the crude extract allowed a better dereplication of the azaphilones. The number of features increased from 382 to 2953, leading to a more complete MN using t-SNE visualization. However, only a few azaphilones were dereplicated by querying the MS/MS. To enlarge our annotation, a literature survey was performed. Height azaphilones were annotated after this second dereplication step coming from the isolation process (14)(15)(16)(17)(18)(19)(20)(21). Eight additionally reported compounds not detected by the first dereplication round were then identified thanks to our in silico strategy: 24, 34, 35, 40, 72 and 75-77 [32,39].
In silico strategies were tested, but the results were not satisfactory enough for azaphilone analogs [43][44][45][46]. Indeed, in silico strategies display 17-25% to 87-93% annotation accuracy for "known-unknown" metabolites, depending on the database boost [46]. However, these strategies do not perform correctly for "unknown-unknown" metabolites derived from enzymatic and chemical transformation or for original structures, as was the case for azaphilones extracted from P. sclerotiorum SNB-CN111. At present, there is a need to increase our ability to accurately annotate "unknown-unknown" characteristics to identify compounds that cannot be isolated for various reasons.
The primary annotation based on HRMS data permitted us to differentiate analogs with and without chlorine atoms on the carbon at position 5 due to a typical difference of 33.9610 combined with a change in isotopic pattern related to the characteristic abundance of 37 Cl [47,48]. Interestingly, only two unchlorinated N-azaphilones (41 and 53) were observed. It was supposed that chlorine atoms contribute to azaphilone scaffold affinity for primary amines by electroattractive effects. The nature of the acylation on the hydroxyl at position 7 was simply identified according to typical neural losses, as described in Figure  4. In this way, four different acylations of azaphilone, i.e., acetylation, propionylation, butanoylation and pentanoylation, were systemically associated with three different azaphilone scaffolds (1/22/26/28, 2/42/45/44, 24/25/27/29). As expected when performing reverse-phase LC, the CH 2 increment linearly increased the retention time of each molecule ( Figure S121). More unique modifications, such as benzoyl and hydroxylation of the acyl moiety, were also identified. Moreover, for azaphilones bearing a benzoylation (33,34,36), the benzoyl moiety leads to the formation of the main ion fragment, whereas the azaphilone scaffold constitutes the neutral loss ( Figures S36 and S37, S39). In contrast, for azaphilone with linear acylation, the acyl moiety is the main neutral loss, and azaphilone is the main ion fragment. This particular fragmentation pathway due to the presence or absence of a labile proton at position α of the carbonyl was previously described for mitorubrin azaphilone [35].
One of the specificities of azaphilone chemistry is its capacity for spontaneous conversion of oxygen atoms into nitrogen groups at position 2. Thus, a mass shift of 0.9843 between the compounds is directly related to the spontaneous exchange of O by NH. Neutral losses of 71.9844 corresponding to the loss of a CO 2 group from the lactone ring and CO from the azaphilone scaffold were then observed (3, 4, 15, 63, 75, 76). One of the specificities of azaphilone chemistry is its capacity for spontaneous conversion of oxygen atoms into nitrogen groups at position 2. Thus, a mass shift of 0.9843 between the compounds is directly related to the spontaneous exchange of O by NH. Neutral losses of 71.9844 corresponding to the loss of a CO2 group from the lactone ring and CO from the azaphilone scaffold were then observed (3, 4, 15, 63, 75, 76).  The highest activities related to fractions F6, F7 and F8 were linked to features depicted in clusters A, B1, B3 and D. Many of the azaphilones grouped in these clusters were minor compounds, and only 1, 2, 5, 23, 63, 74, 75 and 80 were isolated and structurally characterized. Experimental MIC measurements using these pure compounds confirmed moderate activities against human pathogens. It can be assumed either that the most active molecule(s) have not been isolated thus far due to low abundance in the fractions or that azaphilones have synergistic activities.
Another property of azaphilones is their strong absorbance, which results in yellow, orange, red or violet molecules. For example, compound 1 shows a maximum absorbance at 361 nm, 5 at 370 nm and 23 at 356 nm [7,30,49]. When using acetonitrile as the solvent, molecules 74 and 75 also exhibit two maximal absorbances at 424/548 nm and 428/541 nm, respectively. The highest activities related to fractions F6, F7 and F8 were linked to features depicted in clusters A, B1, B3 and D. Many of the azaphilones grouped in these clusters were minor compounds, and only 1, 2, 5, 23, 63, 74, 75 and 80 were isolated and structurally characterized. Experimental MIC measurements using these pure compounds confirmed moderate activities against human pathogens. It can be assumed either that the most active molecule(s) have not been isolated thus far due to low abundance in the fractions or that azaphilones have synergistic activities.

General Experimental Procedures
Another property of azaphilones is their strong absorbance, which results in yellow, orange, red or violet molecules. For example, compound 1 shows a maximum absorbance at 361 nm, 5 at 370 nm and 23 at 356 nm [7,30,49]. When using acetonitrile as the solvent, molecules 74 and 75 also exhibit two maximal absorbances at 424/548 nm and 428/541 nm, respectively.

General Experimental Procedures
Optical rotations were measured at 20 • C in acetonitrile using an Anton Paar MCP 300 polarimeter in a 100-mm-long 350 µL cell. UV spectra were recorded at 20 • C in ace-tonitrile or methanol using a PerkinElmer Lambda 5 spectrophotometer. Electronic circular dichroism spectra were acquired at 20 • C in acetonitrile on a JASCO J-810 spectropolarimeter. NMR spectra were recorded on Bruker 300, 500, 600 and 700 MHz spectrometers (Bruker, Rheinstetten, Germany). The chemical shifts (δ) are reported as ppm based on the solvent signal, and coupling constants (J) are in hertz. Preparative HPLC was conducted with a Gilson system equipped with a 322 pumping device, a GX-271 fraction collector, a 171 diode array detector and a prep ELSII. All solvents were HPLC grade, purchased from Sigma-Aldrich (Saint-Quentin-Fallavier, France).

General Identification Procedure
The taxonomic marker analyses were externally performed by BACTUP, France. The identification of the fungi was conducted by amplification of the ITS4 or ITS1 region of ribosomal DNA and the bacterial isolates were identified on the basis of 16S rDNA sequence analysis. The sequences were aligned with DNA sequences from GenBank, NCBI (http://www.ncbi.nlm.nih.gov, accessed on 7 June 2021), using BLASTN 2.2.28. The sequences were deposited in the GenBank for accession numbers.

Isolation and Identification of Penicillium Sclerotiorum SNB-CN111
The strain was isolated from a Nasutitermes similis termite aerial nest sampled in Piste de Saint-Elie (N 05 • 01,838 W 052 • 44,606 ) in French Guiana. The strain SNB-CN111 from the strain library collection at ICSN was identified as Penicillium sclerotiorum. A sample submitted for amplification and nuclear ribosomal internal transcribed spacer region ITS4 sequencing allowed for strain identification by NCBI sequence comparison. The sequence has been registered in the NCBI GenBank database (http://www.ncbi.nlm. nih.gov, accessed on 4 September 2013) under registry number KJ023726.

General Cultivation and Extraction Procedure
All strains, including bacteria, were cultivated on solid PDA medium at 26 • C for 15 days, on 3 Petri dishes of 14 cm diameter (150 cm 2 ). On a large scale, the microorganisms were cultivated under the identical conditions with 130 Petri dishes of 14 cm diameter (2 m 2 ). The contents of the Petri dishes were transferred into a large container and macerated with EtOAc for 24 h. The organic solvent was collected by filtration under vacuum, washed with water in a separating funnel and evaporated to dryness under reduced pressure.

Extraction of SNB-CN111
P. sclerotiorum was cultivated on 330 Petri dishes (14 cm diameter) at 28 • C for 15 days on potato dextrose agar (PDA) medium (Dominique Dutscher SAS, Brumath, France). The culture medium containing the mycelium was cut into small pieces and macerated three times at room temperature with ethyl acetate (EtOAc) on a rotary shaker (70 rpm) for 24 h. The contents were extracted with 10 L of EtOAc using a separatory funnel. Insoluble residues were removed via filtration and the organic phase was washed three times with an equivalent volume of water (H 2 O), dried with anhydrous solid Na 2 SO 4 and then evaporated using a rotary evaporator under reduced pressure and temperature of 30 • C to yield a crude extract (6.5 g).

LC-MS/MS Analysis
Crude extracts of all SNB-CN strains, cultivated on PDA and extracted as previously described for Penicillium sclerotiorum SNB-CN111 together with fractions from Penicillium sclerotiorum SNB-CN111, were prepared at 1 mg.mL −1 in methanol and filtered on a 0.45 µm PTFE membrane. LC-MS/MS experiments were performed with a 1260 Prime HPLC (Agilent Technologies, Waldbronn, Germany) coupled with an Agilent 6540 Q-ToF (Agilent Technologies, Waldbronn, Germany) tandem mass spectrometer. LC separation was achieved with an Accucore RP-MS column (100 × 2.1 mm, 2.6 µm, Thermo Scientific, Les Ulis, France) with a mobile phase consisting of H 2 O/formic acid (99.9/0.1) (A)-acetonitrile/formic acid (99.9/0.1) (B). The column oven was set at 45 • C. Compounds were eluted at a flow rate of 0.4 mL · min −1 with a gradient from 5% B to 100% B in 20 min and then 100% B for 3 min. The injection volume was fixed at 5 µL for all analyses. For electrospray ionization source, mass spectra were recorded in positive ion mode with the following parameters: gas temperature 325 • C, drying gas flow rate 10 L.min −1 , nebulizer pressure 30 psi, sheath gas temperature 350 • C, sheath gas flow rate 10 L.min −1 , capillary voltage 3500 V, nozzle voltage 500 V, fragmentor voltage 130 V, skimmer voltage 45 V, Octopole 1 RF Voltage 750 V. For ESI, internal calibration was achieved with two calibrants, purine and hexakis (1 h,1 h,3 h-tetrafluoropropoxy) phosphazene (m/z 121.0509 and m/z 922.0098), providing a high mass accuracy better than 3 ppm. The data-dependent MS/MS events were acquired for the five most intense ions detected by full-scan MS, from the 200-1000 m/z range, above an absolute threshold of 1000 counts. Selected precursor ions were fragmented at a fixed collision energy of 30 eV and with an isolation window of 1.3 amu. The mass range of the precursor and fragment ions was set as m/z 200-1000.
Isolated compounds from Penicillium sclerotiorum SNB-CN111 fractions were prepared at 0.1 mg.mL −1 in methanol and filtered on a 0.45 µm PTFE membrane. The isolated compounds were analyzed according to the same procedure.

Data Processing and Analysis
The data files were converted from the d standard data format (Agilent Technologies) to mzXML format using MSConvert software, part of the ProteoWizard package 3.0 (21). All mzxml values were processed using MZmine2v51 as previously described [16]. Mass detection was realized with an MS1 noise level of 1000 and an MS/MS noise level of 0. The ADAP chromatogram builder was employed with a minimum group size of scans of 3, a group intensity threshold of 1000, a minimum highest intensity of 1000 and m/z tolerance of 0.008 (or 20 ppm). Deconvolution was performed with the ADAP wavelet algorithm according to the following settings: S/N threshold = 10, minimum feature height = 1000, coefficient/area threshold = 10, peak duration range 0.01-1.5 min and t R wavelet range 0.00-0.04 min. MS/MS scans were paired using a m/z tolerance range of 0.05 Da and t R tolerance range of 0.5 min. Isotopologs were grouped using the isotopic peak grouper algorithm with a m/z tolerance of 0.008 (or 20 ppm) and a t R tolerance of 0.2 min. Peaks were filtered using a feature list row filter, keeping only peaks with MS/MS scans (GNPS). Adduct identification, i.e., sodium-or potassium-cationized species, was performed on the peak list with a retention time tolerance of 0.1 min, a m/z tolerance of 0.008 or 20 ppm and a maximum relative peak height of 150%. A complex search, such as dimers, was performed with a retention time tolerance of 0.1 min, a m/z tolerance of 0.008 or 20 ppm and a maximum relative peak height of 150%. Peak alignment was performed using the join aligner with a m/z tolerance of 0.008 (or 20 ppm), a weight for m/z at 20, a retention time tolerance of 0.2 min and weight for t R at 50. The MGF file and the metadata were generated using the export/submit to GNPS option.
Molecular networks were calculated and visualized using MetGem 1.3 software [14], and MS/MS spectra were window-filtered by choosing only the top 6 peaks in the ±50 Da window throughout the spectrum. The data were filtered by removing all peaks in the ± 17 Da range around the precursor m/z. The m/z tolerance windows used to find the matching peaks were set to 0.02 Da, and cosine scores were kept in consideration for spectra sharing at least 2 matching peaks. The number of iterations, perplexity, learning rate and early exaggeration parameters were set to 5000, 25, 200 and 12 for the t-SNE view.

X-ray Structure Determination of Compounds 1 and 5
Crystals of compounds 1 and 5 were obtained by slow evaporation at 4 • C. A suitable crystal was selected for each of them, mounted on a nylon loop and fixed with oil. Then, X-ray diffraction and crystallographic data were collected at room temperature using redundant ω scans on a Rigaku XtaLabPro single-crystal diffractometer using microfocus Mo Kα radiation and an HPAD PILATUS3 R 200K detector.
Using Olex2 [50], the structures were readily solved by intrinsic phasing methods (SHELXT [51]) and by full-matrix least-squares methods on F2 using SHELXL [52]. The nonhydrogen atoms were refined anisotropically, and most of the hydrogen atoms were identified in difference maps and were treated as riding on their parent atoms.
For each structure, the Flack parameter [53] was refined. The determination of the absolute structure was confirmed by using Bayesian statistics on Bijvoet differences [54] based on the Olex2 results.
All the molecular graphics presented here were computed with Mercury 2020.3.0 [55]. Crystallographic data for the two structures (1 and 5) have been deposited in the Cambridge Crystallographic Data Centre database (the deposition numbers are CCDC 2085749 and CCDC 2085750, respectively). Copies of the data can be obtained free of charge from the CCDC at www.ccdc.cam.ac.uk (accessed on 7 July 2021).

Biological Assays
The crude extracts and pure isolated compounds were tested on the human pathogenic microorganisms Candida albicans (ATCC 10213), methicillin-resistant Staphylococcus aureus (ATCC 33591) and Trichophyton rubrum (SNB-TR1). The test was performed in conformance with reference protocols from the European Committee on Antimicrobial Susceptibility Testing [56]. The minimal inhibitory concentration value was obtained after 48 h for C. albicans, 24 h for MRSA and 72 h for T. rubrum. Vancomycin (for bacteria) and itraconazole (for fungi) were used as positive controls.
For cytotoxic assays, the crude extracts and isolated compounds were tested in triplicate at concentrations of 10 µg.mL −1 and 1 µg.mL −1 in the MRC5 cell line (ATCC CCL-171, Human Lung Fibroblast Cells), following the procedure described by Tempête et al. [57].

Conclusions
We annotated 74 azaphilone analogs, including 49 new molecules. Among them, eight azaphilones were isolated, including three new azaphilones. The structural data are in agreement with the structure predictions from MS/MS data organized in molecular networks. According to the newly established fragmentation pathways, azaphilones can now be efficiently dereplicated in silico in complex mixtures, even if a low amount of samples is available. Large collections of Penicilium can now be screened for the production of undescribed azaphilones, allowing us to better understand their biosynthetic pathways.
This article represents the first study showing the correct and robust annotation, by Moleculare Network methodology, of more than 70 structurally similar azaphilones in plant extracts. This result is due, in part, to the similar structure of the azaphilonelike compounds, but also to the highly specific fragmentation pathways during MS2 experiments. It should also be noted that the annotation was facilitated by the fact that many analogs had already been described in the literature.