Spectral Grouping of Nominally Aspergillus versicolor Microbial-Collection Deposits by MALDI-TOF MS

Historical microbial collections often contain samples that have been deposited over extended time periods, during which accepted taxonomic classification (and also available methods for taxonomic assignment) may have changed considerably. Deposited samples can, therefore, have historical taxonomic assignments (HTAs) that may now be in need of revision, and subdivisions of previously-accepted taxa may also be possible with the aid of current methodologies. One such methodology is matrix-assisted laser-desorption and ionization time-of-flight mass spectrometry (MALDI-TOF MS). Motivated by the high discriminating power of MALDI-TOF MS coupled with the speed and low cost of the method, we have investigated the use of MALDI-TOF MS for spectral grouping of past deposits made to the Centre for Agriculture and Bioscience International (CABI) Genetic Resource Collection under the HTA Aspergillus versicolor, a common ascomycete fungus frequently associated with soil and plant material, food spoilage, and damp indoor environments. Despite their common HTA, the 40 deposits analyzed in this study fall into six clear spectral-linkage groups (containing nine, four, four, four, four, and two members, respectively), along with a group of ten spectrally-unique samples. This study demonstrates the clear resolving power of MALDI-TOF MS when applied to samples deposited in historical microbial collections.


Introduction
Historical microbial collections often contain samples that have been deposited over extended time periods, sometimes many decades. Over this time, accepted taxonomic classification (and also the available methods for taxonomic assignment) may have changed considerably. Prior to the 1990s, common methods for taxonomic assignment of fungi were based predominantly upon microscopy and the analysis of morphological features, often coupled with taxonomic keys. Whilst DNA-based methods have increased considerably in significance since, samples deposited earlier may have historical taxonomic assignments (HTAs) that may now be in need of revision, and the subdivision of previously-accepted taxa may also be possible with the aid of current methodologies. One such methodology is matrix-assisted laser-desorption and ionization time-of-flight mass spectrometry (MALDI-TOF MS), which is rapid and relatively inexpensive, and has found widespread use in the characterization and identification of biological samples.
MALDI-TOF MS exploits the simple yet elegant laser-initiated 'MALDI' soft-ionization process [1], which enables the desorption of large proteins into the gas phase without fragmentation. In addition, the MALDI process adds a single positive charge to a significant proportion of the desorbed proteins [2]. This positive charge allows the gas-phase proteins to be accelerated over a short distance by means of

Materials and Methods
The following 40 strains used in this study (Table 1) were obtained from the CABI Genetic Resources Collection, a recognized microbial repository, an International Depositary Authority under the Budapest Treaty, and part of the global World Federation for Culture Collections network of public-service culture collections providing authenticated microorganism and reference material to the scientific community. All cultures were then grown for 3 days at 25 • C on duplicate Potato Dextrose Agar (Oxoid, Thermo Fisher Scientific, Waltham, MA, USA) plates. ≥99.8% ethanol, ≥ 98% (TLC-grade) α-cyano-4hydroxycinnamic acid (HCCA) matrix, LC-MS-grade acetonitrile, and 99% ReagentPlus ® -grade TFA were purchased from Sigma (Gillingham, UK). CHROMASOLV TM LC-MS-grade water was purchased from Fluka (Loughborough, UK).
Fungal biomass was mixed with 60 µL of MALDI reagent 1 (11 mg/mL HCCA matrix in 65% (v/v) acetonitrile, 2.5% (v/v) TFA, and 32.5% (v/v) water) using a plastic inoculating loop coated in biomass. Cell lysis and acid-soluble-protein extraction were carried out at room temperature (20 • C), and samples were left for at least one minute before further processing. One microliter of the resulting crude lysates was then pipetted onto the Bruker sample plate, air dried, and loaded into the spectrometer.
Mass spectrometry covering the mass range between 2 kDa and 20 kDa was carried out using a Bruker Microflex LT linear-mode instrument running the MALDI Biotyper 4.0 applications (Bruker Daltonik, Bremen, Germany) as described in Reeve and Seehausen [15]. All spectra are shown baseline-subtracted, smoothed, y-axis-autoscaled, and covering the mass range 2 kDa to 20 kDa (with x-axis scale increments of 2 kDa). Calibration was carried out using the manufacturer's 'BTS' controls (E. coli proteins supplemented with ribonuclease A and myoglobin), using peaks with masses at 3,637. 8 Sample preparations from plate-1 and plate-2 replicates were carried out as described above, from which a database of 40 plate-1 reference spectra was generated. For spectral comparison, plate-2 test samples were compared against the database of plate-1 reference spectra and Bruker identification scores were generated as described in Reeve and Seehausen [15]. In these molecular-weight-based spectral comparisons, Bruker identification scores were derived using the standard Bruker algorithm. This first converts raw mass spectra into peak lists, which are then compared between spectra. Three separate values are computed: the number of peaks in the reference spectrum that have a closely-matching partner in the test spectrum (value range 0-1), the number of peaks in the test spectrum that have a closely-matching partner in the reference spectrum (value range 0-1), and the peak-height symmetry of the matching peaks (value range 0-1). The above three values are multiplied together and normalized to 1000, and the base-10 logarithm is then taken to give the final Bruker score (range 0-3). Bruker scores of scores between 2.3 and 3.0 indicate very close relatedness, scores between 2.0 and 2.3 indicate close relatedness, and scores below 1.7 indicate low relatedness.
Methods used to undertake morphological identification were based on Klich [33]. Cultures were recovered from preserved stock and three-point inoculations were prepared on 90 mm plates of Czapek Yeast Autolysate Extract Agar (CYA formulation according to Samson and Pitt [34]). The cultures were incubated in darkness at 25 • C for 7 days. Growth rate was then measured and colony colors (upperside and reverse) were recorded. Using a Nikon D40 camera with a DX Nikkor 18-55 mm f/3.5-5.6 G ED II lens and zoom setting at 45, photographs were taken of the top and base of the 7-day culture plates.
Microscopic examination was performed by removing a small quantity of material from the 7-day plates using a sterile needle, mounting on a glass slide in a drop of lactofuchsin stain (0.2 g acid-fuschin, 50 mL glycerol, and 150 mL lactic acid), adding a cover slip, and examining structures at 400× using an Olympus BH-2 microscope. From the features observed, including vesicle diameter and shape, presence/absence of metulae, colony diameter and size, shape, color, and ornamentation of conidia, the taxonomic key to the Aspergillus species [33] was used to determine provisional morphological identification.        Figures 1 and 2 show that no spectra were obtained from IMI 129489 (plate 1), IMI 194967 (plate 1), and IMI 96330 (plate 2), and poor-quality spectra were obtained from IMI 129488 (plate 2) and IMI 194967 (plate 2) (not included for further analysis). For the remaining 75 samples (94%), peak-rich spectra with good duplication were obtained. Despite the fact that the HTA for each sample is nominally the same (A. versicolor), there are visible differences apparent between many of the spectra. In order to discriminate at high resolution between the samples, pairwise spectral comparisons were made. Table 2 shows the Bruker scores generated from spectral comparisons between plate-1 reference-sample spectra and plate-2 test-sample spectra, showing all Bruker scores of 2.0 or greater obtained for each test sample unless the highest score was below 2.0, in which case the highest score obtained is shown (indicated in parentheses). Bruker scores of between 2.300 and 3.000 indicate very close relatedness ('highly-probable species-level identification'), scores between 2.000 and 2.299 indicate close relatedness ('secure genus-level identification and probable species-level identification'), scores between 1.700 and 1.999 indicate intermediate relatedness ('probable genus-level identification'), and scores below 1.699 indicate low relatedness ('no reliable identification').  From the 100 spectral comparisons shown Table 2, one score of zero was obtained (because IMI 96330 plate 2 failed to generate a spectrum), and three comparisons were obtained with Bruker scores falling below 2.0 (samples 4, 7, and 24). For the remaining 96 comparisons, Bruker scores exceeding 2.0 were obtained, with an average score of 2.319 and a standard deviation of 0.193.

Results
The data in Table 2 enable the construction of spectral-linkage groups (SLGs), within which all members are related by one or more spectral comparison in Table 2 with a Bruker score exceeding 2.0, and between which SLGs no members have a spectral comparison in Table 2 with a Bruker score exceeding 2.0. Six SLGs are apparent from the data in Table 2 In addition to the above six SLGs, the data in Table 2 show that there are ten spectrally-unique samples (SUSs) that generated no other Bruker score exceeding 2.0 other than the plate-1 against cognate plate-2 comparison. The ten high-scoring SUSs are IMI 211385, IMI 226507, IMI 314386, IMI 366228,  IMI 381617, IMI 381685, IMI 49124, IMI 91859, IMI 91883, and IMI  The remaining three samples either failed to generate a plate-2 test spectrum (IMI 96330) or failed to give spectral comparison scores of greater than 2.0 (IMI 16041 ii and IMI 194967).
In order to assess visually the spectral consistency within each SLG, Figure 3 shows the MALDI-TOF MS spectra of acid-soluble fungal proteins from the plate-2 samples comprising SLG 1 to SLG 6, along with the ten SUSs. 220 Figure 3 shows a considerable degree of spectral variation in the SUSs; good consistency between 221 spectra within SLGs 1, 2, 5, and 6; and some variation (mainly due to additional peaks) within SLGs 222 3 and 4. Figure 4 shows, for ready visual comparison between the six SLGs, MALDI-TOF MS spectra  Figure 3 shows a considerable degree of spectral variation in the SUSs; good consistency between spectra within SLGs 1, 2, 5, and 6; and some variation (mainly due to additional peaks) within SLGs 3 and 4. Figure 4 shows, for ready visual comparison between the six SLGs, MALDI-TOF MS spectra of acid-soluble fungal proteins from plate-2 examples of each of the six SLGs. 228 Table 3 shows the Bruker scores for spectral comparison against the Bruker database of

233
S1 for example images) are also given for comparison.   Table 3 shows the Bruker scores for spectral comparison against the Bruker database of filamentous fungal samples for the SLGs and SUSs, where Bruker scores between 1.700 and 1.999 ('probable genus-level identification') are indicated in parentheses, and Bruker scores below 1.699 ('no reliable identification') are indicated by strike-through. The results of independent and blind post-MALDI-TOF MS taxonomic-key-based morphological identifications at seven days (see Figure S1 for example images) are also given for comparison.

Discussion
Motivated by the high discriminating power of MALDI-TOF MS coupled with the speed and low cost of the method, we have investigated the use of MALDI-TOF MS for spectral grouping of past deposits made to the CABI Genetic Resource Collection under the HTA A. versicolor. Despite their common HTA, the 40 deposits analyzed fall into six clear SLGs (SLGs 1, 2, 3, 4, 5, and 6, with nine, four, four, four, four, and two members respectively), along with a group of ten high-scoring SUSs.
Comparison between the spectra obtained and the Bruker database has been carried out but, whilst the grouping of samples on the basis of spectral similarity is clear, the results shown in Table 3 should not be interpreted as definitive taxonomic classifications for the samples analyzed in this study as relatively few species of Aspergillus are represented in the Bruker database spectra. Those  westerdijkiae. In addition, spectra found in the Bruker database will have been generated following sample-preparation methods other than the method used in the present study, which may reduce the scorings obtained from spectral comparisons. Bearing these caveats in mind, however, the spectral comparisons do reveal a number of interesting observations. Firstly, members of SLG 2 show a consistent 'identification' of A. ustus (supported by taxonomic-key-based morphological identifications) rather than A. versicolor as is the case for the remaining five SLGs, with IMI 360877, IMI 360878, IMI 360879 matching most closely to A. ustus database-entry DSM 1349 DSM and IMI 360880 matching most closely to A. ustus database-entry DSM 63535 DSM. Secondly, whilst SLG 1 is the largest SLG observed in this study, in contrast to any of the other SLGs, the Bruker scores for database entries within SLG 1 all fall below 1.7, suggesting that members of this SLG, whilst closely-related to each other, are spectrally the most remote from any of the Bruker database entries. Thirdly, all members of SLG 4 show a consistent 'identification' of A. versicolor 2009_137364 MUZ, with higher Bruker scores that suggest closer relatedness of these SLG members to the database entry. Fourthly, higher Bruker scores against database entries are again observed in SLG 6, where the highest 'identifications' scores are to A. versicolor database-entry F51 LLH. Finally, the SUS IMI 91859 is 'identified' as Penicillium italicum DSM 2754NT DSM, which is supported by the different growth morphology observed for this strain on the agar plates prior to sampling and by taxonomic-key-based morphological identification. As such, it is clear that this isolate was mis-identified on its deposit to the CABI collection.
It is clear from the above that the HTA given to the 40 strains used in this study (A. versicolor) covers a wide range of groupable subtypes from the MALDI-TOF MS spectra resulting from growth in culture followed by a simple and inexpensive method of sample preparation. MALDI-TOF MS, therefore, offers a rapid and inexpensive method for the classification of past deposits made to microbial collections and has great potential alongside complementary methodologies based upon morphological assessment [27,33] and nucleic-acid analysis [27,[29][30][31][32] to assist taxonomists working with deposits made under HTAs that now may be in need of further revision or clarification.