A Preliminary Study to Classify Corn Silage for High or Low Mycotoxin Contamination by Using near Infrared Spectroscopy

Mycotoxins should be monitored in order to properly evaluate corn silage safety quality. In the present study, corn silage samples (n = 115) were collected in a survey, characterized for concentrations of mycotoxins, and scanned by a NIR spectrometer. Random Forest classification models for NIR calibration were developed by applying different cut-offs to classify samples for concentration (i.e., μg/kg dry matter) or count (i.e., n) of (i) total detectable mycotoxins; (ii) regulated and emerging Fusarium toxins; (iii) emerging Fusarium toxins; (iv) Fumonisins and their metabolites; and (v) Penicillium toxins. An over- and under-sampling re-balancing technique was applied and performed 100 times. The best predictive model for total sum and count (i.e., accuracy mean ± standard deviation) was obtained by applying cut-offs of 10,000 µg/kg DM (i.e., 96.0 ± 2.7%) or 34 (i.e., 97.1 ± 1.8%), respectively. Regulated and emerging Fusarium mycotoxins achieved accuracies slightly less than 90%. For the Penicillium mycotoxin contamination category, an accuracy of 95.1 ± 2.8% was obtained by using a cut-off limit of 350 µg/kg DM as a total sum or 98.6 ± 1.3% for a cut-off limit of five as mycotoxin count. In conclusion, this work was a preliminary study to discriminate corn silage for high or low mycotoxin contamination by using NIR spectroscopy.


Introduction
Silage is produced to preserve forage with high moisture content by controlled fermentation and represents an important feed for cattle. Corn (Zea mays L.) is the most widely grown crop for silage [1,2] mainly due to its high-yielding, consistent source of forage, high digestibility, good animal palatability, and ability to be successfully ensiled. As a result, most dairy farms in northern Italy have adopted a cropping plan mainly based on the cultivation of corn hybrids for whole-crop plant production, thus increasing the self-sufficiency of the energy requirements [2] The microbiota in ensiled mass changes during conservation due to modification of temperature, humidity, oxygen availability, and pH conditions that normally occur in different ensiling stages. Further, poor storage conditions can also lead to undesirable mound contamination, mycotoxin production, and a reduction in nutritional value [3,4].
In particular, mycotoxins are secondary metabolites produced by fungi mainly belonging to Aspergillus, Fusarium, Alternaria, and Penicillium strains [5,6]. Corn silage can be contaminations from Aspergillus and Fusarium are also investigated via the sensor-based electronic nose. Near-infrared spectroscopy (NIR) and an electronic tongue and electronic nose have been at the forefront of quality control technologies in the food industry due to their real-time data processing capabilities [32]. As an example, the use of an electronic nose was recently applied for assessing aflatoxin B1 and fumonisins contamination in 316 and 229 corn samples, respectively, achieving the best classification result with the artificial neural network. The aim was to investigate whether the electronic nose was capable of grouping samples contaminated at levels above or below the legal limits, achieving correct estimation for aflatoxin B1 in about 77% of samples and 78% of correct classifications for Fumonisins [33].
The first application of infrared spectroscopy to the analysis of microorganisms can be traced back to the 1950s [34,35]. Moreover, in the 1980s, papers on the use of NIR to detect fungal contamination [36] or for quantifying mycotoxins in barley, corn, and wheat [37] were published. More recently, a linear discriminant analysis was applied to determine Aspergillus spp. contamination levels in peanuts by comparing NIR spectroscopy and an electronic nose [38]. Other studies conducted on brown rice samples artificially inoculated with Aspergillus flavus and A. parasiticus strains of fungus, demonstrated that NIR and MIR technology had the potential to simultaneously detect aflatoxin B1, B2, G1, and G2 while achieving good predictive accuracy for both NIR (R 2 = 0.936-0.973, RPD = 2.5-4.0) and MIR spectroscopy (R 2 = 0.922-0.970, RPD = 2.5-4.0) [39]. Therefore, Infrared (IR) spectroscopy can be used to quickly assess the hygiene quality of the feeds, other than their chemical composition. In particular, the IR spectroscopy is rapid, non-destructive, versatile, often applicable on-field, and requires no chemicals. All these characteristics have led to its use in multiple applications [40].
As reported by Min and Cho [41], spectroscopic techniques have proven to be alternative tools for the early detection of mycotoxins in agricultural products, and their advantages over conventional invasive methods are related to their rapidity and nondestructive characteristics. Because of the deleterious health effects of mycotoxins, the quick monitoring of animal feed contamination can avoid economic losses while preserving animal health. The use of this technology favors quick feedback in the field, and a larger and more frequent number of analyses may be performed, ensuring greater safety in the use of the cereal in both animal and human nutrition [42]. Moreover, the use of so-called "rapid methods" is highly relevant for improving knowledge on the presence and distribution of mycotoxins in food and feed. NIR analysis represents one of the most promising tools for farm screening silage mycotoxin tools as affirmed by Cheli et al. [18].
Quantification models can lack the sensitivity necessary to determine mycotoxin contamination, due to the small size of mycotoxins molecules [25] present in low concentrations in the range of ppm (mg/mL) or a few parts per billion (ppb; ng/mL), and infrared spectroscopy currently is not always sufficiently sensitive for this quantitative purpose [37].
According to previous reports describing the strengths and criticalities of the application of NIR spectroscopy in the quantification or discrimination of mycotoxins, an alternative approach was proposed to develop qualitative discriminant models for the classification of mycotoxins [19,20]. Due to the difficulty of the NIR spectrometer to develop robust regression models for the quantification of individual mycotoxins, and in relation to the fact that silages are usually subject to co-contamination of these toxins, the aim of the current study was therefore to obtain qualitative models able to discriminate silage for total concentrations or total counts of groups of mycotoxins. In order to rank silage based on mycotoxins co-contamination, we calibrate a NIR apparatus using the Random Forest (RF), a machine learning classification technique.

Results
The acquired spectra were assessed using Principal Component Analysis (PCA) to identify putative outliers. No samples had values both outside the F-residuals and Hotelling's T2 threshold values. Consequently, no spectrum was removed (Figure 1a).

Results
The acquired spectra were assessed using Principal Component Analysis (PCA) to identify putative outliers. No samples had values both outside the F-residuals and Hotelling's T2 threshold values. Consequently, no spectrum was removed (Figure 1a).  The samples included in calibration were characterized for chemical, fermentative, and organoleptic traits and mycotoxin contamination. Chemical and fermentative traits are reported in Table 1 and data were adapted by Gallo et al. [20]. The variables included in the NIR calibration were the count and sum (i.e., concentration) of different secondary metabolites produced by some fungal organisms. The NIR calibration concerned total counts of mycotoxins and the sum of single mycotoxin concentrations of mycotoxigenic fungi species such as Aspergillus, Fusarium, Penicillium, and Alternaria. Other variables included in calibration were regulated and emergent Fusarium-produced mycotoxins, only emerging Fusarium-produced mycotoxins, Fumonisins mycotoxins produced by fungi of the Fusarium genus, and lastly, Penicillium produced mycotoxins.
The contamination levels of the original 115 corn samples are reported in Table 2. The samples included in this study rank between low-l to high-contamination levels from a minimum of 14 to a maximum of 45 types of mycotoxins detected in the 115 corn silages, with a mean of 26.20 ± 6.42 and a 25% and 75% quartile of 21.50 and 30.50, respectively. Moreover, the average concentration of total mycotoxins was 5895.70 ± 7252.46 µg/kg DM. The mean concentration of emerging Fusarium toxins was 2453.83 ± 3571.47 µg/kg DM. Conversely, the mean concentration of regulated and emerging Fusarium toxins was 4781.04 ± 6539.44 µg/kg DM, whereas the counts of this mycotoxin category were 15.33 ± 3.61. Concerning all fumonisins mycotoxins, the range of concentration values was 16,104.84 µg/kg DM for a maximum of eight co-occurrence of these, and high fumonisins occurrences were observed for non-regulated FB3, FB4, FA1, and masked forms of FA1. Penicillium toxins ranked between a minimum of 2.12 to a maximum of 1286.49 µg/kg DM. Ochratoxin A was not detected in any samples, and mycophenolic acid and/or its metabolite mycophenolic acid IV was found in only five samples.

Calibration Model Results
In Table 3, we reported the performances of the total sum and mycotoxin count calibrations for each cut-off value for the balanced validation dataset. The average accuracy ranked from 82.2 ± 5.9%, using a cut-off of 4000 µg/kg DM, to 96.0 ± 2.7%, adopting the highest cut-off value of 10,000 µg/kg DM. Sensitivity and specificity were >80%, independently, if the threshold applied to discriminate the two classes was 4000, 7000, or 10,000 µg/kg DM. The lowest sensitivity and specificity (80.2 ± 8.5% and 81.3 ± 8.0%, respectively) were obtained using 4000 µg/kg DM as a threshold, whereas the best model performance was achieved using 10,000 µg/kg DM as a threshold. Table 3. Near-infrared spectroscopy calibration parameters from the prediction set on the total sum (µg/kg DM) and count (n) of Mycotoxin class 1 . In regard to the total count of mycotoxins in corn silage samples, the best model was achieved using 34 co-occurring mycotoxins to discriminate the high vs. low classes. In this model, misclassification of the upper contamination was negligible with 0.03 ± 0.30 samples wrongly attributed to class 1 and 2.7 ± 1.7 to class 2.

TOTAL SUM of Mycotoxins
In Table 4, we report all the calibration models related to regulated or emerging Fusarium-produced mycotoxin (i.e., R&E-Fusarium toxins). Inferior performances were achieved using cut-off limits of 1500 µg/kg DM and 13 for sum or count, respectively. Accuracy levels were lower than 80% in these two models, although we recorded an improvement in ranking samples within the class with contamination levels of 2500 µg/kg DM for the Fusarium-mycotoxin contamination sum and 15 for the Fusarium-mycotoxin count. In both cases, the sum and count NIR calibration showed a relatively narrow confidence interval (95%) of about 77.9 ± 4.7%, 93.2 ± 2.9% and 71.1 ± 5.6%, 90.1 ± 3.8%, respectively. Table 4. Near-infrared spectroscopy calibration parameters from the prediction set on the sum (µg/kg DM) and count (n) of the Regulated and Emerging Fusarium-toxins class 1 . Regarding mycotoxins that have no specific regulations (Table 5), satisfactory calibration models for sum parameters were developed using the three different thresholds of 700, 1000, and 1200 µg/kg DM; however, in the latter case, the accuracy achieved was~90% with a 95% CI of 83.2 ± 3.8%, 95.4 ± 2.2% and a precision of 94.9%. Similarly, calibration for a count of emerging Fusarium-mycotoxin showed good performances when the limit of class determination was eight co-occurring emerging mycotoxins. In this case, only 3.9 ± 2.2 low-contamination samples were erroneously attributed to the high-contamination category.

SUM of R&E-Fusarium Toxins
Concerning the fumonisins sum NIR calibration (Table 6), the average level of accuracy was 88.3 ± 4.3% with a precision average of 92.4%. In the same classification, the allocation by the model to class 2 achieved, on average, a precision of 84.6%.
Regarding the same mycotoxin contamination parameter, setting 6 as the classification limit allowed us to obtain a good classification RF model. The accuracy was 92.8 ± 3.3%; a few samples (i.e., 1.0 ± 1.4) were misclassified in class, with low co-occurrence contamination, and the specificity for the high level was 97.6 ± 3.1%. Table 5. Near-infrared spectroscopy calibration parameters from the prediction set on the sum (µg/kg DM) and count (n) of Emerging Fusarium-toxins class 1 .  Table 6. Near-infrared spectroscopy calibration parameters from the prediction set on the sum (µg/kg DM) and count (n) of fumonisins mycotoxin class 1 . The last calibration was about Penicillium-produced mycotoxin concentration with the count reported in Table 7. The accuracy values of both threshold contaminations of 250 and 350 µg/kg DM ranged from 90 to 95%, with an average of 92.9% and 95.1%, respectively. Better results were obtained both for sensitivity and specificity using 350 µg/kg DM concentration as the class limit (i.e., 91.0 ± 4.9% and 99.7 ± 1.6%, respectively). Table 7. Near-infrared spectroscopy calibration parameters from the prediction set on the sum (µg/kg DM) and count (n) of Penicillium mycotoxin class 1 . Adopting three or five as a cut-off in terms of count class division increased the accuracy by 18.3% with no misclassified samples in class 1.

Discussion
The assessment of multiple mycotoxin contamination in feedstuffs and total mixed rations should be considered when formulating dairy cow diets, albeit those mycotoxins may be below regulatory limits [43]. In the current manuscript, we developed different NIR calibrations to provide a tool to rapidly assess mycotoxin co-contamination risk to animals, with a special focus on corn silage total contamination. The mean values of nutritional composition or fermentative traits were typical for corn silage produced in this geographical area [20]. The mycotoxins included in the calibration belong to the main mycotoxigenic fungi Aspergillus spp., Fusarium spp., and Penicillium spp. which are prevalent in corn silage [44]. In the current manuscript, we aggregated mycotoxins produced by the same fungal genera and quantitatively measured them using the LC-MS/MS reference method.
In order to have more information regarding the specific mycotoxins that were detected in our samples, we referred Gallo et al. [20] for details. The proposed calibration aims to develop a rapid and non-destructive method to classify corn silage into low or high contamination classes, representing a qualitative clustering technique.

Mycotoxin's Contamination Occurrences and the NIR Calibration Approach
In particular, Aspergillus produced the mycotoxins rugulusovin or brevianamide F, as detected in most of the corn silages, with an incidence >88.9%. Despite these compounds being assigned to this class, they represent the unspecific diketopiperazine produced by many microbial species among several Aspergillus strains. No aflatoxins or nigragillin were detected, but emodin was detected at low concentrations in more than 50.0% of samples. Kojic acid was found in most of the corn silages, and incidences lower than 21% were calculated for 3-nitropropionic acid, averufin, bis(methylthio)gliotoxin, and asperphenamate [20]. Fusarium produced mycotoxins were quantified and the regulated FB1 and FB2 were detected in the majority of the corn silages with occurrence values higher than 95.5%; high fumonisin occurrences were also observed for non-regulated FB3, FB4, FA1, and masked forms of FA1, their incidences being higher than 54.2%. The average concentrations of FB6 and FA2 results were lower than 12.4 µg/kg DM, and ZEA was detected in all samples with incidences ranging from 22.2% to 100.0%. No trichothecenes type-A, regulated T-2, and HT-2 toxins were detected; trichothecenes type-B and DON were detected as well as nivalenol and deoxynivalenol-3-glucoside, but at low concentrations and low incidences. Other than regulated Fusarium produced mycotoxins, siccanol, moniliformin, equisetin, epiequisetin, and bikaverin were detected in the majority of analyzed corn silages. Finally, enniatins were detected in some samples, while Beauvericin was quantified in the majority of corn silages with an incidence of >83.3%. Regarding Penicillium produced mycotoxins, ochratoxin A and barceloneic acid were not detected in any samples, whereas 7-hydroxypestalonic, pestalotin, oxaline, phenopyrrozin, and questiomycin A were detected at low concentrations (<50.0 µg/kg DM) in several corn silages, with incidences higher than 60%. Mycophenolic acid, as well as its metabolite mycophenolic acid IV, were detected only in two samples.
Since corn silage could be contaminated by a high number of regulated and emerging mycotoxins, it was decided to calibrate the NIR spectrometer in relation to their concentration or count. The samples included in the study were randomly collected from livestock production realities representative of the intensive dairy farm system in Italy and consequently, their distribution in class was biased towards low levels of contamination. One of the major obstacles in mycotoxin modeling is related to mycotoxin monitoring data often being unbalanced towards low mycotoxin concentrations [45]. To obtain more robust and accurate calibrations, a re-balancing approach was adopted to obtain class-balanced distribution. Re-balancing has been previously adopted in NIR calibration using different mathematical approaches, but ultimately by under-sampling the majority class and/or generating a new representative of the minority class [46][47][48]. Moreover, a recent study evidenced that the combination of oversampling and downsampling techniques performed better than using exclusively one or the other [49].
Here, we applied the "smote and undersample" algorithm to oversample the minority class by SMOTE while undersampling the majority class, as this method was specifically conceived to handle extremely imbalanced data [50]. The RF classification model was then applied as it has been successfully applied by others to classify single corn kernels with aflatoxin contamination from NIR spectra [51]. To evaluate real model performances not related to the possible random good distribution of samples, the process was repeated 100 times and the results showed were the mean of these repetitions.

NIR Calibrations
The model developed in this study allowed us to obtain a good calibration in terms of accuracy, sensitivity, and specificity thanks to its ability to balance errors in datasets where classes are not equally distributed [52,53]. Such good performances were obtained thanks to the correlation between the NIR spectra for various functional groups and major fungal constituents.
Fusarium mycotoxins are primarily produced by F. proliferatum and F. verticillioides and FB1 and FB2 appeared to be the most predominant among fumonisins [54]. FB1 and FB2 were detected in almost all samples included in this calibration; FB6 and FBA2, on the contrary, showed low incidences in our data. The ability of the NIR spectrometer to see Fusarium mycotoxins was supported by previous relatively good qualitative prediction using NIR spectrometer obtained for Fusarium verticillioides fungal infection, ergosterol, and Fumonisin B1 content with model performances from R 2 = 0.78 for F. verticillioides and Fumonisin B1 to R 2 = 0.81 for ergosterol content [53].
Fumonisin contamination in corn silage is usually ascribed to pre-harvest crop conditions [4,18], but management strategies before harvest, can increase Fumonisin contaminations. Here we see the importance of discriminating corn silage based on total Fumonisin mycotoxin contamination. This kind of forage is often infected by mycotoxin-producing Fusarium fungi, first among all DON and ZEN but also their modified forms. In fact, positive correlations were established between concentrations of the co-occurring mycotoxins and their modified forms, and to avoid underestimation, it is necessary to also quantify modified mycotoxins; more than half of all forage corn samples (i.e., 57%) included in the study were co-contaminated with DON, ZEA, and other modified forms because DON and ZEN are produced by the same Fusarium species (F. graminearum and F. culmorum) [55]. Moreover, the great accuracy in predicting total contamination of mycotoxin in terms of count or concentrations was supported by comparative studies using NIRS and imaging methods and concluding that NIR has good recognition of heavily mold-infected and lower infected kernels [56]. Further, the proliferation level of these toxic compounds in forage was strictly related to milk in its chemical composition with an accumulation of sphingolipids, together with purine and pyrimidine derivatives, in bad quality corn samples [57], which supports the importance of quickly determining mycotoxin corn silage contamination.
The performances achieved for fumonisin mycotoxin contamination were in line with those reported by Levasseur-Garcia et al. [58] on 117 corn samples collected in Italy, Denmark, France, Hungary, The Netherlands, and Poland with a percentage of wellclassified samples of 96%. Similar predictive capacities of NIR towards contamination by both fumonisins and ZEA were found by Tyska et al. [59]. In this study, a total of 676 Brazilian corn samples were employed to calibrate the NIR using 236 samples for FBs and 440 for ZEN contamination, achieving an R 2 of 0.899 and 0.984, respectively.
The complex mixture of mycotoxins in silages can originate from pre-harvest mold contamination, in particular for Fusarium spp., as well as from post-harvest contamination with toxins produced by fungal species such as Aspergillus and Penicillium [4]. Because of Penicillium's ability to grow in silage conditions (i.e., low oxygen and high carbon dioxide concentrations, low temperatures, and high concentrations of volatile fatty acids), the determination of these fungi mycotoxins is particularly important. Importantly, P. roqueforti is the most frequently occurring toxicogenic fungal species in Europe [60]. Despite the concentration in samples included in the present study being lower than levels associated with animal disorders, the classification risk of Penicillium toxins is important to evaluate the co-contamination in corn silage.
In all calibrations, the ability of NIR to differentiate between low contamination and high contamination toxins was greater using higher threshold classes both for the count and for the sum. NIR spectroscopy was able to provide information about the chemical functional groups in the molecules [37]. A fungal attack damages tissues and cells so the changes in corn silage properties, such as protein, carbohydrates, and lipid content, were related to changes in fungal contamination as reflected in their spectral signature [61]. Adopting a greater level of cut-off allows for better sorting between low and high contaminated corn silages.
Critically, the current NIR calibrations were proposed as practical tools useful to classify corn silage by their contamination by developing qualitative predictive models. Both mycotoxins and fungi cannot be directly detected by NIR spectroscopy due to the lack of sensitivity of this method, but indirect information regarding the contaminations could be noted, e.g., changes related to alterations due to fungal attack are visible using the NIR system [62]. Another strength of this method is the possibility of performing simultaneous analysis of the different types of mycotoxins arising from the contamination of different fungi. To improve the proposed method, and make it more robust and applicable, it is necessary to expand the initial database of 115 samples. Additionally, the use of Vis/NIR spectroscopy can further improve the calibrations [63]. Functional groups (CH, NH, and OH) in chemical components could exhibit characteristic absorption in the Vis/NIR region which would result in significant changes in specific spectral bands during fungal growth. These changes could be reflected by Vis/NIR spectrum and NIR hyperspectral imaging (HSI) as reported by Mishra et al. [64].

Practical Application
In order to better explain the potentialities of our method, we report some recent studies published in the last three years showing the great practical-applicative potentialities of the use of NIR in monitoring mycotoxin contamination in several matrices used in animal or human nutrition in Table 8. In particular, the results obtained in our study were in line with the classification studies reported below [65][66][67][68][69][70][71]. The study reported here gave a practical application of NIR calibration in order to underline the importance of this type of approach as a practical tool to assess the safety of silage. For a deep overview of the previous studies using NIR for the prediction of mycotoxin in different feeds and food, please refer to Levasseur-Garcia [37].

Conclusions
NIR spectroscopy has several advantages with respect to traditional analytical methods, mainly due to its capability of being a rapid, non-destructive, and economical tool capable of estimating several chemical-biological parameters simultaneously. Its application to classify or estimate fungal and mycotoxin contamination has received more and more attention, however, the very low concentration levels of mycotoxin could limit the effective detection and quantification of corn silage contamination. The NIR method developed in the current paper was based on the detection of the sums or counts of a different group of mycotoxins, while not focusing on a specific fungi toxin. Coupled with the contamination class re-balancing step, this method produced good calibration models in terms of accuracy, sensitivity, and specificity and appears to be a suitable screening method to provide rapid information regarding silage mycotoxin contamination. Samples belonging to a high contaminated class should be successively analyzed by conventional methods in order to assess the real risk of mycotoxin contamination for regulated animals in particular.

Sample Collection, Preparation, and Analysis
A total of 115 corn silage samples were collected in a survey from dairy farms located in the Po Valley (Italy) and Sardinia. These farms were randomly selected and visited over the years 2017-2019. The collection was conducted by adopting a sampling procedure based on methods described by Undersander et al. 2005 from horizontal bunker silos. The wet sample was about 2 kg and was sampled from at least four different points of each bunker feed out face, placed in a clean plastic jar, hermetically closed, and analyzed as reported below.
Each sample was dried at 60 • C in a ventilated oven until achieving a constant weight, then milled through a 1-mm and 0.5-mm screen using a laboratory mill (Thomas-Wiley, Arthur H. Thomas Co., Philadelphia, PA, USA), and stored for analysis. The 1-mm milled aliquot was intended for chemical, fermentative, and mycotoxin analysis, while the 0.5-mm milled aliquot was intended for NIR analysis in order to remove spectral noise caused by particle size [72,73].
All 1-mm corn silages sampled were characterized for the presence and quantification of fungal metabolites by using LC-MS/MS at the Department of Agrobiotechnology according to Sulyok et al. [74].
These authors validated a Liquid Chromatography coupled to a tandem Mass Spectrometry (LC-MS/MS) based method to quantify more than 500 secondary microbial metabolites thanks to its high selectivity, sensitivity, robustness, and multi-analyte capability that allow for the determination of a large number of analytes simultaneously.
The analytical procedure consists of weighing 5 g of samples and the subsequent extraction with 20 mL acetonitrile/water/acetic acid (79:20:1, v/v/v) for 90 min on a rotary shaker (GFL, Burgwedel, Germany) for 90 min at room temperature in a horizontal position. A volume of 500 µL of the extract was diluted with 500 µL of dilution solvent composed of acetonitrile:water:acetic acid (20:79:1, v/v/v) in vials. Then, 5 µL was injected into an LC-MS/MS system with QTrap 5500 MS/MS (Sciex, Foster City, CA, USA) coupled with an Agilent 1290 series UHPLC system (Agilent Technologies, Waldbronn, Germany). Chromatographic separation was performed at 25 • C on a Gemini C18 column (150 × 4.6 mm i.d.,5 µm particle size) equipped with a 4 × 3 mm precolumn with the same characteristics (Phenomenex, Torrance, CA, USA). The eluents used were composed of methanol/water/acetic acid (10:89:1, v:v:v) as eluent A and methanol/water/acetic acid (97:2:1, v:v:v) as eluent B. Scheduled multiple reaction monitoring was used for analyte detection. Stock solutions of each reference standard of mycotoxin and fungal metabolite were prepared by dissolving the substance, at 250 µg/mL, in acetonitrile with a few compounds dissolved in acetonitrile/water 1:1 (v/v), methanol, or water instead.
Sixty-two intermediate mixes were prepared by mixing the stock solutions of 10 analytes each, but the final multi-analyte standard was freshly prepared prior to spiking experiments by mixing the intermediate mixes. All solutions were stored at −20 • C. External calibration using a serial dilution of a multi-component stock solution was used for the quantification of the mycotoxins. The accuracy of the method was verified by the authors on a continuous basis by participating in ring trials with a current success rate (z-scores between −2 and +2) of approximately 95% for over 1400 results submitted. Moreover, the ion ratio had to agree with the corresponding values of the standards within 30% whereas, for the retention time, a stricter criterion of ±0.03 min was applied in this study.
Samples were also characterized for chemical and fermentative traits, as previously reported by Gallo et al. [20]. The DM was determined by gravimetric loss of free water by heating at 105 • C for 3 h (Association of Official Analytical Chemists or AOAC 1995, method 945.15); ash was determined as a gravimetric residue after incineration at 550 • C for 2 h (AOAC 1995, method 942.05), and ether extract (EE) was obtained following the method 920.29 of AOAC (1995). The crude protein (CP, N × 6.25) was determined using the Kjeldahl method (AOAC 1995, method 984.13). The soluble fraction of CP (expressed on a DM basis) was determined according to Licitra et al. [75]. Neutral detergent fiber (NDF), acid detergent fiber (ADF), and lignin (ADL) were determined using the AnkomII Fiber Analyzer (AnkomTechnology Corporation, Fairport, NY, USA) according to the method described by VanSoest et al. [76]. The NDF analysis utilized a neutral detergent solution containing sodium sulfite and a heat-stable amylase (activity of 17.400 Liquefon units/mL, Ankom Technology, Fairport, NY, USA). NDF, ADF, and ADL contents were corrected for the residual ash content. Starch was measured by polarimetry (Polax 2L, Atago ® , Tokyo, Japan). The disappearance of NDF after 24 h of rumen incubation (24 h NDFD) was measured in situ by incubating nylon bags in the rumen of two cannulated dairy cows for 24 h [77]. Regarding fermentative parameters, 50 g of wet samples were extracted with a Stomacher blender (Seward Ltd., West Sussex, Worthing, UK) for 3 min in distilled water (water:sample fresh weight ratio: 3:1). After filtering using gauze, an aliquot was centrifuged at 3000 r.p.m. for 15 min at room temperature. A liquid phase added with an internal standard (i.e., pivalic acid) was injected into a gas chromatographic-flame ionization detector (GC/FID) system that was equipped with a capillary column DB-WAX UI (60 m × 0.250 mm; 0.25 µm; Agilent Technologies S.p.A., Milano, Italy). Lactic acid was instead determined using HPLC [78].
A Foss NIRSystem 5000 spectrometer (Foss 5000 NIR systems; Foss Electric, York, UK) was used to collect near-infrared (NIR) spectra. The instrument had a scanning range from 1100 to 2500 nm and diffuse reflectance NIR spectra were obtained with a 2 nm step acquisition (i.e., 700 variables) from 2-3 g of samples ground at 0.5 mm. Ground samples were scanned using a standard ring cup of the FOSS NIR instrument, a small round quartz cup with a 3.75 cm diameter, as a sample holder.
Each sample was scanned twice and then an average of the two acquisitions was used for the present calibration (Appendix A).

Discriminant Cut-Off Limits Applied to Mycotoxin Groups
To compare different classification approaches, three different cut-off levels were adopted related to each mycotoxin contamination in order to assign for each sample a high or low level of contamination with respect to these values. With this aim, a high contamination class and a low contamination class were created with samples that had values of mycotoxin infection higher or lower to identified cut-off limits.
The mycotoxin contamination parameters ( Table 2) that were used were (i) the total concentration (i.e., µg/kg dry matter or DM) and total count of detectable mycotoxins (i.e., Aspergillus toxins, Alternaria toxins, zearalenone and its metabolites, trichothecenes type B, fumonisins and their metabolites, enniatins, beauvericin, other emerging Fusarium toxins, penicillium toxins, and other fungi toxins and unspecified fungi toxins); (ii) regulated (i.e., fumonisins B1 and B2, deoxynivalenol, and zearalenone, being those detected in the samples) and emerging Fusarium toxins called R&E-Fusarium toxins count and sum; (iii) only emerging Fusarium toxins called E-Fusarium toxins count and sum; (iv) Fumonisins and their metabolites; and (v) Penicillium toxins count and sum as reported in a previous work by Gallo et al. [20].
In particular, for the total sum of mycotoxin, the cut-off limits were arbitrarily set to 4000, 7000, or 10,000 µg/kg dry matter (DM), and counts were set to 28, 31, or 34. Regulated and emerging Fusarium-produced mycotoxins had 1500, 2000, or 2500 µg/kg DM as sum cut-off limits while 13, 14, or 15 were set as count limits. Only emerging Fusarium-produced mycotoxins were distinguished with high or low contamination levels with limits of 700, 1000, or 1200 µg/kg DM and 6, 7, or 8. Fumonisin sum and count classes were identified with 400, 700, or 1000 µg/kg DM and 4, 5, or 6 levels, respectively.
Finally, the Penicillium-produced mycotoxin classes were identified with contamination levels greater or less than 150, 250, or 350 µg/kg DM and 3, 4, or 5. For each cut-off limit, samples that have mycotoxin concentration values or counts below the threshold are classified as class 1, or if higher than the used threshold, as class 2.

Outlier Spectra Detection, Re-Sampling Procedure, Development, and Evaluation of Classification Models
PCA was used to analyze spectra and detect outliers according to the influence plot, based on Hotelling's T2 statistic with a 99% confidence interval [79]. This statistical approach is used as a diagnostic tool for out-of-scope sample detection during multivariate model development [80]. The acquired spectra were processed by Unscrambler X version 10.5.1 (CAMO Software AS, Oslo, Norway).
The thresholds previously specified created an imbalanced class distribution with most of the samples in the lower contamination group, whereas few samples were labeled in the high contamination class, characterized by a higher risk of the presence of mycotoxins.
To provide comparable size between classes in the data, we applied the Synthetic Minority Over-sampling Technique (SMOTE) algorithm [81] as implemented in the 'smote_ and_undersample' function from the HyperSMURF v2.0 R package [82]. SMOTE is specifically conceived to handle imbalanced data and re-balance classes. Letting I min be the instances of the minority class and I maj , the more abundant class, be the high and low mycotoxins contaminated samples, respectively, SMOTE generates realistic new instances by linear interpolation between randomly chosen pairs of close samples in the minority class. Specifically, for every minority instance x ∈ I min the k nearest-neighbor instances KNN ⊂ I min are computed and x ∈ KNN is randomly drawn. Then, a new synthetic instance x S is computed by linear interpolation between them: where λ is a random number between 0 and 1. The process is repeated and results in new "synthetic" data I min S . The majority class I maj is then resized by random subsample to obtain I maj where I maj = I min + I min S . A classification model to predict high vs. low contamination using wavelength as predictors was fitted through a random forest (RF) as implemented in the 'randomForest' R package Version 4.6-14 [83].
Here, we applied an RF to a classification problem (categorical response), rather than a regression problem (continuous response). For classification, RF uses Gini impurity (Gini) to determine the spit at each node of each decision tree, as in: where p i represents the relative frequency of the class observed and c represents the number of classes. Each decision tree selected a set of important predictors f b , which were bagged (bootstrapped and aggregated), and the majority voted to a final model fi as in where x' is the individual decision tree of B trees. Different classification models (e.g., PCA, partial least squares discriminant analysis, neural network, etc.) were tested in a preliminary phase of the study, with RF considered the most promising one. Here, we reported the results obtained by applying RF for classification. In particular, the class-balanced dataset was split into training and testing subsets including 70% and 30% of the samples, respectively. The RF generated several uncorrelated decision trees, each evaluating a random subset of features. The results of the trees were then averaged to provide the final model.
To evaluate the accuracy of the RF classification model, sensitivity and specificity were calculated [84]. Accuracy was the ability of the model to differentiate the high and low contamination samples. The sensitivity of the model was its ability to determine the low contamination cases correctly. To estimate it, we calculated the proportion of correctly low classified samples with the total of low classified samples by model. Specificity was the same as sensitivity but regarded the highest class.