Using Molecular Spectroscopic Techniques (NIR and ATR-FT/MIR) Coupling with Various Chemometrics to Test Possibility to Reveal Chemical and Molecular Response of Cool-Season Adapted Wheat Grain to Ergot Alkaloids

The objectives of this study were to explore the possibility of using near infrared (NIR) and Fourier transform mid-infrared spectroscopy—attenuated total reflectance (ATR-FT/MIR) molecular spectroscopic techniques as non-invasive and rapid methods for the quantification of six major ergot alkaloids (EAs) in cool-season wheat. In total, 107 wheat grain samples were collected, and the concentration of six major EAs was analyzed using the liquid chromatography-tandem mass spectrometry technique. The mean content of the total EAs—ergotamine, ergosine, ergometrine, ergocryptine, ergocristine, and ergocornine—was 1099.3, 337.5, 56.9, 150.6, 142.1, 743.3, and 97.45 μg/kg, respectively. The NIR spectra were taken from 680 to 2500 nm, and the MIR spectra were recorded from 4000–700 cm−1. The spectral data were transformed by various preprocessing techniques (which included: FD: first derivative; SNV: standard normal variate; FD-SNV: first derivative + SNV; MSC: multiplicative scattering correction; SNV-Detrending: SNV + detrending; SD-SNV: second derivative + SNV; SNV-SD: SNV + first derivative); and sensitive wavelengths were selected. The partial least squares (PLS) regression models were developed for EA validation statistics. Results showed that the constructed models obtained weak calibration and cross-validation parameters, and none of the models was able to accurately predict external samples. The relatively low levels of EAs in the contaminated wheat samples might be lower than the detection limits of the NIR and ATR-FT/MIR spectroscopies. More research is needed to determine the limitations of the ATR-FT/MIR and NIR techniques for quantifying EAs in various sample matrices and to develop acceptable models.


Introduction
Ergot alkaloids (EAs) are toxic compounds produced by Claviceps fungi species, which can parasitize the seed heads of some small grains and grasses during the time of flowering [1,2]. Th Numerous monocotyledonous plants can be attacked by those fungi, including durum wheat, oats, barley, rye, corn, forage grasses, etc. [3]. During the infection, the healthy grain or seed will be replaced by ergots (i.e., sclerotia). The ergots are brown to purple-black in color and usually contain high concentrations of EAs [4].
More than 40 different EAs have been reported. Generally, they could be classified into three groups, including clavinet alkaloids, peptide alkaloids, and lysergic acid derivatives [5]. Ergometrine, ergotamine, ergosine, ergocristine, ergocryptine, and ergocornine are the main EAs produced by Claviceps species [4]. Ergots are usually harvested from uncontaminated grass or grains. Even when the sclerotia were removed from wheat and rye samples by hand cleaning, EAs could still be found [6].
A widespread presence of ergot and EAs contamination in western Canadian cereals has been reported. In 1999, 4% of Western Durum and 12% of Canadian Western Red Spring wheat samples were positive for ergot, and similar outbreaks have been reported in Manitoba in 2005 and in all three Prairie provinces (Alberta, Saskatchewan, and Manitoba) in both 2008 and 2011 [7]. EAs can pose great health risks to humans and animals. For instance, EAs can harm the health and productivity of animals, such as lactation performance, growth, reproductive performance, pregnancy rates, sperm motility, etc. [8].
Due to analytical limitations, the monitoring of ergot contamination of grain is mainly focused on controlling the content of ergot bodies, while regulations for the concentration of individual EAs in grain are still unavailable [6]. However, the content of EAs and the proportion of individual EAs are extremely variable within ergot bodies and could be significantly affected by geographic regions, harvest time, crop species, and variety [8]. Up to date, the popular methods for the determination of EAs in agricultural commodities are based on wet chemistry, such as high-performance liquid chromatography, liquid chromatography-tandem mass spectrometry (LC-MS/MS), thin-layer chromatography, etc. [9,10]. These methods usually need professional technicians and are expensive and time-consuming. Infrared (IR) spectroscopy has been widely used as a fast and noninvasive approach in feed and food research [10]. Moreover, the IR spectroscopy technique has been reported as a promising technique for the estimation of mycotoxins in agricultural commodities [11]. The commonly used infrared spectral data methods for transformation with various preprocessing techniques include: FD: first derivative; SNV: standard normal variate; FD-SNV: first derivative + SNV; MSC: multiplicative scattering correction; SNV-Detrending: SNV + detrending; SD-SNV: second derivative + SNV; SNV-SD: SNV + first derivative. Shi et al. (2019) [10] reported vibrational spectroscopy limitations for the barley study and suggested additional research on vibrational spectroscopy in grain research. Some studies have been conducted to explore the potential of infrared-based techniques in the ergot detection area. Roberts et al. (1997) [12] applied the NIR method for the quantitative analysis of ergovaline concentration in tall fescue. In another study, Vermeulen et al. (2012) [4] established a model for quantifying the content of ergot bodies (0-10,000 mg/kg) based on hyperspectral imaging techniques. In our previous study, Shi et al. (2019) [10] reported a barley study with vibrational spectroscopy. Nevertheless, the possibility of using a spectroscopic method for the fast prediction of major EAs in cool-season-adapted wheat with low heat units has not been explored.
The aim of this research was to explore the possibility of using near-and mid-infrared spectroscopy combined with different spectral pretreatments and spectral regions to quantify the six major EAs in western Canadian wheat grown under low heat unit (cold) climate conditions.

Statistic Values of EAs
According to the LC-MS/MS analysis, EAs were detected in 75 of the collected samples. Table 1 shows the statistical summary of the ergot alkaloids, including the concentration ranges, averages, standard errors, skewness, and variance. The mean concentrations for total EAs, ergotamine, ergosine, ergometrine, ergocryptine, ergocristine, and ergocornine were 1099.3, 337.5, 56.9, 150.6, 142.1, 743.3, and 97.5 µg/kg, respectively. Most of the positive samples contained relatively low levels of EAs, and the median of the EAs-positive samples was below 42.1 µg/kg. The positive samples also had rather broad EA concentration ranges. For instance, the ranges for total EAs and ergocristine were 21,969.2 and 12,414.6 µg/kg, respectively. It was difficult to develop proper PLS models with such broad ranges and low concentrations. During the model construction stage, various attempts have been made to ameliorate the frequency distribution of EA content, such as removing samples with extremely high (e.g., >8000 ppb) or low (e.g., <10 µg/kg) concentrations of total or individual EAs.

Overview of Spectral Data
The unprocessed IR spectra of wheat are shown in Figure 1. The peaks in the spectra were mainly due to absorption due to the presence of moisture, protein, carbohydrates, and lipids. NIR spectroscopy deals with molecular combination bands and overtones primarily of OH, CH, NH, and CO vibrations [13]. The broad and complex bands make it difficult to interpret the NIR spectrum visually and assign specific bands to specific chemical components. The MIR spectrum contains information related to fundamental molecular vibrations and could be divided into four major sections, including the region of X-H stretching (4000-2500 cm −1 ), triple bond (2500-2000 cm −1 ), double bond (2000-1500 cm −1 ), and the so-called "fingerprint" (1500-400 cm −1 ) [14]. In the NIR region, the characteristic absorption bands of proteins are located between 2148 and 2200 nm, which are related to combinations of C-O stretching, C-N stretching, N-H in-plane bending, the combination of C-H and C=O stretch, and the second N-H bend overtone [15]. Amide I (ca. 1700-1600 cm −1 ) and amide II (ca. 1480-1575 cm −1 ) bands were utilized for characterizing the primary structures and investigating the relative richness of protein molecules [10,16,17].

PLS Model Construction
The raw IR spectra usually contain undesirable noise and background variations, which can reduce the performance of multivariate models. Spectral variations unrelated to the chemical or physical properties of the samples can be removed by spectral pretreatment [15,18]. Both individual preprocessing and the integration of different preprocessing methods can be applied to reduce side information and multiple types of interference and variations [10,19].

PLS Model Construction
The raw IR spectra usually contain undesirable noise and background variations, which can reduce the performance of multivariate models. Spectral variations unrelated to the chemical or physical properties of the samples can be removed by spectral pretreatment [15,18]. Both individual preprocessing and the integration of different preprocessing methods can be applied to reduce side information and multiple types of interference and variations [10,19]. Various pretreatments are available to researchers. For instance, non-uniform particle sizes in samples could result in light scattering effects; MSC and SNV are commonly used techniques to reduce such effects and adjust the baseline offsets. Resolution enhancement, random noise reduction, and subtle band shape highlights can be achieved by performing derivative algorithms. Detrending targets to adjust the curvilinearity and baseline shift of samples in powder form or densely packed samples [15,19,20].
In another research, several pretreatments including detrending, first derivative, SNV, SNV-Detrending, spectral-average, and baseline offset were used to preprocess the NIR spectra to develop PLS model for predicting endophyte alkaloids concentrations in dried perennial ryegrass [10,21].
The irrelevant information contained in full spectra may distort the models calibrated based on full wavelengths [19]. The redundancy and collinearity of the spectral data can be reduced by selecting important wavelengths. RCA is an effective technique for detecting important wavelengths [22]. Wavelengths with high absolute RC values are suggested as important wavelengths for the specific models.
The RCA chart of PLS models established based on spectra preprocessed by SNV for predicting total EAs concentrations is shown in Figure 2. Several conditions, such as different pre-processing and different wavelengths (e.g., the fingerprint bands of the MIR region, the classical NIR bands of 1000-2500 nm, and a variety of selected sensitive wavelengths), were used for calibration to obtain acceptable PLS models.

Evaluation of PLS Models
Partial least squares regression is capable of reducing the data dimension and overcoming the multicollinearity problem and has been suggested as an alternative technique to ordinary least squares since the 1960s [10,23]. It is particularly powerful in developing IR models because it could effectively remove irrelevant spectral variations [24,25]. Many researchers have developed calibration models for predicting ergosterol, aflatoxin, fumonisin, ochratoxin A, and deoxynivalenol content in different cereals by the PLSR method or its variants [11].
PCA was performed on both selected wavelength ranges and the full wavelength ranges to explore the sample spectral structures [10]. Nevertheless, the biplots of PCA analysis revealed that those wheat samples couldn't be clustered clearly by EA concentration. The result of PCA analysis of the infrared spectra (pretreated by MSC) of samples containing different total EA concentrations is shown in Figure 3. The significant differences in the spectra of samples might result from differences in major chemical constituents among samples. Various pretreatments are available to researchers. For instance, non-uniform particle sizes in samples could result in light scattering effects; MSC and SNV are commonly used techniques to reduce such effects and adjust the baseline offsets. Resolution enhancement, random noise reduction, and subtle band shape highlights can be achieved by performing derivative algorithms. Detrending targets to adjust the curvilinearity and baseline shift of samples in powder form or densely packed samples [15,19,20].
In another research, several pretreatments including detrending, first derivative, SNV, SNV-Detrending, spectral-average, and baseline offset were used to preprocess the NIR spectra to develop PLS model for predicting endophyte alkaloids concentrations in dried perennial ryegrass [10,21].
The irrelevant information contained in full spectra may distort the models calibrated based on full wavelengths [19]. The redundancy and collinearity of the spectral data can be reduced by selecting important wavelengths. RCA is an effective technique for detecting important wavelengths [22]. Wavelengths with high absolute RC values are suggested as important wavelengths for the specific models.
The RCA chart of PLS models established based on spectra preprocessed by SNV for predicting total EAs concentrations is shown in Figure 2. Several conditions, such as different pre-processing and different wavelengths (e.g., the fingerprint bands of the MIR region, the classical NIR bands of 1000-2500 nm, and a variety of selected sensitive wavelengths), were used for calibration to obtain acceptable PLS models.

Evaluation of PLS Models
Partial least squares regression is capable of reducing the data dimensio overcoming the multicollinearity problem and has been suggested as an alte technique to ordinary least squares since the 1960s [10,23]. It is particularly powe developing IR models because it could effectively remove irrelevant spectral var [24,25]. Many researchers have developed calibration models for predicting ergo aflatoxin, fumonisin, ochratoxin A, and deoxynivalenol content in different cereals PLSR method or its variants [11].
PCA was performed on both selected wavelength ranges and the full wave ranges to explore the sample spectral structures [10]. Nevertheless, the biplots o analysis revealed that those wheat samples couldn't be clustered clearly by EA c tration. The result of PCA analysis of the infrared spectra (pretreated by MSC) of s containing different total EA concentrations is shown in Figure 3. The significant ences in the spectra of samples might result from differences in major chemical c uents among samples. The performance of multivariate models can be evaluated with a number of statistical criteria. The coefficient of determination is a primary criterion that can indicate the goodness of fit [26]. Excellent models usually obtain a R 2 greater than 0.91; the R 2 of a good prediction ranges from 0.82 to 0.90; models with an R 2 value between 0.66 and 0.81 could be used for approximate quantitative estimation; models with an R 2 = 0.50-0.65 could only make discrimination analyses between samples with high and low concentrations [27].
The statistical parameters of NIR and MIR models constructed for predicting individual EAs and total EA concentrations are listed in Table 2 (1-7). Most PLS models developed in the present study obtained rather low R 2 C values, and the R 2 CV was unavailable (NA). Although the R 2 C values of some models were higher (e.g., the model for ergocornine constructed with FD pretreated NIR spectra), their cross-validation statistics were very poor, and none of them had external prediction capability (i.e., the R 2 P was unavailable). The results suggested that good calibration fit didn't automatically produce desirable external predictive ability. This was inconsistent with the findings in a previous study, which reported that good fits to models during calibration do not infer the obtained model can make satisfying external predictions [28].
The statistical parameters obtained during the calibration and validation stages showed that the constructed models can't be used for quantification or discrimination of the EA content in wheat. When the models were developed using a variety of selected wavelength ranges, no improvement was observed. Roberts et al. (2005) [29] reported that the total EA concentrations in tall fescue can be determined by NIR spectroscopy. In their study, the EAs content analyzed by commercial ELISA test kits was calibrated with NIR spectra (1110-2490 nm) to create PLS models (R 2 = 0.77-0.95). However, it should be noted that the actual values of total EAs were unavailable in their study, and the reference EA levels were replaced by the absorbance values since there was no standard for converting the relative content into actual EA concentrations. Furthermore, no independent prediction result was reported. The performance of multivariate models can be evaluated with a number of statis tical criteria. The coefficient of determination is a primary criterion that can indicate th goodness of fit [26]. Excellent models usually obtain a R 2 greater than 0.91; the R 2 of good prediction ranges from 0.82 to 0.90; models with an R 2 value between 0.66 and 0.8 could be used for approximate quantitative estimation; models with an R 2 = 0.50-0.6 could only make discrimination analyses between samples with high and low concen trations [27].
The statistical parameters of NIR and MIR models constructed for predicting ind Lolitrem B, peramine, and ergovaline are common endophyte alkaloids found in perennial ryegrass plants. The NIR models for detecting and quantifying endophyte alkaloids in perennial ryegrass were constructed in a previous study [30]. The average levels for ergovaline, lolitrem B, and peramine were 0.71, 1.32, and 7.16 mg/kg, respectively. A modified PLS method was applied, and they obtained R 2 CV values of 0.76, 0.41, and 0.94 for ergovaline, lolitrem B, and peramine, respectively. Based on their results, the concentration of peramine and ergovaline can be predicted by NIRS models, while the models for lolitrem B achieved undesired results.    1 Abbreviation: N C , sample count of calibration set; N p , sample count of prediction set; R 2 C , coefficient of determination for calibration; RMSEC, root mean square error of calibration (%); SEC, standard error of calibration (%); R 2 CV , coefficient of determination for cross-validation (%); RMSECV, root mean square error of cross-validation (%); SECV, standard error of cross-validation (%); R 2 P , coefficient of determination for prediction (%); RMSEP, root mean square error of prediction (%); SEP, standard error of prediction (%); MSC, multiplicative scattering correction; SNV, standard normal variate; SNV-Detrending, SNV + detrending; FD-SNV, first derivative + SNV; SD-SNV, second derivative + SNV.
In another study, the NIR hyperspectral imaging system was used to develop models for predicting ergot bodies in wheat [4]. They employed multivariate image analysis, PLS discriminant analysis, and support vector machine techniques to construct models. No false positives were observed with non-contaminated samples, and the LOD and LOQ were 145 mg/kg and 341 mg/kg, respectively. The greater ergot body concentration in those samples may facilitate the calibration process, although it should be noted that the EA type and levels in the ergot body can vary greatly. In another study, they reported that the discrimination models between ergot bodies and cereal kernels were constructed depending mainly on the differences in fat and starch levels of the grains [2]. Cereal kernels contain a high level of starch and a low level of fat, while ergot bodies are characterized by a high lipid content.
Based on the previous studies, the concentrations of EAs in the majority of samples might be too low for developing a proper NIR or MIR model. Moreover, the difference in IR absorbance bands between fungal-infected grains and healthy grains mainly reflects the changes in major chemical constituents such as carbohydrate, protein, etc. Many calibration studies indicated that the prediction of mycotoxin levels was not based on the toxin directly but rather relied on the spectral changes related to major chemical components [10,11].
The EA concentration in grains could be too low for direct determination by conventional NIR and MIR methods. Besides, the chemical profiles of ergot bodies may be different from those of normal grains, and the uneven distribution of ergot particles in grains could make it more difficult to obtain appropriate spectra for calibration and prediction.

Conclusions
The possibility of using NIR and ATR-FT/MIR techniques associated with various spectra preprocessing methods and wavelength ranges for the quantification of EAs in coolseason wheat was evaluated. During the calibration process, numerous spectral regions of both raw and preprocessed spectra were selected and calibrated, but the validation parameters of all PLS models were undesirable, and no model could be used to perform independent prediction. The EA content in most samples was rather low, which may be below the detectable limit of the employed IR spectroscopy. The frequency distribution of the EAs' concentration was undesirable, which made the calibration more difficult. More research is needed in the future to explore the direct detection limit of the infrared spectroscopic methods for predicting EA concentrations in different grains.

Sample Preparation and LC-MS/MS Analysis
A total of 107 wheat samples grown in Western Canada were collected from May 2016 to August 2017. The determination of six major EAs was conducted at Prairie Diagnostic Services (PDS) by using the LC-MS/MS approach developed by Krska et al. (2008). Reagent EA standards were supplied by Romer Labs (Union, MO, USA). Primary-secondary amines were supplied by Agilent Technologies (Palo Alto, CA, USA) and employed as materials for dispersive solid-phase extraction. Acetonitrile and ammonium acetate were purchased from Fisher Scientific (Fair Lawn, NJ, USA). The standards for EAs were dissolved in acetonitrile and stored in a freezer (−80 • C).
A grinder with a 1.0 mm screen was used for grinding the samples. 5.0 g of the ground samples were weighed into an Erlenmeyer flask (125 mL). Twenty-five mL of 85:15 (v/v) acetonitrile/3.03 mM aqueous ammonium carbonate were added to the samples and stirred for 10 min. The supernatant was filtered into a clean beaker through Whatman No. 41 (ashless) filter paper. The filtrate (1 mL) was added to 50 mg of primary-secondary amine and agitated for 5 min to clean the matrix. The supernatant was used for EAs analysis.
The LC-MS/MS system was composed of an Agilent 1100 HPLC system and a Micromass Quattro Ultima TM mass spectrometer (Waters, Milford, MA, USA). Multiple reaction monitoring was applied to identify the "parent" ions (first quadropole) and the "daughter" ions (second quadropole). The software used for data collection, processing, and curve construction was MassLynx 4.1 (Waters Corp., Milford, MA, USA). The standard curves were fitted to linear regression (y = ax + b), where x and y correspond to the alkaloids' content and peak area, respectively. The recovery rates for ergometrine, ergometrine, ergocryptine, ergocornine, ergocristine, and ergosine were 51, 101, 81, 87, 75, and 82%, respectively. The limit of quantification was 1.25 µg/kg and the detection limit was 0.5 µg/kg. The concentration of total EAs is the sum of the six major EA concentrations. The detailed procedures and validation parameters of the methods could be found in another study [10,21].

NIR and MIR Spectra Collection
The Unity SpectraStar 2500XL-R NIR analyzer (Unity Scientific, Brookfield, CT, USA) was applied to obtain the NIR spectra in the reflectance mode from 680 nm to 2500 nm at an interval of 1 nm. The cool-season wheat samples were placed on the rotary sample-cup spinner. The NIR spectra (*.SPC format) were recorded with the built-in software (InfoStar, Unity Scientific, USA). The collection of MIR spectra (ca. 4000-700 cm −1 ) was carried out with the Jasco FT/IR-4200 spectrometer in attenuated total reflectance mode (JASCO Corp., Tokyo, Japan). To eliminate noise arising from water and carbon dioxide, the background spectra were recorded. The generated MIR spectra (in JWS format) were transformed to JCAMP-DX files by the JASCO Spectra Manager II software. For each sample, three replicate spectra were taken, and they were averaged prior to chemometric modeling. More detailed information regarding the spectra collection has been summarized in another study [9,10].

Chemometric Analysis
The Unscrambler ® X software (version 10.4, CAMO Software, Oslo, Norway) was applied to preprocess spectral data and perform the multivariate analysis.
Nine types of pretreatments were used to transform the raw spectra, including baseline offset, first and second order derivatives (FD and SD), the standard normal variate (SNV), multiplicative scattering correction (MSC), detrending, FD-SNV, SD-SNV, and SNV-detrending.
The spectral data structure and the potential outliers were explored by principal component analysis (PCA). Samples without outliers were classified into calibration and independent prediction sets in an approximate ratio of 3:1. Both the raw spectral data and the preprocessed spectral data were used to construct calibration models. The calibration models were developed based on calibration sets using the PLS algorithm. To investigate the important wavelength/wavenumber ranges, the regression coefficient analysis (RCA) was carried out using the Unscrambler software. Recalibrations were conducted using the selected sensitive wavelengths to optimize the predictive ability of the original models that were generated based on full wavelengths. Moreover, F-residuals and/or Hotelling's T 2 values were used for detecting the remaining outliers during regression stages. A leaveone-out cross-validation was performed to validate the established models. Furthermore, calibration models that obtained valid cross-validation parameters were applied to the individual prediction subsets to evaluate their potential for external prediction. More information regarding the modeling process is also available in another study [9,10].
To evaluate the PLS models, calibration statistics were calculated, including the coefficients of determination in calibration (R 2 C ) and cross-validation (R 2 CV ). The minimum values of root mean square error of calibration (RMSEC) and cross-validation (RMSECV) were used to select the best PLSR model [31]. The prediction determination coefficient (R 2 P ), calibration root mean square error (RMSEP), and prediction standard error (SEP) were summarized for evaluating the prediction performance of the calibration models [10].
Author Contributions: H.S. was as a postdoctoral fellow in P.Y. lab, performed the experiments and wrote the paper; P.Y. is the principal investigator and the supervisor, designed the project, provided resources and funding, and revised the manuscript. All authors have read and agreed to the published version of the manuscript.