^{1}

^{1}

^{2}

^{1}

^{1}

^{*}

^{3}

^{3}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Visible and near infrared (Vis/NIR) spectroscopy was investigated for the fast analysis of superoxide dismutase (SOD) activity in barley (

Barley (

Near infrared (NIR) spectroscopy is a well-established technical for both quantitative and qualitative analysis in the field of agriculture [^{2} = 0.66–0.96) in correlating the NIR spectral data of ground wheat and barley with their amino acid concentrations [

The objectives of this experiment were to study the feasibility of using NIR spectroscopy to predict the activity of SOD in barley leaves, and compare the performance of different spectral preprocessing methods, different effective selection methods and calibration methods (partial least squares, least squares-support vector machine and Gaussian process).

The experiments were conducted at the farm of Zhejiang University, Hangzhou (30°10′N, 120°12′E), China, in the year 2010. A herbicide (ZJ0273) was used as stressor with five concentrations (0, 50, 100, 500 and 1,000 mg/L) being applied at the two-leaf stage. A total of 75 barley samples were collected during the growing period (after treatment for 5, 10 and 15 days). The total samples were randomly divided into two sets, 50 samples for calibration and the remaining 25 samples for validation. No single samples were used in both calibration set and validation set at the same time.

NIR spectra of the barley leaves were obtained using a Handheld FieldSpec spectrometer (Analytical Spectral Device, Boulder, CO, USA). The wavelength region is from 325 nm to 1,075 nm and the resolution of the instrument is 1.5 nm. Sample spectra acquired by averaging three spectra of one sample. In this study, three software packages were employed, including ASD View Spec Pro, Unscrambler V9.8 (CAMO AS, Oslo, Norway) and MATLAB V7.0 (The Math Works, Natick, MA, USA). Spectral pretreatment was necessary, because this could remove the spectral baseline shift, noise and light scatter influence [

Leaf superoxide dismutase (SOD) activity was analyzed by the method of Dhindsa

Successive projections algorithm (SPA) and Regression coefficient (RC) analysis were proposed as variable selection strategy in this work. SPA is a forward selection method which starts with one wavelength, then incorporates a new one at each iteration, until a specified number

Four regression methods: partial least squares (PLS), multiple linear regression (MLR), least squares-support vector machine (LS-SVM), and Gaussian process (GP) were used for comparison of prediction performance.

PLS was performed by the software Unscrambler V9.8. The latent variables (LVs) were used as the direct inputs of PLS models to develop a relationship between the spectral data and the SOD activity in barley leaves. The number of latent variables was selected using full cross-validation procedure on the training set. MLR was still complied by the software Unscrambler V9.8.

The free LS-SVM v1.5 toolbox was applied with MATLAB V.7.0 to develop the LS-SVM models. Input variables, kernel function and model parameters were three crucial elements for LS-SVM model [^{2} (σ^{2}) were determined by a two-step grid search technique.

Gaussian process regression (GPR) is a recently developed machine learning method which is successfully applied to resolve regression and classification problems. Gaussian processes (GPs) are non-parametric models where a priori Gaussian process is directly defined over function values. The details of Gaussian process regression could be found in the literature [

The evaluation standards include correlation coefficients (

Seven different PLS models with full-spectrum were developed to evaluate the effects of different preprocessing methods. As mentioned above, the correlation coefficients (

The wavelengths selected between 973 and 1,020 nm (975, 981, 982, 984, 986, 992, 997, 999 and 1,000 nm) could be attributed to the second overtone of N-H stretching vibration. This region was considered as one of the characteristic bands of protein [

Comparing the eight different models, they all achieved acceptable results.

Vis/NIR spectroscopy combined with multivariate analysis was successfully applied for the fast estimation of SOD activity in barley leaves. SG and MSC were selected as optimized processing methods by PLS. SPA and RC were successfully applied to select the most relevant EWs. Gaussian process regression gave good performance in this study, which indicated that it was a useful calibration method for the NIR spectroscopic technique. The best prediction performance was achieved by the LV-LS-SVM model with SG spectra, whereby the correlation coefficient (

This work was supported by the 863 National High Technology Research and Development Program of China (2011AA100705, 2012AA101903), Natural Science Foundation of China (31071332), Zhejiang Provincial Natural Science Foundation of China (Z3090295), China Postdoctoral Science Foundation (2011M501009), the Science and Technology Department of Zhejiang Province (2011C32G2130011) and the Fundamental Research Funds for the Central Universities (2012FZA6005).

(

Selected EWs by SPA according to SG spectra.

The regression coefficients of PLS.

Predicted

Predicted

Statistical values of activity of SOD in Barley Leaves (U/mg pro).

50 | 1.52–6.43 | 4.13 | 1.342 | |

25 | 1.56–6.21 | 4.08 | 1.273 |

The prediction results of activity of SOD by PLS models with full-spectrum.

4 | 0.7742 | 0.8028 | 0.0816 | 0.6729 | 1.4151 | |

4 | 0.8301 | 0.7060 | 0.0784 | 0.7481 | 1.1052 | |

3 | 0.8156 | 0.7258 | −0.0421 | 0.7038 | 1.1654 | |

5 | 0.8233 | 0.7179 | −0.0384 | 0.7508 | 0.9775 | |

13 | 0.7943 | 0.7718 | −0.1123 | 0.6886 | 1.1573 | |

3 | 0.5939 | 1.0417 | −0.2265 | 0.4290 | 2.1013 | |

5 | 0.7629 | 0.8524 | 0.0178 | 0.7497 | 1.0383 |

Selected EWs by SPA and RC.

SPA | Raw | 18 | 453, 480, 970, 954, 408, 447, 469, 400, 1,000, 559, 497, 992, 406, 982, 404, 462, 434, 409 |

SG | 7 | 846, 997, 992, 560, 988, 409, 668 | |

MSC | 10 | 869, 913, 984, 864, 749, 951, 854, 888, 918, 908 | |

RC | Raw | 10 | 404, 419, 420, 442, 957, 975, 986, 999, 1,000, 962 |

SG | 9 | 403, 419, 420, 443, 462, 957, 975, 986, 997 | |

MSC | 15 | 400, 412, 434, 442, 681, 716, 723, 731, 864, 912, 947, 954, 965, 981, 1,000 |

The prediction results by different models with optimal pretreatment.

^{2} |
||||
---|---|---|---|---|

| ||||

_{v} |
||||

SPA-PLS | Raw | 12/18/- | 0.6165 | 1.1324 |

SG | 7/7/- | 0.7539 | 0.8627 | |

RC-PLS | Raw | 2/10/- | 0.7035 | 0.8905 |

MSC | 4/15/- | 0.6927 | 0.9416 | |

SPA-MLR | Raw | -/18/- | 0.6489 | 1.1280 |

SG | -/7/- | 0.7539 | 0.8627 | |

LV-LS-SVM | Raw | 6/-/(97.98,326.27) | 0.8988 | 0.5521 |

SG | 6/-/(10.21,55.56) | 0.9064 | 0.5336 | |

SPA-LS-SVM | Raw | -/18/(2.10 × 10^{3},562.91) |
0.7203 | 0.9293 |

SG | -/7/(361.12,129.21) | 0.8267 | 0.7330 | |

RC-LS-SVM | Raw | -/10/(3.14,29.90) | 0.7798 | 0.7838 |

SG | -/9/(8.61,78.30) | 0.7798 | 0.7838 | |

SPA-GPR | Raw | -/18/- | 0.4771 | 1.1380 |

SG | -/7/- | 0.8200 | 0.7377 | |

RC-GPR | Raw | -/10/- | 0.7840 | 0.7776 |

SG | -/9/- | 0.7440 | 0.8326 |