Nitrogen Concentration Estimation in Tomato Leaves by VIS-NIR Non-Destructive Spectroscopy

Nitrogen concentration in plants is normally determined by expensive and time consuming chemical analyses. As an alternative, chlorophyll meter readings and N-NO3 concentration determination in petiole sap were proposed, but these assays are not always satisfactory. Spectral reflectance values of tomato leaves obtained by visible-near infrared spectrophotometry are reported to be a powerful tool for the diagnosis of plant nutritional status. The aim of the study was to evaluate the possibility and the accuracy of the estimation of tomato leaf nitrogen concentration performed through a rapid, portable and non-destructive system, in comparison with chemical standard analyses, chlorophyll meter readings and N-NO3 concentration in petiole sap. Mean reflectance leaf values were compared to each reference chemical value by partial least squares chemometric multivariate methods. The correlation between predicted values from spectral reflectance analysis and the observed chemical values showed in the independent test highly significant correlation coefficient (r = 0.94). The utilization of the proposed system, increasing efficiency, allows better knowledge of nutritional status of tomato plants, with more detailed and sharp information and on wider areas. More detailed information both in space and time is an essential tool to increase and stabilize crop quality levels and to optimize the nutrient use efficiency.

important to consider the critical N concentration, which is the minimum in the plant required for maximum growth [12]. Tei et al. [9] proposed a critical N dilution curve for processing tomato which can represent the reference to evaluate if the crop is at sub-optimal (<3.72%), optimal (between 3.72% and 4.81%) or luxury (between 4.81% and 5.2%) consumption at any time of the cycle.
In this study we propose a more complex and efficient opto-electronic method for the evaluation of N nutritional status in tomato leaves. It refers to the utilization of a visible-near infrared (VIS-NIR) portable spectrophotometer representing a rapid, non-destructive, cost-effective technique [13]. The aim of this study was to evaluate the feasibility and accuracy of this method as compared to SPAD readings and SAP test and to reference standard chemical analysis.

Data Collection
This study was carried out in 2008 at the Experimental Station of the Department of Agricultural and Environmental Sciences, in Papiano (Tiber Valley, Perugia province, Central Italy, 43°N, elev. 165 m) on a clay-loam soil. Processing tomato (Lycopersicon esculentum Mill., cv. PS1296) was grown in the field according to a randomized block design with three replicates where thirteen fertilisation treatments were compared, differing for application technique, N form and rate. Five of them were represented by green manures grown in the previous fall-winter and incorporated in early spring: green manures were hairy vetch (Vicia villosa Roth.) and barley (Hordeum vulgare L.) cultivated as monocultures at full sowing density (200 seeds m −2 for vetch and 400 seeds m −2 for barley) and as intercrops obtained by using a fraction of the full sowing density for each species according to a substitutional approach, namely 75% of vetch + 25% of barley, 50% + 50% and 25% + 75%. The total N supplied by green manures varied from 252 kg ha −1 to 183, 167, 160 and 154 as the proportion of vetch decreased from 100% (pure vetch) to 0% (pure barley). Previous experiments have shown that actual N release in the soil from green manures above is much different in time, with pure barley causing N deficiency during early stages of the following cash crop [14,15]. The other eight treatments included: broadcast all-at-once application of two organic fertilisers (poultry manure and by-product from leather factory, both at 100 kg N ha −1 ); localised and split fertigation with one organic and one mineral fertiliser at 2 different rates (100 and 200 kg N ha −1 ); two unfertilised controls, one with tomato in plots where no crop was grown in autumn-winter and one with tomato in plots where barley was grown and then mown and removed from the field before tomato transplanting in order to cause the maximum depletion of soil available N.
The supply of P and K was adjusted taking into account the amount supplied with organic fertilizers, in order to obtain the same rate for all N fertilization treatments (75 kg ha −1 of P 2 O 5 and 75 kg ha −1 of K 2 O). The same irrigation volume was applied in a two-times-per-week irrigation schedule for all treatments, according to potential crop evapotranspiration. The N nutritional status of the crop was evaluated on three sampling periods (s.p.), 25 June (37 Days After Transplanting, DAT), 9 July (51 DAT), and 23 July (65 DAT), in coincidence with plant samplings for growth analysis (1st s.p., 2nd s.p. and 3rd s.p. respectively). Each sampling period corresponds to a specific phenologic stage of the tomato plantation: 1st vegetative growth, 2nd early flower fruit and 3rd fruit bulking.
At each sampling date eight plants per plot were harvested. The SPAD readings were taken on the apical leaflet blade of the youngest fully expanded leaf of those plants; the petioles of the same leaves were then collected and SAP nitrate concentration was measured by an ion-specific electrode meter. Then the eight leaflets above, plus other 16 leaflets detached from other young fully expanded leaves of the same eight plants per plot, were stored at 5 °C in plastic envelopes in the dark and carried to the CRA-ING laboratory (Lat. 42°06'11.00" N, Long. 12°37'40.81" E) where VIS-NIR measurements were performed within three hours. Thirteen-fifteen leaves were spectrally measured two times, randomly acquiring nearly 3,100 full spectra. The N concentrations of the leaflets used for VIS-NIR measurements and of the whole above-ground plant subsamples were then measured by analysis of dry matter; an automatic analyser (FlowSys, Systea, Italy) was used to measure organic-N concentrations on digests prepared according to Isaac and Johnson [16].

SPAD Analysis and SAP Test
The SPAD readings were taken from the apical leaflet of the youngest fully expanded leaf; the petioles of the same leaves were then collected and SAP nitrate concentration was measured by an ion-specific electrode meter (Cardy, Spectrum Technologies, Inc., Plainfield, IL, USA). The N concentration of the leaves and of the whole plant were then measured by analysis of dry matter; an automatic analyser (FlowSys, Systea, Italy) was used to measure organic-N concentrations on digests prepared according to Isaac and Johnson [16].

Spectrophotometric Analysis
For the VIS-NIR measurements, a (portable) single channel spectrophotometer was used. The system is composed of five parts: (1) a Hamamatsu S 3904 256Q spectrograph in a special housing; a customized illumination system realized by a 20 W halogen lamp and an optical fiber bundle consisting of approx. 30 quartz glass; (2) an optical entrance with input round: 70 µm × 2,500 µm and diameter 0.5 mm NA = 0.22 mounted in SubMiniature version A-coupling; (3) specific probes with quartz optical fiber of connection; (4) a transmission device for transmitted or absorbed light for thin solids or liquid with variable optical length; (5) a notebook equipped with specific software to acquire, calibrate and elaborate spectral data. The Hamamatsu spectrograph has the following characteristics: grating: flat-field, 366 line/mm (centre); spectral range: 310-1,100 nm; wavelength accuracy absolute: 0.3 nm; temperature-induced drift: <0.02 nm/K; resolution (Rayleigh-criterion): DlRayleigh ≫ 10 nm; sensitivity: ≫1,013 Counts/Ws (with 14-Bit-conversion); straylight: <0.8% with halogen lamp and 16 bit A/D converter.
For spectral acquisition, the 'pen' probe was used to measure the spectral reflectance response on each single leaf (spot area ≈ 10 mm 2 ). On each leaf two spot areas were acquired with the pen probe in the same areas used for SPAD and SAP test analysis. The reflectance measure is referred to the light percentage that is reflected by the material and acquired by an optical quartz fiber (0.7 mm in diameter) fixed at 45° inside a circular aperture of 4 mm in diameter, in relation with a white reference (100% of the signal available). The material surface due to its softness was able to include the entire circular aperture avoiding any external light interference. The spectral measurements were performed in laboratory considering a white calibration (lower value with respect to the external light), the instrumental integration time (light acquisition time) and subtracting the background noise (variable in function of the instrument temperature). A very low signal/noise ratio was observed in the beginning and at the end of the spectral range, affecting the accuracy measurements, so only the spectrum in the range 400-800 nm were take into account for the analysis. All spectral values were expressed in terms of relative reflectance. To remove drift effect for each group 12-15 leaves were chosen at random. After 30-35 spectral measurements a new white calibration was performed.

Chemometric Analysis of Spectral Data
Mean reflectance values of all leaves considering together all treatments, were compared to each reference chemical value by chemometric multivariate methods (Partial Least Squares, PLS). The procedure includes the following steps ( Figure 1): (1) extraction of raw spectra dataset, to be used as X-block variables; (2) X-block variables selection; (3) creation of measured values dataset to be used as reference or response variable (Y-block); (4) data fusion of the two dataset (X-and Y-block) in one analysis dataset; (5) SPXY (sample set partitioning based on joint X-and Y-blocks) [17] partitioning of the dataset into two subsets, one for the model (85% of whole dataset) and one for the external validation test (15% of whole dataset) (i.e., 85% of total samples were used to calibrate model and 15% were reserved for external validation). With respect to the random partitioning method, widely used in literature, SPXY approach, assigning equal importance to the samples distribution within both X-and Y-blocks, returns more objective and replicable results and could found analytical applications involving complex matrices, in which the composition variability of real samples cannot be easily reproduced by optimized experimental designs; (6) application of different pre-processing algorithms to X-block and Y; (7) application of chemometric technique (PLS): modelling and testing; (8) calculation of efficiency parameter of prediction.
To divide the dataset into model (calibration and validation) and test sub-sets, for multivariate PLS analysis, the SPXY method [17] was used. This method employs a partitioning algorithm that takes into account the variability in both x-and y-spaces. To obtain the best prediction test, different X and Y pre-processing techniques were applied (Table 1), from the simpler (none, Log 1/R, diff1, mean centre, autoscale, median centre, baseline) to the more specific for spectral data (Savitsky Golay, Multiple Scatter Correction, Orthogonal Signal correction). Pre-processing for X block was applied both as single pass then as double considering all possible combinations (i.e., Log 1/R + autoscale).
Prediction of nitrogen leaves concentration was performed by PLS regression model, using PLS Toolbox in MATLAB V7.0 (The Math Works, Natick, MA, USA). PLS is a soft-modelling method [18] for constructing predictive models when the factors are many and highly collinear. The model works through a specific algorithm (SIMPLS) on the whole array variables (input variables, X-block) and on the observed values (Y-block), after pre-processing treatments. The model determines the minimum set of the n estimation variables (LV, latent variables) by a recursive process. These variables could be represented in a n-dimensional space and they are used by PLS to calculate the best regression matrix between the X and the Y. The calibration models were also validated using full cross-validation, Venetian blind (Matlab rel. 7.1, PLSToolbox Eigenvector rel. 4.0). The model includes a calibration phase and a validation phase and for both phases it calculates the residual errors (Root mean square error in calibration RMSEC and validation RMSECV). Modelling methods are subjected to over-fitting: this occurs when a model is excessively complex, such as having too many parameters (LV) relative to the number of observations. Therefore, in order to avoid the over-fitting is necessary to choose the model in order to optimize the number of LV in relation with efficiency parameters.

Regression Analysis on SPAD and SAP Data
Different monovariate regressions (linear, power, exponential and logarithmic) were calculated between the petiole nitrate concentration measured by the SAP and the total plant and leaf N concentration measured by laboratory analysis. The same analyses were also determined between the chlorophyll concentration assessed by SPAD analysis and both the total plant and leaf N concentration measured by the lab analysis.
The linear correlations between the N concentration chemically measured of the three different sampling periods (s.p. 1st, 2nd and 3rd) and the N concentration measured through SPAD analysis were calculated. To calculate the prediction efficiency, as in chemometric analysis, the whole dataset was divided into model (calibration and validation) and test sub-sets by means of the SPXY method [17].

Predictive Accuracy of Models
The predictions obtained from the SPAD, SAP and PLS models in external validation subset were compared through linear regression analysis with the observed values.
Different accuracy parameters were extracted such as RMSE (Root Mean Square Error), SEP (Standard Error of Prediction) and correlation coefficient (r). The r was taken into consideration for distinguishing systematic errors and studying the correlation between the reference and predicted values. Generally, a good model should have high correlation coefficients r, low RMSE and SEP.
Others three parameters were calculated referring to Gauch et al. [19]: squared bias (SB), nonunity slope (NU) and lack of correlation (LC). In formulae (1, 2 and 3) they are defined as follow: where x are the model-based predicted and y the measured values respectively, X and Y their mean, N the number of observations in the validation test, b is the slope and r 2 the square of the correlation.
In a perfect prediction, i.e., in the 1:1 line of equality Y = X, SB and LC should be equal to 0, while NU > 0 for b ≠ 1. In the accuracy analysis, SB is a good indicator of translation, NU of the rotation and LC of the scattering of the correlation line [19].  In the figure, the three thresholds extracted by the critical-N curve proposed by Tei et al. [9] are also reported. These thresholds indicate the optimum N concentration of the plants depending on the phenologic stage of the tomato plantation. Approximately 80% of the samples were below the critical threshold of N leaf concentration during the first two sampling period, and ~65% were below the threshold for the 3rd sampling period.

Results
The r values of the linear correlations between the N concentration chemically measured divided in the three different sampling periods (s.p. 1st, 2nd and 3rd) and the N concentration measured through SPAD analysis resulted very low: 0.5, 0.2 and 0.3 respectively. Table 2 indicates the results of the linear regression of the model and test performed on the values of SPAD and SAP. The coefficient of correlation (r) of the test is equal to 0.56 in both SPAD and SAP test. The SEP and RMSE values are slightly lower for the SPAD analysis (0.72 and 0.64 respectively). Table 2 reports also the squared bias (SB), the nonunity slope (NU) and the lack of correlation (LC) for both SPAD and SAP test. These values are respectively equal to: SB = 0.12 and 0.12; NU = 0.0003 and 0.05; LC = 0.48 and 0.48. Table 2. Results of linear regression prediction of N concentration in tomato leaves from SPAD (chlorophyll meter readings) and SAP (measurements of N-NO 3 concentration in petiole) analysis. Efficiency parameters reported: correlation coefficient (r), Standard Error of Prediction (SEP), Root Mean Square Error (RMSE), Squared bias (SB), nonunity (NU) and lack of correlation (LC).   Figure 3 shows the correlation between measured and predicted values of N by SPAD and SAP analysis in the test represented by the 15% of the whole sample dataset extracted by the SPXY method. In Table 3 values and results of PLS models and test prediction on the four spectral datasets (W, R1, R2 and R3) and of N concen tration in tomato leaves from spectral reflectance analysis are reported.   The best model was the R3 using only the central values of the spectra (496-694 nm). This model uses firstly a Log1/r pre-processing on the X-block and then a snv pre-processing on the pre-processed X-block. Y-block was not pre-processed. The r value of the test is very high: 0.94 (Table 3).
Also the values of SEP and RMSE of the test are very low (0.35 and 0.40 respectively). The prediction ability of the model revealed to be high, being a SEP of 0.43 and the RMSE of 0.43 indicates that predictions were on average within 0.43% N of the measured values. Moreover, the values of the predictive accuracy are equal to: SB = 0.05, NU = 0.0188 and LC = 0.09.

Discussion and Conclusions
Usually N nutrition is determined by leaf chemical analysis, which presents some disadvantages that limit its use, such as the length of sampling time, the use of hand labour, the need for specialized equipment and high cost [20]. Thus, according to Guimarã es et al. [21] alternative methods that using portable gauges, permitting diagnosis and monitoring of the N nutrition of the plants in a faster and non-destructive way in the field are required.
In this study the estimation efficiency of the N concentration of tomato leaves determined by a portable VIS-NIR spectrophotometer by means of chemometric procedures resulted always higher than these obtained by SPAD chlorophyll meter readings and SAP tests, as demonstrated for the parameters r (0.94 vs. 0.56), SEP (near 40% lower), RMSE values (near 35% lower), SB (0.05 vs. 0.12), NU (0.0188 for VIS-NIR vs. 0.0003 for SPAD and 0.0517 for SAP test) and finally LC (0.09 for VIS-NIR vs. 0.48 for SPAD and SAP test). The N nutritional state in plants may be determined indirectly by the chlorophyll concentration present in the leaves, as it is directly related to their N concentration. Many studies found a high correlation between N and chlorophyll, because pigments determine most spectral features between 400 nm and 700 nm [22,23]. It was confirmed in this work by proving that a restricted spectral dataset (R3 = 496-694 nm) that refers to the spectra range of the chlorophyll, highly correlated with the analysed leaf N concentration. Similar results were obtained in the study of Min et al. [24] where the N leaf concentration of Chinese cabbage was detected using VIS and NIR spectroscopy in combination with PLS regression producing a r = 0.92. The most significant wavelength correlated to chlorophyll was identified in the 710 nm, but also wavelengths near 550 and 840 nm contributed to N prediction as in our study.
Esposti et al. [25] reported a SPAD chlorophyll meter for the multi-parametric chemical compound concentration estimation in leaves; they successfully estimated only N. However, the ability of this method to monitor crop N status in the field has been significantly enhanced by recent work analysing leaves of rice [26], corn [27] and cotton [28]. The same situation is for the SAP test that have been developed to measure nutrient concentrations in a number of vegetable crops including potato [29], tomato [30], cabbage [31], cauliflower [32] and capsicum [33]. Both SPAD chlorophyll meter and SAP test are inexpensive and give rapid results which accuracy mainly depends on type, variety and phenological stage of the cultivation. Times of collection of petioles during the day for the SAP test are important if SAP nutrient concentrations show diurnal variation observed in beets [34] and also in tomato plants [35]. In addition, while plant N concentration declines with crop biomass accumulation, the N concentration per unit leaf area within the upper layer of canopy would remain more or less constant [12]; moreover, a vertical gradient in the canopy N concentration can be observed [36]. So the SAP test may be not able to show a decrease in plant N accumulation during the crop cycle since petioles are collected anytime at the top of the plants. Although many other factors can affect petiole nitrate concentration such as cultivar, temperature and solar radiation [37], the petiole SAP showed to be a reliable diagnostic tool for about 2/3 of the crop cycle (i.e., until the end of linear growth phase) when it is really important for N fertilizer management in processing tomato.
The tomato crop analyzed in this work resulted in a deficiency N concentration phase comparing with the critical N curve presented by Tei et al. [9], especially until the early flowering period. Therefore, the estimation efficiency of the N concentration of tomato leaves determined by the SPAD and the SAP test, considering also the separate sampling period, was always underperforming respect to what indicated in literature. This fact could probably depend on the crop deficiency condition in terms of N concentration, referring to the critical N curve [9]. The limited availability in the sample of elements with a concentration of leaf N exceeding critical thresholds may have limited the predictive power of both tests (SPAD and SAP).
In this study the utilization of the portable VIS-NIR spectrophotometer, increasing efficiency, allows better knowledge of nutritional status of tomato plants, with more detailed and sharp information and on wider areas. More detailed information either in space (increase in detail) and in time (the system allowed to perform spectral measurements with an acquisition time of 2 s per leaf, for 500-800 leaves and 100-150 plants) is an essential tool to increase and stabilize crop quality levels and to optimize the nutrient use efficiency, mainly in low input production models.