Assessing the Robustness of Vegetation Indices to Estimate Wheat N in Mediterranean Environments

Remotely sensed vegetation indices have been extensively used to quantify plant and soil characteristics. The objectives of this study were to: (i) compare vegetation indices developed at different scales for measuring canopy N content (g∙N∙m −2 ) and concentration (%); and (ii) evaluate the effects of soil background reflectance, cultivar, illumination and atmospheric conditions on the ability of vegetation indices to estimate canopy N content. Data were collected from two rainfed field sites cropped to wheat in Southern Italy (Foggia) and in Southeastern Australia (Horsham). From spectral readings, 25 vegetation indices were calculated. The Perpendicular Vegetation Index showed the best prediction of plant N concentration (%) (r 2 = 0.81; standard error (SE) = 0.41%; p < 0.001). The Canopy Chlorophyll Content Index showed the best predictive capability for canopy N content (g∙N∙m −2 ) (r 2 = 0.73; SE = 0.603; p < 0.001). Canopy N content was best related to indices developed at the canopy scale and containing a red-edge wavelength. Canopy-scale indices were related to canopy N%, but such relationships needed to be normalized with biomass. Geographical location influenced mainly simple ratio or normalized indices, while indices that contained red-edge wavelengths were more robust and able to estimate canopy parameters more accurately. OPEN ACCESS Remote Sens. 2014, 6 2828


Introduction
Vegetation indices (VIs) based on spectral reflectance have been used in agricultural research for finding functional relationships between canopy characteristics and remote sensing observations for nearly four decades [1,2].Most indices for detecting canopy chemical component (e.g., nitrogen, chlorophyll) have been developed at the leaf scale, because it is the first step for further up-scaling to the canopy level [3].In this case, indices are developed by using leaves of the same species, leaves from different crops or by comparing several indices with a large simulated database [4,5].
At the leaf level, spectral reflectance is a function of the chlorophyll absorption, internal structure, leaf thickness, air-water interface, distribution of pigments and chemical constituents [6][7][8], as well as leaf surface properties, such as waxy cuticle, and pigment concentrations and distribution [9].At the canopy level, it is a function of the Leaf Area Index (LAI), leaf clumping, leaf angle distribution, vegetation cover, soil background and source-target illumination geometry [10][11][12][13][14].As a consequence, a vegetation index performing satisfactorily when estimating nitrogen content at the leaf level might perform poorly for the estimation at the canopy level, because the factors that affect crop reflectance vary according to scale [15] and may be confounding at different scales.At canopy level, many of the factors mentioned above (e.g., vegetation cover, clumping and leaf angle distribution) change remarkably as a function of the crop phenological stage.Early in the growing season, the effects of low canopy cover resulting in a greater soil exposed to the sensor can make it difficult to isolate the plant signal from the soil reflectance, affecting the ability of indices to detect canopy nutritional content [16].Later in the season, LAI values of three or more cause some VIs to lose sensitivity for measuring canopy nutritional content [17].For example, [18] found a high correlation between leaf nitrogen and leaf optical properties (r 2 = 0.90).However, at the canopy level, the correlation between canopy nitrogen concentration and canopy reflectance decreased as a function of LAI.
Assessing the response of VIs developed at different scales for canopy N status across locations is challenging due to differences in canopy cover, the scale at which indices have been developed, different plant structures and biomass accumulation.To improve the quality of the estimation of early canopy nutritional response in rainfed environments where the canopy rarely reaches a complete cover, these factors need to be accounted for.
Improving the ability of remotely sensed indices to map canopy N near the time farmers decide to apply mid-season N fertilizer could help improve N management in rainfed environments.Quantification of canopy nutritional content after the time for mid-season application would result in a limited opportunity for farmers to adapt their fertilizer management tactics.Thus, the objective of N sensing for agricultural management should make use of indices that are accurate at the critical time of the season when farmers make management decisions [19].
In this study, we hypothesized that the scales (leaf vs. canopy) of the indices influenced the correlations between canopy nutritional status and Vis.Furthermore, the method of expressing canopy N (N concentration (%) vs. N (g/area)) will impact the relationships developed and their usefulness for N management.
The objectives of this study were to: (i) compare the response of VIs developed at different scales for measuring canopy N content (g•N•m −2 ) and concentration (%); (ii) evaluate the effects of soil background reflectance, cultivar, illumination and atmospheric conditions on the ability of VIs to estimate canopy N content; and (iii) assess the utility of the relationships developed for quantifying canopy N content at the time farmers need to make decisions for mid-season N applications.

Experiment Description
Data were collected from two rainfed field sites cropped to wheat, one in Southern Italy (Foggia) and the other in the south eastern wheat belt of Australia (Horsham).Data were collected during the growing season in 2006/2007 (December to June) for Foggia and during the 2007 (June to December) growing season for Horsham.The two growth stages targeted for both sites were the pseudo-stem elongation (Z30) and anthesis (Z65) [20].

Field Experiment in Foggia, Italy
The experiment was carried out at the Cereal Research Center, Foggia, Italy (41°28''N, 15°32''E; 75 m above sea level).Durum wheat (Triticum Durum, Desf.; cv.Ofanto) was sown on 10 December 2006 (DOY 344) on a north-west to south-east direction with a row spacing of 17 cm.The soil was a clay-loamy soil according to the USDA particle-size distribution limits (Black Vertisol).The experiment was part of a long-term (17 years) continuous wheat system, with two levels of nitrogen (0 and 90 kg•N•ha −1 ).The experimental design consisted of two plots in a completely random design, separated from each other by 5 m of bare soil.One plot received 90 kg•N•ha −1 as a split application for 17 years: one application at sowing with 25 kg•N•ha −1 as diammonium phosphate and the other at pseudo-stem elongation (Z30) with 65 kg•N•ha −1 as urea.The second plot had not been fertilized for 17 years.Each plot was divided into five sub-plots (each 6 × 20 m).Inside each sub-plot, measurements were collected at five different, 1-m −2 areas.Initially, non-destructive measurements were performed (remote sensing and LAI), then the plants were collected for determination of biomass and N. Two of those five areas were used for destructive determination of biomass content and canopy N content on 28 February 2007 (Z30), and 6 April 2007 (Z65), while the other three were harvested at the end of the growing season.Canopy and soil reflectance were measured using a FieldSpec ® HandHeld Pro portable spectroradiometer [21].The instrument's spectral range is from 325 to 1075 nm with a 10-nm bandwidth.With a sensor field of view (FOV) of 25°, the instrument was held at 1.5-m above the soil, and multiple measurements were made to overcome the problem of fixed height above the soil.The measurements were made under clear sky conditions around midday.All spectral measurements were converted to reflectance by referencing a 99% Spectralon [22] panel at various times during each sample date.
Leaf Area Index (LAI) was measured non-destructively with a portable LI-COR LAI 2000 [23] at the same five locations inside each sub-plot and on the dates mentioned above.
Aboveground dry biomass was determined by drying the fresh samples in an oven at 65 °C for 48 h.Canopy N concentration was measured on the whole plant with the Kjeldahl method and, canopy N content (g•N•m −2 ) was obtained by multiplying the N concentration (%) by dry weight.

Field Experiment in Horsham, Australia
The Horsham field experiment was located at the Victorian Department of Primary Industries, Crop Breeding Centre (36°45′S, 142°6′E; 120 m above sea level).Wheat (Triticum aestivum L., cultivar Yitpi) was sown on 18 June 2007, in a north to south direction with a row spacing of 21.7 cm on Horsham clay, a Grey Vertisol [24].
Measurements were taken on plots belonging to the AGFACE (Australian Grains Free Air CO 2 Enrichment) experiment [25].The statistical design was a randomized complete block (4 replications) with CO 2 (550 ppm and ambient 370 ppm) injected through rings around the plots and irrigation (rainfed and supplemental) randomized at the ring level.Time of sowing was randomized at the half ring level.Within each half ring, there were six, 1.7 × 4-m sub-plots randomized for cultivar (Yitpi and Janz) and nitrogen (0 and 80 kg•N•ha −1 ).The Janz variety was not sampled or considered in this study.The N was split with 30 kg•N•ha −1 as urea at sowing and 50 kg•N•ha −1 as urea at Z30.A FieldSpec ® Pro portable spectroradiometer [21] was used to measure reflected light from the canopy and soil on 6 September 2007 (Z30), and 29 October 2007 (Z65).The spectral range of the radiometer ranged from 350 to 2500 nm, and the sensor FOV was 25°, with measurements collected in all 12 plots for each ring at 2-m above the soil.Multiple measurements were made following the same procedure carried out in Italy.LAI was measured with an LI-3100 [23].
Three of the six plots in each half ring were used for destructive samplings, at Z30 and Z65, with the remaining three used for yield determination. Destructive measurements included removing crops to determine aboveground biomass, total canopy N content (g•N•kg −1 ) and N concentration (%).

Vegetation Indices
For this study, we calculated 24 VIs (Table 1), which were divided as functions of two parameters, the scale at which they were developed (leaf, canopy, regional), and the target variable (e.g., biomass, LAI, yield, nitrogen and chlorophyll).However, the main reason for selecting these indices was to allow a search for indices used for the estimation of canopy N concentration (N%) and content (g•N•m −2 ) that were robust across crop locations and growing conditions.The wavelength used to derive the indices, the formulae and references [16,[26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44] are reported in Table 1.Some of the VIs need information regarding the bare soil reflectance, which was determined for each site at the same time of the other measurements on a bare soil plot near the main plots.The spectral reflectance of the soil near-infrared (NIR) and red were linearly combined to obtain the soil line equation [10].For the Australian site, the soil line was NIR Australia = 1.06 × red + 0.03; for the Italian site, the soil line equation was the following: NIR Italy = 1.32 × red + 0.003.

Statistical Analyses
Regression analyses were performed to find functional relationships between each of the VIs (Table 1) and canopy N concentration and N content for the two sites and for two different developmental stages: Zadoks Z30 and Z65 [20].
The indices were analyzed by fitting several models, and the results of the model fitting were evaluated by studying the rate of change, the coefficient of determination, r 2 , and the standard error of the estimation.From the standard error of this estimate, the confidence limits for the true value can be eventually calculated for any probability level, and the values of the slopes of the regressions can be compared in the literature [45].
For the Italian and Australian sites, the correlations between VIs and crop parameters were obtained by considering the treatments 0 N and 90 N together in order to capture the variation in spectral response expected.The relationship obtained between VIs and N was tested using the k-fold cross-validation.Cross-validation is a technique for assessing how the results of a statistical analysis will generalize to an independent dataset and how accurately a predictive model will perform [46].The model is first tested on a known dataset (training dataset) and then evaluated on an unknown dataset (testing dataset).The k-fold cross-validation means that the dataset is randomly divided into k equal-sized sub-datasets.From these k sub-datasets, k − 1 sub-datasets are used as training dataset, and the single k sub-dataset is used as the validation dataset for testing the model.Then, the cross-validation process is then repeated k times, with each of the k sub-datasets used as the validation dataset.The results from each of the iterative processes are combined to produce a single estimation [46].In this way, all the data are used for both training and validation, and each single observation is used for validation exactly once.In this study, we used the 10-fold (k = 10) cross-validation on the combined dataset at the Z30 stage occurrence known to be critical to quantify canopy N content.
From the results of the cross-validation, the ability of index estimation was evaluated by examining the cross-validation estimate of prediction error (mean square (MS)), which is a corrected measure of the prediction error averaged across the k.In addition, the root mean square error (RMSE) was reported on the graph showing measured vs. simulated plant N content (g•N•m −2 ).The MS and the RMSE are calculated as follows: (1) (2) where, = the measured value , = the simulated value and N = the number of pairs of measured and simulated values.Statistical and regression analyses were performed using GENSTAT 10th edition [47]; the cross-validation was performed using the DAAG package [48].The results of cross-validation were evaluated by studying the rate of change, b (slope), the coefficient of determination, r 2 , and the coefficient of determination obtained by the cross-validation and the MS of the prediction error.

Results and Discussion
Biomass values at pseudo-stem elongation growth stage Z30 [20] were 104.5 g•m −2 for the Italian site and 52 g•m −2 for the Australian site for the 0 N treatments.The biomass values for the 90 N treatments were 191.2 g•m −2 for the Italian site and 52.6 g•m −2 for the Australian site (Tables 2 and 3).Later in the season, they showed the opposite behavior, with the Australian site having higher values.LAI values were higher for the Italian site at both growth stages (Table 3).Canopy N concentration (%) was higher for the Australian site at the Z30 stage, the values ranging between 2.3% and 5.07% for the 0 and 90 N against N% ranging between 1.8% and 3.01% for the Italian site.As biomass increases, canopy N% decreases, and this process is known as N dilution [49].To illustrate the concept of N dilution at different growth stages, the canopy N concentration (%) was plotted against canopy biomass (g•m −2 ) at the Z30 stage and the anthesis (Z65) stage for both sites.The upper and lower limit of the relationship between biomass and canopy N% was calculated using the algorithms developed by [50].In Figure 1, the canopy N% were related to canopy biomass accumulation; the values of N% of both locations fell within these limits, except for three points falling out of the upper limit at Z65 (Figure 1).

Figure 1.
The relationship between canopy nitrogen (N) concentration (%) and biomass (g•m −2 ) for Australia and Italy at the pseudo-stem elongation (Z30) and anthesis (Z65) stages.The N% max and N% min were calculated following the approach of [50].AUS, Australia; ITA, Italy.
The results of the relationships between VIs and canopy N (%) and N content (g•N•m −2 ) at Z30 are shown in Tables 4 and 5, respectively.The Perpendicular Vegetation Index (PVI) showed the highest coefficient of determination of canopy N (%) at Z30 (r 2 = 0.81; SE = 0.42%; p < 0.001) followed by the Visible Atmospherically Resistant Index (VARIgreen) (r 2 = 0.78; SE = 0.44%; p < 0.001), both showing a similar ability to predict N%.The results of the cross-validation for these indices indicate that they are good estimators of canopy N% with a mean square (MS) of 0.17 and 0.20%, respectively (Table 4).The first six indices in the ranking were all developed at the canopy scale.
Canopy N (g•N•m −2 ) at Z30 showed the best relationship with CCCI (r 2 = 0.73; SE = 0.60; p < 0.001), GI also showed good predictive ability but with a lower r 2 and higher standard error (r 2 = 0.63; SE = 0.71; p < 0.001) than the Canopy Chlorophyll Content Index (CCCI).The results of the cross-validation show that the CCCI was the best index in predicting canopy N content (g•m −2 ), with an MS of 0.37 g•m −2 , while the others had an MS of 0.51 g•m −2 or greater (Table 5).Overall, all the indices showed high standard errors in calibrating the models, although they showed the good predictive ability of canopy N content after performing the cross-validation, except indices like Gitelson 1 (GIT 1), Ratio Analysis of Reflectance Spectra (RARSb), GIT 3 and DATT 1, which showed high MS (Table 5).The relationship between measured canopy N content (g•N•m −2 ) and predicted canopy N content (g•N•m −2 ) with the CCCI using the cross-validation early in the season at Z30 is shown in Figure 2. At this growth stage, the points from the two locations fall in the same data space, demonstrating the good predictive ability of the CCCI for canopy N content (RMSE = 0.60 g•m −2 ) independent of canopy differences.At anthesis (Z65), the results of cross-validation show a lack of correlation with canopy N concentration and N content (data not shown).Canopy N concentration (%) was poorly predicted, while canopy N content (g•N•m −2 ) was predicted with lower accuracy by all the indices, with the CCCI showing the best N content estimation at this stage (MS = 14.56 g•N•m −2 ).
The good predictive ability of canopy N content by the CCCI is due to its reduced sensitivity to other factors that might influence the N signal from the canopy and the relationship between the index and canopy biomass at Z30.The relationship between CCCI, LAI and canopy biomass is shown in Figure 3a,b, with the Italian and Australian data falling in the same data space and showing a linear distribution.Later in the season, as biomass increased, the CCCI values reached a saturation level around 0.8 (Figure 3a).The comparison of VIs for canopy N content (g•N•m −2 ) and canopy N concentration (%) showed that indices originally derived at the canopy-scale indices performed better than those derived at the leaf-scale indices.In addition, red-edge-based or indices that minimize the soil reflectance showed the best performance, independent of location.These indices would include canopy-scale factors, such as shadows, soil background, variable illumination, etc., and, therefore, would be more representative of canopy conditions.
Canopy N estimation at Z30, the stage at which farmers apply mid-season N fertilizer, could significantly improve the opportunity for farmers to intervene with tactical fertilizer management.Therefore, a suitable index would be one that is not affected by the location, but that can be used across environments to measure canopy N content without any ground sampling [51,52].This would require taking into account the confounding effects of soil background, cultivar, canopy architecture, illumination and atmospheric conditions, accounting for canopy biomass, N dilution and particularly, the measurement scale.If the proper wavebands are available in satellites or on aircraft, this could be used to map N content across farmer fields without calibrations, allowing them to target N applications.
Among all the VIs studied, the CCCI was the best index for robustly estimating canopy N content (g•N•m −2 ).In fact, the CCCI was the only index to measure both N% and N content with r 2 values of about 0.7.It also showed a linear relationship with biomass and LAI at Z30 (Figures 2 and 3).This means that at Z30, the effects of the different location and cultivar type do not play as major confounding effects on the ability of the index to sense the biomass and LAI patterns.The CCCI is a two-dimensional index calculated from the Normalized Difference Red Edge (NDRE) as a surrogate of N% and the NDVI as a surrogate of canopy cover [52].The results of this study agree with previous findings [19,53,54] that the CCCI, which uses a two-dimensional approach to calculating canopy N, is the best multispectral predictor of N content in wheat.It is also interesting to notice that the VARIgreen, which was developed for the regional estimation of crop conditions, showed good correlation with either N concentration (%) or N content (g•N•m −2 ).It is calculated from visible bands (Table 1) and has been found to be a good indicator of the vegetation fraction [31]; since canopy response to N is a response to canopy cover (and vegetation fraction), this could explain why this is a good estimator of canopy N content.Similarly, the CCCI incorporates the NDVI in the two-dimensional method to account for canopy cover.
The vegetation fraction can be indirectly evaluated using LAI measures.The Italian site, for the 90 N treatments at Z30, showed LAI values of 2.7, while the Australian counterpart showed lower values.At LAI values higher than 2.5, the VIs based on visible and NIR reflectance "saturate", making the index insensitive to further changes in canopy biomass accumulation [55].The CCCI, on the other hand, showed good correlation with biomass and LAI, because it is based on a two-dimensional approach that compensates for an increasing of canopy biomass (Table 1).It also includes the red-edge, which is sensitive to slight changes in canopy N. The red-edge is a narrow region between the visible and NIR of the spectrum-boundary between chlorophyll absorption (red) and leaf scattering (NIR) [56].
It is found [53] that in rainfed environments, there is the need to normalize N (%) as a function of canopy biomass before finding a functional relationship for canopy N estimation.This approach provides actual canopy N amounts, which is more useful as a basis for making N fertilizer input decisions than N% alone.The decline of canopy N% is correlated with aboveground biomass accumulation, independent of weather, species and genotype, and this effect is called the "dilution effect" [49].Such a relationship is a function of canopy N uptake, crop growth rate and carbon (C) and N allocation between crop organs, which are all physiological processes [57,58].The goal of developing a remote index that can be used across locations and seasons would need to account for crop growth stage, relationships to canopy N physiology and biomass [51,53].
The findings presented here agree with the literature [53], where the CCCI was better correlated with canopy N (g•N•m −2 ) than N (%), because the CCCI can account for canopy-scale variations, especially at the critical stages near Z30, when farmers make fertilizer application decisions.In this study, we did not test the NDVI directly.The NDVI has been shown to be a good estimator of LAI or canopy biomass [1,11].The aim of this study was to test the validity of the indices for early canopy N discrimination.However, the NDVI is indirectly included into the CCCI calculation as a surrogate of canopy cover [52].
Later in the season, at Z65, the relationship between all the indices and canopy N% did not hold, probably because later in the growing season, the canopy is mostly senesced, with predominantly wheat ears visible, which have a different spectral response.In addition, canopy N content was poorly estimated after Z30, because VIs saturate at high biomass values [19].Thus, the use of the indices presented here should be targeted during early season vegetation growth, before the flowering stage, in order to target N input applications.
Ideally, the choice of an index for canopy N measurement will not depend on the geographical location, where measurements are made.Biomass has the greatest influence on the determination of canopy N content and N uptake; therefore, any index that estimates biomass and changes in N at the canopy scale will have the best chances of being robust across locations and variable crop conditions.The indices chosen here that were originally developed at the canopy scale presumably incorporate those canopy scale factors in their relationships to canopy N (e.g., soil background, internal canopy light scatter, differences in leaf angles and shadows), and they relate more strongly to field-based canopy characteristics.

Conclusions
Canopy N content (g•N•m −2 ) in wheat in Mediterranean environments was best related to indices developed at the canopy scale and contain a red-edge wavelength.Biomass accumulation influenced mainly simple ratio or normalized indices, while indices that contained red-edge wavelengths were more able to predict canopy N parameters.Among all the vegetation indices studied, the CCCI was the best index for robustly estimating canopy N content (g•N•m −2 ), because it is based on a two-dimensional approach that compensates for an increase in canopy biomass, and it includes the red-edge, which is sensitive to slight changes in canopy N. Recent findings [54] corroborate the two-dimensional index method for a robust multispectral solution to estimating canopy N in wheat.Further research is needed to validate the approach in other crops and multiple locations.

Figure 2 .Figure 3 .
Figure 2. Canopy N (g•N•m −2 ) measured and predicted with the Canopy Chlorophyll Content Index (CCCI) for the Australian (full symbols) and Italian (open symbols) site at Z30 using the cross-validation method.The line represent the 1:1 relationship.

Table 1 .
Vegetation indices used in this study.NDRE, Normalized Difference Red Edge; LAI, Leaf Area Index; NIR, near-infrared.
a treatment means; b standard error of the mean (n = 6).

Table 3 .
Mean wheat biophysical variables from Horsham (Australia) for the 2007 growing season at stages Z30 and Z65 for the cultivar "Yitpi".
a treatment means; b the standard error of the mean (n = 4).

Table 4 .
Calibration and validation of the estimation of plant N concentration (%) by vegetation indices for the pooled Australian and Italian site (AUS + ITA) at Z30. Regressions were significant at p < 0.001.Indices are ordered by the coefficient of determination of the cross-validation (CV r 2 ).Coefficient of determination of the regression; b standard error (%) of the regression; c slope of the regression line; d coefficient of determination from the evaluation obtained through cross-validation (CV); e estimate of the prediction error (MS). a

Table 5 .
Calibration and validation of the estimation of plant N content (g•m −2 ) by vegetation indices for the pooled Australian and Italian site (AUS + ITA) at Z30. Regressions were significant at p < 0.001.Indices are ordered by the coefficient of determination of the cross validation (CV r 2 ).Coefficient of determination of the regression; b Standard Error (%) of the regression; c Slope of the regression line; d Coefficient of determination from evaluation obtained through cross-validation (CV); a e Estimate of the prediction error (mean square (MS)).