Estimation of Leaf Area Index Using DEIMOS-1 Data : Application and Transferability of a Semi-Empirical Relationship between two Agricultural Areas

This work evaluates different procedures for the application of a semi-empirical model to derive time-series of Leaf Area Index (LAI) maps in operation frameworks. For demonstration, multi-temporal observations of DEIMOS-1 satellite sensor data were used. The datasets were acquired during the 2012 growing season over two agricultural regions in Southern Italy and Eastern Austria (eight and five multi-temporal acquisitions, respectively). Contemporaneous field estimates of LAI (74 and 55 measurements, respectively) were collected using an indirect method (LAI-2000) over a range of LAI values and crop types. The atmospherically corrected reflectance in red and near-infrared spectral bands was used to calculate the Weighted Difference Vegetation Index (WDVI) and to establish a relationship between LAI and WDVI based on the CLAIR model. Bootstrapping approaches were used to validate the models and to calculate the Root Mean Square Error (RMSE) and the coefficient of determination (R) between measured and predicted LAI, as well as corresponding confidence intervals. The most suitable approach, which at the same time had the minimum requirements for fieldwork, resulted in a RMSE of 0.407 and R of 0.88 for Italy and a RMSE of 0.86 and R of 0.64 for the Austrian test site. Considering OPEN ACCESS Remote Sens. 2013, 5 1275 this procedure, we also evaluated the transferability of the local CLAIR model parameters between the two test sites observing no significant decrease in estimation accuracies. Additionally, we investigated two other statistical methods to estimate LAI based on: (a) Support Vector Machine (SVM) and (b) Random Forest (RF) regressions. Though the accuracy was comparable to the CLAIR model for each test site, we observed severe limitations in the transferability of these statistical methods between test sites with an increase in RMSE up to 24.5% for RF and 38.9% for SVM.


Introduction
In the recent years, the use of satellite sensor data has become more common in precision agriculture technologies [1], and various operational services are being developed within the Global Monitoring for Environment and Security (GMES) initiative [2].In this context, the data users (i.e., farmers, large and small scale agri-businesses) are mostly interested in monitoring the spatial distribution of some crop characteristics over the growing season and time-series of satellite acquisitions at high spatial resolution are a major source of information.
A key vegetation parameter attracting most interest is the Leaf Area Index (LAI), defined as the total one-sided area of green leaf area per unit ground surface area [3].LAI is used to derive agronomical indicators for various crop management purposes.For instance, LAI maps are used in agro-meteorological models to derive the crop water needs (an example of operative application is given in Irrisat) [4], to monitor the nitrogen status and to apply fertilizer with variable rates (e.g., FarmSat), as input in crop models to derive agronomical variables [5,6].On a larger scale, LAI and other biophysical variables are used for example for yield predictions at administrative level [7][8][9].A general overview of remote sensing contributions to agriculture is given in [10].
Two groups of techniques have been commonly applied for the estimation of the LAI from optical satellite sensor data using semi-empirical/statistical approaches (i.e., vegetation indices, VI) or physical based approaches of leaf-canopy radiative transfer model (RTM) inversion [11,12].Most of the empirical or statistical equations, such as regressions between spectral reflectance, vegetation indices (VI) or shape indices (e.g., red edge) and field measurements [13][14][15], employ data in two or more wavebands, usually red and near-infrared [16,17].VIs are often the only option for the retrieval of LAI with limited spectral information (such as in the case of DEIMOS-1 data with only three spectral bands).
A prerequisite for the quantitative analysis of time-series of satellite sensor data is to perform radiometric and atmospheric corrections [18], if possible using reliable instantaneous atmospheric measurements (such as aerosol optical thickness, water vapor content) and/or the spectral reflectance of known ground targets either derived from ground measurements (surface and/or atmospheric conditions) or from consolidated library data [19,20].Several approaches have been proposed for performing atmospheric corrections.An operative procedure is based on the use of look-up-tables (LUT) with pre-calculated atmospheric RTM simulations for different satellite sensor types [20,21].However, model assumptions and simplifications often lead to inaccuracies in the estimated top-of-canopy (TOC) reflectance measurements [22].
The inherent inaccuracy of TOC reflectance data can be accounted for by using image-specific calibration procedures to estimate LAI.A typical example is the use of soil line-based VIs in place of intrinsic VIs [23]: in this case the image-specific tuning of the soil line slope parameter can be used to increase the consistency in the analysis of atmospherically corrected products, especially for time-series datasets.
In this study, the application of image-specific calibration procedures to estimate LAI was tested.We used satellite time-series from DEIMOS-1 data, an operational satellite in the DMC constellation, acquiring radiance in three spectral bands (green, red and near-infrared) at 22 m spatial resolution.DEIMOS-1 satellite is equipped with a wide-image-swath sensor (630 km), providing a large coverage and overlap between scenes and therefore increasing the revisit time and the probability of capturing cloud-free images.Considering the available spectral resolution of the sensor, we selected a simple LAI retrieval approach based on a VI using the CLAIR model [24].This approach has been tested using canopy reflectance model data [24], field-based reflectance measurements [25] and satellite data in a number of studies [4,26].
The main goal of this work was to evaluate different operational strategies to identify the parameters of the semi-empirical CLAIR model to estimate LAI.Within this main goal, we assessed the transferability of the model parameters between test sites.For comparison, two relatively novel statistical methods were also investigated using Random Forest (RF) and Support Vector Machine (SVM) regressions.
In addressing these issues, the study provides recommendations for deriving consistent time-series maps of LAI in operational frameworks at moderate (22 m) spatial resolution with limited spectral information.Other statistical LAI mapping approaches are presented in other papers of this special issue [27].

Overview
The described methodology provides an operational perspective to identify the parameters of the semi-empirical CLAIR model [24,28] for deriving consistent time-series LAI maps.A set of DEIMOS-1 satellite sensor data were acquired over two agricultural regions in Southern Italy and Eastern Austria during one growing season.Contemporaneous field estimates of LAI were collected for a number of fields and crops.Using this dataset, three operational procedures were tested (Table 1).First, we considered a test site specific application based on seasonal average (constant) values of the CLAIR model parameters, namely the soil line slope, the WDVI ∞ (representing the asymptotically limiting value for the Weighted Difference Vegetation Index (WDVI) when LAI tends to infinity) and the α extinction and scattering coefficient.Secondly, we derived the model parameters for each satellite acquisition separately.Finally, we tested an intermediate solution based on an image-specific estimation of the parameters that can be extracted directly from the images (i.e., soil line slope and WDVI ∞ ) and on a seasonal average (constant) value of the parameter that requires contemporaneous field measurements (i.e., the α coefficient).Additionally, we evaluated the transferability of local models between the two test sites.For comparison, two other statistical methods to estimate LAI were also investigated: Random Forest (RF) and Support Vector Machine (SVM) regressions.
Table 1.Summary of the three procedures to derive the CLAIR model parameters.The 'soil line slope' represents the slope of the linear relationship between red and near-infrared reflectance of bare soil pixels.The WDVI ∞ is the asymptotically limiting value for the Weighted Difference Vegetation Index (WDVI) when LAI tends to infinity and α is the extinction and scattering coefficient of the CLAIR model.A fourth procedure consisted in transferring the model parameters from one test site to another.

Field Experimental Campaigns
The field and satellite data used in this study were acquired in the context of two field campaigns carried out in two agricultural regions located in Southern Italy and in Eastern Austria during the 2012 growing season.The Italian field campaign was undertaken at the 500 km 2 'Piana del Sele' and at the 3820 km 2 'Piana del Volturno' sites in the Campania region of Southern Italy (Figure 1) (Lat.40.52°N,Long.15.00°E).The two sites are large agricultural regions and are characterized by irrigated agriculture (mainly forage crops, fruit trees and vegetables) with an average field size of about 2 ha [29].The first site is characterized by very different soil types including Mollisols, Alfisols, Inceptisols and Entisols [30] according to the United States Department of Agriculture (USDA) soil taxonomy; the second plain is an alluvial formation with soil types varying from Entisols to Vertisols [31].The average annual precipitation is about 800 mm, mostly concentrated during the winter months, with dry and warm summers.Maximum reference evapotranspiration rates, generally occurring during the second half of July, range between 3 and 5 mm/d [32].
The Austrian field campaign was undertaken at the 1000 km 2 'Marchfeld region' in Lower Austria (Lat.48.20°N, Long.16.72°E).The dominant soil types are Chernozem and Fluvisol, based on the Food and Agriculture Organization (FAO) World Soil Classification.The general soil conditions are characterized by a humus-rich A horizon and a sandy C horizon, followed by fluvial gravel from the former river bed of the Danube [33].The region is characterized by a semi-arid climate with an average annual precipitation of 500-550 mm that can drop to 300 mm making it the driest region of Austria.Precipitation during the growing season (April-September) is 200-440 mm.Irrigation in Lower Austria has made it possible to establish a variety of crops thus contributing to the importance of Marchfeld in agricultural production.About 65000 ha of the area in Marchfeld are used for agricultural production.The main crops are vegetables (11%), sugarbeet (10%) and potatoes (7%).
Within these two large and relatively flat regions, LAI was estimated within 400 m 2 elementary sampling units (ESU) that were relatively homogeneous in terms of both vegetation type and growth stage.The center of each ESU was geo-located by means of a GPS, with an accuracy of ±3-5 m.  2.
LAI was estimated using the Plant Canopy Analyzer LAI-2000 [34].Measurements were made during early morning and late afternoon under diffuse light conditions to minimize the effects of direct sunlight otherwise leading to LAI underestimation [35].A view cap of 180 degrees was used to physically limit the sensor field-of-view and thus to reduce interference due to the presence of an operator.There are some limitations in this type of indirect measurements technique [36].On the one hand, the LAI-2000 sensor does not distinguish photosynthetically active leaf tissue from other plant elements, such as stems, flowers or senescent leaves.This leads to overestimated LAI.On the other hand, the 'clumping effect', i.e., non-random positioning of canopy elements, is also neglected by the measurement device causing LAI underestimation.Hence, some compensation can be assumed [37].The LAI recorded using the LAI-2000 sensor represents a measure close to the effective Plant Area Index ('PAIe') for reasons discussed above [37].The average LAI, resulting from three replications of one above-canopy and nine below-canopy measurements, was taken as representative measure for each ESU.The three replications (for a total of 27 measurements) were sampled randomly within the ESU.The same sampling protocol was used in Austria and in Italy and throughout the three months field campaigns.

Satellite Data
Multispectral data were acquired from DEIMOS-1, a satellite in the DMC constellation.The sensor records radiance in three broad spectral bands corresponding to green, red, and near-infrared parts of the electromagnetic spectrum at a ground sampling distance (GSD) of 22 m.Five and eight scenes were acquired during the 2012 growing season (Table 2) for the Austrian and for the Italian case studies, respectively.Satellite images were provided orthorectified to sub-pixel accuracy (~10 m) using ground control points and the Shuttle Radar Topography Mission (SRTM v3) data as digital elevation model.The image data were atmospherically corrected by using ATCOR-2 [20].For the Austrian test site, during the image acquisition campaign, five reference measurements of ground spectral reflectance (bare soils, dry and dense vegetation) were taken in correspondence of one satellite acquisition (August 2nd).A subset of these measurements was used to perform the atmospheric correction.The other subset was used to validate the results of the correction algorithm.The atmospheric correction for the other images in the time-series was cross-checked by observation of pseudo-invariant targets within the study area [19].A similar procedure was used for the Italian test site and the accuracy of the atmospheric correction was checked against a set of reference pixels containing identifiable pseudo-invariant ground targets (i.e., asphalt, sea water, concrete and sand) with known reflectance values from a spectral library [20].

2)
or ly m he α coefficient can be calibrated using a regression analysis technique applied to observed and estimated LAI values [4].This latter parameter describes the canopy architecture and it is dependent on the crop type and the corresponding Leaf Angle Distribution (LAD) value.Different values of the parameters in Equation ( 2) can be found in the literature.For instance, Clevers [25] reported an α value ranging between 0.252 and 0.53 and WDVI ∞ ranging between 68.6 and 57.9 for vegetative and generative growth stages of barley respectively.
In this study, we visually selected about 40-60 sample points of bare soils within each of the acquired satellite image data and the reflectance of these points was used for calculation of the soil line slope.The soil line slope was then used to calculate the WDVI.The WDVI ∞ value was extracted for each image.Then, we derived the α coefficient using an unconstrained nonlinear optimization method, which minimizes the Root Mean Square Error (RMSE) (our cost function) between measurements and predictions of LAI values.An initial α value of 0.35 was used as first guess based on previous campaigns for the Italian test site.We checked different starting values, which always converged at the same final result.
We tested three different procedures (Table 1) to derive a set of α and WDVI ∞ values to be applied throughout the growing season.Firstly we considered a test site specific estimation of the empirical parameters of Equations ( 1) and ( 2) obtaining one seasonal average (constant) value of the soil line slope and of WDVI ∞ .The α coefficient was derived using the pooled set of LAI measurements acquired during the three months campaign.In this way, the time-series field and satellite datasets were considered as a unique dataset.This procedure was repeated independently for each of the two test sites under investigation.In this case, we would perform field measurements for each new test site only during the first year or during short field campaigns before the operational activities.Once calibrated, the model parameters could be applied to the newly acquired images, without further need for field work.
The second procedure consisted in deriving a set of image-specific model parameters for each new satellite acquisition.In this case, we calibrated the α coefficient using only the LAI measurements acquired in correspondence of each acquisition.To avoid introducing biases related to measurements taken on different dates, we did not consider antecedent measurements.This procedure represents the most intensive and time consuming approach, in which we would need field campaigns for each new satellite overpass for the estimation of image-specific α coefficients.Similarly, we would extract a certain number of points from each image in order to calculate the soil line slope and the WDVI ∞ .
Differently, a constant α value to be applied throughout the growing season for each test site was considered as third procedure.This represents an intermediate solution in terms of resources required since we retain the same approach of the second case for deriving image-specific soil line slope and WDVI ∞ values.
These three procedures reflect possible practical approaches for applying the CLAIR model in order to derive time-series of LAI maps in nearly real-time (48 h from satellite acquisitions) and for operational campaigns.For instance, the CLAIR model has been used to derive LAI for the calculation of crop water requirements in irrigation advisory services [4].
Considering the feasible size of the field datasets for this kind of analysis (74 and 55 measurements for Italy and Austria, respectively), the validation of the model performance was achieved by resampling the field dataset using a bootstrapping approach with 200 repetitions [39] in order to provide an unbiased estimation of the model accuracy [40].To quantify the model prediction accuracies we used the following statistical measures: RMSE, the coefficient of determination (R 2 ) between measured and predicted LAI and the confidence intervals.

Statistical Based Approaches for the Estimation of LAI: Support Vector Machine (SVM) and Random Forest (RF) Regression
A second group of techniques were applied for the estimation of LAI based on statistical models established using (1) a least-squares version of a support vector machine (SVM) regression and (2) a regression tree using Random Forest (RF) algorithm.
SVMs have been developed in the framework of statistical learning theory by Vapnik [41] for classification purposes and have been recently extended to regression problems [42].SVMs have been applied for the estimation of LAI in previous studies [43][44][45][46] providing satisfactory results.A comprehensive tutorial for SVM regression was provided by [47].In this study we used a SVM implemented by [48] as a Matlab toolbox.We selected a linear kernel function, which provided better results compared to polynomial or radial basis functions.
RF is an ensemble algorithm developed by Breiman [49].It consists of several regression trees; each tree is trained on a bootstrap sample of the training dataset and the majority vote is taken for the RF prediction.Random subsets of input variables are used to train the regression tree.The algorithm provides an internal unbiased estimate of the error using the so-called 'out-of-bag' (OOB) samples.RF has been successfully used in several classification problems but limited application to biophysical vegetation parameters retrieval can be found in the literature [50,51].In this study we used a Matlab implementation of RF [52] with a default set of model parameters (no. of trees = 500; no. of input variables randomly chosen at each split = 2).
For both SVM and RF, the regressions were established using LAI measurements and atmospherically corrected reflectance values in green, red and near-infrared bands.Additionally, date-specific WDVI and WDVI ∞ values were used as input variables.This set of variables provided the most stable and accurate results compared to using seasonal average of WDVI and WDVI ∞ values or reflectance only.In the case of SVM, we first optimized the regularization parameter γ of SVM model using a simplex search method and a leave-one-out cross validation approach with the experimental LAI dataset.Then, two models (one for each test site) were applied to the experimental datasets to calculate RMSE and R 2 .In the case of RF, the model performance was calculated using the RMSE obtained from the average OOB Mean Square Error (MSE) over 500 trees.A pseudo-coefficient of determination (pseudo-R 2 ) was calculated as follows: 1 − MSE/variance (LAI).

Identification of the CLAIR Model Parameters and LAI Estimation Accuracy
Procedure 1: the soil line slope and of the asymptotic limiting value for the WDVI (WDVI ∞ ), along with the dates for field campaigns and satellite acquisitions are presented in Table 3.The α coefficient was derived using a bootstrap approach with 200 resampling of the experimental LAI dataset.The RMSE and R 2 between measured and predicted LAI and the corresponding α coefficient were taken as the median values of the 200 bootstrap estimations.Prediction accuracies are presented in Table 4.The scatterplots between field and satellite estimates of LAI are shown in Figure 3. Applying this first procedure resulted in a RMSE of 0.817 and a R 2 of 0.57 for the study region in Italy.For Austria, the RMSE was 1.037 and R 2 was 0.57.It can be observed that the distance from the 1:1 line increases with higher values of both measured and estimated LAI.In addition, the variation of the actual estimates increases along with the LAI-value, an effect which was greatly reduced for the study region in Italy by using the second procedure.As expected, the RMSE for the Austrian case study was higher given the larger variability of the crop types in this area compared to Italy.
Procedure 2: the LAI measurements from the last two field campaigns (31/07 and 07/08, 13/08 and 10/09) and the corresponding image data for the Italian test site were combined in a single dataset due to a limited number of field measurements.Similarly, the LAI and image data from the last campaign (07/09 and 21/09) for the Austria test site were combined.The number of LAI measurements used for each image-specific calibration, including LAI statistics (minimum, maximum and mean values) is provided in Table 2.
The scatterplots between field and satellite estimates of LAI are shown in Figure 4. Varying the soil line slope and thus WDVI values helped to improve WDVI based LAI estimation.Using the parameters (soil line slope and WDVI ∞ ) listed in Table 3 resulted in an RMSE of 0.388 and R 2 of 0.89 for the test site in Italy (Table 4).In addition, the higher variation of LAI estimation could be reduced when compared to the first procedure.For the Austrian test site the RMSE was 0.82 and R 2 was 0.66.Analyzing the data presented in Table 4, we noticed that the range of variability of the α values (0.08) in the Austria test site is greater than the Italian one (0.04).This explains the larger RMSE observed in Austria as a consequence of the large crop variability in this area.In spite of this, larger errors are observed only for LAI values greater than 3.
Procedure 3: results are presented in Table 4 and in Figure 5.A constant α coefficient value throughout the growing seasons and image-specific variation of soil line slope and WDVI ∞ resulted in a RMSE of 0.407 and R 2 of 0.88 for Italy and a RMSE of 0.86 and R 2 of 0.64 for the Austrian test site.No substantial changes in the performance of the LAI estimation were observed when comparing this procedure to the full image-specific procedure.The slight increase in the RMSE is compensated by the reduced field work requirements of this approach.

Parameterization of Support Vector Machine and Random Forest (RF) Regressions and LAI Estimation Accuracy
The results of the SVM tuning provided a γ value of 0.5435 and 0.2998 for Italy and Austria, respectively.The application of tuned models (one for each test site) to the experimental datasets resulted in a RMSE of 0.394 and 0.785 for the Italian and for the Austria test sites, respectively.R 2 resulted in 0.86 for Italy and 0.69 for Austria.The application of the RF regression achieved a RMSE (OOB) of 0.502 ± 0.027 and R 2 of 0.82 for the Italian case study and a RMSE of 0.860 ± 0.017 and R 2 of 0.63 for the Austrian case study.Compared to the CLAIR model performance, SVM achieved comparable accuracy for procedures 2 (Figure 4) and 3 (Figure 5) with image-specific soil line slope and WDVI ∞ values.On the contrary, RF reported slightly lower accuracies for the Italian case study.Similarly, we tested the transferability of the SVM and RF regressions between test sites.The application of RF models provided a RMSE of 0.747 when transferring to Italian case study and a RMSE of 0.906 for the Austrian case study.In this case, we observed poorer prediction power compared to test site specific models, with a percentage increase in the RMSE up to 24.5%.
The mutual application of the SVM regression models provided a RMSE of 0.783 and of 0.808 for the Italian and Austrian case study respectively, with a percentage increase of the RMSE up 38.9% compared to test site specific models.A summary of the results is provided in Table 5.

Conclusions
In this study, we described a methodology to derive consistent time-series of Leaf Area Index (LAI) maps using an exponential relationship (CLAIR model) between LAI and the soil-line based Weighted Difference Vegetation Index (WDVI).The study analyzed different procedures for estimating the CLAIR model parameters (soil line slope, WDVI ∞ and α coefficient) and highlighted the operational aspects of these approaches.We tested a seasonal average (constant) and an image-specific tuning to increase the consistency in the analysis of atmospherically corrected time-series images.Additionally, we applied two other statistical methods based on Random Forest (RF) and Support Vector Machine (SVM) regressions.These two algorithms, applied in several remote sensing problems have proved particularly useful in recent years for land cover classification.However, few studies can be found in the remote sensing literature for their application to biophysical parameter prediction problems.
The experimental analysis was conducted using a set of DEIMOS-1 satellite sensor data acquired over two agricultural regions in Southern Italy and Eastern Austria.In particular, these data were acquired in the context of two satellite-based irrigation advisory services; LAI maps are used as input in agro-meteorological models to derive crop water needs.This information is delivered directly to the farmers for water management purposes in nearly real time.The Italian case study represents already an operational service, while the Austrian site is a test-bed for further development and testing transferability of models and procedures from different environments, agricultural and management practices.Although limited to one satellite type only, this dataset offered the possibility to evaluate the transferability of the model parameters from one test site to another.This is particularly relevant in case no field measurements are available.
Using the CLAIR model with an image-specific tuning of the soil line slope and WDVI ∞ and a constant α coefficient value provided the most accurate and consistent results (RMSE = 0.407 and 0.86 for Italy and Austria respectively).Notably, this approach has the minimum requirements for fieldwork.Considering this procedure, we also evaluated the transferability of the local model parameters between the two test sites observing a decrease in estimation accuracies smaller than 0.5%.RF and SVM regressions provided comparable results for each test site.However, we observed severe limitations in the transferability of these statistical methods between test sites with an increase in RMSE up to 24.5% for RF and 38.9% for SVM.
This work indicates that the CLAIR model can provide satisfactory accuracy if an adequate model parameterization is performed; results achieved here (with a limited number of spectral inputs) do not justify the use of advanced regression techniques (such as RF and SVM) compared to simple and robust semi-empirical relationships.One advantage of the semi-empirical model used in this study is that its parameters have a physical denotation and it is possible to interpret and explain the model behavior.In addition, it has the advantage of not requiring a priori crop map information.This additional layer could be used, if available, to perform a crop specific calibration of the extinction and scattering coefficient (α).However, this would require additional fieldwork to achieve a statistically significant number of measurements for each crop type.

Figure 1 .
Figure 1.Selected study regions.The study was carried out in an agricultural region located in the Eastern part of Austria ('Marchfeld region') and in two areas ('Piana del Sele' and 'Piana del Volturno') located in the Campania region in Southern Italy.
Figure March during using A

Figure 3 .Figure 4 .
Figure 3. Procedure 1: scatterplots of the LAI estimation for the Italian case study (a) and for the Austria case study (b).Vertical error bars correspond to the Standard Error on LAI (SEL) measurements for each ESU.Horizontal error bars correspond to the Standard Error on LAI predictions and it was estimated using resampling with 200 bootstraps.
Figure for the

Table 2 .
Timing of the field campaigns and satellite acquisitions with corresponding statistics of the LAI measurements.Note that acquisitions were not always performed during the same dates in the two areas (Sele and Volturno) of the Italian test site.

Table 5 .
Transferability between test sites (IT: Italy; AT: Austria) of the CLAIR model, Random Forest and Support Vector Machine regressions.The parameters of the CLAIR model used in this comparison were derived following the procedure 3.