Comparison of Four Machine Learning Methods for Generating the GLASS Fractional Vegetation Cover Product from MODIS Data

Long-term global land surface fractional vegetation cover (FVC) products are essential for various applications. Currently, several global FVC products have been generated from medium spatial resolution remote sensing data. However, validation results indicate that there are inconsistencies and spatial and temporal discontinuities in the current FVC products. Therefore, the Global LAnd Surface Satellite (GLASS) FVC product algorithm using general regression neural networks (GRNNs), which achieves an FVC estimation accuracy comparable to that of the GEOV1 FVC product with much improved spatial and temporal continuities, was developed. However, the computational efficiency of the GRNNs method is low and unsatisfactory for generating the long-term GLASS FVC product. Therefore, the objective of this study was to discover an alternative algorithm for generating the GLASS FVC product that has both an accuracy comparable to that of the GRNNs method and adequate computational efficiency. Four commonly used machine learning methods, back-propagation neural networks (BPNNs), GRNNs, support vector regression (SVR), and multivariate adaptive regression splines (MARS), were evaluated. After comparing its performance of training accuracy and computational efficiency with the other three methods, the MARS model was preliminarily selected as the most suitable algorithm for generating the GLASS FVC product. Direct validation results indicated that the performance of the MARS model (R2 = 0.836, RMSE = 0.1488) was comparable to that of the GRNNs method (R2 = 0.8353, RMSE = 0.1495), and the global land surface FVC generated from the MARS model had good spatial and temporal consistency with that generated from the GRNNs method. Furthermore, the computational efficiency of MARS was much higher than that of the GRNNs method. Therefore, the MARS model is a suitable algorithm for generating the GLASS FVC product from Moderate Resolution Imaging Spectroradiometer (MODIS) data.


Introduction
Fractional vegetation cover (FVC), generally defined as the fraction of green vegetation as seen from the nadir, is an important variable for describing land surface vegetation [1][2][3].FVC is also a key biophysical parameter for studying the atmosphere, pedosphere, hydrosphere, and biosphere, as well as their interactions [4], and is widely used in weather prediction, hydrological monitoring, and related research fields [5][6][7][8].Long-term and global-scale FVC datasets are highly important for land surface process and climate change studies and also for their extensive applications in the monitoring of agriculture, forestry, disaster risk, and drought [5,[8][9][10].Estimation of FVC from remote sensing data is the only effective way to generate FVC products, especially at the regional and global scale.
There are three main types of FVC estimation methods using remotely sensed data: empirical methods, pixel un-mixing modeling, and physical model-based methods.The empirical methods are based on the statistical relationships between FVC and spectral band reflectance or vegetation indices [11,12].Empirical methods are simple, computationally efficient, and widely used for estimating FVC at the regional scale.However, empirical methods are limited temporally and spatially because their statistical relationships are constructed using data acquired at specific times in distinct regions.Thus, although they are typically applicable to specific research areas and vegetation types, they may become invalid when they are expanded to larger areas.The pixel un-mixing model assumes that each pixel is composed of several components, and considers the fraction of vegetation compositions as the FVC of the pixel [13][14][15].The dimidiate pixel model, which assumes that the pixels are composed solely of vegetation and non-vegetation components, is the simplest and most widely used pixel un-mixing model [5,14,16].However, due to the complexity of land surfaces, it is difficult to use the pixel un-mixing model to determine the number of endmembers and the spectral responses of the endmembers over large areas for FVC estimation.The physical model-based methods for FVC estimation are based on the inversion of canopy radiative transfer models that simulate the physical relationships between vegetation canopy spectral reflectance and FVC.Such physical models have clear physical significance and can be adapted to a wide range of scenarios [17].However, because of the complexity of the physical models, direct inversion is generally complex, and artificial neural networks (ANNs) are usually used for indirect inversion of the physical model by training with a pre-computed reflectance database from the physical models to simplify the inversion [8,18].ANN methods have the advantages of computational efficiency and robustness to noisy data, and can approximate multivariate non-linear relationships, which make them popular choices for large-area FVC estimation from remote sensing data [1,[19][20][21].
Furthermore, the current global FVC products were largely generated using medium spatial resolution remotely sensed data.However, specific validation results indicate that there are inconsistencies among the different FVC products as well as spatial and temporal discontinuities within them [22][23][24].For example, the Spinning Enhanced Visible and Infrared Imager (SEVIRI) and MEdium-Resolution Imaging Spectrometer (MERIS) FVC products were found to have systematic bias, such that the MERIS FVC presents lower values (by approximately 0.1-0.2) [24].VEGETATION (VGT) FVC was found to be generally higher than SEVIRI FVC by approximately 0.15, and Carbon cYcle and Change in Land Observational Products from an Ensemble of Satellites (CYCLOPES) FVC values were found to be underestimates [25,26].With the underestimation problem corrected, the GEOV1 FVC product is an improved version of the CYCLOPES FVC product.However, the GEOV1 FVC product suffers from unsatisfactory temporal and spatial continuities [24].Therefore, a long-term global FVC product with both high accuracy and satisfactory spatial and temporal continuities is urgently needed.
Recently, Jia et al. [27] developed an operational algorithm for estimating FVC from Moderate Resolution Imaging Spectroradiometer (MODIS) surface reflectance data using general regression neural networks (GRNNs).Their method was used to generate the Global LAnd Surface Satellite (GLASS) FVC product, which was supported by China's National High Technology Research and Development Program.Validation results demonstrated that the GRNNs-generated GLASS FVC product had accuracy that was comparable to the accuracy of the GEOV1 FVC product, which was considered to be the best global FVC product available at the time.Moreover, the spatial and temporal continuities of the GLASS FVC were superior to those of the GEOV1 FVC [27].Balancing the estimation accuracy and the computational efficiency of the estimation algorithm is a key issue for generating lengthy time series, and high-resolution and global-scale FVC products.However, the computational efficiency of using the GRNNs method to generate the GLASS FVC product is currently unsatisfactory.It takes more than one hour to generate one tile of data from MODIS data, which severely restricts the production of the GLASS FVC product.Therefore, a highly efficient method with accuracy comparable to that of the GRNNs method is urgently needed for generating the GLASS FVC product.
Machine learning methods have been applied widely to land surface parameter product generation using remotely sensed data due to their capability of nonlinear fitting and computational efficiency.For example, Baret et al. used back-propagation neural networks (BPNNs) to produce the GEOV1 leaf area index (LAI), fraction of absorbed photosynthetically active radiation (FAPAR), and fraction of vegetation cover (FCOVER) products [1,21,27].Durbha et al. [28] used support vector regression (SVR) to retrieve LAI from Multi-angle Imaging SpectroRadiometer (MISR) data with satisfactory results.Jiang et al. [29] used multivariate adaptive regression splines (MARS) to generate the long-term GLASS Daytime All-Wave Net Radiation Product.To generate the global FVC product, a satisfactory balance between the accuracy and the computational efficiency of the FVC estimation algorithm is needed.Therefore, the objective of this study was to find an alternative algorithm for generating the GLASS FVC product having both accuracy comparable to that of the GRNNs method and relatively high computational efficiency.In this study, four commonly used and effective machine learning methods, BPNNs, GRNNs, SVR, and MARS, were evaluated to identify the most suitable method for generating the GLASS FVC product.

Data and Methods
This study was designed to find a suitable GLASS FVC product algorithm having accuracy comparable to that of the GRNNs method and also high computational efficiency (Figure 1).First, four machine learning methods, GRNNs, BPNNs, SVR, and MARS, were trained on identical samples of different sizes, and the performances of the four methods were evaluated.Then, the method with both suitable accuracy and high computational efficiency was used to generate the GLASS FVC product.method with accuracy comparable to that of the GRNNs method is urgently needed for generating the GLASS FVC product.
Machine learning methods have been applied widely to land surface parameter product generation using remotely sensed data due to their capability of nonlinear fitting and computational efficiency.For example, Baret et al. used back-propagation neural networks (BPNNs) to produce the GEOV1 leaf area index (LAI), fraction of absorbed photosynthetically active radiation (FAPAR), and fraction of vegetation cover (FCOVER) products [1,21,27].Durbha et al. [28] used support vector regression (SVR) to retrieve LAI from Multi-angle Imaging SpectroRadiometer (MISR) data with satisfactory results.Jiang et al. [29] used multivariate adaptive regression splines (MARS) to generate the long-term GLASS Daytime All-Wave Net Radiation Product.To generate the global FVC product, a satisfactory balance between the accuracy and the computational efficiency of the FVC estimation algorithm is needed.Therefore, the objective of this study was to find an alternative algorithm for generating the GLASS FVC product having both accuracy comparable to that of the GRNNs method and relatively high computational efficiency.In this study, four commonly used and effective machine learning methods, BPNNs, GRNNs, SVR, and MARS, were evaluated to identify the most suitable method for generating the GLASS FVC product.

Data and Methods
This study was designed to find a suitable GLASS FVC product algorithm having accuracy comparable to that of the GRNNs method and also high computational efficiency (Figure 1).First, four machine learning methods, GRNNs, BPNNs, SVR, and MARS, were trained on identical samples of different sizes, and the performances of the four methods were evaluated.Then, the method with both suitable accuracy and high computational efficiency was used to generate the GLASS FVC product.

Training Samples
The training samples used to develop the FVC estimation algorithm from MODIS surface reflectance data with GRNNs were used in this study [27].The sampling locations consisted of the

Training Samples
The training samples used to develop the FVC estimation algorithm from MODIS surface reflectance data with GRNNs were used in this study [27].The sampling locations consisted of the BEnchmark Land Multi-site Analysis and Intercomparison of Products (BELMANIP) sites, specific sites in FLUXNET that were not overlaid nor particularly proximal to the BELMANIP sites, and Validation of Land European Remote Sensing Instruments (VALERI) sites with ground measured FVC data.In total, approximately 500 global sampling locations were selected for use in the present study.The sample locations are located in relatively flat and homogeneous areas, are globally distributed, and cover all types of vegetation.The red and NIR band reflectance in the reprocessed MODIS data of the 25 pixels (5 × 5 = 25) surrounding the global sampling locations were extracted, and the corresponding Landsat thematic mapper (TM)/Enhanced Thematic Mapper plus (ETM+) FVC pixels for each extracted MODIS pixel were averaged as the sampling FVC of the MODIS pixel.The Landsat TM/ETM+ FVC data were derived from a dimidiate pixel model based on the terrestrial ecoregions and vegetation types [27].To remove abnormal samples and guarantee the reliability and stability of the training samples, the samples were refined further.The sampling FVC values were plotted against the normalized difference vegetation index (NDVI) values, which were computed from the MODIS reflectance in the red and NIR bands.Then, for each NDVI value class (20 classes over the [0, 1] domain of variation), the cases with FVC values that were lower (higher) than the 5% percentile (95% percentile) were removed from the sample datasets.After further optimization, 16,980 cases with consistent MODIS surface reflectance (reflectance in the red and NIR bands) paired with refined sampling FVC values were generated.The sampling dataset was split randomly into a training dataset composed of 90% of the available data and an essentially independent validation dataset composed of the remaining 10% of the sampling dataset, which was used to evaluate the theoretical performances of the four models.Further details regarding the training samples can be found in the study of Jia et al. [27].

GRNNs model
GRNNs, the generalization of radial basis function networks and probabilistic neural networks, were developed by Donald Specht [30,31].Jia et al. [27] applied GRNNs to the estimation of FVC and demonstrated that GRNNs were reliable for FVC estimation.GRNNs contain four layers, the input, pattern, summation, and output layers.The input layer provides all of the measurement variables to all of the neurons in the pattern layer; each neuron represents a training pattern, and the output of each neuron is a measure of the distance of the input from the stored patterns.The summation layer consists of two types of summation neurons: one type computes the summation of the weighted outputs of the pattern layer, and the other type calculates the unweighted outputs of the pattern neurons.Finally, the output layer performs a normalization step to compute the predicted value of the output variable.In this study, the input variables to the GRNNs for retrieving FVC were the reprocessed MODIS reflectance values in the red and NIR bands, and the output variable was the corresponding FVC.In this study, a Gaussian function was used as the kernel function of the GRNNs, and the fundamental formulation was expressed as follows: where D 2 i represents the squared Euclidean distance between the input vectors X and the i-th training input vector X i , Y i is the output vector corresponding to X i , Y (X) is the estimation corresponding to X, n is the number of samples, and σ is a smoothing parameter that controls the size of the receptive region.Because the architecture and weights of GRNNs are determined by the input, the training of GRNNs consists essentially of optimizing the smoothing parameter σ [27].The smoothing parameter significantly affects the prediction accuracy of the GRNNs, and a suitable smoothing parameter was found by using the holdout method.The holdout method for a particular σ consisted of removing one sample from the training data at a time and then constructing GRNNs based on all the other training samples.The GRNNs were then used to estimate Y for the removed sample.By repeating this process for each sample and storing each estimate, the mean squared error between the actual sample values Y i and the estimates could be evaluated.The value of σ giving the smallest error was used in the final GRNNs [30,32].The training process was completed when the minimum of the cost function of the smoothing parameter was reached as follows: where Ŷi (X i ) is the estimation corresponding to X i using the GRNNs trained over all of the training samples except the i-th sample.The widely used shuffled complex evolution method developed by the University of Arizona was selected to obtain the optimal smoothing parameter of the GRNNs [33,34].
The GRNNs method was implemented with the Visual Studio 2012 platform (Microsoft Corporation, Redmond, Washington, DC, USA).
• BPNNs BPNNs, a popular type of neural networks, have proven to be an effective algorithm for estimating land surface vegetation variables, such as LAI and FAPAR [1,35].Therefore, BPNNs were selected for the comparison of the performance among methods.The BPNNs training procedure is divided into two parts, a forward propagation of information and a backward propagation of error.The back-propagation algorithm network adjusts the weights in each successive layer to reduce the errors at each level.In the linkage of the layers, the transmission of information procedure is unidirectional transmission to the input layer, with treatment of the information in the input layer, the hidden layer, and finally transmission to the output layer.The status of each layer can only be affected by the next layer.If the anticipated outcome is not generated in the output layer, the algorithm switches to back-propagation, and the error between the outcome and the expected value is returned along the original path.In the present study, the BPNNs first learned from the training dataset and built relationships between reflectance and FVC, then the trained BPNNs could produce the optimal FVC estimates based on the actual reflectance of the remotely sensed data.The inputs of the BPNNs included the reflectance of the red and NIR bands, and the output consisted of the corresponding FVC.The number of nodes in the hidden layer was set to four.The BPNNs activation function in the hidden layer was set to "tansig", the transfer function for the output layer was set to "purelin", and the training function was set to "trainlm", respectively.Because of its efficient convergence capacity, the Levenberg-Marquardt minimization algorithm was used to calibrate the synaptic coefficients [36].The BPNNs modeling was implemented with the Matlab 2014a platform (Matlab 2014a, The MathWorks, Natick, MA, USA).

• SVR model
SVR is a commonly used machine learning method for solving nonlinear regression estimation problems [37].Given a set of data points G = (x i , d i ) n i (x i is the input vector, d i is the desired value and n is the total number of data patterns), SVR approximates the function f(x) using the following equation: where ω is the weight vector, b is the bias, and τ(x) is the kernel function, which is typically a non-linear function for transforming non-linear inputs into a linear mode in a high-dimensional feature space.
Unlike the traditional regression model, in which coefficients are estimated by minimizing the squared loss, SVR applies the so-called ε-insensitivity loss function (L ε ) to estimate its parameters: where y is the desired (target) output and ε is defined as the region of ε-insensitivity.When the predicted value falls into the band area, the loss is zero.In contrast, if the predicted value falls outside the band area, the loss is equal to the difference between the predicted value and the margin.
When empirical risk and structure risk are considered together, the SVR model can be constructed to minimize the following quadratic programming problem: where i = 1, . . .,n is the number of training data; (ξ i ± ξ * i ) is the empirical risk; 1 2 z T z is the structure risk preventing over-learning and the lack of applied universality; and C is referred to as the regularized constant, which determines the trade-off between the empirical risk and regularization terms.This study adopted the general form of the SVR-based regression function, defined as follows [38]: where α i and α i * are Lagrangian multipliers that satisfy α i × α i * = 0, n is the number of support vectors, and b is the bias.K is the kernel function, which in this study was the radial basis function (RBF) kernel, one of the most widely used SVR kernel functions [37][38][39], which is defined as K (x, , where σ denotes the width of the RBF.Parameters C, ε, and γ must be determined for SVR training.To obtain the optimal SVR model, the training dataset was randomly divided further into 90% and 10% proportions for model building and testing, respectively, and the grid search method [40] was applied to determine the best parameter set of C and γ that could generate the minimum forecasting mean square error.In the searching process, C and γ are increasing exponentially with the base 2 and the exponent located in [−8, 8] with the step of 0.8.Finally, ε was determined based on the training data and set to 0.1.In this study, the FVC estimation utilized the Library for Support Vector Machines (LIBSVM) and was implemented with the Matlab 2014a platform.

• MARS model
MARS is a nonparametric and multivariate regression analysis model presented by Jerome H. Friedman [39].MARS essentially builds flexible models by fitting piecewise linear regressions; that is, the non-linearity of a model is approximated through the use of separate linear regression slopes in distinct intervals of the independent variable space.In addition to searching for variables sequentially, MARS also searches for the interactions between variables, allowing any degree of interaction to be considered as long as it provides a better fit to the data.
MARS builds models of the form [39]: where B i (x) are piecewise linear basis functions, and each c i is a constant coefficient.Each piecewise linear basis function takes the following form: where c is a constant called the knot.
The inputs of the MARS included the reflectance of the red and NIR bands, and the output was the corresponding FVC.The optimal MARS model is determined by a two-stage process.First, MARS constructs a very large number of basis functions to fit the data, where variables can interact with each other to fit the data.Then, the basis functions are deleted in the order of least importance using the generalized cross-validation (GCV) criterion [40].The importance of a variable can be assessed by observing the decrease in the calculated GCV when it is removed from the model.MARS is capable of reliably tracking the extremely complex data structures hidden in high-dimensional data.Additional details regarding the MARS model building process can be found in [40].In this study, the best number of basis functions was determined using the GCV method.MARS was implemented with the Matlab2014a platform.
The four aforementioned methods were trained with identical training samples of varying sample sizes, and their performances were evaluated based on independent validation.The R-square (R 2 ) and root mean square error (RMSE) were selected to evaluate the comparison results.Moreover, reprocessed MODIS reflectance data for BELMANIP sites containing all kinds of vegetation types in the year 2003 were entered into the four machine learning algorithms, and the outputs of BPNNs, SVR, and MARS were compared with those of GRNNs, which have been proven to be very efficient for FVC estimation.In this study, all of the four methods were implemented using the Microsoft Windows 7 operating system (Microsoft Corporation, Redmond, Washington, DC, USA) on a 3.20 GHz Intel Core PC with 8 GB of memory.

Spatial-Temporal Comparison and Direct Validation
After training with the optimal training sample size, the four methods were used to generate the global land surface FVC from reprocessed MODIS reflectance data.The MODIS land cover product (MCD12Q1) was used as priori conditions: the FVC values in the vegetation regions, including eight main biomes, were estimated from the reprocessed MODIS surface reflectance data using the proposed method, and the FVC values in the non-vegetated regions were set to zero.Because the GRNNs method has accuracy comparable to that of the GEOV1 FVC product and improved spatial and temporal continuities [27], the accuracy and the spatial and temporal consistencies of the machine learning methods were compared with those of the GRNNs method to determine a suitable alternative to the GRNNs method for generating the GLASS FVC product.
Assessment and validation of moderate-resolution FVC products are generally difficult because ground point measurements are not suitable for direct comparisons due to surface heterogeneity.High spatial resolution remote sensing data can be used to scale the ground measurements up to moderate spatial resolution pixels for comparison and evaluation.In this study, high-resolution FVC maps from VALERI (accessed at http://w3.avignon.inra.fr/valeri)[41] were used to validate the four machine learning methods and inter-compare with the GRNNs method.Following the guidelines defined by the subgroup Land Product Validation (LPV) of Committee Earth Observing Satellites' Working Group on Calibration and Validation (CEOS WGCV), an empirical transfer function between the high-resolution reflectance data and the FVC ground measurements for a site was established to derive a high-resolution FVC map that was then aggregated to the moderate-resolution products for comparison [42].Most of these sites are sized at 3 km × 3 km (some are larger), and multi-date measurements are available at specific sites.In total, 44 high-resolution FVC maps over 27 VALERI sites [27] were selected to validate the FVC estimates of the four machine learning methods assessed in this study.Information about these locations, such as validation site positions, land cover types, dates of ground measurement, and mean FVC values from the validation site areas is given in Table 1.The sites cover various land cover types, including grassland, cropland, shrubs, forest, etc.Therefore, the validation results in this study are representative.
In this study, to obtain improved spatial matching of the FVC estimates from the MODIS data and high-resolution remote sensing data of the VALERI sites, the FVC data estimated using MODIS data were converted to the same projection as those of the corresponding high-resolution FVC maps.Then, the averaged FVC estimates corresponding to the scope of the high-resolution maps were extracted.Meanwhile, the extracted FVC values were linearly interpolated to the acquisition date of the corresponding high-resolution FVC maps.Finally, the performances of the machine learning methods were directly validated using these extracted FVC samples.

Training Accuracy and Computational Efficiency
The performances of the four machine learning methods with different training sample sizes were evaluated by R 2 and RMSE, and the results are shown in Figure 2. The smoothing parameter for the GRNNs method, the C and γ parameters for the SVR method, and the maximum number of model terms for the MARS method are determined in the training process.In this study, only the parameters with training sample size of 15,282, which achieved the best training results, are presented.The determined optimal smoothing parameter for FVC estimation from MODIS data for the GRNNs method was 0.0042, the C and γ for the SVR method were 27.8576 and 1.7411, respectively, and the best number of the basis functions for the MARS method was 21.In addition, the independent validation results from the 15,282 training samples are summarized in Table 2.
Remote Sens. 2016, 8, x FOR PEER 9 of 16 The performances of the four machine learning methods with different training sample sizes were evaluated by R 2 and RMSE, and the results are shown in Figure 2. The smoothing parameter for the GRNNs method, the C and  parameters for the SVR method, and the maximum number of model terms for the MARS method are determined in the training process.In this study, only the parameters with training sample size of 15,282, which achieved the best training results, are presented.The determined optimal smoothing parameter for FVC estimation from MODIS data for the GRNNs method was 0.0042, the C and  for the SVR method were 27.8576 and 1.7411, respectively, and the best number of the basis functions for the MARS method was 21.In addition, the independent validation results from the 15,282 training samples are summarized in Table 2.The computation time required for model training and estimation differed considerably among the four methods (Table 2).The computational efficiency of the GRNNs and SVR methods was very low when large datasets were used; therefore, these two methods were not suitable for generating a long time series FVC product.Additionally, when the training sample size was reduced, their estimation accuracy decreased.Although the four methods were implemented on different platforms, the comparison of computation time was still reasonable for the assertion that the computational efficiency of MARS and BPNNs was better than that of GRNNs and SVR.
One year of FVC estimates were also derived using the four methods at the BELMANIP sites containing all kinds of vegetation types.Due to a lack of ground truth data and the fact that GRNNs have proven to be very efficient for FVC estimation, the FVC estimates of the other three methods were compared with those of the GRNNs method.Scatterplots between the GRNNs derived FVC and the BPNNs, SVR and MARS derived FVCs are shown in Figure 3.The results show that the other three methods performed well in estimating FVC and had good consistency with the GRNNs method (R 2 > 0.98, RMSE < 0.037).However, the BPNNs method had a small overestimation problem for FVC values from 0.05 to 0.2.In contrast, the MARS method had better consistency with the GRNNs method compared with the BPNNs and SVM methods.Figure 2 shows that R 2 increases while RMSE decreases with sample size, with some fluctuations.Generally, GRNNs performed slightly better than SVR and slightly worse than MARS and BPNNs.All of the four methods had satisfactory performance (R 2 > 0.96, RMSE < 0.07) given sufficient training samples.The BPNNs and MARS had lower sensitivity than the GRNNs to training sample size variations, and SVR was very sensitive to training sample size.All of the four methods began to have stable performance when the training sample size was large enough.In addition, a turning point was noted between the sample sizes of 304 and 456.Prior to the sample size of 304, the values of R 2 and RMSE changed quickly, and the four models were sensitive to sample size variation.Subsequent to the training sample size of 456, the performances of the four methods, particularly the MARS method, began to stabilize, and the R 2 and RMSE values showed only slight improvement.With a training sample size of 15,282, over-fitting was not observed, and all of the four methods, particularly the MARS method, presented their best performance (R 2 > 0.96, RMSE < 0.07).Therefore, 15,282 training samples were used as the final training data set.
The computation time required for model training and estimation differed considerably among the four methods (Table 2).The computational efficiency of the GRNNs and SVR methods was very low when large datasets were used; therefore, these two methods were not suitable for generating a long time series FVC product.Additionally, when the training sample size was reduced, their estimation accuracy decreased.Although the four methods were implemented on different platforms, the comparison of computation time was still reasonable for the assertion that the computational efficiency of MARS and BPNNs was better than that of GRNNs and SVR.
One year of FVC estimates were also derived using the four methods at the BELMANIP sites containing all kinds of vegetation types.Due to a lack of ground truth data and the fact that GRNNs have proven to be very efficient for FVC estimation, the FVC estimates of the other three methods were compared with those of the GRNNs method.Scatterplots between the GRNNs derived FVC and the BPNNs, SVR and MARS derived FVCs are shown in Figure 3.The results show that the other three methods performed well in estimating FVC and had good consistency with the GRNNs method (R 2 > 0.98, RMSE < 0.037).However, the BPNNs method had a small overestimation problem for FVC values from 0.05 to 0.2.In contrast, the MARS method had better consistency with the GRNNs method compared with the BPNNs and SVM methods.Due to its high computational efficiency and estimation accuracy, the MARS method was preliminarily selected for subsequent direct validation and spatial-temporal comparison with the GRNNs method, and the BPNNs and SVR methods were not explored further.

Spatial-Temporal Comparison and Direct Validation
A direct validation of the MARS method was evaluated, and the spatial and temporal consistencies between the global FVC maps derived from the MARS and the GRNNs methods were compared.Figure 4a, b show the global FVC maps from the MARS and GRNNs methods on day 201 of 2003 (20 July 2003), respectively.A difference map between the two global FVC maps generated by the MARS and GRNNs methods is also presented in Figure 4c for comparison.Figure 4 shows that there was adequate spatial agreement between the MARS and GRNNs derived FVC from the MODIS data, and no missing data were observed.Quantitatively, approximately 99.95% of the difference values were located between −0.1 and 0.1.Furthermore, the distributions of the FVC values of both data sets were consistent with the actual conditions of the land cover distributions.Therefore, the global FVC map generated by the MARS model had sufficient spatial consistency and FVC values very similar to those generated by the GRNNs method, as well as satisfactory spatial continuity.Due to its high computational efficiency and estimation accuracy, the MARS method was preliminarily selected for subsequent direct validation and spatial-temporal comparison with the GRNNs method, and the BPNNs and SVR methods were not explored further.

Spatial-Temporal Comparison and Direct Validation
A direct validation of the MARS method was evaluated, and the spatial and temporal consistencies between the global FVC maps derived from the MARS and the GRNNs methods were compared.Figure 4a, b show the global FVC maps from the MARS and GRNNs methods on day 201 of 2003 (20 July 2003), respectively.A difference map between the two global FVC maps generated by the MARS and GRNNs methods is also presented in Figure 4c for comparison.Figure 4 shows that there was adequate spatial agreement between the MARS and GRNNs derived FVC from the MODIS data, and no missing data were observed.Quantitatively, approximately 99.95% of the difference values were located between −0.1 and 0.1.Furthermore, the distributions of the FVC values of both data sets were consistent with the actual conditions of the land cover distributions.Therefore, the global FVC map generated by the MARS model had sufficient spatial consistency and FVC values very similar to those generated by the GRNNs method, as well as satisfactory spatial continuity.Due to its high computational efficiency and estimation accuracy, the MARS method was preliminarily selected for subsequent direct validation and spatial-temporal comparison with the GRNNs method, and the BPNNs and SVR methods were not explored further.

Spatial-Temporal Comparison and Direct Validation
A direct validation of the MARS method was evaluated, and the spatial and temporal consistencies between the global FVC maps derived from the MARS and the GRNNs methods were compared.Figure 4a, b show the global FVC maps from the MARS and GRNNs methods on day 201 of 2003 (20 July 2003), respectively.A difference map between the two global FVC maps generated by the MARS and GRNNs methods is also presented in Figure 4c for comparison.Figure 4 shows that there was adequate spatial agreement between the MARS and GRNNs derived FVC from the MODIS data, and no missing data were observed.Quantitatively, approximately 99.95% of the difference values were located between −0.1 and 0.1.Furthermore, the distributions of the FVC values of both data sets were consistent with the actual conditions of the land cover distributions.Therefore, the global FVC map generated by the MARS model had sufficient spatial consistency and FVC values very similar to those generated by the GRNNs method, as well as satisfactory spatial continuity.
(a)  The FVC temporal profiles of the sampling sites in the year of 2003 were also generated to compare the temporal consistency of the MARS derived and GRNNs derived FVC.Several representative FVC temporal profiles for cropland, grassland, shrubland and forestland, are shown in Figure 5.These representative FVC temporal profiles show good temporal consistency between the two data sets.The two data sets also have similar magnitude and dynamic range, and no missing data are observed.Moreover, the temporal profiles extracted from the MARS method-derived FVC are very smooth, whereas those extracted from the GRNNs method show very slight fluctuations.Furthermore, the two data sets exhibit similar seasonal variations, which could reflect actual vegetation growth characteristics.These seasonal vegetation changes and the temporal consistency between the MARS derived FVC and the GRNNs derived FVC indicated that the MARS method for FVC estimation is reliable and capable of revealing actual earth surface variations.In summary, the results of this study indicate that the MARS method-derived FVC were temporally continuous and had good temporal consistency with the GRNNs derived FVC.The FVC temporal profiles of the sampling sites in the year of 2003 were also generated to compare the temporal consistency of the MARS derived and GRNNs derived FVC.Several representative FVC temporal profiles for cropland, grassland, shrubland and forestland, are shown in Figure 5.These representative FVC temporal profiles show good temporal consistency between the two data sets.The two data sets also have similar magnitude and dynamic range, and no missing data are observed.Moreover, the temporal profiles extracted from the MARS method-derived FVC are very smooth, whereas those extracted from the GRNNs method show very slight fluctuations.Furthermore, the two data sets exhibit similar seasonal variations, which could reflect actual vegetation growth characteristics.These seasonal vegetation changes and the temporal consistency between the MARS derived FVC and the GRNNs derived FVC indicated that the MARS method for FVC estimation is reliable and capable of revealing actual earth surface variations.In summary, the results of this study indicate that the MARS method-derived FVC were temporally continuous and had good temporal consistency with the GRNNs derived FVC.To evaluate the performance of the MARS method further, 44 ground measurement-based samples from VALERI sites were used to directly validate the results from the MARS method and compare them with those from the GRNNs method (Figure 6).There was a reliable agreement between the FVC estimated from both the MARS and GRNNs methods and the ground measurements.The performance of the MARS method (R 2 = 0.836, RMSE = 0.1488) was similar to that of the GRNNs method (R 2 = 0.8353, RMSE = 0.1459), which indicated an acceptable level of consistency between their FVC estimates.In addition, it also could be seen that there were small differences between the validation results of the GRNNs method in this study and the results in [27].The reason for the differences was that the FVC in [27] was first aggregated to the same scale as that of the GEOV1 FVC product (spatial resolution of 1 km) to compare their performances.Therefore, the small differences are reasonable.To evaluate the performance of the MARS method further, 44 ground measurement-based samples from VALERI sites were used to directly validate the results from the MARS method and compare them with those from the GRNNs method (Figure 6).There was a reliable agreement between the FVC estimated from both the MARS and GRNNs methods and the ground measurements.The performance of the MARS method (R 2 = 0.836, RMSE = 0.1488) was similar to that of the GRNNs method (R 2 = 0.8353, RMSE = 0.1459), which indicated an acceptable level of consistency between their FVC estimates.In addition, it also could be seen that there were small differences between the validation results of the GRNNs method in this study and the results in [27].The reason for the differences was that the FVC in [27] was first aggregated to the same scale as that of the GEOV1 FVC product (spatial resolution of 1 km) to compare their performances.Therefore, the small differences are reasonable.To evaluate the performance of the MARS method further, 44 ground measurement-based samples from VALERI sites were used to directly validate the results from the MARS method and compare them with those from the GRNNs method (Figure 6).There was a reliable agreement between the FVC estimated from both the MARS and GRNNs methods and the ground measurements.The performance of the MARS method (R 2 = 0.836, RMSE = 0.1488) was similar to that of the GRNNs method (R 2 = 0.8353, RMSE = 0.1459), which indicated an acceptable level of consistency between their FVC estimates.In addition, it also could be seen that there were small differences between the validation results of the GRNNs method in this study and the results in [27].The reason for the differences was that the FVC in [27] was first aggregated to the same scale as that of the GEOV1 FVC product (spatial resolution of 1 km) to compare their performances.Therefore, the small differences are reasonable.The MARS method had both high computational efficiency and estimation accuracy comparable to that of the GRNNs method.The global FVC maps generated from MARS were highly consistent with those generated from GRNNs.Furthermore, the MARS temporal profiles of various types of representative vegetation were smoother than those of GRNNs and capable of reflecting true growth trends.Overall, the MARS model was determined to be the most suitable method for generating the GLASS FVC product.

Conclusions
In this study, four commonly used machine learning methods were evaluated for their capability to generate the GLASS FVC product.The four machine learning methods were trained with identical sample data from global sampling locations, and the MARS model was preliminarily selected as the most suitable method after comparing the fitting accuracies and computational efficiencies of the four methods.Furthermore, a direct validation using ground FVC measurements and a comparison of the spatial and temporal consistencies of MARS derived and GRNNs derived FVC indicated that the accuracy of the MARS method was comparable to the accuracy of the GRNNs method for global FVC estimates.In addition, the computational efficiency of the MARS method is superior to that of the GRNNs method.Therefore, the MARS method is a suitable alternative to the GRNNs method for generating the GLASS FVC product from MODIS data.Further work should focus on an extensive assessment of the performance of the MARS method using additional ground measurement data.

Figure 2 .
Figure 2. Performance of the four methods with different training sample size.(a) Coefficient of determination (R 2 ); (b) Root mean squared error (RMSE).Figure 2. Performance of the four methods with different training sample size.(a) Coefficient of determination (R 2 ); (b) Root mean squared error (RMSE).

Figure 2 .
Performance of the four methods with different training sample size.(a) Coefficient of determination (R 2 ); (b) Root mean squared error (RMSE).

Figure 2
Figure 2 shows that R 2 increases while RMSE decreases with sample size, with some fluctuations.Generally, GRNNs performed slightly better than SVR and slightly worse than MARS and BPNNs.All of the four methods had satisfactory performance (R 2 > 0.96, RMSE < 0.07) given sufficient training samples.The BPNNs and MARS had lower sensitivity than the GRNNs to training sample size variations, and SVR was very sensitive to training sample size.All of the four methods began to have stable performance when the training sample size was large enough.In addition, a turning point was noted between the sample sizes of 304 and 456.Prior to the sample size of 304, the values of R 2 and RMSE changed quickly, and the four models were sensitive to sample size variation.Subsequent to the training sample size of 456, the performances of the four methods, particularly the MARS method, began to stabilize, and the R 2 and RMSE values showed only slight improvement.With a training sample size of 15,282, over-fitting was not observed, and all of the four methods, particularly the MARS method, presented their best performance (R 2 > 0.96, RMSE < 0.07).Therefore, 15,282 training samples were used as the final training data set.The computation time required for model training and estimation differed considerably among the four methods (Table2).The computational efficiency of the GRNNs and SVR methods was very low when large datasets were used; therefore, these two methods were not suitable for generating a long time series FVC product.Additionally, when the training sample size was reduced, their estimation accuracy decreased.Although the four methods were implemented on different platforms, the comparison of computation time was still reasonable for the assertion that the computational efficiency of MARS and BPNNs was better than that of GRNNs and SVR.One year of FVC estimates were also derived using the four methods at the BELMANIP sites containing all kinds of vegetation types.Due to a lack of ground truth data and the fact that GRNNs have proven to be very efficient for FVC estimation, the FVC estimates of the other three methods were compared with those of the GRNNs method.Scatterplots between the GRNNs derived FVC and the BPNNs, SVR and MARS derived FVCs are shown in Figure3.The results show that the other three methods performed well in estimating FVC and had good consistency with the GRNNs method (R 2 > 0.98, RMSE < 0.037).However, the BPNNs method had a small overestimation problem for FVC values from 0.05 to 0.2.In contrast, the MARS method had better consistency with the GRNNs method compared with the BPNNs and SVM methods.

Figure 3 .
Figure 3. Scatterplots of the comparison results of the fractional vegetation cover (FVC) estimates from the general regression neural networks (GRNNs) with the three other methods: (a) back-propagation neural networks (BPNNs); (b) support vector regression (SVR); (c) multivariate adaptive regression splines (MARS).

Figure 3 .
Figure 3. Scatterplots of the comparison results of the fractional vegetation cover (FVC) estimates from the general regression neural networks (GRNNs) with the three other methods: (a) backpropagation neural networks (BPNNs); (b) support vector regression (SVR); (c) multivariate adaptive regression splines (MARS).

Figure 4 .
Figure 4. Global land surface FVC maps generated using the MARS and GRNN methods for day 201 (20 July) of 2003: (a) the MARS method; (b) the GRNNs method; (c) the global FVC difference map between the MARS and GRNNs methods.

Figure 4 .
Figure 4. Global land surface FVC maps generated using the MARS and GRNN methods for day 201 (20 July) of 2003: (a) the MARS method; (b) the GRNNs method; (c) the global FVC difference map between the MARS and GRNNs methods.

Figure 5 .
Figure 5. Temporal profiles of the MARS derived FVC and the GRNNs derived FVC over sampling sites for year 2003: (a) Evergreen needle forest; (b) Grassland; (c) Cropland; and (d) Open shrubland.

Figure 6 .
Figure 6.Scatterplots of FVC estimated using the two methods ((a) MARS method; (b) GRNNs method) against the ground FVC measurements.

Figure 5 .
Figure 5. Temporal profiles of the MARS derived FVC and the GRNNs derived FVC over sampling sites for year 2003: (a) Evergreen needle forest; (b) Grassland; (c) Cropland; and (d) Open shrubland.

Figure 5 .
Figure 5. Temporal profiles of the MARS derived FVC and the GRNNs derived FVC over sampling sites for year 2003: (a) Evergreen needle forest; (b) Grassland; (c) Cropland; and (d) Open shrubland.

Figure 6 .
Figure 6.Scatterplots of FVC estimated using the two methods ((a) MARS method; (b) GRNNs method) against the ground FVC measurements.

Figure 6 .
Figure 6.Scatterplots of FVC estimated using the two methods ((a) MARS method; (b) GRNNs method) against the ground FVC measurements.

Table 1 .
Characteristics of the sites selected for accuracy assessment.

Table 2 .
The statistical performances of the four machine learning methods with 15,282 training samples.

Table 2 .
The statistical performances of the four machine learning methods with 15,282 training samples.