Assessment of Leaf Chlorophyll Content Models for Winter Wheat Using Landsat-8 Multispectral Remote Sensing Data

: The leaf chlorophyll content (LCC) is a critical index to characterize crop growth conditions, photosynthetic capacity, and physiological status. Its dynamic change characteristics are great signiﬁcance for monitoring crop growth conditions understanding the process of and energy exchange between crops and the environment. Extensive research on LCC retrieval with hyperspectral data onboard various sensor platforms. Nevertheless, limited attention has been paid to LCC inversion from multispectral data, such as the data from Landsat-8, and the potentials and capabilities of the data for crop LCC estimation have not been fully explored. The present study made use of Landsat-8 Operational Land Imager (OLI) imagery and the corresponding ﬁeld experimental data to evaluate their capabilities and potentials for LCC modeling using four di ﬀ erent retrieval methods: vegetation indices (VIs), machine learning regression algorithms (MLRAs), lookup-table (LUT)-based inversion, and hybrid regression approaches. The results showed that the modiﬁed triangular vegetation index (MTVI2) exhibited the best estimate accuracy for LCC retrieval with a root mean square error (RMSE) of 5.99 µ g / cm 2 and a relative RMSE (RRMSE) of 10.49%. Several other vegetation indices that were established from red and near-infrared (NIR) bands also exhibited good accuracy. Models established from Gaussian process regression (GPR) achieved the highest accuracy for LCC retrieval (RMSE = 5.50 µ g / cm 2 , RRMSE = 9.62%) compared with other MLRAs. Moreover, red and NIR bands outweighed other bands in terms of GPR modelling. LUT-based inversion methods with the “K(x) = − log (x) + x” cost function that belongs to the “minimum contrast estimates” family showed the best estimation results (RMSE = 8.08 µ g / cm 2 , RRMSE = 14.14%), and the addition of multiple solution regularization strategies e ﬀ ectively improved the inversion accuracy. For hybrid regression methods, the use of active learning (AL) techniques together with GPR for LCC modelling signiﬁcantly increased the estimation accuracy, and the combination of entropy query by bagging (EQB) AL and GPR had the best accuracy for LCC 12.43 µ cm 2 , RRMSE 21.77%).


Introduction
The interaction between global environmental change and terrestrial ecosystems has always been one of the central issues in the study of global change [1]. Vegetation, which covers 70% of the global land area, is an essential indicator of the change of the land ecological environment. It is also the major object of earth observation with remote sensing techniques. The ecological processes related to plant material energy exchange, for instance, photosynthesis, transpiration, respiration, and primary productivity, are in close connection with the biophysical and biochemical parameters of the vegetation. Among these parameters, chlorophyll is a crucial antenna pigment, which is responsible for light absorption and transfer in photosynthesis. Changes in the leaf chlorophyll content (LCC) thus directly affect biochemical processes such as photosynthesis and primary productivity [2]. In agricultural remote sensing research, chlorophyll is also used as an important index of crop growth conditions [3], and its content variations are related to crop stress, the aging process, and nitrogen nutrition [4]. Therefore, quantitative analysis of LCC has important significance, not only for understanding the process of material and energy exchange between plants and the environment, but also for monitoring crop growth, nutritional status, and stress conditions in agricultural applications.
Owning to its remarkable absorption characteristics in the visible range, nondestructive estimation of LCC is possible with spectral analysis and remote sensing techniques, and numerous studies have focused on chlorophyll retrieval methods [5][6][7][8]. Generally, the retrieval approaches can be classified into four methodological categories: parametric regression methods, nonparametric regression methods, physically based model inversion methods, and hybrid regression methods [9], and each method has captured varying degrees of attention in chlorophyll assessment when using multi-spectral or hyperspectral datasets acquired from ground-based, airborne, and space-borne sensors.
Parametric regression methods, such as vegetation indices (VIs), and spectra of first-order and second-order differential characteristics, have been extensively used for chlorophyll retrieval. For example, Pu and Gong [10] compared and analysed the relationship between hyperspectral reflectance, its first-order and second-order differential characteristics, and the leaf chlorophyll content, and it was found that the first-order differential value at 725 nm and the second-order differential value at 705 nm had the highest correlation with LCC, and the values of the correlation coefficients were both higher than 0.7. Based on Medium Resolution Imaging Spectrometer (MERIS) satellite data, Dash and Curran [11] proposed the MERIS terrestrial chlorophyll index (MTCI) using red and red-edge band data, and found that MTCI was suitable for accurate estimation of the crop chlorophyll content. Gitelson et al. [5] established two chlorophyll indices, i.e., green and red-edge chlorophyll indices (CIgreen and CIred-edge), respectively, using a conceptual model, and these two indices showed excellent performance in canopy chlorophyll content retrieval. Yu et al. [4] proposed a ratio of the reflectance difference index (RRDI) based on the multiple scatter correction (MSC) theory. The results indicated that RRDI was accurate for LCC assessment, and it could alleviate the effect of structural characteristics on LCC retrieval to some extent.
Different from parametric methods that use spectral features established from several specific bands, nonparametric methods take advantage of full-spectrum information based on training data to optimize regression algorithms [12]. For instance, Tang et al. [13] investigated and compared multiple linear regression (MLR), back propagation, radial basis function neural networks (BPNN, RBFNN), and partial least squares regression (PLSR) for assessing LCC in soybean plants. Their results suggested that these regression algorithms with wavelet analysis could achieve good estimation results. Among them, RBFNN and PLSR with a Gaussian kernel function showed the best accuracy and stability for LCC retrieval. Zhao et al. [6] utilized three methods, i.e., the Bayesian model average (BMA), PLS, and stepwise multiple regression (SMR), for LCC assessment with abundant measured leaf data. It was found that these three models achieved a good estimation accuracy. Moreover, the BMA algorithm could alleviate the overfitting problem and improve the generalization of the established LCC model compared with PLS and SMR; thus, it was more suitable for LCC retrieval. Based on spaceborne Compact High Resolution Imaging Spectrometer (CHRIS) data and airborne Compact Airborne Spectrographic Imager (CASI) data, Verrelst et al. [14] investigated and tested the Gaussian process regression (GPR) algorithm for LCC estimation. Their results suggested that GPR was suitable for LCC retrieval.
Physically based model inversion was established on the basis of radiative transfer models (RTMs). RTMs are quantitative models that explain the mechanism describing the relationship between spectral reflectance and vegetation biophysical and biochemical parameters. These models can be used to perform abundant simulations based on a robust understanding of physical, chemical, and biological processes [15]. The process with plant input parameters to simulate leaf-or canopy-level reflectance is called 'forward', and inversion is the inverse process. Among all RTMs, the leaf optical properties model PROSPECT and canopy bidirectional reflectance model SAIL (Scattering by Arbitrary Inclined Leaves) are widely used in the remote sensing community. Darvishzadeh et al. [16] tested the capability of PROSAIL RTM and ALOS AVNIR-2 multispectral image data using a lookup-table (LUT) approach for assessing the canopy chlorophyll content in paddy rice. Their results demonstrated the ability of the PROSAIL inversion method to estimate the canopy chlorophyll content in paddy rice using ALOS AVNIR-2 multispectral data. For the sake of alleviating the ill-posed issue of LUT-based RTM inversion methods, Rivera et al. [17] analyzed different regularization strategies, including varied cost functions (CFs), applying different levels of noise, and employing multiple best solutions, to relieve the problem of LCC estimation. Their results showed that LUT-based RTM inversion methods together with different regularization strategies evidently improved the estimation accuracy, and employment of a normalized "L1-estimate" CF in the inversion process achieved the best estimation with a relative error of 17.6%. Zhang and Wang [18] conducted research on the assessment of LCC in Tamarix ramosissima via inversion of PROSPECT RTM by introducing a merit function. They used its calibrated version instead of the original PROSPECT-4 and found that the calibrated PROSPECT-4 was more accurate for the retrieval of LCC with a root mean square error (RMSE) value of 28.79 mg/m 2 . Croft et al. [19] evaluated the capability of LUT-based RTM inversion methods for LCC assessment with multi-spectral Landsat-8 imagery. They adopted a two-step inversion process using coupled PROSPECT and SAIL RTMs, and it exhibited an accurate estimation (RMSE = 16.18 µg/cm 2 ) of LCC with Landsat-8 data.
Hybrid regression methods take advantage of both physically based techniques and machine learning regression algorithms (MLRAs). That is, these approaches utilize abundant synthetic data simulated by RTMs instead of measured data collected from field campaigns for training machine learning regression models, so as to improve the generalization and computational efficiency of the models. For instance, Malenovsky et al. [8] investigated the combination of continuum removal and RTM for LCC retrieval from the data acquired by Airborne Imaging Spectroradiometer (AISA) Eagle. They applied a continuum removal technique to PROSPECT-DART (discrete anisotropic radiative transfer) simulations and then used these data to train an artificial neural network (ANN). Their ground validation results showed that the ANN and PROSPECT-DART hybrid approach was accurate for LCC estimation, with an RMSE value of 2.18 µg/cm 2 and a relative RMSE (RRMSE) value of 4.18%.
To mitigate the problem of computational costs for MLRAs, especially when the amount of RTMs training data is extremely large, Verrelst et al. [20] employed active learning (AL) techniques so as to optimize sample selection from simulated Sentinel-3 Ocean and Land Color Instrument (OCLI) data for training Kernel-based MLRAs. Their results suggested that AL methods were more efficient than random sampling in choosing appropriate samples for training the MLRAs, since MLRAs together with AL techniques exhibited better estimation accuracy than the results with random sampling. Research conducted by Upreti et al. [21] for LCC retrieval with Sentinel-2 data also supported the conclusion that the AL technique was efficient in selecting samples for training MLRAs.
The above-mentioned literature has indeed enriched the methodologies for LCC assessment with remote sensing techniques. Nevertheless, each retrieval method had its own drawbacks that need to be avoided or overcome. For parametric regression methods, the representativeness of experimental samples and the physical mechanism of remote sensing models are crucial to the effectiveness and universality of these models. However, the problem of overfitting training data collected from field experiments may be incurred by flexible model definitions when nonparametric methods are used. In order to mitigate this overfitting issue, various advanced machine learning algorithms had been considered for LCC retrieval [12]. In terms of RTM inversion, the inversion process is actually an ill-posed problem, since different combinations of leaf-level and canopy-level parameters could lead to very similar simulations of canopy reflectance. Moreover, simplifications and idealization of some processes in RTMs could produce inaccuracies for canopy reflectance modeling [22]. LUT-based RTM inversion strategies and different regularization strategies might be efficient to mitigate the ill-posed issue, and to better handle the inversion process. For hybrid regression methods, it should be noted that these approaches do not alleviate the main issues of RTMs; they merely use all available data simulated by RTMs to train machine learning regression models. Nevertheless, the main shortcoming of these models with respect to adopting hybrid methods is the computation cost. AL approaches, which are intended for selection of optimal samples from a training data pool, can be promising for obtaining an optimized training set and increasing computational efficiency for hybrid methods. Thus, optimizing and improving different LCC retrieval methods are needed, particularly for the application of these methods to various new sensors for LCC estimation.
In recent years, with the rapid advance of earth observation technologies, newly launched satellite sensors, such as the Gaofen (GF) series in China, Sentinel series in Europe, and Landsat-8 in the US, offer huge potential for enrichment of LCC retrieval methodologies. The Landsat-8 Operational Land Imager (OLI) is the newest senor by far in the Landsat observation project. Compared to previous sensors, the Landsat-8 OLI sensor has advanced spectral bands and radiometric resolution, a better signal-to-noise ratio, and it has been used for various purposes in the terrestrial ecosystem [23]. Nevertheless, limited studies have reported an investigation of Landsat-8 OLI data or Landsat series datasets for plant LCC retrieval [19,24,25]. Research on the potential and capability of LCC modelling using Landsat-8 OLI data has a profound influence: on the one hand, robust and accurate LCC models from different satellite sensors could be used together for deriving high-frequency LCC products for rapid monitoring of agricultural crops; on the other hand, these models could provide methods and technical support for applications of similar multispectral sensors onboard unmanned aerial vehicles (UAVs) for LCC estimation at a specific fine scale. Therefore, the aim of the present study was to assess the capability of Landsat-8 OLI data for LCC modelling with different retrieval methods. The specific objectives were to: (i) investigate the performance of broadband vegetation indices in LCC assessment with Landsat-8 OLI data; (ii) inspect the ability of machine learning regression algorithms in LCC retrieval; (iii) establish LUT-based RTM inversion based on Landsat-8 OLI data using different regularization strategies to optimize LCC estimation; and iv) explore the feasibility of hybrid methods using computationally demanding MLRAs with different active learning strategies for LCC retrieval.

Field Experiments
Field experiments were conducted in Shunyi District (40 • 08 N, 116 • 39 E), Beijing, China ( Figure 1) during the 2016 growing season. The area has a warm temperate climate, with mean annual rainfall of 620.0 mm, a mean annual temperature of 11.2 • C, and it is mainly characterized by fluvo-aquic soil. The average topsoil nutrient status of the region (0-0.30 m depth) was as follows: organic matter 14.23 g/kg, total nitrogen 0.90 g/kg, available phosphorus 30.63 mg/kg, and rapidly available potassium 123.75 mg/kg. In 2015, three major cultivars of winter wheat (Nongda 212, Zhongmai 12, and Nongda 5181) were planted in this district during the period from 25 September to 5 October. Fertilization and irrigation were applied according to local standard practice managed by farmers. In the 2016 campaigns, twenty-four elementary sampling units (ESUs), in which a single cultivar had been planted, were established in farmers' fields. The size of the ESUs was approximately 30 m × 30 m, corresponding to the spatial resolution of Landsat-8 imagery, and the locations were within the fields and far from the field borders. Four field surveys at different growth stages, whose dates were close to Landsat-8 acquisitions, were conducted during the whole growing period of winter wheat. Detailed information on the experiment is listed in Table 1.
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 18 and irrigation were applied according to local standard practice managed by farmers. In the 2016 campaigns, twenty-four elementary sampling units (ESUs), in which a single cultivar had been planted, were established in farmers' fields. The size of the ESUs was approximately 30 m × 30 m, corresponding to the spatial resolution of Landsat-8 imagery, and the locations were within the fields and far from the field borders. Four field surveys at different growth stages, whose dates were close to Landsat-8 acquisitions, were conducted during the whole growing period of winter wheat. Detailed information on the experiment is listed in Table 1.

Ground Data Measurements
On each sampling date, chlorophyll readings of winter wheat were made in five homogeneous crop areas (each in a 1 m × 1 m area) randomly distributed inside each ESU using a SPAD-502 (Konica-Minolta, Tokyo, Japan). Measurements were performed on the top-most leaves of different wheat plants per area, and a total of 50 measurements were taken to obtain a mean value of chlorophyll SPAD reading per ESU. These SPAD readings were converted to the leaf chlorophyll content (mass per unit leaf area, μg·cm −2 ) using the relationship "LCC = (99SPAD)/(144-SPAD)" proposed by Cerovic et al. [26], which achieved a conversion accuracy of approximately 4 μg·cm −2 for monocot (wheat and maize) species. Apart from chlorophyll SPAD measurements, the central position of each ESU was geo-located with GPS measurements for subsequently associating the leaf chlorophyll content estimation with the corresponding Landsat-8 OLI multispectral data.

Landsat-8 Imagery Processing
Landsat-8 OLI images, which were close to the dates of field experiments, were firstly downloaded from Earth Explorer (https://earthexplorer.usgs.gov/). These downloaded images were then preprocessed through radiometric calibration and atmospheric correction to obtain surface a b

Ground Data Measurements
On each sampling date, chlorophyll readings of winter wheat were made in five homogeneous crop areas (each in a 1 m × 1 m area) randomly distributed inside each ESU using a SPAD-502 (Konica-Minolta, Tokyo, Japan). Measurements were performed on the top-most leaves of different wheat plants per area, and a total of 50 measurements were taken to obtain a mean value of chlorophyll SPAD reading per ESU. These SPAD readings were converted to the leaf chlorophyll content (mass per unit leaf area, µg·cm −2 ) using the relationship "LCC = (99SPAD)/(144-SPAD)" proposed by Cerovic et al. [26], which achieved a conversion accuracy of approximately 4 µg·cm −2 for monocot (wheat and maize) species. Apart from chlorophyll SPAD measurements, the central position of each ESU was geo-located with GPS measurements for subsequently associating the leaf chlorophyll content estimation with the corresponding Landsat-8 OLI multispectral data.

Landsat-8 Imagery Processing
Landsat-8 OLI images, which were close to the dates of field experiments, were firstly downloaded from Earth Explorer (https://earthexplorer.usgs.gov/). These downloaded images were then preprocessed through radiometric calibration and atmospheric correction to obtain surface reflectance data using radiometric calibration and Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) processing tools in ENVI 5.1 software (Exelis Visual Information Solutions, Boulder, CO, USA, 2014). For the sake of concentrating on the winter wheat area, processed Landsat-8 imagery was clipped to a 1483 × 1121 pixel size using the boundary of Shunyi district, and then multi-band spectral data of the ESUs were extracted from the clipped images. Nevertheless, owning to slight cloud contamination of the images acquired on the dates of 18 April, 20 May, and 5 June, a total of 3, 2, and 5 ESUs that were contaminated by cloud in these images were not included. Finally, a total of 86 ESUs with corresponding Landsat-8 multi-band spectral data and leaf chlorophyll contents was considered for subsequent modelling and analysis.

PROSAIL Simulated Dataset
Leaf-level RTM PROSPECT-5 [15] coupled with the canopy bidirectional reflectance model 4SAIL [27], which is referred to as the PROSAIL model, was used for the generation of a simulated dataset in the present study. PROSPECT-5 simulates leaf directional-hemispherical reflectance and transmittance from 400 to 2500 nm with six input variables: leaf chlorophyll content (LChl), leaf carotenoid content (LCar), leaf structure parameter (N), leaf mass per area (LMA), equivalent water thickness (EWT), and brown pigments (Cbrown), while parameters such as the leaf area index (LAI), leaf angle distribution (LAD), hot-spot parameter (hots), fraction of diffuse incoming solar radiation (skyl), sun zenith angle (θs), view zenith angle (θv), relative azimuth angle (φ), leaf directional-hemispherical reflectance and transmittance derived from PROSPECT-5, and reflectance spectrum of moist and dry soil are needed for 4SAIL to output canopy bidirectional reflectance. Here, LAD was characterized by the average leaf angle (ALA), and ALA was illustrated by an ellipsoidal distribution [28]. Reflectance of moist and dry soil was measured during the field experiments with an ASD FieldSpec 3 spectrometer (Analytical Spectral Devices, Inc., Boulder, CO, USA). Moreover, a scaling factor (α soil ) was employed in 4SAIL so as to consider soil brightness variations as a function of these two soil types [14]. Table 2 presents the ranges, statistical distribution, and number of classes of the input parameters for PROSAIL model simulation. These parameters statistics were based on values measured during field campaigns and other existing studies [21]. Gaussian input distributions were used for N, LChl, Cm, LAI, ALA, α soil , and hots to fit the actual distribution of these parameters for winter wheat during the growing stages. Sun and viewing conditions were set to the same situation as the Landsat-8 satellite overpass. In total, 121,500 simulated canopy bidirectional reflectance data points were produced by performing PROSAIL using a random combination of all the input parameters. Then, the simulated data were resampled, using Landsat-8 spectral response functions, to six bands, i.e., blue, green, red, near-infrared (NIR), short-wave infrared 1 (SWIR 1), and short-wave infrared 2 (SWIR 2) bands, (Coastal, Pan, and Cirrus bands were excluded). Since differences might exist between simulated and actual Landsat-8 data, Gaussian noise was added to the simulated data so as to better describe the actual Landsat-8 characteristics. The equation for computing the Gaussian noise was as follows [29]: where R * (λ) and R(λ) are the processed Landsat-8 reflectance with noise and the unprocessed simulated reflectance data, respectively. MD and MI are the multiplicative wavelength dependent noise and multiplicative wavelength independent noise, respectively. Similarly, AD and AI are the additive wavelength dependent noise and independent noise, respectively. Referring to Weiss and Baret [29] and Upreti et al. [21], A value of 0.01 was used for AD and AI, and values of MD and MI were set as 2% and 1%, respectively, for all ands.

Chlorophyll Modelling Methods
To evaluate the capability of Landsat-8 OLI data for chlorophyll assessment, different retrieval methods, including vegetation indices (VIs), machine learning regression algorithms (MLRAs), lookup -table (LUT)-based inversion, and hybrid regression, were used. The methodology adopted in this work is shown in Figure 2.
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 18 actual Landsat-8 data, Gaussian noise was added to the simulated data so as to better describe the actual Landsat-8 characteristics. The equation for computing the Gaussian noise was as follows [29]: where R * (λ) and R(λ) are the processed Landsat-8 reflectance with noise and the unprocessed simulated reflectance data, respectively. MD and MI are the multiplicative wavelength dependent noise and multiplicative wavelength independent noise, respectively. Similarly, AD and AI are the additive wavelength dependent noise and independent noise, respectively. Referring to Weiss and Baret [29] and Upreti et al. [21], A value of 0.01 was used for AD and AI, and values of MD and MI were set as 2% and 1%, respectively, for all ands.

Chlorophyll Modelling Methods
To evaluate the capability of Landsat-8 OLI data for chlorophyll assessment, different retrieval methods, including vegetation indices (VIs), machine learning regression algorithms (MLRAs), lookup -table (LUT)-based inversion, and hybrid regression, were used. The methodology adopted in this work is shown in Figure 2.

Vegetation Indices
Since the available bands of the Landsat-8 OLI sensor were blue, green, red, NIR, SWIR 1, and SWIR 2 bands, VIs composed by blue, green, red, and NIR bands were considered for evaluation of Landsat-8 OLI imagery for chlorophyll modelling. The selected VIs included the normalized difference vegetation index (NDVI), green normalized difference vegetation index (GNDVI), simple ratio (SR), modified simple ratio (MSR), enhanced vegetation index (EVI), enhanced vegetation index 2 (EVI2), optimized soil adjusted vegetation index (OSAVI), modified soil adjusted vegetation index (MSAVI), and modified triangular vegetation index (MTVI2). Detailed information on these VIs is listed in Table 3. Table 3. Spectral indices used in this study.

Spectral Index Formula a Reference
Normalized difference vegetation index (NDVI) Modified soil adjusted vegetation index (MSAVI) Modified triangular vegetation index (MTVI2) a R λ is the reflectance value at band λ (nm).

Machine Learning Regression Algorithms
MLRAs, which capture nonlinear relationships between the input (e.g., band reflectance features) and output (e.g., biochemical parameters) through training flexible modes with input datasets, are effective approaches for agronomy parameter retrieval. Here, six MLRAs were selected on consideration of their fast training, good performance, and popularity in various application domains for the purpose of investigation and evaluation of Landsat-8 OLI data for leaf chlorophyll content modelling. These six MLRAs were partial least square regression (PLSR), random forest (RF), feedforward neural networks (FNN), support vector regression (SVR), kernel ridge regression (KRR), and Gaussian processes regression (GPR). Six reflectance bands, i.e., blue, green, red, NIR, SWIR 1, and SWIR 2 bands, were used as input features for these MLRAs. Here, PLSR was optimized using a leave-one-out cross-validation (LOOCV) scheme to determine the number of latent variables by minimizing the predicted residual sums of squares (PRESS). As for the RF algorithm, the number of regression trees and the numerical value of a random subset of variables were optimized based on a k-fold cross-validation scheme. For FNN optimization, the Levenberg-Marquardt learning algorithm with a squared loss function was selected, and a k-fold cross-validation procedure was used to avoid overfitting problems. Initial weights of the FNN model were generated by the Nguyen-Widrow method, and model regularization was conducted by limiting the maximum number of net weights to half the number of training samples. For the implementation of SVR and KRR, a radial basis function (RBF) kernel was used for these two algorithms, regularization parameter, tolerance value, and a kernel parameter were used for SVR, and regularization and kernel parameters for KRR were optimized with a k-fold cross-validation strategy. Regarding the GPR model, a scaled Gaussian kernel function was selected, and model hyperparameters and weights were optimized by maximizing the marginal likelihood in the training data. A list of brief introductions on these methods is presented in Table 4. Table 4. Summary of the MLRA algorithms investigated in this study.

Brief Description Reference
Partial least square regression (PLSR) PLSR combines principal component analysis with canonical correlation analysis, which could overcome the problem of multicollinearity between traditional independent variables, and the extracted PLS factors could explain most of the variation in both the predictors and response variables. [39] Random forest (RF) RF regression is a fusion algorithm based on a decision tree classifier, which uses a bootstrap resampling method to extract multiple samples, and decision trees are constructed for each sample; then, the predicted average values of all decision trees are taken as the final prediction results. [40]

Brief Description Reference
Feedforward neural networks (FNN) Neural networks (NN) refer to a complex network structure formed by the interconnection of a large number of processing units (neurons). Here, the standard multi-layer FNN model was adopted, and the Levenberg-Marquardt learning algorithm with a squared loss function was selected to optimize the established NN structure. [41] Support vector regression (SVR) SVR maps training samples to a high-dimensional space and transforms a nonlinear problem in low-dimensional space to a linear problem in high-dimensional space, and then carries on linear modeling. Here, a radial basis function was used to transform nonlinear problems to linear ones. [42] Kernel ridge regression (KRR) KRR is a regression algorithm based on the kernel method. It uses a kernel function to map original data to a high-dimensional space. The mapped data show a linear relationship in the high-dimensional space, and the established model has a strong generalization ability. [43] Gaussian processes regression (GPR) GPR is a statistical learning method under the Bayesian framework, which is often used in nonlinear modeling. It can transform a prior distribution into a posterior model by training historical data, so as to obtain prediction results with probability significance. [44]

LUT-Based Inversion Strategies
LUT-based inversion strategies are extensively-used solutions in physically-based model inversion methodologies, which identify a synthetic reflectance set that is most similar to an actual one, by inquiring the LUT generated from RTMs, applying a cost function, and setting various regularization strategies. A cost function is used to minimize the difference between simulated and measured data for all wavebands [12], while regularization strategies are aimed at alleviating the ill-posed problem, and to better handle the inversion process. In the present study, ten cost functions (CFs) that belong to three different families ("information measures", "M-estimates", and "minimum contrast estimates") were investigated and compared. Detailed information on these ten CFs is listed in Table 5. In addition, two regularization strategies were used: the adding of Gaussian noise to the simulated canopy reflectance (Section 2.4), and a range from 0 (single best solution) to the mean of the 30% best solutions at an increment of 2% was included.

Hybrid Regression Methods
Different from MLRAs and LUT-based inversion, hybrid regression approaches use simulated data generated by RTMs and stored in LUTs for training machine learning regression models instead of ground collected data. Nevertheless, the training process of machine learning regression models might be the computational cost when the amount of used RTM-simulated data increases. For the sake of computational efficiency, we investigated and tested six AL methods that belong to two different families (uncertainty and diversity), with a kernel-based MLRA (i.e., GPR) for the capability of hybrid regression methods in LCC modelling using Landsat-8 OLI data. These AL techniques were variance-based pool of regressors (PAL), entropy query by bagging (EQB), residual regression AL (RSAL), angle-based diversity (ABD), Euclidean distance-based diversity (EBD), and cluster-based diversity (CBD). For details on these six AL approaches, refer to Verrelst et al. [20]. For AL implementation, ten subsets each with a number of 2500 were randomly selected from the simulation in Section 2.4 as learning datasets. A random subset of 50 samples was firstly chosen from one learning dataset as initial training data for model training by GPR. Then, samples from the remaining learning dataset (2450 samples) were added to the preliminary training data (50 samples per iteration) using six different AL techniques, with a stopping criteria of 100 iterations or an RMSE decrease lower than 50%. This process was repeated 10 times with different learning datasets. In comparison, the full subset of 2500 samples was trained using GPR without ALs as a reference.

Statistical Analysis
For VI methods, linear regression models were adopted for establishing relationships between LCC and VIs derived from Landsat-8 OLI data. A k-fold (k = 10) cross-validation procedure and the same k-fold partitions were used for both VI methods and MLRAs. The performance of the VI methods and MLRAs was evaluated by examining the cross-validation estimation of RMSE and RRMSE. For LUT-based inversion strategies and hybrid regression methods, experimental measured data (n = 86) were used for ground validation with RMSE and RRMSE. The implementation of LUT-based inversion and hybrid regression was performed using the Automated Radiative Transfer Models Operator (ARTMO) Toolbox version 3.26 [9] within Matlab software.

LCC Estimation with VIs
The calibration results for LCC assessment with VIs are reported in Table 6. The performance of different VIs for LCC modelling varied. NDVI exhibited an acceptable relationship with LCC, with an R 2 value of 0.42, an RMSE of 6.61 µg/cm 2 , and an, RRMSE of 11.58%. In comparison, the ability of GNDVI to assess LCC was rather poor (R 2 = 0.28, RMSE = 7.39 µg/cm 2 , RRMSE = 12.95%). SR showed a poor relationship with LCC, with a R 2 value of 0.34, an RMSE of 7.09 µg/cm 2 , and an RRMSE of 12.41%. As for MSR, its relationship with LCC was slightly improved (R 2 = 0.38, RMSE = 6.87 µg/cm 2 , RRMSE = 12.04%). Compared with the performance of EVI (R 2 = 0.37, RMSE = 6.91 µg/cm 2 , RRMSE = 12.10%), EVI2 behaved much better in estimating LCC, with an R 2 value of 0.52, an RMSE of 6.02 µg/cm 2 , and an RRMSE of 10.54%. OSAVI and MSAVI exhibited good relationships with LCC, but MSAVI performed slightly better (R 2 = 0.52, RMSE = 6.01 µg/cm 2 , RRMSE = 10.53%). Among all VIs, MTVI2 exhibited the best relationship with LCC, with an R 2 value of 0.55, an RMSE of 5.82 µg/cm 2 , and an RRMSE of 10.19%. Note: x and y in the Equation column refer to VIs and LCC, respectively. ** indicates statistical significance at 0.01.
The cross-validation results of these VIs for LCC estimation are presented in Figure 3. The performances of all VIs during cross validation showed a good consistency with their behavior in the calibration process. MTVI2 showed the best estimation accuracy of LCC, with an R 2 value of 0.53, an RMSE of 5.99 µg/cm 2 , and an RRMSE of 10.49%. The fitted line between the measured and predicted LCC was the closest to the 1:1 line among all VIs. MSAVI, EVI2, and OSAVI exhibited satisfactory estimation results of LCC (R 2 > 0.4, RMSE < 6.5 µg/cm 2 , RRMSE < 11.3%). Among them, MSAVI and EVI2 showed a very similar prediction accuracy, which accorded with their behavior in the calibration results. NDVI, EVI, and MSR showed acceptable results in LCC estimation with R 2 around 0.35, and RMSE and RRMSE approximated 7.0 µg/cm 2 and 12%, respectively. SR and GNDVI showed the worst estimation of LCC with R 2 lower than 0.30, RMSE higher than 7.2 µg/cm 2 , and RRMSE higher than 12.7%. Although the estimation accuracy differed among selected VIs, it was obvious that all these VIs had obvious insensitivity to low LCC values (< 30 µg/cm 2 ). Furthermore, two sample points with LCC values of approximated 80 µg/cm 2 were evidently underestimated by all VIs.

MLRAs in LCC Assessment
The performance of different MLRAs in LCC estimation is shown in Figure 4. Overall, these MLRAs exhibited good estimation results compared with VI methods. PLSR, RF, FNN, and SVR showed a similar estimation accuracy; SVR performed slightly better (R 2 = 0.50, RMSE = 6.16 µg/cm 2 , RRMSE = 10.80%). Among them, the fitted line between measured and predicted LCC produced by FNN was the closest to the 1:1 line. Compared with these four MLRAs, KRR behaved even better with an R 2 value of 0.54, an RMSE of 5.94 µg/cm 2 , and an RRMSE of 10.39%. Moreover, the slope of its fitted line between measured and predicted LCC was nearer to 1, and the intercept value was closer to 0. Among all MLRAs, GPR showed the most accurate results for LCC estimation (R 2 = 0.60, RMSE = 5.50 µg/cm 2 , RRMSE = 9.62%). Furthermore, it showed more sensitivity to LCC values lower than 40 µg/cm 2 compared to other MLRAs. However, similar to VI methods, two sample points with LCC values close to 80 µg/cm 2 were also underestimated by all MLRAs.

LUT-Based Inversion for LCC Estimation
Since the mean of multiple best solutions that ranged from 0 to 30% at an increment of 2% was used for all selected CFs, only the top ten ranked ground validation results of LUT-based inversion for LCC retrieval are reported in Table 7. For different CFs, their performance varied markedly. "Pearson chi-square" CF with 8% multiple solutions had the best estimation results (RMSE = 8.94 µg/cm 2 , RRMSE = 15.65%), while "Negative exponential disparity" CF (30% multiple solutions) showed the worst results with an RMSE of 18.54 µg/cm 2 and an RRMSE of 32.48% among the "information measures" category. "K-divergence Lin" and "Jeffreys-Kullback-Leibler" CFs exhibited similar estimation results with RMSE close to 16 µg/cm 2 , and RRMSE approximated 28%. For CFs among the "M-estimates" type, "Geman and McClure" with 30% multiple solutions had slightly better results (RMSE = 10.20 µg/cm 2 , RRMSE = 17.86%) than those of the "Least absolute error". In comparison, "root mean square error" CF with 30% multiple solutions showed rather poor results, with an RMSE of 17.75 µg/cm 2 and an RRMSE of 31.08%. The "K(x) = −log (x) + x" CF, which belongs to the "minimum contrast estimates" family with 4% multiple solutions, showed the best estimation accuracy (RMSE = 8.08 µg/cm 2 , RRMSE = 14.14%) among all the selected 10 CFs. By contrast, the other two CFs from "minimum contrast estimates", i.e., "K(x) = log (x) + 1/x" and "K(x) = log (x) 2 ", showed slightly inferior estimates with values of RMSE and RRMSE close to 13 µg/cm 2 and 23%, respectively.

Hybrid Regression Methods in LCC Modelling
The results of GPR with different AL methods for LCC retrieval are presented in Table 8. The GPR approach for LCC estimation is also included for comparison. For AL approaches that belong to the diverse family, GPR with CBD behaved the best in the cross-validation process, with an RMSE of 13.83 µg/cm 2 and an RRMSE of 24.60%, while GPR with EBD showed the best results in the ground-validation process (RMSE = 14.46 µg/cm 2 , RRMSE = 25.33%). By contrast, GPR with ABD exhibited inferior estimations for both the cross-validation and ground-validation processes. Different from the results of diverse ALs, the estimation results of GPR with uncertainty ALs (i.e., PAL, EQB, and RSAL) in the ground validation were better than those in the cross validation. Furthermore, the ground-validation estimation results from uncertainty ALs were slightly better than those from diverse ALs. GPR with EQB showed the best estimation accuracy, with an RMSE of 12.43 µg/cm 2 and an RRMSE of 21.77% in the ground validation. Nevertheless, it behaved the worst among all the six ALs in the cross validation. GPR with PAL and RASL showed almost the same results in the cross-validation process. Regarding the ground-validation results, GPR with RASL behaved slightly better than PAL. As for GPR without ALs, it showed similar results to the above-mentioned GPR with ALs in the cross-validation process. Nevertheless, it exhibited relatively poor results in the ground-validation procedure.

Discussion
Landsat-8 OLI is one of the most remarkable sensors among the Earth Observation projects. Acquired data from this platform have been used for a variety of agricultural applications, such as crop leaf area index estimation, soil moisture retrieval, and crop monitoring [45][46][47]. Nevertheless, its potentials and capabilities for crop leaf chlorophyll content estimation have not been fully explored. The present study took advantage of Landsat-8 OLI imagery and the corresponding field experimental data to completely evaluate its capabilities and potentials for LCC modeling using four different retrieval methods including VIs, MLRAs, LUT-based inversion, and hybrid regression approaches. Overall, the LCC estimation results exhibited good accuracy, which accorded with the research of Croft et al. [19] and Yin et al. [48], suggesting that Landsat-8 OLI data are suitable for crop LCC retrieval.
For LCC assessment, VIs that consisted of blue, green, red, and NIR bands were considered on account of the band settings of the Landsat-8 OLI sensor. Even though some VIs, for instance, MTVI2, MSAVI, and EVI2, were not intended for chlorophyll retrieval, they still exhibited good accuracy among all the VIs for LCC estimation. MTVI2 was constructed for increasing the sensitivity to the leaf area index while minimizing chlorophyll influence [38]. MSAVI aims to increase the dynamic range of vegetation signals and minimize soil background influences [37]. EVI2 was put forward to increase the sensitivity of vegetation features to high biomass regions while decoupling background signals and reducing atmosphere influences [35]. Compared with the performance of NDVI, these three indices showed much better results for LCC estimation, suggesting that the modifications of these three indices improved LCC estimation accuracies, particularly for MSAVI and EVI2 since they are composed of red and NIR bands, which is the same as NDVI. The center of the OLI red band is close to the absorption peaks of chlorophyll a and b at 662 nm and 644 nm [2], which could partly explain the good performance of these VIs. Furthermore, the LCC values used in this study were converted from SPAD readings, while SPAD readings were calculated from the transmission features of red (650 nm) and infrared (940 nm) light [49]. This could also account for the good performance of MTVI2, MSAVI and EVI2 in LCC estimation despite that their original purposes were not for LCC assessment. It is worth noting that combination of red and NIR bands showed better results than that of the combination of green and NIR bands since NDVI exhibited more accurate results than GNDVI. Overall, all these VI results suggest that red and NIR bands are critical for LCC assessment with Landsat-8 OLI data.
Compared with VI methods, MLRAs generally had slightly better results, since they utilized all band information and nonlinear transforms. NN gained attention for agronomic parameter modelling and operational products in previous studies [29,50,51]. Here, FNN did not outperform other MLRAs and showed a rather similar estimation to that of PLSR, RF, and SVR, suggesting that it might not be the most adequate algorithm. The methodologies of PLSR, RF, and SVR are different from each other, and they exhibited different performances for varied agronomy parameter retrieval in previous studies using hyperspectral data [4,52,53]. Here, they exhibited very similar estimation results. This might be attributed to the confined broadbands (i.e., 6 bands) used in these models. In comparison, KRR and GPR showed even better estimation results. These accurate results accord with their performance in previous research [54,55]. Among all MLRAs, GPR is the most capable for not only maintaining very good numerical performance and stability but also for largely overcoming the blackbox issue, by providing ranking features (bands) that are used in the model [14]. According to GPR sigma band analysis, we found that the red band and NIR band are the top two bands frequently used in GPR models, which indicates that these two bands are critical for GPR modeling. This could also support the phenomenon that VIs composed of red and NIR bands showed good results for LCC estimation.
In terms of LUT-based inversion methods, ten CFs with different multiple best solution regularization strategies showed varied behaviors for LCC retrieval. The results suggest that the "root mean square error" CF, which was extensively-used in some previous studies [56,57], might not be the optimal CF for LCC inversion with Landsat-8 OLI data since it exhibited rather poor estimation. In comparison, CFs such as "Pearson chi-square", "Geman and McClure", and "K(x) = −log (x) + x" that belong to three different families, had much better estimates. Among them, "K(x) = −log (x) + x" showed the best inversion accuracy. These results accord with the works of Rivera et al. [17] and Verrelst et al. [58]. The use of multiple solution regularization strategies did improve the inversion accuracy of different CFs as compared with the cases without using them. However, it seems that high values of multiple solutions were more effective than low values in regulating LUT-based inversion since most CFs achieved good estimations when high values of multiple solutions (i.e., 30%) were used. For noise regularization, a Gaussian noise model was used, and the same noise criterion (details in Section 2.4) was adopted for both LUT-based inversion methods and hybrid regression approaches, in order to make a comparison between them. Generally, LUT-based inversion methods were more effective than hybrid regression approaches in LCC retrieval with Landsat-8 OLI data, since LUT-based inversion methods with most CFs exhibited better LCC estimation. Reasons for this might be largely connected with the data sizes that were different in using these two methods: LUT-based inversion used all the simulated data (n = 121,500) for modelling, whilst partial simulation (n = 2500) was used for establishing hybrid regression models. Compared with the results from the full training data set with GPR, the use of AL methods with GPR led to superior retrieval accuracies, and all AL techniques actually exhibited similar estimation results for ground validation. AL methods from the diversity family showed consistent results for cross-validation and ground-validation processes, whereas the uncertainty AL exhibited quite a difference between the two processes, especially for the results of EQB with GPR, suggesting that it might be unstable though it achieved the best accuracy for ground-validation. Even though diversities existed between different ALs for training GPR models, we can conclude that AL approaches were fairly effective and accurate for LCC retrieval with hybrid regression methods.

Conclusions
In this study, the potential and capability of Landsat-8 OLI multispectral data for LCC assessment in winter wheat were comprehensively investigated and evaluated, using different retrieval methods including broadband VIs, MLRAs, LUT-based inversion, and hybrid regression approaches. Overall, the LCC estimation results exhibited good accuracies except variations existed between different retrieval methods. Among the selected VIs, MTVI2 showed the best estimation accuracy with an RMSE of 5.99 µg/cm 2 and an RRMSE of 10.49%. VIs (i.e., MSAVI, EVI2, OSAVI) established from red and NIR bands also exhibited good accuracy for LCC estimation. MLRAs generally had slightly better results compared to those of VIs. GPR best captured the variations in LCC with the highest accuracy for LCC retrieval (RMSE = 5.50 µg/cm 2 , RRMSE = 9.62%). Furthermore, the red band and NIR bands outweighed other bands in GPR modelling, suggesting these two bands are of great importance for LCC retrieval. LUT-based inversion methods with different CFs exhibited varied results. "K(x) = −log (x) + x" CF that belongs to the "minimum contrast estimates" family had the best accuracy (RMSE = 8.08 µg/cm 2 , RRMSE = 14.14%), followed by the "Pearson chi-square" and "Geman and McClure" CFs from "information measures" and "M-estimates" families, respectively. Moreover, the addition of multiple solution regularization strategies improved the inversion accuracy compared with the cases without using them. Owing to the computational cost and limited simulated data for modelling, hybrid regression methods with GPR exhibited inferior estimation compared to the results of LUT-based inversion. Nevertheless, the use of AL techniques together with GPR for LCC modelling significantly increased the estimation accuracy compared with the results from the full training data set with GPR, and the combination of EQB and GPR had the best accuracy for ground validation (RMSE = 12.43 µg/cm 2 , RRMSE = 21.77%). On the basis of all tests carried out in this work with different retrieval methods, it can be concluded that Landsat-8 OLI multispectral data can be accurately used for crop LCC retrieval.