An Evaluation of Machine Learning Methods for Leaf Area Index Retrieval

Wang, Dong; Miao, Lijuan; Lu, Yutian; Jiang, Hanyang; Liu, Qiang

doi:10.3390/rs18121884

Open AccessArticle

An Evaluation of Machine Learning Methods for Leaf Area Index Retrieval

by

Dong Wang

^1,2,

Lijuan Miao

³

,

Yutian Lu

³,

Hanyang Jiang

² and

Qiang Liu

^1,2,*

¹

State Key Laboratory of Climate System Prediction and Risk Management, Nanjing University of Information Science and Technology, Nanjing 210044, China

²

School of Remote Sensing and Geomatic Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China

³

School of Geographical Science, Nanjing University of Information Science and Technology, Nanjing 210044, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(12), 1884; https://doi.org/10.3390/rs18121884

Submission received: 11 May 2026 / Revised: 3 June 2026 / Accepted: 4 June 2026 / Published: 7 June 2026

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Multi-regional benchmarking shows ensemble machine learning (ML) methods, especially GBTR, achieve the highest LAI retrieval accuracy across diverse biomes.
ML-based LAI estimates outperform operational products but exhibit a systematic compression effect at low and high LAIs.

What are the implications of the main findings?

The compression effect reveals limitations of purely data-driven models, highlighting the need for physical constraints.
The results guide robust ML algorithm selection for scalable global LAI mapping, including data-sparse regions.

Abstract

The Leaf Area Index (LAI) serves as a vital biophysical parameter for quantifying vegetation dynamics and ecosystem functioning. While traditional LAI retrieval methods face challenges in handling nonlinear spectral-vegetation relationships, machine learning (ML) approaches offer promising alternatives through their data-driven adaptability. This study presents a comprehensive cross-site assessment of 13 ML algorithms for LAI estimation, leveraging ground observations from 98 sites worldwide. Our systematic assessment reveals three key findings: First, ensemble methods consistently outperformed other approaches, with Gradient Boosted Tree Regression (GBTR) achieving superior accuracy (R² = 0.647, RMSE = 0.899) and robustness (ΔR² < 0.05 beyond n = 69 training samples). Second, Gaussian Process Regression (GPR) illustrated exceptional stability across varying training sizes (R² = 0.607 ± 0.012), highlighting its reliability for data-limited scenarios. Third, all tested ML models substantially outperformed operational LAI products, with the GBTR model demonstrating superior explanatory power (external validation R² = 0.647) compared to MODIS; its R² value had increased by 0.489. This optimal balance of accuracy, computational efficiency, and resistance to overfitting positions GBTR as a reasonable choice for large-scale LAI mapping. These findings underscore ML’s promising potential in vegetation monitoring while highlighting the need for hybrid approaches that combine physical principles with data-driven learning to address current limitations in extreme-value estimation and ecological generalizability.

Keywords:

Leaf Area Index (LAI); machine learning; remote sensing; Gradient Boosting Tree Regression (GBTR); MODIS

1. Introduction

The Leaf Area Index (LAI), defined as the total one-sided leaf area per unit ground surface area, serves as a fundamental biophysical parameter characterizing vegetation canopy structure and function [1,2]. As a pivotal indicator in terrestrial ecosystem studies, LAI plays a critical role in quantifying energy exchange, photosynthesis, and biogeochemical cycles [3,4,5]. Its applications extend to precision agriculture, where it aids in crop growth monitoring, yield prediction, and resource management [6,7]. Given the increasing pressures of climate change and anthropogenic disturbances, accurate LAI estimation is indispensable for understanding vegetation responses to environmental shifts and informing sustainable land-use policies [8]. Consequently, refining LAI inversion methodologies remains a critical research focus in fostering ecological preservation and climate change resilience.

Historically, LAI quantification relied on direct, albeit destructive, field measurements, wherein leaves were harvested and their area measured manually [8]. While this approach yields precise site-specific data, its scalability is severely limited by labor intensity, spatial representativeness constraints, and the inability to support continuous monitoring. The advent of remote sensing technology has revolutionized LAI estimation by enabling large-scale, non-destructive, and temporally consistent observations. Contemporary LAI inversion methods can be classified into two categories: empirical-statistical approaches and physically based radiative transfer models (RTMs), each with distinct advantages and limitations.

Empirical-statistical models establish regression relationships between ground-measured LAI and spectral vegetation indices (VIs) derived from satellite or aerial imagery [9,10]. These models typically involve preprocessing steps such as atmospheric correction, noise reduction, and spectral band normalization, followed by statistical calibration using techniques like multiple linear regression (MLR) or partial least squares regression (PLSR) [11]. Despite their computational efficiency and simplicity, empirical models are highly sensitive to the quality and representativeness of training data. Biases may arise when applied across heterogeneous landscapes or diverse phenological stages, limiting their generalizability.

In contrast, the physically based RTMs, rooted in the optical properties of vegetation and radiation transfer theory, estimate LAI by simulating the interplay between vegetation and light. This approach necessitates the selection of appropriate physical models (e.g., radiative transfer model, PROSPECT model) [12,13] to delineate the light-vegetation interaction process and determine the requisite model parameters (e.g., reflectivity, transmittance, structural parameters of the vegetation canopy). While physical models remain the gold standard for understanding radiation transfer mechanisms within vegetation canopies, achieving high retrieval accuracy typically requires complex iterative inversion algorithms. Consequently, as demonstrated by [14,15], balancing this mechanistic rigor with the computational efficiency required for global-scale, high-resolution mapping remains a primary challenge, justifying the exploration of alternative approaches.

Recent advancements in computational power and data availability have propelled machine learning (ML) techniques to the forefront of LAI inversion research [16,17,18]. Unlike traditional methods, ML algorithms (e.g., random forests, neural networks, and support vector machines) excel at capturing nonlinear relationships between multispectral data and LAI without explicit physical formulations. By leveraging high-dimensional feature spaces, including multi-temporal, multi-sensor, and ancillary environmental data, ML models can enhance inversion accuracy and robustness [19]. Notable studies demonstrate the superiority of ML over conventional approaches. For instance, Du et al. [20] reported that random forest regression outperformed other ML models in maize LAI estimation, attributing its success to inherent feature selection and ensemble learning mechanisms. Beyond agricultural systems, ML-based approaches have increasingly revealed their capacity to capture the nonlinear dynamics of LAI across natural ecosystems. For instance, in structurally complex forest ecosystems, ML algorithms have been instrumental in mitigating the saturation effects of optical sensors [21]. In grassland and shrubland biomes, nonparametric models have proven particularly effective in capturing the fine-scale heterogeneity of canopy structures and seasonal phenology [22]. A critical research gap persists in developing broad, regionally applicable ML frameworks capable of harmonizing diverse vegetation types and environmental conditions.

Global LAI products, such as MODIS LAI, integrate optical VIs (e.g., NDVI) with RTMs to generate spatially continuous datasets [23]. However, their accuracy is frequently compromised by atmospheric interference (e.g., clouds and aerosols), bidirectional reflectance effects, and the oversimplification of canopy structural heterogeneity [24]. RTM-driven products often employ static parameterizations, failing to adapt to regional variations in leaf biochemistry or canopy architecture [25]. Consequently, discrepancies arise in complex terrains or during rapid phenological transitions, undermining their utility in ecosystem research and global change monitoring [26,27].

Emerging ML techniques offer promising solutions to these limitations by enabling data-driven adaptation to environmental variability. However, systematic evaluations of ML-based LAI inversions at a global scale remain scarce [28], yet their application in LAI inversion remains insufficiently validated [29]. Addressing this gap, our study utilizes MODIS multispectral remote sensing data and employs a combination of statistical and machine learning regression methods, including partial least squares regression (PLSR), artificial neural networks (ANN), and support vector regression (SVR), to analyze the relationship between remote sensing features and vegetation Leaf Area Index (LAI). We aim to evaluate model performance across the site coverage areas; identify optimal methodologies balancing accuracy and generalizability; benchmark ML-derived LAI against established global products (e.g., MODIS LAI) to quantify improvements in precision and reliability. By integrating multi-source remote sensing data and advanced ML frameworks, this research seeks to advance the methodological understanding of large-scale LAI estimation, ultimately supporting enhanced ecosystem monitoring and climate resilience strategies.

2. Materials and Methods

2.1. Data Sources

2.1.1. In Situ LAI Observations

To ensure robust validation of remote sensing-derived LAI estimates, we compiled a comprehensive global dataset of in situ LAI measurements from five well-established research initiatives (Table 1). These datasets span diverse biomes and temporal scales, providing a critical foundation for model calibration and evaluation. The Bigfoot Project (1999–2003) [30] delivers high-quality LAI, Fraction of Absorbed Photosynthetically Active Radiation (FPAR), and nitrogen content measurements across North American biomes, including Arctic tundra, boreal forests, and temperate croplands. Field campaigns employed standardized protocols combining LAI-2000 canopy analyzers (LI-COR Biosciences, Lincoln, NE, USA) and destructive sampling. The VALERI Project focused on validating medium-resolution sensors (e.g., Landsat, SPOT) using a nested sampling design. Hemispherical photography and LAI-2000 measurements were collected within 3 × 3 km² sites, encompassing forests, grasslands, and agricultural systems across six continents. Harvard Forest Project (2014–2018) [31], a flagship site of the National Ecological Observatory Network (NEON), provided long-term LAI dynamics in a temperate deciduous forest. Data integration from tower-based sensors and periodic ground surveys ensured temporal consistency. The GBOV Project (2013–2022) captured high-resolution LAI reference images over 5–7 years using digital hemispherical imagery and satellite data (Sentinel-2, Landsat-8) to refine LAI values [32]. Since 2013, the IMAGINE Project (2013–2016) has collected ground data for validating Copernicus’ global land services, employing high-resolution satellite imagery to produce reference maps for LAI, Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), and the Fraction of Vegetation Cover (FCOVER). In total, 172 LAI measurements from 98 sites, covering 11 vegetation types (e.g., deciduous broadleaf forest, shrubland, Arctic tundra, Figure 1), were analyzed. Furthermore, to characterize the topographic envelope of our validation dataset, we extracted the elevation for all sites using a global Digital Elevation Model (DEM). The topographic analysis reveals that the in-situ measurements are predominantly situated in low-relief areas. Over 80% of the validation sites are located at elevations below 500 m (ranging from 0 m to 482 m). Because of the predominantly flat terrain, the distribution aspect is largely uniform and exerts minimal influence on the optical signals. Therefore, it should be noted that the validation primarily reflects model performance over flat to moderately undulating landscapes.

2.1.2. Surface Reflectance and Reanalysis Products

To address scale mismatch between satellite and ground observations, we extracted spectral predictors (Table S9) from Moderate Resolution Imaging Spectroradiometer (MODIS) MOD09GA surface reflectance product (1 km spatial resolution, https://lpdaac.usgs.gov/products/mod09gav006/, accessed on 27 April 2026) [38] based on vegetation sensitivity. Our predictor selection followed a two-step criterion: (1) spectral indices must demonstrate strong correlation with field-measured LAI, and (2) bands should exhibit complementary sensitivity to vegetation characteristics. The Enhanced Vegetation Index (EVI) emerged as the optimal predictor through correlation analysis (Figure 2). Short-Wave Infrared (SWIR) bands contributed substantially to model predictive power, especially in high-biomass regions like tropical forests. Unlike visible bands, which saturate in dense canopies, SWIR reflectance is less affected by multiple scattering and provides critical information on canopy structural density. This sensitivity to leaf water content and dry matter, combined with spectral information distinct from visible absorption, enabled the algorithm to better discriminate vegetation types from background soil signals [39]. And from a perspective of importance analysis, SWIR bands are of greater importance (Figure S1). Therefore, we further incorporated SWIR bands (B6, B7). To characterize LAI from multiple perspectives, we also incorporated meteorological data as input parameters. The meteorological data comes from Global Land Data Assimilation System (GLDAS) reanalysis data (https://disc.gsfc.nasa.gov/) and includes parameters such as soil moisture, downward short-wave radiation, saturated vapor pressure deficit, vegetation canopy surface water content, and air temperature. Although canopy surface water content is physically linked to LAI within land surface models, the coarse-resolution GLDAS product functions here primarily as a regional moisture-stress factor. Given its modest correlation with site-level LAI (r = 0.241; Table S1), it introduces complementary environmental constraints rather than causing target redundancy. All predictors were normalized to [−1, 1] to eliminate magnitude differences before being fed into the LAI inversion model. This curated feature set enabled robust LAI inversion modeling while maintaining physical interpretability. To align the point-based field measurements with the gridded MODIS and MERRA-2 products, we implemented a nearest-neighbor temporal matching scheme and a 3 km spatial buffering protocol. Specifically, ground observations were paired with the satellite pixel values from the nearest available timestamp. For spatial consistency, we averaged all valid pixels within a 3 km radius of each site, excluding any observations flagged for poor quality or cloud contamination.

2.1.3. Benchmark LAI Products

To validate our results, we utilized two globally recognized LAI datasets: MODIS LAI (MOD15A2H V6.1) (https://lpdaac.usgs.gov/products/mod15a2hv006/, accessed on 27 April 2026) and MERRA-2 LAI (https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/, accessed on 27 April 2026). The MODIS LAI product (MOD15A2H V6.1) provides an 8-day composite at 500 m resolution, integrating Leaf Area Index (LAI) and fractions of photosynthetically active radiation (FPAR) retrievals [40,41]. This dataset employs NASA’s Radiative Transfer and Scaling Model (RTSM), which derives LAI by modeling canopy radiative transfer processes using multi-band reflectance data, particularly from visible and near-infrared spectra [24]. To minimize atmospheric contamination, the product aggregates Terra satellite observations over 8-day windows, retaining only the clearest pixels for inversion. The MERRA-2 LAI dataset is provided as part of the land surface boundary conditions within the Goddard Earth Observing System (GEOS-5.12.4) model. Unlike direct satellite retrievals, MERRA-2 does not perform an independent inversion of LAI from top-of-atmosphere radiance. Instead, it prescribes LAI using a monthly climatology derived from the MODIS LAI product. This climatological input is integrated into the Catchment Land Surface Model to regulate the partitioning of energy and moisture fluxes at the land-atmosphere interface [42]. Consequently, MERRA-2 LAI represents a consistent, globally forced representation of vegetation dynamics that is inherently tied to MODIS-based observations, serving as a standard reference for global-scale land surface modeling rather than an independent observational dataset.

2.2. The Implementation of Machine Learning (ML) Methods

To evaluate the performance of machine learning (ML) algorithms in Leaf Area Index (LAI) inversion, we compiled 13 mainstream ML methods, including artificial neural network (ANN), Gaussian process regression (GPR), random forest (RF), and others.

The modeling process consisted of two key stages: parameter configuration and model validation. To ensure a consistent baseline for comparison, we primarily adopted standard hyperparameters recommended in the literature and software implementations, applying only limited manual adjustments where necessary; specific parameters are shown in Table S2. While systematic optimization strategies such as nested cross-validation or Bayesian optimization could further refine local model boundaries, our uniform approach ensures that the observed performance differences reflect the structural characteristics of the algorithms themselves [43]. To evaluate the model’s performance and spatial transferability, we employed a hierarchical validation framework. We distinguish between internal validation, which assesses predictive skill within the training distribution, and external validation, which utilizes independent data to test true generalizability. Crucially, the data partitioning was executed at the site level rather than at the individual measurement level. In our primary evaluation, the available geographic sites were randomly partitioned into a 75% training set and a 25% independent validation set [44,45]. We used stratified sampling during this split to guarantee that all major vegetation types were adequately represented in both the training and validation subsets. This site-level random split effectively assesses the model’s accuracy in tracking LAI shifts within the known environmental and spectral spaces of the network. To supplement this approach and account for potential spatial autocorrelation bias in clustered regions, we further implemented a strict Leave-One-Site-Out Cross-Validation (LOSO-CV) scheme. In this protocol, entire geographic sites were systematically excluded from the training phase and used exclusively for testing. To balance technical rigor with readability, we grouped the 13 algorithms into 6 categories based on their shared mathematical frameworks. A comparative summary of these groups is provided in Table 2, while the full specifications for each individual model are detailed in Table S3 of the Supplementary Material.

2.2.1. Partial Least Squares Regression (PLSR)

Partial least squares regression (PLSR) is a multivariate statistical method designed to construct predictive models when predictor variables exhibit multicollinearity or when the number of predictors exceeds the number of observations [46,47]. PLSR is mainly used to deal with the problem of high correlation between independent variables (i.e., multicollinearity), especially when the number of predictors is greater than the number of observations or where there are noisy variables.

In PLSR, it is assumed that X is a matrix of n × p, where n is the number of site LAI samples, p is the number of predictors (vegetation index, meteorological parameters), and Y is a matrix of response variables n × q, where q is the number of response variables (LAI). The main goal of PLSR is to find latent variables (called components) T and U such that [46]:

X = T P^{T} + E

(1)

Y = U Q^{T} + F

(2)

where T and U are latent variable matrices, P and Q are loading matrices, and E and F are the residual matrices.

PLSR determines these latent variables by maximizing the covariance between T and U. Specifically, PLSR finds the latent variable that maximizes the trace of the covariance matrix described below:

c o v (T, U) = c o v (X W, Y V)

(3)

where W and V are the corresponding weight matrices.

2.2.2. Support Vector Regression (SVR)

Support vector regression (SVR) is a regression method based on the principle of the support vector machine (SVM) [48,49]. SVR aims to find a function that is as close as possible to all training data points within a given tolerance ε, while maintaining the smoothness of the model and avoiding overfitting [50].

SVR handles nonlinear problems through kernel techniques, mapping data to a high-dimensional space to find linear decision boundaries within that space [51]. Commonly used kernel functions include linear, polynomial, and radial basis function (RBF) kernels. The goal of the SVR is to minimize the following objective functions:

\min_{w, b} \frac{1}{2} {‖w‖}^{2} + C \sum_{i = 1}^{n} L_{ε} (y_{i}, f (x_{i}))

(4)

where

L_{ε} (y_{i}, f (x_{i}))

is a ε-insensitive loss function,

C

is a regularization parameter, and

f (x_{i})

is a prediction function.

2.2.3. Gaussian Process Regression (GPR)

Gaussian process regression (GPR) is a nonparametric Bayesian regression technique that is commonly used to predict continuous output values with uncertainty, especially for small datasets and scenarios that require uncertainty estimation. The core idea of GPR is to treat each point in the dataset as a random variable that together constitute a multivariate Gaussian distribution [52], with the mean and covariance structure defined by the Gaussian process, assuming that for input x (parameters such as reflectivity, meteorological parameters, etc.), the output y (LAI) obeys a Gaussian process:

y (x) ~ G P (m (x), k (x, x^{'}))

(5)

where

m (x)

is a function of the mean, and

k (x, x^{'})

is a kernel function or a covariance function. A common choice is to set the mean function to zero,

m (x) = 0

, and the kernel function

k (x, x^{'})

determines the similarity between the data points.

2.2.4. Neural Networks

In this study, three neural network models were used: Artificial Neural Network (ANN), Radial Basis Function Network (RBFN), and Generalized Regression Neural Network (GRNN), and the performance of the model was optimized by five-fold cross-verification. ANN simulates the neural network structure of the human brain, consisting of an input layer, a hidden layer, and an output layer, and simulates complex nonlinear relationships by adjusting the weights between nodes [53]. We used a three-layer backpropagation neural network model and determined the optimal parameters through cross-validation.

RBFN is a feedforward neural network with a radial basis function as the activation function [54], usually using a Gaussian function, and excels in classification and regression tasks because of its fast training speed and strong approximation ability. The core of RBFN is the radial basis function, and the Gaussian function is usually chosen as the activation function. The output of RBFN can be expressed as [55]:

y (x) = \sum_{i = 1}^{N} w_{i} \cdot ϕ (‖x - c_{i}‖)

(6)

Specifically,

x

is reflectivity, meteorological parameters, etc.

c_{i}

is the center of the ith radial basis function,

w_{i}

is the weight,

ϕ (‖x - c_{i}‖)

is a radial basis function, and the commonly used form of the Gaussian function is:

ϕ (‖x - c_{i}‖) = \exp (- \frac{{‖x - c_{i}‖}^{2}}{2 σ_{i}^{2}})

(7)

Based on RBFN, GRNN is suitable for regression and prediction tasks, and has the characteristics of fast training and simple structure, especially suitable for regression analysis of small samples and high-dimensional data. Given the input

x

and training datasets

{(x_{i}, y_{i})}

, the output prediction

\hat{y}

of GRNN is:

\hat{y} (x) = \frac{\sum_{i = 1}^{N} y_{i} \cdot \exp (- \frac{{‖x - x_{i}‖}^{2}}{2 σ^{2}})}{\sum_{i = 1}^{N} \exp (- \frac{{‖x - x_{i}‖}^{2}}{2 σ^{2}})}

(8)

In this equation,

y_{i}

is the target value of the training sample, and

σ

is a smoothing parameter.

2.2.5. Decision Tree

In this study, three regression techniques were used: Random Forest Regression (RFR), Gradient Boosting Tree Regression (GBTR), and Classification and Regression Tree (CART) to solve the regression problem. They are based on ensemble learning and improve prediction accuracy and stability by building multiple decision trees. RFR trains decision trees by random feature subsets and self-service sampling, reducing overfitting and merging and enhancing generalization ability [56]. Its output is the average of the predicted values for multiple trees:

\hat{y} = \frac{1}{M} \sum_{m = 1}^{M} T_{m} (x)

(9)

Specifically,

M

is the number of trees, and

T_{m} (x)

is the prediction of the m-th tree on input x.

When building each tree, RFR randomly selects a subset of features for splitting, preventing all trees from over-relying on the same feature, thus increasing the robustness of the model.

GBTR progressively improves the accuracy of the model by iterative training, with each new tree correcting the residuals of the previous tree [57,58]. The GBTR model can be represented as an additive model of a series of trees:

\hat{y} = \sum_{m = 1}^{M} η \cdot T_{m} (x)

(10)

In this equation,

η

is the learning rate, which controls the contribution of each tree, and

T_{m} (x)

is the prediction of the m-th tree on the input

x

.

Each tree is built based on the residuals of the previous tree:

r_{i}^{(m)} = y_{i} - {\hat{y}}_{i}^{(m - 1)}

(11)

Specifically,

r_{i}^{(m)}

is the residual of the ith sample in the

m

round,

y_{i}

is the actual value, and

{\hat{y}}_{i}^{(m - 1)}

is the sum of the predicted values of the previous

m - 1

trees.

CART generates a decision tree by recursively segmenting data to minimize the error of each segmented dataset [59]. For regression problems, the goal of CART is to minimize the following squared error by selecting the optimal features and segmentation points:

\min_{j, s} [\sum_{i \in R_{1} (j, s)} {(y_{i} - {\bar{y}}_{R_{1}})}^{2} + \sum_{i \in R_{2} (j, s)} {(y_{i} - {\bar{y}}_{R_{2}})}^{2}

(12)

Thereinto,

R_{1} (j, s)

and

R_{2} (j, s)

are the two regions divided according to the feature

j

and the segmentation point

s

, and

{\bar{y}}_{R_{1}}

and

{\bar{y}}_{R_{2}}

are the sample averages within regions

R_{1}

and

R_{2}

, respectively.

2.2.6. AdaBoost

AdaBoost is a boosting method designed to improve the performance of machine learning algorithms, especially in classification tasks. Proposed by Yoav Freund and Robert Schapire in 1995 [60], AdaBoost is based on an additive model and a forward step-by-step algorithm to construct a strong classifier by integrating multiple weak classifiers, such as decision stumps. This improves the accuracy and robustness of the model.

At its core, AdaBoost is the step-by-step training of a series of weak classifiers, each of which improves on the previous round. Specifically, AdaBoost adjusts the sample weight distribution so that misclassified samples receive more attention in the next round, improving overall performance.

AdaBoost can be combined with random forests, using random forests as weak learners to enhance the robustness of the model to noisy data and outliers, while maintaining the advantages of random forests. It can also be used to improve the performance of neural networks by training multiple neural networks as weak learners, reducing overfitting and enhancing generalization capabilities. Combined with GPR or SVM as a weak learner, prediction accuracy can also be improved, and uncertainty estimation can be provided.

Ensemble learning methods can help reduce model variance and improve stability, but it also increases model complexity and computational cost, so there is a trade-off between performance and efficiency in practical applications.

2.2.7. Performance Evaluation

In this study, a multi-dimensional synthesis was developed to evaluate the 13 machine learning methods based on both their inversion accuracy (e.g., R², RMSE, MAE) and their sensitivity to training sample sizes. Since these six metrics encompass different units and numerical scales, they were uniformly projected onto a comparable [0, 1] range using min–max normalization to facilitate radar chart visualization.

Specifically, for the accuracy metric R² (M1), where higher values indicate superior performance, we applied Equation (13):

x_{1, n o r m} = \frac{x_{1} - x_{1, m i n}}{x_{1, m a x} - x_{1, m i n}}

(13)

For the error metrics (RMSE, MAE; M2–M3) and the sensitivity of M1, M2, and M3 to the training sample size (M4–M6), where lower values indicate better performance or higher model stability, we applied Equation (14) to invert the scale:

x_{i, n o r m} = \frac{x_{i, m a x} - x_{i}}{x_{i, m a x} - x_{i, m i n}}

(14)

where x represents the original metric value for a given model, while

x_{m a x}

and

x_{m i n}

denote the maximum and minimum values observed across all 13 methods. Following this transformation, points near the outer boundary (1) consistently indicate superior accuracy, lower errors, and higher model stability. An optimal algorithm is geometrically characterized by a polygon that expands outward across all six axes, ensuring that a larger total area on the radar chart reflects overall superior model performance.

2.2.8. Model Evaluation

To assess the predictive accuracy of various models, we employed three standard statistical metrics: the coefficient of determination (R²), root mean square error (RMSE) and mean absolute error (MAE). R² represents the proportion of variance explained by the model, while RMSE and MAE quantify the magnitude of prediction errors, with RMSE being more sensitive to outliers. These metrics are defined as follows. These metrics are defined as follows:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(15)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(16)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(17)

where

y_{i}

and

{\hat{y}}_{i}

are the observed and predicted values, respectively,

\bar{y}

is the mean of the observed values, and

n

is the total number of samples.

3. Results

3.1. Training Sample Size Effects on Machine Learning Accuracy in LAI Estimation

In this study, we evaluated thirteen machine learning models using three key metrics (R², RMSE, and MAE) across eight training datasets of incrementally increasing size (Table 3). To isolate the effect of training sample size, we held the test set constant (n = 30) while expanding the training set from 24 to 129 samples in 15-sample increments. Each configuration was repeated five times to ensure robust statistical assessment of model sensitivity to data availability.

We observed three key patterns (Figure 3): First, while all models showed substantial performance gains when initial sample sizes increased (average ΔR² = +0.18, ΔRMSE = −0.34 from minimum to maximum size), these improvements followed strongly diminishing returns beyond ~69 samples, with GBTR exhibiting particularly stable errors (<8% RMSE variation past this threshold). Second, model robustness varied markedly, with GPR and GPR-Adaboost demonstrating exceptional stability across sample sizes (ΔR² < 0.05 for n > 54), while other models displayed greater sensitivity to data volume. Third, the Gaussian Process Regressor emerged as the most reliable algorithm, achieving peak performance (R² = 0.607 ± 0.012, RMSE = 1.065 ± 0.038) with minimal metric fluctuations despite varying training sizes. To further demonstrate the robustness of the GPR model, we visualized its predictive uncertainty in Figure S2. The 95% confidence intervals successfully capture the majority of the true LAI observations, highlighting GPR’s unique capability to output confidence bounds for every estimate. These results demonstrate that while increased sample sizes initially enhance model accuracy, the marginal utility of additional training data decreases once a critical threshold (approximately 2–3 times the feature space dimensionality) is reached.

The learning curve analysis reveals a clear performance plateau once the training set exceeds approximately 69 samples. This trend suggests a state of “diminishing returns”, which likely reflects the exhaustion of available variance within the dataset rather than the models reaching their inherent computational capacity. Given the scarcity of high-quality ground measurements, the dominant spectral-LAI signals appear to be captured in the early stages of training. Consequently, further data inclusion yielded marginal gains, particularly for complex architectures like ANNs that typically require higher information density to excel. This indicates that the observed performance ceiling is primarily a function of information saturation within the available ground observation network.

3.2. Internal and External Validation of LAI Estimation Methods Based on Field Observations

“Internal validation” refers to the model’s performance evaluation within the training data. The internal validation results (Figure 4) reveal significant disparities in model performance, with R² values ranging from 0.365 to 0.868, highlighting the critical impact of algorithm selection on LAI estimation accuracy. The RF-Adaboost ensemble emerges as the top-performing model (R² = 0.868, RMSE = 0.580, MAE = 0.437), demonstrating its robust capacity to capture nonlinear relationships in spectral-vegetation dynamics. Closely following is the GBTR model (R² = 0.837, RMSE = 0.643, MAE = 0.502), which leverages gradient boosting to achieve comparable precision. These ensemble methods outperform simpler algorithms by 18–42% in R² while reducing prediction errors by 30–55%, underscoring the advantage of integrated learning frameworks in handling complex ecological datasets.

Interestingly, SVR-based models (SVR: R² = 0.707; SVM-Adaboost: R² = 0.711) maintain competitive accuracy despite their linear foundations, suggesting that kernel-space transformations effectively map spectral features to LAI variations. In contrast, the GRNN model exhibits notably poor performance (R² = 0.365, RMSE = 1.308, MAE = 1.085), likely due to architectural limitations in modeling high-dimensional vegetation reflectance patterns. Similarly, CART’s rigid decision-boundary constraints lead to suboptimal partitioning (R² = 0.476), reinforcing the need for more flexible algorithms in remote sensing applications.

A systematic bias is observed across all models: they consistently underestimate high LAI values (>2.2) while moderately overestimating low values (<1.5). The deviation from the 1:1 line is characterized by regression slopes ranging from 0.24 to 0.73 and consistently positive intercepts (0.65–1.8). This behavior is largely driven by the non-uniform distribution of the training data; the high density of samples in the 0–2 LAI range appears to pull predictions toward the mean, leading to the underestimation observed in high-biomass regions where data is comparatively sparse (Table S4). This “compression effect” suggests that current architectures struggle with extreme LAI ranges, possibly due to insufficient training samples at vegetation density extremes or unaccounted atmospheric interference.

“External validation” refers to the evaluation using independent test data from different sites or conditions to assess generalizability. External validation (Figure 5) shows that GBTR achieved the highest generalizability (R² = 0.647, RMSE = 0.899, MAE = 0.725), accompanied by an expected performance decline (ΔR² = −0.19 vs. internal validation). The RF-Adaboost ensemble shows robust but slightly lower metrics (R² = 0.633, ΔR² = −0.23), while the base RF model demonstrates notable resilience (R² = 0.646, ΔR² = −0.12). To determine whether these operational differences were meaningful, we conducted Fisher’s r-to-z transformation analysis. The test revealed no statistically significant differences in predictive performance among the two top-performing models (e.g., GBTR and RF, p > 0.05). Therefore, GBTR and RF can be considered equally viable from a statistical perspective for regional LAI estimation. To isolate the contribution of environmental covariates and test for signal redundancy, we evaluated GBTR restricted to spectral predictors. The optical-only GBTR configuration achieved an R² of 0.29 and an RMSE of 1.34 (Figure S3), marking a severe performance deficit compared to the full model (R² = 0.65, RMSE = 0.90). This sharp contrast confirms that hydrometeorological variables from GLDAS reanalysis provide complementary environmental constraints missed by spectral reflectance alone, rather than introducing redundant target leakage. In contrast to the top-tier models, GRNN and CART replicate their poor internal performance (R² = 0.484 and 0.456, respectively), with error inflation exceeding 25% in external tests, rendering them unreliable for operational deployment.

Regression analysis reveals that all models produce fitted slopes less than 1.0 during validation, indicating either incomplete feature representation in training data (e.g., missing key spectral bands or ancillary variables) or unmodeled interactions (e.g., canopy clumping effects or atmospheric scattering).

The compression effect persists but weakens for mid-range LAI values (e.g., 2.0–4.0 for GBTR, Table S4), implying that targeted sample augmentation for high and low LAI extremes could mitigate bias. The GBTR model’s high stability, characterized by low dispersion in predictions and minimal error inflation (ΔRMSE = +0.256), renders it a highly practical choice for large-scale LAI mapping.

3.3. Comparison Between Locally Tuned and Globally Generalized Approaches

Machine learning models demonstrated substantial variations in estimates when benchmarked against two operational satellite-derived LAI products, namely MODIS and MERRA-2. To minimize the impact of spatiotemporal differences on the comparison, we selected the product pixels that closely corresponded to the exact measurement timing at each observation site [26].

The validation results revealed inherent limitations within the operational products when evaluated against ground plots. MODIS LAI showed a moderate correlation with field measurements (R² = 0.254, RMSE = 1.426), while MERRA-2 exhibited a slightly lower agreement (R² = 0.193, RMSE = 1.483). These performance characteristics reflect the well-documented challenges in operational retrieval systems, including the need to balance global coverage with local accuracy [27], and the difficulties in accounting for site-specific vegetation structure and atmospheric conditions [61,62].

In contrast, machine learning models demonstrated marked improvements in estimation accuracy. The GBTR (R² = 0.742, RMSE = 0.785) outperformed MODIS, providing an R² increase of 0.489. Both the RF and SVR models yielded similar performance enhancements, reducing baseline estimation errors by 32% to 42%. Application of Fisher’s r-to-z transformation confirmed that the higher correlation coefficient of the GBTR model relative to the MODIS product was statistically significant (p < 0.05).

To evaluate performance without sampling bias, we further performed independent validation utilizing the 25% holdout dataset. In this independent test (Figure S4), our LAI retrievals maintained a notably higher R² value (GBTR: R² = 0.65) relative to MODIS (R² = 0.10) and MERRA-2 (R² = 0.36), confirming the spatial robustness of the regionalized approach on unseen data. This result underscores that the higher R² represents an improvement in capturing LAI variability rather than a marginal fluctuation in error metrics. Additionally, an uncertainty analysis across distinct LAI intervals (Figure S5) indicates that while MODIS and MERRA-2 exhibit increasing variance at high LAI values, our ML model maintains a more stable error profile. This consistent error distribution confirms that the observed improvements are driven by an enhancement in predictive stability across the entire data range rather than localized fitting.

The consistent performance advantage of machine learning methods highlights their potential to complement existing global products, particularly for regional applications requiring high precision. It remains critical to note, however, that satellite-derived grid products remain indispensable for global-scale monitoring and multi-decadal trend analysis, where their standardized retrieval algorithms and continuous temporal sampling provide unique structural advantages.

This performance gap largely reflects the logistical advantage of site-specific training. Because MODIS and MERRA-2 are optimized for global homogeneity, these results highlight the potential for significant precision gains when global products are augmented by data-driven frameworks calibrated to local vegetation characteristics.

Our comparison with MODIS and MERRA-2 LAI is structured as a cross-scale evaluation of spatial consistency rather than an independent validation of model accuracy. To maximize spatial coverage for this macro-scale consistency assessment, the comparison utilizing both products (Figure 6) incorporates the full dataset. While this is intended to evaluate the structural and distributional agreement across the entire network, these specific metrics should not be interpreted as site-held-out validation. Instead, the comparison highlights how the localized heterogeneity captured by our site-trained model scales against the smoother, coarse-resolution representations typical of global reanalysis frameworks such as MERRA-2.

3.4. Comprehensive Evaluation

The multi-dimensional assessment of the 13 machine learning methods, visualized via the radar chart analysis (Figure 7), demonstrates distinct variations in model capabilities across the six selected performance metrics (complete statistical indicators are tabulated in Table S5). GBTR exhibited the highest accuracy and strongest alignment with ground measurements, yielding a high coefficient of determination (R² of 0.6470) and low prediction errors, with an RMSE of 0.8992 and an MAE of 0.7253. This performance profile confirms the dual advantage of GBTR in precise LAI estimation (Section 3.2) and stability across varying training conditions (Section 3.1).

While Gaussian Process-based methods emerged as viable alternatives, GRNN and CART exhibited clear structural limitations. On a normalized scale, their scores fell below 0.4 across the primary three evaluation dimensions, a constraint directly tied to their weaker validation outcomes. For instance, CART yielded an R² of 0.4674 and a high RMSE of 1.341, while RBFN performed poorly in the latter three indicators (Figure 7). Their poor performance typically manifests in three aspects: (1) inadequate fitting capability with R² values falling below 0.50, (2) amplified error propagation where RMSE values exceed 1.20, and (3) extreme sensitivity to training set size where the standard deviation of R² reaches up to 0.21 as seen in RBFN. These constraints collectively limit their practical utility for large-scale, operational LAI estimation.

3.5. Spatial Distribution Pattern of the Global LAI Average State

Between 2001 and 2021, both LAI products and ML-based LAI estimates (e.g., GBTR LAI) showed a similar spatial pattern in which values were higher near the equator and decreased as latitude increased (Figure 8). Tropical rainforests, with their constant high temperatures, heavy rainfall, and dense canopies, had the highest annual LAI values worldwide. Next were the Siberian steppes at the junction of temperate and boreal zones, as well as the heavily cultivated regions of Eastern Europe and southern Asia, where vegetation coverage was high and leaves were dense. In contrast, the Qinghai–Tibet Plateau, the arid inland areas of Australia, and the tundra regions of northern Canada had lower LAI values due to cold weather, drought, or short growing seasons.

In comparisons across the three products, MODIS LAI estimates were generally lower for grasslands and cultivated areas compared to MERRA-2 and GBTR. MERRA-2 LAI yielded significantly higher values in tropical rainforest regions near the equator. GBTR LAI also tended to underestimate values in equatorial regions. However, in the North American Great Plains, Eurasian temperate forests, and grassland ecosystems, GBTR LAI estimates fell between those of MODIS and MERRA-2, and they matched ground-based measurements more closely. While the overall global distribution is ecologically sound, we also transparently present the limitations caused by the site distribution bias. In high-latitude regions where in situ sites are sparse, the ML method is forced to extrapolate, leading to a noticeable overestimation of LAI.

4. Discussion

4.1. Advancements and Comparative Performance of Machine Learning Algorithms in LAI Estimation

Over the past decade, machine learning (ML) algorithms have revolutionized the field of remote sensing, particularly in the estimation of critical biophysical parameters such as LAI [63]. The rapid advancement of remote sensing technology has enabled the acquisition of high-dimensional datasets encompassing spatial, temporal, and spectral information, thereby demanding increasingly sophisticated algorithms that balance efficiency, accuracy, and robustness [64,65]. In this context, ML-based approaches have indicated exceptional potential for addressing the challenges inherent to LAI estimation [28], offering versatile solutions that outperform traditional physically based methods in many scenarios.

This study evaluated the performance of 13 types of ML methods for LAI estimation using MODIS multispectral imagery. Our findings align with prior research while also introducing novel insights. For instance, Wang et al. (2016) illustrated the superiority of RF over ANN in wheat LAI inversion across different growth stages (jointing, booting, and flowering), a conclusion consistent with our results, where the RF-Adaboost ensemble achieved the highest accuracy [66]. Similarly, Ma et al. (2021) employed UAV hyperspectral data to develop an RF-based cotton LAI model, reporting R² values of 0.74 (modeling) and 0.67 (validation), comparable to the performance of our RF (R² = 0.6455) and RF-Adaboost (R² = 0.6327) models [67]. These parallels underscore the robustness of ensemble methods, particularly RF, in handling diverse agricultural monitoring tasks. Furthermore, Houborg and McCabe (2018) highlighted the capability of ML techniques to effectively harness large-scale, high-dimensional remote sensing data [68], reinforcing the rationale behind our methodological approach.

A key distinction of this study lies in the introduction of GBTR, which exhibited top-tier performance among the evaluated methods (R² = 0.647, RMSE = 0.899, MAE = 0.725, Figure 5). Unlike standalone decision trees, GBTR iteratively optimizes prediction accuracy by combining weak learners, making it highly effective for complex, nonlinear relationships inherent in remote sensing data [57]. Additionally, ensemble methods (e.g., RF, GBTR) and GPR demonstrated notable advantages in feature importance interpretation, offering valuable insights for model refinement and variable selection [69].

Conversely, certain models exhibited limitations. The GRNN, for example, demands substantial training data, and the dataset in this study could probably constrain its performance (Figure 3). While ANNs are broadly adaptable to diverse data types, GRNN remains underutilized in LAI estimation, possibly due to its high requirements for training data [70,71]. Similarly, the CART model, despite its interpretability, struggled with high-dimensional and nonlinear data, likely due to insufficient parameter tuning and inherent structural simplicity compared to more advanced ensemble methods [72].

When these algorithmic behaviors are examined across distinct vegetation types, performance fluctuates according to biome-specific canopy characteristics (Table S7). Advanced ensemble methods (e.g., GBTR, RF) maintain strong predictive stability in both forests and croplands, whereas simpler models (e.g., GRNN, CART) degrade significantly in sparse canopy environments like grasslands. A localized bin-wise error analysis indicates that the overarching compression effect is heavily concentrated at the extreme ends of the leaf area spectrum (Table S4). For example, in dense forest ecosystems where the LAI ranges from 4.0 to 6.0, a clear underestimation occurs due to signal saturation.

Our findings confirm that the inclusion of SWIR bands provides critical corrective leverage against this compression effect, though its importance varies by biome. Within forest ecosystems, visible reflectance saturates early during canopy closure, leaving the model dependent on SWIR bands to capture multi-layered foliage density and canopy water content. In contrast, for structurally simple grasslands, the predictive power shifts back to visible and near-infrared bands that track horizontal fractional cover rather than vertical canopy depth [39]. Integrating these hydrometeorological and SWIR constraints allows top-tier models like GBTR to successfully mitigate saturation limits that typically constrain operational products (Figure S3).

4.2. Machine Learning vs. Traditional LAI Products

Traditional LAI products, such as MODIS LAI (refer to Section 3.4), are predominantly derived from radiative transfer models (RTMs) that establish physical relationships between vegetation reflectance and canopy structure [40,73]. While theoretically sound, these approaches generally face some limitations: (1) they rely on idealized assumptions of surface homogeneity and atmospheric conditions, often failing to account for real-world complexities like topographic effects and mixed vegetation scenarios [74]; (2) their inversion processes typically require stringent parameterization that may not adapt well to diverse ecosystems [75,76]; and (3) they struggle to capture the full spectrum of nonlinear relationships present in multi-temporal, multi-sensor observations [77,78]. Our comparative analysis reveals that machine learning methods overcome these constraints by directly learning the complex mappings between spectral features and biophysical parameters from data. As demonstrated by the independent holdout validation in Section 3.3, this data-driven framework yields substantial accuracy enhancements, including an R² increase of up to 0.55 over operational products (GBTR R² = 0.65 versus MODIS R² = 0.10). Fisher’s r-to-z transformation confirms that these performance gains are statistically significant (p < 0.05), demonstrating the clear advantage of localized, nonlinear optimization particularly in heterogeneous landscapes.

While our current study focuses entirely on benchmarking these data-driven ML methods, the complementary strengths of physical and empirical approaches point toward a promising future avenue for synergistic integration. RTMs could conceptually provide physically constrained priors to regularizing ML training, while ML can enhance RTMs by learning residual corrections for systematic biases. Such hybrid frameworks could leverage the interpretability of physical models while maintaining the predictive power of data-driven approaches, which represents a direction that remains to be experimentally tested but holds significant value for operational applications requiring both accuracy and transparency [79]. Recent work by [63] has revealed the feasibility of this paradigm, showing that ML-corrected LAI products exhibit improved consistency with ground measurements across different biome types. Looking forward, expanding our current workflow into a Physics-Informed Machine Learning (PIML) framework, whether by embedding physical constraints directly into neural network architectures or developing physics-informed loss functions, presents a necessary next step to further mitigate directional anisotropy and establish a more robust standard for scalable vegetation monitoring.

4.3. Perspectives, Limitations, and Future Developments

While our study demonstrates the robust performance of machine learning (ML) methods for LAI estimation, several limitations stemming from data constraints and algorithmic characteristics must be acknowledged.

First, the scarcity and uneven spatial distribution of in situ measurements pose significant challenges. The limited ground truth data restricted systematic validation across all vegetation growth cycles and biomes. Consequently, to maintain statistical significance, model performance was evaluated on an aggregated global basis rather than stratified by biome type. Furthermore, current models were validated predominantly on flat terrain, and their application in data-sparse or underrepresented biomes forces the models to extrapolate beyond training conditions, potentially leading to biased estimates.

Second, the effect of sampling bias and site distribution limitations inherently complicates the evaluation of true spatial generalizability. The geographic clustering of available sites in North America and Europe creates a highly uneven sampling density, which introduces spatial autocorrelation and can artificially inflate standard accuracy metrics. To analyze and quantify this effect, we implemented Leave-One-Site-Out Cross-Validation to test the model on entirely unsampled geographic locations (Figure S6). By removing spatial dependencies, this analysis provides a quantification of the performance penalty induced by sampling bias. While performance metrics naturally showed a moderate decline (e.g., R² from 0.65 to 0.46) compared to random splitting, the model remained robust (RMSE = 1.21), confirming its ability to capture broad environmental gradients rather than merely memorizing local spatial clusters. This finding demonstrates that the model’s predictive accuracy is constrained within the represented environmental space and should not be interpreted as a capacity for seamless spatial extrapolation to entirely unsampled regions.

Third, specific ML algorithms exhibited distinct structural limitations. For instance, the CART model produced discontinuous, step-like LAI (Figure 3 and Figure 4) due to its piecewise constant nature [80]. Attempting to smooth these transitions by increasing tree depth resulted in severe overfitting, which is consistent with the known bias-variance trade-off in shallow tree models. Conversely, while the GBTR model offered better precision, its computational cost was higher. Our controlled benchmark testing indicated that GBTR required 3.4 s of execution time, which represents an approximate fourfold increase relative to CART (0.8 s) and PLSR (0.7 s), as detailed in Table S6. This computational overhead can constrain the viability of GBTR for real-time or continuous large-scale regional applications. Additionally, despite their robust performance, integrated methods often show accuracy variations due to differing experimental designs (e.g., sensors, scales), necessitating model recalibration for new environmental contexts.

To address these challenges, future research should focus on the following key directions. Expanding ground campaigns to enable comprehensive biome-stratified evaluations. Future efforts should also employ block cross-validation or site-independent testing to better quantify geographical transferability [81], and adopting systematic hyperparameter optimization strategies, such as grid search, nested cross-validation, or Bayesian optimization, could further enhance model accuracy and robustness [82]. Future work should prioritize incorporating topographic variables (e.g., elevation, slope, aspect) to enhance generalization in complex terrains, as well as implementing multi-sensor fusion and transfer learning techniques to improve spatial granularity and cross-sensor adaptability. Applying distributed cloud computing architectures (e.g., Google Earth Engine) or optimizing hyperparameter tuning cycles represents a necessary next step to enhance processing throughput across diverse geographic biomes [83]. While our data-driven approach implicitly accommodates angular variance through multi-site ensemble training, incorporating explicit geometric predictors such as solar zenith angle (SZA), view zenith angle (VZA), and relative azimuth angle (RAA), or employing bidirectional reflectance distribution function (BRDF) normalization remains a promising direction [84]. To further reduce uncertainties and unrealistic extrapolation in poorly sampled regions, future work should couple machine learning with physically based models. A Physics-Informed Machine Learning framework, embedding physical constraints and canopy radiative transfer relationships, can ensure physically consistent and highly generalizable LAI estimations across diverse ecosystems [85].

4.4. Research Implications and Ecological Significance

The systematic biases observed in our study—specifically the “compression” of the LAI dynamic range—reflect a convergence of intrinsic physical constraints and statistical phenomena common in ecological forecasting.

In high-biomass biomes, such as tropical rainforests, the observed systematic underestimation is primarily driven by the physical saturation of optical reflectance [86]. As canopy density increases, the sensitivity of top-of-atmosphere signals to further leaf area expansion diminishes. This saturation effect is further compounded by canopy clumping and complex shadowing, which obscure the signal from extreme LAI values [87]. When faced with these “flat” gradients, ML models naturally converge toward conservative estimates (the training mean) to minimize global loss, resulting in a failure to resolve peak leaf area.

Conversely, in sparse biomes like tundra or xeric shrublands, the model tends to overestimate the LAI. In these regions, the relatively high noise-to-signal ratio—often caused by confounding soil background reflectance and snow cover—obscures the subtle vegetation signal [88]. Consequently, the model pulls extremely low values upward toward the population mean. This “regression to the mean” in high-uncertainty scenarios highlights the challenge of maintaining accuracy at the lower end of the biomass spectrum where environmental noise dominates.

From a statistical perspective, these biases are exacerbated by the scarcity of “edge cases” in the global training distribution. The relative lack of samples representing extreme high or low LAI values inevitably penalizes the model’s performance at the tails of the distribution [89]. Our results suggest that standard ML approaches, while effective for central tendencies, inherently struggle to represent the full ecological range of LAI without specific interventions. To mitigate this compression effect in future research, it is essential to implement strategies such as stratified sampling, cost-sensitive learning to weight rare extreme values, or the integration of physical constraints to guide the model beyond purely statistical mapping [90]. Such refinements are critical for ensuring that ML-based LAI products can accurately capture the heterogeneous landscapes and phenological extremes essential for global carbon and water cycle modeling.

5. Conclusions

In this study, the evaluation of 13 machine learning methods for LAI estimation using MODIS data reveals notable performance variations across different algorithms. Among the evaluated methods, both GBTR (R² = 0.837 in internal validation) and RF demonstrated the highest predictive accuracy. While their overall performance differences are not statistically significant (p > 0.05), GBTR exhibited slightly greater numerical stability across varying training sample sizes, making it a robust choice for this specific training distribution. GPR-based approaches, particularly GPR-Adaboost, also show promising results with their inherent error-control capabilities and consistent performance. In contrast, GRNN and CART exhibit limited applicability due to their sensitivity to sample size and poorer generalization ability. Our findings highlight two critical considerations for operational LAI estimation: (1) algorithm selection significantly impacts inversion accuracy, with ensemble methods like GBTR and RF establishing the top-performing tier; and (2) sufficient training data remains essential for model optimization, as prediction accuracy improves with larger sample sizes, yielding a 10% to 15% increase in R² when the sample size doubles. These insights advance the application of machine learning in vegetation monitoring, offering practical guidance for improving LAI product accuracy. The demonstrated enhancement in estimation precision over conventional methods (e.g., radiative transfer models, RTMs) can substantially benefit agricultural management, ecosystem monitoring, and climate modeling. Future work should focus on developing hybrid models that combine the strengths of top-performing algorithms while addressing current limitations in model interpretability and transferability across different vegetation types and climatic conditions; this is of great significance for agricultural management and ecological environment monitoring.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs18121884/s1. Refs. [46,52,54,58,59,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106] are cited in the supplementary materials.

Author Contributions

Conceptualization, D.W., L.M. and Q.L.; methodology, D.W. and L.M.; software, Y.L.; validation, D.W., Y.L., H.J.; formal analysis, D.W.; investigation, Y.L. and H.J.; resources, D.W. and Y.L.; data curation, D.W.; writing—original draft preparation, D.W.; writing— review and editing, D.W., L.M., H.J. and Q.L.; visualization, Y.L.; supervision, Q.L.; project administration, Q.L.; funding acquisition, Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Science Fund Program for Excellent Young Scientists (Overseas), the Startup Foundation for Introducing Talent of NUIST (No. 2023r132), the National Natural Science Foundation of China (42571344). Q.L. wishes to express his gratitude to the Jiangsu Province Distinguished Professor Program for its generous support.

Data Availability Statement

All datasets used in this study are openly accessible from public repositories. The BigFoot dataset is available from the ORNL Distributed Active Archive Center (https://daac.ornl.gov/ (accessed on 27 April 2026)) [33]; the VALERI dataset from INRA (http://w3.avignon.inra.fr/valeri/ (accessed on 27 April 2026)) [34]; the Harvard Forest dataset from the Harvard Forest Data Archive (https://harvardforest.fas.harvard.edu/ (accessed on 27 April 2026)) [35]; the GBOV dataset from the GBOV portal (https://gbov.acri.fr/ (accessed on 27 April 2026)) [36]; and the IMAGINES dataset from the FP7 IMAGINES project website (https://fp7-imagines.eu/ (accessed on 27 April 2026)) [37]. The MODIS surface reflectance product (MOD09GA, Collection 6.1) can be obtained from NASA LP DAAC (https://doi.org/10.5067/MODIS/MOD09GA.061 (accessed on 27 April 2026)) [38]. The MODIS LAI product (MOD15A2H, Version 6) can be accessed at https://lpdaac.usgs.gov/products/mod15a2hv006/ (accessed on 27 April 2026) [40], and the MERRA-2 LAI and related reanalysis products are available via NASA GMAO (https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/ (accessed on 27 April 2026)) [42]. LAI Site Data and the machine-learning source code used for LAI estimation are openly available on GitHub (https://github.com/albenfds-cell/ideal-waddle (accessed on 27 April 2026)).

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT (GPT-5, OpenAI, San Francisco, CA, USA) to improve the language and clarity of the text. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chen, J.M.; Black, T.A. Defining leaf area index for non-flat leaves. Plant Cell Environ. 1992, 15, 421–429. [Google Scholar] [CrossRef]
Garrigues, S.; Lacaze, R.; Baret, F.; Morisette, J.T.; Weiss, M.; Nickeson, J.E.; Fernandes, R.; Plummer, S.; Shabanov, N.V.; Myneni, R.B.; et al. Validation and intercomparison of global Leaf Area Index products derived from remote sensing data. J. Geophys. Res. Biogeosci. 2008, 113, G02028. [Google Scholar] [CrossRef]
Running, S.W.; Baldocchi, D.D.; Turner, D.P.; Gower, S.T.; Bakwin, P.S.; Hibbard, K.A. A Global Terrestrial Monitoring Network Integrating Tower Fluxes, Flask Sampling, Ecosystem Modeling and EOS Satellite Data. Remote Sens. Environ. 1999, 70, 108–127. [Google Scholar] [CrossRef]
Zhang, P.; Anderson, B.; Tan, B.; Huang, D.; Myneni, R. Potential monitoring of crop production using a satellite-based Climate-Variability Impact Index. Agric. For. Meteorol. 2005, 132, 344–358. [Google Scholar] [CrossRef]
Asner, G.P.; Scurlock, J.M.O.; Hicke, J.A. Global synthesis of leaf area index observations: Implications for ecological and remote sensing studies. Glob. Ecol. Biogeogr. 2003, 12, 191–205. [Google Scholar] [CrossRef]
Jégo, G.; Pattey, E.; Liu, J. Using Leaf Area Index, retrieved from optical imagery, in the STICS crop model for predicting yield and biomass of field crops. Field Crops Res. 2012, 131, 63–74. [Google Scholar] [CrossRef]
Baez-Gonzalez, A.D.; Kiniry, J.R.; Maas, S.J.; Tiscareno, M.L.; Macias, C.J.; Mendoza, J.L.; Richardson, C.W.; Salinas, G.J.; Manjarrez, J.R. Large-Area Maize Yield Forecasting Using Leaf Area Index Based Yield Model. Agron. J. 2005, 97, 418–425. [Google Scholar] [CrossRef]
Fang, H.; Baret, F.; Plummer, S.; Schaepman-Strub, G. An overview of global leaf area index (LAI): Methods, products, validation, and applications. Rev. Geophys. 2019, 57, 739–799. [Google Scholar] [CrossRef]
Zhao, Z.; Cai, X.; Huang, C.; Shi, K.; Li, J.; Jin, J.; Yang, H.; Huang, T. A novel semianalytical remote sensing retrieval strategy and algorithm for particulate organic carbon in inland waters based on biogeochemical-optical mechanisms. Remote Sens. Environ. 2022, 280, 113213. [Google Scholar] [CrossRef]
Verrelst, J.; Malenovsky, Z.; Van der Tol, C.; Camps-Valls, G.; Gastellu-Etchegorry, J.P.; Lewis, P.; North, P.; Moreno, J. Quantifying Vegetation Biophysical Variables from Imaging Spectroscopy Data: A Review on Retrieval Methods. Surv. Geophys. 2019, 40, 589–629. [Google Scholar] [CrossRef] [PubMed]
Campos-Taberner, M.; García-Haro, F.J.; Busetto, L.; Ranghetti, L.; Martínez, B.; Gilabert, M.A.; Camps-Valls, G.; Camacho, F.; Boschetti, M. A Critical Comparison of Remote Sensing Leaf Area Index Estimates over Rice-Cultivated Areas: From Sentinel-2 and Landsat-7/8 to MODIS, GEOV1 and EUMETSAT Polar System. Remote Sens. 2018, 10, 763. [Google Scholar] [CrossRef]
Yang, Y.P.; Huang, Q.T.; Wu, Z.F.; Wu, T.J.; Luo, J.C.; Dong, W.; Sun, Y.W.; Zhang, X.; Zhang, D.Y. Mapping crop leaf area index at the parcel level via inverting a radiative transfer model under spatiotemporal constraints: A case study on sugarcane. Comput. Electron. Agric. 2022, 198, 107003. [Google Scholar] [CrossRef]
Gonsamo, A.; Pellikka, P. The sensitivity based estimation of leaf area index from spectral vegetation indices. ISPRS J. Photogramm. Remote Sens. 2012, 70, 15–25. [Google Scholar] [CrossRef]
Baret, F.; Weiss, M.; Lacaze, R.; Camacho, F.; Makhmara, H.; Pacholcyzk, P.; Smets, B. GEOV1: LAI and FAPAR essential climate variables and FCOVER global time series capitalizing over existing products. Part1: Principles of development and production. Remote Sens. Environ. 2013, 137, 299–309. [Google Scholar] [CrossRef]
Fuster, B.; Sánchez-Zapero, J.; Camacho, F.; García-Santos, V.; Verger, A.; Lacaze, R.; Weiss, M.; Baret, F.; Smets, B. Quality assessment of PROBA-V LAI, fAPAR and fCOVER collection 300 m products of copernicus global land service. Remote Sens. 2020, 12, 1017. [Google Scholar] [CrossRef]
Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of machine-learning classification in remote sensing: An applied review. Int. J. Remote Sens. 2018, 39, 2784–2817. [Google Scholar] [CrossRef]
Qin, G.X.; Wu, J.; Li, C.B.; Meng, Z.Y. Comparison of the hybrid of radiative transfer model and machine learning methods in leaf area index of grassland mapping. Theor. Appl. Climatol. 2024, 155, 2757–2773. [Google Scholar] [CrossRef]
Zhang, W.; Li, Z.J.; Pu, Y.; Zhang, Y.T.; Tang, Z.J.; Fu, J.Y.; Xu, W.J.; Xiang, Y.Z.; Zhang, F.C. Estimation of the Leaf Area Index of Winter Rapeseed Based on Hyperspectral and Machine Learning. Sustainability 2023, 15, 12930. [Google Scholar] [CrossRef]
Mananze, S.; Pôças, I.; Cunha, M. Retrieval of Maize Leaf Area Index Using Hyperspectral and Multispectral Data. Remote Sens. 2018, 10, 1942. [Google Scholar] [CrossRef]
Du, L.; Yang, H.; Song, X.; Wei, N.; Yu, C.X.; Wang, W.T.; Zhao, Y. Estimating leaf area index of maize using UAV-based digital imagery and machine learning methods. Sci. Rep. 2012, 12, 15937. [Google Scholar] [CrossRef]
Karlson, M. Remote Sensing of Woodland Structure and Composition in the Sudano-Sahelian zone Application of WorldView-2 and Landsat 8. Ph.D. Thesis, Linkopings Universitet, Linkopings, Sweden, 2015. [Google Scholar]
Darvishzadeh, R.; Skidmore, A.; Atzberger, C.; van Wieren, S. Estimation of vegetation LAI from hyperspectral reflectance data: Effects of soil type and plant architecture. Int. J. Appl. Earth Obs. Geoinf. 2008, 10, 358–373. [Google Scholar] [CrossRef]
Martínez, B.; García-Haro, F.J.; Camacho-de Coca, F. Derivation of high-resolution leaf area index maps in support of validation activities: Application to the cropland Barrax site. Agric. For. Meteorol. 2009, 149, 130–145. [Google Scholar] [CrossRef]
Fang, H.; Jiang, C.; Li, W.; Wei, S.; Baret, F.; Chen, J.M.; Garcia-Haro, J.; Liang, S.; Liu, R.; Myneni, R.B. Characterization and intercomparison of global moderate resolution leaf area index (LAI) products: Analysis of climatologies and theoretical uncertainties. J. Geophys. Res. Biogeosci. 2013, 118, 529–548. [Google Scholar] [CrossRef]
Strow, L.L.; Hannon, S.E.; De Souza-Machado, S.; Motteler, H.E.; Tobin, D. An overview of the AIRS radiative transfer model. IEEE Trans. Geosci. Remote Sens. 2003, 41, 303–313. [Google Scholar] [CrossRef]
Liu, M.; Yu, W.; Li, D.; Shang, F.; Zhang, L.; Wang, S.; Yang, W.; Zhao, R.; Wang, X. Validation of Multi-Scale LAI Products in Heterogeneous Terrain-Based UAV Images. Remote Sens. 2025, 17, 3393. [Google Scholar] [CrossRef]
Xiao, Z.; Liang, S.; Wang, J.; Chen, P.; Yin, X.; Zhang, L.; Song, J. Use of general regression neural networks for generating the GLASS leaf area index product from time-series MODIS surface reflectance. IEEE Trans. Geosci. Remote Sens. 2013, 52, 209–223. [Google Scholar] [CrossRef]
Zhang, J.; Cheng, T.; Guo, W.; Xu, X.; Qiao, H.; Xie, Y.; Ma, X. Leaf area index estimation model for UAV image hyperspectral data based on wavelength variable selection and machine learning methods. Plant Methods 2021, 17, 49. [Google Scholar] [CrossRef] [PubMed]
Liu, S.; Jin, X.; Nie, C.; Wang, S.; Yu, X.; Cheng, M.; Shao, M.; Wang, Z.; Tuohuti, N.; Bai, Y. Estimating leaf area index using unmanned aerial vehicle data: Shallow vs. deep machine learning algorithms. Plant Physiol. 2021, 187, 1551–1576. [Google Scholar] [CrossRef] [PubMed]
Cohen, W.; Maiersperger, T.; Pflugmacher, D. BigFoot Leaf Area Index Surfaces for North and South American Sites, 2000–2003; Data set; Oak Ridge National Laboratory Distributed Active Archive Center: Oak Ridge, TN, USA, 2010. [CrossRef]
Garvey, S.M.; Templer, P.H.; Pierce, E.A.; Reinmann, A.B.; Hutyra, L.R. Diverging patterns at the forest edge: Soil respiration dynamics of fragmented forests in urban and rural areas. Glob. Change Biol. 2022, 28, 3094–3109. [Google Scholar] [CrossRef]
Brown, L.A.; Meier, C.; Morris, H.; Pastor-Guzman, J.; Bai, G.; Lerebourg, C.; Gobron, N.; Lanconelli, C.; Clerici, M.; Dash, J. Evaluation of global leaf area index and fraction of absorbed photosynthetically active radiation products over North America using Copernicus Ground Based Observations for Validation data. Remote Sens. Environ. 2020, 247, 111935. [Google Scholar] [CrossRef]
Cohen, W.B.; Maiersperger, T.K.; Turner, D.P.; Ritts, W.D.; Pflugmacher, D.; Kennedy, R.E.; Kirschbaum, A.; Running, S.W.; Costa, M.; Gower, S.T. MODIS land cover and LAI collection 4 product quality across nine sites in the western hemisphere. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1843–1857. [Google Scholar] [CrossRef]
Baret, F.; Weiss, M.; Allard, D.; Garrigues, S.; Leroy, M.; Jeanjean, H.; Fernandes, R.; Myneni, R.; Privette, J.; Morisette, J. VALERI: A network of sites and a methodology for the validation of medium spatial resolution land satellite products. Remote Sens. Environ. 2005, 76, 36–39. [Google Scholar]
Caron, S.; Garvey, S.M.; Gewirtzman, J.; Schultz, K.; Bhatnagar, J.M.; Driscoll, C.; Hutyra, L.R.; Templer, P.H. Urbanization and fragmentation have opposing effects on soil nitrogen availability in temperate forest ecosystems. Glob. Change Biol. 2023, 29, 2156–2171. [Google Scholar] [CrossRef]
Brown, L.A.; Fernandes, R.; Djamai, N.; Meier, C.; Gobron, N.; Morris, H.; Canisius, F.; Bai, G.; Lerebourg, C.; Lanconelli, C.; et al. Validation of baseline and modified Sentinel-2 Level 2 Prototype Processor leaf area index retrievals over the United States. Isprs J. Photogramm. Remote Sens. 2021, 175, 71–87. [Google Scholar] [CrossRef]
Camacho, F.; Lacaze, R.; Latorre, C.; Baret, F.; De la Cruz, F.; Demarez, V.; Di Bella, C.; Fang, H.; García-Haro, J.; Gonzalez, M.P. A network of sites for ground biophysical measurements in support of Copernicus Global Land Product Validation. In Proceedings of the IV RAQRS Conference, Torrent, Spain, 22–26 September 2014; pp. 22–26. [Google Scholar]
Vermote, E.; El Saleous, N.; Justice, C.; Kaufman, Y.; Privette, J.; Remer, L.; Roger, J.-C.; Tanre, D. Atmospheric correction of visible to middle-infrared EOS-MODIS data over land surfaces: Background, operational algorithm and validation. J. Geophys. Res. Atmos. 1997, 102, 17131–17141. [Google Scholar] [CrossRef]
Xiao, Y.; Zhao, W.; Zhou, D.; Gong, H. Sensitivity analysis of vegetation reflectance to biochemical and biophysical variables at leaf, canopy, and regional scales. IEEE Trans. Geosci. Remote Sens. 2013, 52, 4014–4024. [Google Scholar] [CrossRef]
Myneni, R.B.; Hoffman, S.; Knyazikhin, Y.; Privette, J.; Glassy, J.; Tian, Y.; Wang, Y.; Song, X.; Zhang, Y.; Smith, G. Global products of vegetation leaf area and fraction absorbed PAR from year one of MODIS data. Remote Sens. Environ. 2002, 83, 214–231. [Google Scholar] [CrossRef]
Chen, H.; Zhu, G.; Zhang, K.; Bi, J.; Jia, X.; Ding, B.; Zhang, Y.; Shang, S.; Zhao, N.; Qin, W. Evaluation of evapotranspiration models using different LAI and meteorological forcing data from 1982 to 2017. Remote Sens. 2020, 12, 2473. [Google Scholar] [CrossRef]
Gelaro, R.; McCarty, W.; Suárez, M.J.; Todling, R.; Molod, A.; Takacs, L.; Randles, C.A.; Darmenov, A.; Bosilovich, M.G.; Reichle, R.; et al. The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). J. Clim. 2017, 30, 5419–5454. [Google Scholar] [CrossRef] [PubMed]
Cawley, G.C.; Talbot, N.L. On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res. 2010, 11, 2079–2107. [Google Scholar]
Adelabu, S.; Mutanga, O.; Adam, E. Testing the reliability and stability of the internal accuracy assessment of random forest for classifying tree defoliation levels using different validation methods. Geocarto Int. 2015, 30, 810–821. [Google Scholar] [CrossRef]
Dube, T.; Mutanga, O.; Adam, E.; Ismail, R. Intra-and-inter species biomass prediction in a plantation forest: Testing the utility of high spatial resolution spaceborne multispectral rapideye sensor and advanced machine learning algorithms. Sensors 2014, 14, 15348–15370. [Google Scholar] [CrossRef]
Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
Wold, S.; Esbensen, K.; Geladi, P. Principal component analysis. Chemom. Intell. Lab. Syst. 1987, 2, 37–52. [Google Scholar] [CrossRef]
Cortes, C. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Dietrich, R.; Opper, M.; Sompolinsky, H. Statistical mechanics of support vector networks. Phys. Rev. Lett. 1999, 82, 2975. [Google Scholar] [CrossRef]
Awad, M.; Khanna, R. Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers; Springer Nature: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
Mountrakis, G.; Im, J.; Ogole, C. Support vector machines in remote sensing: A review. ISPRS J. Photogramm. Remote Sens. 2011, 66, 247–259. [Google Scholar] [CrossRef]
Rasmussen, C.E. Gaussian processes in machine learning. In Summer School on Machine Learning; Rasmussen, C.E., Ed.; Springer: Berlin/Heidelberg, Germany, 2003; pp. 63–71. [Google Scholar]
Van Gerven, M. Computational foundations of natural intelligence. Front. Comput. Neurosci. 2017, 11, 299674. [Google Scholar] [CrossRef] [PubMed]
Montazer, G.A.; Giveki, D.; Karami, M.; Rastegar, H. Radial basis function neural networks: A review. Comput. Rev. J. 2018, 1, 52–74. [Google Scholar]
Mas, J.F.; Flores, J.J. The application of artificial neural networks to the analysis of remotely sensed data. Int. J. Remote Sens. 2008, 29, 617–663. [Google Scholar] [CrossRef]
Scornet, E. Random forests and kernel methods. IEEE Trans. Inf. Theory 2016, 62, 1485–1500. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013, 7, 21. [Google Scholar] [CrossRef]
Loh, W.-Y. Classification and regression tree methods. In Encyclopedia of Statistics in Quality and Reliability; Ruggeri, F., Kenett, R.S., Faltin, F.W., Eds.; Wiley: Chichester, UK, 2008; Volume 1, pp. 315–323. [Google Scholar]
Freund, Y. Boosting a weak learning algorithm by majority. Inf. Comput. 1995, 121, 256–285. [Google Scholar] [CrossRef]
Fang, H.; Zhang, Y.; Wei, S.; Li, W.; Ye, Y.; Sun, T.; Liu, W. Validation of global moderate resolution leaf area index (LAI) products over croplands in northeastern China. Remote Sens. Environ. 2019, 233, 111377. [Google Scholar] [CrossRef]
Verger, A.; Baret, F.; Weiss, M. Near real-time vegetation monitoring with CYCLOPES, MERIS, and MODIS products. Remote Sens. Environ. 2011, 115, 2243–2255. [Google Scholar] [CrossRef]
Azadbakht, M.; Ashourloo, D.; Aghighi, H.; Radiom, S.; Alimohammadi, A. Wheat leaf rust detection at canopy scale under different LAI levels using machine learning techniques. Comput. Electron. Agric. 2019, 156, 119–128. [Google Scholar] [CrossRef]
Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.-S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep learning in remote sensing: A comprehensive review and list of resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef]
Ma, L.; Liu, Y.; Zhang, X.; Ye, Y.; Yin, G.; Johnson, B.A. Deep learning in remote sensing applications: A meta-analysis and review. ISPRS J. Photogramm. Remote Sens. 2019, 152, 166–177. [Google Scholar] [CrossRef]
Wang, L.; Zhou, X.; Zhu, X.; Guo, W. Inverting wheat leaf area index based on HJ-CCD remote sensing data and random forest algorithm. Trans. Chin. Soc. Agric. Eng. 2016, 32, 149–154. [Google Scholar] [CrossRef]
Ma, Y.; Lyu, X.; Yi, X.; Ma, L.; Qi, Y.; Hou, T.; Zhang, Z. Monitoring of cotton leaf area index using machine learning. Trans. Chin. Soc. Agric. Eng. 2021, 37, 152–162. [Google Scholar] [CrossRef]
Houborg, R.; McCabe, M.F. A hybrid training approach for leaf area index estimation via Cubist and random forests machine-learning. ISPRS J. Photogramm. Remote Sens. 2018, 135, 173–188. [Google Scholar] [CrossRef]
Han, W.; Zhang, X.; Wang, Y.; Wang, L.; Huang, X.; Li, J.; Wang, S.; Chen, W.; Li, X.; Feng, R. A survey of machine learning and deep learning in remote sensing of geological environment: Challenges, advances, and opportunities. ISPRS J. Photogramm. Remote Sens. 2023, 202, 87–113. [Google Scholar] [CrossRef]
Atzberger, C. Advances in remote sensing of agriculture: Context description, existing operational monitoring systems and major information needs. Remote Sens. 2013, 5, 949–981. [Google Scholar] [CrossRef]
Yuan, H.; Yang, G.; Li, C.; Wang, Y.; Liu, J.; Yu, H.; Feng, H.; Xu, B.; Zhao, X.; Yang, X. Retrieving soybean leaf area index from unmanned aerial vehicle hyperspectral remote sensing: Analysis of RF, ANN, and SVM regression models. Remote Sens. 2017, 9, 309. [Google Scholar] [CrossRef]
Loh, W.Y. Classification and regression trees. WIREs Data Min. Knowl. Discov. 2011, 1, 14–23. [Google Scholar] [CrossRef]
Assiri, M.E.; Qureshi, S. A multi-source data fusion method to improve the accuracy of precipitation products: A machine learning algorithm. Remote Sens. 2022, 14, 6389. [Google Scholar] [CrossRef]
Fang, H.; Liang, S.; Kuusk, A. Retrieving leaf area index using a genetic algorithm with a canopy radiative transfer model. Remote Sens. Environ. 2003, 85, 257–270. [Google Scholar] [CrossRef]
Tian, Y.; Woodcock, C.E.; Wang, Y.; Privette, J.L.; Shabanov, N.V.; Zhou, L.; Zhang, Y.; Buermann, W.; Dong, J.; Veikkanen, B. Multiscale analysis and validation of the MODIS LAI product: I. Uncertainty assessment. Remote Sens. Environ. 2002, 83, 414–430. [Google Scholar] [CrossRef]
Fang, H.; Wei, S.; Liang, S. Validation of MODIS and CYCLOPES LAI products using global field measurement data. Remote Sens. Environ. 2012, 119, 43–54. [Google Scholar] [CrossRef]
Pisek, J.; Chen, J.M. Comparison and validation of MODIS and VEGETATION global LAI products over four BigFoot sites in North America. Remote Sens. Environ. 2007, 109, 81–94. [Google Scholar] [CrossRef]
Huang, X.; Lin, D.; Mao, X.; Zhao, Y. Multi-source data fusion for estimating maize leaf area index over the whole growing season under different mulching and irrigation conditions. Field Crops Res. 2023, 303, 109111. [Google Scholar] [CrossRef]
Jiang, Y.; Zhang, Z.; He, H.; Zhang, X.; Feng, F.; Xu, C.; Zhang, M.; Lafortezza, R. Research on Leaf Area Index Inversion Based on LESS 3D Radiative Transfer Model and Machine Learning Algorithms. Remote Sens. 2024, 16, 3627. [Google Scholar] [CrossRef]
Hastie, T. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Spring: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Roberts, D.R.; Bahn, V.; Ciuti, S.; Boyce, M.S.; Elith, J.; Guillera-Arroita, G.; Hauenstein, S.; Lahoz-Monfort, J.J.; Schröder, B.; Thuiller, W. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 2017, 40, 913–929. [Google Scholar] [CrossRef]
Snoek, J.; Larochelle, H.; Adams, R.P. Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems 25; Curran Associates Inc.: Red Hook, NY, USA, 2012. [Google Scholar]
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
Lucht, W.; Schaaf, C.B.; Strahler, A.H. An algorithm for the retrieval of albedo from space using semiempirical BRDF models. IEEE Trans. Geosci. Remote Sens. 2002, 38, 977–998. [Google Scholar] [CrossRef]
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
Mutanga, O.; Skidmore, A.K. Narrow band vegetation indices overcome the saturation problem in biomass estimation. Int. J. Remote Sens. 2004, 25, 3999–4014. [Google Scholar] [CrossRef]
Ryu, Y.; Baldocchi, D.D.; Kobayashi, H.; Van Ingen, C.; Li, J.; Black, T.A.; Beringer, J.; Van Gorsel, E.; Knohl, A.; Law, B.E. Integration of MODIS land and atmosphere products with a coupled-process model to estimate gross primary productivity and evapotranspiration from 1 km to global scales. Glob. Biogeochem. Cycles 2011, 25, GB4017. [Google Scholar] [CrossRef]
Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
Meyer, H.; Pebesma, E. Machine learning-based global maps of ecological variables and the challenge of assessing them. Nat. Commun. 2022, 13, 2208. [Google Scholar] [CrossRef] [PubMed]
Willard, J.; Jia, X.; Xu, S.; Steinbach, M.; Kumar, V. Integrating physics-based modeling with machine learning: A survey. arXiv 2020, arXiv:2003.04919. [Google Scholar]
Atkinson, P.M.; Tatnall, A.R. Introduction neural networks in remote sensing. Int. J. Remote Sens. 1997, 18, 699–709. [Google Scholar] [CrossRef]
Durbha, S.S.; King, R.L.; Younan, N.H. Support vector machines regression for retrieval of leaf area index from multiangle imaging spectroradiometer. Remote Sens. Environ. 2007, 107, 348–361. [Google Scholar] [CrossRef]
Verrelst, J.; Muñoz, J.; Alonso, L.; Delegido, J.; Rivera, J.P.; Camps-Valls, G.; Moreno, J. Machine learning regression algorithms for biophysical parameter retrieval: Opportunities for Sentinel-2 and-3. Remote Sens. Environ. 2012, 118, 127–139. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Elith, J.; Leathwick, J.R.; Hastie, T. A working guide to boosted regression trees. J. Anim. Ecol. 2008, 77, 802–813. [Google Scholar] [CrossRef]
Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef]
Broomhead, D.S.; Lowe, D. Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks; Controller HMSO: London, UK, 1988. [Google Scholar]
Specht, D.F. A general regression neural network. IEEE Trans. Neural Netw. 1991, 2, 568–576. [Google Scholar] [CrossRef]
Chipman, H.A.; George, E.I.; McCulloch, R.E. Bayesian CART model search. J. Am. Stat. Assoc. 1998, 93, 935–948. [Google Scholar] [CrossRef]
Farifteh, J.; Van der Meer, F.; Atzberger, C.; Carranza, E. Quantitative analysis of salt-affected soil reflectance spectra: A comparison of two adaptive methods (PLSR and ANN). Remote Sens. Environ. 2007, 110, 59–78. [Google Scholar] [CrossRef]
Awad, M.; Khanna, R.; Awad, M.; Khanna, R. Support vector regression. In Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers; Apress: Berkeley, CA, USA, 2015; pp. 67–80. [Google Scholar] [CrossRef]
Tang, Z.; Guo, J.; Xiang, Y.; Lu, X.; Wang, Q.; Wang, H.; Cheng, M.; Wang, H.; Wang, X.; An, J. Estimation of Leaf Area Index and Above-Ground Biomass of Winter Wheat Based on Optimal Spectral Index. Agronomy 2022, 12, 1729. [Google Scholar] [CrossRef]
Gubbala, K.; Kumar, M.N.; Sowjanya, A.M. AdaBoost based Random forest model for Emotion classification of Facial images. MethodsX 2023, 11, 102422. [Google Scholar] [CrossRef] [PubMed]
Zhan, X.; Yu, S.; Li, Y.; Zhou, Z.; Cao, H.; Tang, G. Reconstructing historical forest spatial patterns based on CA-AdaBoost-ANN model in northern Guangzhou, China. Landsc. Urban Plan. 2024, 242, 104950. [Google Scholar] [CrossRef]
Liu, W.; Zhao, R.; Su, X.; Mohamed, A.; Diana, T. Development and validation of machine learning models for prediction of nanomedicine solubility in supercritical solvent for advanced pharmaceutical manufacturing. J. Mol. Liq. 2022, 358, 119208. [Google Scholar] [CrossRef]
Lan, Y.; Zhang, Y.; Lin, W. Diagnosis algorithms for indirect bridge health monitoring via an optimized AdaBoost-linear SVM. Eng. Struct. 2023, 275, 115239. [Google Scholar] [CrossRef]

Figure 1. The global spatial distribution of Leaf Area Index (LAI) across diverse biomes. The distribution of these sites, including Crops, Palm, Temperate Coniferous Forest, Desert, Forest, Deciduous Broad-leaved Forest, Mixed Forest, Shrub, Evergreen Broad-leaved Forest, Grass, and Arctic Tundra, was adapted from Bigfoot, VALERI, Harvard Forest, GBOV and IMAGINE.

Figure 2. The correlations between spectral indices and field-measured LAI. Where ** indicates the correlations are significant at p < 0.01.

Figure 3. The image presents two line graphs depicting the sensitivity of various machine learning (ML) models to training sample size, which is incremented in intervals of 15. The left graph (a) shows the coefficient of determination (R²), while the right graph (b) displays the Root Mean Square Error (RMSE) as a function of the training sample size.

Figure 4. The scatter plot matrix presents internal validation results obtained using thirteen different machine learning methods for estimating LAI. Each subplot corresponds to a unique machine learning algorithm, including (a) PLSR, (b) ANN, (c) SVR, (d) GPR, (e) RF, (f) GBTR, (g) RF-Adaboost, (h) ANN-Adaboost, (i) GPR-Adaboost, (j) SVM-Adaboost, (k) RBFN, (l) GRNN, and (m) CART. In each panel, measured LAI values are plotted on the x-axis against estimated LAI values on the y-axis (N = 129). The dashed line represents the 1:1 reference line of perfect prediction, while the solid line indicates the linear regression fit between the measured and estimated values. The performance of each method is quantified using the coefficient of determination (R²), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). The values of these metrics are displayed within each subplot, providing a direct comparison of the algorithms’ predictive capabilities.

Figure 5. The scatter plot matrix illustrates 43 points of external validation results for thirteen machine learning methods applied to estimate LAI. Each subplot is dedicated to a different algorithm, including (a) PLSR, (b) ANN, (c) SVR, (d) GPR, (e) RF, (f) GBTR, (g) RF-Adaboost, (h) ANN-Adaboost, (i) GPR-Adaboost, (j) SVM-Adaboost, (k) RBFN, (l) GRNN, and (m) CART. In each subplot, the x-axis represents the measured LAI values, while the y-axis shows the estimated LAI values by the respective machine learning method, with 43 points presented in every subplot. The dashed line represents the 1:1 reference line of perfect prediction, while the solid line indicates the linear regression fit between the measured and estimated values. The performance metrics for each method are displayed within the subplots: the coefficient of determination (R²), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE), which provide insights into the accuracy and precision of the LAI estimations.

Figure 6. Scatter plot comparing machine learning regression methods with MODIS LAI and MERRA-2 LAI products. Note that this comparison incorporates the full dataset (comprising both training and validation samples) to evaluate the overall consistency and distributional alignment between the two products across all ground sites, with 172 scatter points included in each panel. The dashed line represents the 1:1 reference line of perfect prediction, while the solid line indicates the linear regression fit between the measured and estimated values.

Figure 7. Comprehensive evaluation of the 13 machine learning regression models across six normalized performance dimensions using a radar chart. The six axes represent the normalized values of the coefficient of determination (M1: R²), Root Mean Square Error (M2: RMSE), Mean Absolute Error (M3: MAE), and the sensitivity of R², RMSE, and MAE to training sample sizes (M4, M5, and M6, respectively). Each model is depicted by a distinct colored polygon connecting its coordinates across the six axes. All metrics are uniformly scaled from 0 at the center to 1 at the outer edge using min–max normalization. Thus, the better the model performance across all metrics, the closer the data points lie to the outer boundary of the radar chart, which means a larger chart area indicates superior model performance.

Figure 8. Annual average spatial patterns of MODIS LAI, MERRA-2 LAI, and GBTR LAI values from 2001 to 2021 (unit: m²/m²). (a) MODIS LAI; (b) MERRA-2 LAI; (c) GBTR LAI.

Table 1. LAI data collected and analyzed in this study. The information in the table includes the project name, location, year, official website, and relevant references.

Project	Location	Vegetation	Year	Website	Reference
Bigfoot	North America	Forest, cropland, tallgrass prairie, desert	1999–2003	https://daac.ornl.gov/ accessed on 27 April 2026	[33]
VALERI	Globe	Forest, grassland, crops	2000–2008	http://w3.avignon.inra.fr/valeri/ accessed on 27 April 2026	[34]
Harvard forest	MA, USA	Forest	2014–2018	https://harvardforest.fas.harvard.edu/ accessed on 27 April 2026	[35]
GBOV	North America, Europe	Forest, grasslands, shrubs,	2013–2022	https://gbov.acri.fr/ accessed on 27 April 2026	[36]
IMAGINES	Globe	Forest, crops, grassland	2013–2016	https://fp7-imagines.eu/ accessed on 27 April 2026	[37]

Table 2. Method Classification.

Category	Method Name
-	PLSR
-	SVR
-	GPR
Neural Networks	ANN
	RBFN
	GRNN
Decision Tree	RF
	GBTR
	CART
AdaBoost	SVR-Adaboost
	GPR-Adaboost
	ANN-Adaboost
	RF-Adaboost

Table 3. Number of training and testing samples for the eight different schemes.

Sample Name	Set 1	Set 2	Set 3	Set 4	Set 5	Set 6	Set 7	Set 8
Training Datasets	24	39	54	69	84	99	114	129
Testing Datasets	30

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, D.; Miao, L.; Lu, Y.; Jiang, H.; Liu, Q. An Evaluation of Machine Learning Methods for Leaf Area Index Retrieval. Remote Sens. 2026, 18, 1884. https://doi.org/10.3390/rs18121884

AMA Style

Wang D, Miao L, Lu Y, Jiang H, Liu Q. An Evaluation of Machine Learning Methods for Leaf Area Index Retrieval. Remote Sensing. 2026; 18(12):1884. https://doi.org/10.3390/rs18121884

Chicago/Turabian Style

Wang, Dong, Lijuan Miao, Yutian Lu, Hanyang Jiang, and Qiang Liu. 2026. "An Evaluation of Machine Learning Methods for Leaf Area Index Retrieval" Remote Sensing 18, no. 12: 1884. https://doi.org/10.3390/rs18121884

APA Style

Wang, D., Miao, L., Lu, Y., Jiang, H., & Liu, Q. (2026). An Evaluation of Machine Learning Methods for Leaf Area Index Retrieval. Remote Sensing, 18(12), 1884. https://doi.org/10.3390/rs18121884

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Evaluation of Machine Learning Methods for Leaf Area Index Retrieval

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Sources

2.1.1. In Situ LAI Observations

2.1.2. Surface Reflectance and Reanalysis Products

2.1.3. Benchmark LAI Products

2.2. The Implementation of Machine Learning (ML) Methods

2.2.1. Partial Least Squares Regression (PLSR)

2.2.2. Support Vector Regression (SVR)

2.2.3. Gaussian Process Regression (GPR)

2.2.4. Neural Networks

2.2.5. Decision Tree

2.2.6. AdaBoost

2.2.7. Performance Evaluation

2.2.8. Model Evaluation

3. Results

3.1. Training Sample Size Effects on Machine Learning Accuracy in LAI Estimation

3.2. Internal and External Validation of LAI Estimation Methods Based on Field Observations

3.3. Comparison Between Locally Tuned and Globally Generalized Approaches

3.4. Comprehensive Evaluation

3.5. Spatial Distribution Pattern of the Global LAI Average State

4. Discussion

4.1. Advancements and Comparative Performance of Machine Learning Algorithms in LAI Estimation

4.2. Machine Learning vs. Traditional LAI Products

4.3. Perspectives, Limitations, and Future Developments

4.4. Research Implications and Ecological Significance

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI