1. Introduction
Reliable estimation of above-ground pasture biomass or pasture biomass (PB) is essential for effective farm management, feed budgeting, and decision-making in pasture-based dairy systems. As global demand for agricultural efficiency grows, accurate and scalable PB monitoring tools become increasingly important. Pasture biomass estimation supports livestock system productivity, maintains environmental sustainability through optimised resource use, and improves economic viability across diverse farming operations [
1]. However, conventional data collection methods like rising plate meters (RPM), a tool that relates compressed height of PB with the amount of biomass present, provide relatively high-quality data but are labour intensive, time consuming and impractical for daily monitoring, and this could create operational bottlenecks for farmers seeking to balance productivity with operational sustainability [
2,
3].
In contrast, satellite remote sensing provides broad scale, repeatable and comparatively low cost coverage of vegetation condition [
1,
4]. Yet, its most widely used proxy, the Normalised Difference Vegetation Index (NDVI), saturates under dense canopies, reducing sensitivity to PB and flattening response curves, which compromises predictive accuracy [
5,
6]. Alternative indices, including the Enhanced Vegetation Index (EVI), Soil-Adjusted Vegetation Index (SAVI) and Normalised Difference Red-Edge Index (NDRE), are derived from different combinations of spectral bands and offer only marginal improvements, unable to accurately predict PB on their own [
7,
8]. Because PB variability is driven by multiple factors such as soil moisture, paddock management, short-term weather extremes and region-specific sward composition, indices alone cannot explain PB variation; these additional drivers must be represented explicitly in modelling frameworks [
9,
10].
Sentinel-2 provides free, high-temporal and high-spatial resolution multispectral imagery that supports large-scale, frequent PB measurements, making it a cost-effective alternative relative to other options like UAV-mounted sensors, field spectrometers, and ground-based cameras [
10,
11,
12,
13]. Its diverse spectral bands allow researchers to monitor vegetation health and growth consistently and affordably [
14,
15].
However, satellite-based data utilisation has its inherent limitations [
16]. Cloud cover, atmospheric interference and inclement weather conditions can produce unreliable satellite-derived products or no data at all, limiting PB estimates for decision-making using this approach and reducing reliable insights on pasture management and utilisation [
4,
8]. On days when satellite data are completely unavailable, for example on cloudy or snowy days, ground-based RPM measurements are usually employed to compensate for missing satellite data to maintain consistent flow of data [
9,
10,
17,
18].
Integrating diverse data sources offers a potential solution to these challenges. Recent research shows that combining raw satellite imagery with ground-truth measurements, weather variables and paddock-specific information, can significantly enhance PB prediction accuracy [
10,
18]. Machine learning (ML) algorithms such as Random Forests (RF), Support Vector Machines (SVM) and Artificial Neural Networks (ANN) have been used in these integrations and outperform traditional regression methods and linear programming by providing better insights, identifying nonlinear relationships between driving factors affecting pasture growth and managing high-dimensional data, and adapting to temporal and spatial variability, thereby offering a more comprehensive understanding of pasture utilisation and management [
19,
20,
21,
22].
As an alternative or complement to ML approaches, physiological growth models provide mechanistic insights into pasture dynamics by simulating processes such as photosynthesis, respiration, and nutrient uptake based on environmental drivers. Models such as DairyMod, APSIM-Pasture, and ModVege have been successfully applied to predict pasture growth in temperate grazing systems, offering interpretable predictions grounded in plant ecophysiology [
23,
24,
25,
26]. While these process-based models excel at capturing temporal growth patterns under varying environmental conditions, they typically require extensive parameterisation and may lack the flexibility to incorporate high-dimensional remote sensing data directly [
27]. Hybrid approaches that combine the mechanistic foundation of physiological models with the pattern-recognition capabilities of ML algorithms represent a promising research direction, potentially leveraging the strengths of both methodologies to improve predictive accuracy and model interpretability [
28,
29].
However, significant research gaps persist, particularly regarding temporal and spatial misalignments between satellite data and ground measurements, which create inconsistencies as satellite observations often do not coincide with the timing or location of ground data collection. Furthermore, missing data due to cloud cover or other environmental obstructions disrupts the continuity of satellite-derived datasets [
4,
30]. Addressing these issues requires sophisticated data manipulation strategies such as interpolation techniques to fill gaps, synthetic data generation to expand datasets and progressive training to capture temporal patterns across different scenarios [
31]. The generalisability of models across regions and farming systems is also a concern as a model trained on data from one area or region may perform well within that context, however, its applicability to other areas or locations with different weather, soil, and management conditions is not guaranteed. Therefore, evaluating reliability and flexibility using external validation datasets is essential to ensure that the predictive framework is not overly tailored to a specific environment but remains versatile enough for broader adoption.
Building on these challenges highlighted, this research poses several key questions to address the limitations of current pasture estimation methods. First, how can the integration of raw satellite imagery, RPM measurements, weather data, and paddock-specific characteristics significantly enhance PB prediction accuracy, especially overcoming the known saturation issues of conventional derived vegetation indices like NDVI? Second, how effectively can the integration of diverse data sources address the challenges of missing or unreliable satellite readings caused by cloud cover, thereby ensuring data continuity for robust model performance? Third, can advanced ML models effectively compensate for temporal and spatial mismatches, ensuring their validity and reliability when applied to external datasets across varied weather, soil, and management contexts, thereby offering a robust and generalisable solution beyond existing site-specific approaches?
To address these questions, the objectives of this study are threefold. First, to develop an integrated ML framework that leverages raw Sentinel-2 reflectance, RPM-derived PB, weather, and paddock management characteristics to accurately estimate PB. Second, to evaluate the reliability and transferability of the framework across diverse temporal and spatial conditions, including unseen farms and seasons, by leveraging data augmentation and progressive training strategies. Third, to benchmark the developed framework against a commercial platform, demonstrating its practical application as a transparent, cost-effective and scalable solution for sustainable near real-time, farm-specific pasture utilisation and management in dairy systems.
2. Materials and Methods
2.1. Study Area and Data Sources
This study was conducted across 16 commercial dairy farms located in three coastal districts of New South Wales (NSW), Australia, between November 2021 and July 2024 (
Figure 1). The farms were distributed across the mid-coast district (n = 7 farms, latitude range: −34.03° to −31.72°S, longitude range: 150.65° to 152.68°E), the south coast (n = 5 farms, latitude range: −36.82° to −36.64°S, longitude range: 149.60° to 149.90°E), and the north coast (n = 4 farms, latitude range: −28.90° to −28.68°S, longitude range: 152.91° to 153.13°E), with specific farm coordinates withheld to maintain commercial confidentiality. These farms collectively managed 2436 hectares of grazing land, with herd sizes varying from 105 to 580 milking cows, and individual farms containing between 22 and 83 paddocks, with utilisable grazing areas ranging from 64.5 ha to 313 ha. The study regions presented diverse weather conditions, with long-term mean annual rainfall averaging approximately 780 mm in the south coast, 1284 mm in the mid-coast, and 1073 mm in the north coast. Average daily air temperatures ranged from approximately 5 to 20 °C in winter and 20 to 35 °C in summer. All farms had pastures based on kikuyu (
Cenchrus clandestinus, previous
Pennisetum clandestinum), which produces biomass from late spring (November) through summer and autumn, oversown every year (March to April) with annual ryegrass (
Lolium multiflorum L.), which produces biomass during autumn, winter, and early spring.
Remote sensing data was acquired from Copernicus Sentinel-2 surface reflectance imagery, which offers 10 m spatial resolution and a nominal five-day revisit cycle for each paddock in each farm during clear-sky passes. The spectral bands retained for analysis included blue, green, red, near-infrared, red-edge 1 to 3, and short-wave infrared 2 and 3. Standard vegetation indices such as NDVI, EVI, SAVI, and NDRE were also calculated per pixel. To ensure data quality, pixels were masked using the Function of mask (Fmask) layer, classifying them as valid, water, cloud, shadow, or snow.
Ground-truth PB measurements were conducted using an RPM. A Jenquip EC20 Electronic Pasture Meter (Feilding, New Zealand) was specifically utilised to obtain Compressed Sward Height (CSH) readings. Five primary paddocks on each farm were selected for continuous monitoring over the two-year study period (November 2021 to July 2024), along with one additional reserve paddock (the sixth paddock) which served as a spatially independent validation site. The selection of these five primary paddocks was based on the following criteria: (i) representativeness of the predominant pasture species composition on each farm (kikuyu oversown with annual ryegrass), (ii) accessibility for consistent fortnightly/weekly sampling throughout the study period, (iii) diversity in paddock size and topographic characteristics to capture farm-level variability, and (iv) active incorporation in the farm’s regular grazing rotation to ensure realistic management conditions. The sixth paddock was intentionally withheld from model training to provide an independent spatial validation dataset, representing genuine out-of-sample conditions for assessing model generalisation to unseen paddock locations within the same farm.
RPM calibration was an integral part of the data collection, performed monthly on two designated fixed paddocks per farm [
10]. The calibration process involved collecting nine 0.1 m
2 quadrat cuts at a 5 cm stubble height, stratified across the paddock to capture high, medium, and low biomass zones (three cuts per zone). Samples were dried for 48 hours at 65 °C, weighed, and subsequently regressed against the corresponding CSH measurements to derive farm-specific and seasonally adjusted conversion equations. This approach ensured that CSH-to-PB conversions accounted for seasonal changes in pasture density and species composition. The monthly calibration frequency was designed to capture phenological changes in sward structure that could affect the CSH-PB relationship, particularly during transitions between kikuyu and ryegrass dominance. This process allowed for the accurate conversion of CSH readings into PB, expressed as kilograms of Dry Matter per hectare (kg DM ha
−1). For each paddock, a minimum of 70 plate measurements were recorded and then converted to PB in kg DM ha
−1 [
10,
18]. The spatial distribution of measurements within each paddock ensured that areas with varying slope, drainage, and proximity to high-traffic zones (gates, water points, shade structures) were adequately represented, thereby minimising bias from localised management effects or microtopographic variation.
Additionally, environmental data consisting of daily weather variables for each farm were obtained from the SILO Long Paddock platform (
https://www.longpaddock.qld.gov.au). Variables used were maximum air temperature (°C), minimum air temperature (°C), rainfall (mm), vapour pressure (kPa), maximum and minimum relative humidity (%), incoming solar radiation (MJ m
−2) and evapotranspiration (mm). These observations were obtained to provide essential information regarding weather influences on PB.
2.2. Data Preprocessing
Sentinel-2 images were corrected to surface reflectance, resampled to a common 10 m grid and clipped to paddock boundaries obtained from the georeferenced GIS dataset for each farm. The pixel-quality layer Fmask [
32] classified each pixel as valid, water, cloud, shadow or snow, and only pixels flagged valid were retained for analysis. For every monitored paddock on the day an image was taken, the mean reflectance of blue, green, red, near-infrared, red-edge 1-3 and short-wave-infrared 2-3 was calculated, and vegetation indices NDVI, EVI, SAVI and NDRE were derived. Spectral outliers falling outside 1.5 interquartile ranges from the first or third quartile were removed before averaging.
Rising plate-meter records were filtered to keep PB between 1000 kg DM ha−1 and 4000 kg DM ha−1. Approximately 70 individual RPM readings collected within each paddock on a monitoring day were averaged to one PB value per paddock. Within each farm-paddock-date group, categorical fields such as paddock name were reduced to their modal value and numeric fields were averaged. Daily SILO weather data were converted to numeric, checked for implausible entries and merged directly by each farm and calendar date.
Date stamps in every dataset were converted to datetime and expanded to ISO week number, calendar year and austral seasons, where Spring includes September to November, Summer (December to February), Autumn (March to May) and Winter (June to August). Satellite data, daily weather observations and PB estimates were merged on farm code, paddock code and date. When Sentinel and plate-meter observations were not recorded on the same day, PB rows were retained only if a Sentinel acquisition occurred within ±3days. Rows with missing predictor values were removed on a complete-case basis, and continuous variables were centred and scaled to unit variance for subsequent modelling, yielding a curated dataset of 3161 records from 80 paddocks across the 16 farms.
2.3. Interpolation Methods for Data Augmentation
Following the merging of the RPM measurements, collected weekly in year 1 and fortnightly in year 2, with Sentinel-2 passes that recur on an approximately five-day revisit cycle, small temporal gaps remained in PB because satellite imagery does not provide a direct PB measurement from the bands. To expand the dataset, these gaps were filled within each paddock in each farm using stochastic interpolation routines implemented in Python (v3.11.4).
Four interpolation techniques were assessed to augment PB observations for the merged dataset (RPM and Sentinel). The set comprised a second order polynomial in time as the baseline curve, a Gaussian radial basis function, a multiquadric radial basis function [
33,
34] and a minimum curvature exact spline. Radial basis surfaces offered flexible data-driven alternatives that perform well for smooth, yet non-linear environmental series [
35,
36,
37] and the minimum curvature spline minimises the surface Laplacian, a property valued in geophysical analysis [
38]. All four algorithms were executed independently three times for each paddock in each farm: once on year 1 data only (dates earlier than 1 April 2023), once on year 2 data only (dates on or after 1 April 2023) and once on the combined two-year record to exploit longer term temporal structure where available.
Interpolation was performed only when at least three actual PB observations were available for a paddock and was strictly confined to the temporal span defined by those observations, thereby preventing any extrapolation. Analysis of observation intervals revealed that time gaps between consecutive pasture meter measurements in the final dataset ranged from 7 to 14 days (median = 7 days, mean = 8.4 days). The majority of intervals (84.9%) were ≤14 days, with 94.3% ≤30 days and only 1.6% exceeding 60 days. The multiquadric radial basis function employed is appropriate for capturing smooth temporal trajectories across such intervals, as pasture growth follows gradual, predictable patterns between discrete grazing events rather than exhibiting abrupt discontinuities. For the small proportion of longer intervals (>60 days), interpolation remained constrained to the temporal span of observed measurements, with no extrapolation beyond the empirical record. All time stamps were converted to Unix epoch seconds to provide a uniform, monotonic reference axis for curve fitting. Prior to modelling, any PB measurement falling outside the biologically credible interval of 1000 to 4000 kg DM ha−1 was discarded. After the gap-filling procedure each record was annotated to indicate whether Sentinel-2 reflectance originated from a nearest-date substitution and whether the corresponding PB simulated was replaced by an interpolated value, enabling downstream analyses to distinguish directly measured data from synthetically generated values. Rows with null values after processing were discarded. The combination of biological filtering, nearest-date filling and advanced interpolation produced a comprehensive temporal gap-free dataset containing 9816 daily records from 80 paddocks across the 16 farms.
2.4. Predictive Modelling for Pasture Biomass
2.4.1. Model Training and Optimisation
Univariate analysis was employed to quantify linear relationships between each predictor and the response variable PB (kg DM ha−1) using a Pearson correlation matrix visualised as a heat map. Numerical predictors comprised daily maximum temperature (°C), daily minimum temperature (°C), evapotranspiration (mm d−1), incoming solar radiation (MJ m−2 d−1), vapour pressure (kPa), rainfall (mm d−1), maximum and minimum relative humidity (%), ten (10) Sentinel-2 spectral bands (blue, green, red, near infra-red, red edge 1 to 3, short-wave infra-red 2 and 3) and four derived vegetation indices (NDVI, SAVI, EVI, NDRE). While standard greenness indices (NDVI, EVI, SAVI, NDRE) were calculated, SWIR-based indices such as the Cellulose Absorption Index (CAI) were not explicitly derived, as the tree-based ML algorithms were expected to capture relevant SWIR band information through non-linear feature interactions. Categorical predictors were season, coastal region and grazing information; these were examined with box plots and frequency tables. Four feature sets were defined: (i) all bands with indices, (ii) all bands with weather data and indices, (iii) bands only and (iv) bands with weather data without indices. These configurations were assessed to determine which combination of variables most effectively models PB and to quantify the contribution of weather and vegetation indices to its variability. Predictors showing negligible correlation or high collinearity were removed.
All analyses in this study were carried out in Python (v3.11.4). Categorical variables were one-hot encoded and numerical features were centred and scaled with StandardScaler. Data were split randomly into training and test partitions at an 80:20 ratio, after which multiple random seeds were evaluated through a five-fold three-repeat Repeated K-Fold cross-validation loop to stabilise estimates. Eight regression algorithms, linear regression (LR), least absolute shrinkage and selection operator (LASSO), decision tree (DT), support vector regression (SVR), k-nearest neighbours (KNN), random forest (RF), gradient boosting machine (GBM) and extreme gradient boosting (XGBoost), were wrapped in scikit-learn pipelines. Hyper-parameters for every algorithm were declared in a single Python dictionary; tree ensembles varied the number of decision trees (n_estimators) from 50 to 450 and maximum depth from 3 to 10, while SVR varied the cost regularisation parameter C from 0.1 to 10 and kernel type. A grid search procedure was employed to systematically test all hyper-parameter combinations, selecting the configuration that minimised negative mean absolute error (MAE, kg DM ha−1), and the model with the lowest cross-validated error was retained.
2.4.2. Progressive Training for Temporal Consistency
To examine how sequential retraining influences model performance while preserving the chronological order of new observations, a progressive training strategy developed by Correa-Luna, et al. [
10] was adopted. Every record carried calendar year, month, ISO week, and week of month (WOM), enabling four nested training subsets that represented 25%, 50%, 75% and 100% of each monthly cycle. The subsets were defined as follows: 1 W used WOM = 1; 2 W used WOM = 1 or 2; 3 W used WOM = 1, 2 or 3; and 4 W used WOM = 1, 2, 3 or 4. After each increment, the best model was refitted on the enlarged subset and evaluated on the unchanged test split, with MAE retained as the optimisation metric. The protocol was run separately on the non-interpolated dataset and on the interpolated dataset created in
Section 2.3, ensuring that temporal consistency was assessed under both raw and augmented conditions without introducing look-ahead bias. Additionally, to assess inter-annual generalisation, an additional experiment trained the model onYear1Set (data collected before 1 April 2023) and evaluated it on Year2Set (data collected on or after that date), using three dataset variants: the non-interpolated data, a Year 1 multiquadric-interpolated dataset and a full-period multiquadric-interpolated dataset.
2.4.3. Pasture Biomass Model Validation
For the validation, two independent hold-out samples were excluded from all training steps, including non-interpolated, interpolated, and progressive training workflows, and were kept free of any interpolation or gap-filling procedures to represent truly unseen data. The first independent validation sample comprised 41 records representing all available and valid paired RPM PB and Sentinel-2 imagery observations collected from the five primary monitored paddocks across nine of the study farms between 1 and 30 November 2024. The specific count of 41 records was the outcome of applying the data-quality filters (i.e., PB values between 1000 and 4000 kg DM ha−1) to all available measurements during this validation period. For these records, PB measured with RPM was merged with same-day Sentinel-2 imagery, and cloud-affected scenes were handled by replacing them with the corresponding weekly mean reflectance. The selection of these nine farms was based on the availability of complete, high-quality independent data that met the validation criteria during this specific November 2024 period, ensuring they provided fresh, untainted observations.
The second independent validation sample consisted of 63 records reflecting all valid observations gathered specifically from the sixth monitored paddock on each participating farm for the whole period. This “sixth paddock” was intentionally designated as an additional, spatially distinct unseen geographic validation set. The 63 records represent the total number of valid observations obtained after applying the identical merging procedure and the same 1000–4000 kg DM ha−1 PB filtering as the training data, ensuring comparable data quality while maintaining their independence. The predictive accuracy of the models on these truly independent datasets was quantified using standard performance metrics: root mean squared error (RMSE), mean absolute error (MAE), mean squared error (MSE), and the coefficient of determination (R2). These metrics were calculated separately for each independent hold-out sample, thereby enabling a robust assessment of the ability of the model to generalise to completely unseen, non-interpolated data across different temporal and spatial contexts.
4. Discussion
4.1. Exploratory Analysis and Vegetation Index Limitations
Descriptive analysis revealed that PB is governed by seasonality and regional setting, with composite spectral indices explaining daily variation more effectively than instantaneous weather readings. The Pearson correlation heat map (
Figure 3) partitioned predictors into three main clusters that informed the modelling strategy. The high auto-correlation among the ten Sentinel-2 bands (r > 0.70) justified the use of tree-based feature selection methods rather than attempting manual dimensional reduction. The greenness-biomass cluster showed that while NDRE (r = 0.49), EVI (0.45), NDVI (0.44) and SAVI (0.44) all correlated positively with PB, none achieved correlation coefficients above 0.50, indicating substantial unexplained variance. The weather cluster revealed tight coupling among maximum temperature, evapotranspiration and solar radiation (r ≈ 0.75) but only weak instantaneous ties to PB (r ≤ 0.18), emphasising the lagged influence of weather on pasture growth rather than same-day effects. These insights guided the development of multi-feature ML pipelines that merged meteorological variables, full-band Sentinel-2 reflectance, red-edge-enhanced indices and categorical predictors to capture the residual variability in PB. The univariate regression analysis (
Figure 2) demonstrated that increased sampling does not alleviate the saturation problem inherent in vegetation indices. Saturation occurs when the relationship between a vegetation index and biomass becomes non-linear and eventually plateaus, meaning that increases in biomass no longer result in proportional increases in the index value. The fan-shaped residual pattern confirms that NDVI plateaus near 0.80 while PB continues to rise beyond 3000 kg DM ha
−1. Even red-edge information (NDRE), which emerged as the sole significant positive coefficient (+3701 kg DM ha
−1 unit
−1,
p < 0.001), cannot fully capture PB variability once canopies exceed approximately 2800 kg DM ha
−1. This saturation fundamentally limits the utility of univariate spectral regressions at high canopy density, where accurate PB estimation is most critical for practical farm management.
An important consideration is that PB variability reflects not only environmental conditions but also management practices, particularly grazing pressure, which can rapidly alter available biomass independent of weather or spectral signals [
39,
40]. In the absence of explicit management variables, environmental predictors may partly function as categorical site indicators that implicitly capture farm-specific utilisation pattern. This limitation reinforces the necessity of frequent ground-truth calibration through RPM measurements, which inherently integrate both environmental and management effects at each sampling occasion [
9,
10]. The fortnightly to weekly sampling regime employed here was specifically designed to capture these combined dynamics, while meteorological variables enhanced temporal interpolation between observations [
41].
4.2. Overcoming Vegetation Index Saturation Through Multi-Spectral Integration
The primary challenge addressed in this study was the saturation of traditional vegetation indices like NDVI under dense canopies, which typically occurs when PB exceeds approximately 3 tonnes DM ha
−1 [
42]. This saturation fundamentally limits the utility of conventional remote sensing approaches for practical farm management, where accurate PB estimation is most critical at higher PB levels. The univariate regression analysis (
Figure 2) clearly illustrated this limitation, with NDVI plateauing near 0.80 whilst PB continued to rise beyond 3000 kg DM ha
−1, resulting in the characteristic fan-shaped residual pattern that confirms the inadequacy of NDVI alone for explaining PB variability.
The integrated approach of combining raw Sentinel-2 reflectance with weather variables demonstrates a clear pathway beyond these limitations. The progression from NDVI-only (R
2 = 0.28, MAE = 359 kg DM ha
−1) to NDVI plus weather factors (R
2 = 0.50, MAE = 284 kg DM ha
−1) to full spectral bands with weather data (R
2 = 0.63, MAE = 243 kg DM ha
−1) clearly shows that whilst weather data provides valuable orthogonal information, the complete Sentinel-2 spectral suite captures essential biophysical information that traditional indices cannot adequately represent. This finding aligns with recent research by Jennewein, et al. [
43], who achieved R
2 = 0.70 for crop biomass/PB estimation using multi-sensor proximal remote sensing combining multiple satellite platforms. However, our current study demonstrates that comparable performance can be achieved using only freely available Sentinel-2 data, highlighting the cost-effectiveness and accessibility of this approach for widespread adoption across diverse farming systems. Collectively these findings demonstrate that PB is governed by a variety of factors such as seasonality and regional setting, that composite spectral indices explain daily variation more effectively than instantaneous weather readings, and that greenness saturation limits the usefulness of univariate spectral regressions at high canopy density.
Guerini Filho, et al. [
44] showed that Sentinel-2 imagery combined with vegetation indices could predict natural grassland PB with R
2 values ranging from 0.51 to 0.65. However, our current study demonstrates that robust predictive performance can be achieved even in the absence of explicit vegetation indices, indicating that raw reflectance bands, especially in the red-edge and short-wave infrared regions, inherently contain the necessary biophysical information. This finding is further supported by Gargiulo, et al. [
18], who reported R
2 = 0.72, RMSE = 255 kg DM/ha
−1 when combining Sentinel-2 with Planet CubeSats data, but required multiple commercial satellite sources. The decision to exclude vegetation indices from the final model aligns with findings from Morse-McNabb, et al. [
19], who demonstrated that including SWIR bands substantially enhanced yield prediction accuracy with Sentinel-2 when predicting PB above 3000 kg DM ha
−1, improving R
2 from 0.79 to 0.90 and reducing RMSE by nearly 200 kg DM ha
−1. Ogungbuyi, et al. [
45] similarly highlighted the limitations of index-only approaches, noting that despite a moderate correlation with total PB (R
2 = 0.43), the associated MAE of 871.83 kg DM ha
−1 was too large for practical application. In contrast, the inclusion of SWIR bands, as demonstrated in the current study, not only minimised preprocessing requirements but also improved generalisation on external datasets. The validation results (
Figure 5) confirmed that raw red-edge and short-wave-infra-red bands already carry the information encapsulated by the indices, with explicit index calculation adding noise without improving generalisation.
Although weather showed weak instantaneous correlations with PB (
Section 3.1), concurrent weather still provides orthogonal information that improves prediction. Removing meteorological inputs lowered test-set R
2 by 5-6 points and raised MAE by roughly 30 kg DM ha
−1, confirming that weather variables capture aspects of pasture condition not reflected in same-day spectral measurements. The ranking of model quality remained unchanged across validation sets, with XGBoost ahead of gradient boosting and random forest, followed by linear methods, showing that ensemble tree algorithms capture non-linear interactions between weather and reflectance that translate beyond the calibration domain. Stand-alone decision trees performed worst, highlighting the stabilising value of ensemble averaging in a feature space dominated by collinear reflectance bands.
Consistent with our findings, Chen, et al. [
46] reported optimal PB prediction when all Sentinel-2 bands, NDVI, and weather variables were included, yielding R
2 approximately 0.60 and MAE approximately 262 kg DM ha
−1. This study matched these benchmarks without requiring explicit inclusion of vegetation indices, reinforcing the argument that full spectrum inputs provide a richer predictive foundation than derived indices alone. In a broader modelling context, Netsianda and Mhangara [
4] combined Sentinel-2 bands, NDVI, and elevation data to estimate PB using RF and GBM algorithms, achieving an R
2 of 0.73. Whilst their approach involved multiple data modalities, our findings demonstrate comparable or superior results using only satellite and weather data, without requiring additional topographic inputs.
While this study demonstrated that XGBoost effectively captured information from raw SWIR2 and SWIR3 bands through non-linear feature interactions, future research could explore whether explicit SWIR-based indices such as the Cellulose Absorption Index (CAI) or Normalised Difference Lignin Index (NDLI) provide additional interpretability or improve performance in simpler, more interpretable models (e.g., linear regression or decision trees) that may be preferred in some operational contexts [
47]. Recent studies have demonstrated that SWIR bands are particularly effective for predicting high pasture biomass (>4000 kg DM ha
−1) where chlorophyll-based indices like NDVI saturate [
19,
48], and that explicit SWIR-enhanced indices can provide clearer mechanistic insights into canopy structural properties [
49,
50]. However, our results confirm that ensemble tree-based models can effectively learn these relationships directly from raw spectral bands without requiring pre-calculated indices.
4.3. Data Augmentation Through Multiquadric Interpolation
Cloud cover and satellite revisit cycles create inevitable temporal gaps in optical remote sensing data, a fundamental limitation recognised across the remote sensing literature [
4,
16]. The multiquadric radial basis interpolation approach employed in this study addresses this limitation by filling gaps in the RPM time series, creating interpolated observations based on actual observations that improve model training without introducing unrealistic temporal assumptions. The interpolation process accounts for pasture growth between measurements by fitting a smooth surface through the observed RPM points, with the multiquadric function providing a mathematically principled way to estimate intermediate values that reflect the gradual accumulation and depletion of biomass between actual field measurements. By confining interpolation strictly to the temporal span defined by observed measurements and requiring at least three actual observations per paddock, the method avoids extrapolation whilst capturing the underlying growth trajectory.
Whilst radial basis interpolation has found application in environmental and medical sciences, and some studies have explored interpolation methods for satellite data in agricultural applications, the specific application of multiquadric interpolation to bridge temporal gaps between ground-truth RPM measurements and cloud-affected satellite imagery in pasture biomass estimation represents a methodological contribution to this field. The observed improvement aligns with recent advances in agricultural data augmentation reported by Gracia Moisés, et al. [
51], who demonstrated substantial error reductions using similar techniques in optical spectroscopy applications. However, the application to temporal gap-filling in satellite-ground data integration represents a distinct methodological contribution.
The multiquadric surface employed in this study demonstrated a favourable balance of speed, stability, and predictive effectiveness compared to alternative interpolation methods (
Figure 6). This approach contrasted with other interpolation methods such as Gaussian process or Kriging augmentation, which, despite improving spatial homogeneity, are computationally intensive for national-scale datasets [
52]. Recent mathematical advancements that generalise the multiquadric kernel for quasi-interpolation [
53,
54,
55] suggest that even higher fidelity could be achieved as these theoretical formulations become operational in geospatial libraries. The trend-assisted multiquadric gridding approach demonstrated efficiency gains noted in Earth observation applications [
31], supporting its adoption for operational pasture monitoring systems.
4.4. Progressive Training and Temporal Generalisation
The integration of interpolated rows within the progressive training regime demonstrated the practical benefit of continually updating the model. This trajectory corroborates the findings of Correa-Luna, et al. [
10] that even a relatively small proportion of fortnightly ground observations, roughly 10% of paddock days, is sufficient to stabilise pasture predictions whilst emphasising the necessity of retraining the model on a comparable rolling schedule.
However, the temporal validation experiments (
Figure 8) revealed important limitations of interpolation when applied to strict temporal extrapolation scenarios. Training exclusively on Year1Set and testing on Year2Set demonstrated that interpolation actually reduced performance compared to the raw data baseline (R
2 slipped from 0.24 to 0.14, MAE rose from 349 to 368 kg DM ha
−1). This apparent contradiction with the overall benefits of interpolation highlights the context-dependent nature of data augmentation techniques. This finding aligns with domain adaptation challenges widely reported in ML literature Meyer, et al. [
56]. To understand this performance decline, we decomposed the prediction error into bias and variance components. The interpolated Year1Set model exhibited higher bias (systematic underestimation of Year2 biomass by approximately 15%) and similar variance compared to the raw model, suggesting that the interpolation surface captured Year1-specific patterns that did not transfer to Year2 conditions.
The observations generated through interpolation introduced bias when the training and testing data came from fundamentally different temporal distributions, as the interpolation surface fitted to Year 1 data encoded seasonal patterns specific to that year. However, when training and testing data came from similar temporal distributions (the 80:20 split from the combined dataset), interpolation significantly improved performance by providing a denser, more representative sample of the underlying growth patterns. In summary, these findings also indicate that the combined application of interpolation and progressive training effectively bridges inherent observational gaps associated with optical remote sensing and establishes a data-efficient method for maintaining accuracy as seasonal conditions change. The comprehensive strategy directly addresses model validity and reliability across varied climatic, soil, and management contexts, thereby offering a robust and generalisable solution.
4.5. Model Performance and Validation Across Multiple Scales
The robustness of the final multiquadric-augmented model was reflected across external validation sets (
Figure 7). The model achieved consistent performance on the November 2024 paddocks and the sixth-paddock validation sets, outperforming the non-interpolated baseline whilst other interpolation methods failed to surpass an R
2 of 0.37. Spatial leave-farm-out experiments confirmed that a PB model trained on one set of paddocks inevitably experiences a reduction in accuracy when applied to entirely excluded fields. In our case, the non-interpolated version surrendered roughly one quarter of its explanatory power under this scenario, a decline consistent with cross-location deterioration reported by Smith, et al. [
57]. However, the introduction of multiquadric interpolation significantly mitigated this loss, helping the model maintain a more balanced representation of the growth envelope and maintaining the MAE on the held-out farms below 300 kg DM ha
−1, while explaining nearly half of the PB variance, surpassing the more complex ensembles assessed by Smith, et al. [
57] at comparable extrapolation distances.
The disparity between robust performance on the internal 20% test split and comparatively weaker scores on independent validation sets elucidates the inherent challenges associated with model transferability, as widely reported in remote sensing applications where models are tested beyond their calibration domain. Validation errors exceeded test errors by about 10%, reflecting the additional temporal and geographic distance embodied in the hold-out samples. Nevertheless, the crucial need for regular retraining is emphasised by the observation that model accuracy declined on paddocks and seasons withheld from calibration, even with interpolation. Overall, the results demonstrate that temporal generalisation hinges on two complementary strategies, retaining the full historical record so that the model learns from experiencing the complete seasonal duration, and supplementing that record with cautiously smoothed interpolations that densify the signal without inflating noise. When applied together, these measures raise R2 by about 30% points and cut MAE by roughly 85 kg DM ha−1 compared with a model trained on a single season and maintain consistent accuracy when entire farms are withheld during testing. The consistency of these results across scenarios shows that the model, built with the full predictor set, extends reliably to farms unseen during calibration, reinforcing the temporal gains reported above.
4.6. Comparison with Commercial Decision-Support Systems
The comparison with PIO, a commercial PB estimation platform, provides crucial context for the application of research-based approaches. The strong agreement between the developed open-source model and the proprietary platform (R
2 = 0.66, MAE = 240 kg DM ha
−1) demonstrates that transparent, academic approaches can achieve comparable performance to commercial solutions across matched observations spanning multiple farms and production years [
10]. This benchmark was performed across multiple farms and distinct production years, demonstrating that an openly documented and fully reproducible approach can achieve relatively high performance whilst retaining adaptability to integrate new sensors and management variables as they become available. Commercial platforms typically have access to multiple satellite sources, sophisticated atmospheric correction algorithms, and proprietary data fusion techniques, making this performance parity particularly significant. The comparison validates methodological choices whilst highlighting the potential for democratising precision agriculture technologies. Many existing commercial solutions require significant capital investment or ongoing subscription costs that may be prohibitive for smaller farming operations, particularly in developing regions where cost-effective monitoring solutions are most needed.
4.7. Limitations and Future Research Directions
Several factors constrain the current implementation and highlight avenues for future research. The reliance on multiquadric interpolation presumes gradual temporal evolution of PB, a common approach for filling cloudy gaps in vegetation monitoring but one that risks obscuring sharp declines associated with grazing or drought. Combining optical-derived indices with sensors capable of detecting rapid canopy structural shifts, such as C-band radar backscatter, could offset this limitation [
58]. Pasture greenness measures displayed the well-recognised saturation effect once biomass exceeded roughly 3000 kg DM ha
−1, thereby limiting model sensitivity in the critical range where tactical management adjustments are most often needed [
46,
59]. Acquiring richer spectral information, such as narrow-band hyperspectral or chlorophyll fluorescence signals, may offer enhanced sensitivity under high biomass and address this ceiling effect [
13,
60,
61].
Additionally, weather predictors were restricted to daily aggregates, limiting capacity to reflect sub-daily factors such as vapour pressure deficit, refined soil moisture estimates, or high-resolution temperature extremes that could potentially capture short-lived stress events influencing pasture growth between satellite overpasses. The study spanned three districts within a single coastal climatic zone, and explicit testing of site-specific differences in management such as fertiliser timing or detailed pasture quality was not conducted. As noted by Holzworth, et al. [
62], expansion of ground data across diverse weather and management systems is essential to evaluate broad-scale model transferability. The Year 1 to Year 2 transfer experiment reinforces this point because despite a threefold increase in sample size through interpolation, a model trained exclusively on the first season could not accommodate phenological shifts of the following year, with acceptable performance only restored after expanding the calibration base and retraining on multi-year data [
63].
The demonstrated performance achievements using freely available satellite data have important implications for democratising precision agriculture technologies. The open-source approach provides full methodological transparency allowing for local adaptation and improvement, eliminates ongoing data acquisition costs through use of freely available Sentinel-2 data, and enables integration of additional sensor types or management variables as they become available. However, the current implementation may require some technical expertise, making it more immediately accessible to researchers, government agents, or technical staff rather than directly to individual farmers. Strategically, the progressive training paradigm establishes a pathway for continuously improving models that remain reliable as satellite technology, weather conditions, and management practices evolve.
Methodologically, this study significantly advances pasture remote sensing by demonstrating that integrating raw spectral, meteorological, and management data within a unified machine learning framework provides superior performance compared to models driven solely by vegetation indices. From a practical standpoint, the interpolation-enhanced model delivers near real-time PB estimates that align well with commercial platforms across diverse environments, offering producers a transparent and customisable alternative to proprietary solutions. Future work should focus on exploring ensemble combinations of optical and radar imagery, automating paddock-level recalibration using farmer-supplied RPM pasture biomass measurements, and integrating the model into grazing allocation tools to directly translate predictive accuracy into measurable productivity gains and farm profitability. These enhancements will enable integration into grazing allocation tools, establishing a dynamic, self-improving decision-support system that translates predictive accuracy into improved pasture utilisation and increased farm profitability.
5. Conclusions
This study demonstrates that accurate pasture biomass estimation can be achieved even where traditional vegetation indices saturate and Sentinel-2 imagery is intermittently obscured by cloud cover. By integrating raw multispectral reflectance, rising plate-meter measurements, daily weather data, and paddock-level metadata within a machine learning framework, the model effectively bridged the gap between satellite observations and ground measurements. The unified strategy prioritising full-band reflectance over vegetation indices achieved robust baseline performance (R2 = 0.63, MAE = 243 kg DM ha−1), substantially outperforming NDVI-based approaches. Multiquadric radial basis interpolation of field measurements addressed temporal gaps from cloud obstruction, augmenting the dataset with approximately 30% interpolated observations and improving performance to R2 = 0.70 and MAE = 216 kg DM ha−1. These improvements were observed across independent validation sets, with the November 2024 validation set achieving R2 = 0.44 and MAE = 267 kg DM ha−1, and the sixth-paddock validation set achieving R2 = 0.48 and MAE = 235 kg DM ha−1.
The implementation of progressive training with seasonally aligned observations-maintained model accuracy across temporal and spatial contexts. Leave-farm-out validation confirmed robust generalisation to unseen farms (R2 = 0.46, MAE < 300 kg DM ha−1), whilst comparison with the commercial Pasture.io platform demonstrated comparable performance using freely available data. Despite these advancements, several limitations warrant future research, as multiquadric interpolation may smooth abrupt pasture biomass declines following intensive grazing, and daily weather aggregates may overlook short-lived environmental events. Future work should focus on integrating finer-resolution sensors, developing automated paddock-level recalibration systems, and incorporating the model into grazing allocation tools to establish a dynamic decision-support system that translates predictive accuracy into improved pasture utilisation and farm profitability.