Scaling Biomass Estimation by Expanding Ground Truth with UAS-Derived Training Data

KC, Kushal; Khanal, Sami

doi:10.3390/rs17183163

Open AccessArticle

Scaling Biomass Estimation by Expanding Ground Truth with UAS-Derived Training Data

by

Kushal KC

and

Sami Khanal

^*

Department of Food, Agricultural, and Biological Engineering, The Ohio State University, Columbus, OH 43210, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(18), 3163; https://doi.org/10.3390/rs17183163

Submission received: 1 August 2025 / Revised: 7 September 2025 / Accepted: 10 September 2025 / Published: 12 September 2025

(This article belongs to the Special Issue Advanced in Remote Sensing Approaches for Agricultural Monitoring at Field and Regional Scale (Second Edition))

Download

Browse Figures

Versions Notes

Abstract

Accurate estimation of winter cover crop biomass at a landscape scale is key to assessing benefits and promoting widespread adoption. Satellite imagery offers broad coverage but is limited by coarse resolution and spatial mismatch with field measurements. This study introduces a hybrid framework to improve satellite-based estimation of cereal rye cover crop biomass by integrating UAS-derived data. Extreme gradient boosting (XGBoost) and random forest (RF) machine learning models were trained across three scenarios: (1) UAS-based models, using field-measured biomass alongside UAS-derived vegetation indices (VIs) and crop height; (2) satellite-based models, using field-measured biomass and Sentinel-2 satellite-derived VIs and grey level co-occurrence texture measures; and (3) UAS–satellite synergistic models, where UAS-estimated biomass served as surrogate ground truth for calibrating satellite-derived VIs and texture features. Our results show that the error increased by up to 49% for XGBoost and 31% for RF when using field-measured cereal rye biomass at a 0.5 × 0.5 m² to directly train satellite-derived features with 10 m resolution (RMSE = 83.09 g m⁻² for XGBoost and 80.46 g m⁻² for RF), compared to using UAS-derived features at 5 cm (RMSE = 55.78 g m⁻² for XGBoost and 61.63 g m⁻² for RF). Notably, the UAS–satellite synergistic model demonstrated improved alignment with RMSE of 59.79 g m⁻² for XGBoost and 61.45 g m⁻² for RF while potentially overcoming the limitations due to differences in the size of satellite pixels and field measurements. These findings underscore the potential of UAS-derived biomass estimates to improve the accuracy, scalability, and spatial fidelity of satellite-based cover crop biomass estimation.

Keywords:

unmanned aerial system (UAS); cereal rye; cover crop; biomass; remote sensing; satellite; UAS–satellite synergy

1. Introduction

While modern agricultural practices have significantly boosted productivity and efficiency, they have also contributed to widespread decline in soil health and water quality, mainly across the Midwest United States. To mitigate these environmental concerns, conservation practices such as winter cover cropping are increasingly promoted through state and federal level incentive programs [1,2]. Among their many benefits, winter cover crops improve soil organic carbon, reduce soil compaction and erosion, suppress weeds, and enhance nutrient retention [3,4]. The magnitude of these benefits is closely tied to the amount of biomass produced by the cover crops. Thus, timely and spatially detailed information on cover crop biomass is critical for guiding field-level management decisions that promote both agronomic productivity and environmental sustainability.

Conventional methods for cover crop biomass estimation, which are based on manual field sampling, are labor- and resource-intensive, and therefore limited in spatial coverage [3,5]. Remote sensing (RS) technologies offer a promising alternative by enabling scalable, cost-effective biomass estimation from individual locations to fields to regional landscapes. In particular, unmanned aerial systems (UASs) and satellite imagery have been increasingly leveraged to estimate cover crop biomass and their nutrient uptake, often in conjunction with ground-truth data [6,7,8].

Satellite imagery, with its broad spatial coverage and frequent revisit times, is well suited for landscape-scale monitoring. However, its utility for precise biomass estimation is limited by relatively coarse spatial resolution (10–30 m), which often does not align with the small-scale field plots used for ground-based measurements [9,10]. For instance, while ground-measurements represent data within just a few square meters, the small units in satellite imagery (i.e., pixel) often range from 10 m (Sentinel-2) to 30 m (Landsat-8/9) which oversimplifies features within a pixel. This spatial mismatch can introduce significant biases in model calibration and validation [11,12,13,14]. Recent advancements in UAS platforms and sensors present a promising opportunity to bridge this scale gap. UASs can capture high-resolution imagery (up to centimeter level), making it ideal for capturing fine-scale spatial variability that mirrors ground sampling plots. When equipped with advanced sensors, such as multispectral, hyperspectral, and LiDAR, UASs can offer rich spectral and structural information for precise estimation of cover crop biomass [8,15,16]. However, due to constraints such as limited battery life, regulatory restrictions, and operational cost, UASs are less practical for frequent and large-scale monitoring [17,18,19,20].

A promising approach to overcome these challenges is to use UAS-derived estimates as a high-resolution intermediary between field data and satellite observations. Prior studies have demonstrated that UAS-based estimates can effectively calibrate satellite imagery for large-scale crop monitoring [21,22]. Yet, this strategy remains underexplored in the context of cereal rye, a widely incentivized conservation practice in the Midwest. Existing efforts to estimate cereal rye biomass at larger scales often rely solely on satellite imagery and ground data, without adequately addressing the spatial resolution mismatch between small plot-based biomass samples (often <1 m²) and satellite pixels (10–30 m) [10,23,24]. For instance, Kharel et al. [23] evaluated the correlation between mixed species cover crop biomass collected from 1 m² area and vegetation indices (VIs) derived from 3 m resolution Plantescope imagery in research farms in Stoneville, Mississippi.

This study addresses this critical gap by integrating UASs and satellite-based remotely sensed imagery to enhance the accuracy and scalability of cover crop biomass estimation. Specifically, we aim to (i) estimate cereal rye biomass using machine learning models trained on UAS-based imagery and field-measured biomass, (ii) compare these estimates with models trained directly on satellite imagery and field data, and (iii) evaluate whether using UAS-derived biomass as an intermediate ground-truth can improve the accuracy of satellite-based biomass predictions. We hypothesize that (i) UAS-based models, due to their higher spatial fidelity, will outperform satellite-based models in estimating cereal rye biomass and (ii) integrating UAS-derived biomass estimates as calibration data will improve satellite-based model performance, offering a scalable pathway for accurate landscape-level cover crop monitoring.

2. Materials and Methods

2.1. Study Sites

This study was conducted on a total of 21 cereal rye (Secale cereal L.) fields located in Northwestern and Central Ohio, with 15 fields in 2021 and 6 fields in 2022 (Figure 1a). The majority of these fields (i.e., 17 of them) are situated in the Northwestern region, within the broader Western Lake Erie basin (WLEB), a region that continues to grapple with soil degradation and nutrient-driven water pollution due to long-standing intensive corn and soybean production [25,26]. In response to these environmental pressures, state and federal agencies have increasingly encouraged the use of winter cover crops, particularly cereal rye, as a strategy to improve soil health and reduce nutrient runoff [27].

2.2. Field Data Collection

Field data collection was conducted between March and May, prior to the termination of cereal rye. Composite biomass samples were collected using a 0.5 × 0.5 m² quadrat at each of the sampling locations (6 in 2021 and 10 in 2022) (Figure 2a). These locations were selected to represent the range of soil types present in each field, as identified from the SSURGO soil database [28]. These biomass samples were processed in the lab; oven dried at 55 °C and weighed to determine the dry biomass weight. In addition, the average plant height at each sampling location was measured using a meter-tape. To geolocate sampling points on the UAS imagery, different strategies were used across the two years. In 2021, visible ground markers were strategically placed in the field to enable visual identification in the UAS images [8] (Figure 2b). In 2022, precise coordinates of sampling locations were recorded using a Trimble R8s global navigation satellite system (GNSS) (Westminster, CO, USA) receiver with real-time kinematic (RTK) correction from the continuously operating reference station (CORS) network (Figure 2c).

2.3. Remote Sensing Data Collection

2.3.1. UAS Image Collection

In 2021, multispectral UAS imagery were captured for each field using a DJI Phantom 4 multispectral (DJI Innovations, Shenzhen, China) UAS flown at an altitude of approximately 300 ft (~90 m) (Figure 2d). This sensor captures imagery in five spectral bands, including Blue (450 ± 16 nm), Green (560 ± 16 nm), Red (650 ± 16 nm), Red-Edge (730 ± 16 nm), and Near-Infrared (840 ± 26 nm). In 2022, a similar process was followed using a WingtraOne Gen One (Wingtra AG, Zurich, Switzerland), a vertical take-off and landing (VTOL) UAS, flown at an altitude of 325 ft (99 m) (Figure 2e). This UAS was equipped with a six-band Micasense Altum sensor (EagleNXT, Wichita, KS, USA), capturing images in six spectral bands, including Blue (475 ± 32 nm), Green (560 ± 27 nm), Red (668 ± 14 nm), Red-Edge (717 ± 12 nm), Near-Infrared (842 ± 57 nm), and Thermal (6 μm). All flights were conducted with at least 80% front and 70% side overlap to ensure image quality and spatial coverage. For accurate georeferencing, the DJI Phantom 4 was equipped with a real-time kinematics (RTK) correction module for centimeter-level positional accuracy. For WingtraOne, images were post-processed using WingtraHub software (version 2.12.1) using the post-processing kinematic (PPK) correction approach, based on a fixed Trimble GNSS base station deployed during each flight.

2.3.2. Satellite Images

To align satellite observations with UAS and field data, we compiled Sentinel-2 satellite imagery within ± six days from ground sampling (Table S1). Sentinel-2, a constellation of two satellites, provides high-resolution multispectral imagery globally at a five-day revisit interval [29]. These imageries were accessed via Google Earth Engine (GEE) (https://code.earthengine.google.com/), a cloud processing platform [30], via Python (version 3.8.18). From each image, five spectral bands were selected to match those available from the UAS’s collected imagery: Blue (496.6 nm), Green (560 nm), Red (664.5 nm), Red-Edge (740.2 nm), and NIR (835.1 nm).

Due to cloud cover and satellite revisit limitations, sentinel-2 imagery was not consistently available for all fields and data collection dates. Consequently, only 18 of the 21 fields had at least one cloud-free satellite observation aligned with field data collection. While the Blue, Green, Red, and NIR bands are available at 10 m resolution, the Red-Edge band is provided at 20 m. To ensure consistency across all inputs, the Red-Edge band was resampled at 10 m using bilinear interpolation. Further details on imagery selection and acquisition dates are provided in Table S1.

2.4. Data Processing

2.4.1. Point Cloud and Orthomosaic Generation

Multispectral UAS images were processed using the Pix4Dmapper software program (version 4.7.5) [31] to generate 3D point clouds and a multi-band orthomosaic for each field (Figure 3a,b). The 3D point clouds consist of data points in a three-dimensional coordinate system (i.e., X, Y, and Z), with X–Y defining horizontal orientation and Z indicating elevation. This processing workflow was applied to all 21 fields, including both single-date and multi-date image acquisition. Figure 3c,d show the stitched orthomosaics, one representing a single band and another three bands visualized using a natural color.

2.4.2. Computation of Vegetation Indices

Using the UAS-derived orthomosaic images, average spectral values for each of the six spectral bands were extracted from 0.5 × 0.5 m² sampling areas across all 21 fields corresponding to their positional information. Using these values, three VIs were calculated: blue–green ratio (BGratio), normalized green–red difference index (NGRDI), and normalized difference red-edge (NDRE) (Table 1). These indices were selected due to their low multi-collinearity, as identified in our previous work [8]. Additionally, all three are ratio-based indices, which help minimize noise and calibration errors, an important consideration when using data from two different multispectral sensors (DJI Phantom 4 multispectral and Altum multispectral) across two different years. To ensure consistency across platforms, the same set of VIs was also computed using the corresponding satellite imagery.

2.4.3. Computation of Crop Height

In addition to VIs, crop height was estimated for all fields to incorporate into the UAS-based biomass prediction models. Following the image processing in Pix4Dmapper, the resulting point clouds were exported in LiDAR Aerial Survey (LAS) format and analyzed in Rstudio (version 2025.05.1) using ‘lidr’ package (version 4.0.2) [35]. Unlike prior studies that relied on UAS imagery at two points—one with bare soil to create a digital terrain model (DTM) and another during crop growth for a digital surface model (DSM) [36,37]—our study lacked baseline (bare-soil) imagery. As such, crop height was estimated from a single UAS image acquisition.

Briefly, to approximate both ground (DTM) and canopy (DSM) elevations, a grid-based approach was used. For each field, two spatial grids were created: a 3 m grid for estimating ground elevation and a 1 m grid for estimating canopy height. The 3 m grid was overlaid on the point clouds data to extract the 1st percentile of elevation values, which were assumed to represent ground surface. Concurrently, the 99th percentile of elevation values was extracted within the 1 m grid to represent the top-of-crop canopy points. The larger 3 m window was used specifically to improve the detection of ground points in vegetated areas. The 1st percentile elevations (DTM) were then interpolated to the 1 m grid using inverse distance weighing to align with the canopy grid (DSM). Crop height for each 1 m cell was calculated by subtracting the interpolated DTM from the DSM (Figure 4). We also tested and evaluated the crop height estimations using 1 m grid for both DSM and DTM. Finally, these gridded crop height data, initially at 1 m spatial resolution, were further resampled to 5 cm using bilinear resampling to match the spatial resolution of the original UAS orthomosaics.

2.4.4. Computation of Texture Features for Satellite Images

Since crop height calculation from satellite imagery is not possible, we relied on texture features as an alternative metric for structural crop information. Texture measures derived from the spatial arrangement of pixel values capture canopy variability and organization, which are closely linked to crop growth and biomass [38]. We used the grey level co-occurrence matrix (GLCM) texture measures to extract eight different texture features using NIR band: mean, variance, homogeneity, contrast, entropy, dissimilarity, second moment, and correlation [39]. GLCM-based texture features have been used in prior studies to estimate crop biomass [40,41]. The computation was conducted in Rstudio (version 2025.05.1) using the ‘glcm’ package (version 1.6.5) and selecting the window size of 3 × 3. This window size was selected considering the coarse resolution of satellite images (i.e., 10 m) and the balancing of capturing local texture while avoiding over smoothing. Figure 5 illustrates the eight second-order GLCM texture measures derived from a Sentinel-2 NIR band on one of the fields.

2.5. Machine Learning Models to Estimate Cereal Rye Biomass

To estimate cereal rye biomass, we used two regression models—random forest (RF) and extreme gradient boosting (XGBoost), both of which demonstrated superior performance in our previous work [8]. At first, UAS- and satellite-based datasets were split into training and testing sets using a 70/30 ratio, with grouped sampling applied to ensure that samples from the same field were not shared across both sets (Figures S1–S3).

Model implementation was carried out in the Python-based scikit-learn platform (version 1.3.2) using the ‘RandomForestRegressor’ and ‘XGBRegressor’. Hyperparameter tuning was performed using Bayesian optimization via the ‘hyperopt’ library (version 0.2.7) [42], in combination with nested cross-validation to ensure robust and unbiased performance evaluation (Table S2). The visualizations of the parameter space used for optimization are also provided in Supplementary Documents (Figures S5 and S6). The nested cross-validation framework consisted of a 3-fold inner loop for hyperparameter tuning and a 5-fold outer loop for model performance assessment [43]. Grouped cross-validation was used in both loops to prevent data from the same field from appearing in both training and validation subsets, thereby avoiding spatial data leakage.

Bayesian optimization was selected over traditional grid and random search methods for its efficiency in exploring the hyperparameter space by leveraging prior evaluations to accelerate convergence toward optimal configurations. Once the models were tuned and trained, they were evaluated on the test dataset using three metrics: root mean squared error (RMSE), mean absolute error (MAE), and coefficient of determination (R²).

2.5.1. Model Calibration

To predict cereal rye biomass, the models were calibrated using three distinct approaches, based on data source and spatial scale of the response and independent variables: (i) UAS-based model, (ii) satellite-based model, and (iii) UAS–satellite synergistic model (Figure 6).

UAS-Based Model

The first approach (Model 1 in Figure 6) utilized UAS data, aligning spatially with the 0.5 × 0.5 m² field sampling plots. Field-measured biomass served as the response variable, while predictor variables included three VIs derived from UAS imagery and UAS-estimated crop height. Incorporating crop height into the UAS-based model is expected to improve model accuracy by capturing structural characteristics of the cereal rye canopy. Following training, tuning, and testing of the models, the best-performing model configuration was selected to generate high-resolution cereal rye biomass maps (5 cm) across all UAS-surveyed fields. We want to emphasize that the data included all the fields where UAS images were available (i.e., 21 fields) (Model 1a) and all the spatial estimations were made based on this model. But we also re-trained and tested the UAS-based model on the same set of fields where satellite images were available to make a direct comparison (Model 1b).

Figure 6. Overview of the modeling workflow, including calibration steps based on UAS and satellite imagery (first panel), and the process of developing the UAS–satellite synergistic model (second panel), where high-resolution UAS-derived cereal rye biomass estimates were aggregated to 10 m to match the spatial resolution of satellite imagery and used as surrogates for ground-truth data.

Satellite-Based Model

The second approach (Model 2 in Figure 6) involved training models using VIs and GLCM-texture derived from Sentinel-2 imagery at a 10 m resolution. These were paired with field-measured biomass from the 0.5 × 0.5 m² sampling areas. However, due to cloud cover and limited satellite availability, the fields included in this model were not fully consistent with those used in the UAS-based analysis. Nonetheless, we included the same group of test fields that were considered in the UAS-based model. For fair comparison, we evaluated the satellite-based model with the UAS-based model trained in the same set of fields (i.e., Model 1b) where satellite images were available.

UAS–Satellite Synergistic Model

The third approach involved a hybrid modeling strategy that leveraged UAS-derived biomass estimates (estimated based on Model 1a) to train models using satellite-based predictors (Model 3 in Figure 6). This method was designed to address the spatial mismatch between fine-scale field measurements and the coarser resolution of Sentinel-2 imagery, where a single 10 × 10 m satellite pixel can encompass approximately 100 UAS pixels within a 0.5 × 0.5 m² area (Figure S4). To resolve this, high-resolution UAS-based biomass predictions were aggregated to 10 m resolution at the field sampling locations, aligning them spatially with satellite-derived VIs. These aggregated biomass values were then used as a surrogate for field-measured observations, enabling model calibration at a consistent spatial scale. By integrating the spatial detail of UAS data with the broader coverage of satellite imagery, this synergistic approach aimed to enhance the accuracy and scalability of landscape-level biomass estimation. The final tuned values of hyperparameters across all three modeling scenarios are provided in Table S3 while the results of cross-validation during training are provided in Figure S7. To understand the contribution of each feature to the prediction, the relative importance was assessed using Shapely Additive exPlanations (SHAP) values [44].

3. Results

3.1. UAS-Based Crop Height

Figure 7 presents a comparative analysis of crop height measurements using the UAS-based 3D point clouds, processed using two different ground point filtering methods: a 1 m grid and a 3 m grid. The results from the 1 m grid approach exhibit less reliable crop height estimates, as indicated by the large scatter of data points and weaker correlation (r = 0.23), alongside higher error values (MAE = 17.55 cm and RMSE = 29.69 cm) (Figure 7a). In contrast, filtering ground points with a 3 m grid results in a more reliable estimation, showing a moderate correlation (r = 0.68) and reduced error metrics (MAE = 13.08 cm and RMSE = 20.21 cm) (Figure 7b). Further analysis of the data reveals that the 3 m grid approach results in lower average bias and RMSE across the fields (Figure 7c,d), highlighting the advantages of increased grid sizes for accurately filtering ground points within a crop field and eventually improving the accuracy and consistency of UAS-derived crop height estimations. Thus, the UAS-based model relied on crop height estimates from a 3 m grid.

3.2. Performance of UAS-Based Models

When VIs and crop height derived from UAS-based multispectral images were used to train models with field-measured biomass (in case of Model 1a), both XGBoost and RF demonstrated comparable performance (Figure 8). The scatter plots illustrate estimated versus field-measured biomass values for each model across multiple field sites, with data points colored by field and the red dashed line representing a 1:1 reference line. On the independent test dataset, the XGBoost model achieved an R² of 0.67 compared to RF with R² of 0.63, along with error metrics: MAE of 47.57 g m⁻², and RMSE of 66.89 g m⁻² for XGBoost versus MAE of 47.07 g m⁻² and RMSE of 70.69 g m⁻² for RF. While MAE values were nearly identical, the improvement in R² and the reduction in RMSE suggest that XGBoost had a modest advantage in overall prediction accuracy and error reduction. The clustering of data points near the 1:1 line for both models suggests generally good model agreement with field-measured values. However, there was a wider scatter in RF estimations, particularly for biomass lower than 200 g m⁻², suggesting greater variability and reduced precision compared to XGBoost. Based on the SHAP analysis, the most impactful features to model performance were observed to be in order of Crop height > NDRE > BGratio > NGRDI for XGBoost and in order of Crop height > NGRDI > BGratio > NDRE for RF (Figure 9). While crop height showed consistent contribution across both models, NDRE followed by BGratio had greater contribution than NGRDI in the XGBoost model. In contrast, all three VIs showed similar level of contribution in the RF model. This is indicative of the greater ability of XGBoost in exploiting information contained within the spectral VIs than the RF model. Based on its superior performance, the XGBoost model was selected for further computations to generate spatially explicit maps of cereal rye biomass across all UAS-surveyed fields. Similarly, another version of UAS-based models (Model 1b) aligning with the same group of fields with access to satellite images resulted in MAE = 33.21 g m⁻², RMSE = 55.78 g m⁻², and R² = 0.83 for XGBoost and MAE = 35.55 g m⁻², RMSE = 61.63 g m⁻², and R² = 0.79 for RF (Figure S8).

3.3. Performance of Satellite-Based Models

Both the XGBoost and RF models demonstrate comparable predictive accuracy when using a satellite-based approach. The XGBoost model had MAE of 46.24 g m⁻², RMSE of 83.09 g m⁻², and R² of 0.61. Similarly, the RF model had MAE of 49.91 g m⁻², RMSE of 80.46 g m⁻², and R² of 0.64 (Figure 10). The scatter plots indicate that both models exhibit considerable dispersion, particularly at lower biomass values, resulting in limited overall prediction accuracy and significant errors. This is likely due to the inherent discrepancy between localized field measurements and the broader view provided by satellite data. These differing spatial scales cause challenges in reconciling the datasets, leading to inaccuracies in the predictions. When compared to the UAS-based model (Model 1b), trained and tested on the same set of fields, the satellite-based models showed a significant increase in error: the RMSE increased by 49% for XGBoost and 31% for RF. This highlights the greater level of detail and spatial resolution available from UAS data, which enhances model accuracy compared to satellite-based observations. For both satellite-based XGBoost and RF models, the most important variables were identified as NGRDI followed by BGratio, NDRE, and other texture features (Figures S9 and S10).

3.4. UAS–Satellite Synergistic Models

The models trained using UAS-derived biomass as a surrogate for ground-truth data demonstrated substantial increase in alignment with the estimated biomass for both XGBoost and RF (Figure 11). The XGBoost model achieved R² of 0.67, MAE of 30.68 g m⁻², and RMSE of 59.79 g m⁻². The RF model also showed improved performance, with an R² of 0.65, MAE of 32.04 g m⁻², and RMSE of 61.45 g m⁻². Although we could not perform direct one-to-one comparisons due to differences in reference data, the results indicate a significant improvement in the alignment between UAS-based reference estimates and target data from satellite observations. This suggests high-resolution cereal rye biomass estimations from UAS-based machine learning models can serve as effective proxies for calibrating satellite data. This approach not only enhances model performance but also holds strong potential for scaling biomass estimation across broader landscapes. The SHAP analysis revealed feature importance in the order of NGRDI > BGratio > NDRE > texture measures for XGBoost and NGRDI > BGratio > GLCM_mean > NDRE > other texture measures for RF (Figure 12). NGRDI and BGratio show a larger share of contribution in the model prediction than other features.

3.5. Spatial Variability in Estimated Biomass Maps

The models showing superior performance in each of the three modeling approaches were selected to generate spatially explicit maps of cereal rye biomass (i.e., XGBoost in UAS-based and UAS–satellite synergistic models and RF in satellite-based models). The spatially explicit biomass estimates from the UAS-based model captured fine-grained variation that was not discernible in either the satellite-based or the UAS–satellite synergistic model. This demonstrates the key advantage of UAS imagery, which enables a more detailed and granular assessment of biomass distribution across agricultural fields that coarser resolution satellite imagery tends to overlook.

Figure 13 shows biomass predictions from three models across two contrasting fields, F21-4, characterized by high biomass, and F21-12, characterized by low biomass. In field F21-4 (top row), the UAS-based model revealed substantial within-field variation, clearly delineating areas of high and moderate biomass. In comparison, the satellite-based model (Model 2) consistently underestimated biomass in the densest areas, likely due to spatial averaging within the 10 m resolution pixels. The UAS–satellite synergistic model (Model 3) showed improved alignment with the UAS-based reference, both in spatial pattern and magnitude, suggesting that integrating UAS-derived biomass helped compensate for the underestimation inherent in satellite-only models. In the case of field F21-12 (bottom row), which had generally low biomass, the satellite-based model also tends to underestimate biomass in regions with sparse vegetation cover, likely due to background reflectance effects and the coarser spatial resolution blending non-vegetated areas with patches of cover crops. The synergistic model again mitigated this bias, providing a more accurate spatial representation that better resembled the fine-scale detail captured by the UAS-based model.

These findings underscore the limitations of satellite imagery in accurately capturing the full extent of within-field biomass variability, largely due to the mismatch between satellite pixel resolution and the fine spatial heterogeneity present in the field. However, by integrating UAS-derived estimates as a spatially aligned reference, the synergistic model bridges this gap, demonstrating the value of combining high-resolution UAS data with the broader coverage and repeated frequency of satellite imagery.

4. Discussion

4.1. Performance of UAS- and Satellite-Based Models

In this study, we evaluated the performance of biomass estimation models using two distinct RS approaches: UAS-based and satellite-based models, each trained with their respective imagery and field-measured biomass samples. UAS-based models used crop height as well as VIs (NGRDI, BGratio, and NDRE) as predictors, while satellite-based models used the three VIs alongside the GLCM-based texture features to compensate for lack of crop height. Our results demonstrated better agreement between the estimated biomass with field-measured biomass for UAS-based models over satellite-based models, which aligns with prior research comparing UAS data with coarser resolution satellite imagery for biomass estimation [21,45,46]. For instance, Doughty et al. [21] reported that NDVI derived from UAS imagery showed a stronger correlation with above-ground biomass in coastal wetland vegetation in southern California (R² = 0.40 and RMSE = 534.6 g/m²) than NDVI from Landsat 8 satellite imagery (R² = 0.26 and RMSE = 596.8 g/m²). Similarly, Avarez-Mendoza et al. [46] found that UAS-based models estimated above-ground biomass in Brachiaria pasture in Colombia more accurately (R² = 0.70–0.76) compared to models based on Sentinel-2 imagery (R² = 0.45–0.60).

The improved performance of UAS-based models can be largely attributed to the much finer spatial resolution of UAS imagery, which allows for more accurate representation of spatial variability within agricultural fields and better alignment with field-based sampling areas. Furthermore, the ability to derive high-resolution crop height also significantly contributed to enhancing the predictability as shown by its highest ranking in feature importance order. In contrast, the coarser spatial resolution of satellite data tends to oversimplify field-level heterogeneity and introduces considerable uncertainty due to the mismatch in scale between imagery and ground-sampling units [47]. Our findings show up to 49% increase in error for satellite observations compared to UAS-derived observations even if training and testing were conducted on the same sets of fields. It should be noted that differences could be due to sensor capabilities as well as the added value of crop height information. Nonetheless, the issue of spatial mismatch has been well documented. For instance, Wessels et al. [48] observed a weak NDVI–biomass relationship (R² = 0.36), which was largely attributed to the discrepancy between the small sampling plots (50 × 60 m²) and large AVHRR satellite pixels (1000 × 1000 m²), a scale mismatch of roughly 333:1. Similarly, in our study, biomass samples were collected from 0.5 × 0.5 m² quadrats (0.25 m²), whereas satellite-derived predictors were based on 10 × 10 m² pixels (100 m²), resulting in a 400:1 scale difference. Such disparity leads to signal dilution and increased error when attempting to correlate field-measured biomass with satellite-derived spectral information.

In contrast, the UAS imagery used in our study had a spatial resolution of 5 cm per pixel, enabling detailed capture of field variability and strong correspondence with the ground-truth data. This high-resolution data greatly reduced spatial uncertainty and improved model accuracy by aligning more closely with the scale of biomass sampling. Our findings underscore the value of UAS-based data in precision agriculture applications, especially where accurate, fine-scale biomass estimation is critical.

4.2. Leveraging High-Resolution UAS Data to Improve Satellite-Based Biomass Predictions

Building on the evaluation of UAS-based and satellite-based models, we further explored a synergistic approach by using high-resolution UAS-derived biomass estimates, aggregated at 10 m, as surrogate ground-truth data for calibrating models based on satellite-derived features. Although the satellite-only model was trained using field-measured biomass, and the synergistic model was trained using UAS-estimated biomass, the UAS–satellite approach demonstrated improved alignment with reference data and higher predictive accuracy. This improvement is likely attributed to the consistent spatial scale between the response and predictor variables, which reduces the spatial mismatch issues commonly observed when pairing fine-scale field measurements with coarser satellite imagery. The error decomposition highlights how scale mismatch contributes to prediction error: for the best model, direct field to satellite model resulted in an error ~83 g m⁻², whereas field-to-UAS model at finer resolution reduced error to ~56 g m⁻². When UAS estimates were aggregated to satellite pixel scale, the synergistic model achieved an RMSE of ~60 g m⁻², suggesting that much of the improvement stems from reducing variance introduced by scaling differences between field plots and satellite pixels.

Similar findings have been reported in previous research that employed UAS-derived data to enhance satellite-based biomass estimation across diverse ecosystems. For instance, Mao et al. [22] used RGB imagery from UASs, combined with SfM techniques, to estimate above-ground biomass at 2 cm in desert shrub in Inner Mongolia, China. These estimates were then aggregated to match the resolution of various satellite platforms (Plantescope at 3 m, Sentinel-2 at 10 m, and Landsat 8 at 30 m) and subsequently used to calibrate models. The resulting satellite-based models showed significantly improved performance (R² = 0.62–0.93) compared to earlier efforts that relied solely on satellite imagery and ground survey data (e.g., R² = 0.58 from [49]; R² = 0.45–0.50 from [50]).

Similarly, Liu et al. [51] demonstrated the value of using UAS-based shrub biomass estimates at 6 cm resolution to calibrate satellite observations (Landsat 8/9, Sentinel-2, and Sentinel-1 at 10 to 30 m) in the Helan Mountain region in China. Their calibrated satellite-based models achieved an R² of 0.62, outperforming traditional models that relied on models trained solely on satellite imagery using field-sampled biomass (R² = 0.43–0.58; [49]. Beyond shrub biomass estimation, Huang et al. [52] showed that aggregating leaf area index (LAI) estimates from medium resolution Landsat ETM+ (30 m spatial resolution) satellite as a reference LAI worked well (R² = 0.61) for calibrating coarse-resolution MODIS satellite images (1 km) to generate LAI maps over a larger study area.

These results collectively highlight the potential of UAS-derived data to serve as an effective intermediary between fine-scale field observations and broader-scale satellite imagery. From a practical standpoint, the implications of this integrated modeling approach are significant. By leveraging the spatial detail of UAS imagery and the large-area coverage and frequency of satellite platforms, agricultural stakeholders can gain timely, cost-effective insights into cover crop performance and field variability at scale. This enables more informed decision-making around resource allocation, cover crop management, and conservation planning, ultimately supporting sustainable and economically viable agricultural practices across broad regions.

4.3. Opportunities, Limitations, and Future Works

A key advantage of the UAS–satellite synergistic approach as used in this study is its potential to dramatically expand training datasets beyond field-measured ground-truth data. Our UAS–satellite synergistic model utilized only 169 data samples from 18 fields (101 for training, 68 for testing), consistent with corresponding field-measurements. While machine learning models benefit from larger and diverse datasets, field data collection across broad agricultural landscapes remains expensive and logistically challenging. By using UAS-derived biomass estimates as a surrogate for ground-truth data in training data for satellite-based models, we can generate substantially more reference points than would be feasible through traditional ground sampling alone. Our study demonstrates that high-resolution UAS imagery not only improves estimation accuracy but also serves as a scalable alternative to traditional field measurements, supporting robust model development. As access to satellite imagery from multiple platforms continues to grow, such as PlanetScope (3 m, 1-2 day revisit), Sentinel-2 (10 m, 5-day revisit), and Landsat 8/9 (30 m, 8-day revisit), the ability to pair these data with UAS-derived biomass opens new venues for high-frequency, large-scale mapping of crop growth and biophysical properties.

Despite these promising outcomes, several limitations should be acknowledged. One major challenge was the temporal misalignment between satellite image acquisition and field data collection. We did not fully consider the satellite overpass schedules during the planning phase for data collection, and persistent cloud cover further restricted data availability. As a result, suitable satellite imagery was unavailable for 27 out of 54 sampling events. In many cases, we relied on imagery acquired up to six days before UAS flights or field sampling, potentially introducing temporal mismatches that could affect model accuracy. To mitigate this issue, future research should incorporate synchronized data collection protocols that align field sampling, UAS flights, and satellite overpasses.

In this study, we focused on a limited set of VIs—NGRDI, BGratio, and NDRE—across both UAS and satellite models to establish a baseline framework for biomass estimation. While these indices capture key aspects of canopy greenness and chlorophyll content, future studies can incorporate additional spectral features and VIs. Despite the limited number of VIs used in this study, the findings support that UAS-derived data can enhance satellite-based biomass predictions through a synergistic approach. The focus of this study was on understanding how scale differences between fine-resolution field measurements, UAS observations, and satellite pixels influence model performance, rather than on optimizing the number or combination of spectral features. The baseline set of indices demonstrate that aggregating UAS-derived biomass to satellite resolution reduces scale-induced errors and improves predictive alignment. We also did not include synthetic aperture radar (SAR) data (e.g., Sentinel-1) because it was not possible to consistently obtain acquisitions that matched the same dates as the Sentinel-2 imagery. Given the number of fields in our study, we aimed to minimize temporal mismatches and the potential bias they could introduce. Future studies can incorporate additional spectral and textural features, as well as backscatter information from temporally aligned SAR observations, to further enhance model performance.

Although the synergistic satellite–UAS model achieved enhanced alignment between reference and predicted data, this result should be interpreted with caution due to methodological limitations. The improved accuracy likely arises from the spatial alignment between UAS-based biomass, used as training data, and satellite observations, creating an implicit relationship that can inflate agreement metrics. In practice, the satellite model is evaluated against the UAS-derived data rather than against fully independent field biomass measurements due to a lack of sampling areas consistent with satellite image pixels. This means the reported accuracy metrics represent the best-case alignment between predictors and response and may overstate true predictive accuracy. In our results, we tried to compensate for this by showing how error increases when training and testing satellite-based models compared to the UAS-based models with same group of fields. To establish robustness, future validation should rely on independent, pixel-scale ground biomass data that matches the resolution of satellite imagery, thereby providing a more reliable benchmark. Similarly, it should be noted that explicit year-wise or sensor-wise generalization—such as training the model on data from one year or sensor and testing on another—was not evaluated in the present study. Our analysis focused on leveraging the full diversity of available data across years and sensors to assess overall predictive performance. Evaluating model transferability across years and different sensor types represents an important avenue for future research, which could provide further insights into the robustness and generalizability of the models under varying temporal and sensor conditions. While our analyses focused on widely reported summary metrics (R², MAE, RMSE) to benchmark model performance, we recognize that incorporating confidence intervals and statistical tests would provide additional rigor in assessing the significance of differences between models. Future studies could adopt resampling-based approaches (e.g., bootstrapping or extended cross-validation) to generate uncertainty bounds for evaluation metrics, and apply paired statistical tests to compare prediction errors across models. Such evaluations would help to formally quantify performance differences and strengthen the robustness of comparative assessments.

Despite these limitations, the synergistic framework remains valuable as a proof-of-concept for bridging persistent scale gap between plot-level biomass measurements and satellite observations. By leveraging UAS-derived estimates as an intermediate step, the approach aligns training targets with the spatial resolution of satellite imagery, thereby reducing scale-induced noise and enabling satellites to capture field-level variability more consistently. This approach is not just limited to cereal rye and could be adapted for other crops, including cash crops, legumes or other cover crop species. Applying such an approach across diverse crop types could support regional monitoring, improve yield or biomass mapping, and inform management decisions, provided that future studies incorporate crop-specific calibration.

5. Conclusions

This study focused on improving large-scale cover crop biomass estimation by combining UAS and satellite RS, specifically addressing the spatial mismatches between satellite image pixels and field-measurements. Our methodology comprised three distinct steps, including (1) generating high-resolution cereal rye biomass maps using machine learning models trained on VIs and crop height data from UASs, (2) developing machine learning models trained on VIs and texture features extracted from coarse-resolution satellite imagery, and (3) calibrating satellite-based models using UAS-derived biomass estimates as surrogate ground-truth data.

Our findings demonstrated the superior performance of UAS-based models (RMSE of 55.78 g m⁻² for XGBoost and 61.63 g m⁻² for RF) compared to satellite-only models (RMSE of 83.09 g m⁻² for XGBoost and 80.46 g m⁻² for RF) due to the fine spatial resolution and strong alignment with the scale of field sampling. This corresponded to a 49% and 31% increase in the errors for XGBoost and RF, respectively. Importantly, integrating UAS-derived biomass into the satellite-based model, the UAS–satellite synergistic approach achieved improved alignment with estimated biomass with RMSE up to 59.79 g m⁻² for XGBoost and 61.45 g m⁻² for RF. Furthermore, the within-field variability of cereal rye biomass captured by the UAS-based model was closely reflected in the synergistic model, highlighting its effectiveness in preserving within-field variability that was largely absent in satellite-only predictions. Collectively, these findings underscore the value of UAS imagery as a scalable and cost-effective intermediary that bridges the resolution gap between ground-based sampling and satellite observations. By enabling more accurate calibration of satellite models, UAS data extends the utility of satellite RS for regional-scale monitoring of cover crop biomass.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/rs17183163/s1, Figure S1: Data distribution for training and testing sets resulted after grouped sampling based on independent fields for UAS-based model; Figure S2: Data distribution for training and testing sets resulted after grouped sampling based on independent fields for satellite-based model; Figure S3: Data distribution for training and testing sets resulted after grouped sampling based on independent fields for UAS–satellite synergistic model; Figure S4: Number of pixels in UAS image vs Sentinel-2 satellite imagery relative to the size of the sampling location; Figure S5: Visualization of the hyperparameter search space using Hyperopt for XGBoost model; Figure S6: Visualization of the hyperparameter search space using Hyperopt for RF model; Figure S7: Box plots showing distribution of root mean square error during 5-fold cross validation at training stage for (a) UAS-based models, (b) satellite-based models, and (c) UAS–satellite synergistic models; Figure S8: Scatterplots showing UAS-estimated cereal rye biomass versus field-measured biomass considering only fields where satellite images were available; Figure S9: Impact of features on output based on Shapely Additive exPlanations (SHAP) analysis for satellite-based RF model; Figure S10: Impact of features on output based on Shapely Additive exPlanations (SHAP) analysis for UAS–satellite synergistic XGBoost model; Table S1: Temporal alignment between field data collection and satellite image acquisition; Table S2: The range of hyperparameters that were used in hyperparameter tuning of XGBoost and random forest models. Hyperparameter tuning was conducted using Bayesian optimization technique with hyperopt library in python following 10-fold cross validation; Table S3: Values of tuned hyperparameters for the three model scenarios for both XGBoost and random forest.

Author Contributions

K.K.: Conceptualization, Data collection, Formal analysis, Methodology, Software, Validation, Visualization, Writing—original draft, Writing—review and editing. S.K.: Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by funds from OSU L&L Grant # PG107271, SI Grant # PG107338, IGP Grant # PG 2022017, Ohio Soybean Council Grant # GR123740, USDA-AFRI Grant # GR130726, and Hatch Project # NC1195.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

We want to thank Matthew Romanko, Mark Bolin, Boden Fisher, Brigitte Moneymaker, Gaoshoutong Si, Abha Bhattarai, and Neha Joshi for their support during field data collection.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Baylis, K.; Coppess, J.; Gramig, B.M.; Sachdeva, P. Agri-Environmental Programs in the United States and Canada. Rev. Environ. Econ. Policy 2022, 16, 83–104. [Google Scholar] [CrossRef]
Burnett, E.; Wilson, R.S.; Heeren, A.; Martin, J. Farmer Adoption of Cover Crops in the Western Lake Erie Basin. J. Soil Water Conserv. 2018, 73, 143–155. [Google Scholar] [CrossRef]
Finney, D.M.; White, C.M.; Kaye, J.P. Biomass Production and Carbon/Nitrogen Ratio Influence Ecosystem Services from Cover Crop Mixtures. Agron. J. 2016, 108, 39–52. [Google Scholar] [CrossRef]
Daryanto, S.; Fu, B.; Wang, L.; Jacinthe, P.-A.; Zhao, W. Quantitative Synthesis on the Ecosystem Services of Cover Crops. Earth-Sci. Rev. 2018, 185, 357–373. [Google Scholar] [CrossRef]
McClelland, S.C.; Paustian, K.; Williams, S.; Schipanski, M.E. Modeling Cover Crop Biomass Production and Related Emissions to Improve Farm-Scale Decision-Support Tools. Agric. Syst. 2021, 191, 103151. [Google Scholar] [CrossRef]
Yuan, M.; Burjel, J.C.; Isermann, J.; Goeser, N.J.; Pittelkow, C.M. Unmanned Aerial Vehicle-Based Assessment of Cover Crop Biomass and Nitrogen Uptake Variability. J. Soil Water Conserv. 2019, 74, 350–359. [Google Scholar] [CrossRef]
Prabhakara, K.; Dean Hively, W.; McCarty, G.W. Evaluating the Relationship between Biomass, Percent Groundcover and Remote Sensing Indices across Six Winter Cover Crop Fields in Maryland, United States. Int. J. Appl. Earth Obs. Geoinf. 2015, 39, 88–102. [Google Scholar] [CrossRef]
KC, K.; Romanko, M.; Perrault, A.; Khanal, S. On-Farm Cereal Rye Biomass Estimation Using Machine Learning on Images from an Unmanned Aerial System. Precis. Agric. 2024, 25, 2198–2225. [Google Scholar] [CrossRef]
Hively, W.D.; Lang, M.; McCarty, G.W.; Keppler, J.; Sadeghi, A.; McConnell, L.L. Using Satellite Remote Sensing to Estimate Winter Cover Crop Nutrient Uptake Efficiency. J. Soil Water Conserv. 2009, 64, 303–313. [Google Scholar] [CrossRef]
Jennewein, J.S.; Lamb, B.T.; Hively, W.D.; Thieme, A.; Thapa, R.; Goldsmith, A.; Mirsky, S.B. Integration of Satellite-Based Optical and Synthetic Aperture Radar Imagery to Estimate Winter Cover Crop Performance in Cereal Grasses. Remote Sens. 2022, 14, 2077. [Google Scholar] [CrossRef]
Liang, S.; Fang, H.; Chen, M.; Shuey, C.J.; Walthall, C.; Daughtry, C.; Morisette, J.; Schaaf, C.; Strahler, A. Validating MODIS Land Surface Reflectance and Albedo Products: Methods and Preliminary Results. Remote Sens. Environ. 2002, 83, 149–162. [Google Scholar] [CrossRef]
Wu, X.; Xiao, Q.; Wen, J.; You, D.; Hueni, A. Advances in Quantitative Remote Sensing Product Validation: Overview and Current Status. Earth-Sci. Rev. 2019, 196, 102875. [Google Scholar] [CrossRef]
Hufkens, K.; Bogaert, J.; Dong, Q.H.; Lu, L.; Huang, C.L.; Ma, M.G.; Che, T.; Li, X.; Veroustraete, F.; Ceulemans, R. Impacts and Uncertainties of Upscaling of Remote-Sensing Data Validation for a Semi-Arid Woodland. J. Arid Environ. 2008, 72, 1490–1505. [Google Scholar] [CrossRef]
Lu, D.; Chen, Q.; Guangxing, W.; Liu, L.; Guiying, L.; Moran, E. A Survey of Remote Sensing-Based Aboveground Biomass Estimation Methods in Forest Ecosystems. Int. J. Digit. Earth 2016, 9, 63–105. [Google Scholar] [CrossRef]
Sharma, P.; Leigh, L.; Chang, J.; Maimaitijiang, M. Above-Ground Biomass Estimation in Oats Using UAV Remote Sensing and Machine Learning. Sensors 2022, 22, 601. [Google Scholar] [CrossRef]
Wang, F.; Yang, M.; Ma, L.; Zhang, T.; Qin, W.; Li, W.; Zhang, Y.; Sun, Z.; Wang, Z.; Li, F.; et al. Estimation of Above-Ground Biomass of Winter Wheat Based on Consumer-Grade Multi-Spectral UAV. Remote Sens. 2022, 14, 1251. [Google Scholar] [CrossRef]
Hodgson, M.E.; Sella-Villa, D. State-Level Statutes Governing Unmanned Aerial Vehicle Use in Academic Research in the United States. Int. J. Remote Sens. 2021, 42, 5366–5395. [Google Scholar] [CrossRef]
Cracknell, A.P. UAVs: Regulations and Law Enforcement. Int. J. Remote Sens. 2017, 38, 3054–3067. [Google Scholar] [CrossRef]
Li, N.; Liu, X.; Yu, B.; Li, L.; Xu, J.; Tan, Q. Study on the Environmental Adaptability of Lithium-Ion Battery Powered UAV under Extreme Temperature Conditions. Energy 2021, 219, 119481. [Google Scholar] [CrossRef]
Xiao, C.; Wang, B.; Zhao, D.; Wang, C. Comprehensive Investigation on Lithium Batteries for Electric and Hybrid-Electric Unmanned Aerial Vehicle Applications. Therm. Sci. Eng. Prog. 2023, 38, 101677. [Google Scholar] [CrossRef]
Doughty, C.L.; Ambrose, R.F.; Okin, G.S.; Cavanaugh, K.C. Characterizing Spatial Variability in Coastal Wetland Biomass across Multiple Scales Using UAV and Satellite Imagery. Remote Sens. Ecol. Conserv. 2021, 7, 411–429. [Google Scholar] [CrossRef]
Mao, P.; Ding, J.; Jiang, B.; Qin, L.; Qiu, G.Y. How Can UAV Bridge the Gap between Ground and Satellite Observations for Quantifying the Biomass of Desert Shrub Community? ISPRS J. Photogramm. Remote Sens. 2022, 192, 361–376. [Google Scholar] [CrossRef]
Kharel, T.P.; Bhandari, A.B.; Mubvumba, P.; Tyler, H.L.; Fletcher, R.S.; Reddy, K.N. Mixed-Species Cover Crop Biomass Estimation Using Planet Imagery. Sensors 2023, 23, 1541. [Google Scholar] [CrossRef] [PubMed]
Xia, Y.; Guan, K.; Copenhaver, K.; Wander, M. Estimating Cover Crop Biomass Nitrogen Credits with Sentinel-2 Imagery and Sites Covariates. Agron. J. 2021, 113, 1084–1101. [Google Scholar] [CrossRef]
Michalak, A.M.; Anderson, E.J.; Beletsky, D.; Boland, S.; Bosch, N.S.; Bridgeman, T.B.; Chaffin, J.D.; Cho, K.; Confesor, R.; Daloğlu, I.; et al. Record-Setting Algal Bloom in Lake Erie Caused by Agricultural and Meteorological Trends Consistent with Expected Future Conditions. Proc. Natl. Acad. Sci. USA 2013, 110, 6448–6452. [Google Scholar] [CrossRef]
Berry, M.A.; Davis, T.W.; Cory, R.M.; Duhaime, M.B.; Johengen, T.H.; Kling, G.W.; Marino, J.A.; Den Uyl, P.A.; Gossiaux, D.; Dick, G.J.; et al. Cyanobacterial Harmful Algal Blooms Are a Biological Disturbance to Western Lake Erie Bacterial Communities. Environ. Microbiol. 2017, 19, 1149–1162. [Google Scholar] [CrossRef]
Ruffatti, M.D.; Roth, R.T.; Lacey, C.G.; Armstrong, S.D. Impacts of Nitrogen Application Timing and Cover Crop Inclusion on Subsurface Drainage Water Quality. Agric. Water Manag. 2019, 211, 81–88. [Google Scholar] [CrossRef]
USDA. NRCS Soil Survey Geographic (SSURGO) Database for Ohio. Available online: https://websoilsurvey.sc.egov.usda.gov/App/WebSoilSurvey.aspx (accessed on 15 January 2021).
ESA Sentinel. Available online: https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-2 (accessed on 7 May 2023).
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-Scale Geospatial Analysis for Everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
Pix4D. Pix4Dmapper. Available online: https://www.pix4d.com/product/pix4dmapper-photogrammetry-software (accessed on 5 April 2021).
Sellaro, R.; Crepy, M.; Trupkin, S.A.; Karayekov, E.; Buchovsky, A.S.; Rossi, C.; Casal, J.J. Cryptochrome as a Sensor of the Blue/Green Ratio of Natural Radiation in Arabidopsis. Plant Physiol. 2010, 154, 401–409. [Google Scholar] [CrossRef]
Tucker, C.J.; Sellers, P.J. Satellite Remote Sensing of Primary Production. Int. J. Remote Sens. 1986, 7, 1395–1416. [Google Scholar] [CrossRef]
Gitelson, A.; Merzlyak, M.N. Quantitative Estimation of Chlorophyll-a Using Reflectance Spectra: Experiments with Autumn Chestnut and Maple Leaves. J. Photochem. Photobiol. B Biol. 1994, 22, 247–252. [Google Scholar] [CrossRef]
R Studio Team. Integrated Development for R; R Studio Team: Boston, MA, USA, 2020. [Google Scholar]
Chu, T.; Starek, M.J.; Brewer, M.J.; Murray, S.C.; Pruter, L.S. Characterizing Canopy Height with UAS Structure-from-Motion Photogrammetry—Results Analysis of a Maize Field Trial with Respect to Multiple Factors. Remote Sens. Lett. 2018, 9, 753–762. [Google Scholar] [CrossRef]
Tunca, E.; Köksal, E.S.; Taner, S.Ç.; Akay, H. Crop Height Estimation of Sorghum from High Resolution Multispectral Images Using the Structure from Motion (SfM) Algorithm. Int. J. Environ. Sci. Technol. 2024, 21, 1981–1992. [Google Scholar] [CrossRef]
Armi, L.; Fekri-Ershad, S. Texture Image Analysis and Texture Classification Methods—A Review. arXiv 2019, arXiv:1904.06554. [Google Scholar]
Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man. Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
Liu, Y.; Feng, H.; Yue, J.; Jin, X.; Li, Z.; Yang, G. Estimation of Potato Above-Ground Biomass Based on Unmanned Aerial Vehicle Red-Green-Blue Images with Different Texture Features and Crop Height. Front. Plant Sci. 2022, 13, 938216. [Google Scholar] [CrossRef] [PubMed]
Mohammadpour, P.; Viegas, D.X.; Viegas, C. Vegetation Mapping with Random Forest Using Sentinel 2 and GLCM Texture Feature—A Case Study for Lousã Region, Portugal. Remote Sens. 2022, 14, 4585. [Google Scholar] [CrossRef]
Bergstra, J.; Yamins, D.; Cox, D. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In Proceedings of the 30th International Conference on Machine Learning (ICML), Atlanta, GA, USA, 16–21 June 2013. [Google Scholar]
Bradshaw, T.J.; Huemann, Z.; Hu, J.; Rahmim, A. A Guide to Cross-Validation for Artificial Intelligence in Medical Imaging. Radiol. Artif. Intell. 2023, 5, e220232. [Google Scholar] [CrossRef]
Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Li, M.; Shamshiri, R.R.; Weltzien, C.; Schirrmann, M. Crop Monitoring Using Sentinel-2 and UAV Multispectral Imagery: A Comparison Case Study in Northeastern Germany. Remote Sens. 2022, 14, 4426. [Google Scholar] [CrossRef]
Alvarez-Mendoza, C.I.; Guzman, D.; Casas, J.; Bastidas, M.; Polanco, J.; Valencia-Ortiz, M.; Montenegro, F.; Arango, J.; Ishitani, M.; Selvaraj, M.G. Predictive Modeling of Above-Ground Biomass in Brachiaria Pastures from Satellite and UAV Imagery Using Machine Learning Approaches. Remote Sens. 2022, 14, 5870. [Google Scholar] [CrossRef]
Wang, G.; Gertner, G.Z.; Anderson, A.B. Spatial-Variability-Based Algorithms for Scaling-up Spatial Data and Uncertainties. IEEE Trans. Geosci. Remote Sens. 2004, 42, 2004–2015. [Google Scholar] [CrossRef]
Wessels, K.J.; Prince, S.D.; Zambatis, N.; MacFadyen, S.; Frost, P.E.; Van Zyl, D. Relationship between Herbaceous Biomass and 1-km² Advanced Very High Resolution Radiometer (AVHRR) NDVI in Kruger National Park, South Africa. Int. J. Remote Sens. 2006, 27, 951–973. [Google Scholar] [CrossRef]
Liang, T.; Yang, S.; Feng, Q.; Liu, B.; Zhang, R.; Huang, X.; Xie, H. Multi-Factor Modeling of above-Ground Biomass in Alpine Grassland: A Case Study in the Three-River Headwaters Region, China. Remote Sens. Environ. 2016, 186, 164–172. [Google Scholar] [CrossRef]
Jin, Y.; Yang, X.; Qiu, J.; Li, J.; Gao, T.; Wu, Q.; Zhao, F.; Ma, H.; Yu, H.; Xu, B. Remote Sensing-Based Biomass Estimation and Its Spatio-Temporal Variations in Temperate Grassland, Northern China. Remote Sens. 2014, 6, 1496–1513. [Google Scholar] [CrossRef]
Liu, W.; Wang, J.; Hu, Y.; Ma, T.; Otgonbayar, M.; Li, C.; Li, Y.; Yang, J. Mapping Shrub Biomass at 10 m Resolution by Integrating Field Measurements, Unmanned Aerial Vehicles, and Multi-Source Satellite Observations. Remote Sens. 2024, 16, 3095. [Google Scholar] [CrossRef]
Huang, D.; Yang, W.; Tan, B.; Rautiainen, M.; Zhang, P.; Hu, J.; Shabanov, N.V.; Linder, S.; Knyazikhin, Y.; Myneni, R.B. The Importance of Measurement Errors for Deriving Accurate Reference Leaf Area Index Maps for Validation of Moderate-Resolution Satellite LAI Products. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1866–1871. [Google Scholar] [CrossRef]

Figure 1. (a) Location of farmers’ fields where data were collected in 2021 and 2022. (b) Zoomed in view of a field with a layout of sampling locations. Note: The numbers on top of dots in the figure indicate the number of closely located fields.

Figure 2. (a) Collection of cereal rye biomass using an electric cutter from a 0.5 × 0.5 m² quadrat. (b) Marking of sampling location with cones in 2021. (c) Recording of precise geographic coordinates for each sampling location using a Trimble R8s GNSS receiver in 2022, and two UAS platforms, including (d) DJI Phantom 4 multispectral (Shenzhen, China) and (e) WingtraOne UAS (Zurich, Switzerland), for data acquisition.

Figure 3. (a) Original multispectral images captured from UAS; (b) 3D point cloud generation in Pix4D based on processing of UAS captured multispectral images. (c) Orthomosaic of a single band generated from Pix4D, and (d) orthomosaic visualized in a natural color combination.

Figure 4. Workflow for estimating cereal rye crop height using 3D point clouds generated from Pix4Dmapper software (version 2.12.1).

Figure 5. Grey level co-occurrence texture measures derived from NIR band using a window of size 3 × 3 pixels for a Sentinel-2 satellite imagery. This includes mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation.

Figure 7. Comparison of estimated crop height showing (a) scatter plot between field-measured values and UAS-derived crop height using ground points (i.e., 1st percentile) from 3 m grid, (b) scatter plot between field-measured values and UAS-derived crop height using ground points from 1 m grid, (c) average bias per field, and (d) average RMSE per field using the two methods.

Figure 8. Comparison of UAS-estimated versus field-measured cover crop biomass using XGBoost (left) and random forest (RF; right) models based on VIs and crop height derived from UAS-based multispectral imagery. Each point represents a test sample, color coded by field for seven fields.

Figure 9. SHAP-based interpretation of the UAS-based XGBoost (a,b) and random forest (RF) (c,d) models for predicting cereal rye biomass. SHAP summary plot (a,c) illustrates the distribution and directionality of SHAP values for individual observations where each point represents a sample and is colored by the corresponding feature value (red = high, blue = low). Similarly, the bar plot (b,d) shows the mean absolute SHAP value for each feature showing an average contribution to model output across all samples.

Figure 10. Comparison of satellite-estimated versus field-measured cover crop biomass using XGBoost and random forest (RF) models. Each point represents a test sample, color coded by fields. The inset figures show zoomed-in view of points at lower biomass level.

Figure 11. Comparison of satellite-estimated versus UAS-estimated cover crop biomass using XGBoost (left) and random forest (RF; (right)) models based on VIs and crop height derived from UAS–satellite synergistic models. Each point represents a test sample, color coded by fields. The inset figures show a zoomed-in view of points at lower biomass level.

Figure 12. SHAP-based interpretation of the UAS–satellite synergistic XGBoost (a,b) and RF (c,d) models for predicting cereal rye biomass.

Figure 13. Spatially explicit cereal rye biomass estimated by the three models—(Model 1) UAS-based, (Model 2) satellite-based, and (Model 3) UAS–satellite synergistic—for (a) field F21-4 (high biomass) and (b) field F21-12 (low biomass).

Table 1. Vegetation indices computed from the original multispectral images captured by the UASs and the Sentinel-2 satellite.

Indices	Computing Formula	Source
BGratio	$\frac{B}{G}$	[32]
NGRDI	$\frac{(G - R)}{(G + R)}$	[33]
NDRE	$\frac{(N I R - R E)}{(N I R + R E)}$	[34]

Note: B: Blue band, G: Green band, R: Red band, RE: Red-edge band and NIR: Near-infrared band reflectance.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

KC, K.; Khanal, S. Scaling Biomass Estimation by Expanding Ground Truth with UAS-Derived Training Data. Remote Sens. 2025, 17, 3163. https://doi.org/10.3390/rs17183163

AMA Style

KC K, Khanal S. Scaling Biomass Estimation by Expanding Ground Truth with UAS-Derived Training Data. Remote Sensing. 2025; 17(18):3163. https://doi.org/10.3390/rs17183163

Chicago/Turabian Style

KC, Kushal, and Sami Khanal. 2025. "Scaling Biomass Estimation by Expanding Ground Truth with UAS-Derived Training Data" Remote Sensing 17, no. 18: 3163. https://doi.org/10.3390/rs17183163

APA Style

KC, K., & Khanal, S. (2025). Scaling Biomass Estimation by Expanding Ground Truth with UAS-Derived Training Data. Remote Sensing, 17(18), 3163. https://doi.org/10.3390/rs17183163

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Scaling Biomass Estimation by Expanding Ground Truth with UAS-Derived Training Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Sites

2.2. Field Data Collection

2.3. Remote Sensing Data Collection

2.3.1. UAS Image Collection

2.3.2. Satellite Images

2.4. Data Processing

2.4.1. Point Cloud and Orthomosaic Generation

2.4.2. Computation of Vegetation Indices

2.4.3. Computation of Crop Height

2.4.4. Computation of Texture Features for Satellite Images

2.5. Machine Learning Models to Estimate Cereal Rye Biomass

2.5.1. Model Calibration

UAS-Based Model

Satellite-Based Model

UAS–Satellite Synergistic Model

3. Results

3.1. UAS-Based Crop Height

3.2. Performance of UAS-Based Models

3.3. Performance of Satellite-Based Models

3.4. UAS–Satellite Synergistic Models

3.5. Spatial Variability in Estimated Biomass Maps

4. Discussion

4.1. Performance of UAS- and Satellite-Based Models

4.2. Leveraging High-Resolution UAS Data to Improve Satellite-Based Biomass Predictions

4.3. Opportunities, Limitations, and Future Works

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI