Next Article in Journal
Development and Application of Self-Supervised Machine Learning for Smoke Plume and Active Fire Identification from the Fire Influence on Regional to Global Environments and Air Quality Datasets
Previous Article in Journal
Delay–Doppler Block Division Multiplexing: An Integrated Navigation and Communication Waveform for LEO PNT
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

High-Resolution Aboveground Biomass Mapping: The Benefits of Biome-Specific Deep Learning Models

by
Martí Perpinyà-Vallès
1,2,*,†,
Daniel Cendagorta-Galarza
1,†,
Aitor Ameztegui
2,3,
Claudia Huertas
1,
Maria José Escorihuela
4 and
Laia Romero
1
1
Lobelia Earth S.L., Carrer del Doctor Trueta, 113, 08005 Barcelona, Spain
2
Department of Agricultural and Forest Sciences and Engineering, Universitat de Lleida, 25198 Lleida, Spain
3
Joint Research Unit CTFC-Agrotecnio-CERCA, Ctra. Sant Llorenç, Km 2, 25280 Solsona, Spain
4
isardSAT S.L., Carrer del Doctor Trueta, 113, 08005 Barcelona, Spain
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2025, 17(7), 1268; https://doi.org/10.3390/rs17071268
Submission received: 4 March 2025 / Revised: 26 March 2025 / Accepted: 1 April 2025 / Published: 2 April 2025
(This article belongs to the Section Remote Sensing for Geospatial Science)

Abstract

:
Regional mapping of Above Ground Biomass Density (AGBD) using Remote Sensing data has shown high accuracy but lacks replicability at a global scale. In contrast, global models capture AGBD variability across biomes but struggle with biome-specific accuracy. To address this gap, we develop and assess the performance of a Deep Learning model for mapping AGBD at 10-m resolution using multi-source satellite data (Sentinel-1, Sentinel-2, ALOS PALSAR-2, and GEDI) across four biomes: Mediterranean, taiga (boreal forests), tropical rainforests, and semi-arid savannas. The model is trained and validated separately for each biome, yielding four regional models with normalized RMSEs of 0.43–0.67 and correlation coefficients (r) of 0.61–0.77 against forest inventories. We compare predictions from these models to a benchmark dataset and to a model trained on all four biomes combined. The regional models consistently outperform both, achieving better metrics than the benchmark. Additionally, an analysis of prediction drivers reveals biome-specific differences, reinforcing the importance of per-biome mapping approaches. This study highlights the advantages and limitations of regional against global modeling, creating the basis for biome-specific, replicable, scalable and multi-temporal AGBD mapping.

1. Introduction

Accurate estimates of above-ground biomass density (AGBD) are essential for understanding carbon dynamics, supporting ecosystem management, and addressing global climate change. Mapping AGBD at global scale enables researchers and policymakers to monitor forest carbon stocks, assess the impacts of land-use change, and track progress toward climate mitigation goals [1,2]. Large-scale biomass mapping also aids in biodiversity conservation [3] by identifying habitats in critical situation and in risk of deforestation [4]. Furthermore, these estimates provide valuable data for improving climate models and informing sustainable land management practices, making them a crucial tool in global environmental monitoring efforts [5,6]. In recent years, the field of mapping AGBD or above ground carbon stocks from remote sensing has received an ever-increasing attention [7,8]. Approaches range from combining remote sensing imagery with in-situ measurements of trees—translated into AGB via allometric equations—[9,10], to the use of Airborne Laser Scanning (ALS) data [11,12,13]. Until recently, attempts to map AGBD globally have relied on the use of globally relevant datasets of in-situ tree measurements and ALS flights to calibrate models which produce maps covering the entire planet. This is the case of a few datasets that have become critical benchmarks to compare to [14,15,16]. However, these global models usually lack the spatial resolution to resolve fine-grain patterns at the local level and their accuracy can be constrained by ecological diversity and insufficient representation of regional variability [17,18].
A critical turning point on the advancement of the state-of-the-art on AGBD mapping has been the deployment of the Global Ecosystem Dynamics Investigation (GEDI) mission aboard the International Space Station. GEDI mission provides 25 m footprint-level AGBD data obtained from well-calibrated allometric equations on the height measurements obtained through its LiDAR system [19,20]. Common approaches combine GEDI data with either one or more of the following remote sensing technologies: Synthetic Aperture Radar (SAR) (such as Sentinel-1) [21], which captures vegetation structural information; multi-spectral sensors (such as Sentinel-2) [22], which provide insights into vegetation composition and health; and topographic or environmental data as auxiliary variables [23]. This combination of data from different sources has facilitated the development of machine learning and deep learning models that provide highly detailed biomass maps, often outperforming traditional approaches based on empirical relationships or physical models [21,22,23,24,25,26]. Despite the potential of these approaches, which often achieve superior accuracy by tailoring inputs and parameters to local ecosystem characteristics, their scalability to the global scale has seldom been explored [27,28]. It is therefore critical to explore strategies that combine the accuracy of regional models with the scalability and consistency of global approaches, leveraging publicly available, high-resolution datasets with near-global coverage such as Sentinel-1, Sentinel-2, and GEDI.
This study aims to address this challenge by applying a modified UNet architecture to predict AGBD across forests located in four different biomes—Mediterranean, taiga (boreal forests), tropical rainforests, and semi-arid savannas. The ranges of AGBD values, the species present in each ecosystem, the varying vegetation densities and the availability of data in these biomes were the main drivers for the selection of the areas of study. using harmonized satellite data for training and prediction, and validates with inventory datasets in each biome. We evaluate the model’s performance and compare it to a state-of-the-art, open-source, widely used global biomass dataset, the European Space Agency’s Climate Change Initiative (ESA CCI) AGB product [14]. We seek to determine whether regionally tailored models can achieve the expected superior accuracy while maintaining global applicability. We then compare the performance of our regional models against a single model trained on data from all four biomes simultaneously—an approach that more closely resembles global modeling while preserving the same methodology and resolution. By integrating an explainable artificial intelligence (XAI) approach we assess the contribution of individual variables, offering insights into the drivers of biomass prediction in each biome and the potential for improving large-scale AGBD estimation. We expect multi-spectral and C-band frequency SAR data to be useful in lower AGBD biomes, since saturation is not as much of an issue as in higher AGBD areas like tropical rainforests. Moreover, L-band SAR is expected to bring meaningful contributions in biomes with higher vegetation density, as well as infrared bands from multi-spectral data showing canopy moisture, which can be a proxy for vegetation density.

2. Materials and Methods

2.1. Study Areas

This study focuses on four distinct regions to evaluate model performance across diverse ecological contexts. The diversity in species, AGBD ranges and data availability for model validation were the main drivers for the selection of the 4 regions. Figure 1, shows the distribution of plot locations across biomes and sub-biomes from [29]. Each biome is characterized by distinct climate conditions.
The study encompasses forests distributed across four distinct regions, each characterized by unique climatic conditions and vegetation types. In the boreal forests of Quebec, Canada, mean annual temperatures range from 0 °C to −2.5 °C, with annual precipitation between 700 mm and 1100 mm. These cold environments support coniferous-dominated forests adapted to long, harsh winters. In contrast, the Mediterranean forests and shrublands of Catalonia, northeastern Spain, experience a typical Mediterranean to subMediterranean climate with hot, dry summers and mild, humid winters. Here, average annual temperatures range from 10 °C to 17 °C, and precipitation varies between 350 mm and 1000 mm, fostering sclerophyllous vegetation adapted to seasonal drought. The tropical rainforests of the Brazilian Amazon present a stark contrast, with consistently warm and humid conditions, where average temperatures range from 25 °C to 28 °C and annual rainfall is abundant, between 2000 mm and 3000 mm, supporting a dense, highly diverse forest structure. Finally, the tropical shrublands and woodlands of Burkina Faso and Niger are characterized by high temperatures, averaging between 28 °C and 32 °C annually, and low, variable precipitation ranging from 300 mm to 600 mm per year. These arid conditions result in sparse tree cover and scattered drought-adapted shrubs, constituting the typical savanna landscape.

2.2. Satellite Data

The training data to build AGBD models were derived from the GEDI Level 4A (L4A) AGBD dataset, which provides, globally distributed, LiDAR-derived biomass estimates. This dataset was downloaded and subjected to extensive preprocessing and filtering to ensure retention of the highest quality data points and alignment with the temporal coverage of the input data. Data was downloaded spanning the years 2019 through 2021 to ensure sufficient spatial and temporal coverage. GEDI L4A files containing full orbital paths that intersected each area of interest were downloaded, clipped, and subsetted to extract the relevant data variables. Data points were retained only if the l2_quality_flag, l4_quality_flag, and algorithm_run_flag were all set to 1, indicating successful processing and reliable output. Measurements collected during leaf-off conditions, as identified by a leaf_off_flag value of 1, were excluded due to their potential to impact biomass estimates. To reduce noise from sunlight, only observations with a solar_elevation below 0 were included. Additionally, data points were filtered to ensure a sensitivity value of at least 0.95, guaranteeing high confidence in the lidar-derived biomass estimates [20]. Lastly, an upper limit of 800 Mg ha−1 was applied to the AGBD values to exclude outliers and maintain a realistic range of biomass predictions, based on reviews on global distribution of AGBD values across biomes [30,31]. This upper limit only excluded around 0.01% and 0.1% of the data points in the Mediterranean and Tropical biomes respectively, while no points were reaching such high values for the other two biomes. Around 90% of the total GEDI data points were discarded as they did not satisfy any of the previous conditions. The GEDI L4A dataset serves as the primary reference for model training, providing a high-quality and globally consistent baseline for biomass predictions.
The satellite data used as potential AGBD predictors for this study consists of: (i) Sentinel-1 Synthetic Aperture Radar (SAR) Radiometrically Terrain Corrected (RTC) with dual-band cross-polarization data, (ii) Sentinel-2 multi-spectral Level-2A imagery, and (iii) the Copernicus 30-meter Digital Elevation Model (DEM), all of which were accessed and processed through Microsoft’s Planetary Computer platform. These datasets provide high-resolution, globally consistent inputs for biomass modeling. For Sentinel-1, we first separated the images between ascending and descending orbits, to account for the different information provided by opposing viewing angles from the satellite. Each separate image was acquired and processed individually, filtering out those values lower than −30 dB, and applying a Lee filter, with a window size of 3 pixels, for speckle reduction. Then, a temporal average was calculated from all images taken during the growing season for each site. For Sentinel-2, we selected all images with a cloud cover lower than 30%, and we then applied the Scene Classification Layer (SCL) produced by Sen2Cor L2A processor [32,33], to filter out those pixels classified as clouds, cloud shadows, snow or ice, saturated or defective pixels, and topographic casted shadows, ensuring cleaner data for biomass analysis. Additionally, we corrected for the added offset in the processing baseline changes of January 2022 and computed the yearly median to reduce residual cloud cover, retaining the most representative image for each year. In addition to these primary datasets, PALSAR (Phased Array type L-band Synthetic Aperture Radar) data were incorporated for the tropical rainforests and the semi-arid savannas of Burkina Faso and Niger, where the data was openly available, as opposed to the Northern Hemisphere locations. The inclusion of PALSAR data allowed us to assess the added value of longer-wavelength SAR data, and its ability to penetrate deeper into the canopy, which can enhance biomass predictions in specific ecosystems. These PALSAR datasets were obtained and processed using Google Earth Engine, where, as with Sentinel-1, the temporal average was calculated from all images taken during the growing season for each site. Together, these datasets provided a comprehensive and robust foundation for the biomass prediction model, leveraging multi-source remote sensing data to account for various environmental and temporal factors. To ensure uniformity in spatial resolution, all satellite-derived variables were resampled to a common grid at 10-m resolution.

2.3. Forest Inventory Validation Data

We used different sources of field data to validate our results. For boreal and Mediterranean forests we used data from the Quebec permanent National forest inventory (NFI) [34] and the 4th Spanish National Forest Inventory [35], respectively. Forest inventory datasets from non-governmental surveys were used for the Brazilian Amazonian rainforests [36] and the semi-arid regions of Burkina Faso and Niger [37].
To align with our objective of developing region-specific models applicable to digitized, continuous forest inventories and similar applications, we selected study areas accordingly. Catalonia, as a whole, was chosen as a representative political region where AGBD mapping could support carbon stock monitoring. In contrast, Quebec, which spans a much larger area and includes both boreal and semi-boreal forests, required a more targeted selection. To maintain a comparable spatial extent and ensure consistency across regions, we restricted the Quebec dataset to plots within the boreal forest biome, resulting in an area similar in size to Catalonia. Regarding Burkina Faso and Brazil, since the availability of the data was scattered across the biome and the distances were large between the plots, only buffers of each area were taken, resulting in scattered data collection (5 areas in Brazil and 2 in Burkina Faso and Niger).
The characteristics and sizes of forest inventory plots also varied across the study regions, as seen in Figure 1. In Brazil, plot sizes differed by site, with most plots measuring 50 × 50 m, while others extended to 500 × 20 m [36]. In contrast, data collection in Burkina Faso and Niger utilized circular plots with a radius of 20 m (see [37] for a detailed description of the plot design). The Spanish National Forest Inventory consists of circular permanent plots distributed along a 1 × 1 km grid. Each plot actually consists on four concentric circular plots where trees are measured depending on their size, with a maximum plot radius of 25 m. For Quebec, data were collected within circular areas encompassing 200 square meters. From all datasets, observations collected from 2015 onward were selected, ensuring temporal consistency between the validation data and the satellite imagery used to generate biomass predictions.
To ensure robust and consistent validation across the four distinct ecosystems, all datasets were subjected to a harmonization process. This process was undertaken to align metrics and data structures across biomes, enabling accurate comparisons and validation outcomes. Measurements from individual trees were converted into estimates of aboveground biomass using the most site-specific allometric equations available, rather than using a general allometric equation, which can result in biased and unaccurate estimations for some biomes [38]. For Mediterranean forests, we used biomass estimates provided by the Spanish National Forest Inventory, based on province- and species-specific allometric equations [35], while in Burkina Faso and Niger, site-specific equations were applied as in Perpinyà-Vallès et al., (2024) [37]. For boreal forests, since no specific allometric equations were available from the dataset producers, we used the R package allodb [39], which is specifically designed for extra-tropical tree allometries, to select the most appropriate allometric equation based on species and geographic coordinates. The application of extra-tropical allometric equations to boreal forests is supported by their empirical calibration using datasets from temperate and boreal regions, ensuring their applicability to the specific structural and ecological characteristics of boreal tree species. Moreover, boreal forests share similar growth constraints with other extra-tropical ecosystems, such as temperature limitations and slow biomass accumulation rates, further justifying this approach. Finally, well-known tropical allometric equations from Chave et al. (2014) [38] were used for tropical rainforests of Brazil. To harmonize the different sampling desings, individual AGB estimates for all trees within a plot were converted into aboveground biomass density values (Mg ha−1) using the total number of measured trees and the specific plot area.
The range of AGBD values in the inventories were substantially different in the four biomes, ranging up to around 350 Mg ha−1 in Catalonia, 90 Mg ha−1 in Burkina Faso and Niger, over 700 Mg ha−1 in Brazil and over 300 Mg ha−1 in Quebec. Based on previous studies, we discarded plots with AGBD field estimates over 500 Mg ha−1 in Brazil [40] and 200 Mg ha−1 in Quebec [41,42] to avoid an excessive weight of outliers. These represented less than 10% and 5% of the plots for Brazil and Quebec respectively. The main goal of this filtering was to obtain a dataset that would be representative of the overall range of values without including many outliers that could come from human error upon measurement or anomalously large trees. This last point was the main driver of the outliers detected, with almost all the outliers having at least one tree with diameters larger than 80 cm. Indeed, allometric equations are typically calibrated for a limited range of tree diameters, and given their non-linear nature, they can lead to unrealistic AGB estimates when trees with very large diameters are considered [43].
The total number of data points available for validation varied across biomes: boreal forests (308), Mediterranean forests (2532), tropical rainforests (117), and semi-arid savannas (113). These datasets provide a representative and geographically diverse foundation for evaluating model performance in predicting above-ground biomass density.

2.4. Model Structure & Training

The Deep Learning model used in this study is built on a U-Net architecture [44], based on the work developed by Schwartz et al. (2023) [45], designed for image-to-image regression tasks. The model incorporates residual connections, enhancing gradient flow, and dropout regularization and L2 weight decay to mitigate overfitting. It follows a fully convolutional encoder-decoder structure, where the contracting path consists of four convolutional blocks, each followed by a 2 × 2 max pooling operation, progressively increasing the number of filters from 64 to 1024. The bottleneck layer captures high-level feature representations before the expanding path symmetrically reconstructs spatial information using bilinear upsampling and skip connections. The final output layer applies a 1 × 1 convolution with a linear activation to produce a single-channel continuous-valued prediction. The model’s loss function is specifically tailored to the sparse nature of the training data, similar to the approach by Schwartz et al. [45], where loss is computed only at pixels containing valid GEDI observations. A custom Root Mean Squared Error (RMSE) loss function, applied only to pixels with GEDI-derived AGBD values, ensures unbiased learning while disregarding areas with missing data. Additionally, land cover points from ESA WorldCover 2020 and 2021 [46,47], where AGBD is necessarily zero (e.g., urban areas, bare rock, ice/snow, water, and grassland), are randomly sampled and incorporated into the training dataset to help the model learn “hard zeroes”. The training process is further optimized using an Exponential Decay learning rate schedule, starting at 0.0005 and decreasing by a factor of 0.95 every 1000 steps, combined with the Adam optimizer for stable convergence. The model was trained for 50 epochs using a batch size of 16. To prevent overfitting, we employed an early stopping techinque, using the validation loss as the monitoring metric, with a patience of six epochs before restoring the weights of the epoch with best validation loss. Once trained, the model was applied to generate wall-to-wall biomass predictions at 10-m resolution across the study areas for all years where inventory validation data was available, ensuring spatially comprehensive estimates of AGBD.
In each study area, the model was trained using image patches of 256 × 256 pixels, ensuring that each patch contained at least five GEDI data points to avoid incorporating regions with insufficient information. Depending on the size of the study area, two different sampling strategies were applied. For larger areas, a total of 10,000 patches were randomly sampled across the entire extent, with each patch assigned an inverse weight based on the average GEDI-derived AGBD value to balance representation across biomass ranges. From these, a weighted random selection of 3000 patches was used for training. In smaller study areas where fewer patches were available, all patches meeting the GEDI density criterion were included in the training set, ensuring sufficient spatial coverage while maintaining data quality for model learning.
Additionaly, a single “global” model trained on patches from all 4 biomes was carried out. The data extracted for each biome to train their respective regional models were sampled to obtain the final dataset containing 8283 patches. Of those, 2500 were randomly selected from each of the larger areas (Quebec and Catalonia) and the remaining were filled with all the available patches from Brazil and Burkina Faso/Niger (approximately 1500 each), totaling 8283 patches. The input datasets had to be filtered for them to be available globally. This consisted in keeping only one of both Ascending and Descending passes from Sentinel-1 instead of using both when available, leaving the Ascending track for all biomes except the tropical rainforest, where only Descending was used instead as it was the only one available. PALSAR data was also not used, as it was not available over all the areas of study. The remaining datasets were therefore Sentinel-2, Sentinel-1 (Ascending only, Descending in the case of Brazil) and the DEM.

2.5. Variable Importance Analysis

To assess the contribution of each explanatory variable to the predictions of the model, saliency maps were used as an XAI technique [48]. Saliency maps capture the sensitivity of the model output with respect to changes in the input features, allowing visualization and quantification of the importance of each spectral band in predicting AGBD.
For each input patch, the gradient of the model output with respect to each input band was computed using automatic differentiation. The absolute value of these gradients was then averaged across all spatial locations within the patch to obtain a measure of the importance of each band. This process was repeated in all samples, using all image patches from each biome for a single year. Since the gradients are not normalized, the resulting values reflect the actual magnitude of change in the model output in response to variations in each band. This means that differences in gradient values across biomes can provide insight not only into which bands are most important but also into the extent to which changes in a band influence the predicted AGBD in each biome. Higher gradient values indicate that small changes in a band result in larger changes in the model output, revealing bands with a stronger influence on the predictions.
The analysis was conducted separately for each biome to account for the heterogeneity in ecological and environmental conditions that influence biomass dynamics. This biome-specific evaluation provides insights into the relevance of different spectral bands across diverse ecosystems, highlighting potential regional variations and limitations when applying a globally trained model.

2.6. Validation, Benchmark and Global Comparison

The performance of the models was determined using three complementary approaches: (1) comparing predictions against the GEDI L4A data used during training, (2) using each field inventory plot to validate the closest prediction to its center, and (3) comparing the AGBD values in each field inventory plot to the mean, minimum and maximum value of all pixel predictions intersecting the area of the inventory plot, in order to account for regional variations in plot definitions and their location uncertainty. The latter validation method has been identified as one of the key issues to solve in future studies of AGBD mapping [49].
The predictive performance of the model was quantified using four statistical metrics calculated using the UNet estimations as precited values and the forest inventory datasets as observed values: Pearson correlation coefficient (r), coefficient of determination (R2), Mean Absolute Error (MAE) and Normalized Root Mean Square Error (nRMSE). The Pearson Correlation Coefficient (r) assesses the strength and direction of the linear relationship between predicted and observed values, with values closer to 1 indicating a strong positive correlation. The Coefficient of Determination (R2) represents the proportion of variance in the observed data explained by the model, with higher values suggesting better predictive performance, values closer to 0 suggesting no predictability of the model. Negative values may occur when the model does not see the validation data, which is the case when we use independent forest inventories, indicating no predictive capability. MAE measures the average magnitude of errors between predicted and observed values, providing an indication of overall prediction accuracy. Finally, Normalized Root Mean Square Error (nRMSE) is a normalized version of RMSE that expresses the prediction error relative to the mean observed value, facilitating comparisons across different study regions. It is calculated as:
n R M S E = 1 N i = 1 N ( y i y ^ i ) 2 1 N i = 1 N y i
where y i are the observed values, y ^ i are the predicted values, and N is the total number of observations. This formulation ensures that errors are assessed in proportion to the scale of the observed biomass values, enabling robust performance evaluation across biomes.
By combining multiple validation approaches with these statistical metrics, we ensured a thorough and reliable assessment of the model’s ability to estimate above-ground biomass density across different ecological regions.
Additionally, to evaluate the performance of the adapted UNet model against an established global biomass product, we used the ESA CCI AGB dataset [14] as a benchmark. This dataset offers global coverage of biomass estimates at a 100-m spatial resolution and provides data for multiple time points, making it the most comparable product to the approach presented in this study, despite its coarser resolution. The ESA CCI data were downloaded and processed to align with the temporal and spatial scales of our study regions. The comparison included both a qualitative and quantitative analysis. Continuous biomass maps from the ESA CCI dataset were compared to those generated by the adapted UNet model in representative areas of the four biomes. Furthermore, errors in the ESA CCI estimates were evaluated using the same inventory plots and validation metrics applied to the adapted UNet model. As the ESA CCI dataset has a coarser resolution than the UNet model estimations, the individual pixel containing the field inventory plot area was used for this comparison. This ensured a direct and consistent comparison of model performance. By assessing both spatial accuracy and quantitative agreement, this benchmark analysis provides insights into the relative strengths and limitations of the adapted UNet model in comparison to a widely used global dataset.
Moreover, we investigated the effects of resolution and methodology in a comparison between a regional and a global approach. Working under the assumption that regional mapping produces the best results for each biome—as it is capable of optimizing the extraction of information from the available sources depending on their characteristics, a comparison of the results of regional modeling with a single “global” model trained on patches from all 4 biomes was carried out. The trained global model was then used to predict AGBD for all biomes in the years in which in-situ data was available, and metrics were obtained and compared to those of the regional models. This comparison ensures a fair evaluation of regional versus global modeling approaches. Unlike the ESA CCI benchmark, which differs in methodology and resolution, this analysis directly compares models trained under consistent conditions, minimizing potential biases. The results are presented together in the benchmark comparison section.

3. Results

3.1. Variable Importance

Clear differences in both the relative and absolute importance of each variable were observed across biomes, as biome-specific models enhance feature extraction by leveraging the most relevant spectral and radar data for each ecosystem (Figure 2). When analyzing each biome in detail, distinct patterns emerged. In Quebec’s boreal forests, Near Infrared (NIR) band showed the highest influence on predictions, reflecting its strong sensitivity to leaf biomass. Red-edge bands and Sentinel-1 Vertical-Horizontal (VH) ascending pass contributed with additional structural information, though to a lesser extent than NIR. This suggests that while multispectral data, particularly from the red-edge and NIR spectrum, plays a dominant role, adding SAR data still provides useful information.
Mediterranean forests exhibited a unique bimodal distribution across all components, indicating that different patches prioritize different variables as key prediction drivers. Overall, SAR data in VH co-polarization—especially from the descending pass, highly sensitive to volume scattering caused by the leaves and branches in a forest canopy, combined with NIR and red-edge bands, had the highest predictive impact on ABGD. With greater topographic and climatic gradients than in Quebec, Catalonia’s forests appear more sensitive to minor data variations, likely due to the region’s diverse ecosystem types, which entails mostly Mediterranean forest and shrubland communities, but also some temperate forests in the northernmost areas across the Pyrenees.
In Brazil’s tropical rainforest, Sentinel-2 data was the most influential predictor (Figure 2), with infrared bands, particularly red-edge and Short Wave Infrared (SWIR) playing a crucial role. SWIR, which reflects vegetation moisture content, had a greater impact here than in the previous biomes. SAR data is not best suited for AGBD estimation at high vegetation densities due to penetration capabilities of the higher frequency datasets used, such as Sentinel-1 C-band and PALSAR’s L-band. P-band, for instance, would be better suited for AGBD mapping, with penetration capabilities even in dense forests. Despite its limitations, PALSAR data, particularly in HV co-polarization, contributed more than Sentinel-1 due to its deeper canopy penetration, making it especially valuable for the model. The larger gradients in variable importance compared to Quebec are attributed to the broader range of AGBD values, requiring finer spectral distinctions to detect AGBD variations. Additionally, reliance on Sentinel-2 may explain the observed signal saturation at higher AGBD values.
Finally, in the semiarid savannas of Burkina Faso and Niger, multiple data sources hold similar importance. Notably, SWIR, NIR, and red bands dominate, suggesting that NDVI—composed of NIR and red bands for vegetation localization and vigor-, supplemented by SWIR for moisture drives predictions. Since canopy penetration is less critical in this biome, PALSAR adds little value compared to Sentinel-1. As in Quebec, the narrower range of AGBD values results in smaller gradients of variable importance, meaning the model detects AGBD variations primarily in response to more substantial spectral shifts.

3.2. GEDI Validation

The comparison of our predictions with GEDI data showed a high correlation between the distribution of data points over the study areas and the ability of the model to correctly predict AGBD throughout its range of values. Since the model is trained by patches, and in each patch there are several data points that contribute to the loss used to drive the model learning, the model is hardly ever overfitting individual points, but rather tending towards the understanding of overall trends.
All biomes exhibited a similar pattern of a more accurate fit at low AGBD values—with more representativity of GEDI data points—while the errors got larger at higher AGBD values. In all biomes, the median errors at lower AGBD values were up to 4 orders of magnitude higher in the first bin compared to the last one (Figure 3). For reference, the 90th percentile of the GEDI AGBD data in each biome is 92.21 Mg ha−1 for Quebec, 125.91 Mg ha−1 for Catalonia, 284.81 Mg ha−1 for Brazil and 10.40 Mg ha−1 for Burkina Faso and Niger, as indicated by a vertical red dashed line in Figure 3. The best performance across the range of values was observed for Catalonia, where the drop in observations along the AGBD gradient, despite being high, was well distributed across the territory and better captured by the model. Only the last three bins (245–350 Mg ha−1) show large errors with significant deviation. In terms of relative errors, the clearest decline in performance is seen in the semi-arid areas of Burkina Faso and Niger, where the quantity of data points close to 0 is orders of magnitude higher than the rest of data points, providing a very good fit to the first bin (0–10 Mg ha−1, covering up to 90% of the available data points), but compromising the errors in the rest of the range of values. A similar pattern was observed for Quebec, with the amount of GEDI data points being orders of magnitude higher at approximately 20–40 Mg ha−1 and then quickly dropping. Only a few thousands of points are located over 150 Mg ha−1, limiting the ability of the model to learn these patterns across the territory. Tropical rainforests, on the other hand, presented a more homogeneous distribution of values up to around 250 Mg ha−1, from which point a decline in amount of data points is seen, accompanied by a drop in performance. Above this threshold, errors reach median values over 200 Mg ha−1, suggesting areas where the signal is clearly saturated.
Combined with sensor saturation [50], this lack of data points at higher AGBD values is considered the main limitation that could explain the lack in performance when comparing our predictions with in-situ data too, as seen in Figure 4.

3.3. Forest Inventory Validation and Benchmark Comparison

From the two approaches to validate the UNet estimations against the forest inventory datasets, explained in Section 2.4, averaging across the pixels (method 3) showed better results than just taking the pixel of the central point of the plot (method 2), as it accounts for the full spatial variability within the plot area rather than relying on a single pixel. By integrating information from multiple pixels, this method reduced potential mismatches caused by geolocation uncertainties, edge effects, or heterogeneity within the plot. While this method did not improve model performance in Quebec and Catalonia, it improved nRMSE by 28% in Brazil, and 258% in Burkina Faso & Niger. Interestingly, in semi-arid areas, considering the maximum value among all pixels showed the best results instead of the average value. This large improvement can be attributed to the dominance of isolated trees with high biomass, which exert a stronger influence on overall biomass estimation than smaller, scattered vegetation. By contrast, traditional averaging methods may underestimate AGBD in these environments due to the presence of large areas with little to no vegetation.
Scatter plots of observed (forest inventory plots) vs predicted values (either ESA CCI or our regional UNet model) illustrate that the UNet estimations exhibit saturation at higher AGBD levels, limiting the model’s ability to predict accurate values beyond a certain threshold (Figure 4). However, predictions beyond GEDI’s 90th percentile for each biome—as indicated in Figure 3—are obtained across the territory, showing the model’s capability to generate predictions throughout the whole AGBD range. On the contrary, ESA CCI has the capability to reach higher values, exhibiting less saturation potentially due to its calibration being done with a wider range of AGBD values (lower panels in Figure 4). However, our model tends to fit the lower end of AGBD values more accurately, with ESA CCI tending to overestimate low AGBD in three out of four biomes. The exception were the semi-arid areas, where ESA CCI consistently predicted much lower values than observed in the inventories, proving incapable of detecting AGBD values over 3 Mg ha−1. Accordingly, R2 values were higher for UNet than for ESA CCI in all the biomes.
Additionally, a qualitative comparison can be seen in Figure 5, where we can observe how, in all four biomes, our model is more capable of distinguishing heterogeneity of AGBD in forests, with clear changes in land cover due to its higher resolution. The granularity of our estimates at 10 m resolution allows us to detect details such as small clumps of sparse trees in semi-arid areas or larger vegetation patterns usually located along rivers. In the Mediterranean region, while ESA CCI captures the overall range of AGBD values, its predictions exhibit high variability, leading to a poorer fit in areas with mixed land cover, such as cropland, grasslands, urban zones, and forests. In Boreal forests, it is clearly seen from the scatter plot (Figure 4), that ESA CCI generally overpredicts AGBD and is less able to distinguish patterns near rivers and croplands, similarly to the Mediterranean Region. Finally, ESA CCI has a more homogeneous distribution across the Amazon region, not identifying differences in vegetation density across the study areas. The overall results show a clear advantage of augmenting resolution with our model.
In Table 1 the different metrics calculated for the ESA CCI and our estimations, both from the regional and the “global” model are presented. Our models outperformed the ESA CCI’s using either the regional or the global model and for all the biomes, especially in those areas where the ESA CCI had a very homogeneous or null signal, such as the tropical rainforest in Brazil and the semi-arid areas in Burkina Faso and Niger. Some biomes benefitted from global modeling to overcome the limitations in the range of values seen by the model, particularly Mediterranean forests, where the best metrics were always obtained by the global model, although by a slim margin when compared to the regional approach. Regional models in Brazil and Quebec were generally better, with an R2 clearly showing a better result in the regional mapping approach (Table 1). Burkina Faso and Niger also clearly benefit of the model being tailored to the particularities of sparse tree cover rather than using a global model, and all the metrics of the UNet regional model improved those of the global approach. Although the global model does in all cases perform better than the benchmark, it is true that it has negative R2 values for Tropical (Brazil) and Savanna (Burkina Faso and Niger) areas, suggesting no predictive capability of the model and reinforcing the need for regional models in these ecosystems.
A potential factor influencing the performance differences between regional and global models is the number of data points seen by the model for each biome. As explained in Section 2.6, the global model had greater exposure to training patches from Catalonia and Quebec, where its performance was slightly better. Additionally, another potential driver of the worse results in the Tropical Rainforests in Brazil and the Savannas in Burkina Faso and Niger is the distinctiveness of tropical rainforests and savannas, which differ significantly in structure and biomass distribution compared to other biomes. Conversely, Boreal Forests and Mediterranean ecosystems share some similarities, particularly in terms of vegetation density and canopy height, which may explain the relatively better performance of the global model in those regions. Finally, another factor that can influence the decrease in performance of the global model in the Savannas and Tropical areas is the lack of L-band SAR data. Since PALSAR data was not available for all regions, the global model was trained only based on Sentinel-1 (C-band SAR) and Sentinel-2 (multi-spectral).

4. Discussion

A key hypothesis in biomass estimation is that regional mapping, even when applied with the same methodology and datasets, performs better only in specific biomes, with global models providing better results in biomes that are well represented in their training data. However, our findings suggest that this assumption does not hold universally. Instead, regional models appear to provide benefits across various biomes, demonstrating their broader applicability and general improvement against, in this case, the benchmark model chosen, ESA CCI. Note that, as stated in Section 3.3, both our regional and global models outperform ESA CCI in areas with homogeneous or null signals. This is potentially due to ESA CCI’s lower resolution, which limits its ability to capture high AGBD values in highly heterogeneous areas, such as the semiarid savannas of Burkina Faso and Niger, but also due to its signal saturation around 200–300 Mg ha−1, as observed in tropical rainforests [51]. This highlights that the capability of the UNet regional model to learn overall vegetation patterns from GEDI, even when using a general approach that combines data from the 4 biomes, brings an added value when mapping AGBD. Our results also suggest that localized mapping approaches may be beneficial regardless of ecosystem type, emphasizing their potential for widespread use. Beyond accuracy, regional models offer flexibility, as they can be tailored to specific conditions by incorporating additional local datasets. It is important to note that in some cases, such as Catalonia, our study does suggest that global approaches can provide results that are as good as a regional model, which shows potential for global models in some areas.
Generally, the proposed modified UNet model is able to capture spatial hierarchies and contextual information, which is essential for accurate pixel-wise predictions but also for spatial coherence in complex forest biomass mapping scenarios, as seen in Figure 5. The UNet’s encoder-decoder structure facilitates the integration of multi-scale features, enhancing its ability to delineate intricate patterns within the biomass data. This method based on contextual information can lead, however, to smoothed outputs that are potentially less precise at the single-pixel scale but more so at the large-scale. This can be one of the reasons for the relatively low explanatory power (R2 = 0.119–0.396) observed across the four study regions. However, this warrants a more nuanced interpretation. In this study, the R2 values were derived from validations using completely independent ground-truth datasets, a methodology that often results in lower R2 values compared to studies where models are both trained and tested on the same type of data [52]. This approach provides a more stringent assessment of model performance, as it evaluates the model’s predictive capability on truly unseen data, thereby offering a realistic measure of its generalizability. Conversely, some studies employ airborne laser scanning (ALS) data for validation [45], which involves comparing continuous maps to continuous maps rather than individual data points. This method typically yields higher R2 values because it emphasizes overall spatial patterns, potentially overlooking discrepancies at finer scales. In our study, the focus was on the accuracy of individual pixels, a more granular approach that inherently leads to lower R2 values but ensures a rigorous evaluation of model precision.
Results from the variable importance analysis do show that the input datasets used weigh differently from biome to biome, particularly when there is fewer variability within them. Until now, regional mapping approaches that have demonstrated superior quality compared to global models [53] have been largely due to their ability to integrate high-resolution inputs, local field measurements, and tailored parameterizations [54,55,56]. This approach offers high accuracy but lacks reproducibility and comparability, while our method brings new insights into the predictability of each biome separately and the replicability of the methodology. In comparison, global models have excelled in capturing broad spatial patterns, but their training data often underrepresent certain biomes, limiting their predictive accuracy in regions with distinct vegetation structures.
There are still key limitations to the regional approach, such as capturing canopy heterogeneity or the saturation of signal at high AGBD levels [50]. Although the saturation can be attributed to constrains in the satellite sensor, our results indicate that the scarcity of high AGBD values in the GEDI dataset can result in insufficient training data for the model to learn and generalize to these higher biomass levels (Figure 3). Additionally, GEDI data itself has large uncertainties which need to be taken into account [57], such as geolocation, allometric equation model selection or high slope-driven errors. As seen in the validation of UNet predictions against GEDI data, our model is constrained by the input data points from LiDAR to provide estimates of AGBD at global scale. This implies that if only 5–10% of the data points are over a certain threshold (dependent on the biome), the model is less probable to estimate AGBD over that threshold, contributing to signal saturation at larger AGBD values. Another important aspect when looking at the input data to train models is its variability. One critical improvement for future implementations is shifting from politically defined regional models to biome-based models. Even though this study was an attempt to bridge the gap between regional and global applicability of AGBD models mainly taking into account biomes, since political regions have a high interest for mapping AGBD for carbon stock accounting, Catalonia was chosen as one of the regions even if there was a small overlap of two different biomes. In this case, both Mediterranean and temperate forests coexist, and a single model may not optimally capture biomass variability across distinct vegetation zones. A more effective strategy would involve biome-specific model partitions, where models are trained separately for each ecological region within the administrative region, and then combined using ensemble techniques or hierarchical modeling. This would allow for more precise AGBD estimations that would better reflect ecological rather than administrative boundaries.
One important aspect of the validation carried out in this study is the representativity of the 4 different biomes and their harmonization. The applicability of AGBD maps such as the one we produced depends as much on the thoroughness of its validation as it does on their accuracy. That is why this study uses in-situ measurements as ground truth, even if AGBD measurements themselves have large uncertainties coming from potential human errata, allometric equations and geolocation accuracy [58]. Overall, the efforts taken in the validation step using not only the closest pixel but all the pixels falling in the potential area of coverage of the plot aligns with well-established forestry practices [59]. It does imply however that the validation resolution of this model is therefore lower than its actual pixel size, and is dependent on the biome and its plot size. The information does come in more detail at 10-m pixels, providing insights into heterogeneity and land cover changes. Another aspect to improve our models and their effective resolution would be to account for potential geolocation errors of the GEDI footprints, which has been shown to greatly impact the predictions [58].
Considering the advantages but also the limitations of current regional approaches, our study introduces a standardized methodology for biomass mapping at a global scale while accounting for biome-specific variations. By structuring our approach to consider these nuances, we provide a scalable solution that bridges the gap between regional accuracy and global applicability. A key challenge in global biomass estimation is the ecological diversity, with each vegetation type exhibiting distinct spectral signatures. This diversity, combined with different terrain conditions, affects the model’s ability to generalize and produce accurate AGBD estimates. As observed in the global model, this variability can affect performance in diverse landscapes, even when applying a regional approach, if the studied area contains a high diversity of vegetation types and terrain conditions. This challenge calls for adaptive strategies to improve the accuracy of the estimate. Additionally, our model framework allows for independent deployment in each region, enabling iterative improvements over time. This flexibility means that new datasets or missions, such as BIOMASS [60], NISAR [61], and additional GEDI data, can be seamlessly integrated, ensuring continuous model refinement. With investigation around GEDI data gaining traction and the mission continuing for the next few years, further improvement in GEDI data filtering is expected. This would further improve results, as GEDI constitutes the basis for this model and most large-scale applications for global AGBD mapping. Furthermore, innovative model architectures, such as multi-output models, can enhance AGBD estimation by simultaneously predicting related variables like maximum height and canopy cover. Since both of these metrics, derived from GEDI data, serve as proxies for AGBD, their inclusion can improve the model’s overall accuracy. This approach ensures that biomass estimation remains dynamic and capable of adapting to future advancements in remote sensing and ecological monitoring.

5. Conclusions

This study aims to bridge the gap between regional AGBD mapping, which often relies on localized, non-replicable data such as ALS or proprietary datasets, and global AGBD datasets, which capture biome diversity but struggle to account for biome-specific variations. By using only open-source datasets, this approach offers a cost-effective, scalable solution for mapping AGBD across Earth’s land masses, with potential improvements as new missions and datasets enhance accuracy in forest structure and vegetation density mapping. Building on extensive research in recent years, we developed a model capable of global AGBD mapping while preserving the specificity of regional modeling. Our results highlight the advantages of regional mapping, demonstrating superior accuracy compared to the ESA CCI benchmark dataset and outperforming a globally trained version of our model. This advantage likely stems from models being able to learn biome-specific patterns more effectively and from tailoring training datasets to each biome’s characteristics. Further exploration of open-source datasets could help reduce uncertainties and errors, enhancing future models.
The primary limitations of our study stem from the quality and sufficiency of open-source datasets in accurately mapping AGBD across its full range of values within each biome. One notable challenge is signal saturation, which could be mitigated by upcoming missions specifically targeting vegetation. Since GEDI data forms the foundation for global AGBD training, careful curation is essential to ensure the highest quality dataset. Improvements in filtering and balancing GEDI data are necessary to ensure that underrepresented values are mapped with equal accuracy. Our findings reinforce the advantages of regional over global AGBD mapping, demonstrating that biome-specific modeling allows for better utilization of relevant information and enables models to adjust variable weights to suit each biome’s needs. Additionally, it informs of potential adjustments to be made on global modeling approaches to better represent within-biome variability while capturing simultaneously biome diversity. With all the insights derived from this study AGBD mapping approaches can be further refined both at the global and regional levels.

Author Contributions

Conceptualization, M.P.-V., A.A., M.J.E. and L.R.; methodology, M.P.-V. and D.C.-G.; software, M.P.-V. and D.C.-G.; validation, M.P.-V., D.C.-G. and C.H.; formal analysis, M.P.-V. and D.C.-G.; investigation, M.P.-V. and D.C.-G.; resources, A.A., M.J.E. and L.R.; data curation, M.P.-V., D.C.-G. and C.H.; writing—original draft preparation, M.P.-V. and D.C.-G.; writing—review and editing, A.A., M.J.E., C.H. and L.R.; visualization, M.P.-V. and D.C.-G.; supervision, A.A. and M.J.E.; project administration, A.A., M.J.E. and L.R.; funding acquisition, M.P.-V., A.A., M.J.E. and L.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Industrial PhD grants AGAUR (2021 DI 121) and DIN2020-010982 financed by MCIN AEI 10.13039/501100011033 and by European Union “NextGenerationEU/PRTR”. Aitor Ameztegui is funded by a Serra-Húnter fellowship from Generalitat de Catalonia.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Martí Perpinyà-Vallès, Daniel Cendagorta-Galarza, Claudia Huertas and Laia Romero are employed by the Lobelia Earth S.L. The Author Maria José Escorihuela is employed by the isardSAT S.L. All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
AGBDAbove Ground Biomass Density
GEDIGlobal Ecosystem Dynamics Investigation
SARSynthetic Aperture Radar
XAIExplainable Artificial Intelligence
RMSERoot Mean Square Error
ALSAirborne Laser Scan
ESA CCIEuropean Space Agency’s Climate Change Initiative
(N)FI(National) Forest Inventory
DEMDigital Elevation Model
SCLScene Classification Layer
PALSARPhased Array type L-Band SAR
MAEMean Absolute Error

References

  1. Hunka, N.; Santoro, M.; Armston, J.; Dubayah, R.; McRoberts, R.E.; Næsset, E.; Quegan, S.; Urbazaev, M.; Pascual, A.; May, P.B. On the NASA GEDI and ESA CCI biomass maps: Aligning for uptake in the UNFCCC global stocktake. Environ. Res. Lett. 2023, 18, 124042. [Google Scholar] [CrossRef]
  2. Feng, Y.; Ciais, P.; Wigneron, J.-P.; Xu, Y.; Ziegler, A.D.; van Wees, D.; Fendrich, A.N.; Spracklen, D.V.; Sitch, S.; Brandt, M. Global patterns and drivers of tropical aboveground carbon changes. Nat. Clim. Change 2024, 14, 1064–1070. [Google Scholar] [CrossRef]
  3. Hagger, V.; Wilson, K.; England, J.R.; Dwyer, J.M. Water availability drives aboveground biomass and bird richness in forest restoration plantings to achieve carbon and biodiversity cobenefits. Ecol. Evol. 2019, 9, 14379–14393. [Google Scholar] [CrossRef] [PubMed]
  4. Zeng, L.; Liu, X.; Li, W.; Ou, J.; Cai, Y.; Chen, G.; Li, M.; Li, G.; Zhang, H.; Xu, X. Global simulation of fine resolution land use/cover change and estimation of aboveground biomass carbon under the shared socioeconomic pathways. J. Environ. Manag. 2022, 312, 114943. [Google Scholar] [CrossRef]
  5. Beaury, E.M.; Smith, J.; Levine, J.M. Global suitability and spatial overlap of land-based climate mitigation strategies. Glob. Change Biol. 2024, 30, 9. [Google Scholar] [CrossRef]
  6. Rogelj, J.; Shindell, D.; Jiang, K.; Fifita, S.; Forster, P.; Ginzburg, V.; Handa, C.; Kheshgi, H.; Kobayashi, S.; Kriegler, E.; et al. Mitigation Pathways Compatible with 1.5 °C in the Context of Sustainable Development. In Global Warming of 1.5 °C; Cambridge University Press: Cambridge, UK, 2022; pp. 93–174. [Google Scholar] [CrossRef]
  7. Tian, L.; Wu, X.; Tao, Y.; Li, M.; Qian, C.; Liao, L.; Fu, W. Review of Remote Sensing-Based Methods for Forest Aboveground Biomass Estimation: Progress, Challenges, and Prospects. Forests 2023, 14, 1086. [Google Scholar] [CrossRef]
  8. Khan, M.N.; Tan, Y.; Gul, A.A.; Abbas, S.; Wang, J. Forest Aboveground Biomass Estimation and Inventory: Evaluating Remote Sensing-Based Approaches. Forests 2024, 15, 1055. [Google Scholar] [CrossRef]
  9. Yang, Q.; Niu, C.; Liu, X.; Feng, Y.; Ma, Q.; Wang, X.; Tang, H.; Guo, Q. Mapping high-resolution forest aboveground biomass of China using multisource remote sensing data. GIScience Remote Sens. 2023, 60, 2203303. [Google Scholar] [CrossRef]
  10. Su, Y.; Wu, Z.; Zheng, X.; Qiu, Y.; Ma, Z.; Ren, Y.; Bai, Y. Harmonizing remote sensing and ground data for forest aboveground biomass estimation. Ecol. Inform. 2025, 86, 103002. [Google Scholar] [CrossRef]
  11. Rana, P.; Popescu, S.; Tolvanen, A.; Gautam, B.; Srinivasan, S.; Tokola, T. Estimation of tropical forest aboveground biomass in Nepal using multiple remotely sensed data and deep learning. Int. J. Remote Sens. 2023, 44, 5147–5171. [Google Scholar] [CrossRef]
  12. Liu, S.; Brandt, M.; Nord-Larsen, T.; Chave, J.; Reiner, F.; Lang, N.; Tong, X.; Ciais, P.; Igel, C.; Pascual, A.; et al. The overlooked contribution of trees outside forests to tree cover and woody biomass across Europe. Sci. Adv. 2023, 9, eadh4097. [Google Scholar] [CrossRef] [PubMed]
  13. Yang, Q.; Su, Y.; Hu, T.; Jin, S.; Liu, X.; Niu, C.; Liu, Z.; Kelly, M.; Wei, J.; Guo, Q. Allometry-based estimation of forest aboveground biomass combining LiDAR canopy height attributes and optical spectral indexes. For. Ecosyst. 2022, 9, 100059. [Google Scholar] [CrossRef]
  14. Santoro, M.; Cartus, O. ESA Biomass Climate Change Initiative (Biomass_cci): Global Datasets of Forest Above-Ground Biomass for the Years 2010, 2015, 2016, 2017, 2018, 2019, 2020 and 2021, v5.01; NERC EDS Centre for Environmental Data Analysis: Chilton, UK, 2024. [Google Scholar] [CrossRef]
  15. Xu, L.; Saatchi, S.S.; Yang, Y.; Yu, Y.; Pongratz, J.; Bloom, A.A.; Bowman, K.; Worden, J.; Liu, J.; Yin, Y.; et al. Changes in global terrestrial live biomass over the 21st century. Sci. Adv. 2021, 7, eabe9829. [Google Scholar] [CrossRef] [PubMed]
  16. Harris, N.L.; Gibbs, D.A.; Baccini, A.; Birdsey, R.A.; de Bruin, S.; Farina, M.; Fatoyinbo, L.; Hansen, M.C.; Herold, M.; Houghton, R.A.; et al. Global maps of twenty-first century forest carbon fluxes. Nat. Clim. Change 2021, 11, 234–240. [Google Scholar] [CrossRef]
  17. Zhang, Y.; Liang, S.; Yang, L. A Review of Regional and Global Gridded Forest Biomass Datasets. Remote Sens. 2019, 11, 2744. [Google Scholar] [CrossRef]
  18. Koldasbayeva, D.; Tregubova, P.; Gasanov, M.; Zaytsev, A.; Petrovskaia, A.; Burnaev, E. Challenges in data-driven geospatial modeling for environmental research and practice. Nat. Commun. 2024, 15, 10700. [Google Scholar] [CrossRef]
  19. Dubayah, R.; Armston, J.; Healey, S.P.; Bruening, J.M.; Patterson, P.L.; Kellner, J.R.; Duncanson, L.; Saarela, S.; Ståhl, G.; Yang, Z.; et al. GEDI launches a new era of biomass inference from space. Environ. Res. Lett. 2022, 17, 095001. [Google Scholar] [CrossRef]
  20. Dubayah, R.O.; Armston, J.; Kellner, J.R.; Duncanson, L.; Healey, S.P.; Patterson, P.L.; Hancock, S.; Tang, H.; Bruening, J.; Hofton, M.A.; et al. GEDI L4A Footprint Level Aboveground Biomass Density, Version 2.1; ORNL DAAC: Oak Ridge, TN, USA, 2022. [Google Scholar] [CrossRef]
  21. Musthafa, M.; Singh, G. Improving Forest Above-Ground Biomass Retrieval Using Multi-Sensor L- and C- Band SAR Data and Multi-Temporal Spaceborne LiDAR Data. Front. For. Glob. Change 2022, 5, 822704. [Google Scholar] [CrossRef]
  22. Ghivarry, G.; Kutchartt, E.; Pirotti, F. Assessing the potential of polarimetric decomposition of Sentinel-1 SAR for the estimation of mangrove forest biomass. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2024, 48, 183–190. [Google Scholar] [CrossRef]
  23. Chen, L.; Ren, C.; Bao, G.; Zhang, B.; Wang, Z.; Liu, M.; Man, W.; Liu, J. Improved Object-Based Estimation of Forest Aboveground Biomass by Integrating LiDAR Data from GEDI and ICESat-2 with Multi-Sensor Images in a Heterogeneous Mountainous Region. Remote Sens. 2022, 14, 2743. [Google Scholar] [CrossRef]
  24. Sataudom, N.; Reangsang, S.; Navakam, S.; Manoonpong, P.; Aobpaet, A.; Sunthornhao, P.; Yooyen, N.; Pongsopon, M.; Chulinrak, N.; Soontaros, S.; et al. A Deep Learning Approach with Uncertainty Estimation to Assess Aboveground Biomass Mapping of Tropical Rainforest in Thailand. In Proceedings of the 2024 IEEE Mediterranean and Middle-East Geoscience and Remote Sensing Symposium (M2GARSS), Oran, Algeria, 15–17 April 2024; pp. 291–295. [Google Scholar] [CrossRef]
  25. Li, H.; Hiroshima, T.; Li, X.; Hayashi, M.; Kato, T. High-resolution mapping of forest structure and carbon stock using multi-source remote sensing data in Japan. Remote Sens. Environ. 2024, 312, 114322. [Google Scholar] [CrossRef]
  26. Zurqani, H.A. A multi-source approach combining GEDI LiDAR, satellite data, and machine learning algorithms for estimating forest aboveground biomass on Google Earth Engine platform. Ecol. Inform. 2025, 86, 103052. [Google Scholar] [CrossRef]
  27. Sialelli, G.; Peters, T.; Wegner, J.D.; Schindler, K. AGBD: A Global-scale Biomass Dataset (Version 2). arXiv 2024, arXiv:2406.04928. [Google Scholar]
  28. Weber, M.; Beneke, C.; Wheeler, C. Unified Deep Learning Model for Global Prediction of Aboveground Biomass, Canopy Height and Cover from High-Resolution, Multi-Sensor Satellite Imagery (Version 2). arXiv 2024, arXiv:2408.11234. [Google Scholar]
  29. Loidi, J.; Navarro-Sánchez, G.; Vynokurov, D. A vector map of the world’s terrestrial biotic units: Subbiomes, biomes, ecozones and domains. Veg. Classif. Surv. 2023, 4, 59–61. [Google Scholar] [CrossRef]
  30. Duncanson, L.; Kellner, J.R.; Armston, J.; Dubayah, R.; Minor, D.M.; Hancock, S.; Healey, S.P.; Patterson, P.L.; Saarela, S.; Marselis, S.; et al. Aboveground biomass density models for NASA’s Global Ecosystem Dynamics Investigation (GEDI) lidar mission. Remote Sens. Environ. 2022, 270, 112845. [Google Scholar] [CrossRef]
  31. Rozendaal, D.M.A.; Requena Suarez, D.; De Sy, V.; Avitabile, V.; Carter, S.; Adou Yao, C.Y.; Alvarez-Davila, E.; Anderson-Teixeira, K.; Araujo-Murakami, A.; Arroyo, L.; et al. Aboveground forest biomass varies across continents, ecological zones and successional stages: Refined IPCC default values for tropical and subtropical forests. Environ. Res. Lett. 2022, 17, 014047. [Google Scholar] [CrossRef]
  32. Louis, J.; L2A Team. Sen2Cor Algorithm Theoretical Basis Document (ATBD) Version 2.10.0 (S2-PDGS-MPC-L2A). European Space Agency. 2021. Available online: https://step.esa.int/thirdparties/sen2cor/2.10.0/docs/S2-PDGS-MPC-L2A-ATBD-V2.10.0.pdf (accessed on 20 November 2024).
  33. Louis, J. Sen2Cor Product Definition Document (PDD) Version 14.9-v4.9 (S2-PDGS-MPC-L2A). European Space Agency. 2021. Available online: https://step.esa.int/thirdparties/sen2cor/2.10.0/docs/S2-PDGS-MPC-L2A-PDD-V14.9-v4.9.pdf (accessed on 20 November 2024).
  34. Ministère des Ressources Naturelles et des Forêts. Placette-Échantillon Permanente, dans Données Québec, 2017, Mis à Jour le 8 Octobre 2024. Available online: https://www.donneesquebec.ca/recherche/dataset/placettes-echantillons-permanentes-1970-a-aujourd-hui (accessed on 15 November 2024).
  35. Magrama, M.A. Ministerio de Agricultura y Pesca, Alimentación y Medio Ambiente; Cuarto Inventario Forestal Nacional: Madrid, Spain, 2017; 78p. [Google Scholar]
  36. Dos-Santos, M.N.; Keller, M.M.; Pinage, E.R.; Morton, D.C. Forest Inventory and Biophysical Measurements, Brazilian Amazon, 2009–2018; ORNL DAAC: Oak Ridge, TN, USA, 2022. [Google Scholar] [CrossRef]
  37. Perpinyà-Vallès, M.; Machefer, M.; Ameztegui, A.; Escorihuela, M.J.; Brandt, M.; Romero, L. Quantification of Carbon Stocks at the Individual Tree Level in Semiarid Regions in Africa. J. Remote Sens. 2024, 4, 0359. [Google Scholar] [CrossRef]
  38. Chave, J.; Réjou-Méchain, M.; Búrquez, A.; Chidumayo, E.; Colgan, M.S.; Delitti, W.B.C.; Duque, A.; Eid, T.; Fearnside, P.M.; Goodman, R.C.; et al. Improved allometric models to estimate the aboveground biomass of tropical trees. Glob. Change Biol. 2014, 20, 3177–3190. [Google Scholar] [CrossRef]
  39. Gonzalez-Akre, E.; Piponiot, C.; Lepore, M.; Herrmann, V.; Lutz, J.A.; Baltzer, J.L.; Dick, C.W.; Gilbert, G.S.; He, F.; Heym, M.; et al. allodb: An R package for biomass estimation at globally distributed extratropical forest plots. Methods Ecol. Evol. 2022, 13, 330–338. [Google Scholar] [CrossRef]
  40. Longo, M.; Keller, M.; dos-Santos, M.N.; Leitold, V.; Pinagé, E.R.; Baccini, A.; Saatchi, S.; Nogueira, E.M.; Batistella, M.; Morton, D.C. Aboveground biomass variability across intact and degraded forests in the Brazilian Amazon. Glob. Biogeochem. Cycles 2016, 30, 1639–1660. [Google Scholar] [CrossRef]
  41. Stelmaszczuk-Górska, M.A.; Urbazaev, M.; Schmullius, C.; Thiel, C. Estimation of Above-Ground Biomass over Boreal Forests in Siberia Using Updated In Situ, ALOS-2 PALSAR-2, and RADARSAT-2 Data. Remote Sens. 2018, 10, 1550. [Google Scholar] [CrossRef]
  42. Banfield, G.E.; Bhatti, J.S.; Jiang, H.; Apps, M.J. Variability in regional scale estimates of carbon stocks in boreal forest ecosystems: Results from West-Central Alberta. For. Ecol. Manag. 2002, 169, 15–27. [Google Scholar] [CrossRef]
  43. Ameztegui, A.; Rodrigues, M.; Granda, V. Uncertainty of biomass stocks in Spanish forests: A comprehensive comparison of allometric equations. Eur. J. Forest Res. 2022, 141, 395–407. [Google Scholar] [CrossRef]
  44. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention, Proceedings of the MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W., Frangi, A., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2015; Volume 9351. [Google Scholar] [CrossRef]
  45. Schwartz, M.; Ciais, P.; De Truchis, A.; Chave, J.; Ottlé, C.; Vega, C.; Wigneron, J.-P.; Nicolas, M.; Jouaber, S.; Liu, S.; et al. FORMS: Forest Multiple Source height, wood volume, and biomass maps in France at 10 to 30m resolution based on Sentinel-1, Sentinel-2, and Global Ecosystem Dynamics Investigation (GEDI) data with a deep learning approach. Earth Syst. Sci. Data 2023, 15, 4927–4945. [Google Scholar] [CrossRef]
  46. Zanaga, D.; Van De Kerchove, R.; De Keersmaecker, W.; Souverijns, N.; Brockmann, C.; Quast, R.; Wevers, J.; Grosu, A.; Paccini, A.; Vergnaud, S.; et al. ESA WorldCover 10 m 2020 v100. 2021. Available online: https://zenodo.org/records/5571936 (accessed on 10 February 2025).
  47. Zanaga, D.; Van De Kerchove, R.; Daems, D.; De Keersmaecker, W.; Brockmann, C.; Kirches, G.; Wevers, J.; Cartus, O.; Santoro, M.; Fritz, S.; et al. ESA WorldCover 10 m 2021 v200. 2022. Available online: https://zenodo.org/records/7254221 (accessed on 10 February 2025).
  48. Simonyan, K.; Vedaldi, A.; Zisserman, A. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv 2013, arXiv:1312.6034. [Google Scholar]
  49. Ma, T.; Zhang, C.; Ji, L.; Zuo, Z.; Beckline, M.; Hu, Y.; Li, X.; Xiao, X. Development of forest aboveground biomass estimation, its problems and future solutions: A review. Ecol. Indic. 2024, 159, 111653. [Google Scholar] [CrossRef]
  50. Mutanga, O.; Masenyama, A.; Sibanda, M. Spectral Saturation in the Remote Sensing of High-Density Vegetation Traits: A Systematic Review of Progress, Challenges, and Prospects. ISPRS J. Photogramm. Remote Sens. 2023, 198, 297–309. [Google Scholar] [CrossRef]
  51. Herold, M.; Araza, A. Product Validation Plan (PVP) v4. ESA Climate Office. 2023. Available online: https://climate.esa.int/media/documents/Product_Validation_Plan_PVP_v4.pdf (accessed on 10 February 2025).
  52. Gao, J. R-Squared (R2)—How much variation is explained? In Research Methods in Medicine. Health Sci. 2023, 5, 104–109. [Google Scholar] [CrossRef]
  53. Cooley, S.S.; Pinto, N.; Becerra, M.; Alvarado, J.W.V.; Fahlen, J.C.; Rivera, O.; Fricker, G.A.; Dantas, A.R.D.L.R.; Aguilar-Amuchastegui, N.; Reygadas, Y.; et al. Combining spaceborne lidar from the Global Ecosystem Dynamics Investigation with local knowledge for monitoring fragmented tropical landscapes: A case study in the forest—Agriculture interface of Ucayali, Peru. Ecol. Evol. 2024, 14, e70116. [Google Scholar] [CrossRef]
  54. Qasim, M.; Csaplovics, E. AGB estimation using Sentinel-2 and Sentinel-1 datasets. Environ. Monit. Assess. 2024, 196, 299. [Google Scholar] [CrossRef] [PubMed]
  55. Kanmegne Tamga, D.; Latifi, H.; Ullmann, T.; Baumhauer, R.; Bayala, J.; Thiel, M. Estimation of Aboveground Biomass in Agroforestry Systems over Three Climatic Regions in West Africa Using Sentinel-1, Sentinel-2, ALOS, and GEDI Data. Sensors 2022, 23, 349. [Google Scholar] [CrossRef] [PubMed]
  56. David, R.M.; Rosser, N.J.; Donoghue, D.N.M. Improving above ground biomass estimates of Southern Africa dryland forests by combining Sentinel-1 SAR and Sentinel-2 multispectral imagery. Remote Sens. Environ. 2022, 282, 113232. [Google Scholar] [CrossRef]
  57. Dorado-Roda, I.; Pascual, A.; Godinho, S.; Silva, C.; Botequim, B.; Rodríguez-Gonzálvez, P.; González-Ferreiro, E.; Guerra-Hernández, J. Assessing the Accuracy of GEDI Data for Canopy Height and Aboveground Biomass Estimates in Mediterranean Forests. Remote Sens. 2021, 13, 2279. [Google Scholar] [CrossRef]
  58. Sun, M.; Cui, L.; Park, J.; García, M.; Zhou, Y.; Silva, C.A.; He, L.; Zhang, H.; Zhao, K. Evaluation of NASA’s GEDI Lidar Observations for Estimating Biomass in Temperate and Tropical Forests. Forests 2022, 13, 1686. [Google Scholar] [CrossRef]
  59. Duncanson, L.; Hunka, N.; Jucker, T.; Armston, J.; Harris, N.; Fatoyinbo, L.; Williams, C.A.; Atkins, J.W.; Raczka, B.; Serbin, S.; et al. Spatial resolution for forest carbon maps. Science 2025, 387, 370–371. [Google Scholar] [CrossRef]
  60. Quegan, S.; Le Toan, T.; Chave, J.; Dall, J.; Exbrayat, J.-F.; Minh, D.H.T.; Lomas, M.; D’Alessandro, M.M.; Paillou, P.; Papathanassiou, K.; et al. The European Space Agency BIOMASS mission: Measuring forest above-ground biomass from space. Remote Sens. Environ. 2019, 227, 44–60. [Google Scholar] [CrossRef]
  61. Kellogg, K.; Hoffman, P.; Standley, S.; Shaffer, S.; Rosen, P.; Edelstein, W.; Dunn, C.; Baker, C.; Barela, P.; Shen, Y.; et al. NASA-ISRO Synthetic Aperture Radar (NISAR) Mission. In Proceedings of the 2020 IEEE Aerospace Conference, Big Sky, MT, USA, 7–14 March 2020; pp. 1–21. [Google Scholar] [CrossRef]
Figure 1. Study areas in the 4 different biomes, represented in the center of the figure and according to the biome/sub-biome classification from [29]. For the cases of Quebec (a) and Catalonia (b), the distribution of all plots is seen. For the cases of Brazil (c) and Burkina Faso and Niger (d), a subset of the plots is zoomed in.
Figure 1. Study areas in the 4 different biomes, represented in the center of the figure and according to the biome/sub-biome classification from [29]. For the cases of Quebec (a) and Catalonia (b), the distribution of all plots is seen. For the cases of Brazil (c) and Burkina Faso and Niger (d), a subset of the plots is zoomed in.
Remotesensing 17 01268 g001
Figure 2. Violin plots showing variable importance from average gradients’ distributions across each biome. Whiskers represent the Q1 − 1.5 IQR and Q3 + 1.5 IQR.
Figure 2. Violin plots showing variable importance from average gradients’ distributions across each biome. Whiskers represent the Q1 − 1.5 IQR and Q3 + 1.5 IQR.
Remotesensing 17 01268 g002
Figure 3. Boxplots of errors across the GEDI value ranges for each biome, with the 90th percentile indicated in a red dashed line, and a histogram containing the data distribution in logarithmic scale.
Figure 3. Boxplots of errors across the GEDI value ranges for each biome, with the 90th percentile indicated in a red dashed line, and a histogram containing the data distribution in logarithmic scale.
Remotesensing 17 01268 g003
Figure 4. Scatter plots of AGBD calculated from forest inventory data against UNet AGBD estimations (top) and ESA CCI (bottom). The data points are color-coded according to the density of points. The 1:1 lines in red dashed lines and the R2 of each model are included.
Figure 4. Scatter plots of AGBD calculated from forest inventory data against UNet AGBD estimations (top) and ESA CCI (bottom). The data points are color-coded according to the density of points. The 1:1 lines in red dashed lines and the R2 of each model are included.
Remotesensing 17 01268 g004
Figure 5. ESA CCI and UNet estimations qualitative assesment.
Figure 5. ESA CCI and UNet estimations qualitative assesment.
Remotesensing 17 01268 g005
Table 1. Performance metrics for ESA CCI and UNet estimations across different regions. In bold, best performances are highlighted.
Table 1. Performance metrics for ESA CCI and UNet estimations across different regions. In bold, best performances are highlighted.
ESA CCIUNet RegionalUNet Global
RegionMAEnRMSErR2MAEnRMSErR2MAEnRMSErR2
Brazil132.790.540.15−0.31098.940.430.730.119116.950.490.65−0.171
B. Faso & Niger29.211.250.36−1.68812.440.590.770.39617.620.760.46−0.018
Quebec46.640.870.59−0.17231.870.680.730.29633.450.670.680.198
Catalonia43.550.710.450.08535.850.610.610.32835.660.610.610.336
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Perpinyà-Vallès, M.; Cendagorta-Galarza, D.; Ameztegui, A.; Huertas, C.; Escorihuela, M.J.; Romero, L. High-Resolution Aboveground Biomass Mapping: The Benefits of Biome-Specific Deep Learning Models. Remote Sens. 2025, 17, 1268. https://doi.org/10.3390/rs17071268

AMA Style

Perpinyà-Vallès M, Cendagorta-Galarza D, Ameztegui A, Huertas C, Escorihuela MJ, Romero L. High-Resolution Aboveground Biomass Mapping: The Benefits of Biome-Specific Deep Learning Models. Remote Sensing. 2025; 17(7):1268. https://doi.org/10.3390/rs17071268

Chicago/Turabian Style

Perpinyà-Vallès, Martí, Daniel Cendagorta-Galarza, Aitor Ameztegui, Claudia Huertas, Maria José Escorihuela, and Laia Romero. 2025. "High-Resolution Aboveground Biomass Mapping: The Benefits of Biome-Specific Deep Learning Models" Remote Sensing 17, no. 7: 1268. https://doi.org/10.3390/rs17071268

APA Style

Perpinyà-Vallès, M., Cendagorta-Galarza, D., Ameztegui, A., Huertas, C., Escorihuela, M. J., & Romero, L. (2025). High-Resolution Aboveground Biomass Mapping: The Benefits of Biome-Specific Deep Learning Models. Remote Sensing, 17(7), 1268. https://doi.org/10.3390/rs17071268

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop