Differentiable Physical Modeling for Forest Above-Ground Biomass Retrieval by Unifying a Water Cloud Model and Deep Learning

Zhao, Cui; Shi, Rui; Ji, Yongjie; Zhang, Wei; Zhang, Wangfei; He, Xiahong; Zhao, Han

doi:10.3390/rs18060912

Open AccessArticle

Differentiable Physical Modeling for Forest Above-Ground Biomass Retrieval by Unifying a Water Cloud Model and Deep Learning

by

Cui Zhao

^1,2

,

Rui Shi

^1,2

,

Yongjie Ji

³

,

Wei Zhang

¹,

Wangfei Zhang

^1,2,*

,

Xiahong He

^1,2 and

Han Zhao

^1,2

¹

Key Laboratory for Conservation and Utilization of In-Forest Resource of Yunnan, College of Forestry, Southwest Forestry University, Kunming 650224, China

²

Key Laboratory for Forest Resources Conservation and Utilization in the Southwest Mountains of China, Ministry of Education, Southwest Forestry University, Kunming 650224, China

³

College of Soil and Water Conservation, Southwest Forestry University, Kunming 650224, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(6), 912; https://doi.org/10.3390/rs18060912

Submission received: 13 February 2026 / Revised: 9 March 2026 / Accepted: 13 March 2026 / Published: 17 March 2026

(This article belongs to the Special Issue Advances in Estimating Aboveground Biomass Based on Multi-source Remote Sensing Data)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

The proposed DPM framework, which integrates the WCM with deep neural networks, achieved superior forest AGB retrieval performance in both subtropical and temperate forest regions, with R² values of 0.60 and 0.48, outperforming traditional physical models and purely data-driven approaches, and demonstrating strong generalization across northern and southern forests.
By incorporating automatic differentiation, DPM enables joint optimization of the WCM’s physical constraints and neural network parameters, preserving the physical plausibility of the model while flexibly capturing complex nonlinear relationships, thereby significantly reducing overfitting under limited training data.

What are the implications of the main findings?

DPM offers a novel framework for remote sensing-based forest AGB estimation, fully leveraging neural networks’ capacity to model complex nonlinear relationships while maintaining the interpretability of physical models, resulting in improved retrieval accuracy and stability.
This study was conducted using C-band SAR data. Due to its relatively short wavelength, C-band has limited penetration ability within dense vegetation canopies, and its backscattering signal is prone to saturation in areas with high biomass, which restricts the model’s inversion performance in complex forest environments. Future work could enhance this framework in two ways: first, by integrating longer wavelengths, such as L-band and P-band SAR data; second, by adopting physical models that better describe multiple scattering mechanisms, like the MIMICS model. This study offers a methodological reference for combining deep learning with physical scattering models, and these improvements are expected to further overcome the limitations of C-band in dense vegetation conditions.

Abstract

To address the limitations of traditional forest above-ground biomass (AGB) retrieval methods—namely, the restricted accuracy of physical models and the limited generalization ability of purely data-driven models—this study proposes a differentiable physical modeling (DPM) approach for forest AGB estimation. The method adopts the water cloud model (WCM) as a physics-based framework, grounded in radiative transfer theory, and integrates C-band synthetic aperture radar (SAR) data with multispectral imagery. Within the PyTorch tensor computation framework, automatic differentiation (AD) is employed to seamlessly couple the WCM with the deep fully connected neural network (DFCNN), enabling a differentiable implementation of the WCM. Using mean squared error (MSE) as the loss function, the neural network parameters are optimized through backpropagation and gradient descent, thereby constructing an end-to-end trainable DPM model that effectively retrieves forest AGB while preserving physical interpretability and generalization capability. To validate the proposed method, two representative test sites were selected: Simao in Pu’er, Yunnan Province, and Genhe in Inner Mongolia. GF-3 PolSAR and RADARSAT-2 data were used to extract backscattering coefficients and compute the radar vegetation index (RVI), while Landsat 8 OLI imagery was employed to calculate the normalized difference vegetation index (NDVI), difference vegetation index (DVI), and soil-adjusted vegetation index (SAVI). These datasets, together with ASTER GDEM, field-measured biomass, and other relevant datasets, were integrated to construct a multisource dataset combining remote sensing and ground observations. The performance of the DPM model was then compared with the traditional WCM and several data-driven models, including the fully connected neural network (FNN), generalized regression neural network (GRNN), RF, and Adaptive Boosting (AdaBoost). The results indicate that the DPM model achieved R² = 0.60, RMSE = 24.23 Mg/ha, Bias = 0.4 Mg/ha, and ubRMSE = 22.43 Mg/ha in Simao, and R² = 0.48, RMSE = 33.29 Mg/ha, Bias = 0.87 Mg/ha, and ubRMSE = 33.28 Mg/ha in Genhe, demonstrating consistently better performance than both the WCM and all tested data-driven models. The DPM model demonstrated consistent performance across ecologically contrasting forest regions. It alleviated the systematic overestimation bias of purely data-driven models and overcame the limitations in predictive accuracy resulting from the simplified structure of the WCM. The differentiability of the WCM enables the loss function errors to be backpropagated through the neural network, thereby allowing the optimization of the physical model parameters. Overall, the DPM framework integrates the advantages of both physical models and data-driven approaches, providing an estimation method with acceptable accuracy for forest AGB retrieval. It also offers theoretical and practical insights for the integration of deep learning and physical knowledge in other research fields.

Keywords:

AGB; end-to-end differentiable physical modeling; water cloud model; deep learning; physics–machine fusion

1. Introduction

Forests are among the most vital global carbon reservoirs, playing a key role in regulating climate change, maintaining ecosystem stability, and achieving the “dual-carbon” targets [1,2]. Forest AGB is a crucial indicator of carbon sequestration capacity, and an essential parameter for assessing forest carbon budgets of forest AGB is critical for evaluating forest carbon sinks and achieving carbon neutrality [3]. Traditional biomass measurement methods mainly rely on field surveys, which are limited by labor, resources, and spatial coverage, making it hard to meet the demands for large-scale and high-precision dynamic monitoring. Recently, remote sensing technologies have become a key tool for regional and global forest AGB estimation due to their broad coverage and short revisit cycles [4,5]. In particular, using SAR combined with optical imagery offers an effective way to estimate forest AGB [6].

With the continuous progress of SAR technology, the ability to acquire remote sensing data has dramatically improved in terms of spatial resolution, polarization diversity, and observation geometry, thereby increasing the demands on modeling methods. As a result, machine learning (ML) techniques, known for their strong nonlinear modeling abilities and data-driven nature, have been increasingly used for estimating forest AGB. ML algorithms can automatically identify intrinsic patterns in datasets with complex relationships and multi-scale features, enabling highly accurate predictions even when the underlying mechanisms are not fully understood [7,8]. For example, Nandy et al. [9] combined data from ICESat-2, Sentinel-1, and Sentinel-2 to estimate forest AGB in the Himalayan region of India using a random forest (RF) model, achieving an R² of 0.83 and an RMSE of 4.64 Mg/ha, demonstrating the robustness of the RF algorithm for in situ forest structure retrieval. Zhang et al. [10] systematically compared eight ML regression algorithms—including gradient boosting decision tree (CatBoost), RF, and extremely randomized trees (ERT)—for forest AGB estimation using various remote sensing data. Their findings showed that ensemble learning algorithms generally performed better than non-ensemble methods, with the CatBoost model providing the highest accuracy across multiple experiments (R² = 0.71, RMSE = 46.7 Mg/ha). Similarly, Zhang et al. [11] built multiple ML models based on Sentinel-2 imagery. They used the Boruta algorithm to select key variables and compared the performance of conventional RF, regularized random forest (RRF), and quantile random forest (QRF) models. The results revealed that the optimal ensemble QRF model delivered the best estimation performance (R² = 0.88, RMSE = 29.56 Mg/ha), effectively reducing biases like overestimation and underestimation across different biomass ranges. Although ML can enhance local estimation accuracy by capturing complex nonlinear relationships, its generalization ability remains limited due to the lack of physical constraints [12]. Additionally, ML models heavily depend on large, high-quality training samples, making them prone to overfitting and producing seemingly high evaluation metrics [13]. Moreover, forest AGB surveys face challenges related to terrain, climate, and labor conditions, which hinder the collection of sufficient ground truth data over large areas and complicate adaptation to highly heterogeneous forest environments [14].

Unlike purely data-driven models, physical models are based on radiative transfer theory and directly simulate the interactions between microwaves and forest vegetation or the underlying surface, including volume scattering, ground scattering, and multiple scattering processes [15,16]. Numerous studies have shown that physical models can effectively capture the relationships between these physical parameters and radar signals, leading to good performance in forest AGB retrieval. For example, Kumar et al. [17] expanded the traditional WCM by using fully polarimetric L-band ALOS PALSAR data to develop an extended water cloud model (EWCM) based on polarimetric SAR data for forest AGB retrieval. After applying plane-of-array (POA) compensation, the accuracy of the EWCM improved significantly, with R² rising from 0.36 to 0.78 and RMSE dropping to 59.77 Mg/ha. This method greatly reduced reliance on field samples and provided a new pathway for high-resolution, physically interpretable forest AGB estimation. Santoro et al. [18] addressed systematic errors in the traditional WCM caused by simplifying assumptions by proposing a LiDAR-assisted hetero-growth parameterized WCM framework. The study established the tree height–volume hetero-growth parameters needed for the WCM using vegetation height and canopy density data from ICESat/GLAS LiDAR, combined with ground measurements from the national forest inventory (NFI). The results showed that the improved WCM accurately predicted SAR backscatter values; although pixel-level estimates showed some dispersion, the mean estimated volume over areas larger than 0.3 ha was nearly unbiased and closely matched NFI data (R² = 0.67, rRMSE = 21.4%, Bias = 0.9 m³/ha). However, the accuracy of physical models is constrained by simplifying assumptions, parameter sensitivity, input data uncertainty, and the complexity of parameter calibration, which hinders the full utilization of large-scale SAR data. As a result, their retrieval accuracy generally falls short of that achieved by data-driven algorithms [19,20].

Several studies have attempted to integrate the two approaches to fully leverage the respective strengths of physical models and data-driven algorithms, aiming to improve the retrieval accuracy of Earth-related scientific parameters. Among such hybrid modeling approaches, the “differentiable modeling (DM)” method proposed by Shen et al. [21] has been widely applied in soil moisture retrieval and hydrological modeling research in recent years. This method emphasizes the unification of physical models and data-driven models through differentiable programming, providing a coherent theoretical framework for their integration. For instance, EI Hajj et al. [22] acquired a series of SAR (TerraSAR-X and COSMO-SkyMed) and optical (SPOT 4/5 and LANDSAT 7/8) images over irrigated grasslands in southeastern France. They employed a method combining a multilayer perceptron (MLP) neural network with the WCM to retrieve soil moisture. The results indicated that the HH polarization combined with the NDVI achieved the best performance, with rRMSE controlled at 3.6 Vol% and soil moisture retrieval realized at 8 m spatial resolution. Li et al. [23] acquired Sentinel data for four study areas, including the Luanhe and Shandianhe River basins, Maqu, and Lake Tahoe. They constructed a DM model by integrating the WCM with deep learning, achieving soil moisture retrieval at 10 m resolution. The results showed that the DM model outperformed purely physical and machine learning models across the four regions. Furthermore, in evaluating model extrapolation capability, the DM model exhibited strong generalization and superior retrieval performance compared to other retrieval approaches. Abbes et al. [24] explored the advantages of combining physical modeling and machine learning techniques for soil moisture estimation. The study demonstrated that integrating the two approaches significantly enhances the accuracy and efficiency of soil moisture estimation, particularly in regions with uncertain input data. This hybrid model exhibits better generalization, handles complex relationships, integrates multiple data sources, reduces model uncertainty, and ultimately improves the reliability and accuracy of soil moisture estimation. Feng et al. [25] reconstructed the traditional physical hydrological model HBV into a differentiable and learnable process model, namely the δ-model, and used embedded neural networks to parameterize or enhance the physical modules within the model. Using 671 watersheds in the United States as the study area, the results showed that the δ-model achieved a Nash–Sutcliffe efficiency (NSE) coefficient of 0.732 for streamflow prediction accuracy. In contrast to physics-informed neural networks (PINNs), which embed physical equations into the loss function as regularization terms to impose soft constraints on model outputs toward physical consistency, the proposed DPM framework adopts a distinct physics–data integration strategy. It reconstructs the WCM as an integral part of the differentiable computational graph, positioning the physical model as an inherent module along the forward propagation path. This enables architectural-level hard coupling between the physical model and the neural network. This conceptual difference yields two key advantages: First, the physical constraint is transformed from an implicit regularization term into an explicit computational structure, ensuring that predictions are necessarily processed through the physical model. Second, intermediate physical variables become directly observable outputs rather than being implicitly embedded in the loss function, thereby enhancing both physical consistency and model interpretability.

However, existing strategies that combine the WCM with machine learning methods still exhibit certain limitations. Most approaches adopt a two-stage sequential architecture, in which the physical model and the machine learning module are structurally separated, and the information flow is unidirectional. In addition, once the WCM parameters are calibrated, they are typically fixed, limiting the model’s adaptability to different forest environments. The current integration is also constrained by the relatively simplified physical structure of the WCM, which does not fully account for multiple scattering and absorption mechanisms of electromagnetic waves within the vegetation canopy, thereby restricting its applicability in regions with complex forest structures. The differentiable physical modeling framework reconstructs the physical model through differentiable programming and integrates it into a unified computational graph with deep neural networks, forming an end-to-end collaborative optimization framework. Its successful applications in hydrological models highlight its potential for enhancing model scalability. Unlike most hydrological models characterized by temporal discretization [26,27], the WCM employed for forest AGB retrieval is based on the radiative transfer equation. Whether the scientific assumptions underlying the WCM for forest microwave scattering scenarios are compatible with a combined physical–neural network structure, and whether this approach is effective and generalizable for forest AGB retrieval, have not yet been systematically investigated. To address this question, this study applies the concept of DPM to forest AGB remote sensing retrieval for the first time, constructing a differentiable inversion framework by coupling the WCM with deep neural networks. The feasibility of the proposed DPM framework is explored in representative subtropical forest types in China using C-band SAR data from the Gaofen series together with optical remote sensing data.

2. Materials and Methods

2.1. Overview of the Study Areas

2.1.1. Simao Study Area

The Simao study area is located in Pu’er City, Yunnan Province (22°27′–24°05′N, 98°19′–102°27′E), within the transitional zone from tropical to subtropical climates. The total area of the study region is approximately 24,960 m². Forests cover about 74.3% of the area, and the mean annual temperature is around 18 °C, with annual precipitation ranging from 1500 to 2000 mm. Forests are relatively continuous, but some portions of the landscape are interspersed with agricultural land and settlements, resulting in high spatial heterogeneity. The forest management regime includes both natural forests and mixed-use areas, with the overall ecosystem structure largely intact and limited human disturbance. The terrain is characterized by significant relief, with elevations ranging from 500 to 2000 m. The landscape is dominated by low to middle mountains and hills, exhibiting a high degree of geomorphological complexity. The predominant forest type is subtropical evergreen broadleaf–conifer mixed forest. Dominant species include Pinus kesiya, Castanopsis fargesii, and Schima superba. The forest structure is complex and vertically stratified, typically comprising tree, sub-tree, and shrub layers. Canopy height generally ranges from 15 to 25 m, and stand density is moderate. The dominant stands are mainly middle-aged and mature. Forest productivity is relatively high, and ecological conditions are favorable, resulting in the overall forest AGB at a comparatively high level. Therefore, the Simao area serves as a representative subtropical forest site for AGB retrieval in southern China.

2.1.2. Genhe Study Area

The Genhe Ecological Station is located in Genhe City, Hulunbuir, Inner Mongolia Autonomous Region (50°20′–52°30′N, 120°12′–122°55′E), in the northern part of the Greater Khingan Mountains. The region belongs to a typical cold-temperate continental monsoon climate zone. The total area of the study region is approximately 23,300 m². Forest coverage exceeds 80%, and forest stands are relatively continuous, with large-scale coniferous forests dominating the landscape. The degree of landscape fragmentation is low. Forest management primarily consists of nature reserves and low-intensity utilization areas, with limited human disturbance and well-preserved ecosystem conditions. The terrain is mainly composed of mountains and hills, with elevations ranging from approximately 600 to 1500 m. The overall relief is moderate, and geomorphological complexity is lower than that of southern mountainous regions. Although local slope variations exist, the general terrain undulation is relatively uniform. The dominant forest type is cold-temperate light coniferous forest. The primary species is Larix gmelinii, accompanied by Pinus sylvestris var. mongolica and Betula platyphylla. Forest structure is relatively simple, with limited vertical stratification. Canopy height generally ranges from 7 to 17 m, and stand density is moderate. The dominant stands are mainly mature and overmature forests. Forest productivity is lower than that of subtropical forests in southern China, and the overall forest AGB is at a moderate level. Therefore, the Genhe study area serves as a representative region for forest AGB retrieval in cold temperate forests in northern China. The locations of sample plots in the two study areas are shown in Figure 1a,b.

2.2. Remote Sensing Data

2.2.1. GF-3 PolSAR Data

GF-3 is China’s first C-band fully polarimetric SAR satellite, launched in August 2016. It is equipped with an advanced SAR system that supports full polarization, multiple imaging modes, and high spatial resolution, providing robust data for applications such as land cover monitoring, resource investigation, and disaster assessment [28]. In this study, one scene of single-look complex (SLC) GF-3 PolSAR data acquired on 1 December 2020 was used, with detailed information provided in Table 1. Data preprocessing included radiometric calibration, multilooking, speckle filtering, and geocoding. The original GF-3 data were first imported into the IDL environment, where the raw data format was converted into SLC format and radiometrically calibrated. Multilooking was then performed using three looks in both the range and azimuth directions. Speckle noise was suppressed using the Lee filter. Finally, geocoding was conducted using ASTER DEM data. The preprocessing flowchart is shown in Figure 2. The resulting Pauli RGB composite is shown in Figure 3a.The main symbols and their units used in this study are listed in Appendix A.

2.2.2. RADARSAT-2 Data

RADARSAT-2 is a high-resolution spaceborne SAR commercial satellite equipped with a C-band sensor, jointly developed by the Canadian Space Agency (Ottawa, Canada) and Macdonald, Dettwiler and Associates Ltd. (MDA, Richmond, Canada). The satellite provides all-weather, day-and-night imaging capability, independent of cloud cover and solar illumination. It supports 18 imaging modes, including a fully polarimetric mode, delivers spatial resolution up to the meter level, and has a revisit period of 2–3 days [29,30]. RADARSAT-2 data are widely used in studies such as surface deformation monitoring, vegetation structure analysis, forest biomass inversion, and resource and environmental surveys. For this study, one scene of fully polarimetric stripmap-mode SLC data covering the Genhe experimental area was acquired through a commercial agency. Detailed information is listed in Table 1. Preprocessing of the RADARSAT-2 data included data import, radiometric calibration, multilook processing, Boxcar filtering, and geocoding. The raw data were first imported into PolSARpro 5.3 software to convert the original format into SLC format. The RADARSAT-2 data had been radiometrically calibrated prior to download. Multilook processing was applied with three looks in both range and azimuth directions. To reduce speckle noise, a Boxcar filter was used for smoothing. Finally, geocoding was performed using a coordinate lookup table derived from the radar image and a digital elevation model (DEM). The preprocessing flowchart is shown in Figure 2. The resulting Pauli RGB composite is shown in Figure 3b.

2.2.3. Landsat-8 OLI Data

The Landsat-8 satellite marks a significant advancement in Earth observation remote sensing technology and is jointly operated by the National Aeronautics and Npace Administration (NASA) and the United States Geological Survey (USGS) [31]. It is equipped with two primary sensors: the operational land imager (OLI) and the thermal infrared sensor (TIRS), providing a total of 11 spectral bands. Specifically, the OLI offers 9 multispectral bands, while the TIRS contains 2 independent thermal infrared bands [32]. The Landsat-8 data used in this study were sourced from the Geospatial Data Cloud platform of the Computer Network Information Center, Chinese Academy of Sciences (https://www.gscloud.cn). Multispectral imagery acquired by the OLI sensor was selected for analysis. The preprocessing of the Landsat-8 OLI imagery included radiometric correction, FLAASH atmospheric correction, and image mosaicking and cropping. All preprocessing steps were performed in ENVI version 5.6, The preprocessing flowchart is shown in Figure 2.

Image quality is crucial for accurate vegetation index calculations. In the Pu’er study area, the warmest months are May and June, when light and thermal conditions are ideal, and accumulated temperature is high, creating optimal conditions for tropical and subtropical vegetation growth; this period marks the peak growing season. Therefore, the Landsat-8 OLI image with relatively low cloud cover was chosen for this study, as it offers clear canopy spectral information and helps minimize potential errors. The Genhe study area is situated in a cold-temperate zone, where the forest growing season mainly spans from May to September. Four Landsat-8 OLI scenes were chosen, capturing the early and late phases of the growing season. All these images have less than 10% cloud cover and superior image quality compared to August scenes. By mosaicking these images, the entire study region can be fully covered. All selected images were taken in the same year as the field survey, allowing them to accurately represent the forest canopy vegetation features of that year, which helps ensure precise vegetation index calculations and reliable biomass estimates. The detailed information of the Landsat-8 OLI images for the two study areas is provided in Table 2.

2.3. Ground Survey Data

2.3.1. Pu’er Ground Data

The ground survey in the Pu’er region was conducted over a 20-day period in December 2020, during which 52 standard sample plots, each measuring 20 m × 20 m, were established. The four corners and the center coordinates of each plot were positioned using a differential GPS, with the positioning accuracy controlled within 20 cm. Upon plot confirmation, a tree-by-tree survey was implemented within each plot, with specific measurement protocols as follows: diameter at breast height (DBH) was measured uniformly at 1.3 m above ground on the uphill side of the trunk using a diameter tape, and the DBH of all individual trees greater than 5 cm within the plot was thoroughly recorded; tree height was measured using a Blume-Leiss hypsometer (BLH; Leiss-Berlin, Berlin, Germany) by first selecting an observation point at a horizontal distance of 20 m from the target tree’s base, then adjusting the hypsometer’s scale to the corresponding distance, keeping the position unchanged, and successively aiming at the tree base and the treetop—the instrument automatically calculated and displayed the total tree height based on the angles and the horizontal distance between the observation point and the tree’s base and top. On sloping terrain or where the line of sight was obstructed, locations with a clear view and relatively gentle topography were selected for observation to ensure measurement accuracy; crown width was determined using the direct measurement method, employing a tape measure to record the widest spread of an individual tree’s crown along the north–south and east–west directions, with the average taken as the crown width; canopy density was obtained through visual estimation based on the tree crown’s projected area. Information such as forest type and plot elevation was also recorded. Based on the dominant tree species within each plot, the survey classified the plots into pure Pinus kesiya var. langbianensis forest plots, broad-leaved forest plots, and mixed coniferous–broadleaf forest plots. Forest AGB of individual trees for each species was calculated using the allometric growth equations, as shown in Table 3. Subsequently, the forest values of all individual trees within each plot were summed to obtain the total forest AGB per plot. This total forest AGB was then divided by the area of each plot (0.04 ha) to derive the unit area forest AGB for each respective plot.

2.3.2. Genhe Ground Data

The ground survey in the Genhe study area was conducted in August 2013, covering 52 fixed square plots, each measuring 20 m × 20 m. Plot locations were determined using differential GPS to record the coordinates of all four corners, with a positioning accuracy within 1 m. Within each plot, all trees were measured individually, recording DBH, total tree height, height to the lowest branch, crown width, canopy cover, and relative tree coordinates. The tree height and height to the lowest branch were measured using a laser hypsometer (TruPulse 360B; Laser Technology Inc., Centennial, CO, USA). Stand-level attributes, such as ground cover, shrub and herb species composition, and vegetation height, were also documented. Forest AGB for the plots was calculated using the same allometric approach as described for the Pu’er study area. The dominant species in the Genhe plots were Larix gmelinii and Betula platyphylla, and their forest AGB was estimated using the allometric equations provided in Equations (1) and (2).

M_{L a r i x g m e l i n i i} = 0.0277 D B H^{2.79326}

(1)

M_{B e t u l a p l a t y p h y l l a} = 0.01905 D B H^{2.24322}

(2)

Figure 4 illustrates the distribution of forest AGB across the sample plots in the Genhe and Pu’er study areas. The forest AGB in Genhe ranged from 13.8 to 204.12 Mg/ha, while in Pu’er, it varied from 70.31 to 192.11 Mg/ha. The overall biomass in Genhe is relatively low, mainly due to its climate zone and forest type: Genhe is located in the northern cold temperate coniferous forest zone, characterized by a short growing season, a high proportion of younger forests, and a relatively simple forest structure. In contrast, Pu’er is situated in the southern subtropical evergreen broadleaf forest and mixed plantation zone, where forests are more mature, leading to faster biomass accumulation. The figure also indicates the maximum and minimum forest AGB values for each study area, providing a clear representation of the variation in forest AGB.

2.4. Methods

2.4.1. Water Cloud Model

In the microwave scattering process within forested environments, the dominant scatterers vary significantly depending on the wavelength. In this study, the SAR data used primarily have wavelengths ranging from 3 cm to 30 cm. The main scatterers at these wavelengths are leaves, small branches, and larger twigs, whose sizes are comparable to the incident wavelengths, resulting in Mie scattering. For C-band SAR, the scattering is dominated by canopy leaves, and resonance effects within Mie scattering are pronounced [33,34]. In this study, we selected the semi-empirical WCM as the physical model. This model assumes that the scattering mechanisms in forested areas include two components: (1) canopy vegetation scattering attenuated by extinction and (2) ground surface scattering also attenuated by the forest canopy [35,36]. The dominant dielectric properties influencing the scattering behavior in the forest canopy are primarily derived from the water content in the leaves and branches. The canopy is modeled as a collection of uniformly distributed suspended water droplets, and the forest canopy biomass to be estimated is introduced [37,38]. A schematic of the forest scattering mechanism described in this study is shown in Figure 5, and Equation (3) provides the mathematical formulation of the WCM. By deriving Equation (3) through (8), the retrieval model of the WCM parameters for forest AGB, with forest AGB as the dependent variable, is obtained as shown in Equation (8).

σ = σ_{f o r e s t} [1 - e (- β W_{f o b s})] + σ_{g r o u n d} e (- β W_{f o b s})

(3)

β_{0} = σ_{f o r e s t}

(4)

β_{1} = σ_{g r o u n d} - σ_{f o r e s t}

(5)

β_{2} = - β = - 2 k \sec θ

(6)

σ = β_{0} + β_{1} e (β_{2} W_{f o b s})

(7)

W_{f p r e} = \frac{1}{β_{2}} \ln (\frac{σ - β_{0}}{β_{1}})

(8)

In Equation (3),

σ

denotes the total forest backscattering coefficient. In this study, the total forest backscattering coefficient in the WCM was substituted with the backscattering coefficients derived from GF-3 PolSAR and RADARSAT-2 data (

σ_{H H}

,

σ_{H V}

,

σ_{V H}

,

σ_{V V}

).

σ_{f o r e s t}

represents the theoretical volume backscattering component from the forest canopy,

σ_{g r o u n d}

represents the theoretical ground backscattering component,

β

is an empirical parameter,

W_{fobs}

denotes the observed AGB, and

W_{f p r e}

denotes the predicted AGB. In Equation (6),

k

indicates the extinction coefficient, and

θ

represents the radar incidence angle.

2.4.2. Deep Fully Connected Neural Network

Within the DPM framework, the DFCNN serves as the machine learning parameterization module for learning the data-driven relationships of the key physical parameters (β₀, β₁, and β₂) in the WCM. Traditional parameter estimation methods, such as empirical formulas or nonlinear least-squares techniques, often struggle to accurately characterize the complex nonlinear relationships between model parameters and remote sensing features in heterogeneous forest environments [39,40]. In this study, the normalized input features are fed into the DFCNN for forward propagation to generate the above physical parameters, which are subsequently integrated into the WCM to jointly compute AGB predictions. The entire workflow is implemented using PyTorch tensor operation framework in Python 3.10, enabling end-to-end automatic differentiation and joint optimization of the network and physical model. To ensure physical plausibility and numerical stability, nonlinear activation functions, including tanh and Softplus, are introduced in the output layer, where tanh constrains the range of β₀, and Softplus guarantees that β₁ and β₂ remain positive, thereby preventing nondifferentiable points and numerical instability during physical model computation. As a parameterization component of the differentiable physical modeling framework, the DFCNN is jointly optimized with the WCM, achieving an integrated balance between physical consistency and nonlinear modeling capability. The network architecture and training configuration are shown in Table 4.

2.4.3. Principle and Method of DPM

Fundamental Principle of DPM

The DPM model can be regarded either as a machine learning model guided by structural priors and optimized within a relatively small search space, or as a process-based model supported by learnable components with a larger search space. Figure 6 illustrates the fundamental principles of DPM, which organically integrates process (a) and process (b), representing a methodological fusion between pure data-driven learning and traditional physical modeling. In the DPM model, the introduction of prior knowledge of physical structure effectively constrains the search space of the neural network, thereby enhancing the model’s physical consistency and interpretability (process a). Meanwhile, by leveraging the expressive power of neural networks, the model extends the limited search scope of conventional physical models, achieving an optimal balance between accuracy and generalization capability (process b). In Figure 6, the background represents the constructed cost function landscape (CFL), which illustrates the search capabilities of the three types of models in the function space. The background color indicates the theoretically achievable minimum cost value under conditions of infinite data; darker colors correspond to lower cost values and stronger model performance. The red pentagram marks the theoretical global optimum. Therefore, compared with traditional methods, the DPM approach combines both structural priors and flexibility—effectively narrowing the search space while maintaining strong expressiveness—thus making it more capable of approaching the optimal solution in complex inference problems.

Data for DPM Training Process

In this study, the required datasets are categorized into three types according to the training process of the DPM model, which are listed in Table 5. The Landsat-8 OLI data, ASTER GDEM data, and SMCI1.0 auxiliary climate data were all resampled to the same spatial resolution as the SAR data using bilinear interpolation before data extraction, ensuring that all input features were precisely aligned in the spatial dimension.

The datasets were categorized as follows: (1) Satellite imagery, including vegetation indices from Landsat-8 OLI, backscattering coefficients, incidence angles, and radar vegetation indices from GF-3 and RADARSAT-2, as well as Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) data; (2) in situ measurements, primarily comprising field-measured forest AGB, vegetation types such as evergreen broadleaf forest, Simao pine pure forest, conifer–broadleaf mixed forest, and broadleaf pure forest, and topographic factors including elevation, latitude, and longitude; and (3) auxiliary data, including climate factors such as temperature and precipitation, extracted from the 1 km daily soil moisture and meteorological dataset of China for one scene in December 2020 for the Pu’er study area and one scene in August 2013 for the Genhe study area, provided by the National Tibetan Plateau Data Center.

Model Construction Process

The DPM model is a very active research area, seeking to find a good balance between purely data-driven and physics-based methods. The core idea of this approach is to be physics-guided and data-assisted, or to achieve deep integration between physical knowledge and data-driven methodologies [41,42].

The DPM model in this study adopts the WCM as the structural backbone, while the empirical parameters within the WCM are predicted by neural network [43]. The schematic diagram illustrating the internal construction differences among the three model types from X to Y is shown in Figure 7. A purely data-driven machine learning model can directly establish the mapping relationship from X to

y

, even without understanding the underlying physical retrieval mechanism. In contrast, physical models are typically composed of a series of continuous physical processes. For instance, the semi-empirical WCM used in this study describes the mechanism of radiation transmission between the canopy and the ground surface. Since each physical subprocess is differentiable, differentiable physical modeling allows the entire model to be decomposed into several differentiable components

g_{n}

(where some submodules, such as

g_{1}

and

g_{3}

, represent unknown processes). For these unknown or empirical processes, neural networks can be used for parametric modeling (e.g., the

N N_{1}

module parameterizes the physical process

g_{1}

in the diagram), thereby enhancing the model’s adaptability and data-learning capability while maintaining physical consistency. The overall architecture of the DPM model constructed in this study is illustrated in Figure 8. Because both the neural network model and the WCM are differentiable, the modeling process can be expressed as a differentiable composite function, which is formulated as Equation (9).

y = f (g_{1}, g_{2})

(9)

where

y

denotes the target variable, namely the forest AGB to be estimated,

g_{1}

denotes the Machine Learning Component (Parameter Regionalization), a neural network-based parameterization module that trains on the input variables and outputs the physical parameters (

β_{0}

,

β_{1}

,

β_{2}

) required by the physical model, and

g_{2}

denotes the Differentiable Process-based Model-WCM Component, a partially differentiable physical submodule that takes the physical parameters and

σ_{H H}

,

σ_{H V}

,

σ_{V H}

,

σ_{V V}

,

θ

,

k

as inputs to perform the physical process mapping and derives the target variable

y

through the physical Equation (8).

The calculation of the target variable depends on observational data and certain parameters that are difficult to measure directly. For parameters that are hard to observe or model, this study uses a DFCNN to predict the key parameters and combines it with the traditional WCM to develop an end-to-end differentiable physical modeling framework. During the model’s forward pass, the neural network’s predicted parameters are fed into the differentiable physical model. The entire computational process is represented by Equation (10).

y = f (N N^{w} (g_{A} (X)), W C M (σ, k, θ, β))

(10)

where

w

denotes the weights of the neural network,

X

represents the attributes associated with the WCM parameters

β

during the forward modeling process, and

g_{A}

denotes the process of parameterizing the WCM using the DFCNN. Since the outputs of the neural network constitute part of the physical model parameters, the entire differentiable physical modeling framework for forest AGB retrieval can be expressed as Equation (11).

y = f (W C M (σ, k, θ, N N^{w} (g_{A} (X))))

(11)

The DFCNN does not directly predict forest AGB; instead, it acts as a supplementary module for the physical model parameters within the retrieval process. This approach not only preserves the theoretical constraints of the physical model, enhancing its physical consistency and interpretability, but also leverages the strong nonlinear mapping ability of neural networks to effectively compensate for errors caused by parameter uncertainty in the physical model [44,45]. The implementation of the function

f

described in the equation relies on AD, which connects symbolic and numerical differentiation. The AD decomposes complex functions into a sequence or computational graph of basic operations (e.g., addition, multiplication, exponentiation, logarithm). The derivatives of these fundamental operations are then combined through the chain rule to compute the derivative of the entire function [46,47]. In this study, the WCM computational graph within the function

f

consists of continuously differentiable operations such as subtraction, division, logarithm, and multiplication. The entire computational process from input variables to the target variable is differentiable. Moreover, the WCM is differentiable with respect to its physical parameters, as the functions

\frac{\partial_{A G B}}{\partial_{β_{0}}}

,

\frac{\partial_{A G B}}{\partial_{β_{1}}}

,

\frac{\partial_{A G B}}{\partial_{β_{2}}}

in the equation exist and are continuous [48]. Both the physical model and the neural network are constructed in differentiable forms, ensuring good overall differentiability throughout the retrieval process. The chain rule is applied as shown in Equation (12), and the MSE is used as the loss function, as formulated in Equation (13).

\frac{\partial_{L o s s}}{\partial_{w}} = \frac{\partial_{L o s s}}{\partial_{A G B}} * \frac{\partial_{A G B}}{\partial_{β}} * \frac{\partial_{β}}{\partial_{w}}

(12)

L o s s (w) = \frac{1}{N * 4} {\sum_{i = 1}^{N = 4} \sum_{j = 1}^{4} (W_{f, j}^{p r e, i} - W_{f, j}^{o b s, i})}^{2}

(13)

where

N

denotes the number of training samples,

W_{f, j}^{p r e, i}

represents the predicted biomass for the

j

-th polarization channel of the

i

-th sample, and

W_{f, j}^{o b s, i}

is the measured biomass for the

j

-th polarization channel of the

i

-th sample.

w

denotes the set of parameters in the neural network, including the weight matrices and bias terms of each layer. Equation (12) expands the gradient of the loss function with respect to

w

into three chain-rule components: the partial derivative of the loss function with respect to the estimated forest AGB, the partial derivative of AGB with respect to the key parameters of the physical model, and the partial derivative of the physical parameters with respect to

w

. During the training phase, the gradient is backpropagated from the output layer to the input layer according to the chain rule, sequentially passing from the loss function to the key parameters in the physical model, and then further to the neural network, enabling iterative updates of

w

.

To sum up, the core of the DPM lies in the deep integration of physical models and deep neural networks through differentiable programming, with its backpropagation mechanism being essential to achieving this integration. The forward propagation of the model generates physical parameters of the WCM based on multi-source remote sensing features, which are then used to compute biomass predictions. The prediction error is obtained through a loss function. The error gradient is propagated backwards through the WCM to the DFCNN, updating its weights and enabling the DFCNN to progressively learn to output physical parameters that conform to the laws of microwave radiative transfer. This physics-guided mechanism endows the DPM model with both the interpretability of physical models and the nonlinear modeling capability of deep learning, ensuring that the AGB retrieval results satisfy both data-driven statistical patterns and physical consistency.

2.4.4. Model Comparison and Evaluation

To evaluate the performance of the DPM framework in forest AGB retrieval, the results were compared with those from other retrieval models. To assess the physical model’s ability to conduct forest AGB retrieval without machine learning, this study directly applies the WCM for this purpose. Serving as a baseline for the physical model, the parameters in the WCM are obtained through nonlinear least squares. To analyze the performance of purely data-driven approaches that lack physical constraints regarding retrieval accuracy and generalization ability, two representative deep learning models were used: the FNN and the GRNN. The rationale for choosing these two models is as follows: FNN, a classic deep learning architecture, offers strong nonlinear approximation ability and scalability [49]; GRNN, a statistical non-parametric method, demonstrates good generalization performance with moderate sample sizes [50]. Their differences in modeling mechanisms and data dependency help reveal the performance of purely data-driven methods in terms of accuracy and generalization. Additionally, to expand the comparison framework and further validate the modeling capabilities of different machine learning methods, two widely used ensemble learning algorithms were included: RF and AdaBoost. RF performs retrieval through multiple independent regression trees, while AdaBoost iteratively adjusts the weights of weak learners to improve predictive performance [51].

The five methods described above represent different modeling paradigms, including physical modeling, deep learning, and classical machine learning. Their retrieval performance is systematically compared with the end-to-end DPM approach proposed in this study, which integrates physical mechanisms with differentiable programming. To ensure that each model achieves optimal performance under reasonable conditions, the other models adopt preprocessing strategies consistent with their own characteristics and are evaluated using the same metrics. This allows for a comprehensive assessment of the applicability, accuracy, and physical interpretability of each method in forest biomass retrieval. To ensure experimental comparability, the semi-empirical WCM, machine learning models, and deep learning models are all trained using the same feature variables and training dataset as the DPM model. Furthermore, all models, including the semi-empirical model, undergo necessary parameter tuning and performance optimization.

In addition, to comprehensively evaluate model generalization, this study uses random sampling validation to compare six models. It relies on statistical indicators for quantitative analysis: the coefficient of determination (R²), bias, root mean square error (RMSE), and unbiased root mean square error (ubRMSE). R² ranges from 0 to 1; values near 1 indicate higher retrieval accuracy. Bias is the average difference between predicted and observed values and shows if the model generally overestimates or underestimates; values closer to zero indicate less systematic error. RMSE measures deviation between predictions and observations; lower RMSE means better accuracy. ubRMSE shows the spread of predicted values after removing bias; smaller values indicate predictions are tightly clustered around observations. The calculation formulas for these metrics are calculated as shown in Equations (14)–(17).

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(Y_{i} - y_{i})}^{2}}{\sum_{i = 1}^{n} {(Y_{i} - \bar{y})}^{2}}

(14)

B i a s = \frac{1}{n} \sum_{i = 1}^{n} (y_{i} - Y_{i})

(15)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - Y_{i})}^{2}}

(16)

u b R M S E = \sqrt{R M S E^{2} - b i a s^{2}}

(17)

where

n

is the number of validation sample points,

Y_{i}

refers to the biomass observed at the

i

-th sample point,

y_{i}

refers to the biomass predicted at the

i

-th sample point, and

\bar{y}

refers to the mean observed biomass across all sample points.

3. Results

3.1. Analysis of the Forest AGB Loss Function

Figure 9a,b illustrate the training and validation loss curves of the DPM model in the Pu’er and Genhe study areas, respectively, showing how the errors evolve with the number of iterations.

Figure 9a shows that in the Pu’er study area, the initial loss is relatively high, but it decreases rapidly within the first 200 iterations, indicating that the model can significantly reduce prediction errors with fewer iterations. As the iterations progress, the rate of loss reduction slows down and stabilizes at a relatively steady level. The training and validation loss curves follow very similar trends, with minimal difference between them. Figure 9b shows that in the Genhe study area, the initial loss is comparatively lower. In the early stages, the loss decreases gradually with each iteration, and a more noticeable decline occurs in the middle and later stages (around the 1300th iteration), after which it stabilizes. Both the training and validation loss remain highly consistent throughout the entire training process, without significant fluctuations.

In summary, despite the significant differences in climate conditions, vegetation types, and biomass distribution between the southern and northern study areas of China, the DPM model exhibits a consistent loss trend in both regions. The close alignment between the training and validation errors indicates that the model is able to converge steadily across these different areas, demonstrating consistent performance across ecologically contrasting regions under independent training.

3.2. Visualization of Intermediate Variable Updates via the Backpropagation Mechanism

The DPM model effectively leverages the advantages of both physical and machine learning models, combining the generalization capability of data-driven approaches with the interpretability of physical models, specifically through the output of intermediate variables—the parameters of the physical model. Unlike purely data-driven models, DPM not only predicts the target retrieval variable but also provides access to intermediate variables that enhance the model’s interpretability.

Although the model is trained in an integrated end-to-end manner, it not only outputs the retrieval results of the target variable, forest AGB, but also produces the intermediate variables of the WCM. As shown in Figure 10a,b, the DPM framework demonstrates how the DFCNN updates the weights and biases of each hidden layer during training using gradient descent and the chain rule of backpropagation based on the evolving loss values. Overall, Figure 10a,b show that these updates gradually decay and stabilize as the number of training epochs increases, reflecting the progressive stabilization of the parameters and the corresponding updates of the intermediate variables.

These results not only validate the numerical convergence and stability of the training process but also indirectly confirm the effectiveness of the backpropagation mechanism in updating the neural network parameters. In the context of the DPM model, this mechanism further ensures reliable estimation of intermediate variables, providing a solid optimization basis for building physically constrained remote sensing retrieval models.

3.3. Random Sampling Validation

The model performance was evaluated using a random sampling validation approach, which offers high computational efficiency and faster training. To assess the effectiveness of the models in both the southern and northern study areas of China, the datasets were randomly divided, with 60% of the samples used for training and 40% for validation. After training, key evaluation metrics (R², RMSE, Bias, and ubRMSE) were calculated on the validation set to quantify the models’ predictive accuracy and stability. The results of the random sampling validation for the Simao and Genhe study areas are presented in Table 6 and Table 7. The relationships between the predicted and observed AGB values for each model in both study areas are shown in Figure 11 and Figure 12.

3.4. Statistical Significance Testing

To assess whether the performance advantages of the DPM model over the comparison models were statistically significant, paired-sample t-tests were conducted. The tests used absolute prediction errors of the validation sets from the two study areas. Each study area included 21 validation samples. To control for multiple comparisons, a Bonferroni correction was applied with α = 0.01. The statistical results are summarized in Table 8 and Table 9. Where MD denotes the mean of paired differences. SD denotes the standard deviation of differences. The t value represents the t-test statistic. df refers to the degrees of freedom, and p indicates the significance level. where * p< 0.05 indicates a significant difference, ** p < 0.01 indicates a highly significant difference, and *** p < 0.001 indicates an extremely significant difference.

The paired-sample t-test results show that in the Simao study area, the absolute errors of DPM were significantly lower than those of all comparison models. Specifically, the differences with the FNN and GRNN reached a highly significant level (p < 0.001), while the differences with RF, AdaBoost, and the WCM reached a significant level (p < 0.01). These findings are consistent with the performance metrics in Table 6, indicating that DPM’s advantages in subtropical complex forest scenarios are statistically reliable. In the Genhe study area, the absolute errors of DPM were also significantly lower than those of all comparison models. The differences with the GRNN and WCM reached a highly significant level (p < 0.001), those with the FNN and AdaBoost reached a significant level (p < 0.01), and the difference with RF reached a moderately significant level (p < 0.05). Despite the higher overall retrieval difficulty in Genhe, DPM maintained a clear statistical advantage, further demonstrating its potential applicability in cold-temperate forest scenarios.

4. Discussion

(1): This study independently validated the DPM model in two regions with markedly different ecological conditions: the Simao District in Pu’er, Yunnan, and the Genhe District in Inner Mongolia. The results indicate that DPM achieved the best retrieval performance in both study areas. In the Simao study area, DPM attained an R² of 0.60, consistently higher than RF of 0.41, AdaBoost of 0.38, FNN of 0.31, GRNN of 0.26, and WCM of 0.29. Its RMSE and ubRMSE were both 24.23 Mg/ha, the lowest among all models, indicating that the predicted AGB values were highly consistent with the observed values in terms of both overall trend and variability. The DPM bias was 0.40 Mg/ha, close to zero, indicating virtually no systematic overestimation or underestimation, whereas the bias of RF, AdaBoost, GRNN, and FNN approached or exceeded 10 Mg/ha, reflecting the significant systematic errors in purely data-driven models due to the lack of physical constraints. In the Genhe study area, overall retrieval proved to be more challenging than in the Simao area, and the accuracy of all models decreased. Nevertheless, the DPM model still achieved the best performance. The DPM model reached an R² of 0.48, consistently higher than an RF of 0.34, FNN of 0.26, AdaBoost of 0.24, GRNN of 0.18, and WCM of 0.08. Its RMSE of 33.29 Mg/ha was approximately 11% lower than that of RF of 37.5 Mg/ha and 25% lower than that of the WCM of 44.50 Mg/ha. The bias of DPM was 0.87 Mg/ha, remaining close to zero, whereas the bias of RF of −7.35 Mg/ha, FNN of 5.65 Mg/ha, and GRNN of 19.95 Mg/ha indicated notable systematic deviations.
(2): As typical ensemble learning algorithms, RF and AdaBoost generally outperform the GRNN and FNN in forest AGB retrieval, as neural networks tend to underperform when trained on small datasets. Although AdaBoost achieved a slightly higher R² than the FNN in the Genhe study area, the difference was minimal. The DPM model integrates the WCM into the neural network, applying physical constraints to ensure prediction plausibility while leveraging deep learning to capture complex nonlinear relationships. This integration of physical modeling and data-driven learning allows the DPM model to deliver forest AGB retrieval results with high accuracy, strong stability, and minimal systematic bias.
(3): The advantages of the DPM model lie not only in its accuracy in forest AGB prediction but also in its joint optimization of physical constraints and data-driven learning. The differentiable WCM equations enable prediction errors to be effectively backpropagated to the neural network, allowing the network parameters to be optimized while maintaining physical consistency. Meanwhile, the outputs of intermediate variables are a key factor enabling the DPM model to achieve acceptable accuracy, relatively low systematic bias, and strong interpretability.
(4): C-band radar signals exhibit limited performance in densely vegetated areas and are less sensitive for forest AGB retrievals [52]. In such environments, the radar signal may not penetrate the canopy effectively to reach the ground, reducing the accuracy of AGB estimation [53,54]. In this study, the saturation threshold of C-band SAR was quantitatively assessed by analyzing the relationship between predicted AGB and measured AGB. The results indicate that in the Simao study area, when measured AGB exceeds approximately 120 Mg/ha, the DPM model predictions start to systematically deviate below the 1:1 line, showing an underestimation trend; when measured AGB exceeds 150 Mg/ha, predictions stabilize and mostly fall within 120–140 Mg/ha, suggesting the signal has entered saturation. Therefore, the saturation threshold in the Simao area is roughly 120–150 Mg/ha. In the Genhe study area, this occurs at a lower level, as predictions begin to deviate when measured AGB exceeds around 100 Mg/ha. For high-biomass ranges (>150 Mg/ha), predicted values are confined to 100–120 Mg/ha, indicating a saturation threshold near 100 Mg/ha. Previous studies have shown that the saturation threshold of L-band SAR can reach 150–200 Mg/ha [55], while the theoretical saturation threshold of P-band SAR is even higher, approximately 300–400 Mg/ha [56]. By comparing our results with these findings, future research may consider integrating L-band or P-band data to enhance canopy penetration and further mitigate saturation effects in high-biomass forests. Further analysis shows that the differences in saturation thresholds between the two study areas are closely related to forest structural complexity. The Simao study area consists of multi-layer mixed forests with distinct vertical canopy stratification and a relatively high proportion of volume scattering, which may, to some extent, delay the saturation process of C-band backscatter signals. In contrast, the Genhe study area is dominated by single-layer coniferous forests with relatively simple structures and scattering mechanisms, exhibiting a faster saturation trend under the conditions of this study. These findings indicate that the saturation characteristics of C-band SAR-based biomass retrieval are not only associated with total biomass but may also be influenced by forest structural attributes. Forest structural complexity may affect the manifestation of saturation thresholds by altering the relative contributions of different scattering mechanisms.
(5): In this study, the acquisition times of the field survey data aligned with those of the SAR imagery. The field survey in the Simao study area was carried out in December 2020, coinciding with the collection of the GF-3 imagery, while the field survey in the Genhe study area took place in August 2013, corresponding to the acquisition of the RADARSAT-2 imagery. The year of acquisition for the Landsat-8 optical imagery also matched these datasets, ensuring temporal consistency among the main data sources. However, potential spatiotemporal mismatches still exist in some auxiliary datasets. From a temporal perspective, the SMCI1.0 climate data are daily-scale products. Although their acquisition year matches the SAR and field data, they cannot exactly represent the meteorological conditions at the specific time of the SAR overpass. In the Genhe study area, SAR data were acquired in August, coinciding with a period of heavy rainfall. If rainfall occurred on the day of the satellite overpass, rapid changes in temperature and soil moisture might affect the radar backscatter coefficients, adding uncertainties to the AGB inversion results. From a spatial perspective, the original spatial resolutions of the datasets differ significantly, with SAR data at meter-level resolution, optical imagery and DEM at 30 m, and climate data at 1 km. Although all datasets were resampled to a common spatial resolution during preprocessing, differences in their original scales may still cause spatial biases due to forest stand heterogeneity.
(6): Although the DPM retrieval model uses a data-driven training approach similar to deep neural networks, it stands apart by explicitly incorporating physical knowledge constraints into its structure. This effectively narrows the learning space and allows the DPM to capture data features while adhering to forest scattering mechanisms and radiative transfer principles. As a result, the model achieves robust performance even with small training datasets. To sum up, compared with traditional data-driven models and conventional physical radiative transfer models, the DPM framework connects physical priors to neural networks, advancing the field of physics-informed machine learning.

5. Conclusions

This study is the first to introduce the DPM approach into the field of forest AGB remote sensing retrieval. It establishes a synergistic optimization framework that integrates the WCM with a deep neural network to achieve forest AGB retrieval. This approach fully leverages the neural network’s capability to model complex nonlinear relationships. At the same time, it maintains physical interpretability, effectively improving the accuracy of biomass retrieval and enhancing the model’s generalization ability. It thus provides a new paradigm for forest AGB remote sensing retrieval by combining physical constraints with learning capability. The main conclusions of this study are as follows:

(1): The traditional WCM was re-coded on a deep learning platform and made differentiable using PyTorch’s AD capabilities, reconstructing its differentiability and successfully unifying the WCM with a neural network. Although the DPM model consists of both a machine learning module and a differentiable physics-based model module, the training and retrieval processes have been fully integrated, forming an end-to-end trainable joint model. Within this framework, the neural network dynamically parameterizes key parameters of the physical model and output intermediate variables. It simultaneously optimizes the network weights and biases through backpropagation and gradient descent, thereby achieving a deep integration of physics-driven and data-driven methods.
(2): The DPM model was preliminarily validated using the WCM as its physical model framework, which improved the retrieval accuracy to a certain extent. However, the WCM has a relatively simplified structure and provides an incomplete representation of the multiple scattering and absorption mechanisms of electromagnetic waves within the vegetation canopy, thereby limiting its applicability in regions with complex forest structures. To further enhance the model’s physical consistency and generalization capability, future research could consider adopting other radiative transfer equations, such as the MIMICS model, to replace the current physical model framework and explore the retrieval accuracy and generalization performance achievable through joint optimization with neural networks under a differentiable framework.
(3): This study focused on two small-scale, representative forest areas in southern and northern China—the Simao District in Pu’er and the Genhe District in Inner Mongolia—to develop and validate a forest AGB retrieval approach integrating DPM across contrasting ecological regions. The two study areas exhibit marked differences in climate, forest type, stand structure, and biomass levels, making them highly suitable for evaluating model performance. Experimental results show that the DPM model consistently outperforms the traditional WCM, FNN, GRNN, RF, and AdaBoost methods in estimating forest AGB across both subtropical multi-species forests and northern temperate forests. These results demonstrate the robustness and adaptability of the proposed method to forests with diverse ecological conditions. With further refinement, this approach has the potential for application at larger spatial scales, particularly in regions with pronounced spatial heterogeneity, offering a promising pathway for large-scale forest biomass estimation using remote sensing.

Author Contributions

Conceptualization, C.Z.; methodology, C.Z. and W.Z. (Wangfei Zhang); investigation,Y.J. and W.Z. (Wei Zhang); software, C.Z. and H.Z.; validation, C.Z. and W.Z. (Wei Zhang); formal analysis, C.Z. and W.Z. (Wangfei Zhang); data curation, W.Z. (Wangfei Zhang), Y.J. and X.H.; writin—original draft preparation, C.Z.; writing—review and editing, R.S., Y.J., W.Z. (Wangfei Zhang) and X.H.; visualization, C.Z. and W.Z. (Wei Zhang); resources, R.S., Y.J., W.Z. (Wei Zhang), W.Z. (Wangfei Zhang) and X.H.; supervision, R.S., Y.J., W.Z. (Wangfei Zhang) and X.H.; project administration, R.S., W.Z. (Wangfei Zhang) and X.H.; funding acquisition, W.Z. (Wangfei Zhang), R.S. and X.H. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Natural Science Foundation of China (No. 32371869, No. 32160365, No. 32471865, No. 42161059, No. 32260720), the Yunnan Fundamental Research Projects (No. 202301BD070001-058, No. 202401BB070001-021, No. 202502AE090072) and Yunnan International Joint Innovation Platform for South Asia and Southeast Asia Science and Technology Innovation Center (No. 202403AP140042). National Key R&D Program (No. 2025YFE0210400); China Agriculture Research System (CARS-21).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We thank the research team for their assistance with data acquisition.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. List of main symbols and units.

Symbol	Unit
$σ_{H H}$ , $σ_{H V}$ , $σ_{V H}$ , $σ_{V V}$	dB
$σ_{f o r e s t}$	dB
$σ_{g r o u n d}$	dB
$W_{fobs}$	Mg/ha
$W_{f p r e}$	Mg/ha
RMSE	Mg/ha
Bias	Mg/ha
ubRMSE	Mg/ha
Forest AGB	Mg/ha
MD	Mg/ha
SD	Mg/ha
Slope	°
Aspect	°
Angle ( $θ$ )	°
Longitude	°
Latitude	°
Temperature	°C
Precipitation	mm
Altitude	m
Range	m
Azimuth	m
Wavelength	cm
DBH	cm
H	m

References

Konstantinavičienė, J.; Vitunskienė, V. Definition and Classification of Potential of Forest Wood Biomass in Terms of Sustainable Development: A Review. Sustainability 2023, 15, 9311. [Google Scholar] [CrossRef]
Zhang, Q.; Song, J.; Mayuka, R.N. Climate Change and Forestry Carbon Sink: A Literature Review and Visualization Perspective. Front. For. Glob. Change 2025, 8, 1487503. [Google Scholar] [CrossRef]
Cai, Y.; Zhu, P.; Li, X.; Liu, X.; Chen, Y.; Shen, Q.; Xu, X.; Zhang, H.; Nie, S.; Wang, C.; et al. Dynamics of China’s Forest Carbon Storage: The First 30 m Annual Aboveground Biomass Mapping from 1985 to 2023. Earth Syst. Sci. Data 2025, 17, 6993–7018. [Google Scholar] [CrossRef]
Papucci, E.; Valbuena, R.; Roberge, C.; Mensah, A.A.; Ståhl, G. A Review of Forest Biomass Assessments Based on Remote Sensing Reveals Progress in Methodological Quality—But Major Challenges Remain. For. Int. J. For. Res. 2026, 99, cpag007. [Google Scholar] [CrossRef]
Hunka, N.; May, P.; Babcock, C.; De La Rosa, J.A.A.; De Los Ángeles Soriano-Luna, M.; Saucedo, R.M.; Armston, J.; Santoro, M.; Suarez, D.R.; Herold, M.; et al. A Geostatistical Approach to Enhancing National Forest Biomass Assessments with Earth Observation to Aid Climate Policy Needs. Remote Sens. Environ. 2025, 318, 114557. [Google Scholar] [CrossRef]
Faqe Ibrahim, G.R.; Rasul, A.; Abdullah, H. Improving Crop Classification Accuracy with Integrated Sentinel-1 and Sentinel-2 Data: A Case Study of Barley and Wheat. J. Geovisualization Spat. Anal. 2023, 7, 22. [Google Scholar] [CrossRef]
Abowarda, A.S.; Bai, L.; Zhang, C.; Long, D.; Li, X.; Huang, Q.; Sun, Z. Generating Surface Soil Moisture at 30 m Spatial Resolution Using Both Data Fusion and Machine Learning toward Better Water Resources Management at the Field Scale. Remote Sens. Environ. 2021, 255, 112301. [Google Scholar] [CrossRef]
Lei, F.; Senyurek, V.; Kurum, M.; Gurbuz, A.C.; Boyd, D.; Moorhead, R.; Crow, W.T.; Eroglu, O. Quasi-Global Machine Learning-Based Soil Moisture Estimates at High Spatio-Temporal Scales Using CYGNSS and SMAP Observations. Remote Sens. Environ. 2022, 276, 113041. [Google Scholar] [CrossRef]
Nandy, S.; Srinet, R.; Padalia, H. Mapping Forest Height and Aboveground Biomass by Integrating ICESat-2, Sentinel-1 and Sentinel-2 Data Using Random Forest Algorithm in Northwest Himalayan Foothills of India. Geophys. Res. Lett. 2021, 48, e2021GL093799. [Google Scholar] [CrossRef]
Zhang, Y.; Ma, J.; Liang, S.; Li, X.; Li, M. An Evaluation of Eight Machine Learning Regression Algorithms for Forest Aboveground Biomass Estimation from Multiple Satellite Data Products. Remote Sens. 2020, 12, 4015. [Google Scholar] [CrossRef]
Zhang, X.; Shen, H.; Huang, T.; Wu, Y.; Guo, B.; Liu, Z.; Luo, H.; Tang, J.; Zhou, H.; Wang, L.; et al. Improved Random Forest Algorithms for Increasing the Accuracy of Forest Aboveground Biomass Estimation Using Sentinel-2 Imagery. Ecol. Indic. 2024, 159, 111752. [Google Scholar] [CrossRef]
Doshi-Velez, F.; Kim, B. Towards A Rigorous Science of Interpretable Machine Learning 2017. arXiv 2017, arXiv:1702.08608. [Google Scholar]
Osei Darko, P.; Metari, S.; Arroyo-Mora, J.P.; Fagan, M.E.; Kalacska, M. Application of Machine Learning for Aboveground Biomass Modeling in Tropical and Temperate Forests from Airborne Hyperspectral Imagery. Forests 2025, 16, 477. [Google Scholar] [CrossRef]
Ma, T.; Zhang, C.; Ji, L.; Zuo, Z.; Beckline, M.; Hu, Y.; Li, X.; Xiao, X. Development of Forest Aboveground Biomass Estimation, Its Problems and Future Solutions: A Review. Ecol. Indic. 2024, 159, 111653. [Google Scholar] [CrossRef]
Ulaby, F.T.; McDonald, K.; Sarabandi, K.; Dobson, M.C. Michigan Microwave Canopy Scattering Models (MIMICS). In Proceedings of the International Geoscience and Remote Sensing Symposium, “Remote Sensing: Moving Toward the 21st Century”; IEEE: London, UK, 1988; Volume 2, p. 1009. [Google Scholar]
Liang, P.; Pierce, L.E.; Moghaddam, M. Radiative Transfer Model for Microwave Bistatic Scattering from Forest Canopies. IEEE Trans. Geosci. Remote Sens. 2005, 43, 2470–2483. [Google Scholar] [CrossRef]
Kumar, S.; Garg, R.D.; Govil, H.; Kushwaha, S.P.S. PolSAR-Decomposition-Based Extended Water Cloud Modeling for Forest Aboveground Biomass Estimation. Remote Sens. 2019, 11, 2287. [Google Scholar] [CrossRef]
Santoro, M.; Cartus, O.; Fransson, J.E.S. Integration of Allometric Equations in the Water Cloud Model towards an Improved Retrieval of Forest Stem Volume with L-Band SAR Data in Sweden. Remote Sens. Environ. 2021, 253, 112235. [Google Scholar] [CrossRef]
Dolatabadi, N.; Nasseri, M.; Zahraie, B. Comparative Assessment of Surface Soil Moisture Simulations by the Coupled Wcm-Iem vs. Data-Driven Models Using the Sentinel 1 and 2 Satellite Images. Earth Sci. Inf. 2023, 16, 1563–1584. [Google Scholar] [CrossRef]
Inoubli, R.; Bennaceur, L.; Jarray, N.; Ben Abbes, A.; Farah, I.R. A Comparison between the Use of Machine Learning Techniques and the Water Cloud Model for the Retrieval of Soil Moisture from Sentinel-1A and Sentinel-2A Products. Remote Sens. Lett. 2022, 13, 980–990. [Google Scholar] [CrossRef]
Shen, C.; Appling, A.; Gentine, P.; Bandai, T.; Gupta, H.; Tartakovsky, A.; Baity-Jesi, M.; Fenicia, F.; Kifer, D.; Liu, X.; et al. Differentiable Modeling to Unify Machine Learning and Physical Models and Advance Geosciences. Nat. Rev. Earth Environ. 2023, 4, 552–567. [Google Scholar] [CrossRef]
El Hajj, M.; Baghdadi, N.; Zribi, M.; Belaud, G.; Cheviron, B.; Courault, D.; Charron, F. Soil Moisture Retrieval over Irrigated Grassland Using X-Band SAR Data. Remote Sens. Environ. 2016, 176, 202–218. [Google Scholar] [CrossRef]
Li, Z.; Yuan, Q.; Yang, Q.; Li, J.; Zhao, T. Differentiable Modeling for Soil Moisture Retrieval by Unifying Deep Neural Networks and Water Cloud Model. Remote Sens. Environ. 2024, 311, 114281. [Google Scholar] [CrossRef]
Abbes, A.B.; Jarray, N.; Farah, I.R. Advances in Remote Sensing Based Soil Moisture Retrieval: Applications, Techniques, Scales and Challenges for Combining Machine Learning and Physical Models. Artif. Intell. Rev. 2024, 57, 224. [Google Scholar] [CrossRef]
Feng, D.; Liu, J.; Lawson, K.; Shen, C. Differentiable, Learnable, Regionalized Process-Based Models With Multiphysical Outputs Can Approach State-Of-The-Art Hydrologic Prediction Accuracy. Water Resour. Res. 2022, 58, e2022WR032404. [Google Scholar] [CrossRef]
Baghdadi, N.; Holah, N.; Zribi, M. Soil Moisture Estimation Using Multi-incidence and Multi-polarization ASAR Data. Int. J. Remote Sens. 2006, 27, 1907–1920. [Google Scholar] [CrossRef]
Pouryousefi-Markhali, S.; Poulin, A.; Boucher, M.-A. Spatio-Temporal Discretization Uncertainty of Distributed Hydrological Models. Hydrol. Process. 2021, 36, e14635. [Google Scholar] [CrossRef]
Zhang, G.; Wang, S.; Chen, Z.; Zheng, Y.; Zhao, R.; Wang, T.; Zhu, Y.; Yuan, X.; Wu, W.; Chen, W. Development of China’s Spaceborne SAR Satellite, Processing Strategy, and Application: Take Gaofen-3 Series as an Example. Geo-Spat. Inf. Sci. 2024, 27, 221–236. [Google Scholar] [CrossRef]
Li, X.; Chen, Y.; Tong, L.; Luo, S. A Study on Vegetation Cover Extraction Using a Wishart H-α Classifier Based on Fully Polarimetric Radarsat-2 Data. Int. J. Remote Sens. 2016, 37, 2844–2859. [Google Scholar] [CrossRef]
Capaldo, P.; Crespi, M.; Fratarcangeli, F.; Nascetti, A.; Pieralice, F.; Porfiri, M.; Toutin, T. Dsms Generation From Cosmo-Skymed, Radarsat-2 and Terrasar-x Imagery on Beauport (Canada) Test Site: Evaluation and Comparison of Different Radargrammetric Approaches. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2013, XL-1/W1, 41–46. [Google Scholar] [CrossRef]
Storey, J.; Choate, M.; Moe, D. Landsat 8 Thermal Infrared Sensor Geometric Characterization and Calibration. Remote Sens. 2014, 6, 11153–11181. [Google Scholar] [CrossRef]
Morfitt, R.; Barsi, J.; Levy, R.; Markham, B.; Micijevic, E.; Ong, L.; Scaramuzza, P.; Vanderwerff, K. Landsat-8 Operational Land Imager (OLI) Radiometric Performance On-Orbit. Remote Sens. 2015, 7, 2208–2237. [Google Scholar] [CrossRef]
Bindlish, R.; Barros, A.P. Parameterization of Vegetation Backscatter in Radar-Based, Soil Moisture Estimation. Remote Sens. Environ. 2001, 76, 130–137. [Google Scholar] [CrossRef]
Dobson, M.; Ulaby, F. Active Microwave Soil Moisture Research. IEEE Trans. Geosci. Remote Sens. 1986, GE-24, 23–36. [Google Scholar] [CrossRef]
Attema, E.P.W.; Ulaby, F.T. Vegetation Modeled as a Water Cloud. Radio Sci. 1978, 13, 357–364. [Google Scholar] [CrossRef]
Singh, S.K.; Prasad, R.; Srivastava, P.K.; Yadav, S.A.; Yadav, V.P.; Sharma, J. Incorporation of First-Order Backscattered Power in Water Cloud Model for Improving the Leaf Area Index and Soil Moisture Retrieval Using Dual-Polarized Sentinel-1 SAR Data. Remote Sens. Environ. 2023, 296, 113756. [Google Scholar] [CrossRef]
Vereecken, H.; Weihermüller, L.; Jonard, F.; Montzka, C. Characterization of Crop Canopies and Water Stress Related Phenomena Using Microwave Remote Sensing Methods: A Review. Vadose Zone J. 2012, 11, vzj2011.0138ra. [Google Scholar] [CrossRef]
Heffernan, S.; Strimbu, B.M. Estimation of Surface Canopy Water in Pacific Northwest Forests by Fusing Radar, Lidar, and Meteorological Data. Forests 2021, 12, 339. [Google Scholar] [CrossRef]
Lu, D.; Chen, Q.; Wang, G.; Liu, L.; Li, G.; Moran, E. A Survey of Remote Sensing-Based Aboveground Biomass Estimation Methods in Forest Ecosystems. Int. J. Digit. Earth 2016, 9, 63–105. [Google Scholar] [CrossRef]
Saatchi, S.S.; Harris, N.L.; Brown, S.; Lefsky, M.; Mitchard, E.T.A.; Salas, W.; Zutta, B.R.; Buermann, W.; Lewis, S.L.; Hagen, S.; et al. Benchmark Map of Forest Carbon Stocks in Tropical Regions across Three Continents. Proc. Natl. Acad. Sci. USA 2011, 108, 9899–9904. [Google Scholar] [CrossRef]
Bandai, T.; Ghezzehei, T.A. Forward and Inverse Modeling of Water Flow in Unsaturated Soils with Discontinuous Hydraulic Conductivities Using Physics-Informed Neural Networks with Domain Decomposition. Hydrol. Earth Syst. Sci. 2022, 26, 4469–4495. [Google Scholar] [CrossRef]
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-Informed Machine Learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
Watt-Meyer, O.; Brenowitz, N.D.; Clark, S.K.; Henn, B.; Kwa, A.; McGibbon, J.; Perkins, W.A.; Harris, L.; Bretherton, C.S. Neural Network Parameterization of Subgrid-Scale Physics From a Realistic Geography Global Storm-Resolving Simulation. J. Adv. Model. Earth Syst. 2024, 16, e2023MS003668. [Google Scholar] [CrossRef]
Deshpande, A. Learning Biomolecular Motion: The Physics-Informed Machine Learning Paradigm. arXiv 2025, arXiv:2511.06585. [Google Scholar] [CrossRef]
Qu, Y.; Bhouri, M.A.; Gentine, P. Joint Parameter and Parameterization Inference with Uncertainty Quantification through Differentiable Programming. arXiv 2024, arXiv:2403.02215. [Google Scholar] [CrossRef]
Baydin, A.G.; Pearlmutter, B.A.; Radul, A.A.; Siskind, J.M. Automatic Differentiation in Machine Learning: A Survey. J. Mach. Learn. Res. 2018, 18, 1–43. [Google Scholar]
Verma, A. An Introduction to Automatic Differentiation. Curr. Sci. 2000, 78, 804–807. [Google Scholar]
Whitney, H. Differentiable Functions and Singularities. In Hassler Whitney Collected Papers; Eells, J., Toledo, D., Eds.; Contemporary Mathematicians; Birkhäuser: Boston, MA, USA, 1992; pp. 227–454. ISBN 978-1-4612-7740-8. [Google Scholar]
Cybenko, G. Approximation by Superpositions of a Sigmoidal Function. Math. Control Signal Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Specht, D.F. A General Regression Neural Network. IEEE Trans. Neural Netw. 1991, 2, 568–576. [Google Scholar] [CrossRef] [PubMed]
Li, W.; Suhayb, M.K.; Thangavelu, L.; Abdulameer Marhoon, H.; Pustokhina, I.; Alqsair, U.F.; El-Shafay, A.S.; Alashwal, M. Implementation of AdaBoost and Genetic Algorithm Machine Learning Models in Prediction of Adsorption Capacity of Nanocomposite Materials. J. Mol. Liq. 2022, 350, 118527. [Google Scholar] [CrossRef]
Magagi, R.; Jammali, S.; Goïta, K.; Wang, H.; Colliander, A. Potential of L- and C- Bands Polarimetric SAR Data for Monitoring Soil Moisture over Forested Sites. Remote Sens. 2022, 14, 5317. [Google Scholar] [CrossRef]
Bauer-Marschallinger, B.; Freeman, V.; Cao, S.; Paulik, C.; Schaufler, S.; Stachl, T.; Modanesi, S.; Massari, C.; Ciabatta, L.; Brocca, L.; et al. Toward Global Soil Moisture Monitoring With Sentinel-1: Harnessing Assets and Overcoming Obstacles. IEEE Trans. Geosci. Remote Sens. 2019, 57, 520–539. [Google Scholar] [CrossRef]
Huang, S.; Ding, J.; Zou, J.; Liu, B.; Zhang, J.; Chen, W. Soil Moisture Retrival Based on Sentinel-1 Imagery under Sparse Vegetation Coverage. Sensors 2019, 19, 589. [Google Scholar] [CrossRef]
Mermoz, S.; Réjou-Méchain, M.; Villard, L.; Le Toan, T.; Rossi, V.; Gourlet-Fleury, S. Decrease of L-Band SAR Backscatter with Biomass of Dense Forests. Remote Sens. Environ. 2015, 159, 307–317. [Google Scholar] [CrossRef]
Sandberg, G.; Ulander, L.M.H.; Fransson, J.E.S.; Holmgren, J.; Le Toan, T. L- and P-Band Backscatter Intensity for Biomass Retrieval in Hemiboreal Forest. Remote Sens. Environ. 2011, 115, 2874–2886. [Google Scholar] [CrossRef]

Figure 1. Geographic locations and sample plot distributions of the two study areas: (a) Simao study area; (b) Genhe study area; Numbers 1, 2, and 3 represent the specific locations of the sample plots in the two study areas.

Figure 2. Flowchart of the data preprocessing procedure.

Figure 4. Distribution of forest AGB in sample plots in the Genhe and Pu’er study areas (Mg/ha).

Figure 5. Schematic diagram of forest backscattering in the WCM.

Figure 6. Cost function landscape illustrating the search ability of different models in function space. (a) is ML models that are guided into smaller searchable spaces (ovals) by structural priors; (b) is process-based models with expanded search space supported by learnable units.

Figure 7. Schematic illustration of the differences between machine learning models, physical models, and DPM in constructing the mapping from X to

y

.

Figure 7. Schematic illustration of the differences between machine learning models, physical models, and DPM in constructing the mapping from X to

y

.

Figure 8. Schematic of the DPM architecture with the WCM as the backbone. Black solid lines represent the forward propagation path used for forest AGB prediction, while green dashed lines represent the backpropagation path used during neural network training for gradient computation and parameter updates.

Figure 9. (a) Loss curves of forest AGB in Simao study area. (b) Loss curves of forest AGB in Genhe study area.

Figure 10. (a) Trend of the mean parameter gradients of each DFCNN layer with training epochs for the Simao study area. (b) Trend of the mean parameter gradients of each DFCNN layer with training epochs for the Genhe study area.

Figure 11. Scatter plots comparing observed and predicted AGB for the DPM, GRNN, FNN, AdaBoost, RF, and WCM models in the Simao study area. The red solid line represents the fitted linear regression line, and the black dashed line i represents the 1:1 ideal reference line.

Figure 12. Scatter plots comparing observed and predicted AGB for the DPM, GRNN, FNN, AdaBoost, RF, and WCM models in the Genhe study area. The red solid line represents the fitted linear regression line, and the black dashed line i represents the 1:1 ideal reference line.

Table 1. The information of the acquired SAR data.

Parameters	GF-3	RADARSAT-2
Band	C	C
Imaging mode	QPSI	FQP
Polarization	HH, HV, VH, VV	HH, HV, VH, VV
Incidence angle	23.35	37.4
Range	2.25	4.96
Azimuth	4.68	4.73
Wavelength	5.55
Orbit direction	Ascending

Table 2. Detailed parameter information of Landsat-8 OLI images.

Study Site	Acquisition Date	Path/Row	Cloud Cover	Data Product	Image Level
Pu’er	16 May 2020	130-044	2.01%	LC08_L1TP	Level-1
Genhe	5 May 2013	122-024	4.20%
	28 October 2013	122-025	8.05%
	19 October 2013	123-024	0.01%
	22 December 2013	123-025	2.11%

Table 3. The forest AGB models for tree species in the Simao study area.

Vegetation Type	Biomass Model
Simao Pine (Pinus kesiya)	$M = 0.0582 {{DBH}^{2.1203} H}^{0.4668}$
Various Betula species in Southwest China	$M = 0.08907 {{DBH}^{1.89807} H}^{0.52019}$
Michelia species	$M = 0.12045 {{DBH}^{2.06446} H}^{0.382653}$
Quercus species (oak)	$M = 0.1355 {{({DBH}^{2} H)}^{0.817} + 0.0275 ({DBH}^{2} H)}^{0.8165}$
Eucalyptus species	$l g M = 0.814 l g ({DBH}^{2} H) - 0.9816$
Quercus species	$M = 0.22999 {{DBH}^{1.39183} H}^{0.57393}$
Cupressus species (cypress)	$M = 0.010158 {{DBH}^{2.94424} H}^{0.41591}$
Cunninghamia species (Chinese fir)	$M = 0.10301 {({DBH}^{2} H)}^{0.7773}$
Hard broadleaf species	$M = {0.3507 (DBH - 1.1948)}^{2} + (0.0301 {DBH}^{2.3643} + 0.051) + (0.0181 {DBH}^{2} - 0.2477)$
Soft broadleaf species	$M = 0.02739 {({DBH}^{2} H)}^{0.8988} + {0.01497 ({DBH}^{2} H)}^{0.8756} + {0.01059 ({DBH}^{2} H)}^{0.8132}$ $+ 0.0121 {({DBH}^{2} H)}^{0.854295}$

Note:

DBH

is the diameter at breast height,

H

is the height of the tree, and

M

is the AGB of each tree.

Table 4. The DFCNN architectural hyperparameter setting.

Parameters	Setting
layers	4
units	128 → 64 → 32 → 3
dropout	0.3
initial learning rate	0.001
epochs	2100
random seeds	42
parameter counts	12355
loss function	MSE
batch size	Full-batch
activation function	LeakyReLU
normalization	Min-Max
optimizer	Adam

Table 5. Three types of datasets required for DPM training.

Category	Dataset	Variable	Spatial Resolution
Satellite Data	Landsat8 OLI	NDVI DVI SAVI	30 m
	ASTER GDEM	Slope Aspect	30 m
	GF-3	Angle RVI $σ_{H H}$ $σ_{H V}$ $σ_{V H}$ $σ_{V V}$	8 m
	RADARSAT-2	Angle RVI $σ_{H H}$ $σ_{H V}$ $σ_{V H}$ $σ_{V V}$	10 m
Ground-based data	In situ	Forest AGB Vegetation type Altitude Latitude Longitude	Point
Auxiliary data	SMCI1.0	Temperature	1 km
Auxiliary data	SMCI1.0	Precipitation	1 km

Table 6. Statistical metrics of six AGB retrieval models in Simao study area.

Model	R²	RMSE	Bias	ubRMSE
DPM	0.60	24.23	0.4	24.23
RF	0.41	29.29	10.47	27.36
AdaBoost	0.38	29.92	10.89	27.87
WCM	0.31	31.57	−0.19	31.57
FNN	0.29	33.09	9.92	31.56
GRNN	0.26	41.87	19.95	36.81

Table 7. Statistical metrics of six AGB retrieval models in Genhe study area.

Model	R²	RMSE	Bias	ubRMSE
DPM	0.48	33.29	0.87	33.28
RF	0.34	37.53	−7.35	36.8
FNN	0.26	39.78	5.65	39.38
AdaBoost	0.24	40.30	0.56	40.30
GRNN	0.18	41.91	19.95	36.81
WCM	0.08	44.5	0.49	44.5

Table 8. Results of the paired-sample t-test comparing the DPM model with other AGB retrieval models in the Simao study area.

Models	MD	SD	t	df	p	Significance
DPM & RF	−4.87	7.45	−2.99	20	<0.01	**
DPM & AdaBoost	−5.52	8.12	−3.11	20	<0.01	**
DPM & FNN	−9.23	9.88	−4.28	20	<0.001	***
DPM & GRNN	−17.85	18.30	−4.47	20	<0.001	***
DPM & WCM	−7.68	8.76	−4.01	20	<0.01	**

Table 9. Results of the paired-sample t-test comparing the DPM model with other AGB retrieval models in the Genhe study area.

Models	MD	SD	t	df	p	Significance
DPM & RF	−4.12	8.45	−2.23	20	<0.05	*
DPM & FNN	−6.53	9.28	−3.22	20	<0.01	**
DPM & AdaBoost	−6.87	9.51	−3.31	20	<0.01	**
DPM & GRNN	−9.18	10.62	−3.96	20	<0.001	***
DPM & WCM	−11.43	11.87	−4.41	20	<0.001	***

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, C.; Shi, R.; Ji, Y.; Zhang, W.; Zhang, W.; He, X.; Zhao, H. Differentiable Physical Modeling for Forest Above-Ground Biomass Retrieval by Unifying a Water Cloud Model and Deep Learning. Remote Sens. 2026, 18, 912. https://doi.org/10.3390/rs18060912

AMA Style

Zhao C, Shi R, Ji Y, Zhang W, Zhang W, He X, Zhao H. Differentiable Physical Modeling for Forest Above-Ground Biomass Retrieval by Unifying a Water Cloud Model and Deep Learning. Remote Sensing. 2026; 18(6):912. https://doi.org/10.3390/rs18060912

Chicago/Turabian Style

Zhao, Cui, Rui Shi, Yongjie Ji, Wei Zhang, Wangfei Zhang, Xiahong He, and Han Zhao. 2026. "Differentiable Physical Modeling for Forest Above-Ground Biomass Retrieval by Unifying a Water Cloud Model and Deep Learning" Remote Sensing 18, no. 6: 912. https://doi.org/10.3390/rs18060912

APA Style

Zhao, C., Shi, R., Ji, Y., Zhang, W., Zhang, W., He, X., & Zhao, H. (2026). Differentiable Physical Modeling for Forest Above-Ground Biomass Retrieval by Unifying a Water Cloud Model and Deep Learning. Remote Sensing, 18(6), 912. https://doi.org/10.3390/rs18060912

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Differentiable Physical Modeling for Forest Above-Ground Biomass Retrieval by Unifying a Water Cloud Model and Deep Learning

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Overview of the Study Areas

2.1.1. Simao Study Area

2.1.2. Genhe Study Area

2.2. Remote Sensing Data

2.2.1. GF-3 PolSAR Data

2.2.2. RADARSAT-2 Data

2.2.3. Landsat-8 OLI Data

2.3. Ground Survey Data

2.3.1. Pu’er Ground Data

2.3.2. Genhe Ground Data

2.4. Methods

2.4.1. Water Cloud Model

2.4.2. Deep Fully Connected Neural Network

2.4.3. Principle and Method of DPM

Fundamental Principle of DPM

Data for DPM Training Process

Model Construction Process

2.4.4. Model Comparison and Evaluation

3. Results

3.1. Analysis of the Forest AGB Loss Function

3.2. Visualization of Intermediate Variable Updates via the Backpropagation Mechanism

3.3. Random Sampling Validation

3.4. Statistical Significance Testing

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI