Next Article in Journal
Innovative Rotating SAR Mode for 3D Imaging of Buildings
Previous Article in Journal
Exploring Spatial Patterns of Tropical Peatland Subsidence in Selangor, Malaysia Using the APSIS-DInSAR Technique
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Biomass Estimation and Saturation Value Determination Based on Multi-Source Remote Sensing Data

1
Key Laboratory of Sustainable Forest Ecosystem Management (Ministry of Education), School of Forestry, Northeast Forestry University, Harbin 150040, China
2
Head of Department of Forest Management, GIS of Bauman Moscow State Technical University, 2-YaBaumanskaya Ulitsa, 5c1, Moscow 105005, Russia
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(12), 2250; https://doi.org/10.3390/rs16122250
Submission received: 29 April 2024 / Revised: 16 June 2024 / Accepted: 19 June 2024 / Published: 20 June 2024

Abstract

:
Forest biomass estimation is undoubtedly one of the most pressing research subjects at present. Combining multi-source remote sensing information can give full play to the advantages of different remote sensing technologies, providing more comprehensive and rich information for aboveground biomass (AGB) estimation research. Based on Landsat 8, Sentinel-2A, and ALOS2 PALSAR data, this paper takes the artificial coniferous forests in the Saihanba Forest of Hebei Province as the object of study, fully explores and establishes remote sensing factors and information related to forest structure, gives full play to the advantages of spectral signals in detecting the horizontal structure and multi-dimensional synthetic aperture radar (SAR) data in detecting the vertical structure, and combines environmental factors to carry out multivariate synergistic methods of estimating the AGB. This paper uses three variable selection methods (Pearson correlation coefficient, random forest significance, and the least absolute shrinkage and selection operator (LASSO)) to establish the variable sets, combining them with three typical non-parametric models to estimate AGB, namely, random forest (RF), support vector regression (SVR), and artificial neural network (ANN), to analyze the effect of forest structure on biomass estimation, explore the suitable AGB of artificial coniferous forests estimation of machine learning models, and develop the method of quantifying saturation value of the combined variables. The results show that the horizontal structure is more capable of explaining the AGB compared to the vertical structure information, and that combining the multi-structure information can improve the model results and the saturation value to a great extent. In this study, different sets of variables can produce relatively superior results in different models. The variable set selected using LASSO gives the best results in the SVR model, with an R 2 values of 0.9998 and 0.8792 for the training and the test set, respectively, and the highest saturation value obtained is 185.73 t/ha, which is beyond the range of the measured data. The problem of saturation in biomass estimation in boreal medium- and high-density forests was overcome to a certain extent, and the AGB of the Saihanba area was better estimated.

1. Introduction

Forests are the most structurally complex and functionally rich ecosystems in terrestrial ecosystems and one of the richest natural resources in the world, playing an important role in measuring the global carbon cycle and climate change [1,2]. AGB is the mass of organic matter growing above ground level per unit area, which is one of the key parameters for assessing aboveground carbon stocks and emissions [3,4]. Accurate estimation of forest structural parameters and forest AGB on a global scale contributes to a better understanding of ecosystems and carbon cycle processes [5]. Boreal coniferous forest is the largest biome in the world and has rich carbon accumulation capacity, so the estimation of its AGB is particularly noteworthy [3].
Traditional AGB estimation is mainly based on destructive methods to obtain data such as diameter at breast height (DBH) and tree height (H), which are combined with the allometric growth equation to calculate the biomass of different tree species, and the total biomass at the regional scale is obtained by summing up the different types of tree species. However, there are problems such as time-consuming and delayed updates. Increasingly developed remote sensing technology has the advantages of being multi-temporal, multi-scale, multi-sensor, and non-destructive, making remote sensing inversion combined with measured data an important approach to forest AGB estimation. Remote sensing methods can provide continuous forest monitoring at periodic temporal resolution for large and difficult to reach areas [6]. Currently, AGB estimation uses data from optical, microwave radar, Lidar, and aerial imagery, with optical and radar being the most widely used data sources. However, the saturation problem caused by remote sensing signals is one of the biggest obstacles to using optical and radar data for AGB estimation. This means that when the biomass reaches a certain threshold, the electromagnetic radiation information received via remote sensing no longer reflects the changes in biomass, and the remote sensing model cannot accurately estimate the distribution area of high biomass, resulting in the problem of biomass saturation.
Passive optical sensors are comparatively insensitive to changes in vertical forest structure, have trouble penetrating the canopy, and can only gather horizontal information on the surface canopy [7]. Previous studies have demonstrated that utilizing optical data to estimate high biomass areas experiences saturation problems [8]. Gitelson reduced saturation by adding a weighting factor to the near-infrared reflectance in the normalized vegetation index (NDVI) to adjust the relative contribution of near-infrared and red reflectance to the vegetation index [9]. However, the existing vegetation index mostly adopts two or three bands, ignoring the important information that may be contained in other spectral bands and the different responses of different research regions and data to different spectral bands [10]. Zhaohua Liu proved that texture features can effectively alleviate the underestimation caused by spectral saturation [11]. Bastin demonstrated that texture and biomass have a strong correlation, and that texture is still very sensitive in places with high biomass (>600 Mg/ha) [12]. The texture ratios have been shown to correlate well with biomass, delaying the saturation point to a large extent [13]. However, the existing studies do not consider the information contained in different texture features and the response relationship between them, ignoring their relative contribution to the texture index, and do not make better use of the performance of texture and ratio techniques.
SAR measurements have a higher sensitivity to the vertical alignment of forest elements because of their ability to penetrate the vegetation layer and interact with scatterers at different heights [14]. High-frequency bands (X and C bands) only act on branches and leaves, which cannot penetrate the lower part of the canopy [15]. Longer wavelengths (L, P band) can interact with branches and trunks to provide information about the vertical structure of the forest [16]. Xiaodong Huang used a water cloud model (WCM) to analyze the saturation problem in the L and C bands and showed that because the C band has a higher extinction coefficient, the L band is more suitable for estimating high biomass [17]. Kasischke used AIRSAR data and L-band data from SIR-C to estimate pine forest AGB in the southeastern U.S. The results showed that the L-band of SIR-C was able to efficiently estimate the biomass of structurally complex forests with a maximum value of 100 Mg/ha and the saturation value of approximately 250 Mg/ha for stands with simple structure [18]. Michael Schlund combined P- and L-band radar backscattering and tree height information to overcome the restriction of biomass saturation by using a logarithmic model, and the results showed that introducing tree height information into the model could overcome the limitations of AGB estimation to a certain extent [19]. Tree height has been shown to be an important variable in estimating AGB, which increases the saturation of AGB, and tree height can be converted to forest AGB through the allometric growth equation and other models [20,21]. Existing studies linking radar metrics to forest parameters are based on the vertical component of the data, but in structurally complex forests, signal penetration may be limited, which can constrain the accuracy of the estimation results.
AGB is affected by the complexity of forest structure in both the horizontal and vertical dimensions, so any measurement of structure at a given time should take into account the spatial variability that occurs in both the horizontal and vertical dimensions of the forest, which has the potential to enhance or even break the saturation point bottleneck problem [22,23]. Therefore, it is necessary to combine optical and radar remote sensing to benefit from their complementary responses to forest features to characterize complex forest structures [24]. If the optical and radar data can be properly integrated into a new dataset, the combined dataset can contain new and more information about the structural characteristics of forests, which can play an important role in local-scale forest management as well as climate modeling at regional and global scales [25,26].
Biomass estimation based on multi-source remote sensing data mainly relies on empirical models, including parametric and non-parametric algorithms. The parametric model is easy to operate and has a clear relationship, so it has great demand and potential in a wide range of ecological environment applications. However, it relies on the model structure and requires the normal distribution hypothesis; the accuracy of the algorithm depends largely on statistical robustness, which is weak in describing the complex relationship between AGB and satellite data [27,28]. In addition to parametric methods such as linear and nonlinear models, non-parametric machine learning (ML) algorithms are a common method by which to overcome saturation limitations [29]. In contrast to parametric techniques, non-parametric algorithms do not explicitly predefine the model structure, are not subject to assumptions about the probability distribution and correlation of the input data to determine the model structure in a data-driven manner [30]. Based on the flexibility of non-parametric algorithms and the advantage of being able to integrate multiple factors, they are more adept at creating complex nonlinear biomass models with better estimation results [25,31]. In existing studies, more ML algorithms are widely used in AGB estimation, such as support vector machines (SVR) [32,33], random forests (RF) [16,34], stochastic gradient boosting (SGB) [35], and stacked integration algorithms [36,37].
At present, non-parametric and parametric methods are mainly used to determine saturation levels for various images and forest types. For non-parametric methods, saturation levels were obtained indirectly based on scatter plots or fitted curves, visually interpreting the extremes of a linear model or the trend of a nonlinear model between the selected variables and the AGB [38]. The semi-exponential model derived from the WCM is the traditional parametric approach for estimating saturation levels using SAR images, and saturation levels are among the model parameters that are directly derived by solving the model [39]. Zhou Lu et al. used two methods of binomial function and power function to fit the functional relationship between biomass and band reflectance and determined the light saturation point of AGB of Pinus kesiya var. langbianensis forest in Puer City to be 106.3 t/ha by calculating the corresponding inflection point value of the function [40]. Zhao et al. determined the light saturation point of pine forests to be 159 t/ha by using the spherical model [41]. Tingchen Zhang used ZY-3 stereo images and multi-spectral data to estimate the stock volume of the artificial larch forest and calculated the saturation point via spherical model to reach 192 Mg/ha [42]. In fact, non-parametric and parametric techniques can both quantitatively assess the level of saturation for individual variables by using extreme values or model solving. However, evaluating the degree of saturation of combinatorial variables in ML algorithms remains a complex problem.
Therefore, this study combines optical and SAR techniques to determine the forest structure. The ratio technique was combined with spectral band and texture to establish the horizontal structure indices, and the PolInSAR technique was improved to invert the forest vertical structure indices. We used remote sensing raw extracted variables (spectral bands, biophysical parameters, backscattering coefficients, polarization decomposition variables, polarization decomposition parameters, SAR indices) and newly constructed structural indices (horizontal structural indices and vertical structural indices) to analyze the horizontal and vertical structural variations of the coniferous forests as well as the compensatory effects that existed between the different remote sensing data. And we introduced the environmental factors to reduce the effect of the topographical factors. Since too many variables directly introduced into the model can lead to overfitting issues, three methods were employed in this study to choose variables: Pearson correlation coefficient, RF significance, and LASSO method. Then, we estimated them using three typical algorithms—RF, SVR, and ANN—to explore the effects of different algorithms on the performance of biomass estimation and quantified the saturation level of the combined variables through the spherical model to break through the limitation problem of the saturation value of forest AGB estimation. The paper is structured as follows: Section 2 describes the basics of the measured data, the processing of the satellite data, and the extraction of the variables. The methodology and theory used in this paper are explained in Section 3. Section 4 explains the results in detail, Section 5 discusses and analyzes the model errors, and Section 6 presents the conclusion.

2. Study Area and Data Processing

2.1. Overview of the Study Area

The study area is located in Saihanba Forest in Weichang Manchu-Mongolian Autonomous County, Chengde City, Hebei Province (42°02′N-42°36′N, 116°51′E-117°39′E). As the largest artificial forest farm in the north, it is rich in forest resources and has abundant vegetation growth, with more than 80% of the area covered by coniferous forests. The main tree species are Larix principis-rupprechtii Mayr, Pinus sylvestris var. mongholica Litv, and so on. Accurate estimation of the biomass in the Saihanba forest can satisfy the need for estimation of the current status and transformation of forest carbon sinks, which is important for the maintenance and improvement of the benefits of forest products, ecological benefits, and social benefits.
In this study, 65 temporary sample plots with an area of 0.06 ha were established (Figure 1), and all of them were artificial coniferous forests, with Larix gmelinii (Rupr.) Kuzen and Pinus sylvestris var. mongholica Litv. as the main species. For the sample plots to better represent the forest area as a whole, the samples were collected in a relatively central location in the forest, avoiding forest edges and large open windows wherever possible. The field survey included species composition, DBH, canopy closure (CC), and H (Table 1). CC was determined using the head-up observation method with systematic sample points, and the number of plants per hectare was used as forest stand density (S). Basal area (BA) is the trunk cross-sectional area measured at 1.3 m above ground level and estimated according to Equation (1). The AGB of each tree was calculated using the allometric growth equation based on different tree species summarized by Haikui Li [43] and Haijun Wang [44] (Table 2), which was combined to obtain the AGB of the sample plot, and the AGB per unit area (t/ha) was calculated.
B A = π 4 D B H 2
Table 1 shows details of the measurements in the sample plots, including the maximum (Max), minimum (Min), mean (Mean), and standard deviation (STD) of CC, H, DBH, BA, S, and AGB. The scatterplot distribution between the sample data is shown in Figure 2. There was a strong negative association between DBH and S, with S decreasing exponentially with increasing DBH, while the range of S varied significantly with DBH about 10 cm, and AGB increased as DBH increased and S decreased. Although there was not much of a positive association, BA and CC were nevertheless generally rising at a steady pace, and AGB was rising together with CC. There was a strong positive association between DBH and H, and AGB increased with H.

2.2. Remote Sensing Data and Extraction Variables

2.2.1. Optical Data

The free availability, global coverage, frequent updates, and longer time scales of Sentinel-2 and Landsat 8 make them valuable data sources for persistent monitoring of forest carbon dynamics [45]. The Landsat 8 (L8) OLI sensor provides multi-spectral data with a spatial resolution of 30 m. The Sentinel-2 data have three bands in the red-edge spectral range, which provides more detail on forest spectral features. In this study, Sentinel-2A (S2A) imagery was obtained from the Copernicus Open Access Centre. The class 2A products downloaded were orthographic correction and geometric correction with sub-pixel accuracy, so radiometric calibration, atmospheric correction, and topographic correction were performed only on the L8 data. And the nearest-neighbor interpolation method was used to resample the S2A bands with different spatial resolutions to 30 m, so that the pixel size of the S2A image corresponds to the size of the L8 image. Table 3 shows the selected optical data acquisition times and their resolutions.
This study extracted bands (Table S1), texture (Table S2) [46], and biophysical parameters (Table S3) [47] for S2A and L8 images. Because horizontal structure can be defined as the arrangement of canopy closure or expressed through the distribution of trees, CC, S, and BA were used as the forest horizontal structure based on the measured data. The new horizontal structure indices were created based on the bands and textures extracted from S2A and L8 to help extend the linear relationship between remote sensing variables and vegetation parameters [48]. In this study, new ratio vegetation indices (RVI) (Equation (2)) (Table S4) were created by adding weights ( m ) to the ratio approach based on the response differences between spectral bands. In addition, considering the differences between textures in each band, weights ( m ) was added to the texture and combined with the ratio technique to create new ratio texture indices (RTI) (Equation (3)) based on three different textures: the same texture (corresponding texture index (CTI), Table S5); the average texture in each direction (mean texture index (MTI), Table S6); and the principal component texture (principal component texture index (PTI), Table S7) [48]. Extracted variables are summarized in Table 4 (all variable extraction methods and detailed explanations are described in Supplementary S1).
m R V I i j = m B a n d i B a n d j ,
m R T I i j = m T A i T A j ,
where m is a weight value set from 0.1 to 1 with an interval of 0.1, i is each band of the sensor, and j is a band other than i . A i and A j denote the same GLCM texture extracted from different bands ( i and j ) in the same window with the same direction (texture feature A ).

2.2.2. SAR Data

The ALOS2 PALSAR data are L-band, high-resolution airborne synthetic aperture radar, and ultra-fine full polarization 1.1-level single-viewing complex products (Table 5) that can provide information about the entire scattering matrix by describing the entire scattering process to retrieve some vertical forest structures [49]. In this study, the full polarization data of the five views acquired from July to September 2020 were used to preprocess the images via radiometric calibration, multilooking, filtering, and terrain correction with GAMMA (http://www.gamma-rs.ch) and PolSARPro v6.0.2 software, and geocoding was performed using 30 m resolution SRTM DEM data (Table 5) [50].
We extracted each image of the backscattering coefficients, polarization decomposition variables (odd scattering (S); double scattering (D)), and volume scattering (V) of Freeman–Durden three-component decomposition [51,52], Yamaguchi three-component decomposition [53], and Van Zyl decomposition [54], and two new ratios (M1 (Equation (4)), M2 ((Equation (5))) were constructed based on different combinations of the three components (Table S8)), polarization decomposition parameters [55], and SAR indices (Table S9) [56,57] variables. Six pairs of interferometric pairs with different spatial and temporal baselines were established (Table S10) with a range resolution of 5.66 m and an azimuthal resolution of 2.86 m. Based on the RVoG semi-empirical inversion model (Equation (6)), variable extinction coefficient (Equation (7)) and polarization decomposition techniques (Equation (8)) are introduced in this study to remove temporal decoherence and eliminate the errors caused by non-homogeneous vertical structure in the forest volume layer [58]. The forest heights derived from the above model were used as the forest vertical structure indices (H) to characterize the forest vertical structure. Extracted variables are summarized in Table 6 (all variable extraction methods and detailed explanations are described in Supplementary S2).
M 1 = S / ( V + D ) ,
M 2 = ( S + D ) / V ,
where S represents odd scattering, D represents double scattering, and V represents volume scattering.
γ V = γ e 2 σ cos θ ( e 2 σ cos θ h v 1 ) 0 h v e 2 σ cos θ z e j ( ε k z Z + φ e ) d z , γ V = γ e 2 σ cos θ ( e 2 σ cos θ h v 1 ) 0 h v e 2 σ cos θ z e j ( ε · k z ) Z d z , γ e = γ e · e j φ e ,
σ = a · e x p z u v 2         0 z h v ,     a > 0 ,
m = ω T T s ω ω T ( T V o l + T d ) ω = ω T T G ω ω T T V ω ,
where γ V is the volume scattering complex coherence, which is a function of the vegetation layer thickness h v and the average extinction coefficient σ ; k z is the vertical wave number; φ is the ground phase; and θ is the radar wave incidence angle. ε , φ e , and γ e are three correction terms introduced on the basis of the RVoG model. γ e is a correction for temporal decoherence on the overall coherence, and ε and φ e are utilized to implement the correction for the phase effect of temporal decoherence on k z . a is the influence factor of σ , u is the position of maximum extinction in the forest canopy, v is the standard deviation represented by the canopy shape, and u and v reflect the vertical heterogeneity of the forest. m represents the ground-volume scattering ratio, and T s , T d , and T V o l represent the contribution of the odd, double, and volume scattering components, respectively. T d and T V o l are used as the volume layer scattering T V , and T s is used as the ground layer scattering T G .

2.2.3. Topographic Factors

In this study, DEM data were used to extract slope, aspect, and elevation data for combined analysis.

3. Modeling of AGB Estimation

A total of 210 variables were used for the estimation: 16 spectral bands, 8 biophysical parameters, 12 horizontal structure indices, 20 backscattering coefficients, 75 polarization decomposition variables, 45 polarization decomposition parameters, 25 SAR indices, 6 vertical structure indices, and 3 topographic data from the L8, S2A, and ALOS2 PALSAR. Variables were selected based on Pearson correlation coefficient, RF importance, and LASSO method. The horizontal structure indices (V1), vertical structure indices (V2), horizontal + vertical structure indices (V3), horizontal + vertical structure indices + topographical variables (V4), Pearson correlation coefficient selected variables (V5), RF importance selected variables (V6), and LASSO selected variables (V7) were entered into the three ML models (RF, SVR, and ANN) for the AGB estimation. The data were first grouped to ensure the consistency of the data utilized in the different modeling methods. Using 75% of the total sample (n = 65) as the training data (n = 49) and the remaining 25% as the test data (n = 16), the training data were used to train the predictive model, and the validation data were used to assess the fitness of the model.
Figure 3 shows the flow chart of the proposed method.

3.1. Variable Selection Methods

Selecting the appropriate variable selection method can significantly improve the estimation accuracy of AGB since different feature variable combinations result in varying modeling accuracy. Therefore, three methods were used to make a selection of 210 variables for data dimensionality reduction to find variables that are highly correlated with AGB for sufficient prediction.
The first method uses the Pearson correlation coefficient to select variables that are significantly correlated with AGB at the 0.05 level. The Pearson correlation coefficient is a measure of linear correlation between two variables and is a filter for indirectly assessing regression problems [59]. The closer its value is to 1 (−1), the stronger positive (negative) correlation between the two variables [60].
The second method uses the RF model to select variables with a significance more than 0.1. The importance assessment of RF provides a measure of the contribution of the feature variables to improve the prediction accuracy of the model, which can be effective in determining the contribution of individual variables to the model and assessing the nonlinear relationship between the feature variables and AGB [61].
The third method is to select variables using the LASSO method. LASSO is a regression method that performs both regularization and variable selection to improve the prediction accuracy and enhance the interpretability of the model. It removes the coefficients of some ineffective variables from the model by making them smaller or even compressing them to 0 to deal with multi-collinearity and retains only the most useful features [62].

3.2. Non-Parametric Model

In recent years, many forest AGB studies have used non-parametric estimation methods and achieved better results [63,64,65]. Since there are numerous ML algorithms, typical algorithms in the three main branches of decision tree, kernel-based regression, and artificial neural network are selected for estimation to explore the impact of different algorithms on the performance of biomass estimation. Three typical techniques are used in this study for AGB estimation: RF, SVR, and ANN. In ML algorithms, parameter tuning plays a decisive role in producing highly accurate results, so optimal hyperparameter tuning is required to achieve low bias and variance. For every machine learning method, there are different tuning stages and parameters. Each ML model is tuned through a series of performance tests, and the optimal parameters are chosen based on the highest overall accuracy [66].

3.2.1. Random Forest (RF)

RF is a machine learning algorithm, developed by Breiman, that is rooted in classification and regression trees (CART). It is a series of decisions based on binary rules that automatically assess and measure the importance of variables and accurately describe complex relationships between independent and dependent variables [31]. During the construction of the RF model, there are two important hyperparameters to be set: “ntree” and “mtry”. “ntree” represents the number of decision trees in the RF, and “mtry” represents the number of variables randomly selected from the tree nodes, which is the number of variables to be considered in each segmentation [67]. Generally speaking, the overall error rate tends to stabilize when the ntree is above 500, but it still depends on the actual data. To guarantee the reliability of the prediction results without affecting the efficiency of the calculation, this study uses MSE to determine the number of leaves and decision trees. Most studies set the default value of mtry to one-third (rounded) of the total number of independent variables, but due to the different specific data, taking the default value of mtry does not necessarily result in obtaining the optimal model, and tuning of mtry is still required. In this study, optimal mtry values were determined for one-sixth, one-third, one-half, two-thirds, and all of the number of variables introduced.

3.2.2. Support Vector Regression (SVR)

Support vector machine is a statistical theory based on the kernel approach that solves multidimensional prediction problems by converting nonlinear regression problems into linear regression problems in a high-dimensional feature space [68,69]. The SVR model has three hyperparameters to set: “kernel”, “C”, and “epsilon”. “kernel” denotes the kernel function, and the choice of kernel function is a central issue in SVR research. There is no way to construct an appropriate kernel function for a given problem, and choosing the right kernel function usually provides better results in regression analysis. In this study, the radial basis kernel function (RBF) is used, which is currently the most widely used kernel function and usually provides better performance than linear and polynomial kernels [65,70]. “C” denotes the error penalty factor, which is used to calculate the penalty loss when a training error occurs. It is a trade-off between the SVR function and the input variables, which can be changed to avoid overfitting [71]. The fitting of the model to the training data is influenced by the loss function parameter epsilon ( ε ). The higher ε is, the larger the error region and the fewer support vectors [72]. The distribution of the data after mapping to the new feature space is determined by the optimal kernel parameter gamma, which also controls the regression error of the model [73,74]. To maximize the model performance and construct the SVR model, a large number of models are trained for various combinations of ε and C using the grid search approach. Ten-fold cross-validation is then used to determine the best C value with gamma value. The C values of 2 2 to 2 9 were set in the modeling; ε was set to 0-1 and the interval to 0.1.

3.2.3. Artificial Neural Networks (ANN)

ANN is a parallel distributed information processing system that mimics the work of neurons in the human brain, which have a strong ability to model complex nonlinear systems and are capable of inscribing high-dimensional mappings that are difficult to express analytically. ANN has several layers of hidden neural networks between the input and output parameters. A typical artificial neural network structure contains an input layer, an output layer, and a hidden layer. Each node is a neuron, and each neuron accepts values from the previous layer of neurons with different weights, which are processed by the activation function within the neuron and then propagated to the next layer [75,76,77]. Two parameters need to be optimally selected for ANN modeling: “size” and “decay”. The number of hidden nodes (size) is an integer greater than or equal to 0. The general method of determination is s i z e = P + O + m , in which P is the number of independent variables in the input layer, O denotes the number of dependent variables in the output layer, and m is an integer between 0 and 10. Weight decay (decay) is a real number from 0 to 0.1 and is a penalty method for regularization of the model [67]. The fitted model becomes smoother and less likely to overfit the training set when the decay value is increased; the model runs the risk of underfitting when the decay is too large. In this study, four possible decay parameters (0.0001, 0.001, 0.01, 0.1) were set, a large number of models were trained for different combinations of size and decay using the grid search method, and the optimal parameter values were obtained using ten-fold cross-validation to build ANN models.

3.3. Determination of Biomass Saturation Value Based on Spherical Modeling

It has been shown that the phenomenon of saturation of the variable of interest is similar to the spatial autocorrelation distribution of the variable of interest in geostatistics [42]. Therefore, the spatial autocorrelation of variables of interest in geostatistics is modeled by using the method based on the semi-variance function to express the relationship between AGB and variables. In geostatistics, measured AGB are considered as spatial distances, and the level of saturation values is extracted directly from the extremes of the spherical model. In this study, the spherical model (Equation (9)) is used to solve the semi-variance function to determine the saturation point of each index and model, and the ability of the new indices and models to improve structural information and AGB estimation is analyzed [68].
y = c 0 + c 3 x 2 B S x 3 2 B S 3 c 0 + c                         x > B S         0 x B S ,
where y is the selected variable; x is the AGB; B S is the variable range, that is, the saturation value of the AGB; c 0 is the nugget gold constant, which is indicative of the value of spectral reflectance at x = 0; c is the arch height, which is the rate of change of the selected variable as the AGB ( x ) increases; and c 0 + c is the abutment value, which is the maximum or minimum reflectance of the AGB when it reaches the saturation value of B S .
Assuming that   b 0 = c 0 , b 1 = 3 c 2 B S , b 2 = c 3 2 B S 3 ,   x 1 = x , x 2 = x 3 , the spherical model is linearized and its linear regression coefficients can be obtained using least squares regression:
y = b 0 + b 1 x 1 + b 2 x 2 .
However, the spherical model makes it difficult to evaluate the saturation level of the variable set, so this study used the Pearson correlation coefficient to find the original extracted variable most relevant to AGB in different remote sensing data, applied it to evaluate the multi-variable AGB saturation level, and determined the saturation value by using all sample site data [42]. Therefore, the SWIR1 band in the S2A image is used as the response variable for spatial autocorrelation, the estimated AGB of the combined variable is chosen to solve the spherical model, and the saturation of the combined variable is illustrated by the relationship between the SWIR1 band and the estimated AGB.

4. Results

4.1. Variable Selection

Table 7 (Supplementary S3) shows a summary of the variables selected by the three methods. The results show that among the three variable selection methods, more horizontal and vertical structural indices were selected compared to the original extracted variables. The green (S2A: B3), red-edge (S2A: B6), and SWIR (S2A: B11) bands are included in the three methods. Because the green band is sensitive to canopy reflectance, the red-edge, as the unique band of Sentinel-2, is sensitive to chlorophyll as well as leaf structure, providing additional information for vegetation characterization [78,79]. SWIR band is not easily affected by atmospheric absorption, scattering, and other interference, and has better resistance to atmospheric disturbance than visible and near-infrared bands [80]. The canopy water content extracted by L8 was introduced into the model, which may be because vegetation containing more photosynthetic material produces more water vapor, which is related to its biomass content [81]. The retained SAR variables are correlated with double and volume scattering, indicating that both are the main polarization features in the AGB estimation. The SAR indices introduced by the model represent canopy vegetation characteristics to some extent. CSI is a measure of the relative importance of the vertical and horizontal structures of the vegetation, demonstrating the dominance of the vertical structure of the sample trees as compared to the horizontal structure. VSI shows the dominance of cross-polarization backscattering compared to co-polarization [82].
The differences in variable selection are mainly due to the different principles underlying the three methods. The Pearson correlation coefficient method focuses on the linear correlation between the variables and AGB. The RF is ranked based on the importance of the variables, and the LASSO uses regularization to deal with multi-collinearity.

4.2. Optimization of Model Parameters

From Figure 4, we can see the parameters of RF; when the number of leaves is 5 and the decision tree is around 500, the model obtains the smallest MSE and tends to stabilize. Figure 5 and Table 8 indicate that for all models other than the V2 variable set, the MSE is lowest when one-sixth of the total number of variables is chosen as the value of mtry. The V2 model has the lowest MSE value, which is obtained by selecting two-thirds of the total number of variables. As can be seen from Figure 5, for SVR and ANN models, the darker the region color in the image, the better the model fitting result because MSE is closer to zero in the dark region. The optimal parameters of each model were ultimately found to produce the AGB estimation (Table 8).

4.3. Estimation of AGB Based on ML Algorithms

The results show (Table 9 and Figure 6) that under the three ML models, the combination of multiple structural indices (V3) and the introduction of topographic factors (V4) can improve the model results to a certain extent compared to the individual structural indices (V1 and V2). Poor results were obtained using the Pearson correlation coefficient (V5) and RF importance to select variables (V6), while the optimal results obtained using the LASSO method to select variables (V7) were significantly better than the other sets of variables.
As can be seen in Figure 7, the underestimation of high values and overestimation of low values were significantly improved when the SVR model was utilized for validation of each variable set compared to RF and ANN. This is because SVR is uniquely suited to handle small-sample data with strong inherent generalizations [83,84]. Since a regression model that achieves satisfactory performance on the training dataset may not be able to predict the unknown dataset, a model that performs well on both the training and the unknown test datasets is considered to have excellent generalization ability [85]. Compared to the optimal results obtained by RF and ANN for each variable set, SVR obtained better results for both the training and test sets for the V7 variable set ( R 2 of 0.9998 and 0.8792, respectively). Therefore, AGB map prediction for the study area was carried out using this model (Figure 8).

4.4. Determination of the Saturation Value

In this study, the spherical model was used to solve the semi-variance function of AGB and the new structure indices, and the curve obtained is shown in Figure 9. Table 10 demonstrates that the S indices and vertical structure indices were lower in R 2 with AGB. The PTI_BA produced the best result ( R 2 = 0.51). PTI_BA and 0905-0919H have larger saturation values than the other variables. However, compared to the results of the BA indices, 0905-0919H did not fit well with the AGB ( R 2 was only 0.2). Overall, the structure indices (PTI_BA and 0905-0919H) can partially avoid the saturation value limitation in the middle and high biomass areas in the boreal forest. Based on the experimental results, while it is difficult to completely eliminate the effect of data saturation, the new indices greatly reduce the effect of data saturation, thus improving the reliability of remote sensing data for AGB prediction.
Scatter plots in different variable sets and reflectance of the S2_B11 band (Figure 9) all showed a trend of decreasing reflectance with increasing biomass until it stabilized. The spectral saturation value of the AGB is the value that corresponds to the value at which the band reflectance stabilizes. Table 11 illustrates the various ranges of saturation value variations obtained for different variables and ML models. The overall saturation value variation ranges from 109–186 t/ha, 135–148 t/ha for the RF model, 151–186 t/ha for the SVR model, and 109–165 t/ha for the ANN model. Overall, the SVR model obtained higher saturation values for all sets of variables and the RF model obtained lower saturation values. The highest saturation value of 185.73 t/ha was obtained for the V7, while the lowest saturation value of 109.21 t/ha was obtained for the V2. The results show that in comparison to the individual structural indices models (V1 and V2), the optimal saturation values obtained by the combined multiple structural indices models (V3 and V4) were significantly higher. Additionally, the optimal saturation value of V4 exceeded that of the two variable selection methods (V5 and V6) (Table 11). This shows that the combination of multi-structural indices can be regarded as an effective means of delaying the saturation problem, and the introduction of topographic factors can contribute to raising the saturation value. V7 obtained the highest saturation value, indicating that its choice of variables is more sensitive to changes in AGB. In this study, the measured AGB value of only one sample exceeded 200 t/ha, and the rest samples were all below 180 t/ha. Therefore, the saturation point range obtained can estimate the AGB results well.

5. Discussion

5.1. Variable Selection

According to the results of the three extracted variables (Supplementary S3), the horizontal and vertical structural indices have a significantly higher amount of valid information than the original variables. The horizontal structural indices have a higher correlation with AGB and are more important to the model than the vertical structural indices. This is because all the subjects in this study are artificial coniferous forests, which means that the canopy structure is complicated and has numerous branches. The horizontal structure index can capture biochemical information in the upper canopy and provide more two-dimensional distribution information in the horizontal range of the canopy [86]. And the horizontal structural indices combine ratio techniques with bands and textures, eliminating the effects of topography and sensors and better expressing vegetation parameter information [87,88]. Compared to spectral bands, texture features can better distinguish vegetation structural details and can detect different forest canopy structural features [89]. The S2A bands have a stronger relationship with AGB and are more important in the model than the L8 sensor [90,91,92]. The red-edge band of S2A is more sensitive to vegetation growth changes, and has a strong correlation with AGB, which is more important to the model [93]. The polarization decomposition variables and parameters introduced by the three variable selection methods are all related to double and volume scattering, suggesting that they are the main polarization features in the AGB estimation. This is because the subjects of this study are artificial coniferous forests, which have complex canopy structures and more branches inside the canopy, resulting in stronger double and volume scattering [86].
Variable selection is one of the most important processes in the modeling process; it can affect the performance of ML algorithms, reduce the data dimension and data storage space, accelerate the estimation process, and improve the interpretability and performance of the model [94]. However, the variable selection methods used in this study did not improve the estimation performance of the model very much, probably because these methods ignored the combinatorial effects between the variables and their autocorrelation, and only selected the variables with good linear correlation or significance, abandoning the remaining variables that may have high saturation levels and contain useful information. Therefore, there is a need to develop variable selection methods that can meet the needs of AGB estimation under different forest conditions in response to the inconsistent applicability of estimated model parameters to different forest types and spatial distributions of biomass [95].

5.2. Analysis of the Model Results

The results show that different ML methods produce better results when using different sets of variables for AGB estimation, with RF, SVR, and ANN each having their advantages. In this study, the vertical structure indices or LASSO selecting variables for AGB estimation can be applied to the SVR model. In this case, the R 2 obtained by SVR is significantly better than the other two models. This is because SVR can handle the strong nonlinear relationship between target parameters and satellite variables and achieve better estimation performance, even with limited sample data [96,97]. RF is a good choice when using horizontal structural indices or combining multiple structural indices as a variable. This may be due to its ability to handle noisy and large datasets, as it is insensitive to noisy data in the dataset [98]. RF has several advantages over other ML techniques, including the ability to handle high-dimensional data, the lack of need for recurrent parameter tuning, and the ability to retain the accuracy of the final results even in the presence of missing feature variables. The anti-collinearity and robustness to outliers of the RF algorithm make it uniquely feasible and highly generalization in using remote sensing variables to predict forest parameters [99,100]. The ANN model operates best when the RF important variables and Pearson correlation coefficient are employed for AGB estimation. In the same variable set, except V2 and V5, the RMSE obtained by ANN is significantly higher than the other two models. This is due to the fact that while ANN is capable of learning intricate patterns and making generalizations in noisy settings, they also require representative training samples and suitable values for network parameters. Compared with RF and SVR models, the ANN results in most variable sets are not optimal, which may be related to the fixed-group data used for estimation in this study. The learning ability of ANN in the training set is too strong, and the obtained model cannot reflect the hidden rules of the sample, which weakens the prediction ability [101,102].
The results of this study show that although the results obtained from estimation using only the vertical structural indices are relatively low compared to the horizontal structural indices, the R 2 for the optimal training set and test set reached 0.7620 and 0.6843, respectively. The results obtained by combining the horizontal and vertical structure information to estimate AGB are significantly improved. And all three variable selection methods include horizontal and vertical structural indices, indicating that both horizontal and vertical structures play an indispensable role in the AGB estimation process, and AGB is affected by both horizontal and vertical structures [22,23]. Although the Pearson correlation coefficient selection variable method can simply and quickly select the characteristic variables that are linearly correlated with AGB, it is based on linear correlation and cannot fully and accurately describe the real relationship between biomass and remote sensing variables in complex forest environments. The RF feature selection method screens relatively key feature factors based on specific evaluation criteria, without considering the combination effect relationship between feature factors, which can lead to the selection of variables that do not better reflect the relationship with AGB [96]. LASSO uses constraint forms to identify smaller subsets of estimated variance and predictor variables with good variable selection and regularization capabilities [103]. The results show that optimal results were obtained for the LASSO variable set, which indicates that LASSO is a more efficient variable selection method [62,104].
The flexible selection of different types of variables and the free requirement of data distribution via ML methods provide better options for AGB modeling. While the ML model has the advantages of simplicity and fast operation, it also suffers from obvious portability problems due to factors such as the quality of measurement data and modeling algorithms [31,99,105]. All three modeling algorithms used in this study have limited ability to estimate extreme levels of AGB because of saturation effects, limited training data, algorithm parameters, specific vegetation types, and environmental conditions [31].

5.3. Saturation Value Analyze

The results showed that the optimal saturation value of the new horizontal and vertical structural indices can reach more than 260 t/ha, which is significantly higher than the saturation value of artificial coniferous forests obtained from the existing studies [42]. However, the fitting accuracy of these indices is low, making them unsuitable for use in accurately estimating AGB. The results of this study show that the saturation values obtained using only SAR data may be smaller than those obtained from optical images, which is the same as the results of previous studies [106]. This may be due mainly to the uncertainty caused by the complexity of processing SAR images and the lack of feature variables that accurately capture the saturation characteristics of the data. In fact, different remote sensing feature variables have different saturation and sensitivity for AGB estimation, and it is difficult to select feature variables with low saturation to produce high forest AGB values. The results show that all three variable selection methods introduce variables with high saturation characteristics, and the optimal saturation points obtained are significantly higher than those obtained from single structural information or the combination of multiple structural information. Moreover, the saturation point obtained by introducing the terrain variable is relatively optimal, which indicates that the terrain variable is a feature variable that contributes significantly to the improvement of AGB estimation. The saturation values of the combined variables obtained in this study were smaller than those of previous studies of deciduous forests [42], which may be because this study used a fixed group validation method and ignored data fusion [39].
The kriging model [107], quantile generalized summation model [108], and quadratic model [109] are currently the most popular methods for figuring out the saturation value. However, each of these models only calculates the saturation value for a single variable. For most of the saturation values of the combined variables, the Pearson correlation coefficient is used to select the variable most relevant to the measured AGB and then determine the saturation value of the model. The results show that this method may not be suitable for this study. Since the combined variables contain the newly created structure indices, the texture and vertical structure information in the structure indices cannot be completely represented by the SWIR1 band, and the saturation values obtained by the variable set are lower than the optimal saturation value obtained by the individual structural indices. The level of saturation for each combination variable is determined via the spherical model, and the determined level of saturation contains two parts, which are caused by the strategy used to assess the saturation level. The first part is mainly caused by the variable selection method and the combination of variables, and the second part is the estimation model used. Thus, for the same combination of variables, the saturation values vary with the method of variable selection and the model [42]. Different remote sensing feature variables have different saturation and sensitivity to AGB estimation, and accurately capturing the saturation level can help to improve the accuracy of AGB estimation. Therefore, future studies should focus on improving the selection of feature variables to improve the data saturation level [106].

6. Conclusions

In this study, we explored the ability of horizontal and vertical structures to estimate AGB by combining the newly established indices and the characteristic factors derived from remote sensing data, and we constructed RF, SVR, and ANN models for forest biomass estimation to analyze the effects of different structural information and satellite variables on the performance of AGB estimation. The results show that the combination of multi-structural information can improve the estimation results of AGB, and the terrain factor has limited improvement on the model accuracy. In this study, the variable set selected via the LASSO method obtained the best results in the SVR model (the R 2 values of the training and the test sets were 0.9998 and 0.8792, respectively), and the saturation point obtained by this model was the highest (185.73 t/ha). In actuality, when given different input datasets, different ML methods produce relatively best results. The measured AGB data in this study are concentrated below 180 t/ha; the saturation value of the optimal combination of variables obtained using the spherical model and SWIR bands is beyond the measured range and allows for a better AGB estimation. However, for dense tropical forests, the saturation value obtained using this method may be relatively low, and the AGB value cannot be well estimated, which has certain limitations. Through comprehensive analysis of multi-source remote sensing data, derived information on forest structure, variable selection methods, and modeling algorithms, this study provides a new idea for the remote-sensing-based AGB modeling of boreal artificial coniferous forests, which overcomes the problem of saturation of biomass estimation in boreal medium- and high-density forests to a certain extent. The subjects of this study are artificial coniferous forests, and the main tree species are pine and larch. Since most stand features are spatially and temporally dependent, the empirical relationship will vary depending on location, time, and tree species type. The results of this study are not directly applicable to large-scale operations, and the estimation of AGB in other heterogeneous forests needs further research and analysis.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/rs16122250/s1, S1: shows the optical data extraction variables. See Table S1 for spectral bands, Table S2 for texture metrics, Table S3 for biophysical parameter information, Table S4 for the new optimal RVIs summary table, Table S5 for the new optimal CTIs summary table, Table S6 for the new optimal MTIs summary table, and Table S7 for the new optimal PTIs summary table. S2: shows the SAR data extraction variables. See table S8 for polarization decomposition variables, Table S9 for the SAR indices, and Table S10 for the PolInSAR data information table. S3: shows the results of three variable selections. Table S11 is the summary of variables selected via Pearson correlation coefficient, Figure S1 presents the importance of RF variables, and Figure S2 presents the LASSO selection variables and their coefficients.

Author Contributions

Conceptualization, W.F. and R.S.; methodology, R.S.; software, Y.N. and R.S.; validation, R.S.; formal analysis, R.S.; investigation, R.S.; resources, W.F.; data curation, R.S.; writing—original draft preparation, R.S.; writing—review and editing, R.S. and S.C.; visualization, R.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (contract no. 31971654) and the Civil Aerospace Technology Advance Research Project (contract no. D040114).

Data Availability Statement

The datasets presented in this article are not readily available because the data are part of an ongoing study. Requests to access the datasets should be directed to corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Oliveira, C.P.d.; Ferreira, R.L.C.; da Silva, J.A.A.; Lima, R.B.d.; Silva, E.A.; Silva, A.F.d.; Lucena, J.D.S.d.; dos Santos, N.A.T.; Lopes, I.J.C.; Pessoa, M.M.d.L.; et al. Modeling and Spatialization of Biomass and Carbon Stock Using LiDAR Metrics in Tropical Dry Forest, Brazil. Forests 2021, 12, 473. [Google Scholar] [CrossRef]
  2. Chen, W.; Zheng, Q.; Xiang, H.; Chen, X.; Sakai, T. Forest Canopy Height Estimation Using Polarimetric Interferometric Synthetic Aperture Radar (PolInSAR) Technology Based on Full-Polarized ALOS/PALSAR Data. Remote Sens. 2021, 13, 174. [Google Scholar] [CrossRef]
  3. Stelmaszczuk-Górska, M.A.; Rodriguez-Veiga, P.; Ackermann, N.; Thiel, C.; Balzter, H.; Schmullius, C. Non-Parametric Retrieval of Aboveground Biomass in Siberian Boreal Forests with ALOS PALSAR Interferometric Coherence and Backscatter Intensity. J. Imaging 2016, 2, 1. [Google Scholar] [CrossRef]
  4. Urbazaev, M.; Thiel, C.; Migliavacca, M.; Reichstein, M.; Rodriguez-Veiga, P.; Schmullius, C. Improved Multi-Sensor Satellite-Based Aboveground Biomass Estimation by Selecting Temporally Stable Forest Inventory Plots Using NDVI Time Series. Forests 2016, 7, 169. [Google Scholar] [CrossRef]
  5. Ahmed, R.; Siqueira, P.; Hensley, S. Analyzing the Uncertainty of Biomass Estimates from L-Band Radar Backscatter over the Harvard and Howland Forests. IEEE Trans. Geosci. Remote Sens. 2014, 52, 3568–3586. [Google Scholar] [CrossRef]
  6. Lee, Y.-S.; Lee, S.; Baek, W.-K.; Jung, H.-S.; Park, S.-H.; Lee, M.-J. Mapping Forest Vertical Structure in Jeju Island from Optical and Radar Satellite Images Using Artificial Neural Network. Remote Sens. 2020, 12, 797. [Google Scholar] [CrossRef]
  7. Ren, Z.; Zheng, H.; He, X.; Zhang, D.; Yu, X.; Shen, G. Spatial estimation of urban forest structures with Landsat TM data and field measurements. Urban For. Urban Green. 2015, 14, 336–344. [Google Scholar] [CrossRef]
  8. Dube, T.; Mutanga, O. Investigating the robustness of the new Landsat-8 Operational Land Imager derived texture metrics in estimating plantation forest aboveground biomass in resource constrained areas. ISPRS J. Photogramm. Remote Sens. 2015, 108, 12–32. [Google Scholar] [CrossRef]
  9. Zhen, Z.; Chen, S.; Yin, T.; Chavanon, E.; Lauret, N.; Guilleux, J.; Henke, M.; Qin, W.; Cao, L.; Li, J.; et al. Using the Negative Soil Adjustment Factor of Soil Adjusted Vegetation Index (SAVI) to Resist Saturation Effects and Estimate Leaf Area Index (LAI) in Dense Vegetation Areas. Sensors 2021, 21, 2115. [Google Scholar] [CrossRef]
  10. Stratoulias, D.; Nuthammachot, N.; Suepa, T.; Phoungthong, K. Assessing the Spectral Information of Sentinel-1 and Sentinel-2 Satellites for Above-Ground Biomass Retrieval of a Tropical Forest. ISPRS Int. J. Geo-Inf. 2022, 11, 199. [Google Scholar] [CrossRef]
  11. Liu, Z.; Long, J.; Lin, H.; Xu, X.; Liu, H.; Zhang, T.; Ye, Z.; Yang, P. Combination Strategies of Variables with Various Spatial Resolutions Derived from GF-2 Images for Mapping Forest Stock Volume. Forests 2023, 14, 1175. [Google Scholar] [CrossRef]
  12. Bastin, J.-F.; Barbier, N.; Couteron, P.; Adams, B.; Shapiro, A.; Bogaert, J.; Cannière, C.D. Aboveground biomass mapping of African forest mosaics using canopy texture analysis: Toward a regional approach. Ecol. Appl. A Publ. Ecol. Soc. Am. 2016, 24, 1984–2001. [Google Scholar] [CrossRef]
  13. Hlatshwayo, S.T.; Mutanga, O.; Lottering, R.T.; Kiala, Z.; Ismail, R. Mapping forest aboveground biomass in the reforested Buffelsdraai landfill site using texture combinations computed from SPOT-6 pan-sharpened imagery. Int. J. Appl. Earth Obs. Geoinf. 2019, 74, 65–77. [Google Scholar] [CrossRef]
  14. Cazcarra-Bes, V.; Tello-Alonso, M.; Fischer, R.; Heym, M.; Papathanassiou, K. Monitoring of Forest Structure Dynamics by Means of L-Band SAR Tomography. Remote Sens. 2017, 9, 1229. [Google Scholar] [CrossRef]
  15. Santoro, M.; Cartus, O.; Carvalhais, N.; Rozendaal, D.M.A.; Avitabile, V.; Araza, A.; de Bruin, S.; Herold, M.; Quegan, S.; Rodríguez-Veiga, P.; et al. The global forest above-ground biomass pool for 2010 estimated from high-resolution satellite observations. Earth Syst. Sci. Data 2021, 13, 3927–3950. [Google Scholar] [CrossRef]
  16. Bispo, P.d.C.; Rodríguez-Veiga, P.; Zimbres, B.; do Couto de Miranda, S.; Henrique Giusti Cezare, C.; Fleming, S.; Baldacchino, F.; Louis, V.; Rains, D.; Garcia, M.; et al. Woody Aboveground Biomass Mapping of the Brazilian Savanna with a Multi-Sensor and Machine Learning Approach. Remote Sens. 2020, 12, 2685. [Google Scholar] [CrossRef]
  17. Huang, X.; Ziniti, B.; Torbick, N.; Ducey, M.J. Assessment of Forest above Ground Biomass Estimation Using Multi-Temporal C-band Sentinel-1 and Polarimetric L-band PALSAR-2 Data. Remote Sens. 2018, 10, 1424. [Google Scholar] [CrossRef]
  18. Kasischke, E.S.; Melack, J.M.; Craig Dobson, M. The use of imaging radars for ecological applications—A review. Remote Sens. Environ. 1997, 59, 141–156. [Google Scholar] [CrossRef]
  19. Schlund, M.; Davidson, M. Aboveground Forest Biomass Estimation Combining L- and P-Band SAR Acquisitions. Remote Sens. 2018, 10, 1151. [Google Scholar] [CrossRef]
  20. Hansen, E.; Gobakken, T.; Solberg, S.; Kangas, A.; Ene, L.; Mauya, E.; Næsset, E. Relative Efficiency of ALS and InSAR for Biomass Estimation in a Tanzanian Rainforest. Remote Sens. 2015, 7, 9865–9885. [Google Scholar] [CrossRef]
  21. Wang, Y.; Zhang, X.; Guo, Z. Estimation of tree height and aboveground biomass of coniferous forests in North China using stereo ZY-3, multispectral Sentinel-2, and DEM data. Ecol. Indic. 2021, 126, 107645. [Google Scholar] [CrossRef]
  22. Alonso, M.T.; Pardini, M.; Papathanassiou, K. Towards Forest Structure Characteristics Retrieval from SAR Tomographic Profiles. In Proceedings of the European Conference on Synthetic Aperture Radar (EUSAR), Berlin, Germany, 3–5 June 2014. [Google Scholar]
  23. Du, C.; Fan, W.; Ma, Y.; Jin, H.I.; Zhen, Z. The Effect of Synergistic Approaches of Features and Ensemble Learning Algorith on Aboveground Biomass Estimation of Natural Secondary Forests Based on ALS and Landsat 8. Sensors 2021, 21, 5974. [Google Scholar] [CrossRef] [PubMed]
  24. Fatehi, P.; Damm, A.; Schaepman, M.E.; Kneubühler, M. Estimation of Alpine Forest Structural Variables from Imaging Spectrometer Data. Remote Sens. 2015, 7, 16315–16338. [Google Scholar] [CrossRef]
  25. Lu, D.; Chen, Q.; Wang, G.; Liu, L.; Li, G.; Moran, E. A survey of remote sensing-based aboveground biomass estimation methods in forest ecosystems. Int. J. Digit. Earth 2014, 9, 63–105. [Google Scholar] [CrossRef]
  26. Minh, N.P.; Ngoc, T.N.; Nguyen, A.H. An improved adaptive decomposition method for forest parameters estimation using polarimetric SAR interferometry image. Eur. J. Remote Sens. 2019, 52, 359–373. [Google Scholar] [CrossRef]
  27. Ronoud, G.; Darvish Sefat, A.A.; Fatehi, P. Beech Tree Density Estimation Using Sentinel-2 Data (Case Study: Khyroud Forest). Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, XLII-4/W18, 891–895. [Google Scholar] [CrossRef]
  28. Mutti, P.R.; da Silva, L.L.; Medeiros, S.D.S.; Dubreuil, V.; Mendes, K.R.; Marques, T.V.; Lucio, P.S.; Santos e Silva, C.M.; Bezerra, B.G. Basin scale rainfall-evapotranspiration dynamics in a tropical semiarid environment during dry and wet years. Int. J. Appl. Earth Obs. Geoinf. 2019, 75, 29–43. [Google Scholar] [CrossRef]
  29. Akhtar, A.M.; Qazi, W.A.; Ahmad, S.R.; Gilani, H.; Mahmood, S.A.; Rasool, A. Integration of high-resolution optical and SAR satellite remote sensing datasets for aboveground biomass estimation in subtropical pine forest, Pakistan. Environ. Monit. Assess. 2020, 192, 584. [Google Scholar] [CrossRef]
  30. López-Serrano, P.M.; Cárdenas Domínguez, J.L.; Corral-Rivas, J.J.; Jiménez, E.; López-Sánchez, C.A.; Vega-Nieva, D.J. Modeling of Aboveground Biomass with Landsat 8 OLI and Machine Learning in Temperate Forests. Forests 2019, 11, 11. [Google Scholar] [CrossRef]
  31. Han, H.; Wan, R.; Li, B. Estimating Forest Aboveground Biomass Using Gaofen-1 Images, Sentinel-1 Images, and Machine Learning Algorithms: A Case Study of the Dabie Mountain Region, China. Remote Sens. 2021, 14, 176. [Google Scholar] [CrossRef]
  32. Hu, Y.; Xu, X.; Wu, F.; Sun, Z.; Xia, H.; Meng, Q.; Huang, W.; Zhou, H.; Gao, J.; Li, W.; et al. Estimating Forest Stock Volume in Hunan Province, China, by Integrating In Situ Plot Data, Sentinel-2 Images, and Linear and Machine Learning Regression Models. Remote Sens. 2020, 12, 186. [Google Scholar] [CrossRef]
  33. Souza, G.S.A.d.; Soares, V.P.; Leite, H.G.; Gleriani, J.M.; do Amaral, C.H.; Ferraz, A.S.; Silveira, M.V.d.F.; Santos, J.F.C.d.; Velloso, S.G.S.; Domingues, G.F.; et al. Multi-sensor prediction of Eucalyptus stand volume: A support vector approach. ISPRS J. Photogramm. Remote Sens. 2019, 156, 135–146. [Google Scholar] [CrossRef]
  34. Benmokhtar, S.; Robin, M.; Maanan, M.; Bazairi, H. Mapping and Quantification of the Dwarf Eelgrass Zostera noltei Using a Random Forest Algorithm on a SPOT 7 Satellite Image. ISPRS Int. J. Geo-Inf. 2021, 10, 313. [Google Scholar] [CrossRef]
  35. Ghosh, S.M.; Behera, M.D. Aboveground biomass estimation using multi-sensor data synergy and machine learning algorithms in a dense tropical forest. Appl. Geogr. 2018, 96, 29–40. [Google Scholar] [CrossRef]
  36. Singh, C.; Karan, S.K.; Sardar, P.; Samadder, S.R. Remote sensing-based biomass estimation of dry deciduous tropical forest using machine learning and ensemble analysis. J. Environ. Manag. 2022, 308, 114639. [Google Scholar] [CrossRef] [PubMed]
  37. Li, X.; Long, J.; Zhang, M.; Liu, Z.; Lin, H. Coniferous Plantations Growing Stock Volume Estimation Using Advanced Remote Sensing Algorithms and Various Fused Data. Remote Sens. 2021, 13, 3468. [Google Scholar] [CrossRef]
  38. Englhart, S.; Keuck, V.; Siegert, F. Aboveground biomass retrieval in tropical forests—The potential of combined X- and L-band SAR data use. Remote Sens. Environ. 2011, 115, 1260–1271. [Google Scholar] [CrossRef]
  39. Long, J.; Lin, H.; Wang, G.; Sun, H.; Yan, E. Mapping Growing Stem Volume of Chinese Fir Plantation Using a Saturation-based Multivariate Method and Quad-polarimetric SAR Images. Remote Sens. 2019, 11, 1872. [Google Scholar] [CrossRef]
  40. Lu, Z.; Guanglong, O.; Junfeng, W.; Hui, X. Light Saturation Point Determination and Biomass Remote Sensing Estimation of Pinus kesiya var. langbianensis forest based on spatial regression models. Sci. Silvae Sin. 2020, 56, 38–47. [Google Scholar]
  41. Zhao, P.; Lu, D.; Wang, G.; Wu, C.; Huang, Y.; Yu, S. Examining Spectral Reflectance Saturation in Landsat Imagery and Corresponding Solutions to Improve Forest Aboveground Biomass Estimation. Remote Sens. 2016, 8, 469. [Google Scholar] [CrossRef]
  42. Zhang, T.; Lin, H.; Long, J.; Zhang, M.; Liu, Z. Analyzing the Saturation of Growing Stem Volume Based on ZY-3 Stereo and Multispectral Images in Planted Coniferous Forest. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 50–61. [Google Scholar] [CrossRef]
  43. Li, H.; Lei, Y. Assessment of Forest Vegetation Biomass and Carbon Stocks in China; China Forestry Publishing House: Beijing, China, 2010. [Google Scholar]
  44. Haijun, W.; Feng, L.; Nan, X. Allometric Equation for Biomass of the Main Carbon Sinks Species in Heilongjiang Province. Prot. For. Sci. Technol. 2016, 21–22+53. [Google Scholar] [CrossRef]
  45. Puliti, S.; Breidenbach, J.; Schumacher, J.; Hauglin, M.; Klingenberg, T.F.; Astrup, R. Above-ground biomass change estimation using national forest inventory data with Sentinel-2 and Landsat. Remote Sens. Environ. 2021, 265, 112644. [Google Scholar] [CrossRef]
  46. Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
  47. Chen, L.; Wang, Y.; Ren, C.; Zhang, B.; Wang, Z. Optimal Combination of Predictors and Algorithms for Forest Above-Ground Biomass Mapping from Sentinel and SRTM Data. Remote Sens. 2019, 11, 414. [Google Scholar] [CrossRef]
  48. Sa, R.; Fan, W. Estimation of Forest Parameters in Boreal Artificial Coniferous Forests Using Landsat 8 and Sentinel-2A. Remote Sens. 2023, 15, 3605. [Google Scholar] [CrossRef]
  49. Tanase, M.; Panciera, R.; Lowell, K.; Tian, S.; Hacker, J.; Walker, J. Airborne Multi Temporal L-band Polarimetric SAR Data for Biomass Estimation in Semi-Arid Forests. Remote Sens. Environ. 2014, 145, 93–104. [Google Scholar] [CrossRef]
  50. Chen, A.; Zebker, H. Reducing Ionospheric Effects in InSAR Data Using Accurate Coregistration. IEEE Trans. Geosci. Remote Sens. 2014, 52, 60–70. [Google Scholar] [CrossRef]
  51. Latrache, H.; Ouarzeddine, M.; Souissi, B. Improved model-based polarimetric decomposition using the PolInSAR similarity parameter. ISPRS—Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, XLI-B7, 847–850. [Google Scholar] [CrossRef]
  52. Xie, Q.; Wang, J.; Liao, C.; Shang, J.; Lopez-Sanchez, J.; Fu, H.; Liu, X. On the Use of Neumann Decomposition for Crop Classification Using Multi-Temporal RADARSAT-2 Polarimetric SAR Data. Remote Sens. 2019, 11, 776. [Google Scholar] [CrossRef]
  53. Cui, Y.; Yamaguchi, Y.; Yang, J.; Park, S.-E.; Kobayashi, H.; Singh, G. Three-Component Power Decomposition for Polarimetric SAR Data Based on Adaptive Volume Scatter Modeling. Remote Sens. 2012, 4, 1559–1572. [Google Scholar] [CrossRef]
  54. Varghese, A.O.; Suryavanshi, A.; Joshi, A.K. Analysis of different polarimetric target decomposition methods in forest density classification using C band SAR data. Int. J. Remote Sens. 2016, 37, 694–709. [Google Scholar] [CrossRef]
  55. Huynen, J. Phenomenological Theory of Radar Targets. Ph.D. Thesis, TU Delft, Delft, The Netherlands, 1970. [Google Scholar]
  56. Pope, K.O.; Rey-Benayas, J.M.; Paris, J.F. Radar remote sensing of forest and wetland ecosystems in the Central American tropics. Remote Sens. Environ. 1994, 48, 205–219. [Google Scholar] [CrossRef]
  57. Mitchard, E.; Saatchi, S.; White, L.; Abernethy, K.; Jeffery, K.; Lewis, S.; Collins, M.; Lefsky, M.A.; Leal, M.; Woodhouse, I.; et al. Mapping tropical forest biomass with radar and spaceborne LiDAR: Overcoming problems of high biomass and persistent cloud. Biogeosci. Discuss. 2011, 8, 8781–8815. [Google Scholar] [CrossRef]
  58. Sa, R.; Nei, Y.; Fan, W. Combining Multi-Dimensional SAR Parameters to Improve RVoG Model for Coniferous Forest Height Inversion Using ALOS-2 Data. Remote Sens. 2023, 15, 1272. [Google Scholar] [CrossRef]
  59. Li, Z.; Chen, Z.; Cheng, Q.; Duan, F.; Sui, R.; Huang, X.; Xu, H. UAV-Based Hyperspectral and Ensemble Machine Learning for Predicting Yield in Winter Wheat. Agronomy 2022, 12, 202. [Google Scholar] [CrossRef]
  60. Wang, P.; Tan, S.; Zhang, G.; Wang, S.; Wu, X. Remote Sensing Estimation of Forest Aboveground Biomass Based on Lasso-SVR. Forests 2022, 13, 1597. [Google Scholar] [CrossRef]
  61. Jiang, F.; Kutia, M.; Sarkissian, A.J.; Lin, H.; Long, J.; Sun, H.; Wang, G. Estimating the Growing Stem Volume of Coniferous Plantations Based on Random Forest Using an Optimized Variable Selection Method. Sensors 2020, 20, 7248. [Google Scholar] [CrossRef] [PubMed]
  62. Lin, W.; Lu, Y.; Li, G.; Jiang, X.; Lu, D. A comparative analysis of modeling approaches and canopy height-based data sources for mapping forest growing stock volume in a northern subtropical ecosystem of China. GIScience Remote Sens. 2022, 59, 568–589. [Google Scholar] [CrossRef]
  63. Wu, C.; Shen, H.; Wang, K.; Shen, A.; Deng, J.; Gan, M. Landsat Imagery-Based above Ground Biomass Estimation and Change Investigation Related to Human Activities. Sustainability 2016, 8, 159. [Google Scholar] [CrossRef]
  64. Su, H.; Shen, W.; Wang, J.; Ali, A.; Li, M. Machine learning and geostatistical approaches for estimating aboveground biomass in Chinese subtropical forests. For. Ecosyst. 2020, 7, 64. [Google Scholar] [CrossRef]
  65. Liu, K.; Wang, J.; Zeng, W.; Song, J. Comparison and Evaluation of Three Methods for Estimating Forest above Ground Biomass Using TM and GLAS Data. Remote Sens. 2017, 9, 341. [Google Scholar] [CrossRef]
  66. Nasiri, V.; Darvishsefat, A.A.; Arefi, H.; Griess, V.C.; Sadeghi, S.M.; Borz, S.A. Modeling Forest Canopy Cover: A Synergistic Use of Sentinel-2, Aerial Photogrammetry Data, and Machine Learning. Remote Sens. 2022, 14, 1453. [Google Scholar] [CrossRef]
  67. Chen, Y.; Ma, L.; Yu, D.; Feng, K.; Wang, X.; Song, J. Improving Leaf Area Index Retrieval Using Multi-Sensor Images and Stacking Learning in Subtropical Forests of China. Remote Sens. 2022, 14, 148. [Google Scholar] [CrossRef]
  68. Halme, E.; Pellikka, P.; Mõttus, M. Utility of hyperspectral compared to multispectral remote sensing data in estimating forest biomass and structure variables in Finnish boreal forest. Int. J. Appl. Earth Obs. Geoinf. 2019, 83, 101942. [Google Scholar] [CrossRef]
  69. Tuong, T.; Tani, H.; Wang, X.; Thang, N.; Bui, H. Combination of SAR Polarimetric Parameters for Estimating Tropical Forest Aboveground Biomass. Pol. J. Environ. Stud. 2020, 29, 3353–3365. [Google Scholar] [CrossRef]
  70. Peng, X.; Zhao, A.; Chen, Y.; Chen, Q.; Liu, H.; Wang, J.; Li, H. Comparison of Modeling Algorithms for Forest Canopy Structures Based on UAV-LiDAR: A Case Study in Tropical China. Forests 2020, 11, 1324. [Google Scholar] [CrossRef]
  71. Meng, S.; Pang, Y.; Zhang, Z.; Jia, W.; Li, Z. Mapping Aboveground Biomass using Texture Indices from Aerial Photos in a Temperate Forest of Northeastern China. Remote Sens. 2016, 8, 230. [Google Scholar] [CrossRef]
  72. Iizuka, K.; Hayakawa, Y.S.; Ogura, T.; Nakata, Y.; Kosugi, Y.; Yonehara, T. Integration of Multi-Sensor Data to Estimate Plot-Level Stem Volume Using Machine Learning Algorithms–Case Study of Evergreen Conifer Planted Forests in Japan. Remote Sens. 2020, 12, 1649. [Google Scholar] [CrossRef]
  73. Wang, Z.; Zhang, M. Evaluation and Comparison of Different Machine Learning Models for NSAT Retrieval from Various Multispectral Satellite Images. Atmosphere 2022, 13, 1429. [Google Scholar] [CrossRef]
  74. Ta, N.; Chang, Q.; Zhang, Y. Estimation of Apple Tree Leaf Chlorophyll Content Based on Machine Learning Methods. Remote Sens. 2021, 13, 3902. [Google Scholar] [CrossRef]
  75. Gao, Y.; Lu, D.; Li, G.; Wang, G.; Chen, Q.; Liu, L.; Li, D. Comparative Analysis of Modeling Algorithms for Forest Aboveground Biomass Estimation in a Subtropical Region. Remote Sens. 2018, 10, 627. [Google Scholar] [CrossRef]
  76. dos Reis, A.A.; Carvalho, M.C.; de Mello, J.M.; Gomide, L.R.; Ferraz Filho, A.C.; Acerbi Junior, F.W. Spatial prediction of basal area and volume in Eucalyptus stands using Landsat TM data: An assessment of prediction methods. N. Z. J. For. Sci. 2018, 48, 1. [Google Scholar] [CrossRef]
  77. Dong, L.; Tang, S.; Min, M.; Veroustraete, F.; Cheng, J. Aboveground forest biomass based on OLSR and an ANN model integrating LiDAR and optical data in a mountainous region of China. Int. J. Remote Sens. 2019, 40, 6059–6083. [Google Scholar] [CrossRef]
  78. Sibanda, M.; Mutanga, O.; Rouget, M.; Kumar, L. Estimating Biomass of Native Grass Grown under Complex Management Treatments Using WorldView-3 Spectral Derivatives. Remote Sens. 2017, 9, 55. [Google Scholar] [CrossRef]
  79. Jiang, F.; Sun, H.; Chen, E.; Wang, T.; Cao, Y.; Liu, Q. Above-Ground Biomass Estimation for Coniferous Forests in Northern China Using Regression Kriging and Landsat 9 Images. Remote Sens. 2022, 14, 5734. [Google Scholar] [CrossRef]
  80. Lin, C.; Daoli, P.; Xuejun, W.; Xinyun, C. Estimation of Forest Stock Volume With Spectral and Textural Information from the Sentinel-2A. J. Northeast For. Univ. 2018, 46, 54–58. [Google Scholar] [CrossRef]
  81. Bolívar-Santamaría, S.; Reu, B. Detection and characterization of agroforestry systems in the Colombian Andes using sentinel-2 imagery. Agrofor. Syst. 2021, 95, 499–514. [Google Scholar] [CrossRef]
  82. Waqar, M.M.; Sukmawati, R.; Ji, Y.; Sri Sumantyo, J.T. Tropical PeatLand Forest Biomass Estimation Using Polarimetric Parameters Extracted from RadarSAT-2 Images. Land 2020, 9, 193. [Google Scholar] [CrossRef]
  83. Liang, Y.; Kou, W.; Lai, H.; Wang, J.; Wang, Q.; Xu, W.; Wang, H.; Lu, N. Improved estimation of aboveground biomass in rubber plantations by fusing spectral and textural information from UAV-based RGB imagery. Ecol. Indic. 2022, 142, 109286. [Google Scholar] [CrossRef]
  84. Ge, J.; Hou, M.; Liang, T.; Feng, Q.; Meng, X.; Liu, J.; Bao, X.; Gao, H. Spatiotemporal dynamics of grassland aboveground biomass and its driving factors in North China over the past 20 years. Sci. Total Environ. 2022, 826, 154226. [Google Scholar] [CrossRef] [PubMed]
  85. Mao, H.; Meng, J.; Ji, F.; Zhang, Q.; Fang, H. Comparison of Machine Learning Regression Algorithms for Cotton Leaf Area Index Retrieval Using Sentinel-2 Spectral Bands. Appl. Sci. 2019, 9, 1459. [Google Scholar] [CrossRef]
  86. Sa, R.; Fan, W. Forest Structure Mapping of Boreal Coniferous Forests Using Multi-Source Remote Sensing Data. Remote Sens. 2024, 16, 1844. [Google Scholar] [CrossRef]
  87. Thapa, R.; Watanabe, M.; Motohka, T.; Shimada, M. Potential of high-resolution ALOS–PALSAR mosaic texture for aboveground forest carbon tracking in tropical region. Remote Sens. Environ. 2015, 160, 122–133. [Google Scholar] [CrossRef]
  88. Sarker, M.; Nichol, J.; Iz, H.; Ahmad, B.B.; Rahman, A. Forest Biomass Estimation Using Texture Measurements of High-Resolution Dual-Polarization C-Band SAR Data. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3371–3384. [Google Scholar] [CrossRef]
  89. Zhang, C.; Huang, C.; Li, H.; Liu, Q.; Li, J.; Bridhikitti, A.; Liu, G. Effect of Textural Features in Remote Sensed Data on Rubber Plantation Extraction at Different Levels of Spatial Resolution. Forests 2020, 11, 399. [Google Scholar] [CrossRef]
  90. Ahmadi, K.; Kalantar, B.; Saeidi, V.; Harandi, E.K.G.; Janizadeh, S.; Ueda, N. Comparison of Machine Learning Methods for Mapping the Stand Characteristics of Temperate Forests Using Multi-Spectral Sentinel-2 Data. Remote Sens. 2020, 12, 3019. [Google Scholar] [CrossRef]
  91. Sibanda, M.; Mutanga, O.; Rouget, M. Examining the potential of Sentinel-2 MSI spectral resolution in quantifying above ground biomass across different fertilizer treatments. ISPRS J. Photogramm. Remote Sens. 2015, 110, 55–65. [Google Scholar] [CrossRef]
  92. Astola, H.; Häme, T.; Sirro, L.; Molinier, M.; Kilpi, J. Comparison of Sentinel-2 and Landsat 8 imagery for forest variable prediction in boreal region. Remote Sens. Environ. 2019, 223, 257–273. [Google Scholar] [CrossRef]
  93. Naik, P.; Dalponte, M.; Bruzzone, L. Prediction of Forest Aboveground Biomass Using Multitemporal Multispectral Remote Sensing Data. Remote Sens. 2021, 13, 1282. [Google Scholar] [CrossRef]
  94. Li, Y.; Li, C.; Li, M.; Liu, Z. Influence of Variable Selection and Forest Type on Forest Aboveground Biomass Estimation Using Machine Learning Algorithms. Forests 2019, 10, 1073. [Google Scholar] [CrossRef]
  95. Ou, G.; Li, C.; Lv, Y.; Wei, A.; Xiong, H.; Xu, H.; Wang, G. Improving Aboveground Biomass Estimation of Pinus densata Forests in Yunnan Using Landsat 8 Imagery by Incorporating Age Dummy Variable and Method Comparison. Remote Sens. 2019, 11, 738. [Google Scholar] [CrossRef]
  96. Li, X.; Ye, Z.; Long, J.; Zheng, H.; Lin, H. Inversion of Coniferous Forest Stock Volume Based on Backscatter and InSAR Coherence Factors of Sentinel-1 Hyper-Temporal Images and Spectral Variables of Landsat 8 OLI. Remote Sens. 2022, 14, 2754. [Google Scholar] [CrossRef]
  97. Liang, L.; Di, L.; Huang, T.; Wang, J.; Lin, L.; Wang, L.; Yang, M. Estimation of Leaf Nitrogen Content in Wheat Using New Hyperspectral Indices and a Random Forest Regression Algorithm. Remote Sens. 2018, 10, 1940. [Google Scholar] [CrossRef]
  98. Chen, Y.; Li, L.; Lu, D.; Li, D. Exploring Bamboo Forest Aboveground Biomass Estimation Using Sentinel-2 Data. Remote Sens. 2018, 11, 7. [Google Scholar] [CrossRef]
  99. Hu, T.; Sun, Y.; Jia, W.; Li, D.; Zou, M.; Zhang, M. Study on the Estimation of Forest Volume Based on Multi-Source Data. Sensors 2021, 21, 7796. [Google Scholar] [CrossRef]
  100. Hu, Y.; Sun, Z. Assessing the Capacities of Different Remote Sensors in Estimating Forest Stock Volume Based on High Precision Sample Plot Positioning and Random Forest Method. Nat. Environ. Pollut. Technol. 2022, 21, 1113–1123. [Google Scholar] [CrossRef]
  101. Wang, L.a.; Zhou, X.; Zhu, X.; Dong, Z.; Guo, W. Estimation of biomass in wheat using random forest regression algorithm and remote sensing data. Crop J. 2016, 4, 212–219. [Google Scholar] [CrossRef]
  102. Zhang, C.; Denka, S.; Cooper, H.; Mishra, D.R. Quantification of sawgrass marsh aboveground biomass in the coastal Everglades using object-based ensemble analysis and Landsat data. Remote Sens. Environ. 2017, 204, 366–379. [Google Scholar] [CrossRef]
  103. Ku, N.-W.; Popescu, S. A comparison of multiple methods for mapping local-scale mesquite tree aboveground biomass with remotely sensed data. Biomass Bioenergy 2019, 122, 270–279. [Google Scholar] [CrossRef]
  104. Zandler, H.; Brenning, A.; Samimi, C. Quantifying dwarf shrub biomass in an arid environment: Comparing empirical methods in a high dimensional setting. Remote Sens. Environ. Interdiscip. J. 2015, 158, 140–155. [Google Scholar] [CrossRef]
  105. Feng, Y.; Lu, D.; Chen, Q.; Keller, M.; Moran, E.; dos-Santos, M.N.; Bolfe, E.L.; Batistella, M. Examining effective use of data sources and modeling algorithms for improving biomass estimation in a moist tropical forest of the Brazilian Amazon. Int. J. Digit. Earth 2017, 10, 996–1016. [Google Scholar] [CrossRef]
  106. Li, X.; Liu, Z.; Lin, H.; Wang, G.; Sun, H.; Long, J.; Zhang, M. Estimating the Growing Stem Volume of Chinese Pine and Larch Plantations based on Fused Optical Data Using an Improved Variable Screening Method and Stacking Algorithm. Remote Sens. 2020, 12, 871. [Google Scholar] [CrossRef]
  107. Zhao, P. Aboveground Forest Biomass Estimation Based on Landsat TM and ALOS PALSAR Data. Master’s Thesis, Zhejiang Agriculture Forestry University, Hangzhou, China, 2017. [Google Scholar]
  108. Chunyu, D. Estimation of Forest Aboveground Biomass and Determination of its Saturation Values Based on Passive and Active Data. Ph.D. Thesis, Northeast Forestry University, Harbin, China, 2023. [Google Scholar]
  109. Lin, H.; Zhao, W.; Long, J.; Liu, Z.; Yang, P.; Zhang, T.; Ye, Z.; Wang, Q.; Matinfar, H.R. Mapping Forest Growing Stem Volume Using Novel Feature Evaluation Criteria Based on Spectral Saturation in Planted Chinese Fir Forest. Remote Sens. 2023, 15, 402. [Google Scholar] [CrossRef]
Figure 1. Location map of the study area: (a) the location map of the study area; (b) the HV polarization data of the study area; (c) the true color image of Sentinel-2, with the actual sample locations indicated by green dots.
Figure 1. Location map of the study area: (a) the location map of the study area; (b) the HV polarization data of the study area; (c) the true color image of Sentinel-2, with the actual sample locations indicated by green dots.
Remotesensing 16 02250 g001
Figure 2. Relationships between forest structural parameters at the sample site level: (a) mean DBH vs. S; (b) CC vs. BA; (c) mean DBH vs. mean forest height, where the size and color shade of the dots vary with biomass.
Figure 2. Relationships between forest structural parameters at the sample site level: (a) mean DBH vs. S; (b) CC vs. BA; (c) mean DBH vs. mean forest height, where the size and color shade of the dots vary with biomass.
Remotesensing 16 02250 g002
Figure 3. Flowchart of the methodology.
Figure 3. Flowchart of the methodology.
Remotesensing 16 02250 g003
Figure 4. Determination of the number of model leaves and decision tree.
Figure 4. Determination of the number of model leaves and decision tree.
Remotesensing 16 02250 g004
Figure 5. Parameter optimization diagram of three models. From left to right are the results of RF, SVR, and ANN models. From top to bottom are the results obtained for the horizontal structure indices (V1), vertical structure indices (V2), horizontal + vertical structure indices (V3), horizontal + vertical structure indices + topographical variables (V4), Pearson selection variable (V5), RF importance selection of the variables (V6), and the variable chosen by the LASSO (V7) in each model.
Figure 5. Parameter optimization diagram of three models. From left to right are the results of RF, SVR, and ANN models. From top to bottom are the results obtained for the horizontal structure indices (V1), vertical structure indices (V2), horizontal + vertical structure indices (V3), horizontal + vertical structure indices + topographical variables (V4), Pearson selection variable (V5), RF importance selection of the variables (V6), and the variable chosen by the LASSO (V7) in each model.
Remotesensing 16 02250 g005
Figure 6. Summary graphs of the results of the training and test sets of the three models for estimating AGB. The first and second rows of each model are the training set results, and test set results, respectively. The left side is R 2 , and the right side is RMSE.
Figure 6. Summary graphs of the results of the training and test sets of the three models for estimating AGB. The first and second rows of each model are the training set results, and test set results, respectively. The left side is R 2 , and the right side is RMSE.
Remotesensing 16 02250 g006
Figure 7. Summary plot of model results. From left to right, the horizontal structure indices (V1), vertical structure indices (V2), horizontal + vertical structure indices (V3), horizontal + vertical structure indices + topographical variables (V4), Pearson selection variable (V5), RF importance selection of the variables (V6), and the variable chosen by the LASSO (V7) were introduced into the three models to estimate the results of AGB. From top to bottom are the results of RF, SVR, and ANN models, and the first and second rows of each model are the training set results and test set results, respectively. The horizontal axis of the image is the measured data, the vertical axis is the predicted results, the blue is the 1:1 straight line, and the green is the fitted line.
Figure 7. Summary plot of model results. From left to right, the horizontal structure indices (V1), vertical structure indices (V2), horizontal + vertical structure indices (V3), horizontal + vertical structure indices + topographical variables (V4), Pearson selection variable (V5), RF importance selection of the variables (V6), and the variable chosen by the LASSO (V7) were introduced into the three models to estimate the results of AGB. From top to bottom are the results of RF, SVR, and ANN models, and the first and second rows of each model are the training set results and test set results, respectively. The horizontal axis of the image is the measured data, the vertical axis is the predicted results, the blue is the 1:1 straight line, and the green is the fitted line.
Remotesensing 16 02250 g007
Figure 8. AGB map of the study area estimated by the LASSO-based SVR model: (a) AGB map of the study area; (b) histogram of AGB distribution.
Figure 8. AGB map of the study area estimated by the LASSO-based SVR model: (a) AGB map of the study area; (b) histogram of AGB distribution.
Remotesensing 16 02250 g008
Figure 9. The spherical model curves for each structure index and different variable sets under different ML models. The left side shows the horizontal structure indices figures (RVI, CTI, MTI, PTI) in order, and the individual index figures of CC, S, and BA are shown on the right. Right side: vertical structure indices figures and a summary plot of the spherical model curves for the different sets of variables under the RF, SVR, and ANN models.
Figure 9. The spherical model curves for each structure index and different variable sets under different ML models. The left side shows the horizontal structure indices figures (RVI, CTI, MTI, PTI) in order, and the individual index figures of CC, S, and BA are shown on the right. Right side: vertical structure indices figures and a summary plot of the spherical model curves for the different sets of variables under the RF, SVR, and ANN models.
Remotesensing 16 02250 g009
Table 1. Summary of sample plot data.
Table 1. Summary of sample plot data.
ParameterMaxMinMeanSTD
Canopy closure (%)10.150.70.196
Mean forest height (m)27.85.8117.34.88
Mean DBH (cm)33.17.222.747.3
Basal area ( m 2 /ha)481.224.79.7
Forest stand density (stems/ha)51001339601083
AGB (t/ha)228.6412.2790.635.75
Table 2. Summary of allometric growth equations for tree species in the sample site. The table shows the tree species recorded during the field work and the tree species allometric equations used to estimate biomass.
Table 2. Summary of allometric growth equations for tree species in the sample site. The table shows the tree species recorded during the field work and the tree species allometric equations used to estimate biomass.
Tree SpeciesAcademic NameAllometric Equation
LarchLarix gmelinii (Rupr.) Kuzen. A G B = 0.046 × ( D B H 2 × H ) 0.905
Sphagnum pinePinus sylvestris var. mongholica Litv. A G B = 0.058 × ( D B H 2 × H ) 0.930
SprucePicea asperata Mast. A G B = 0.068 × ( D B H 2 × H ) 0.866
Table 3. Table of optical data acquisition information.
Table 3. Table of optical data acquisition information.
SensorTimeSpatial Resolution (m)
Landsat 816 June 202030
Sentinel-2A21 June 202010, 20
Table 4. List of features derived from S2A and L8 sensors. All variable extraction methods and detailed explanations are described in the Supplementary S1.
Table 4. List of features derived from S2A and L8 sensors. All variable extraction methods and detailed explanations are described in the Supplementary S1.
VariablesName
BandS2A: B2, B3, B4, B5, B6, B7, B8, B8A, B11, B12;
L8: B2, B3, B4, B5, B6, B7
Texturecontrast, dissimilarity, homogeneity, second moment, entropy, mean, variance, correlation
Biophysical parameterfractional vegetation cover (FVC), leaf area index (LAI), chlorophyll content in the leaf (Cab), canopy water content (Cwc)
Horizontal structure indexratio vegetation index (RVI): RVI_CC, RVI_S, RVI_BA;
corresponding texture index (CTI): CTI_CC, CTI_S, CTI_BA;
mean texture index (MTI): MTI_CC, MTI_S, MTI_BA;
principal component texture Index (PTI): PTI_CC, PTI_S, PTI_BA
Table 5. Table of ALOS data acquisition information.
Table 5. Table of ALOS data acquisition information.
Parameter
Data LevelHBQR1.1
Imaging Date0711, 0725, 0808, 0905, 0919
Polarization ChannelFull polarization (HH, HV, VH, VV)
Table 6. Variables extracted from the five-scene ALOS-2 PALSAR data. All variable extraction methods and detailed explanations are described in the Supplementary S2.
Table 6. Variables extracted from the five-scene ALOS-2 PALSAR data. All variable extraction methods and detailed explanations are described in the Supplementary S2.
VariablesName
Backscattering coefficientHH, HV, VH, VV
Polarization decomposition variableodd scattering (S), double scattering (D), volume scattering (V), M1, M2
Polarization decomposition parameternine parameters of the coherence matrix T 3 , where the real and imag parts were extracted separately
SAR indexcanopy structure index (CSI), biomass index (BMI), volume scattering index (VSI), radar forest degradation index (RFDI), radar vegetation index (RVI)
Vertical structure index0711-0725H, 0725-0808H, 0905-0919H, 0725-0905H, 0808-0919H, 0711-0919H
Table 7. Summary of the three methods of variable selection. Pearson correlation coefficient was used to select variables that were significant at the 0.05 level with AGB. RF was selected to obtain variables with importance greater than 0.1.
Table 7. Summary of the three methods of variable selection. Pearson correlation coefficient was used to select variables that were significant at the 0.05 level with AGB. RF was selected to obtain variables with importance greater than 0.1.
VariablesPearson Correlation CoefficientRandom Forest ImportanceLASSO
BandS2A: B2, B3, B4, B5, B6, B7, B8, B8A, B11, B12;
L8: B5, B6, B7
S2A: B2, B3, B5, B11, B12;
L8: B5
S2A: B3, B6, B11
Biophysical parameterL8: Cwc/S2: FVC
Horizontal structure indexRVI_BA,
CTI_CC, CTI_S, CTI_BA, MTI_CC, MTI_S, MTI_BA, PTI_CC, PTI_BA
RVI_BA, CTI_CC, CTI_BA, MTI_BA, PTI_CC, PTI_BAMTI_BA, PTI_CC, PTI_BA
Polarization decomposition variable/0919: VM20711: FD;
0905: FD
Polarization decomposition parameter0725: T23_imag;
0808: T13_imag
0808: T12_imag0905: T11,
0905: T23_imag
SAR index/0808: CSI0725: CSI; 0808: VSI; 0919: VSI; 0919: RVI
Vertical structure index0711-0725H, 0725-0808H, 0905-0919H, 0725-0905H, 0808-0919H, 0711-0919H0711-0725H, 0725-0808H, 0905-0919H, 0725-0905H, 0711-0919H0725-0808H;
0905-0919H
Table 8. Summary of optimal parameters for the three machine learning models.
Table 8. Summary of optimal parameters for the three machine learning models.
Optimal ParameterML
VariablesRF
(nTree = 500; nLeaf = 5)
SVR
(Kernel = Radial)
ANN
V1mtry = 2C = 4; epsilon = 0.5; gamma = 0.08size = 4; decay = 0.01
V2mtry = 4C = 4; epsilon = 0; gamma = 0.17size = 3; decay = 0.01
V3mtry = 3C = 4; epsilon = 0; gamma = 0.06size = 6; decay = 0.01
V4mtry = 4C = 4; epsilon = 0; gamma = 0.05size = 5; decay = 0.01
V5mtry = 5C = 4; epsilon = 0; gamma = 0.03size = 6; decay = 0.01
V6mtry = 3C = 4; epsilon = 0; gamma = 0.05size = 5; decay = 0.01
V7mtry = 3C = 4; epsilon = 0; gamma = 0.05size = 14; decay = 0.01
Table 9. Summary of the results for the training and test sets of the three models estimating AGB.
Table 9. Summary of the results for the training and test sets of the three models estimating AGB.
VariablesMLTraining Set ResultsTest Set Results
R 2 RMSE R 2 RMSE
V1RF0.801714.93430.833214.3347
SVR0.763015.77170.781113.2919
ANN0.679919.89740.792519.9652
V2RF0.643913.36420.490315.5579
SVR0.762016.28040.684319.3009
ANN0.266915.96200.422217.9815
V3RF0.867412.16730.864514.2202
SVR0.95858.96790.602521.8577
ANN0.794816.91270.752724.4403
V4RF0.886711.27800.879913.4761
SVR0.99463.40590.843019.7249
ANN0.835315.49980.811324.2102
V5RF0.870612.69970.704318.8707
SVR0.98166.18780.700325.5541
ANN0.836615.65440.778123.8184
V6RF0.856613.37410.706919.2561
SVR0.98714.93110.704519.4594
ANN0.833115.84080.753723.9855
V7RF0.882611.90300.802915.0062
SVR0.99980.73760.879212.5204
ANN0.891713.14240.706534.5550
Note: Bold denotes the optimal result.
Table 10. Summary of the results of the fit of the new indices to the AGB and the saturation value. Fitting parameters ( c 0 , c ) and AGB saturation values (BS) for spherical models with different structure indices applied, R 2 is the test set result.
Table 10. Summary of the results of the fit of the new indices to the AGB and the saturation value. Fitting parameters ( c 0 , c ) and AGB saturation values (BS) for spherical models with different structure indices applied, R 2 is the test set result.
Variables c 0 c BS(t/ha) R 2 Variables c 0 c BS (t/ha) R 2
RVI_CC0.47 0.06188.030.05PTI_S0.10 0.17171.990.02
RVI_S0.040.00140.630.20PTI_BA 3.425.27263.120.51
RVI_BA0.34 0.10212.030.340711-0725H5.8812.16164.790.14
CTI_CC0.45 0.16171.300.130711-0919H6.9111.38173.000.15
CTI_S0.760.16181.910.090725-0808H8.9813.34200.300.25
CTI_BA0.65 0.35210.270.460725-0905H8.719.76193.590.19
MTI_CC0.19 0.07178.290.160808-0919H6.058.32195.360.10
MTI_S0.13 0.14185.430.090905-0919H10.3714.32278.030.18
MTI_BA0.33 0.21211.580.49S2_B110.19 0.09186.470.35
PTI_CC 21.0522.16200.000.32
Note: Bold denotes the optimal result.
Table 11. Summary table of saturation value obtained by each model. Fitting parameters ( c 0 , c ) and saturation values (BS) were obtained by substituting the optimal results obtained by the three machine learning methods into the spherical model.
Table 11. Summary table of saturation value obtained by each model. Fitting parameters ( c 0 , c ) and saturation values (BS) were obtained by substituting the optimal results obtained by the three machine learning methods into the spherical model.
VariablesML c 0 c BS (t/ha)VariablesML c 0 c BS (t/ha)
V1RF0.25 0.14148.16V5RF0.27 0.16146.08
SVR0.28 0.17151.51SVR0.22 0.12169.41
ANN0.21 0.10156.93ANN0.21 0.11162.49
V2RF0.27 0.15135.71V6RF0.27 0.16143.88
SVR0.22 0.10157.04SVR0.22 0.12168.35
ANN0.33 0.21109.21ANN0.22 0.11156.66
V3RF0.27 0.16142.77V7RF0.27 0.17146.17
SVR0.22 0.11158.58SVR0.21 0.10185.73
ANN0.20 0.09159.94ANN0.20 0.09165.03
V4RF0.27 0.16143.33
SVR0.21 0.10174.11
ANN0.21 0.10162.74
Note: Bold denotes the optimal result.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sa, R.; Nie, Y.; Chumachenko, S.; Fan, W. Biomass Estimation and Saturation Value Determination Based on Multi-Source Remote Sensing Data. Remote Sens. 2024, 16, 2250. https://doi.org/10.3390/rs16122250

AMA Style

Sa R, Nie Y, Chumachenko S, Fan W. Biomass Estimation and Saturation Value Determination Based on Multi-Source Remote Sensing Data. Remote Sensing. 2024; 16(12):2250. https://doi.org/10.3390/rs16122250

Chicago/Turabian Style

Sa, Rula, Yonghui Nie, Sergey Chumachenko, and Wenyi Fan. 2024. "Biomass Estimation and Saturation Value Determination Based on Multi-Source Remote Sensing Data" Remote Sensing 16, no. 12: 2250. https://doi.org/10.3390/rs16122250

APA Style

Sa, R., Nie, Y., Chumachenko, S., & Fan, W. (2024). Biomass Estimation and Saturation Value Determination Based on Multi-Source Remote Sensing Data. Remote Sensing, 16(12), 2250. https://doi.org/10.3390/rs16122250

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop