Improving Accuracy Estimation of Forest Aboveground Biomass Based on Incorporation of ALOS-2 PALSAR-2 and Sentinel-2 A Imagery and Machine Learning : A Case Study of the Hyrcanian Forest Area ( Iran )

The main objective of this research is to investigate the potential combination of Sentinel-2A and ALOS-2 PALSAR-2 (Advanced Land Observing Satellite -2 Phased Array type L-band Synthetic Aperture Radar-2) imagery for improving the accuracy of the Aboveground Biomass (AGB) measurement. According to the current literature, this kind of investigation has rarely been conducted. The Hyrcanian forest area (Iran) is selected as the case study. For this purpose, a total of 149 sample plots for the study area were documented through fieldwork. Using the imagery, three datasets were generated including the Sentinel-2A dataset, the ALOS-2 PALSAR-2 dataset, and the combination of the Sentinel-2A dataset and the ALOS-2 PALSAR-2 dataset (Sentinel-ALOS). Because the accuracy of the AGB estimation is dependent on the method used, in this research, four machine learning techniques were selected and compared, namely Random Forests (RF), Support Vector Regression (SVR), Multi-Layer Perceptron Neural Networks (MPL Neural Nets), and Gaussian Processes (GP). The performance of these AGB models was assessed using the coefficient of determination (R2), the root-mean-square error (RMSE), and the mean absolute error (MAE). The results showed that the AGB models derived from the combination of the Sentinel-2A and the ALOS-2 PALSAR-2 data had the highest accuracy, followed by models using the Sentinel-2A dataset and the ALOS-2 PALSAR-2 dataset. Among the four machine learning models, the SVR model (R2 = 0.73, RMSE = 38.68, and MAE = 32.28) had the highest prediction accuracy, followed by the GP model (R2 = 0.69, RMSE = 40.11, and MAE = 33.69), the RF model (R2 = 0.62, RMSE = 43.13, and MAE = 35.83), and the MPL Neural Nets model (R2 = 0.44, RMSE = 64.33, and MAE = 53.74). Overall, the Sentinel-2A imagery provides a reasonable result while the ALOS-2 PALSAR-2 imagery provides a poor result of the forest AGB estimation. The combination of the Sentinel-2A imagery and the ALOS-2 PALSAR-2 imagery improved the estimation accuracy of AGB compared to that of the Sentinel-2A imagery only.


Introduction
Forests play an important role in the global carbon cycle for reducing carbon dioxide concentrations, which further mitigates the impact of global warming and climate change [1][2][3].Unfortunately, forests have been destroyed worldwide with a loss of around 2.3 million km 2 from 2000 to 2012, which resulted in carbon stocks losses [4].As such, accurate measurement of forest carbon stocks and aboveground biomass (AGB) is considered key for understanding the global carbon cycle, for evaluating emissions from deforestation, and for regional, sustainable land-use planning [5,6].
Numerous studies have been conducted to estimate forest AGB.These studies could be grouped into destructive methods and non-destructive methods.Although destructive approaches such as field measurements are considered to be the most accurate method for estimating AGB, they are time-consuming and costly.In addition, they may not be feasible for a large-scale analysis [7], particularly in regions of dense and mixed forests (i.e., in tropical and subtropical mountainous areas).Thus, the development of accurate and low-cost models to estimate the forest biomass is still greatly needed to support global climate change mitigation programs such as the United Nations' Reducing Emissions from Deforestation and Forest Degradation in Developing Countries (REDD+) scheme [8,9].
For forest AGB modeling, remotely sensed data has been widely used.Accordingly, the forest AGB is estimated through regression models, which are constructed based on direct relationships between AGB and spectral reflectance values derived from remotely sensed data.However, acceptable accuracy for estimating the forest AGB is still an obstacle in many areas [6,10].In terms of optical data, the main drawback of the AGB estimation is due to the characteristics of forests such as close canopy, high diversity, and other complex structures [11,12].Consequently, the optical data may become saturated and may be less sensitive in high biomass regions, resulting in challenges for accurately estimating AGB [13,14].Compared to the optical data, SAR (Synthetic Aperture Radar) data has proven to be more effective in monitoring forest disturbance dynamics because SAR sensors can penetrate clouds that are common in the tropical area [15,16].
Among SAR image data, L-band is the most widely used for estimating forest AGB due to its sensitivity to woody components [17,18].However, the L-band is only efficient for a certain range of AGB, and outside that range, the signal is saturated [19].To overcome this limitation, integrating multi-source data such as optical images and SAR has been proposed to enhance the accuracy of the forest AGB estimation [20,21].These combinations can be carried out in two ways: (i) data fusion of both SAR and optical images into a new dataset (i.e., principal component analysis (PCA) or wavelet transforms) [22]; and (ii) incorporation of all images [20].However, there is still a debate on the accuracy the AGB estimation using SAR combined with optical data.Cutler et al. [23] pointed out that the combination of Multispectral Landsat TM (Thematic Mapper) and JERS (Japanese Earth Resources Satellite) SAR data may produce more precise results.Inversely, Hame et al. [24] concluded that the combination of optical ALOS AVNIR (Advanced Land Observing Satellite Advanced Visible and Near Infrared Radiometer) data and ALOS PALSAR (Phased Array type L-band Synthetic Aperture Radar) did not improve the accuracy of the biomass estimation.Therefore, further investigations on the combination of the optical data and SAR data for the AGB estimation should be carried out to gather more evidence and reach reasonable conclusions.
The recent launches of the new generation ALOS-2 PALSAR-2 (24 May 2014) by the Japan Aerospace Exploration Agency (JAXA) and the Sentinel-2A satellite (23 June 2015) by the European Space Agency (ESA) have provided new opportunities for forest AGB studies.The ALOS-2 PALSAR-2 offers radar imagery with dual-polarizations HH+HV or full-polarizations HH+HV+VH+VV in L-band [25], whereas the Sentinel-2A provides 12 multispectral bands [26].The radar imagery has encouraged new approaches for developing more accurate models for estimating AGB.Investigation of the ALOS-2 PALSAR-2 for the forest AGB estimation is still rare.However, a recent study on the use of the Sentinel-2A MSI (Multi Spectral Instrument) data for the AGB estimation shows high accuracy [27].In addition, the Sentinel-2A MSI sensor yields better results than that of Landsat 8 OLI for measuring forest canopy cover and leaf area index (LAI) [28] and for the AGB estimation of grasses [27].
Thus, compared to Landsat series, the Sentinel-2 sensor provides more spectral bands including visible and near infrared (VNIR) bands and better spatial resolutions 10-20 m than those of the Landsat data.More recently, combining Sentinel-2A imagery and the Sentinel-1 (SAR) imagery for AGB estimation was carried out, and it was concluded that this combination could be used for estimating AGB [29].Nevertheless, study of the integration of the ALOS-2 PALSAR-2 and the Sentinel-2A data for estimating forest AGB has not been conducted regularly.
We address this gap in the literature in this paper by investigating the integration of the ALOS-2 PALSAR-2 and the Sentinel-2A MSI data for estimating the forest AGB with a case study of the Hyrcanian forest of Iran.The combination of the ALOS-2 PALSAR-2 and the Sentinel-2 MSI data was used because the ALOS-2 PALSAR-2 sensor has long wavelengths that penetrate into forest canopies, whereas the Sentinel-2A sensor has multispectral bands representing the canopy cover reflection of different species.Regardless, no method is best for estimating AGB in all areas.Therefore, in this work, four state-of-the-art machine learning methods including Random Forest (RF), Support Vector Regression (SVR), Multi-Layer Perceptron Neural Network (MLP Neural Net), and Gaussian Processes (GP) were used for building AGB models.These machine learning methods were selected because they have proven to be effective for forest AGB estimation in various investigations [30][31][32].It is noted that the four machine learning algorithms are available at Weka open source software [33].In addition, a python application developed by the authors was used to transfer the predicted AGB results to the GIS (Geographic Information System) environment.

Description of the Study Area
We conducted this study in the Hyrcanian forests, which belong to the Gilan province located in northern Iran (Figure 1).This area has a subtropical climate where the monthly average temperature is 14 • C and the annual precipitation is 1100 mm.The topography of Gilan is characterized by high mountainous terrains in the west and the south areas, while the terrain is flat in the north.According to our survey, the forest in Gilan covers an area of 564,712 ha with closed canopy and is characterized by a broadleaf and different vegetation types.
grasses [27].Thus, compared to Landsat series, the Sentinel-2 sensor provides more spectral bands including visible and near infrared (VNIR) bands and better spatial resolutions 10-20 m than those of the Landsat data.More recently, combining Sentinel-2A imagery and the Sentinel-1 (SAR) imagery for AGB estimation was carried out, and it was concluded that this combination could be used for estimating AGB [29].Nevertheless, study of the integration of the ALOS-2 PALSAR-2 and the Sentinel-2A data for estimating forest AGB has not been conducted regularly.
We address this gap in the literature in this paper by investigating the integration of the ALOS-2 PALSAR-2 and the Sentinel-2A MSI data for estimating the forest AGB with a case study of the Hyrcanian forest of Iran.The combination of the ALOS-2 PALSAR-2 and the Sentinel-2 MSI data was used because the ALOS-2 PALSAR-2 sensor has long wavelengths that penetrate into forest canopies, whereas the Sentinel-2A sensor has multispectral bands representing the canopy cover reflection of different species.Regardless, no method is best for estimating AGB in all areas.Therefore, in this work, four state-of-the-art machine learning methods including Random Forest (RF), Support Vector Regression (SVR), Multi-Layer Perceptron Neural Network (MLP Neural Net), and Gaussian Processes (GP) were used for building AGB models.These machine learning methods were selected because they have proven to be effective for forest AGB estimation in various investigations [30][31][32].It is noted that the four machine learning algorithms are available at Weka open source software [33].In addition, a python application developed by the authors was used to transfer the predicted AGB results to the GIS (Geographic Information System) environment.

Description of the Study Area
We conducted this study in the Hyrcanian forests, which belong to the Gilan province located in northern Iran (Figure 1).This area has a subtropical climate where the monthly average temperature is 14 °C and the annual precipitation is 1100 mm.The topography of Gilan is characterized by high mountainous terrains in the west and the south areas, while the terrain is flat in the north.According to our survey, the forest in Gilan covers an area of 564,712 ha with closed canopy and is characterized by a broadleaf and different vegetation types.In this project, we selected an area of about 1100 hectares of the Hyrcanian forest for the research.This area is located between the longitudes 48 • 48 45 and 48 • 50 35 N, and between the latitudes 37 • 41 13 to 37 • 37 55 E. The altitude of the study area ranges from 110 m to 200 m above sea level.The forest of the study area is typically a suitable ecosystem for the broadleaf Hyrcanian species, especially Fagus orientalis.The forested area is natural and mature with unevenly-aged and dense to semi-dense stands.The area is comprised of mixed hardwood types such as Fagus dominant, mixed Fagus, and Carpinus-Fagus.The most important tree species are Acer cappadocicum, Fagus orientalis, Carpinus Betulus, Tilia platyphyllos, and Alnus glutinosa [34].

Satellite Data Collection and Processing
Table 1 shows the ALOS-2 PALSAR-2 L-band in full polarimetry and the Sentinel-2A MSI data used for estimating the forest AGB in this study.The ALOS-2 PALSAR-2 data (HH, HV, VV, and VH) at level 1.1 was acquired in the dry season with almost no rainfall during the date of the acquisition (1 June 2015).The off-nadir angle (θ) is 35.3 • and the range resolution is equal to 6 m (http://www.eorc.jaxa.jp/ALOS-2/en/obs/pal2_w-cycle.htm).The data was acquired in the single-look complex (SLC) format.For this research, the multi-looked method was applied for de-speckling the full-polarization ALOS-2 PALSAR-2 scene to enhance the radiometric resolution and to square the pixels in ground range geometry at a similar spatial resolution (10 m) as that of the Sentinel-2A imagery [35].Because terrain can significantly impact the result of forest AGB estimation [12], the ALOS-2 PALSAR-2 data were radiometrically terrain-corrected and geocoded using NASA(National Aeronautics and Space Administration) 's Shuttle Radar Topography Mission Digital Elevation Model (DEM) [36].This is suggested by JAXA [25] to correct inherent SAR geometry effects.The geocoded images were re-sampled to a regular grid with 10 m × 10 m pixel size.The intensity values were converted to the normalized radar backscattering coefficients (σ 0 ) using Equation (1).
where σ 0 is a backscattering coefficient; I and Q are the real and imaginary parts of the complex SAR image pixel values [25]; and CF (calibration factor) = −83 [37].
Then, the backscattering coefficients were converted into the corrected backscatter in gammanaught (γ • values) to reduce the effect of the incidence angle on the radar backscatter.In addition, the ALOS-2 PALSAR-2 data were co-registered with the Sentinel-2A MSI scene of the study area (Table 1).In this research, we calculated additional polarimetric parameters such as the alpha angle (α), entropy (H), and anisotropy (A) for the modeling.
The Sentinel-2A MSI Level 1C data was acquired on 20 June 2016 (Table 1).The multispectral bands with spatial resolution of 10 m used are four visible bands; Blue (490 nm), Green (560 nm), Red (665 nm), and Near-Infrared (NIR) (842 nm).The Sentinel-2A Level 1C at the top of the atmosphere (TOA) reflectance data was then processed to Level-2A using ESA's Sen2Cor algorithm to obtain the bottom of the atmosphere (BOA) reflectance images using (http://step.esa.int/main/third-party-plugins-2/sen2cor/).It is noted that ENVI 5.2 software was employed for SAR imagery processing while SNAP toolbox (http://step.esa.int/main/toolboxes/snap/) was used to process the Sentinel-2A imagery.

Image Transformation of the Sentinel-2A MSI Data
The vegetation index is the most common image transformation for multispectral data since each index has the sensitivity for different biophysical parameters such as canopy, biomass, and volume forests.In this study, six vegetation indices including SVI (Simple Vegetation Index), RVI (Ratio Vegetation Index), NDVI (Normalized Difference Vegetation Index), EVI-2 (Enhanced Vegetation Index-2), SAVI (Soil Adjusted Vegetation Index), and PVI-2 (Perpendicular Vegetation Index-2) (see Table 2) were selected for the AGB estimation as suggested in References [31,32,38].Additionally, Principal Component Analysis (PCA) was also computed from the four multispectral bands of the Sentinel-2A and used for the forest AGB modeling as suggested in previous studies [39,40].

Field Plot Measurement and Field Biomass Estimation
We conducted field survey measurements during the autumn of 2015 with permission from local authorities.The fieldwork did not cause any damage to the forest ecosystem.We selected these plots using a stratified random sampling in which each plot was determined based on a 200 m × 300 m grid overlaid over the entire territory to ensure the range of biomass values.A total of 149 sampling plots with a plot size of 30 × 30 m 2 were measured for specific biophysical parameters such as the diameter at breast height (DBH) and tree height.All trees with DBH greater than 7.5 cm were measured.The DBH and the tree height were measured with pulse-laser technology.The center location of each plot was recorded using the Trimble R3 differential global positioning system (DGPS).The four corners of each sampling plot were established in the field using the DGPS method with a plot size of 30 m × 30 m.This allowed for access to the ALOS-2 PALSAR-2 with a pixel size of 6 × 6 m 2 , and the Sentinel-2A with a pixel size of 10 × 10 m 2 .To minimize the positional inaccuracies of DGPS and geo-coding, we calculated the mean of the pixel for each sampling plot with a moving window size of 3 × 3 pixels for the Sentinel-2A and a window size of 3 × 3 pixels for the ALOS-2 PALSAR-2 data.
Because there is no specific allometric equation for the Hyrcanian forest types, we calculated the forest AGB of each tree using the specific gravity method for different forest species of the study area.For this purpose, we first calculated a tree volume based on the standard volume table, and then converted it to biomass using the specific gravity of each species.Finally, the forest AGB of each plot was calculated.Once the forest AGB was calculated in each plot, we summed up all of the AGB values and converted them to Mg•ha −1 .
Table 3 shows the field data of measured AGB in the sample plots in the study area.The forest AGB in the study area ranged between 45.67 and 436.17Mg•ha −1 and had an average of 206.54 Mg•ha −1 .We randomly split the 149 sample plots into two subsets, with the first one (120 sample plots or 80%) used for training models and the second one (29 sample plots or 20%) used for model validation and confirming the prediction accuracy.It is noted that the forest AGBs from the sample plots were used as the dependent variable, whereas the independent variables were derived from the Sentinel-2A data and the ALOS-2 PALSAR-2 data.In the next step, three datasets were generated: (1) the Sentinel-2A dataset; (2) the ALOS-2 PALSAR-2 dataset; and the combination of the Sentinel-2A dataset and the ALOS-2 PALSAR-2 dataset (Sentinel-ALOS).A summary of these datasets is shown in Table 4.All the explanation variables were normalized for use in the machine learning models.

Random Forest
Random Forest (RF) is an efficient machine learning method proposed by Breiman [47] that can be used both for classification and regression purposes.RF has proven to yield high accuracy, find the robust outliers and noise, compute quickly, and show the relative importance of input variables [48,49].
In RF, the bagging (bootstrap aggregating) algorithm [50] is used to generate n sub-datasets from the training dataset.These sub-datasets are called bootstrap datasets.Each bootstrap dataset is used to constructed a base-decision tree using the Classification And Regression Tree (CART) algorithm [51].Finally, these base-decision trees are grouped to form a forest called the RF model.
Ideally, two-thirds of the total samples from the training dataset should be included in these bootstrap datasets and called 'in bag' data.The remaining data is called 'out-of-bag' (OOB) data, and is used to evaluate the RF model [52].
Because the performance of the RF model is dependent on the number of the base-decision trees used, this parameter should be carefully selected.For this research, 500 base-decision trees were selected to ensure the diversity of the random forest model as suggested in [53,54].

Support Vector Regression
Support Vector Regression (SVR) is one of the most efficient machine learning techniques developed, according to the statistical learning theory [55,56].SVR has proven to outperform conventional methods in environmental modeling [57][58][59][60][61], land-use and land-cover classification [62], and estimating forest biomass [32,63].The main advantage of SVR is that it is highly accurate at predicting even with small numbers of training samples [64].
For the forest AGB estimation, the training process aims to build an SVR function as follows: where k (x i ; x) represents the kernel function; x i is the training vector; α denotes the Lagrange multiplier; and b is the bias term in the regression.The quality of the forest AGB estimation is measured by the ε-insensitive loss function proposed by Vapnik [65].In addition, the performance of the SVR model is significantly influenced by the selection of the kernel functions.Therefore, in this research, we selected the Radial Basis Function (RBF) kernel because it is the most widely used for determining forest biomass in previous studies [31,32,63].Consequently, the training of the SVR model required finding the best values for the two meta-parameters, the regularization parameter (C), and the kernel width (γ).For this task, the grid search method was used, as seen in Reference [66,67].Accordingly, the SVR model was constructed using the best C = 18 and γ = 0.102 for the SVR model with the Sentinel-2A dataset, while the best C = 17 and γ = 0.102 were found for the ALOS-2 PALSAR-2 dataset.For the Sentinel-ALOS dataset, C = 6 and γ = 0.102 are the best values.

Multi-Layer Perceptron Neural Network
An artificial neural network consists of a large number of highly interconnected nodes using mathematical algorithms to model non-linear complex problems such as forest biomass modeling.Although various neural network algorithms have been developed, multi-layer perceptron neural networks (MLP Neural Nets) are the most widely used for environmental modeling [68,69], forest monitoring and mapping [70], and forest biomass estimation [71].The structure of a typical MLP Neural Nets model consists of three layers including input, hidden, and output layers, where each layer is composed of several nodes or neurons.In the input layer, the number of neurons represents the number of input explanatory variables, while the number of neurons in the hidden layer must be determined beforehand depending on the data of the study area.The output layer contains one neuron, which indicates the values of the forest AGB in this study.
The performance of the MLP Neural Nets model is significantly affected by connection weights between the input and hidden layers and between the hidden and the output layers.These weights are adjusted and modified accordingly in the training phase based on a back-propagation algorithm [72] meant to minimize the difference between the AGB value generated from the MLP Neural Nets model and the forest AGB inventories.The process is repeated until a predefined accuracy level or the maximum number of repetitions is reached.
To construct the MLP Neural Nets model for this study, the number of hidden neurons that has a significant impact on the forest AGB estimation [71,73] was determined using the test suggested in Reference [58].Thus, by varying the numbers of neurons versus the root-mean-square error (RMSE) using the training dataset, the best MLP Neural Nets models were determined for the three datasets in this study.These best models are characterized by the highest R 2 , the lowest RMSE, and the lowest MAE (mean absolute error).Accordingly, the best MLP Neural Nets model with two neurons was found for the Sentinel-2A dataset, while three neurons were found for the ALOS-2 PALSAR-2 dataset and four neurons were found for the combination Sentinel-ALOS dataset.The activation function of the logistic sigmoid was used.The learning rate, momentum, and training iteration were selected as 0.3, 0.2, and 500, respectively, as suggested in Reference [74,75].

Gaussian Processes
Gaussian process (GP) is considered a powerful regression method that has successfully been used in various real world problems [76,77], including calculating aboveground forest biomass [78].GP is a stochastic process that regresses probability distribution functions over the forest AGB data under the assumption that these functions have a Gaussian distribution.
In the GP regression model, a relation between the input variables x i = [x 1 , . . ., x B ] ∈ R B and the output AGB y ∈ R could be expressed by the following equation: where x i is the explanatory variables used in the training phase; α i is the weight assigned to each one of them; and K is a kernel function [79,80].
A GP assumes that p(f (x 1 ), . . ., f (x N )) is jointly Gaussian, with some mean µ(x) and covariance Σ(x) given by Σ ij = k(x i , x j ), with k representing a kernel function.
A scaled Gaussian kernel function was employed, using the following equation: where υ is a scaling factor, B is the number of input explanatory variables, and σ b is a dedicated parameter controlling the spread of the relations for each particular input variable b.Model parameters (υ, σ b ) and model weights α i can be automatically optimized by maximizing the marginal likelihood in the training set [76,81].

Model Assessment
Models generated from machine learning techniques were validated using the field survey data.A total of 29 sample plots (20% of plots) was randomly selected to assess the performance of these models (as described in Section 2.2.4), of which the forest AGB ranged from 89.74 to 382.15 Mg•ha −1 .
We used the root-mean-square-error (RMSE), the mean absolute error (MAE), and the coefficient of determination (R 2 ) to compare the performance of selected machine learning techniques for the forest AGB estimation.These statistical criteria are widely employed in modeling forest biomass to assess the difference between the observed data and the predicted forest AGB data [82,83].RMSE (Equation ( 5)) is considered a standard metric for measuring errors of regression models.However, the equation is significantly influenced by large values and outliers [84].Therefore, MAE (Equation ( 6)) is used with RMSE for evaluating the variation of the model errors [58].Lower RMSE and MAE values indicate the better regression model.Furthermore, a smaller difference between RMSE and MAE reflects a smaller variance between the errors.R 2 is estimated using Equation (7); and the higher the R 2 values also show a better model [68,71,85].
where ŷi and y i are the predicted and observed biomass for the ith plot, respectively; n is the total number of validation plots, and y is the observed mean values of biomass.

Model Training and Validation
The results of the forest AGB estimation using RF, SVR, MLP Neural Nets, and GP models for the Sentinel-2A dataset are shown in Table 5.The results show that all four models had satisfactory performances using the training data.The highest goodness of fit was found for the RF model (R 2 = 0.95, RMSE = 20.35,MAE = 3.  Analysis of the R 2 values for the AGB estimation models using the ALOS-2 PALSAR-2 data in Table 6 indicates that most values ranged from 0.37 to 0.55 for the training data and ranged from 0.13 to 0.23 for the validation data, apart from the RF model for the training data.Overall, the ALOS-2 PALSAR-2 imagery had a very weak correlation with the AGB estimation.Considering the existing reports on the saturation of SAR backscatter data at a high level of AGB, one could expect such weak relationships, especially for high biomass tropical forests.The ALOS-2 PALSAR-2 data alone led to very poor modeling of the forest AGB in the study area. It is clearly seen from Table 7 that, among the four machine learning models, the SVR model had the highest performance with the Sentinel-ALOS dataset (both on the training and the validation datasets).This shows a strong relationship between the observed and predicted AGB values, where R  Compared to the results from the Sentinel-2A MSI and the ALOS-2 PALSAR-2 images, their combination significantly improved the performance of the SVR model, but the data integration did not improve the model performances in terms of R 2 , RMSE, and MAE values for the other machine learning techniques.Overall, the MLP Neural Nets and the GP models had lower R 2 values and higher RMSE and MAE values.Therefore, we conclude that the SVR model achieved the best performance for estimating the forest AGB in this study.
As can be seen from Table 7, the SVR model achieved the highest R 2 value and the lowest RMSE and MAE values among the four machine learning models.The GP model had a lower error rate than the RF and MLP Neural Nets models.This is likely due to the saturation levels at 274.42 Mg•ha −1 .As can be seen from Table 7, the SVR model achieved the highest R 2 value and the lowest RMSE and MAE values among the four machine learning models.The GP model had a lower error rate than the RF and MLP Neural Nets models.This is likely due to the saturation levels at 274.42 Mg ha −1 .The comparison of the behavior between the four different machine learning models used in biomass estimation is shown in Figure 3.The results revealed that the predicted AGB by the SVR model is consistent with observed AGB.However, the actual biomass value before 237.40 Mg•ha −1 is likely overestimated, and after 248.0 Mg•ha −1 , all models are underestimated and reach their saturation level.The biomass of plots containing dense large forest stands with high DBH were likely underestimated due to the saturation levels of the ALOS-2 PALSAR-2 and the Sentinel-2A sensors for estimating forest AGB.The comparison of the behavior between the four different machine learning models used in biomass estimation is shown in Figure 3.The results revealed that the predicted AGB by the SVR model is consistent with observed AGB.However, the actual biomass value before 237.40 Mg•ha −1 is likely overestimated, and after 248.0 Mg•ha −1 , all models are underestimated and reach their saturation level.The biomass of plots containing dense large forest stands with high DBH were likely underestimated due to the saturation levels of the ALOS-2 PALSAR-2 and the Sentinel-2A sensors for estimating forest AGB.

Accuracy Assessment of Aboveground Biomass
As can be seen from Table 7, the SVR model achieved the highest R 2 value and the lowest RMSE and MAE values among the four machine learning models.The GP model had a lower error rate than the RF and MLP Neural Nets models.This is likely due to the saturation levels at 274.42 Mg ha −1 .The comparison of the behavior between the four different machine learning models used in biomass estimation is shown in Figure 3.The results revealed that the predicted AGB by the SVR model is consistent with observed AGB.However, the actual biomass value before 237.40 Mg•ha −1 is likely overestimated, and after 248.0 Mg•ha −1 , all models are underestimated and reach their saturation level.The biomass of plots containing dense large forest stands with high DBH were likely underestimated due to the saturation levels of the ALOS-2 PALSAR-2 and the Sentinel-2A sensors for estimating forest AGB.The relationships between the plot-measured AGB, spectral bands, and vegetation indices derived from the Sentinel-2A data were further analyzed and presented in scatterplots in Figure 5.It could be seen that RVI had the highest correlation coefficient (R 2 = 0.519) with the plot-measured AGB, followed by NDVI (R 2 = 0.507), PCA1 (R 2 = 0.502), SVI (R 2 = 0.500), PVI-2 (R 2 = 0.462), NIR (R 2 = 0.396), Red band ((R 2 = 0.367), EVI (R 2 = 0.354), PCA3 (R 2 =0.266),SAVI (R 2 = 0.232), PCA2 (R 2 = 0.207), Green band (R 2 = 0.183), and Blue band (R 2 = 0.153).The relationships between the plot-measured AGB, spectral bands, and vegetation indices derived from the Sentinel-2A data were further analyzed and presented in scatterplots in Figure 5.It could be seen that RVI had the highest correlation coefficient (R 2 = 0.519) with the plot-measured AGB, followed by NDVI (R 2 = 0.507), PCA1 (R 2 = 0.502), SVI (R 2 = 0.500), PVI-2 (R 2 = 0.462), NIR (R 2 = 0.396), Red band ((R 2 = 0.367), EVI (R 2 = 0.354), PCA3 (R 2 =0.266),SAVI (R 2 = 0.232), PCA2 (R 2 = 0.207), Green band (R 2 = 0.183), and Blue band (R 2 = 0.153).(R 2 = 0.018), VH (R 2 = 0.002), Alpha (R 2 = 0.002), HH (R 2 = 0.001), and Anisotropy (R 2 = 0.001).These indicate weak correlations between these variables and the plot-measured AGB.Therefore, using the variables derived from the ALOS-2 PALSAR-2 alone led to very poor results of the forest AGB estimation in the study area.

The Role of the Predictive Variables
The role of the predictive variables for the AGB estimation in this study was assessed using RF with the wrapper evaluation method [86,87] that is available in the Weka software.Accordingly, the correlation coefficient was used to calculate the merit of the variables.The result is shown in Table 8.

The Role of the Predictive Variables
The role of the predictive variables for the AGB estimation in this study was assessed using RF with the wrapper evaluation method [86,87] that is available in the Weka software.Accordingly, the correlation coefficient was used to calculate the merit of the variables.The result is shown in Table 8.
It can be seen that SVI, NDVI, and RVI, derived from the multispectral bands of the Sentinel-2A data, were the most important variables for predicting the forest AGB in the study area.This is likely due to the strong correlation coefficients between the forest AGB and SVI, NDVI, and RVI, which are consistent with the results from the scatterplots (Figure 5).The result was in line with recent study results reported in Shen et al. [88] and Pargal et al. [89].Our results suggest that SVI, NDVI, RVI, and PCA1 generated from Sentinel-2A data play an important role in the forest AGB estimation compared to other vegetation indices in the study area.The results were similar to findings [90] and Patel and Majumdar [39].Table 8 also shows that PCA1 and PCA3, derived from multispectral bands, were important variables for predicting the forest AGB.A similar finding was made by Lu et al. [91] and Patel and Majumdar [39].In contrast, Entropy and Alpha derived from the ALOS-2 PALSAR-2 dataset presented low contributions to the estimation of the forest AGB in this study.

Discussion
The use of satellite remote sensing data for estimating the forest AGB is still challenging in semitropical and mountainous areas due to numerous factors influencing the relationship between AGB and remote sensing variables.The challenging factors include topography, soil conditions, and forest structures [20].Therefore, investigating new remotely sensed data, including their fusions and combinations, for forest AGB estimation is very necessary.This study addresses key issues by investigating the combination of remote sensing data of the Sentinel-2A and the ALOS-2 PALSAR-2 for estimating the forest AGB.However, the combinations of optical images (i.e., Landsat) and SAR images (i.e., JERS-1 SAR) for the forest AGB estimation are not new [21].Yet, to the best of our knowledge, investigating the combination of the Sentinel-2A and the ALOS-2 PALSAR-2 for the estimation of forest ABG has seldom been carried out.This may be due to the new launch of the Sentinel-2A (23 June 2015) and the ALOS-2 PALSAR-2 (24 May 2014).In addition, four current state-of-the art machine learning methods (RF, SVR, MLP Neural Net, and GP) were considered in this research for comparing and selecting the best model.The selection of the prediction method is key and does have a significant effect on the estimation accuracy [78].Furthermore, R 2 RMSE and MAE were used to assess the quality of the forest AGB models.
The results of this study indicate that the combination of the Sentinel-2A data and the ALOS-2 PALSAR-2 data significantly improved the estimation accuracy of the forest AGB.This finding is consistent with the conclusions of previous studies, where they pointed out that combining optical (Landsat TM, SPOT) and SAR images (JERS-1 and ERS-1/2) can improve the accuracy of estimating AGB [23].
Among the four machine learning models, the SVR model provided the highest estimation accuracy in terms of the highest R 2 , the lowest RMSE, and the lowest MAE.It was followed by the GP model, the RF model, and the MLP Neural Net model.These results are reasonable because the SVR model has proven robust and efficient with small datasets [92,93].This result is in agreement with the findings reported in References [94][95][96][97], which concluded that SVR consistently outperformed other machine learning methods.The performance (R 2 of 0.73 and RMSE of 38.68 Mg•ha −1 , Table 7) of the SVR model with the combination data of the Sentinel-2A and the ALOS-2 PALSAR-2 indicates a satisfactory result compared to previous studies on the forest AGB, as seen in Reference [98] (R 2 = 0.46), (R 2 = 0.28-0.44)[21], and (R 2 = 0.46) [99].
When considering the use of the Sentinel-2A dataset only, all of the models show moderate prediction performance (R 2 = 0.58-0.70).However, the SVR model has the highest prediction accuracy (R 2 = 0.68, RMSE = 42.04Mg•ha −1 , and MAE = 36.53Mg•ha −1 ).This is due to the Sentinel-2A sensor providing high spatial resolution of images (10 m) and due to the correlation of the Sentinel-2A with the AGB of the study area being significant.In contrast, the four models all showed very low performance (R 2 = 0.12-0.23)with the ALOS-2 PALSAR-2 data.The models only indicated that the ALOS-2 PALSAR-2 data have a weak correlation with the AGB of the study area.This finding is different compared to those reported by Ghasemi, Sahebi and Mohammadzadeh [14] (R 2 = 0.61) and Attarchi and Gloaguen [12] (R 2 = 0.45).However, they used the ALOS PALSAR data.The reason for the difference may be due to the wavelength transformations used for preprocessing the ALOS PALSAR images in Reference [90] and the ALOS PALSAR textures being used in Reference [12] that have improved the performance of their models.
We found that that SVI, NDVI, and RVI generated from Sentinel-2A data played an important role in the AGB estimation of the study area.In addition, PCA1 was more important in AGB estimation than PVI-2.A further in-depth study is highly recommended to improve the understanding of the significance of image transformation such as vegetation indices derived from other spectral bands of Sentinel-2A data in AGB estimation.
One of the critical problems in estimating the forest AGB is data saturations, which often cause low accuracy for estimating the AGB in high biomass or high canopy density areas.It should be noted that a short wavelength such as the C-band at 6 cm saturates often at 10 kg/m 2 , while long wavelengths such as the L-band saturate often at 100 Mg ha −1 in complex and mixed forest structures.This saturation number rises to roughly 250 Mg•ha −1 for simple forests with a few dominant species [100].The saturation level of the ALOS-2 PALSAR-2 sensor for AGB estimation, reported by our previous studies, is around 100-150 Mg•ha −1 [71].The problem surrounding the data saturation could be partly solved through the use of multi-source data or data fusion [101,102].Thus, the result in this study indicates that the integration of the Sentinel-2A data and the ALOS-2 PALSAR-2 data reduced the saturation problem.Consequently, the SVR model developed in this study has the ability to estimate biomass AGB exceeding 237.40 Mg•ha −1 .
The main limitation of the current work is the lack of standard allometric equations for estimating the forest AGB.In fact, a large variation in stand-level AGB estimation has been observed [103].Thus, in this work, the amount of AGB in the sample plots was calculated using the reported specified gravity for each species only.In addition, only four machine learning methods, namely RF, SVR, MLP Neural Net, and GP, were investigated.Therefore, the accuracy of the forest AGB estimation may be enhanced if other newer machine learning algorithms are used.Furthermore, for the SVR model, the best regularization parameter (C) and the kernel width (γ) were determined by the grid search method, which may not guarantee that the optimal values were found.Thus, new meta-heuristic machine learning optimization techniques should be further considered to improve the performance of the forest AGB model.
Despite the limitations, the results in this study indicate that combining the Sentinel-2A data, the ALOS-2 PALSAR-2 data, and machine learning is effective for forest biomass modeling that supports the REDD+ mechanism in developing countries.

Conclusions
This study investigated the potential use of data integration of the Sentinel-2A and the ALOS-2 PALSAR-2 for estimating the forest AGB of the Hyrcanian forest (Iran).In addition, four state-of-the art machine learning methods, including RF, SVR, MLP Neural Net, and GP, were investigated and compared.Based on the finding in this research, the following conclusions are drawn: - The integration of the ALOS-2 PALSAR-2 and the Sentinel-2A data can improve the estimation accuracy of the forest AGB.-SVR is capable of delivering the highest prediction accuracy of the forest AGB compared to RF, MLP Neural Net, and GP. - The Sentinel-2A data could be used to estimate the forest AGB with moderate accuracy, while the ALOS-2 PALSAR-2 data alone is not enough for estimating the forest AGB. - The results of the current work may accommodate provincial decision-making on sustainable forest monitoring and management.

Figure 1 .
Figure 1.Location of the study area.

Figure 1 .
Figure 1.Location of the study area.
2 , RMSE, and MAE were 0.86, 38.66 Mg•ha −1 , and 20.19 Mg•ha −1 on the training dataset and 0.73, 38.68 Mg•ha −1 , and 32.28 Mg•ha −1 on the validation dataset, respectively.In addition, MAE values were clearly lower than RMSE values, which indicates that the model has a low variance of the individual errors in the two datasets.In contrast to the SVR model, the MLP Neural Nets model had the lowest performance: R 2 , RMSE, and MAE on the training dataset and on the validation dataset were 0.87, 34.98 Mg•ha −1 , and 25.71 Mg•ha −1 and 0.44, 64.33 Mg•ha −1 , and 53.74 Mg•ha −1 , respectively.

Figure 2
Figure 2 shows scatterplots of predicted versus observed AGB to show the accuracy of the estimated forest AGB by different machine learning techniques at the study site.As can be seen, the highest saturation level was observed by the SVR model exceeding 237.40 Mg•ha −1 .Although the MLP Neural Nets model showed a higher value, it should be noted that this model has a random behavior in the biomass estimation.

Figure 2
Figure 2 shows scatterplots of predicted versus observed AGB to show the accuracy of the estimated forest AGB by different machine learning techniques at the study site.As can be seen, the highest saturation level was observed by the SVR model exceeding 237.40 Mg•ha −1 .Although the MLP Neural Nets model showed a higher value, it should be noted that this model has a random behavior in the biomass estimation.

Figure 2 .
Figure 2. Comparison between the measured AGB in the Sentinel-ALOS validation dataset and the predicted forest AGB using: (a) the RF model; (b) the SVR model: (c) the MLP Neural Nets model; and (d) the GP model.

Figure 3 .
Figure 3.Comparison of the predicted AGB and the measured AGB values.

Figure 2 .
Figure 2. Comparison between the AGB in the Sentinel-ALOS validation dataset and the predicted forest AGB using: (a) the RF model; (b) the SVR model: (c) the MLP Neural Nets model; and (d) the GP model.

Figure 2
Figure 2 shows scatterplots of predicted versus observed AGB to show the accuracy of the estimated forest AGB by different machine learning techniques at the study site.As can be seen, the highest saturation level was observed by the SVR model exceeding 237.40 Mg•ha −1 .Although the MLP Neural Nets model showed a higher value, it should be noted that this model has a random behavior in the biomass estimation.

Figure 2 .
Figure 2. Comparison between the measured AGB in the Sentinel-ALOS validation dataset and the predicted forest AGB using: (a) the RF model; (b) the SVR model: (c) the MLP Neural Nets model; and (d) the GP model.

Figure 3 .
Figure 3.Comparison of the predicted AGB and the measured AGB values.Figure 3. Comparison of the predicted AGB and the measured AGB values.

Figure 3 .
Figure 3.Comparison of the predicted AGB and the measured AGB values.Figure 3. Comparison of the predicted AGB and the measured AGB values.

Figure 4 .
Figure 4.The AGB generated from the SVR model in the study area.

Figure 4 .
Figure 4.The AGB generated from the SVR model in the study area.

Figure 4 .
Figure 4.The AGB generated from the SVR model in the study area.

Table 1 .
Acquired satellite remote sensing data.

Table 2 .
List of vegetation indices used.

Table 3 .
Characteristics of the forest aboveground biomass (AGB) in the study area.

Table 4 .
Description of the three datasets used in the forest AGB estimation in the study area.

Table 5 .
Training and validation results of the AGB model using the Sentinel-2A dataset.

Table 6 .
Training and validation results of the AGB model using the ALOS-2 PALSAR-2 dataset.

Table 7 .
Training and validation results of the AGB model using the combination of the Sentinel-2A dataset and the ALOS-2 PALSAR-2 dataset (Sentinel-ALOS).

Table 8 .
The importance of the variables for the AGB estimation in this study measured using RF with the wrapper evaluation method.