Modeling of Aboveground Biomass with Landsat 8 OLI and Machine Learning in Temperate Forests

: An accurate estimation of forests’ aboveground biomass (AGB) is required because of its relevance to the carbon cycle, and because of its economic and ecological importance. The selection of appropriate variables from satellite information and physical variables is important for precise AGB prediction mapping. Because of the complex relationships for AGB prediction, non-parametric machine-learning techniques represent potentially useful techniques for AGB estimation, but their use and comparison in forest remote-sensing applications is still relatively limited. The objective of the present study was to evaluate the performance of automatic learning techniques, support vector regression (SVR) and random forest (RF), to predict the observed AGB (from 318 permanent sampling plots) from the Landsat 8 Landsat 8 Operational Land Imager (OLI) sensor, spectral indexes, texture indexes and physical variables the Sierra Madre Occidental in Mexico. The result showed that the best SVR model explained 80% of the total variance (root mean square error (RMSE) = 8.20 Mg ha − 1 ). The variables that best predicted AGB, in order of importance, were the bands that belong to the region of red and near and middle infrared, and the average temperature. The results show that the SVR technique has a good potential for the estimation of the AGB and that the selection of the model hyperparameters has important implications for optimizing the goodness of ﬁt.


Introduction
Forest ecosystems, which cover 30% of the land surface, play a key role in the global carbon cycle, by mitigating anthropogenic emissions [1].A more accurate estimation of the regional to global distribution of forest aboveground biomass is required to provide the baseline of forest carbon stocks, and to quantify the anthropogenic emissions caused by deforestation and forest degradation [2,3].In addition, the quantification of forest biomass has large economic implications for the supply of goods such as wood, timber, food, fiber and energy [4,5].Forest biomass has also important ecological implications in ecosystem sustainability, including soil and water management [6].Moreover, forest biomass and its change also influence other ecosystem services, such as biodiversity [7].
Traditionally, forest biomass has been assessed using field-based inventory plots and destructive biomass sampling to establish biomass stocks at the tree and plot level and extrapolate estimates to regions with similar characteristics to the plots evaluated [8,9].Whereas this approach is valuable and can be precise at a local scale, it is also expensive and time consuming, and inherently limited in geographic representativeness [10,11].The natural variation in forest structure and biomass, combined with the enormous geographic extent and rate of forest loss and disturbance, makes field plots difficult to use (alone) for assessing forest biomass and carbon densities [12], particularly over large areas [13][14][15] such as Mexican forests [16].The use of remote-sensing techniques, calibrated and validated with representative field sites of forest biomass monitoring, allows spatially representative maps of forest ecosystems' structure and productivity to be derived, over larger areas, and at lower costs [17,18].
In particular, the availability of a large archive of free available medium-resolution satellite imagery, such as Landsat, has allowed their increased use in many applications, including the forests aboveground biomass (AGB) estimation at local and regional scales [19][20][21].
It is important to effectively employ suitable techniques to extract spectral variables for biomass estimation modeling [22].The potential variables from Landsat Thematic Mapper (TM) images include individual spectral bands, vegetation indices, transformed images, textural images, and fractional images [23].Although many vegetation indices have been proposed [24], depending on the complexity of forest stand structure, the spectral bands, vegetation indices and texture variables can vary in their relationships with biomass [22].
In addition, environmental conditions can have a large role in modulating spatial and temporal patterns of forest AGB [19,25,26].Consequently, several studies have demonstrated the utility of considering climatic variables, such as temperature or precipitation, in improving the predictions of aboveground biomass from satellite images, particularly over large areas, where climate can largely influence forest species distribution, structure and growth [1,19,27].
Furthermore, topographic variables represent a local heterogeneous factor that influence on forest diversity, distribution and development, directly influencing forest productivity and structure [28].The combination of spectral and topographic variables, such as elevation [12,29], slope and aspect [22,30], or topographic wetness indices [23] is, therefore, potentially useful for improving the mapping of forest productivity at a landscape scale.
Given the diversity of environmental, topographical and biophysical conditions in forest ecosystems in different locations, there is no universal, transferable technique for estimating biomass from remote sensors [6,19,31].The best models to map biomass are reliant on the characteristics of the particular forest or landscape [32], resulting in a need to identify the most useful variables for biomass mapping at each area of study.
In addition to the selection of suitable spectral and environmental variables, the use of proper algorithms for establishing biomass estimation models is also critical [22].Techniques for biomass estimation can be grouped into two broad categories: parametric and non-parametric algorithms [26].Parametric algorithms assume that the relationships between dependent (i.e., biomass) and independent (spectral and environmental) variables have explicit model structures that can be specified a priori by parameters [22].Examples are simple or multiple linear regression models [32,33].Linear regression analysis is probably the most frequently used approach for predicting forest biomass and other forest attributes [34] as shown in the reviews by Mohd et al. [26], Brosofske et al. [32] and Zhang and Ni-Meister [33].
Frequently, the relationships between biomass and remote sensing variables are often too complex to be captured by parametric algorithms.Non-parametric algorithms do not explicitly predefine the model structure, and instead, determine the model structure in a data-driven manner [22].One of the greatest advantages of non-parametric algorithms is that they are free from assumptions regarding the probability distribution and correlation of the input data [22,33].Unlike both simple linear models and multiple regression models, this approach can handle a large number of variables from satellite and ancillary data and effectively solves complex non-linear relationships between the response and predictor variables [33].As complex ecological systems like forests frequently show non-linear relationships, autocorrelation and variable interaction across temporal and spatial scales, non-parametric algorithms often outperform parametric ones [19,35].
The most utilized machine-learning non-parametric methods include nearest neighbor approaches (NN) [36,37], support vector machine (SVM) [38,39], among others.Whereas most of these techniques have shown improvements compared to parametric methods, one disadvantage of some of these methods (such as neuronal networks), is that they are black boxes algorithms that do not offer information of the model selected variables and structure [32].In contrast, methods such as random forest or SVM allow an understanding and interpretation of the selected variables and the structure of the models developed [40][41][42].
SVM techniques [43] are increasingly being applied in remote sensing applications, including biomass prediction [44].In SVM, the support vector regression (SVR) transforms the input data into a high-dimensional feature space using a nonlinear kernel function to minimize training error and the complexity of the model [45].SVM employs the principle of structural risk minimization to simultaneously optimize performance and generalization to effectively alleviate the overfitting problem [46].The method has shown a good intrinsic generalization ability and robustness to noise using limited training sample data to produce relatively higher classification or estimation accuracy than other approaches [22].The key to this approach is identifying suitable parameters: the precision of an SVM model depends on the defined tolerance margin and on a cost function, which denotes the balance between the resulting model and the tolerated deviations [47].In spite of the importance of model optimization for machine-learning techniques, studies that have considered parameter optimization for biomass prediction are relatively scarce.
Another non-parametric technique, RF, randomly selects independent data samples to establish relationship in an advanced hyperplane by constructing numerous small regression trees [41].Only a small number of randomly selected predictor variables are used to find the best split at each node, decreasing the correlation between trees and reducing bias, which brings a higher stability to the final model [32,41,42].This method can deal with large datasets with a large number of variables, it can readily accommodate nonlinear responses, variable interactions, both continuous and categorical explanatory variables, and is relatively unaffected by outliers or multicollinearity [22].
Studies comparing the performance of various machine-learning techniques are relatively scarce [19,44,48].From these studies, we know there is little consensus on which statistical method or set of predictor variables is the most robust across a range of forest conditions [22].Depending on the study, the best results have been obtained either with neuronal networks [49,50], RF [19,43,48] or SVM [48,51].There is a need to better understand how the choice of model type affects predictions of biomass [19].Furthermore, the selection of the best prediction technique might be site specific depending on the characteristics of the stands analyzed and the goals of the study [32].
The forests of the Sierra Madre Occidental (SMO), NW Mexico, provide a good opportunity to compare methodologies for predicting and mapping biomass from spectral, textural, topographic and climatic variables, because of their variation in complex and diverse forest structure and their diverse physiographic and climatic conditions [16,21,23,35].Furthermore, the SMO is also important because of the presence of some of the most important commercial species of pine and oak in Mexican ecosystems [23].In particular, the state of Durango generates between 25% and 30% of national timber production, producing a total of 1.5 million m 3 of roundwood per year, and boasts forest reserves that are important sources of environmental services [16].
The current study aimed at evaluating the performance of machine-learning techniques RF and SVM to predict and map the observed AGB from 318 permanent sampling plots for the SMO forests in the state of Durango, from Landsat 8 Landsat 8 Operational Land Imager bands, vegetation indices, texture indices, topographic and climatic variables.

Study Area
The study area was located on the temperate forests of the Durango State on the western Sierra Madre, NW Mexico (Figure 1).These forests have rich biodiversity and include at least 27 coniferous tree species (of which 20 are Pinus species) and 43 species of Quercus; the predominant forest stands comprise pines and oaks, often mixed with Arbutus and Juniperus, among other tree species.The forest structure is irregular, both referring to the spatial arrangement of trees (vertical and horizontal irregularity) and to the variation in the age structure of trees and stands [52].Elevation in the SMO ranges from 1364 to 3020 m above sea level.The annual average precipitation ranges between 443 and 1452 mm and the annual average temperature between 8 and 26

Study Area
The study area was located on the temperate forests of the Durango State on the western Sierra Madre, NW Mexico (Figure 1).These forests have rich biodiversity and include at least 27 coniferous tree species (of which 20 are Pinus species) and 43 species of Quercus; the predominant forest stands comprise pines and oaks, often mixed with Arbutus and Juniperus, among other tree species.The forest structure is irregular, both referring to the spatial arrangement of trees (vertical and horizontal irregularity) and to the variation in the age structure of trees and stands [52].Elevation in the SMO ranges from 1364 to 3020 m above sea level.The annual average precipitation ranges between 443 and 1452 mm and the annual average temperature between 8 and 26 °C.

Field Data
A total of 318 permanent forest inventory plots were sampled during winter of 2017.Each permanent sampling plot, of 50 × 50 m, was established following the methodology described by Corral-Rivas et al. [53].At every permanent forest inventory plot, Diameter at breast height (dbh) (cm) and total height (h) (m) was measured for every tree of >7 cm dbh.Tree volume was obtained from the equations of Corral et al. [54].Aboveground biomass was estimated utilizing the species specific allometric models of Vargas-Larreta et al. [55].The goodness of fit of those allometric models ranged from 0.87-0.99(R 2 ), and 22.8-95.2kg (root mean square error, RMSE).The descriptive statistics of the main forest inventory variables of the sites of study are summarized in Table 1.The plots covered a gradient from very open stands in the more arid areas in the Eastern part of the area of study to relatively denser stands in the colder and wetter Western part of the SMO.

Field Data
A total of 318 permanent forest inventory plots were sampled during winter of 2017.Each permanent sampling plot, of 50 × 50 m, was established following the methodology described by Corral-Rivas et al. [53].At every permanent forest inventory plot, Diameter at breast height (dbh) (cm) and total height (h) (m) was measured for every tree of >7 cm dbh.Tree volume was obtained from the equations of Corral et al. [54].Aboveground biomass was estimated utilizing the species specific allometric models of Vargas-Larreta et al. [55].The goodness of fit of those allometric models ranged from 0.87-0.99(R 2 ), and 22.8-95.2kg (root mean square error, RMSE).The descriptive statistics of the main forest inventory variables of the sites of study are summarized in Table 1.The plots covered a gradient from very open stands in the more arid areas in the Eastern part of the area of study to relatively denser stands in the colder and wetter Western part of the SMO.A radiometric correction was performed to the images, with the goal of minimizing the effect of dispersion caused by the atmosphere on the radiances perceived by the satellite.The correction consists of converting the digital levels (DLs) to radiance values and then to apparent reflectance values.The word "apparent" refers to the fact that the reflectance has not been corrected for atmospheric effects and represents an initial normalization of the image [56].The radiometric correction was carried out with the "apparent reflectance" method implemented in the "Landsat" package in the R software [57].

Vegetation Indices
After performing the radiometric correction, vegetation indices were generated as potential predictors for the AGB models, given their capacity of discriminating vegetation and soil.The normalized difference vegetation index (NDVI) and the soil-adjusted vegetation index (SAVI) were calculated following Equations ( 1) and (2).Both indices (widely utilized for prediction of forest biomass and other dasometric variables) utilize the maximum absorption of the red (R) and near infrared (NIR) wavelengths, which allow to evaluate the chlorophyll and leaf structure [58].SAVI index includes also a correction factor for soil brightness to the near infrared band (L) [59]. where: NIR: Near Infrared.R: Red.L: Is the soil brightness correction factor and its value is 0.5.

Texture Indices
The texture variable extraction was performed with the grey level co-occurrence matrix (GLCM), which is a method that has showed good results in the estimation of forest variables from spectral data [60].The metrics of second order (co-occurrence) GLCM (Table 3), associated to three different kernels (3 × 3, 5 × 5, 7 × 7), were calculated from NDVI with the software PCI Geomatics [61].

Texture Variables Formula
Mean ME = Ng i, j=1 i * P(i, j) (i − u) 2 P(i, j) Correlation CC = i, j (ij)P(i,j)−µ x µ y σ x σ y where: P (i,j): (i,j) the entry in a normalized gray-tone spatial-dependence matrix; Ng: Number of distinct gray levels in the quantized.

Topographic and Climatic Variables
The digital elevation model from INEGI (Instituto Nacional de Estadística y Geografía) [62], with a resolution of 15 m, was utilized.
A low pass filter in kernels of 5 × 5 was applied to remove the high frequency noise that would result in irregular artifacts in the topographic variables and indices [63,64].
Climatic variables were also considered because of their potential influence together with topography on the development and growth of forest biomass.The National Automated Agroclimatic Stations Network [65] was utilized as sources of climatic data.The climatic data were modeled utilizing kriging.Variables considered for analysis where those where the kriging modeling had fits with more than 90% statistical goodness of fit.The topographic and climatic variables considered for prediction of aboveground biomass are shown in Table 4.

Physical Variable Equation Reference
Topographic where: p and q, are the components of the gradient vector of slope; Z, elevation; R, point radio altitude units; A s , drainage area specified; tan B, local slope angle; D, F, G and H were derived according to the equation of [69]; θ, Aspect in degrees east of north; G, growt response; b o , constant term or sum of the predictor effects in the regression and B s and b 3 are coefficients according to the equation of [72]; a, azimuth in degress from north; s, slope in percent/100.

Support Vector Regression (SVR)
SVR is a machine-learning algorithm developed by Vapnik-Chervonenkis [44].It is a robust machine-learning technique, mainly because of its high learning speed and the use of kernels that project data to a high-dimension new space termed the hyperplane [73].The SVR regression model maps training data in a hyperplane regulated by a flexible boundary line that adjusts the prediction function.Under this principle it penalizes the vectors outside the margin of the boundary line (epsilon) [74].
The SVR model was implemented through the package e1071, in the statistical software R [75].An initial evaluation was performed with a model with the default parameters from SVR.A sensitivity analysis of the cost and epsilon metaparameters was performed to optimize the model performance.A total of 88 models were evaluated in this optimization.
The independent variables considered to predict the inventory-derived AGB were the spectral bands (Table 2), the vegetation and texture indices (Table 3) and the topographic and environmental variables listed in Table 4.

Random Forest (RF)
The package "randomForest" from the statistical software R [76] was utilized to construct the RF model.Candidate predictive variables were the same that considered for SVR.
The training of the RF model consisted on optimizing the parameters ntree (number of trees grown) and mtry (number of predictors sampled for spliting at each node, to minimize the influence of a very strong predictor against the other variables) [77].The parameters mtry and ntree were optimized by evaluating their effect on the mean quadratic error.Evaluated values for Ntree were 500, 1000, 1500 and 2000, and mtry values tested were m = 3, m = 7 (m = √ P), m = 15 (m = P/3) y m = 44 (m = P).Where: P = number of independent variables.SVR and RF models were evaluated through the following goodness of fit coefficients: coefficient of determination (R 2 ) and RMSE.

Results
For SVM models, the highest model accuracy was obtained with values of cost = 4 and epsilon = 0.5, resulting in R 2 value of 0.80 and an RMSE of 8.2 Mg ha −1 .In contrast, for the default model with values of cost = 1 and epsilon = 0.1, the RMSE and R 2 were approximately 10 Mg ha −1 and 0.68, respectively.The optimum number of support vectors were obtained from the optimized parameterization of the SVR model.The number of support vectors decreased from 273 to 170 from the model with default parameters to the optimized model.
The residuals of the default parameter and optimized parameter model are shown in Figure 2. It can be seen that none of the models had considerable limitations due to heteroscedasticity; nevertheless, the variance was more constant against the variable for the optimum parameterization model.In addition, the presence of outliers was higher in the default parameter model, with a lower prediction errors for the optimal parameterization model (Figure 2).
The evaluation of the default and optimized parameterization model allowed the predictive variables with the highest contribution to the SVR model to be identified.The variables with the highest importance both models are shown in Figure 3. Bands 7 and 4 (SWIR and NIR) were the most important variables.Average temperature (Tmed) and elevation (ELV) were selected both in the default and optimized parameterization models.
For the RF model, the optimum model accuracy was found with the following parameters: ntree= 500; mtry = 7.The default parameter model considering the 44 predictive variables, with ntree = 500 and mtry = 7, resulted in RMSE and R 2 of 13.62 Mg ha −1 and 0.34, respectively.Figure 4 shows the variables with highest importance.The most important spectral variables were bands 7, 6 and 2; the physical variables with highest importance were maximum temperature (TM) and mean temperature (Tmed).In the process of RF optimization, a cross validation of 10 repetitions was conducted on the highest importance predictors, with the mtree and ntree obtained for model 2. The resulting reduced model with the highest importance variables shown in Figure 4 resulted in slightly improved RMSE (13.05 Mg ha −1 ) and R 2 (0.40).The evaluation of the default and optimized parameterization model allowed the predictive variables with the highest contribution to the SVR model to be identified.The variables with the highest importance both models are shown in Figure 3. Bands 7 and 4 (SWIR and NIR) were the most important variables.Average temperature (Tmed) and elevation (ELV) were selected both in the default and optimized parameterization models.For the RF model, the optimum model accuracy was found with the following parameters: ntree= 500; mtry = 7.The default parameter model considering the 44 predictive variables, with ntree = 500 and mtry = 7, resulted in RMSE and R 2 of 13.62 Mg ha −1 and 0.34, respectively.Figure 4 shows the variables with highest importance.The most important spectral variables were bands 7, 6 and 2; the physical variables with highest importance were maximum temperature (TM) and mean temperature (Tmed).In the process of RF optimization, a cross validation of 10 repetitions was conducted on the  The analysis of residuals and predicted values against observed values is shown in Figure 5 for both the default parameter and optimized parameterization models.No important problem of variance heterogeneity could be observed for any of the models.The residual histogram was mainly symmetrical, with no problems regarding residuals normality, with the exception of a slight tendency to subestimation of the highest values.The analysis of residuals and predicted values against observed values is shown in Figure 5 for both the default parameter and optimized parameterization models.No important problem of variance heterogeneity could be observed for any of the models.The residual histogram was mainly symmetrical, with no problems regarding residuals normality, with the exception of a slight tendency to subestimation of the highest values.The analysis of residuals and predicted values against observed values is shown in Figure 5 for both the default parameter and optimized parameterization models.No important problem of variance heterogeneity could be observed for any of the models.The residual histogram was mainly symmetrical, with no problems regarding residuals normality, with the exception of a slight tendency to subestimation of the highest values.The AGB map for the state of Durango with the model of the best fit, SVR, is shown in Figure 6.The area with the highest biomass concentration (reaching values of 55 to 168 Mg ha −1 ) was located in the southeast of the area of study.This more dense forests corresponds to the areas with higher elevation, lower temperatures and higher precipitation.Lower biomass are observed in the more open forests at the northeast of the SMO, located in a more arid climate in transition to shrubby desert vegetation in the east of the state.

Residuals (Mg ha
The AGB map for the state of Durango with the model of the best fit, SVR, is shown in Figure 6.The area with the highest biomass concentration (reaching values of 55 to 168 Mg ha −1 ) was located in the southeast of the area of study.This more dense forests corresponds to the areas with higher elevation, lower temperatures and higher precipitation.Lower biomass are observed in the more open forests at the northeast of the SMO, located in a more arid climate in transition to shrubby desert vegetation in the east of the state.

Discussion
The study corroborates the potential of machine-learning methods for AGB prediction and mapping as successfully utilized in recent studies (e.g., [49,51,[77][78][79][80][81][82][83]).In our study, the performance of SVR was better than RF.The results of the optimized SVR model in the current study (R 2 = 0.80, RMSE = 8.20 Mg ha −1 ), are higher than previous results with in the SMO with non-parametric algorithms such as regression trees [18], MP5 model trees [84] and MARS techniques [23], with R 2 in the range 0.5-0.7 and RMSE values of 20-38 Mg ha −1 (20%-40% of the average value).They also improve the previous results with SVR in the area of study with a smaller dataset of 99 plots [21],

Discussion
The study corroborates the potential of machine-learning methods for AGB prediction and mapping as successfully utilized in recent studies (e.g., [49,51,[78][79][80][81][82][83][84]).In our study, the performance of SVR was better than RF.The results of the optimized SVR model in the current study (R 2 = 0.80, RMSE = 8.20 Mg ha −1 ), are higher than previous results with in the SMO with non-parametric algorithms such as regression trees [16], MP5 model trees [15] and MARS techniques [23], with R 2 in the range 0.5-0.7 and RMSE values of 20-38 Mg ha −1 (20-40% of the average value).They also improve the previous results with SVR in the area of study with a smaller dataset of 99 plots [21], with R 2 of 0.62 and RMSE of 27 Mg ha −1 , obtaining similar results for the case of RF (R 2 of 0.48 and RMSE of 31 Mgha −1 ).These goodness of fit results are also similar to those reported in the literature for AGB estimation, such as the average RMSE of 41.2 Mg ha −1 (SD = 29.3Mg ha −1 ) reported in the review by [25], or the values of RMSE of 30-60% of the average value reported in the machine-learning studies of [19,49,82], or [83].
The better performance of SVR observed in the current study confirms the results of the previous study by [21] of a machine-learning techniques comparison in the area of study, where SVR outperformed RF and an optimal SVR parameterization was applied.It also agrees with the observations of other studies, such as the study of [82] where SVR outperformed RF, the study of [83], where SVM was the best of seven techniques including random forest and boosted regression trees, or the results from [51] or [85], where SVR was the best of 17 and 19 machine-learning techniques compared for AGB prediction, respectively.On the other hand, other studies have found RF to provide the best predictions, such as the studies by [80,81] or [86], where RF exhibited the best performance compared to neuronal networks and SVR, among other techniques.These results seems to confirm that the selection of the best method seems to be specific to the characteristics of the dataset of the area of study.
Regarding the variables' selected in the models, the selection of band 7 (SWIR-2) as the most important variable to predict AGB from both the default parameter and optimal parameterized models, both using RF and SVR, agrees with results from previous studies where this variable has also been selected as the most important in predicting AGB [21,23] and total volume [16] in the SMO.Given the complexity of forest structure in the SMO, its inclusion seems to support the observations from several studies that vegetation indices including shortwave infrared wavelength have stronger relationships with biomass for forest sites with complex stand structures [22,87,88].
Additional bands selected by the non-parametric models, in the red and NIR domain (bands 4 and 5, respectively) and their normalized difference (NDVI) have also been found in previous studies to influence AGB [21][22][23] and several dasometric variables, including tree density and crown diameter [16] in the SMO and elsewhere [89] because of the well documented high NIR and low red reflectance in green vegetation [24,33].
In spite of the inclusion in the predictive models of the NIR band, which has been found to have a higher penetrating capacity through forested canopies [90], some subestimation of the highest AGB was observed, particularly for the RF models.The lower accuracies observed for the denser, higher biomass stands agrees with studies that have found higher accuracies in more open stands compared to denser canopy conditions (e.g., [19]).This saturation has been commonly observed in studies with Landsat imagery in areas of high biomass, generally at AGB higher than 100 Mg ha −1 [6,22].Further studies with hyperspectral sensors or in combination with light detection and ranging (LIDAR) might help improve the predictions in the areas of highest biomass.
The inclusion of textural variables both in the SVR and RF models may partially compensate some of the band saturation problems [91].The selection of textural variables by both machine-learning techniques agrees with previous studies where textural variables have contributed to improve AGB predictions in Mexico [21,23] and elsewhere [31,[91][92][93].The texture values in an image allow the spatial heterogeneity in the different types of vegetation to be evaluated, which show distinct phenological patterns related to carbon exchange, vegetative greenness, structure and development stage [91,92].This allows to evaluate the heterogeneity between stands through statistical models that detect a similarity within a group of contiguous pixels [93].Because of the relative complexity in the structure of the SMO forests [52], the role of textural variables in improving AGB estimations, agrees with studies that suggest a greater role of textural variables in ecosystems with a more complex structure [22].
Elevation and temperature variables were also selected among the variables of highest importance for predicting AGB.This result agrees with previous studies that have found a role of elevation [1,12,19,29] and temperature [1,19,84] in improving remote sensing AGB predictions.For example, Yan and Wang [94] found that temperature and precipitation were the most important natural climate factors affecting the growth of vegetation in semiarid and arid regions.Propastin [95] found that the relationship between AGB and the vegetation indices significantly improved by considering elevation as a spatial weight in GWR.Similarly, Baccini [96] in their global study combining MODIS with climatic and topographic data to predict AGB, found that in areas characterized by a wide range of elevation and climate zones, climate and topography exert important control on the spatial distribution of AGB.The observed gradient from the more open, lower biomass stands in the drier eastern part of the area of study to the higher biomass in the denser stands in the colder, wetter western region of the SMO, which was captured by the models containing topographic and climatic variables, agrees with studies that have documented increasing biomass in conifer-dominated forests with increasing elevation (e.g., [19]).
Terrain features are potentially related to key features for forest stand development, such as overall climate characteristics, insolation, evapotranspiration, run-off, infiltration, wind exposure and site productivity [16,21].Climatic and topographic variables can also affect aboveground biomass through their effects on species composition distribution and the interaction of forest species growth with different site conditions [28,30,72].For example, Gallaun et al. [84] found in Europe that conifers had higher stocks at high altitude while the contrary occurred with broadleaves.Similarly, Powell et al. [19] found that the higher biomass ponderosa pine forests occurred at higher elevations with lower temperatures and higher precipitation.This could be consistent with our region of study, mainly dominated by pine species similar to Pinus ponderosa (e.g., Pinus duranguensis).Nevertheless, given the complexity of the species structure present in the SMO, future studies should explore the consideration of tree species abundance maps that have shown potential to improve AGB prediction in the SMO [15] and elsewhere [97].

Conclusions
The results of the current study demonstrated the usefulness and potential of the combination of data derived from permanent sampling plots with Landsat 8 OLI, terrain and climatic variables, to model and map AGB.For the dataset analyzed, SVM was the best method to predict AGB.With the optimization of hyperparameters (cost and epsilon) and the choice of the best kernel through cross validation, this allowed prediction error to be reduced, providing satisfactory results when dealing with large samples.The predictive variables with greater importance for the modeling of the AGB in both models were Bands 7, 4 and 2, in addition to climatic variables (mean temperature), texture variables (NDVIME3 × 3 and NDVIME7 × 7) and topographic (elevation).Future studies should consider additional vegetation and climatic variables, together with exploring the utilization of hyperspectral sensors and airborne or satellite LIDAR data, for improving AGB estimations in the area of study.Funding: The current study was funded by the Forestry National Commission (CONAFOR in Spanish) project for forest inventory of permanent monitoring sites in productive forest areas of Mexico "Acuerdo específico para remedición de sitios permanentes de monitoreo forestal en paisajes productivos forestales en Mexico".

Figure 1 .
Figure 1.Location of the area of study and of the permanent forestry and soil research sites (SPIFyS in Spanish).Temperate forest distribution source is Instituto Nacional de Estadística y Geografía (INEGI) Land-Use Map Series V.

Figure 1 .
Figure 1.Location of the area of study and of the permanent forestry and soil research sites (SPIFyS in Spanish).Temperate forest distribution source is Instituto Nacional de Estadística y Geografía (INEGI) Land-Use Map Series V.

R 2
observed AGB.ŷi = predicted AGB.y i = average AGB.n = number of observations p = number of model parameters.

Figure 2 .
Figure 2. Predicted against observed aboveground biomass (Mg ha −1 ) and distribution of residuals against predicted biomass of the support vector regression (SVR) model for the default parameterization (left) and for the optimized parameterization (right).

− 1 )Figure 2 .
Figure 2. Predicted against observed aboveground biomass (Mg ha −1 ) and distribution of residuals against predicted biomass of the support vector regression (SVR) model for the default parameterization (left) and for the optimized parameterization (right).

Forests 2020 , 19 .Figure 3 .
Figure 3. Variable importance for the for the default parameterization (left) and for the optimized parameterization (right) SVR models.Grey bars represent variable importance and black bars represent the correlation between the response variable and predictive variables.

Figure 3 .
Figure 3. Variable importance for the for the default parameterization (left) and for the optimized parameterization (right) SVR models.Grey bars represent variable importance and black bars represent the correlation between the response variable and predictive variables.

Figure 4 .
Figure 4. Relative importance of predictive variables in the random forest (RF) model.Grey bars represent variable importance and black bars represent the correlation between the response variable and predictive variables.

Figure 5 .
Figure 5. Prediction of the RF model and distribution of residuals against observed AGB (Mg ha −1 ) for the default parameterization (left) and for the optimized parameterization (right).

Figure 4 .
Figure 4. Relative importance of predictive variables in the random forest (RF) model.Grey bars represent variable importance and black bars represent the correlation between the response variable and predictive variables.

Figure 4 .
Figure 4. Relative importance of predictive variables in the random forest (RF) model.Grey bars represent variable importance and black bars represent the correlation between the response variable and predictive variables.

Figure 5 .
Figure 5. Prediction of the RF model and distribution of residuals against observed AGB (Mg ha −1 ) for the default parameterization (left) and for the optimized parameterization (right).

Figure 5 .
Figure 5. Prediction of the RF model and distribution of residuals against observed AGB (Mg ha −1 ) for the default parameterization (left) and for the optimized parameterization (right).

Figure 6 .
Figure 6.Predicted of aboveground biomass (Mg ha −1 ) in the Durango Sierra Madre Occidental (SMO) generated from the best fit SVR model.

Figure 6 .
Figure 6.Predicted of aboveground biomass (Mg ha −1 ) in the Durango Sierra Madre Occidental (SMO) generated from the best fit SVR model.

Author
Contributions: P.M.L.-S.and J.L.C.D. performed the statistical analysis and J.J.C.-R., D.J.V.-N., E.J., P.M.L.-S.and C.A.L.-S., wrote and reviewed the manuscript.All authors have read and agreed to the published version of the manuscript.

Table 1 .
Descriptive statistics of the main forest inventory variables of the sites of study.Spectral information from the satellite Landsat 8 (OLI) of the United States Geological Service (USGS), was retrieved for the forest area in the Durango State.A total of 7 scenes, with a low cloud percentage, were utilized (path/row: 30/44, 31/42, 31/43, 31/44, 32/41, 32/42 and 32/43).The satellite images corresponded to the months of April and May 2017.The wavelength and spectral resolution of the bands utilized are summarized in Table 2.

Table 2 .
Characteristics of Landsat 8 Operational Land Imager bands.