Prediction of Forest Aboveground Biomass Using Multitemporal Multispectral Remote Sensing Data

: Forest aboveground biomass (AGB) is a prime forest parameter that requires global level estimates to study the global carbon cycle. Light detection and ranging (LiDAR) is the state-of-the-art technology for AGB prediction but it is expensive, and its coverage is restricted to small areas. On the contrary, spaceborne Earth observation data are effective and economical information sources to estimate and monitor AGB at a large scale. In this paper, we present a study on the use of different spaceborne multispectral remote sensing data for the prediction of forest AGB. The objective is to evaluate the effects of temporal, spectral, and spatial capacities of multispectral satellite data for AGB prediction. The study was performed on multispectral data acquired by Sentinel-2, RapidEye, and Dove satellites which are characterized by different spatial resolutions, temporal availability, and number of spectral bands. A systematic process of least absolute shrinkage and selection operator (lasso) variable selection generalized linear modeling, leave-one-out cross-validation, and analysis was accomplished on each satellite dataset for AGB prediction. Results point out that the multitemporal data based AGB models were more effective in prediction than the single-time models. In addition, red-edge and short wave infrared (SWIR) channel dependent variables showed signiﬁcant improvement in the modeling results and contributed to more than 50% of the selected variables. Results also suggest that high spatial resolution plays a smaller role than spectral and temporal information in the prediction of AGB. The overall analysis emphasizes a good potential of spaceborne multispectral data for developing sophisticated methods for AGB prediction especially with speciﬁc spectral channels and temporal information.


Introduction
Forests cover 31% of Earth's surface and play a crucial role in understanding global carbon dynamics that affect Earth's climate. Estimation of the aboveground biomass (AGB) is important as it is directly related to the carbon-dioxide sequestration from the atmosphere. The most typical methods to estimate forest AGB are in-situ measurements made by regional or national forest authorities [1,2]. However, such measurements are used at the local scale, and they are usually costly and inefficient over large forest areas. Ideally, an optimal approach to forest parameter estimation at the regional/global scale is to use Earth observation (EO) data acquired by remote sensing technology. Among remote sensing technologies, airborne LiDAR (light detection and ranging) is the state-of-the-art for modeling AGB [3][4][5][6], and multiple studies have demonstrated its very high accuracy in the prediction of forest parameters [7][8][9][10][11]. However, LiDAR data are expensive, and their full coverage of large forest areas is restricted to only certain countries. The modeling and mapping of AGB at a regional/global scale requires data that are economically viable with huge coverage and availability. In recent years, the amount of spaceborne remote sensing data has massively increased, both in terms of data sources and temporal availability [12]. Consequently, there has been an increased interest in using spaceborne multisensor and multitemporal data to estimate and map forest attributes, as well as to detect changes in forest covers [13].
Among the longest operational missions for satellite imaging of the Earth, the Landsat program has captured millions of images since 1972. Many studies have utilized the Landsat archive as a time-series resource for forest monitoring [14,15]. The more recent European Union's Copernicus program developed Sentinel missions for global environmental monitoring, providing images with increased spatial and temporal resolution. The Sentinel missions encompass Sentinel-1 C-band synthetic aperture radar (SAR) and Sentinel-2 multispectral images. Multiple studies based on SAR data have utilized different wavelengths (P, L, S, C) for the prediction of forest AGB. However, many studies found that long wavelength SAR data (P, L) are comparatively more sensitive to AGB than short wavelength SAR data [16][17][18][19][20]. This led to the development of global biomass products using these studies based on SAR data [17]. Similarly, multispectral data have also been frequently used for mapping forest species, forest fires, carbon stock estimations, etc. [21][22][23][24]. Beyond these large international missions, many other EO satellites have been operated with many different types of sensors, each one characterized by different specifications to capture EO data. In particular, recently, the space industry has been revolutionized by an increased number of small satellite missions (e.g., RapidEye, Dove, Iceye, etc.). The use of small satellite technology can be effective in developing economical solutions for ecological problems. They have onboard sensors with higher spatial resolution and frequent temporal coverage [25,26] compared to conventional satellites, such as Landsat and Sentinel. Recently, some studies have been performed with attempts to develop methods to resolve issues in using the small satellite data [27,28]. The latest findings suggest that small satellite data can be very cost-effective and can assist in developing efficient tools to quantify forest AGB [29,30].
A surge in the availability of multispectral data from these different EO satellites has driven the community to produce studies that compare the performance of multispectral data with other sensors in the domain of forest and ecology [31,32]. Currently, the focus of these studies has shifted from expensive and globally limited datasets (e.g., LiDAR) to cheap and extensively available spaceborne multispectral data for developing methods for reliable prediction of AGB [32][33][34]. The prediction approaches vary with the biome (e.g., tropical, temperate, boreal), the type of species (deciduous, evergreen), or the state of the forest (degraded, non-degraded) [35][36][37][38]. Furthermore, a common limitation in most scenarios while using multispectral data is the saturation problem for the prediction of high AGB values. Most of the existing studies indicate a spectral saturation of multispectral sensors over a certain value of AGB that depends on the sensor specifications [39,40]. It is crucial to overcome these limitations not only to benefit from the fact that spaceborne data are much cheaper but also because they allow the prediction of AGB at a national, continental, and even global scale [41], which is currently impossible with airborne LiDAR acquisitions [42,43]. Moreover, they allow a constant update of maps that cannot be easily achieved with airborne LiDAR data in an economical manner.
When modeling forest biophysical parameters with spaceborne multispectral data, the effect of spectral and temporal information at different spatial resolutions should be carefully evaluated. The diverse spectral channels can be correlated with a range of AGB values. Although, the data saturation problems were widely studied for satellites such as Landsat-8 and MODIS, they have been less studied for data acquired by more recent satellites such as Sentinel-2,3 and PlanetScope small and nano-satellites. A higher spectral information content may help in improving the model inaccuracies due to the spectral saturation [44,45]. An increased number of bands provides better details of the spectral signatures of the forest and also allows the computation of a higher number of specific vegetation indices that can be used to characterize the forest. Therefore, it is important to investigate the capabilities of the bands' reflectances and related vegetation indices Remote Sens. 2021, 13, 1282 3 of 21 of different satellite sensors for an accurate prediction of AGB. The spatial resolution of the multispectral data influences the amount of spatial detail that can be accounted for in forest analysis. In very high spatial resolution data, the image pixels cover small areas with large details and can provide the spectral information for single trees, while medium resolution data provides spectral information for a group of trees. Multiple studies have been carried out to analyze the role of the spatial resolution at various scales on AGB mapping accuracy [46][47][48][49][50]. The temporal resolution depends on the revisit time of a satellite: a high temporal resolution is important to capture cloud-free seasonal variations of the vegetation. In recent years, public and private space agencies have invested heavily in increasing the capacity of platforms in all these aspects. Sentinel-2,3 missions provide multispectral imagery from visible to infrared bands in 13-21 spectral bands, and Planet Labs provide products from RapidEye and Dove missions with a daily revisit capacity [51]. As spectral indices used for AGB modeling vary with time, multitemporal data can be used as an additional source of information. The temporal information can potentially also reduce the problem of data saturation that restricts the prediction of large AGB values [52][53][54][55]. Mapping AGB from space is a difficult task that requires robust prediction models [56]. In the past, many studies have been carried out using both linear and non-linear modeling approaches [57][58][59]. In many cases, it was found that when the prediction models become too complex, they operate with statistically insignificant predictor variables [60][61][62]. In such cases, selecting suitable variables is of prime importance to reduce the number of model parameters and thus to increase the generalization of a model. In the literature, there are multiple methods that can be implemented for modeling AGB [63,64], but a simple and generalized modeling approach is preferable for a justified comparison of different multispectral data.
The objective of this study is to explore the effects of the temporal, spectral, and spatial capacities of multispectral satellite data for the prediction of forest AGB. To reach such objective, we considered multitemporal datasets from Sentinel-2, RapidEye, and Dove satellites acquired with sensors characterized by different spatial resolution (i.e., 10, 5, and 3 m) and different numbers of spectral channels (i.e., 12, 5, and 4 bands) for two study areas with different composition and density of the forest. To evaluate the effect of the temporal information, we modeled AGB using both single-time images acquired at different times (one image per season) and multitemporal data (all season data combined) using the common spectral channels (Red-Green-Blue (RGB) and Near Infra-Red (NIR)) of all satellite data. The spectral information was evaluated by a systematic variable selection mechanism applied to all band reflectances and related vegetation indices as variables using all available spectral channels for each satellite's data. Lastly, the effect of spatial resolution was evaluated by comparing the relationship of key selected variables with field estimated AGB.

Study Area Description
The study area includes two sites, Lavarone and Pellizzano, both located in the Province of Trento, Italy, in the southern Alps. The forest in Lavarone is an uneven aged forest located at an altitude ranging from 1200 to 1600 m with species such as Norway spruce (Picea abies (L.) Karst.) and silver fir (Abies alba Mill.) accounting for 80% of the tree population and European beech (Fagus sylvatica L.), European larch (Larix decidua Mill.), and Scots pine (Pinus sylvestris L.) accounting for the remaining 20% of the population. The forest of Pellizzano is spread over an altitude from 900 to 2200 m and has a dominant presence of Norway spruce and subdominant presence of some coniferous species (e.g., Abies alba Mill., Larix decidua Mill., Pinus cembra L., Pinus sylvestris L., and Pinus nigra J., Arnold) and broadleaves species (Populus tremula L., Betula spp.). The forest is multilayered, more mixed, and structurally complex at lower altitudes and sparse at higher altitudes. The study was conducted in two areas to consider different species and the density of the Arnold) and broadleaves species (Populus tremula L., Betula spp.). The forest is multilayered, more mixed, and structurally complex at lower altitudes and sparse at higher altitudes. The study was conducted in two areas to consider different species and the density of the forest. The geographical locations of both the study areas are shown using detailed maps in Figure 1.

Field Data
The field data for the Lavarone study area were collected on multiple days of summer of 2016 for 41 circular plots of 30 m diameter, and for the Pellizzano study area, were collected on multiple days of summer 2014 for 47 circular plots of 30 m diameter distributed over the entire study area. The plot coordinates were recorded using a survey grade Global Positioning System (GPS) unit. The field plots were distributed randomly across the forest area to be statistically significant. Inside each plot position, the diameter at breast height (DBH) and the height of each tree were measured, along with the species. The DBH threshold was kept at 3 cm (i.e., all trees above 3 cm were measured), and dead trees were included, but their presence was minimal. The AGB of each tree was computed using the allometric equations stated in [65] (also presented in Appendix A) and then aggregated for each plot to obtain the plot's AGB. The AGB of the Lavarone field plots used for the study ranged from 69.42 to 412.34 Mg ha -1 , and the Pellizzano field plot AGB ranged from 0.9 to 545.3 Mg ha -1 . The mean tree height, mean tree DBH, and number of trees per plot for both the study areas are stated in Table 1.

Field Data
The field data for the Lavarone study area were collected on multiple days of summer of 2016 for 41 circular plots of 30 m diameter, and for the Pellizzano study area, were collected on multiple days of summer 2014 for 47 circular plots of 30 m diameter distributed over the entire study area. The plot coordinates were recorded using a survey grade Global Positioning System (GPS) unit. The field plots were distributed randomly across the forest area to be statistically significant. Inside each plot position, the diameter at breast height (DBH) and the height of each tree were measured, along with the species. The DBH threshold was kept at 3 cm (i.e., all trees above 3 cm were measured), and dead trees were included, but their presence was minimal. The AGB of each tree was computed using the allometric equations stated in [65] (also presented in Appendix A) and then aggregated for each plot to obtain the plot's AGB. The AGB of the Lavarone field plots used for the study ranged from 69.42 to 412.34 Mg ha −1 , and the Pellizzano field plot AGB ranged from 0.9 to 545.3 Mg ha −1 . The mean tree height, mean tree DBH, and number of trees per plot for both the study areas are stated in Table 1.

Remote Sensing Datasets
The Sentinel-2 data are acquired by twin satellites under ESA's Copernicus program with a revisit capacity of 5 days. Sentinel-2 images are characterized by 13 spectral bands with four bands at 10 m, six bands at 20 m, and three bands at 60 m spatial resolution. In this study, we used 4 spectral bands at 10 m and 6 spectral bands at 20 m resampled to 10 m in the Blue, Green, Red, Red-edge, NIR, and Short wave infrared (SWIR) region.
The RapidEye data are acquired by a constellation of five satellites, operated by the Planet Labs Inc., with a multispectral push-broom imager capturing five spectral bands: Red, Green, Blue, NIR, and Red-Edge. The RapidEye satellites are identically calibrated so that the images acquired by the five satellites are indistinguishable from each other. Most of the acquired data have a view angle smaller than 10 degrees, and for all images, it is always less than 20 degrees. The ground sampling distance (nadir) is 6.5 m, and the orthorectified pixel size is of 5 m. The constellation of satellites acquires data with a daily revisit capacity (off-nadir) and 5.5 days at nadir.
The Dove data are acquired by a constellation of 120 Nano-satellites operated by Planet Labs Inc., with a daily revisit capacity. The satellites are deployed at the International space station (ISS) orbit at an inclination of 52 degrees and Sun synchronous (SS) orbit with an inclination of 98 degrees. The multispectral data are acquired using a Bayer-masked Charge-Coupled Device (CCD) camera with ground sampling distance (nadir) of 2.7-3.2 m in the ISS orbit and 3.7-4.9 m in the SS orbit. The orthorectified pixel size of the acquired images is 3 m with 4 spectral bands: Red, Green, Blue, and NIR.
The dates of multitemporal acquisition of multispectral data are stated in Table 2. There was a time gap between the field sampling and the used remotely sensed data to keep the temporal acquisitions of different satellite data consistent considering their availability. The field estimated annual increment in the diameter at breast height (DBH) was 2.8% [66]. Therefore, the increment of AGB sampled forest plots during the time gap can be considered as negligible and hence, admissible for the proposed experiments. Table 2. Acquisition dates (YYYY-MM-DD) of the multispectral data used in the study.

Methodology
The methodological flowchart of the study is shown in Figure 2. In the following sections, each step is described in detail.

Pre-Processing
Level-1C (Top-of-Atmosphere reflectances in cartographic geometry) data for Sentinel-2 data were gathered from the Copernicus open hub platform of European Space Agency (ESA) and converted to Level-2A (Bottom-of-atmosphere), which were atmospheric and terrain corrected using Sen2cor plugin. The bands at 20 m spatial resolution were resampled at 10 m using nearest neighbor sampling to adjust the pixel size to have a uniform band stack and simplified computation. Level-3A (orthorectified, geometrically corrected, and radiometrically calibrated) RapidEye and Dove images were gathered from the data access portal of Planet Labs (https://www.planet.com/explorer/) (accessed on 26 March 2021). The individual data bands for each dataset were mosaicked and stacked together to get a single data frame for both satellite images for all seasons.

Variable Selection, Modeling, and Validation
Variable selection is a way to reduce the number of predictor variables to reduce the model complexity to mitigate the overfitting problem. In this study, we considered a total

Variable Extraction
Two sets of variables were considered for each multispectral dataset: the original band reflectances and the vegetation indices. The band reflectances of Sentinel-2 from B2 to B8 represent Blue, Green, Red, Vegetation red edge-1, 2, 3, and NIR, respectively; B8A represents narrow NIR, and B11 to B12 represent SWIR-1, 2 channels, respectively. The spectral reflectances of RapidEye from B1 to B5 represent Blue, Green, Red, Red edge, and NIR channels, respectively. The spectral reflectances of Dove from B1 to B4 represent Blue, Green, Red, and NIR channels, respectively. Numerous vegetation indices have been defined for specific applications in the literature [67][68][69]. To limit the number of predictor variables from the large number of existing vegetation indices, we selected 25 indices that were found to be robust in previous studies [70][71][72]. These 25 indices are stated in Table 3 with their respective equations.
All these variables were extracted for each circular ground plot of 30 m diameter, and to match the ground sample plots with remote sensing data at different spatial resolutions, we aggregated the pixels (at 3, 5, and 10 m according to the satellite considered) inside each plot by making the average of the spectral values in each band/index. In particular, we used the "gBuffer" function of the "rgeos" library in R. A round cap style with 15 m radial size was used for the inclusion of pixels within the area of the field plots and only the pixels that overlap with the size of the plot on the ground were considered. Table 3. Vegetation indices and their equations used as variables for modeling.

Variable Selection, Modeling, and Validation
Variable selection is a way to reduce the number of predictor variables to reduce the model complexity to mitigate the overfitting problem. In this study, we considered a total of 25 vegetation indices and all available band reflectances as predictor variables for modeling AGB. A robust adaptive lasso (least absolute shrinkage and selection operator) technique that uses L1 penalization method was implemented to perform automated variable selection. The adaptive lasso estimatorβ LS [73] is stated in Equation (1) Here, λ > 0 is the tuning parameter (penalty parameter) and ω j is the given adaptive weighted vector that needs to be tuned to achieve accurate estimates with a small error. The term Y i − X T i β represents a random sample having identical independent distribution associated with regression coefficients β .
Initially, we used ridge regression for computing the best ridge coefficients at a minimum value of λ parameter. We then calculated the penalty factor using the best ridge coefficients. Next, we performed adaptive lasso with the calculated penalty factor using lasso penalty. The penalty factor was used to tune the λ to allow a differential shrinkage. The penalty factor can be '0' for some predictor variables with the lasso penalty that implies no shrinkage and inclusion of that particular predictor variable in the model. Thus, adaptive lasso uses lasso penalty to push the coefficient of predictor variables to absolute '0', which improves the model by selecting the most consistent variables and automates the process of variable selection.
A generalized linear modeling [74] approach via penalized maximum likelihood was used for associating selected variables with the target AGB values. The modeling algorithm uses cyclical coordinate descent to optimize the objective function over each parameter successively for convergence. To achieve the objective of the study, we computed 6 metrics for assessing the precision, the agreement, and the overfitting of models. The cross validation metrics, their equations, and the related assessment aspect is stated in Table  4. The entire process starting from computing ridge coefficients to computing assessment metrics, was performed by using leave-one-out cross validation. A mean squared error measure was used to calculate the loss of cross validation. We used R-packages-'tidyverse', 'magrittr', 'glmnet', 'pROC', 'mgcv', and 'ggplot2' to conduct the entire process of variable selection, modeling, validation, and graphing.

Design of Experiments
The designed experiments were aimed at analyzing the effect of temporal, spectral, and spatial capacities of multispectral satellite data for the prediction of AGB.
The temporal analysis was carried out using the generalized linear modeling on common spectral channels, i.e., by only using RGB and NIR bands of each satellite data. This was done to remove the effects of extra channels present in Sentinel-2 and RapidEye data as compared to Dove data. The modeling was done for each single-time image, thus obtaining one model for each season. In addition, we also performed modeling on multitemporal data (i.e., considering all images combined) to observe the effect of temporal information on the AGB prediction. The effects of temporal information were assessed based on the metrics defined in Table 2. The spectral analysis was carried out using the generalized linear modeling on data from all available spectral channels of each satellite sensor. The effect of spectral channels was assessed using variables selected by the adaptive lasso algorithm and the agreement of selected variables with AGB using the coefficient of determination. Variables were divided into three groups according to the part of the spectrum they belong to-the common (RGB and NIR) channels, the Red-edge channels, and the SWIR channels.
The spatial analysis was carried out by analyzing scatter plots of the most frequently selected key variables computed at 3, 5, and 10 m spatial resolution with AGB. A Pearson correlation coefficient (R) was computed to observe the change in the agreement between the selected key variables at different spatial scales with the AGB.

Temporal Analysis
The effect of temporal information was analyzed using box plots depicting the variation of precision, agreement, and overfitting computed with generalized linear modeling on data from common spectral channels of the three satellites (RGB and NIR) using six metrics stated in Table 2. The precision box plots (Mean Absolute Difference % (MAD%), Root Mean Squared Difference % (RMSD%)) shown in Figure 3 depict that all single-time models had lower prediction precision as compared to the multitemporal models. The analysis indicates that all satellite data had improved prediction precision due to the inclusion of multitemporal information in the model for both the study areas. The addition of multitemporal information showed more significant improvement for Dove (in Lavarone) and RapidEye (in Pellizzano) data that are characterized by higher spatial resolution. All single-time models showed high variance in the prediction precision. This indicates a significant influence of specific seasons of data acquisition for AGB modeling.     Figure 4 shows the agreement box plots (R2fit, R2cv), pointing out that all multitemporal models performed better than single-time models for both the study areas. Although the temporal information improved performance of different datasets at different magnitudes, a significant improvement was observed for Dove and RapidEye data in the Pellizzano area, where the R2cv doubled the average R2cv of single-time models. The variance in the agreement of all single-time models provides a clear advantage for the use of multitemporal multispectral data for AGB prediction.    Regarding the overfitting box plots shown in Figure 5, the desirable values of Rsquared Ratio (R2R) and Sum of Squares Ratio (SSR) should be less than 1.1, wherein the real (cross-validated) precision exceeded 10% of the model residual variance. For most cases, the multitemporal models achieved less overfitting as compared to singletime models. In addition, the overfitting box plots depict considerable uncertainty for single-time models and certain dependence on the spatial location that indirectly influence the distribution of field plot AGB values. This effect can be observed clearly for the two different study areas where Lavarone had a high variance in overfitting indicators for all data, whereas comparatively low variance was observed for Pellizzano. However, all multitemporal models resulted in an acceptable level of overfitting, proving the effectiveness of temporal information.

Spectral Analysis
The spectral analysis was carried out for multitemporal models using variable selection applied to all available spectral channels with the adaptive LASSO algorithm. Figure 6 represents the variables selected for all the datasets for both the study areas, and values in the round bracket show their frequency of occurrence. The values of the multi-time selected variables maybe different depending upon the season, and their selection was purely based on the adaptive LASSO algorithm applied to the entire set of variables under consideration. For clear visualization of spectrum specific selection, the selected variables have been divided into three spectral ranges associated with RGB and NIR channels, Rededge channels, and the SWIR channels. Figures 7 and 8 show the regression plots (Field estimated AGB vs. Predicted AGB) with a cross validated coefficient of determination for Lavarone and Pellizzano area, respectively. tion applied to all available spectral channels with the adaptive LASSO algorithm. Figure  6 represents the variables selected for all the datasets for both the study areas, and values in the round bracket show their frequency of occurrence. The values of the multi-time selected variables maybe different depending upon the season, and their selection was purely based on the adaptive LASSO algorithm applied to the entire set of variables under consideration. For clear visualization of spectrum specific selection, the selected variables have been divided into three spectral ranges associated with RGB and NIR channels, Rededge channels, and the SWIR channels. Figures 7 and 8 show the regression plots (Field estimated AGB vs. Predicted AGB) with a cross validated coefficient of determination for Lavarone and Pellizzano area, respectively.  Table 3 for definition of all acronyms).
For the Dove data, four variables were primarily selected by the adaptive LASSO algorithm for multitemporal models of both study areas. The Green Leaf Index (GLI) variable formulated using the RGB channels was the most frequently (four times) selected among all variables. The other two variables more frequently selected were VIgreen (two times) and RI (two times). It is interesting to note that these three frequently selected variables for the Dove data were formulated without the NIR channel. The agreement achieved on the scale of coefficient of determination were 0.37 and 0.33 for Lavarone and Pellizzano, respectively. For the RapidEye data, eight variables were selected in total for both multitemporal models. There were three variables selected twice, namely GLI, Red Edge NDVI (NDVIre), and Canopy Chlorophyll Content Index (CCCI), two out of which (NDVIre and CCCI) were formulated using the red-edge channels. The red-edge channels contributed to the significant improvement in the agreement of RapidEye models over Dove models. The agreement achieved for RapidEye based multitemporal models was 0.51 and 0.44 for Lavarone and Pellizzano, respectively. For the Sentinel-2 data, the  Table 3 for definition of all acronyms).
For the Dove data, four variables were primarily selected by the adaptive LASSO algorithm for multitemporal models of both study areas. The Green Leaf Index (GLI) variable formulated using the RGB channels was the most frequently (four times) selected among all variables. The other two variables more frequently selected were VIgreen (two times) and RI (two times). It is interesting to note that these three frequently selected variables for the Dove data were formulated without the NIR channel. The agreement achieved on the scale of coefficient of determination were 0.37 and 0.33 for Lavarone and Pellizzano, respectively. For the RapidEye data, eight variables were selected in total for both multitemporal models. There were three variables selected twice, namely GLI, Red Edge NDVI (NDVIre), and Canopy Chlorophyll Content Index (CCCI), two out of which (NDVIre and CCCI) were formulated using the red-edge channels. The red-edge channels contributed to the significant improvement in the agreement of RapidEye models over Dove models. The agreement achieved for RapidEye based multitemporal models was 0.51 and 0.44 for Lavarone and Pellizzano, respectively. For the Sentinel-2 data, the NDVIre formulated using the red-edge channels was the most frequently selected variable. The other most frequently selected variables GLI, CCCI, Chlorophyll Index Red Edge (CIRE), and Normalized Burn Ratio (NBR) were all selected twice for the two multitemporal models. CCCI and CIRE were formulated using the red-edge channels, whereas NBR was formulated using the SWIR channel. The red-edge and the SWIR channels contributed to more than 50% of the total selected variables. This involved a more significant improvement in the agreement of models as compared to both RapidEye and Dove. The coefficient of determination was computed for multitemporal models as 0.53 and 0.51 for Lavarone and Pellizzano, respectively.
(CIRE), and Normalized Burn Ratio (NBR) were all selected twice for the two multitemporal models. CCCI and CIRE were formulated using the red-edge channels, whereas NBR was formulated using the SWIR channel. The red-edge and the SWIR channels contributed to more than 50% of the total selected variables. This involved a more significant improvement in the agreement of models as compared to both RapidEye and Dove. The coefficient of determination was computed for multitemporal models as 0.53 and 0.51 for Lavarone and Pellizzano, respectively.  To sum up, GLI was the most consistently selected variable for all three satellite data. Sentinel-2 data achieved the highest agreement in modeling as compared to RapidEye and Dove. Moreover, Red-edge and SWIR channels significantly improved the modeling accuracy for the prediction of AGB.

Spatial Analysis
The analysis of the effects of spatial resolution on AGB estimation was carried out by observing the change in the correlation of key variables determined by Pearson's correlation coefficient (R) at different spatial scales. Figure 9 shows the scatter plots of GLI (autumn) vs. AGB at 3, 5, and 10 m for the Lavarone area, and Figure 10 shows the scatter plots of Visible Index Green (VIgreen) (summer) vs. AGB at 3, 5, and 10 m for the Pellizzano area.
The analysis of scatter plots in Figures 9 and 10 collectively suggest a higher correlation of the considered key variables at 10 m, i.e., at a lower spatial scale. A correlation of To sum up, GLI was the most consistently selected variable for all three satellite data. Sentinel-2 data achieved the highest agreement in modeling as compared to RapidEye and Dove. Moreover, Red-edge and SWIR channels significantly improved the modeling accuracy for the prediction of AGB.

Spatial Analysis
The analysis of the effects of spatial resolution on AGB estimation was carried out by observing the change in the correlation of key variables determined by Pearson's correlation coefficient (R) at different spatial scales. Figure 9 shows the scatter plots of GLI (autumn) vs. AGB at 3, 5, and 10 m for the Lavarone area, and Figure 10 shows the scatter plots of Visible Index Green (VIgreen) (summer) vs. AGB at 3, 5, and 10 m for the Pellizzano area.     The analysis of scatter plots in Figures 9 and 10 collectively suggest a higher correlation of the considered key variables at 10 m, i.e., at a lower spatial scale. A correlation of 0.44 and −0.43 was achieved for GLI (autumn) and VIgreen (summer) variables, respectively, at 10 m which was higher in comparison to those achieved at 5 and 3 m. Figures 11 and 12 present the scatter plots for key variables in the red-edge spectral range with the field estimated AGB. Figure 11 shows the scatterplot of CCCI (spring) variable, and Figure 12 shows the scatter plot of NDVIre (spring) at 5 and 10 m for Lavarone and Pellizzano, respectively. A similar trend was observed for key variables in the red-edge spectrum, with the highest correlation observed at 10 m spatial resolution. The highest correlation observed for CCCI (spring) was 0.58 and for NDVIre (spring) was −0.48 at 10 m. Though there was no significant difference for CCCI (spring) at 5 m, a significant difference in correlation was observed for NDVIre (spring).

AGB Maps
To visualize the spatial distribution of the predicted AGB over the selected study areas, we used the multitemporal data based models developed using Sentinel-2 images with the best modeling accuracy for both study areas. The AGB maps for Lavarone and Pellizzano are shown in Figure 13 and Figure 14, respectively. In the maps, the non-forest areas were masked.

AGB Maps
To visualize the spatial distribution of the predicted AGB over the selected study areas, we used the multitemporal data based models developed using Sentinel-2 images with the best modeling accuracy for both study areas. The AGB maps for Lavarone and Pellizzano are shown in Figure 13 and Figure 14, respectively. In the maps, the non-forest areas were masked.

AGB Maps
To visualize the spatial distribution of the predicted AGB over the selected study areas, we used the multitemporal data based models developed using Sentinel-2 images with the best modeling accuracy for both study areas. The AGB maps for Lavarone and Pellizzano are shown in Figures 13 and 14, respectively. In the maps, the non-forest areas were masked.

AGB Maps
To visualize the spatial distribution of the predicted AGB over the selected study areas, we used the multitemporal data based models developed using Sentinel-2 images with the best modeling accuracy for both study areas. The AGB maps for Lavarone and Pellizzano are shown in Figure 13 and Figure 14, respectively. In the maps, the non-forest areas were masked. Figure 13. Spatial AGB prediction map using Sentinel-2 data-Lavarone area. Figure 13. Spatial AGB prediction map using Sentinel-2 data-Lavarone area.
As one can observe from the AGB maps, there were areas having higher AGB, corresponding with more mature and dense forest, and areas with lower AGB corresponding with sparser and young forest. In particular, in the Lavarone forest, the Northern part of the map was at a higher altitude, and the forest there was sparser and less productive. Similarly, in the Pellizzano forest, the Southern part of the map was the one at the highest altitude, characterized by a sparse forest of Larch and Stone pine.

Discussion
In this study, we modeled AGB using spectral features extracted from multispectral satellite data at various spatial scales from medium to high spatial resolution. We analyzed the prediction results with respect to temporal, spectral, and spatial properties of different multispectral data and assessed their potential in the prediction of AGB. The results obtained in this study could be used as guidelines for the use of multispectral data in the prediction of AGB and support some of the results obtained in previous studies.
For example, in [75], the single-time spectral bands and spectral indices from the Sentinel-2 data were independently used to predict forest stock volume. A high range of prediction error was observed for all values greater than 300 m 3 ha -1 for most of the singletime models in the temporal analysis for the prediction of AGB. Our study shows the limitation of high variance in precision, agreement, and overfitting with inferior modeling results of single-time models and provides a solution in the form of multitemporal models. Multitemporal data provides additional information to the model on seasonal varia- Figure 14. Spatial AGB prediction map using Sentinel-2 data-Pellizzano area.
As one can observe from the AGB maps, there were areas having higher AGB, corresponding with more mature and dense forest, and areas with lower AGB corresponding with sparser and young forest. In particular, in the Lavarone forest, the Northern part of the map was at a higher altitude, and the forest there was sparser and less productive. Similarly, in the Pellizzano forest, the Southern part of the map was the one at the highest altitude, characterized by a sparse forest of Larch and Stone pine.

Discussion
In this study, we modeled AGB using spectral features extracted from multispectral satellite data at various spatial scales from medium to high spatial resolution. We analyzed the prediction results with respect to temporal, spectral, and spatial properties of different multispectral data and assessed their potential in the prediction of AGB. The results obtained in this study could be used as guidelines for the use of multispectral data in the prediction of AGB and support some of the results obtained in previous studies.
For example, in [75], the single-time spectral bands and spectral indices from the Sentinel-2 data were independently used to predict forest stock volume. A high range of prediction error was observed for all values greater than 300 m 3 ha −1 for most of the single-time models in the temporal analysis for the prediction of AGB. Our study shows the limitation of high variance in precision, agreement, and overfitting with inferior modeling results of single-time models and provides a solution in the form of multitemporal models. Multitemporal data provides additional information to the model on seasonal variations of the same spectral features. Multitemporal models for all datasets (Dove, RapidEye, and Sentinel-2) achieved better modeling accuracy than their respective single-time models. Therefore, adding the temporal information complemented the spectral features at all spatial scales for both the considered study areas characterized by different forest compositions. Despite the heterogeneous and complex forest ecosystem, the multitemporal models performed better as compared to single-time models. Moreover, the annual growth of the stem diameter, and hence, the AGB, is dependent on factors such as rainfall, temperature, humidity, day length, species, etc. [76], which directly affect the seasonal patterns of vegetation. Therefore, it can be stated that the seasonal variation in spectral reflectance of the forest plots captured using spectral features of multitemporal multispectral data is expected to have a relationship with the structural measures of the forest, and hence, the AGB. Based on similar assumptions, the studies [15,52,77] were previously performed on Landsat time-series data for the prediction of AGB. Our study adopted certain improvements in methodology and analytical approach to quantify the key aspects (i.e., precision, agreement, and overfitting observed in time) for improving the prediction of AGB. In addition, the temporal results were analyzed by developing experiments performed at various spatial scales (3, 5, and 10 m) of multispectral data.
The process of variable selection is crucial for modeling AGB with spectral features of multispectral data. A few studies [34,75,[78][79][80] bypass the process of variable selection and produce models with large overfitting and high margin of errors. For example, in [34], to predict stand biomass using Landsat data, the models were developed separately for spectral bands and spectral indices without the use of any variable selection method. In [78], multiple variables were added to a basic model one at a time to determine an optimum combination model to predict forest stand volume. However, the most reliable way to add or remove a variable is to use a robust variable selection method. In this study, we systematically added variables to the models using an adaptive lasso variable selection method. Only specific robust variables driving significant change in the model were added. In [75], the forest stock volume was estimated from multispectral data with two different spatial resolutions. The two different models at two different spatial resolutions were developed based on single spectral variables having the highest correlation. Differently, in this study, we developed models based on the combination of different spectral variables at three different spatial resolutions using a robust variable selection method. Performing the spectral analysis experiment for two different study areas, the important role of some specific spectral bands in modeling AGB was iteratively determined.
A few other studies (e.g., [75,78,79,81]) were conducted on Sentinel-2 data using different modeling approaches for the prediction of forest biophysical parameters. In [81], Sentinel-2 data were used at 20, 40, and 60 m for the prediction of canopy cover using the same spectral indices. These indices achieved similar results at all spatial scales with identical scatterplots and equivalent RMSE values. In our study, we analyzed the correlation of a few selected key variables from different spectral ranges at different spatial resolutions. The observed significant variation in a correlation of the same key variables at different spatial resolutions pointed out an instrumental role of spatial resolution in the prediction of AGB. The key variables in all cases were more correlated at a lower spatial resolution, and thus, the higher spatial resolution of the data is less relevant when the prediction method is based on spectral features from multispectral data. Our study provides a solution to make AGB modeling at a higher spatial resolution more viable by including relevant spectral information. This can be observed from the modeling results presented in the spectral spatial analysis section of the study.
This study can assist geo-spatial startups, such as Planet Labs, and other space agencies in planning future missions with respect to onboard sensor equipment and upgrade for forest applications. The primary focus of this study was to explore the role of different capacities of multispectral data in modeling AGB, and therefore, the approach was limited to serve the objectives. In addition, this study can prove relevant and useful for the development of more efficient tools and techniques based on automatic feature extraction and selection with hierarchical modeling methods using multispectral data. There is a possibility of developing alternate tree-crown segmentation and texture variance based methods using high resolution multispectral images, which may prove effective even with limited spectral channels. Moreover, with respect to alternate spaceborne data for the prediction of AGB, long wavelength and cross-polarization in SAR may be comparable in terms of uncertainty, availability, and time of processing with state-of-the-art multispectral data. However, multispectral satellites provide better regional and global data consistency for complex canopies and forest cover on mountainous terrain. As shown in [19], SAR data used for four different study areas with three different polarizations produce uncertainty and variance in modeling accuracy for AGB prediction. In our study, a similar variance in modeling accuracy was observed at different temporal instances, and a small uncertainty was observed depending upon variables selected for multitemporal multispectral data. It is worth noting that different modeling approaches could have led to slightly different results. However, this would be the case for every chosen approach or the other different extracted features. Furthermore, the limited field plot samples could restrict model performance to some extent, as observed in this study. As the study was purely based on spectral data, a forest-type specific trained model would be required to make predictions at the global level. Despite these limitations, this study presented a systematic approach to variable extraction and automatic variable selection based on a robust algorithm and a consistent set of experiments to achieve the desired objectives. The results point out the potential of spectral data for AGB prediction. In the future, using this study as a basis, more sophisticated disentangled feature generation and integrated modeling approaches using deep learning can be developed.

Conclusions
In this study, different multitemporal multispectral remote sensing data were used to characterize the effect of temporal and spectral information as well as spatial resolution for predicting forest AGB. The results clearly showed that each characteristic of multispectral data affects the prediction of AGB. Multispectral data characterized by the spectral information from red-edge and SWIR channels in addition to the visible spectrum diminish the effect of data saturation and formulate effective predictors of AGB. The correlation of key variables at different spatial scales iteratively suggests better estimates of AGB at low spatial resolution ruling out a greater role of high spatial resolution in spectral variable based modeling. In addition, single-time based multispectral data are insufficient for accurately modeling AGB, and it requires multitemporal data for better modeling accuracy. The multitemporal models present a significant improvement over all single-time models confirming multispectral data as a fine and potent resource for AGB prediction.  Data Availability Statement: Restrictions apply to the availability of these data.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A
The allometric equations used for the estimation of the AGB of each tree in the field (AGB TREE ) were based on the stem volume equations published in [82] multiplied by the wood density (WD) of each species (IPCC 2003). The stem volume equations of [82] were developed for the region where our study areas are located. The equation for estimating AGB TREE is in the following form: where DBH is the diameter in centimeters, H the height in meters, and AGB TREE is the estimated aboveground biomass in kilograms. The coefficients used for the different species are in Table 1.