Comparison of Statistical Modelling Approaches for Estimating Tropical Forest Aboveground Biomass Stock and Reporting their Changes in Low-intensity Logging Areas using Multi-temporal LiDAR Data

Accurately quantifying forest aboveground biomass (AGB) is one of the most significant challenges in remote sensing, and is critical for understanding global carbon sequestration. Here, we evaluate the effectiveness of airborne LiDAR (Light Detection and Ranging) for monitoring AGB stocks and change ( Δ AGB) in a selectively logged tropical forest in eastern Amazonia. Specifically, we compare results from a suite of different modelling methods with extensive field data. The calibration AGB values were derived from 85 50 × 50m field plots established in 2014 and which were estimated using airborne LiDAR data acquired in 2012, 2014, and 2017. LiDAR-derived metrics were selected based upon Principal Component Analysis (PCA) and used to estimate AGB stock and change. The statistical approaches were: ordinary least squares regression (OLS), and nine machine learning approaches: random forest (RF), several variations of k -nearest neighbour ( k -NN), support vector machine (SVM), and artificial neural networks (ANN). Leave-one-out cross-Remote validation (LOOCV) was used to compare performance based upon root mean square error (RMSE) and mean difference (MD). The results show that OLS had the best performance with an RMSE of 46.94 Mg/ha (19.7%) and R² = 0.70. RF, SVM, and ANN were adequate, and all approaches showed RMSE ≤ 54.48 Mg/ha (22.89%). Models derived from k- NN variations all showed RMSE ≥ 64.61 Mg/ha (27.09%). The OLS model was thus selected to map AGB across the time-series. The mean (± sd - standard deviation) predicted AGB stock at the landscape level was 229.10 (± 232.13) Mg/ha in 2012, 258.18 (±106.53) in 2014, and 240.34 (sd±177.00) Mg/ha in 2017, showing the effect of forest growth in the first period and logging in the second period. In most cases, unlogged areas showed higher AGB stocks than logged areas. Our methods showed an increase in AGB in unlogged areas and detected small changes from reduced-impact logging (RIL) activities occurring after 2012. We also detected that the AGB increase in areas logged before 2012 was higher than in unlogged areas. Based on our findings, we expect our study could serve as a basis for programs such as REDD+ and assist in detecting and understanding AGB changes caused by selective logging activities in tropical forests.


Introduction
Tropical forests have been receiving increasing attention from scientists in the past couple of decades due to their significant contribution to the global carbon cycle. Forests by sequestering and storing great quantities of carbon act as natural 'brakes' on global climate change [1,2]. The Amazon rainforest is notable as the largest continuous area of tropical forest covering approximately 400 million hectares. Given its size, the volumes of carbon dioxide that it can emit and sequester are significant; it stores one-fifth of the total carbon in global terrestrial vegetation and is the largest carbon reservoir in the form of biomass [3,4].
Forest stand development and mortality are subject to natural and anthropogenic disturbances that alter carbon fluxes over time [5,6]. Consequently, economic incentives such as REDD+ exist to alter fluxes in favour of sequestration in forests; and depend on reliable monitoring, reporting, and verification (MRV) protocols [7]. Owing to the potential of tropical forests for sequestration, especially in comparison to other terrestrial ecosystems, an accurate estimate of the forest structure and biomass is necessary to better understand the global carbon cycle [8,9]. However, monitoring in tropical regions is a resource-intensive challenge resulting in infrequent and limited field surveys [10]. Thus, there is a need for reliable LiDAR-based AGB models, an area that is still developing.
Selective logging has been an important activity affecting land use [11] and modifying carbon fluxes within the Amazon, and can degrade the forest environment if logging exceeds the sustainable forest yield [12]. Consequently, reduced impact logging (RIL) techniques are being introduced to permit sustainable resource use of the Amazonian forest. RIL involves intensive planning and monitoring techniques, such as mapping and tree inventories, to minimize negative environmental impacts. It has been shown that well-planned logging can allow close to full recovery of carbon stocks [13,14].
Forest management, REDD+ MRV, and carbon cycle modelling all rely upon accurate estimates of forest aboveground biomass (AGB) stocks and their changes over time [15]. Previous studies have aimed at improving the accuracy of AGB estimation and forest inventory from LiDAR in regional and national level MRV systems such as those for REDD+ programs. [16]. Such studies seek to adhere to the common interpretation of the IPCC guidelines stating that the uncertainty of the AGB should not be greater than 20% of the mean [17]. Airborne LiDAR data collection has become recognised in several forest ecosystems as the most reliable technique for estimating AGB [17][18][19]. LiDAR can be used to observe and facilitate the study of biomass carbon change at multiple scales [20], and to observe the impact of activities such as selective logging [15,21]. Research on the application of modelling methods for identifying these low-intensity logging practices using LiDAR is currently at an infant stage, though it is gaining momentum [22]. AGB can be estimated from LiDAR-derived attributes using a variety of statistical modeling approaches ranging from linear regression techniques to the state-of-art non-parametric methods such as Random Forest (RF), k-Nearest neighbour (k-NN), and Support Vector Machine (SVM), each depending on the underlying assumptions and complexities [16,[23][24][25]. Additionally, a recent study by Shao et al [26] on temperate hardwood forests highlighted the applicability of employing multiplicative nonlinear regression models for estimating AGB. In this case, the authors were able to leverage information on soil-based site productivity classes along with LiDAR-derived metrics to build an optimized model that could account for the variations in site productivity; including an index of site productivity which enhanced their model's ability to explain the overall variability by 14%.
In recent years, there have been a number of studies focused on comparing the accuracy and precision of multiple machine learning approaches estimating biomass. For instance, Domingo et al. [27] performed a comparison of multiple linear regression model (MLR) with four non-parametric models-namely SVM, RF, locally weighted linear regression (LWLR), and a linear model with a minimum length principle (MDL)-to estimate total biomass (tree and shrub biomass fractions) in Pinus halepensis Miller forest stands using low-density LiDAR and field data. MLR was found to outperform other nonparametric methods in terms of RMSE (15.14 tons/ha) and bias (0.01) values, though no statistically significant differences existed between the methods considered. Similarly, Domingo et al. [28], compared the performance of nine regression models in quantifying biomass losses and CO2 emissions due to combustion in an Aleppo pine forest using LiDAR data. Here too, the best model for pre-fire AGB estimation was found to be MLR, and no significant statistical differences were observed among the high performing models. Latifi et al. [29] on the other hand, made use of a wide range of forest variables extracted from multiple remotely sensed data, such as orthorectified colour infrared (CIR) images, medium-resolution Thematic Mapper (TM) imagery, and high-density normalized LiDAR point clouds, for estimating the total volume and biomass in a mixed temperate forest landscape. When comparing the performance of various plot-level nonparametric predictions, which comprised of three distance measures of Euclidean, Mahalanobis, and Most Similar Neighbour, as well as RF, and multiple remotely sensed datasets, the authors showed the superior predictive capability of LiDAR-based metrics and RF combination. Application of evolutionary genetic algorithms was also tested to prune the original high dimensional dataset and improve the performance of modeling techniques; however, intercorrelation related issues proved to be a major hurdle causing unstable results during multiple runs. Meanwhile, Gagliasso et al. [30], on examination of the predictive performance of linear regression, geographic weighted regression (GWR), gradient nearest neighbor (GNN), most similar neighbor (MSN), random forest imputation, and k-nearest neighbor (k-nn), observed that the k-nn (k = 5) had the lowest RMSE and least amount of bias while predicting biomass across 19,000 acres on the Malheur National Forest. Notwithstanding the ever-increasing interest in modeling paradigms, comparative modeling studies for AGB change prediction in selectively logged tropical forests remains nominal.
Even though airborne LiDAR can facilitate spatially explicit and timely estimates of tropical forest structure, trade-offs still exist between modeling techniques and AGB stocks, and AGB change estimations. For instance, it is unclear how much the models can be simplified and still maintain an adequate level of accuracy for AGB stocks estimation, and through the differences between estimates, report its AGB change in tropical forests. Thus, in this study, we aimed to estimate AGB stock and report the changes at the plot and landscape levels using multi-temporal LiDAR data for a selectively logged tropical forest in Amazonia, Brazil. Specifically, we compared nine machine learning approaches to traditional linear regression with the following objectives included in the scope of this study: (i) Evaluate the performance of ordinary least squares (OLS) regression modelling and nine machine learning algorithms: random forest (RF), several variations of k-nearest neighbour (k-NN), support vector machine (SVM), and artificial neural networks (ANN) (ii) Estimate AGB stocks and report AGB change at the landscape level using the best model from the previous step and multi-temporal LiDAR datasets.

Study Area
This study was conducted at Fazenda Cauaxi in Pará state, Brazil ( Figure 1). The state of Pará is in the eastern Amazon, where logging and forest clearing for the purposes of land-use conversion and the gathering of fuelwood have been essential to the local economy of the study area for decades. The local climate is tropical humid, and the average total annual precipitation is approximately 2200 mm [31]. The predominant vegetation is Ombrophilous Dense Forest, also called terra firme (upland), with a mean upper canopy height of 30-40 m and scattered emergent trees up to 50 m tall [32]. Terra Firme literally means "firm earth" and refers to a rainforest that is not inundated by flooded rivers. This forest is noticeably taller and very diverse (> 400 species/hectare in some areas). According to the Brazilian system [33], the soils are mainly classified as dystrophic yellow latosols. The soils have low fertility due to the low reserve of nutrients such as calcium, magnesium, potassium, phosphorus, and nitrogen, in addition to high saturation by aluminum. The topography is mainly flat to mildly undulating [34] with the height above sea level in the study area ranging from 74 to 150 m [35]. across areas logged (c1-c3) by RIL or unlogged (d1-d3) in 2012 (c1 and d1), 2014 (c2 and d2), and 2017 (c3 and d3) corresponding to the zoomed areas denoted in 1b and sharing the same colour ramp. Grid size in c1-c3 and d1-d3 is 10m. The coordinate reference system for the study area is EPSG:4674.

Field Data
In 2014, the field dataset was collected across 85 plots of 50 × 50 m (0.25 ha) spaced at 100 m intervals along transects distributed across areas with logging within the study area ( Figure 1b). Plot corners were registered using differential GNSS (GeoXH6000, Trimble Navigation, Ltd.; Dayton, OH, USA) [15]. Within each plot, sub-plots demarcated along one side of the plot had dimensions of 5 m × 50 m (250 m²). Only the trees within each plot with a diameter at 1.3 m breast height (dbh; cm) exceeding or equal to 35 cm were measured, whereas within the subplot's trees with a dbh within the range 35 cm and 10 cm inclusive were measured. The AGB of each individual tree (agb; kg) was calculated using an allometric equation developed by Chave et al. [36], Equation (1): where is the wood density (g/cm 3 ) per species, which is derived from a published database [36]; E is a compounded measure of environmental stresses-such as variability in temperature and precipitation-which improves estimation when field measurements of tree height are not available. For this, we retrieved a value of E = −0.104 for the location of this study area. Total live AGB (Mg/ha) was calculated by aggregation of individual tree biomass values and using plot appropriate hectare expansion factors. Table 1 presents a summary of the input data (dbh and AGB) in 2014, which were used for statistical analysis.  Figure 2 illustrates the research process that has been designed for estimating AGB stocks and AGB change using LiDAR data and statistical modeling approaches.

Lidar Data and Processing
LiDAR datasets were collected in 2012, 2014, and 2017 as a part of 'Sustainable Landscapes Brazil', a joint venture of the Brazilian Corporation of Agricultural Research (EMBRAPA) and the United States Forest Service (USFS). The attributes of the lidar sensor and flight parameters are displayed in Table 2. LiDAR data processing was carried out using FUSION/LDV version 3.60 software [37] and Lastools [38]. First, LiDAR pulse density was standardized across all years to a common density of 12 pulses/m² using the algorithm implemented in the ThinData utility of the FUSION toolkit [37]. Then lasground was used to classify ground returns (step: 10 m, bulge: 0.5 m, spike: 1 m, offset: 0.05 m). From classified ground returns, DTMs (resolution: 1 m) were created via blast2dem. The calculation of heights above ground (also called normalisation) was performed by the lasheight tool, the PolyClipdata tool clipped the LiDAR data to within the plot boundaries. Lasground, Blast2dem, lasheight, and PolyClipdata are available in the LAStools software. The CloudMetrics tool was then used to derive a suite of plot-level LiDAR canopy metrics ( Table 3); for that, we used the FUSION toolkit. A complete description of the LiDAR-derived canopy height metrics can be found in Mcgaughey [37].

Model Development and Assessment
Modelling variable selection was conducted through principal component analysis (PCA) [39] according to Silva et al. [40]. Eigenvectors were identified and inspected for each principal component, using the function "prcomp" in R [41]; a detailed description of the PCA algorithm can be found in Fernandez et al. [42]. After variable selection, we tested nine statistical model approaches for estimating and mapping AGB stock and changes as follows: i) Ordinary Least Squares (OLS) regression. This is a common method for modelling and predicting AGB from LiDAR metrics. The OLS model was implemented in R with the "lm" function.
ii) Random Forest (RF). The RF algorithm was implemented in R using the randomForest package [43]. In RF, ntree was set to 1000, and the other parameters (e.g. mtry) were left in RF default mode.
iii) k-Nearest Neighbour (k-NN) imputation. This is a non-parametric method used for regression and classification [44]. In this study, we conducted k-NN using the package yaImpute in R [45]. For each imputation, we set k = 1 neighbour to preserve the variance of the data [46]. Neighbour weighting methods used were the Euclidean (k-NN-EU), Mahalanobis (k-NN-MA), Most Similar Neighbour (k-NN-MSN), Independent Component Analysis (k-NN-ICA), Random Forest (k-NN-RF), and raw (unweighted) data (k-NN-RAW). iv) Support Vector Machine (SVM). This is a non-parametric statistical method. The SVM algorithm was performed using the R package e1071 via an epsilon-regression with the default epsilon value of 0.1 [47].
v) Artificial neural network (ANN). Here, a simulation of a biological neural network system using mathematical modelling is performed [48]. Normally, three layers of neurons make up a neural network: an input layer, a hidden layer, and an output layer. The nnt package in R was used for the ANN [49]. The hidden layer neurons parameter was set to 40, and the input and hidden nodes were set to compute the logistic function, while the output node was set to compute a linear function. Before running ANN, the dataset was standardized.
Although machine learning methods do not require that assumptions such as normality and homogeneity of variance be met, this is not true of linear regression [50,51]. Herein, we used Shapiro-Wilk [52] and Breusch-Pagan [53] tests to evaluate the normality and heteroscedasticity of the OLS model residuals. We where AGBt is the AGB stock for year t and ΔAGBt1-t2 is the AGB change between years t1 and t2, both expressed in Mg/ha. Leave-one-out cross-validation (LOOCV) was employed to assess accuracy. This is done by iteratively removing a single plot i from the total number of plots n, then using the remaining plots to fit a separate model and predict a value for the removed plot ( ), the prediction is then compared to the observed value ( ). We calculated, in Mg/ha, absolute root mean squared error (RMSE), mean difference (MD), and the coefficient of determination (R 2 ), in order to respectively evaluate model precision, accuracy, and agreement between the predicted and observed estimates (Equations 8-10).
Moreover, the relative RMSE and MD (in %) were calculated as a percentage of the observed mean AGB ( ).
The mean estimator for the AGB stocks in 2012 ( ), 2014( ) and 2017 ( ) and changes at the landscape level within unlogged and logged by RIL units were calculated (Equations 11-13) following McRoberts et al. [54]: where N is the population size (number of pixels in the area); n is the sample size; , and are the estimated AGB stocks in 2012, 2014, and 2017 at the pixel i, respectively; is the model prediction at different times, and represents the reference biomass at each time. The second element of Equation (14) represents the bias of the initial biomass change estimator.
An uncertainty analysis of , and was conducted at the landscape level by integrating errors at the pixel level and compensating for spatial autocorrelation of errors as follows [55,56] (Equation 15): where σ² is the variance of the estimator of the mean AGB stock; ρ(d) is the spatial autocorrelation function of distance d, based on an exponential semivariogram model; and σj is the estimated standard error of AGB stock values at the j-th pixel.
The variance of the estimator for ∆ ( ) was computed as (Equation 16): where is model prediction error and ̅ is mean model error.

Principal Component Analysis (PCA) and Variable Selection
The PCA indicated that 97% of the variance in the suite of LiDAR metrics is accounted for by the first six principal components (Figure 3). For each of these principal components, one metric was selected on the basis of the highest contained value of eigenvectors. The selected metrics were: PC1: Mean height (HMEAN); PC2: Height coefficient of variation (HCV); PC3: Height kurtosis (HKUR); PC4: Canopy cover (COV); PC5: Modal height (HMODE); PC6: Height skewness (HSKEW). Therefore, we selected only these metrics for modelling AGB, thus reducing dataset size and redundancy. The contribution of each PC to the total variance and the projection of each metric are shown in Figure 3. Table 4 represents the eigenvalues and eigenvectors of each principal component.

Model Performance
The best modelling approach for estimating AGB according to the LOOCV procedure was the OLS, which produced a relative RMSE less than 20% (Table 5). Shapiro -Wilk, and Breusch-Pagan showed that the assumptions of the linear regression were not violated (p-value > 0.05).
RF, SVM, and ANN showed slightly lower MD than OLS; however, the accuracy was inferior in terms of R² and RMSE. The k-NN algorithm had six derivations tested, with R² values varying from 0.35 to 0.53. Relative RMSE values ranged from 27.10% to 31.87%, while relative MD values varied from -1.96 to -0.65%, respectively. Among the k-NN variants, the k-NN-MSN model had the best performance, with the greatest values of R² and the lowest value of RMSE and MD. The worst performance was found for k-NN-RAW, which presented the lowest values for R² and the greatest value of RMSE and MD (Table 5).

Aboveground Biomass Change Mapping and Uncertainty
Estimation of AGB stocks and AGB change at the landscape level was performed using the OLS model (Table 6). Figure 4 shows AGB stocks, while Figure 5 reports the AGB change derived from AGB stock maps and highlights these estimations at both unlogged and logged areas. The AGB estimates generated by the best model (OLS) allowed us to observe that there were increases in AGB stocks at the landscape level. In 2012, the mean stock was 229.10 Mg/ha (sd ± 232.12), while in 2014 and 2017, it was 258.18 (sd ± 106.53) and 240.34 (sd ± 177) Mg/ha, respectively. We also observed the same pattern for the unlogged area (Figures 4 a1, b1, and c1), where there were increases in the order of 10 Mg/ha between the period of 2012 to 2017. However, we noticed that more recent logging areas (2010 and 2012) showed losses in the biomass stocks, while those that suffered older logging managed to recover the biomass stocks and show increases in the stocks, highlighting the area that was

Discussion
Tracking change in AGB is vital for monitoring, reporting, and verification protocols (MRV) in support of REDD+. For accurate and satisfactory estimations, proper modelling techniques and data acquisition procedures are necessary. In our study, we developed maps of AGB stocks using multitemporal LiDAR data and advanced modelling techniques that show the variation of AGB stocks over the years for logged forests in eastern Amazonia. Owing to the subtle and short-term changes occurring, logged forests are one of the hardest in which to detect changes. Results from our study highlight the robustness of our framework, the potential of multi-temporal LiDAR, and the importance of appropriate modelling techniques in support of climate change mitigation initiatives. By comparing AGB change between logged and intact forests, we gained insight into tropical forest resilience to disturbance. Specifically, our findings indicate that tropical forests have great potential for AGB recovery even after disturbances such as selective logging.
To predict biophysically important forest attributes such as basal area, mean stem diameter, and AGB, LiDAR measurements derived from point clouds can be used in empirical models [57][58][59][60]. In our study, the metrics selected to compose the models are corroborated by previous studies [59,61]. For instance, numerous AGB estimation studies [62][63][64] had indicated the metric 'mean canopy height' to be one of the most significant attributes, and this is reflected by our PCA results. Likewise, metrics such as Standard Deviation and Coefficient of Variation of Height were found to provide information on the vertical complexity and heterogeneity of canopy components [65]. In addition, our results support the findings of some previous studies that assessed the capacity of LiDAR data point-based metrics to describe forest biophysical parameters, using results obtained from point density [61] and the Canopy Cover metric [66,67]. Nonetheless, it might be possible to estimate AGB with the help of several other ALS-derived metrics [68,69] as well as with more simplified model structures, while being able to attain similar levels of accuracy as reported in our study. This would be an interesting area to explore in the near future.
In studies aimed to estimate AGB stock and AGB change, the selection of the appropriate modelling approach is one of the most critical steps [59]. We found, through the use of LOOCV, that OLS performed better than non-parametric approaches; a finding which has been reported in other studies comparing modelling methods in predicting various forest attributes [70][71][72]. By comparing the performance of OLS with other methods, we further evaluated how much variation is happening with AGB estimation varies and what trade-offs may be associated with different methods while working with logged forests. Additionally, we demonstrate that methods such as RF and SVM that performed close to the OLS can be used to estimate and make inferences when necessary; that is, in situations where there exist non-linear or diverse relationships between dependent and independent variables [73]. The performances of RF and SVM, as the best among non-parametric approaches, may have been affected not only by the number of field plots but also by other factors such as bootstrapping of data to avoid overfitting R² values [74].
In the case of k-NN based methods, we noticed comparatively less satisfactory results, even after feature scaling. Hudak et al. [46] compared different k-NN imputations to simultaneously impute the basal area and plot density per species from topographic variables and LiDAR-derived canopy structure. They concluded that k-NN was inferior to RF, reflecting our results. This can be tied to the fact that this algorithm uses the training data for classification rather than for learning and improving the model, and is very sensitive to noisy data, missing values, outliers, and dimensionality; additional difficulty rests in determining the value of parameter K on a case-by-case basis. We also found the computational cost to be quite high here as we had to calculate the distance of each instance under consideration to all the training samples. In general, apart from the dilemma with attribute selection-which might have contributed to the poor performance in our case-we had selected the same metrics for all the models built; another critical issue while employing k-NN is the uncertainty in choosing the appropriate kind of distance-based learning. In our study, we did include six different types of distances and were able to compare their performances. Given the low performances and minimal variations, in terms of RMSE and MD, between different distance-based learning methods, further research is encouraged before considering k-NN based techniques for similar studies. As previous studies have recommended, k-NN-based approaches can be more reliable when a designbased framework of forest inventory with non-parametric based estimators is involved, because this method accounts for dependence and heteroscedasticity in the data [73,75].
Asner et al. [76] compared the accuracies of non-parametric AGB models integrating LiDAR and optical data in a forest in northwestern China and found that among non-parametric approaches, RF performed best, followed by Back Propagation Neural Networks and SVR. The results of Asner et al. [76] show improvements in AGB estimates by integrating LiDAR and optical data and present a pattern similar to that of this study. Görgens et al. [77] conducted a study with very similar results to our own, in which the authors found superior RF performance as compared with other machine learning approaches such as ANN and SVR. On the other hand, there have been studies conducted based on non-linear regression models as well. For instance, Shao et al [26], for estimating AGB, employed a multiplicative non-linear regression model that took into consideration both lidar-derived metrics as well soil-based site productivity class data. Herein, the authors were able to address a few critical issues associated with mixed forests, such as the overlooked differences in height-diameter relationships with respect to sites and species found within, resulting from varied site productivities and the similarity of the vertical height profiles with varied tree volume/density arising from the deliquescent growth form of hardwood trees; not to mention, these concerns are ubiquitous and extremely challenging in the realm of tropical forests. The authors reported the relationship between AGB and LiDAR-based metrics to be nonlinear in case of low productivity sites and predominantly linear on high productivity sites.
When making comparisons to other studies, we concluded that the results of our research, in terms of R² for AGB estimates, fall within the bounds of that which has been found in tropical forest areas [18,76]. Asner et al. [76] in four tropical regions located in Madagascar, Peru, Panama, and Hawaii, reported R² varying between 0.68 to 0.85. A study in selectively logged tropical forest by d'Oliveira et al. [78] found values of R² ranging from 0.63-0.72 for linear regression models, as the authors expected, owing to their restriction to a single allometric AGB equation exclusively based upon the diameter for all species, akin to our study. Englhart et al. [16] emphasized multi-temporal LiDAR's power in accurately quantifying tree height change and associated AGB, necessary for REDD+, even for very small areas/plots.
Regarding the mean AGB densities in Mg/ha, the values we found agree with findings from other studies conducted in tropical forests. Authors such as d'Oliveira et al. [78] used airborne LiDAR data to estimate AGB and to identify regions impacted by selective logging across tropical forests in the western Brazilian Amazon, and the mean AGB they found was of 231.6 Mg/ha. Andersen et al. [25] estimated the AGB for two years, 2010 and 2011, and obtained the mean values of 232.1 and 223.0 Mg/ha, respectively, and AGB change of -9.1 Mg/ha for the period evaluated. The mean change in AGB stocks observed in Andersen et al. [25] is similar to the values found in this study. It should be noted that the higher estimated values in this study are justified owing to the analysis intervals also being larger. Furthermore, in our results, we noticed the largest decrease (-30.24 Mg/ha) in AGB stocks between 2014 and 2017 in a logged area (-10.0 Mg/ha per year); however, when analyzing the entire period (2012 to 2017), it shows a gain of approximately 8 Mg/ha. In other words, over the entire evaluated period, it was possible to verify the increase in biomass stocks and not the decrease as seen first, perhaps reflecting the balance between increased growth and increased mortality in the explored locations [35]. For this reason, we believe that studies in this scope need longer assessment times as they may otherwise result in hasty conclusions. Rangel Pinagé et al. [35] cited a series of studies on mortality after logging and also commented on the need for investigations with larger time spans and different logging intensities to determine the persistence of logging impacts on the canopy. In addition, we estimated slightly higher gains in forests logged before the first LiDAR acquisition (2012) than in intact forests (mean biomass gains of 20.0 Mg/ha in logged areas and for unlogged forests mean biomass gains of 7.0 Mg/ha, both for the period 2012-2014). Moreover, the logged areas and the unlogged areas do not differ widely in terms of values and appear to be gaining biomass at similar rates.
Based on our results, there are four areas where future studies could focus: scaling approaches, quantifying impacts of other phases of logging on AGB, and exploring other concomitant factors that affect carbon release, and the influence of site productivity variations and multiple tree species presence on AGB change. For instance, scaling up could be done through the use of full-waveform LiDAR, which can cover larger areas. Recent studies have shown that the results from discrete return and full-waveform LiDAR are of comparable accuracy [79]. Since LiDAR surveys are expensive, another option is to scale up regional estimations of AGB and AGB change with satellite imagery [16,80]. Through merging data sources, it is possible to perform a classification approach versus regression approach as well-for example, by making use of modelling techniques such as random forest-for identifying and isolating logged areas and then for comparing their AGB estimation capabilities. It should be noted that the primary changes in AGB in this study were caused by felling trees; however, future studies should also focus on quantifying the impacts on AGB caused by other phases of selective logging; such as the construction of roads and log landings. Additionally, it could be intriguing to explore how other concomitant factors such as increased forest fires due to logging, damages from machinery related to logging, and forest degradation affect carbon. Lastly, we recommend more research to investigate the influence of different tree species being present and forest types in AGB change, as previous studies [16] have reported that logged forests experience higher growth rates and accumulate more AGB than unaffected primary forests. On a similar theme, if we could stretch the data collection paradigms whenever possible to include direct or indirect measures of site productivity details, that would allow us to substantially improve the predictive capability of AGB models, as issues associated with lidar-height-based metrics can be kept minimal as discussed in Shao et al. [26].

Conclusion
In our study, we modelled AGB in a selectively logged Amazonian tropical forest using field data and LiDAR-derived metrics. In addition, we tested different approaches based on nonparametric methods to estimate AGB at the plot level and compared it with parametric OLS. Our comparison of the methods for estimations has shown that OLS is the most suitable method for AGB estimation using airborne LiDAR, as indicated by R², RMSE, and MD. We also demonstrated that other methods like RF, SVM, and ANN have the potential for predicting AGB when a non-parametric method is required. Our results show that OLS can be used to estimate AGB with satisfactory accuracy and that repeated LiDAR measurements over time are capable of estimating AGB change at landscape levels. While field data for 2012 and 2017 would have allowed us to evaluate if the same parameters and models work best for all the cases, these results remain robust. LiDAR-based approaches allow us to address questions of changing ABG with increasing speed, scale, and reliability. Our study effectively demonstrated this capacity by finding that LiDAR has the capability of tracking even the small-scale differences happening within logged forests with satisfiable accuracies.