Estimating Forest Biomass Dynamics by Integrating Multi-Temporal Landsat Satellite Images with Ground and Airborne LiDAR Data in the Coal Valley Mine , Alberta , Canada

Assessing biomass dynamics is highly critical for monitoring ecosystem balance and its response to climate change and anthropogenic activities. In this study, we introduced a direct link between Landsat vegetation spectral indices and ground/airborne LiDAR data; this integration was established to estimate the biomass dynamics over various years using multi-temporal Landsat satellite images. Our case study is located in an area highly affected by coal mining activity. The normalized difference vegetation index (NDVI), enhanced vegetation index (EVI and EVI2), chlorophyll vegetation index (CVI), and tasseled cap transformations were used as vegetation spectral indices to estimate canopy height. In turn, canopy height was used to predict a coniferous forest’s biomass using Jenkins allometric and Lambert and Ung allometric equations. The biophysical properties of 700 individual trees at eight different scan stations in the study area were obtained using high-resolution ground LiDAR. Nine models (Hi) were established to discover the best relationship between the canopy height model (CHM) from the airborne LiDAR and the vegetation spectral indices (VSIs) from Landsat images for the year 2005, and HB9 (Jenkins allometric equation) and HY9 (Lambert and Ung allometric equation) proved to be the best models (r2 = 0.78; root mean square error (RMSE) = 44 Mg/H, r2 = 0.67; RMSE = 58.01 Mg/H, respectively; p < 0.001) for estimating the canopy height and the biomass. This model accurately captured the most affected areas (deforested) and the reclaimed areas (forested) in the study area. Five years were chosen for studying the biomass change: 1988, 1990, 2001, 2005, and 2011. OPEN ACCESS Remote Sens. 2015, 7 2833 Additionally, four pixel-based image comparisons were analyzed (i.e., 1988–1990, 1990–2005, 2005–2009, and 2009–2011), and Mann-Kendall statistics for the subsets of years were obtained. The detected change showed that, in general, the environment in the study area was recovering and regaining its initial biomass after the dramatic decrease that occurred in 2005 as a result of intensive mining activities and disturbance.


Introduction
Monitoring forest biomass dynamics is known as one of the important factors for environmental modeling [1], and obtaining information on forest biomass dynamics has become a necessity for forest management [2].Additionally, improving the monitoring techniques to assess forest biomass change is required to understand anthropogenic effects on ecosystem balance [3].
In the last decade, the use of passive and active remote sensing sensors to monitor forest biomass has received attention due to their ability to model the carbon cycle, deforestation and forest degradation [4][5][6].The recent advances in ground and airborne light detection and ranging (LiDAR) remote sensing enable more accurate data acquisition for the vertical and horizontal vegetation structure and better estimation of forest biomass [7,8].Many investigations have used LiDAR data to estimate biomass by considering the statistical relationship between field measurements and airborne LiDAR [7,9,10].Data from high-resolution ground LiDAR will maximize accuracy and minimize uncertainty in the findings [11][12][13].The main purpose of using LiDAR data in our investigation is to measure the biophysical parameters of trees, such as the diameter at breast height (DBH), tree height and canopy width.Multi-spectral optical remote sensing, such as Landsat Thematic Mapper (TM) satellite images, is a valuable data source in ecological research for estimating multi-temporal biomass dynamics and has been widely used for monitoring vegetation cover [14][15][16][17].Also, Pflugmacher et al. [3] discussed the limitations of estimating biomass using multi-spectral sensors, namely, the misrepresentation of the actual spatial distribution of the biomass.Additionally, the difficulty of evaluating the sampling errors because of the spatial resolution heterogeneity between remote sensing data, field surveys, and biomass maps has been discussed by [18,19].Many scientists succeeded in enhancing the relationship between Landsat spectral indices and forest biomass by incorporating the spectral trajectory of forest disturbance and regrowth, which decreased the root mean square error (RMSE) from 57% to 41% [20].
The allometric equations are useful for estimating biomass in North America, and were collected from different research papers for various tree species [21][22][23].Equations that are suitable for a large number of species have been developed [24,25] to correct the allometric equations previously published, such as in [26][27][28][29].Additionally, many scientists have adopted various allometric equations in their case studies because they can be used to address the spatial variability [28,30,31].
Our case study involves a mining area in the western central area of the province of Alberta in Canada, where huge areas were affected by deforestation as a result of coal mining activities.Such anthropological activity introduce abiotic chemical and physical disturbance such as pH (soil acidity and alkalinity), soil salinity, soil compaction, and low hydraulic conductivity, and these factors harm the soil fertility and the ecosystem balance and cause degradations over time [32][33][34].
The main objective of this investigation is to obtain the multi-spatiotemporal biomass distribution from optical remote sensing data for a coniferous forest on a coal mining site.To achieve this goal, we first addressed several research challenges:

Case Study Description
The case study was conducted in the Coal Valley Mine area, which is located approximately 80 km south of Edson in west central Alberta, Canada, as shown in Figure 1.This study site is approximately 7000 Ha.The study area has been used for coal mining since 1978.The natural vegetation in the study area is dominated by coniferous forests composed of Pinus contorta Loudon (lodgepole pine) with ericaceous shrubs and feather mosses.Populus tremuloides Michx.(Aspen) and mixed Pinus and Populus stands with herbaceous and deciduous shrub understories are present on the south aspects and the crests of knolls [35,36].Long-term continuous climatic data are not available for our particular study site, but the average climatic data of the precipitation, temperature and snowfall were obtained from the nearest NOAA station (Id: CN71881030622440) in Edson (53°35'N, 116°28'W) for the period of 1961-2010 [37].The average annual total precipitation is approximately 570 mm, and 305 mm occurs June-August.The coldest period is December-January, with a mean temperature of −18 °C, and the warmest period is July-August, with a mean temperature of 20 °C.The highest snowfall is 35 cm in January, and the lowest snowfall is approximately 6-8 cm in May and September, according to the preliminary field surveys that were conducted on 13 June 2013 and 28 April, 20 May, 11 July, 5 August 2014.Reclamation activities included the placement of cover soil, fertilization, and seeding with a mixture of legume and graminoid species; in the future, shrubs and tree seedlings may be planted to reforest the former mining sites.

Materials and Methods
This investigation is based on various remote sensing data that were acquired and integrated via advanced analytical techniques, as shown in Figure 2. The spatial analyses were conducted using several datasets (i.e., multi-temporal Landsat satellite images, high-resolution ground LiDAR, and airborne LiDAR).The integration system analyses were conducted after the pre-processing data quality check to remove the noise and unify the data geo-references.Software packages, such as ArcGIS 10.1, Cyclone 8.11, and ENVI 5.0, were used to achieve the objectives.

Airborne LiDAR and Canopy Height Extraction
The discrete airborne LiDAR data for the case study were acquired on 22 November 2005 and 26 March 2006.The horizontal root-mean-square error (RMS) is 45 cm, and the vertical RMS is 30 cm.The point density of the acquired LiDAR data for all tiles is greater than 1.3 points per square meter.Environmental System Research Institute (ESRI) ArcGIS 10.1 was used to process the LAS v1.0 point cloud, which contained all return point clouds.Digital elevation models (DEM) and digital surface models (DSM) were produced after removing the noise from the data acquisition.A canopy height model (CHM) with one meter spatial resolution was calculated by subtracting the DEM from DSM; the CHM represents a 3D surface of the vegetation cover.

Ground LiDAR and Field Measurements
The forest biophysical parameters were collected on 20 May and 11 July 2014, using Leica ScanStation C10.An exploratory field survey was conducted on 28 April 2014, to locate the scan stations that represent the overall ecosystem of the case study.Eight scan stations were placed at three locations, where the range of the trees metrics was 300 m and the scan rate was 50,000 points/s.The scan stations are fully operational in conditions ranging from bright sunlight to complete darkness, and the accuracy of single tree's (object) measurements was 4-6 mm.Cyclone 8.11 software was used to calculate the following characteristics of individual trees: diameter at breast height (DBH), height (H), and canopy width.One of the major advantages of using ground LiDAR is that the obtained data do not require extra field surveys for validation [38,39].

Landsat Data
Landsat 5 Thematic Mapper TM satellite images in cloud-free conditions were obtained from the US Geological Survey for spring and summer days in five years, i.e., 2 September 1988, 15 September 1990, 28 May 2005, 27 August 2009, and 9 September 2011.The Landsat data were re-projected into a Universal Transverse Mercator (UTM), zone 11 north NAD 1983 horizontal coordinate system.ENVI version 5.0 was used to apply the radiometric correction analysis to all multi-temporal Landsat data using the Landsat calibration utility for atmospheric correction.The calibrated images provide better reflectance for minimizing the climate effects that could potentially lead to miscalculations or additional noise.The small temporal heterogeneity (less than three months) between the Landsat dataset will not affect the change detection analysis because of the slow rate of growth in coniferous species [40,41].

Calculating the Spectral Indices
The normalized difference vegetation index (NDVI), enhanced vegetation index (EVI and EVI2), chlorophyll vegetation index (CVI), and tasseled cap transformation were considered as vegetation spectral indices (VSIs).These indices were estimated from the multi-temporal Landsat satellite data (Table 1), which were integrated with the ground LiDAR measurements to obtain the biophysical parameters of the studied forest; the parameters were used to estimate the biomass.The VSIs were statistically related to the biomass status and spatial distribution and were resistant to atmospheric influences and the forest structure [16,17,[42][43][44].
Tasseled cap transformations are considered to be robust indicators of forest dynamics [50].A linear combination of sensor-specific weightings for each band (Table 2) was applied to the multi-temporal Landsat 5 satellite images; these transformations are the three orthogonal indices called brightness, greenness and wetness (TCB, TCG and TCW, respectively) [20,51].

Allometric Equations and Ground LiDAR Metrics from Individual Trees and Point Clouds
The Jenkins allometric equations model are suitable for predicting the biomass based on DBH for different coniferous species [26][27][28], Jenkins et al. [22] estimated the biomass as follow: where is the biomass (kg), is the diameter in meters at breast height (1.3 m), is the natural log base, and β , β are the coefficients.Also, Lambert et al. [52] and Ung et al. [53] provided allometric equations to estimate biomass were derived from thousands of samples (trees) across Canada [54].The allometric equation form is shown as follows: where is the biomass (kg), β , β , β are the coefficients.Ni-Meister et al. [8] Found that LiDAR metrics can be used as predictors for biomass.In this study, we used a new high-resolution ground LiDAR to obtain the relationship between the DBH and the tree height (H) in tree-level metric bases (r 2 = 0.74; p < 0.001).Approximately 700 individual trees in the coniferous forest were obtained in the study site (Figure 3), and the empirical relationships from the ground LiDAR field measurements were developed as follows: where DBH is the diameter in meters at breast height (at 1.3 m) and H is the tree height in meters.

Statistical and Temporal Trend Analysis
The stepwise multiple linear regression method was used to select the most significant variables; multiple linear regressions were conducted for airborne LiDAR and derived VSI from Landsat 5 data for 2005.We determined the most suitable model, which is based on the linear regression scatter plot with the ground LiDAR results that identified the biophysical metrics for the study.This model was used to predict H for the years of 1988,1990,2009, and 2011 and to map the biomass in the study years.The conventional multivariate regression model can be expressed as follows: where is CHM in meters and is considered to be the dependent parameter to be predicted, is the intercept, are the regression coefficients and the values of the independent variables, and is the number of independent variables.
To determine the model that best predicted the biomass spatial distribution, 409 points were used to validate the prediction models; we used the root mean square error (RMSE), which indicates how far the predicted values deviated from the measured data (ground LiDAR) [55].
where is the predicted values, is the measured values from the ground LiDAR, and is the number of the observations and grid points.The Akaike's Information Criterion (AIC) can be defined as the objective way to decide between models that provide best fits, and recently used as an assessment tool to select the best model that predict/or estimate phenomena [56].
where is the maximized log-likelihood of the model and is the number of parameters estimated [57].
Also, AIC differences can be used to identify the best model that has∆ ≡ 0, where is the Akaike of model and is the lowest Akaike among the tested models .A pixel-based image comparison was used to estimate the amount of change over the study years.Four temporal subsets were analyzed: 1988-1990, 1990-2005, 2005-2009, and 2009-2011.The comparisons were based on the difference between the same spatial pixel positions for each pair of multi-year periods.The difference between each pair provides useful data on the most degraded or reclaimed sites in the study area [58,59].A temporal trend analysis was provided by [17].The Mann-Kendall trend test analysis is a nonparametric test for linear regression with zero slopes.Each pixel location was calculated in a time series; the S statistics increase when the value of the later data is higher than the earlier data, and the cumulative score of all pairs is used to calculate S [60].Gilbert [61] proposed the following procedures to calculate the Mann-Kendall trend statistics: (1) The biomass data should be listed in temporal order: , , … , where is the biomass estimated maps at time (2) The sign is the indicator function, where the value of all 1 /2 must be determined for all differences of where . sign (3) Computing the Mann-Kendall Statistics as follow:

Regression Models and Biomass Estimation
Stepwise multiple regression analysis was used to estimate the canopy height from the Landsat spectral indices (NDVI, EVI2, TCW, TCG, and TCB), which are independent variables.Nine models were established to identify the best relationship between the airborne LiDAR and Landsat data for 2005 from 4771 points that were extracted from all variables at the same location.H8 and H9 were the best models for estimating the canopy height (r 2 = 0.73, and 0.74; p < 0.001, respectively.∆ 0 and 176.4,respectively), as shown in Table 3.
The final equations that were used to predict the biomass directly from canopy height are shown as follow: Figure 4 shows the comparisons between the measured biomass from the high-resolution ground LiDAR and the estimated biomass derived from the Landsat spectral indices models.The models with higher adjusted r 2 and lower RMSE are indicative of better biomass estimations from the spectral indices variables; the models (HB7 and HB9) with higher r 2 values had lower RMSE values.TCG and TCW increased r 2 + 0.1, and the RMSE decreased (−16.87Mg/H).Also, the Lambert and Ung allometric equation provided the same superior models (HY7 and HY9), see Figure 5.   3), and based on Lambert and Ung allometric equation .

Estimation Uncertainties
Understanding, identifying, and reducing the sources of uncertainties are critical in improving forest biomass estimation performances.In this investigation, based on the ground LiDAR data survey, a group of statistical models was developed.The models were based on the best relationships between the tree height data, DBH and spectral indices to estimate biomass from the Landsat images.However, our models involve uncertainty for the following reasons.First, a scale difference exists between remotely sensed data and ground LiDAR survey data.The spatial resolution of Landsat 5 data is 30 m. Considering the high-resolution ground LiDAR scans, the average values of multiple samples were used to reduce the estimation error caused by the scale difference.Second, the satellite image data used in this study originate from Landsat 5 and require an atmospheric correction to obtain the best spectral reflectance for surface greenness and distribution.Third, there are uncertainties concerning the field survey for biomass change detection because simultaneous field data and satellite images are difficult to obtain.To address the temporal variance between the field survey and the remote sensing data acquisition, we focused on the relationships between the biophysical parameters in the study sites rather than the temporal data acquisition; the ultimate goal of our research is to provide an optimal biomass prediction model.

Biomass Spatial Distribution and Trajectory of Change
The estimated biomass results had a wide ranging spatial distribution as a result of mining activities that led to deforestation, as shown in Figure 6.During 1988-1990, the total area with a biomass <100 Mg/H decreased approximately 55.3%, and the biomass in the southeastern area increased approximately 36.8%.The northwestern and eastern areas lost approximately 2.2% of the >600 Mg/H biomass area, which shows the capability of the estimation model that was used to map the biomass spatial distribution.The study site was massively deforested, except for the northeastern side.After 2005, land reclamation occurred.During 2005-2009, the vegetation cover area increased 26.6% for the trees that had biomass >600 Mg/H, 25.4% for biomass of 100-250 Mg/H, and 23% for biomass <100 Mg/H.The biomass of trees <100 Mg/H increased 57.8% as a result of continuous land reclamation and regrowth.
As shown in Figure 7, a trend analysis for 1988-2011 was conducted using the estimated biomass data; the analysis showed a significant decrease in the biomass over this 23-year period.Applying the biomass prediction model to the S statistics provided clear information on how massively the mining activities deforested huge areas for this period.Changing the time window provided a different view of the activities at the same study sites.During 2009-2011, the trend analysis showed an obvious increase in the biomass, which is clear evidence of reclamation activity and regrowth.

Major Findings
The integration of data from optical remote sensing and airborne and ground LiDAR provided the ultimate data source for estimating and mapping the spatial and temporal biomass distribution for coniferous forests.Many scientists have demonstrated that Landsat and LiDAR data can be used to derive highly reliable maps of biomass distributions based on field survey measurements [3,16,50].In this study, we developed an approach that further integrates airborne LiDAR data with the high-resolution ground LiDAR to specifically determine the relationships between the biophysical characteristics of a forest and spectral indices from Landsat data.The relationships were used to directly estimate the biomass distribution at the study site from the spectral indices that were calculated from multi-temporal Landsat satellite images.Several key elements were considered to estimate and map the biomass temporal dynamics and spatial distribution using Landsat data.First, information and statistical relationships derived from both the airborne and high-resolution ground LiDAR were used to improve the biomass modeling substantially and provide accurate datasets.The advantage of using ground LiDAR for biomass modeling is the capability of accurately measuring all the biophysical details at the study site and of linking the ground LIDAR data to the airborne LiDAR data.This connection revealed information undetectable from airborne LiDAR; thus, this information is essential to provide the best model for estimating biomass directly from optical remote sensing data, such as Landsat satellite data.Second, spectral indices were carefully chosen to provide more information on the vegetation cover, as shown in Tables 1 and 2. In this research, VSIs were derived from calibrated multi-temporal Landsat images to detect and map the location and magnitude of biomass dynamics.Several studies demonstrated the benefit of using Landsat images for change detection in coniferous forest environments [62].We provide new information on the potential use and capability of such methods in estimating and mapping biomass dynamics.Third, the modeling approach has a high degree of certainty, based on the direct link between spectral indices and biophysical parameters captured by the airborne and ground LiDAR.TCB, TCG, TCW, EVI2, and NDVI were highly suitable variables for estimating the biomass spatial distribution for our case study, as shown in Table 3. Similarly to [17], a strong, negative relationship was found between brightness reflectance and vegetation coverage (r = 0.81); this relationship represents the bare soil spots that result from the mining activities in the study site.The TCW is a highly weighted (positively) spectral variable with CHM (r =0.78); this relationship supported the modeling approach to estimate the biomass according to the canopy heights [13,27].Fourth, Zhao et al. [28] stated that the multiple linear models with height-related metrics estimate the biomass variation better than the profile-related metrics because the biomass is directly related to the height and/or DBH, whereas the canopy-profile-based metrics provide very limited information.Essentially, the Jenkins allometric equations used in this research employs DBH as the input, but it is still possible to use an additional variable (i.e., height), as suggested by [27,28].
Models with higher adjusted r 2 and low RMSE likely have a greater ability to predict biomass using LiDAR-derived variables.HB9 and HY9 were the best models regarding biomass estimations using allometric equations.The final regression model HB9 was further used with the multi-temporal Landsat images to estimate the biomass spatial distribution; this model is useful for cases that have intense coniferous forests.Additionally, these results showed the advantages of using Landsat satellite images to map and assess biomass change over time and to detect deforestation and reclamation effects (see Figure 6).Finally, change detection provides more information on the history of the study site in terms of the trajectory of change.Massive anthropogenic activity appeared in 2005 throughout the study site.The northeastern side of the study site, which has a high elevation, showed more vegetation cover stability, which is a morphological characteristic that protects trees from the mining activities.In 2004, mining activities increased to nearly 4 million tons/year, and the mining reached its full extent in the summer of 2006, in which thermal coal production doubled [35].During 1990-2005, the areas that had a biomass <100 Mg/H decreased to 90.3%, and the areas that had a biomass >600 Mg/H decreased by 7.2%.

Suggestions for Future Research
This research successfully used multi-temporal Landsat data to develop and introduce a new methodology for assessing biomass change in coniferous forests in human-affected areas; nevertheless, some aspects need to be considered in future research.Landsat satellite data provide the most continuous time series for assessing biomass change.However, it is highly preferable to use continuous multi-temporal satellite data in bi-weekly temporal intervals to avoid possible disturbances/or natural hazards (wildfires, floods, etc.) that could affect the findings.Integrating different sensors could cause errors due to spatial resolution heterogeneity, and therefore, any integration approaches in environmental research must be taken gradually by developing the integration analysis in a stepwise manner and assessing the accuracy of each step.We believe that integrating ground LiDAR directly with Landsat data will result in more errors rather than greater accuracy.A field survey is essential to verify the findings and to confirm whether the predictions correspond with reality.The use of high-resolution optical satellite sensors (Worldview, Quick Bird, and Spot) is a significant advantage of this type of research, but these data are costly compared with the free Landsat archive.

Conclusions
In this study, we estimated the multi-temporal and spatial distribution of biomass for a coniferous forest at a coal mining site to assess the biomass changes as a result of land degradation and reclamation.Various datasets and techniques were used and integrated to provide the best predictors for estimating biomass directly from Landsat data.Ground LiDAR metrics were essential to understanding and accurately finding the best relationship between different biophysical parameters in our case study.A strong relationship (r 2 = 0.74; p < 0.001) was found between tree height and DBH; this statistical relationship was based on the metrics of 700 individual trees.The ground-LiDAR-based regression model was used to replace the DBH in Jenkins allometric equation with the tree height, which was obtained from airborne LiDAR.This step helped estimate the biomass from the airborne LiDAR for 2005 directly from the tree height that was extracted from the difference between the DEM and DSM.Nine spectral based models were analyzed with the airborne LiDAR metrics to select the best biomass predictor; HB9 proved to be the best predictor (r 2 = 0.78; RMSE = 44 Mg/H; p < 0.001).This model was used to estimate the biomass spatial distributions for 1988, 1990, 2001, 2005, and 2011.A major loss of biomass was found in 2005, when the study site lost approximately 50% of its forested area.This major deforestation was the result of extreme excavation.The land morphology in the northeastern side of the site played a vital role in protecting the natural habitats from the excavation activity.The areas with biomass <100 Mg/H were indicators for land reclamation and conservation, both of which increased dramatically during 2005-2011.Predicting biomass from Landsat data requires different resolution levels of data records that can be integrated for understanding the disturbance activity that occurred.With the continuity of the Landsat data, developing new algorithms becomes highly essential for studying historical changes and understanding the connection between past and current environmental states [20].
(a) Using high-resolution ground LiDAR to find the relationship between different biophysical parameters, including tree height, DBH and canopy width, (b) Applying a ground-LiDAR-based regression model to two allometric equations to estimate the biomass values from the CHM of the airborne LiDAR, (c) Finding the best spectral-based regression model to predict the coniferous forest biomass from multi-temporal Landsat data, and (d) Assessing the change in the biomass in 1988, 1990, 2001, 2005, and 2011.

Figure 1 .
Figure 1.The geographical location of the case study, aerial imagery with 0.5 m spatial resolution.

Figure 2 .
Figure 2. Flow chart describing the integration and the analyses steps of Landsat and LiDAR data.

Figure 3 .
Figure 3.A 3D visualization of the data obtained from the high resolution ground LiDAR.

Figure 4 .
Figure 4.The stepwise multiple linear regression results for the measured and estimated biomass analysis of nine different models (see, Table3), and based on Jenkins allometric equation .

Figure 5 .
Figure 5.The stepwise multiple linear regression results for the measured and estimated biomass analysis of nine different models (see, Table3), and based on Lambert and Ung allometric equation .

Figure 6 .
Figure 6.The spatial distribution of the estimated biomass over the case study over the years 1988, 1990, 2005, 2009, and 2011; the spatial resolution is 30 m.

Figure 7 .
Figure 7.A zoomed view of two sites in the case study showing the decrease and increase of the biomass for long-term (1988-2011) and short-term (2009-2011).

Table 1 .
The derived VSIs equations from Landsat 5 TM, is the near infrared band, is the red band, is the blue band, and is the green band.G, C1, C2, and L are 2.5, 6, 7.5, and 1 respectively.

Table 3 .
−1based on the Jenkins allometric equation, is the biomass in Mg•H −1 based on the Lambert and Ung allometric equation, and is the canopy height in meters.A summary of multiple regression analysis results and regression models.