Forest Potential Productivity Mapping by Linking Remote-Sensing-Derived Metrics to Site Variables

: A ﬁne-resolution region-wide map of forest site productivity is an essential need for e ﬀ ective large-scale forestry planning and management. In this study, we incorporated Sentinel-2 satellite data into an increment-based measure of forest productivity (biomass growth index (BGI)) derived from climate, lithology, soils, and topographic metrics to map improved BGI (iBGI) in parts of North American Acadian regions. Initially, several Sentinel-2 variables including nine single spectral bands and 12 spectral vegetation indices (SVIs) were used in combination with forest management variables to predict tree volume / ha and height using Random Forest. The results showed a 10–12 % increase in out of bag (OOB) r 2 when Sentinel-2 variables were included in the prediction of both volume and height together with BGI. Later, selected Sentinel-2 variables were used for biomass growth prediction in Maine, USA and New Brunswick, Canada using data from 7738 provincial permanent sample plots. The Sentinel-2 red-edge position (S2REP) index was identiﬁed as the most important variable over others to have known inﬂuence on site productivity. While a slight improvement in the iBGI accuracy occurred compared to the base BGI model (~2%), substantial changes to coe ﬃ cients of other variables were evident and some site variables became less important when S2REP was included.


Introduction
Forest productivity is an important measurement for sustainable forest planning and management and carbon sequestration studies and can be estimated from gross primary productivity (GPP) or potential site quality. While GPP is the ecological measure of productivity that estimates the total amount of organic matter fixed through photosynthesis, site productivity estimates the potential of a particular forest stand to produce aboveground wood and is usually quantified as a site index (SI). Methods based on SI are the most common and widely used measures of forest productivity, but these methods are often site-specific and species-dependent, leading to bias in the comparison of productivity in different regions [1]. In addition, SI estimation in regions dominated by multi-cohort and mixed-species can be challenging. In these regions, a variety of other approaches can be used to determine and quantify potential site productivity [2].
Fine-resolution, region-wide mapping of forest productivity is desired for improved comparison of productivity among and within regions as well as large-scale planning and management, particularly with respect to evaluating the effect of climate change on ecosystems. However, these maps are generally not available for many regions yet. Several attempts have been made to map potential forest productivity at regional scale [3][4][5]. However, updated regional productivity maps across different climatic regions will be more desired for a wide range of disciplines. The production of these This research aimed at employing several remotely sensed variables obtained from Sentinel-2 satellite data to produce an improved BGI model. The specific objectives of this research were: (1) assessing the potential capabilities of several Sentinel-2-derived SVIs and single spectral bands for forest productivity measurement through evaluating the Sentinel-2 variables to predict tree volume and height; (2) using the best Sentinel-2 variables in combination with site variables for modeling and mapping forest potential productivity across New Brunswick, Canada and Maine, USA at 20 m spatial resolution; (3) comparing the improved BGI using remote sensing variables to the original BGI to better understand regional differences and areas of future improvement.

Materials and Methods
The overall methodology of this study is shown in Figure 1. All components of the workflow are explained in the following sections.
(NDVI) and EVI, have been suggested to estimate forest biophysical variables such as LAI and productivity [24]. Landsat spectral bands have been used for the estimation of forest structural attributes such as height, volume, and biomass [20,21]. Sentinel-2 imagery has spectral bands in the red-edge region that were not available in multi-spectral satellites like Landsat. These spectral bands are more efficient for the quantification of forest biophysical attributes such as leaf chlorophyll content, LAI and fractional vegetation cover [25][26][27], which could be potentially helpful to improve potential forest productivity estimation and have yet to be fully investigated to our knowledge.
This research aimed at employing several remotely sensed variables obtained from Sentinel-2 satellite data to produce an improved BGI model. The specific objectives of this research were: 1) assessing the potential capabilities of several Sentinel-2-derived SVIs and single spectral bands for forest productivity measurement through evaluating the Sentinel-2 variables to predict tree volume and height; 2) using the best Sentinel-2 variables in combination with site variables for modeling and mapping forest potential productivity across New Brunswick, Canada and Maine, USA at 20 m spatial resolution; 3) comparing the improved BGI using remote sensing variables to the original BGI to better understand regional differences and areas of future improvement.

Materials and Methods
The overall methodology of this study is shown in Figure 1. All components of the workflow are explained in the following sections.

Study Area
The study area covers parts of the Acadian forest region of the United States and Canada. The Acadian forest region is a transitional zone between two larger forest ecosystems, the northern broadleaf forest to the south and the boreal forest to the north, and encompasses three Canadian Maritime Provinces (New Brunswick, Nova Scotia, and Prince Edward Island), parts of southern Quebec, as well as northern New England states including Maine. Two major predominant forest types are spruce-fir forests comprising in particular red spruce (Picea rubens Sarg.) and balsam fir

Study Area
The study area covers parts of the Acadian forest region of the United States and Canada. The Acadian forest region is a transitional zone between two larger forest ecosystems, the northern broadleaf forest to the south and the boreal forest to the north, and encompasses three Canadian Maritime Provinces (New Brunswick, Nova Scotia, and Prince Edward Island), parts of southern Quebec, as well as northern New England states including Maine. Two major predominant forest types are spruce-fir forests comprising in particular red spruce (Picea rubens Sarg.) and balsam fir (Abies balsamea (L.) Mill.), and the northern hardwood forest, where sugar maple (Acer saccharum Marsh.), American beech (Fagus grandifolia Ehrh.), and yellow birch (Betula alleghaniensis Britton) are dominant. Other common tree species are red maple (Acer rubrum L.), eastern hemlock (Tsuga canadensis (L.) Carr.), eastern white pine (Pinus strobus L.), and northern white-cedar (Thuja occidentalis L.) [28]. The Acadian forest has been greatly altered by a long history of human-induced changes such as land clearing, intensive timber harvesting, and silvicultural practices, as well as natural disturbances, windstorms and insect outbreak in particular. Therefore, the majority of the Acadian forest is dominated by naturally-regenerated, mixed-species stands having either even-or uneven-aged structures. Two study area extents were used in this study. Figure 2A shows the extent of the study area in New Brunswick, Remote Sens. 2020, 12, 2056 4 of 17 Canada where several Sentinel-2 variables were evaluated and selected for forest productivity modeling, and Figure 2B shows the extent of the study area where a potential forest productivity map was produced using the suggested model.
dominant. Other common tree species are red maple (Acer rubrum L.), eastern hemlock (Tsuga canadensis (L.) Carr.), eastern white pine (Pinus strobus L.), and northern white-cedar (Thuja occidentalis L.) [28]. The Acadian forest has been greatly altered by a long history of human-induced changes such as land clearing, intensive timber harvesting, and silvicultural practices, as well as natural disturbances, windstorms and insect outbreak in particular. Therefore, the majority of the Acadian forest is dominated by naturally-regenerated, mixed-species stands having either even-or unevenaged structures. Two study area extents were used in this study. Figure 2A shows the extent of the study area in New Brunswick, Canada where several Sentinel-2 variables were evaluated and selected for forest productivity modeling, and Figure 2B shows the extent of the study area where a potential forest productivity map was produced using the suggested model.

Field Data
For the study area in New Brunswick (Figure 2A), different sets of field data were collected. In particular, New Brunswick Crown Lands' forest management polygon data [29], which contained photo-interpreted species composition, treatment history, and year of treatment, were used. This information allowed us to determine species percentages, stand age and management type (Mgmt) (i.e., planted, pre-commercial thinning (PCT), clearcut regeneration). BGI data [2] and Light Detection and Ranging (LiDAR)-derived forest inventory predictions, both having 20 m spatial resolution, were also collected [30].
To map forest potential productivity for Maine and New Brunswick ( Figure 2B), a total of 7033 existing forest inventory plots, measured by government and industrial forestry organizations across Maine and New Brunswick, were used in this study. These included: 1) New Brunswick Department of Natural Resources and Energy Development (NB-DNRED) provincial permanent sample plots for tree growth research; 2) Acadian Timber Inc. permanent sample plots in

Field Data
For the study area in New Brunswick (Figure 2A), different sets of field data were collected. In particular, New Brunswick Crown Lands' forest management polygon data [29], which contained photo-interpreted species composition, treatment history, and year of treatment, were used. This information allowed us to determine species percentages, stand age and management type (Mgmt) (i.e., planted, pre-commercial thinning (PCT), clearcut regeneration). BGI data [2] and Light Detection and Ranging (LiDAR)-derived forest inventory predictions, both having 20 m spatial resolution, were also collected [30].
To map forest potential productivity for Maine and New Brunswick ( Figure 2B), a total of 7033 existing forest inventory plots, measured by government and industrial forestry organizations across Maine and New Brunswick, were used in this study. These included: (1) New Brunswick Department of Natural Resources and Energy Development (NB-DNRED) provincial permanent sample plots for tree growth research; (2) Acadian Timber Inc. permanent sample plots in northwestern New Brunswick; (3) Maine state continuous forest inventory plots from the Forest Inventory and Analysis Unit, USDA Forest Service.
Biomass growth of surviving trees (BG; kg ha −1 year −1 ) for all the plots were provided through [2]. BG pm was calculated using aboveground dry biomass of all trees in a plot (p) that are alive at the start of a plot measurement period (m) between two measurement periods as explained in [2]. Plot measurements prior to 1990 were excluded from the analysis to avoid the possible effects of the 1970-1980s spruce budworm (Choristoneura fumiferana Clem.) outbreak on growth estimates. Pre-commercially thinned and planted stands also were excluded and tree ages between 20 and 100 years were used [2].

Remote Sensing Data Collection and Processing
Sentinel-2 A and B Level 1C Top-Of-Atmosphere (L1C-TOA) reflectance products during summer time were collected for this study. For the first part of the study, two cloud-free Sentinel-2 A L1C-TOA images, dated July 31 and September 14, 2017, were collected for our study site, located in the southern parts of New Brunswick, from https://earthexplorer.usgs.gov/ (Figure 2A). No cloud-free imagery was available for the month of August. The Sentinel-2 SNAP toolbox and the Sen2Cor processor were used for atmospheric-, terrain and cirrus cloud corrections of L1C-TOA products. The L1C-TOA imagery were converted to L2A surface reflectance products. Then, several SVIs were computed from Sentinel-2 spectral bands using L2A products at 20 m resolution. The list of SVIs and their equations along with single spectral bands (total of 21 Sentinel-2 satellite derived variables) used in this study is presented in Table 1. Table 1. Sentinel-2 variables used in the study (all at 20 m spatial resolution).

No
Band/Index * Band info/ /Formulation Reference Green (560 nm) 3 b4 Red (665 nm b12 Shortwave Infrared (2190 nm) 10 CLre The initial selection of SVIs and band combinations was based on their performance in previous studies to measure vegetation biochemical and biophysical properties such as chlorophyll content, leaf area index (LAI), biomass and volume using Sentinel-2 or other multi-or hyperspectral sensors [27,32,[41][42][43]. The red-edge region (spectral range between 690 and 730 nm) is suggested to be more effective for the measurement of vegetation biophysical and biochemical parameters and, in contrast to red (centered at 670 nm) and green (centered at 550 nm) reflectance, has been found to be more sensitive to a wide range of chlorophyll content. For this reason, SVIs derived from red-edge bands are expected to be a better measure for chlorophyll content, LAI, biomass, and tree volume compared with traditional SVIs [26,44,45]. Therefore, in this study several red-edge SVIs were evaluated (Table 1). We used Sentinel-2-derived variables for the prediction of the total bole inside-bark volume (TV) and height of the thickest 100 trees per hectare (HT; i.e. top height) as a proxy of forest productivity [22].
In the second part of the study, which mapped forest potential productivity over the entire study region ( Figure 2B), 58 Sentinel-2A and B L1C-TOA images dated between July 20 and September 14 in both 2017 and 2018 were collected from https://earthexplorer.usgs.gov/ to create a seamless mosaic of the best Sentinel-2-derived variables selected from the first part of the study for Maine and New Brunswick. Similarly, as described above, the Sentinel-2 SNAP toolbox was used to convert the LIC-TOA imagery to L2A surface reflectance products.

Prediction of Total Bole Inside-bark Volume (TV) and Height of Thickest 100 Trees (HT)
LiDAR-derived predictions of TV and HT on a 20 × 20 m point feature grid, acquired from the NB-DNRED, were intersected with the 21 Sentinel-2 variables (nine spectral bands and 12 vegetation indices) from two different dates (July 31 and September 14, 2017). The resulting 20×20 m point layer was intersected again with the New Brunswick crown forest management polygon layer, which contained photo-interpreted species composition, treatment history, as well as year of treatment, and allowed us to determine species percentages, stand age and management type (i.e., planted, PCT and clearcut regeneration) for each point. Sentinel-2-derived variables, in combination with other variables such as photo-interpreted species composition of stands, stand age and management history, and BGI, were used to predict LiDAR-derived TV and HT in our study site in New Brunswick (Figure 2A). In this case, we used TV and HT as if they were measured directly, rather than modeled responses. In New Brunswick, TV and especially HT were predicted at high accuracy from LiDAR point clouds with r 2 of 0.7-0.8 and 0.8-0.9, respectively [30]. TV and HT were modeled using Random Forests (RF) [46] using stand species composition, stand age, stand management type, BGI, and Sentinel-2 spectral bands and indices. Only stands that were > 1 acre were used in RF models (7400 stands total) [46]. With LiDAR HT and stand age, site index can be calculated. Here we were interested in identifying whether Sentinel-2 variables explain additional variability in point and stand-level LiDAR-TV and -HT beyond what could already be predicted by age, species, management type, and BGI. Bands that showed good predictive power and low auto-correlation could be carried forward to improve the BGI model.

Site Factors
Site factors (variables) used in this research are the same variables used in [2] to model the base BGI, including climate, lithology, soils, and topography variables. Monthly climate normals  at 800 m resolution were acquired from [47]. Lithology and soil data were obtained from various sources ( Table 2) and resampled to 50 m. A variety a topographic variables were estimated based on existing digital elevation models (DEMs) and resampled to 20 m. Topographical variables consisted of elevation, depth to water (DTW) [48], slope ( • ), and System for Automated Geoscientific Analyses (SAGA) terrain indices [49] such as topographic wetness, roughness, positive openness, and slope position class. 2.6. Using Sentinel-2 Variables to Improve BGI in Maine and New Brunswick BG responses to site factors and the best Sentinel-2-derived variables selected from the first part of the study were explored using the nonparametric RF regression tree algorithm [46]. Before using Sentinel-2-derived variables for modeling forest potential productivity, the following procedure was applied to the imagery to normalize and smooth the data.
We noted significant observable differences between Sentinel-2 values captured in July compared to late August and early September. While preference was given to late summer captures, large cloud-free captures in Maine and New Brunswick were unavailable in late summer in certain areas, and for those areas, normalized July images were used. Normalization was performed by linearly regressing late summer imagery against July imagery where spatially coincident for undisturbed forested areas, and then predicting late summer values from the July values where needed.
We used an adaptation of a geographic weighted regression (GWR) technique to develop smoothed Sentinel-2 best variable mosaics for New Brunswick and Maine separately as a function of elevation, slope, and log depth to water + 0.0001 on a 20 m grid by: (1) sampling Sentinel-2 variables, elevation, slope, and depth to water attributes on a 250 m grid; (2) omitting sample points that were: (a) non-forest or wetland, recently harvested (since 1990) based on Landsat-based disturbance history data (up to 2010 for Maine using Forest Disturbance History from Landsat, 1986-2010 [50] and up to 2015 for New Brunswick using Landsat harvest and fire change history products from 1985-2015 [51], (b) recent disturbances identified with Sentinel-2 b3 > 0.06, and (c) areas missing Sentinel-2 coverage (no cloud free days~5% of total area); (3) using least squares to solve approximately 1.5 million liner models, one for each grid-point (966,728 for Maine, 687,876 for New Brunswick), using the closest 30 grid-point attributes as local samples and stepwise backward variable selection using a significance-level < 0.05 as a threshold for variable inclusion, and where no variables were significant, the average observed Sentinel-2 variable value was used for that model; (4) predicting local distance-weighted average Sentinel-2 variables using the closest 30 linear models for each point location on a 20 m grid for the entire New Brunswick and Maine areas. We trained this model by varying the number of point observations  used to build each model, and the number of models  used to predict the Sentinel-2 optimal variables at each grid point, and chose numbers that, on average, minimized predicted Sentinel-2 variable RMSE using ten-fold cross validation. Smoothed Sentinel-2 variables slightly outperformed unsmoothed variables in terms of Random Forest variable importance (always ranked higher), therefore smoothed data were used exclusively for all further analyses. Smoothed variables were added to the original BGI model as an additional asymptotic term and model coefficients were then re-solved for New Brunswick and Maine plots, together.

Adding Sentinel-2 Terms to Biomass Growth Non-Linear Equation
Smoothed Sentinel-2 variables found useful for predicting biomass growth were introduced as additive asymptotic terms in the original Chapman-Richards biomass growth equation (BG) [2] with the base and new models were refit using only Maine and New Brunswick (Nova Scotia and Prince Edward Island excluded). The revised BG equation (iBG) and refit coefficients were used to map the new Sentinel-2-informed BGI, herein called 'iBGI', throughout New Brunswick and Maine. The two equations were then compared and areas of interest were determined for further exploration.

Prediction of TV and HT using Field and Sentinel-2 Variables
A 10-12 % increase in out of bag (OOB) r 2 was observed when Sentinel-2 variables were included in the prediction of TV (Table 3). The prediction of stand-level TV based on age, species composition, Mgmt, and BGI yielded an OOB r 2 of 68.8%, whereas the addition of the July and September Sentinel-2 data increased the OOB r 2 to 80.5%. Additionally, dropping species composition as a predictor variable Remote Sens. 2020, 12, 2056 8 of 17 did not significantly affect the OOB r 2 (79.9% vs. 78.7%). After reviewing the construction of the correlation matrix of the bands and indices, we dropped all bands and indices with the exception of bands b3, b8a, and b11 and indices S2REP and NDVI45, yielding no change in model results (OOB r 2 : 78.7% to 78.5%). Partial dependence plots showed a strong relationship between volume and b3, b8a, and S2REP, but a relatively poor relationship with band b11 and NDVI45. Random Forest models using age, Mgmt, BGI, and only b3, b8a, and S2REP resulted in a slight reduction in OOB r 2 (78.5% vs. 77.2%). Dropping age and Mgmt and running the model on only BGI, b3, b8a, and S2REP yielded an OOB r 2 of 61.2% and, finally, using only Sentinel-2 best variables (b3, b8a, S2REP) resulted in an OOB r 2 of around 58%. Results for TV prediction using September Sentinel-2 data were similar to July results where adding September variables in combination with age, species composition, Mgmt, and BGI increased the OOB r 2 from 68.8% to 79.9% (Table 3). Table 3. Results of total volume (m 3 /ha) (TV) prediction by Random Forest using species composition, age, management type, BGI and Sentinel-2 spectral bands and indices. Results for HT prediction incorporating Sentinal-2 data were similar to those obtained for TV and are presented in Table 4. Incorporating Sentinel-2 variables increased predication accuracy by 10% (OOB r 2 from 59.7% to 70.3%). Based on the performance of Sentinel-2 variables to predict TV and HT, two single spectral bands (b3 and b8a) and one SVI (S2REP) were selected to be used for modeling forest potential productivity at landscape scale.

Improving Biomass Growth Index Model for Maine and New Brunswick
Before incorporating two Sentinel-2 single spectral bands and SVIs to model forest potential productivity, the correlations between the extracted variables were evaluated using 7038 forest inventory plots in Maine and New Brunswick. The results showed that NIR (b8a) reflectance and S2REP were not strongly correlated (r~0.2). However, green reflectance (b3) was correlated with S2REP and b8a (r~0.5). Moreover, b8a and S2REP more strongly correlated with plot biomass growth rate (r~0.25-0.30), but green band were not correlated with plot biomass growth rate. Due to the low performance of b3 to predict plot biomass growth rate and total biomass, only NIR (b8a) and S2REP were used for the addition to the model. The list of the variables used for model selection and the final model is presented in Table 5. The performances of the three variable combinations are presented in Table 6. When S2REP was added to site variables to predict BG, it resulted in an r 2 of 47.9%. Adding NIR (b8a) and % hardwood composition by basal area (HW) variables slightly increased the performance of the model to 48.7% and 49.3%, respectively. S2REP was highly significant as an asymptote term in the BGI model and outperformed all site variables when it was added to the model. While only a slight improvement in accuracy occurred, substantial changes to coefficients of other variables were evident; i.e., some variables became less important when S2REP was included. When NIR and S2REP were added to the model, r 2 increased slightly but NIR showed a high correlation with hardwood content (HW) ( Table 6) and therefore was removed from the final model.
Finally, only S2REP was selected for the inclusion in the model used for mapping the improved forest potential productivity index (improved BGI; iBGI). The base model, the new model and their components are presented in Equations (1)- (7).
where the asymptote of original BGI and new BGI (iBGI) were defined by a combination of site factors (CLIMATE, SOIL, TOPOGRAPHY; see Table 2), SPECIES (percent basal area of PO and Pi) and SENTINEL-2 (S2REP_GWR) factors with parameters a 0 , . . . ,a 9 , s 0 , and s 1 ; the growth rate parameter b 0 was influenced by plot aboveground dry biomass (BM) at time 0 surviving to time 1; and the shape parameters c 0 and c 1 were influenced by the quadratic mean diameter (QMD) at time 0 (Table 7). All site variable values were bilinearly interpolated for each plot location (p), while stand variables, including BG, were calculated for each plot measurement period (m) [2]. To address heteroscedacity, a variance power function was included in the final model.  The map of base BGI for Maine and New Brunswick is presented in Figure 3. This map was produced based on Maine and New Brunswick permanent sample plots using coefficients presented in Table 7 and therefore can be slightly different from the BGI map presented in [2]. The asymptote of the iBG pm model with species set at zero (iBGI) was also mapped at 20 m resolution for all of Maine and New Brunswick and is presented in Figure 4. The iBGI map was generally similar to the base BGI map (e.g., highest rates of productivity can be observed in the southern and central parts of Maine attributed to warmer growing conditions, deep soils, and fine sedimentary bedrocks). Lower productivity can be observed in highlands and coastal areas as well as poorly drained local topography, or areas with poor soils or bedrock. Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 19     The observed discrepancies between the iBGI and base BGI are related to two different factors: 1) on average, the BG model under-predicted the productivity in New Brunswick by about 235 kg ha −1 year −1 but over-predicted it in Maine by about 150 kg ha −1 year −1 , and this difference was not much improved with the addition of S2REP to the iBG model. These differences can be attributed to variations related to the soil and lithology mapping procedures between regions (Maine and New Brunswick) that could not be fully standardized, and/or differences in regional species composition that could not be factored out of the BG model when mapping BGI and iBGI; 2) in general, low productivity (poor) sites, especially those with stocking limitations, in both Maine and New Brunswick are more detectable using Sentinel-2 data compared to high productivity sites; therefore, poor sites experienced the largest index change (reduction) when adding S2REP to the BG model for both regions. Therefore, the new iBGI map addressed the issues caused by rock, semi-barren land, or poor drainage. As an example, Figure 5 shows the new map clearly predicting poorer site productivity in areas of exposed rock in the New Brunswick highlands, when compared to the base BGI map, indicating an improvement.
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 19 The observed discrepancies between the iBGI and base BGI are related to two different factors: 1) on average, the BG model under-predicted the productivity in New Brunswick by about 235 kg ha −1 year −1 but over-predicted it in Maine by about 150 kg ha −1 year −1 , and this difference was not much improved with the addition of S2REP to the iBG model. These differences can be attributed to variations related to the soil and lithology mapping procedures between regions (Maine and New Brunswick) that could not be fully standardized, and/or differences in regional species composition that could not be factored out of the BG model when mapping BGI and iBGI; 2) in general, low productivity (poor) sites, especially those with stocking limitations, in both Maine and New Brunswick are more detectable using Sentinel-2 data compared to high productivity sites; therefore, poor sites experienced the largest index change (reduction) when adding S2REP to the BG model for both regions. Therefore, the new iBGI map addressed the issues caused by rock, semi-barren land, or poor drainage. As an example, Figure 5 shows the new map clearly predicting poorer site productivity in areas of exposed rock in the New Brunswick highlands, when compared to the base BGI map, indicating an improvement.

Discussion
Sentinel-2-derived variables were used to predict LiDAR-estimated TV and HT, which were found to be quite effective either by themselves or used in combination with other ancillary data. Optical remote sensing data such as Landsat have been long used by numerous researchers to measure both forest biophysical and structural properties. While the application of SVIs is common for vegetation biophysical properties estimation [24], single spectral bands have commonly been used to measure structural properties [20,21] and productivity [3]. The application of Sentinel-2 data for estimating forest structural information and productivity is relatively new because they are recentlylaunched satellites. However, research has already shown the superiority of red-edge SVIs that can be calculated from Sentinel-2 data to measure vegetation biophysical variables [25,26,32,45,52].

Discussion
Sentinel-2-derived variables were used to predict LiDAR-estimated TV and HT, which were found to be quite effective either by themselves or used in combination with other ancillary data. Optical remote sensing data such as Landsat have been long used by numerous researchers to measure both forest biophysical and structural properties. While the application of SVIs is common for vegetation biophysical properties estimation [24], single spectral bands have commonly been used to measure structural properties [20,21] and productivity [3]. The application of Sentinel-2 data for estimating forest structural information and productivity is relatively new because they are recently-launched satellites. However, research has already shown the superiority of red-edge SVIs that can be calculated from Sentinel-2 data to measure vegetation biophysical variables [25,26,32,45,52].
In our research, two red-edge SVIs, namely S2REP and NDVI45, were identified as the best SVIs for TV and HT prediction among the 12 SVIs evaluated. The prediction of stand-level volume based on only Sentinel-2 variables was relatively promising (all variables OOB r 2 : 66%; only b3, b8a, S2REP variables OOB r 2 : 58%) and should be further tested across a broader range of conditions when more data becomes available. In our analysis, incorporating only Sentinel-2 variables with forest management data yielded an OOB r 2 > 80%. The prediction of HT based on only Sentinel-2 variables was also promising but the results were slightly lower than those for TV (all variables OOB r 2 : 56%; only b3, b8a, S2REP variables OOB r 2 : 45%). In contrast to TV, incorporating Sentinel-2 variables with forest management data yielded an OOB r 2 > 70%. Generally, optical remote sensing data provide more extensive information on forest structural attributes in the horizontal plane but are less effective to measure forest vertical structure, such as canopy height. Therefore, tree biophysical variables, such as LAI, crown size and tree diameter, have shown stronger correlation with optical variables compared to height [19,53].
S2REP was also strongly correlated with plot BG and in the new iBGI model, it was a highly significant variable and outperformed all other site variables when added to the BG model. However, NDVI45 did not have a strong relationship with plot BG and, therefore, was removed from further analysis. S2REP's strong performance in productivity estimation is in agreement with previous studies that suggested this index as a robust index for LAI and chlorophyll content estimation [26]. To our knowledge, currently there is no literature available on S2REP performance for TV, HT and productivity estimation, but our findings imply a solid and rather encouraging relationship between this index and those parameters, which should be further assessed in future analyses.
Among single spectral bands, NIR (b8a) was identified as a significant variable to be used to estimate structural and biophysical parameters as well as productivity in previous research [3,20], however, as NIR showed high correlation with hardwood content, it was removed from our final model. Sentinel-2 b3 had a good performance to predict TV and HT but exhibited low performance to predict plot BG. This can be explained by the difference in forest maturity between modelled datasets in each analysis. The dataset used for the first part of the study to predict TV and HT was from younger stands but that used to model BGI/iBGI included mostly mature to old trees. However, b3 also was observed to be useful for identifying areas that were recently disturbed. For example, areas that were heavily disturbed (clearcut and roads) in the past 5-10 years nearly always had b3 reflectance values > 0.06. In addition, S2REP values commonly hovered around 716 to 726 for undisturbed productive forest land, but dropped sharply for areas recently disturbed, roads and agriculture. S2REP values were observed to remain lower than surrounding forest for 20-30 years post-harvest based on visual comparisons with Landsat-derived disturbance data from 1984-2015 in New Brunswick. It was visually apparent that, on average, S2REP values declined in areas with poor drainage and on semi-barren sites, and therefore seemed to be correlated to site productivity; however, within a given topographical position and local site, S2REP values were highly variable and probably influenced by other local factors than site, such as canopy gaps, canopy structure and species variation.
Although a slight improvement in the accuracy of the iBGI can be observed (around 2%), changes to coefficients of other site variables were substantial. S2REP was identified as the most important variable and was superior to climate, lithology, soils, and topographic site variables used to predict site productivity. Some soil and topography variables (e.g. RGS and DWT) became less important in the new model. Therefore, while climate variables [1], soil data [6] or a combination of site and climatic variables [2] are commonly used to map potential productivity across different regions, incorporating additional remote sensing data, particularly red-edge SVIs, seems to add an important new source of information for region-wide site productivity mapping. Although a model based on only remote sensing variables cannot explain all the variations in BG, these variables seem to have strong contributions and can potentially replace some of the site variables in the model. The productivity mapping incorporating remote sensing and site variables is promising but further improvements are still needed as our model performance and resulting maps highlight.
Several factors may have introduced errors or affected the performance of the new iBGI model and consequently, addressing these issues will potentially improve future model performance: (1) to produce a seamless mosaic of Sentinel-2 imagery, early summer imagery was employed for some tiles where mid-summer imagery was not available due to severe cloud cover; if those images are replaced by mid-summer images, the performance of the model will likely improve; (2) although several pre-processing steps were applied to produce cloud-and haze-free data for each tile in the mosaic, possible haze issues in 5 out of 26 tiles in each mosaic may still exist; a multi-date approach to produce composite images may address this issue; (3) the base BGI model was developed for Maine, New Brunswick, Nova Scotia, and Prince Edward Island and removing plots from Nova Scotia and Prince Edward Island resulted in less overall accuracy; (4) finally, obtaining higher positional accuracy of provincial permanent sample plot coordinates can improve the model's overall performance.

Conclusions
The iBGI meets the criteria of a good site productivity measure as it incorporates both physical and biophysical characteristics of forest stands and it is species-independent, quantitative and comparable region wide. Seninel-2 satellite data provide useful spectral information and variables for estimating vegetation properties at fine spatial resolution and are a suitable source of information to measure tree structural components and forest site productivity. Among all Sentinel-2 variables, b3, b8a and S2REP were identified as the best variables for predicting TV and HT for young stands, while S2REP was identified as the best Sentinel-2 variable to predict biomass growth rate over all forest conditions. These findings should help guide future efforts using Sentinel-2 data to predict forest structural attributes. Of particular interest in this region is using Sentinel-2 data to better map species composition and fine-scale, mixed-severity disturbances, which could also likely help to further improve BGI estimates. This, combined with the greater availability of regional LiDAR-derived forest metrics, should also provide better data needed to train and refine future maps of iBGI. Overall, the developed framework of combining remote sensing and site variables in this analysis showed promise and can be further applied in regions dominated by multi-cohort, mixed species forests.