Estimating Stand Age from Airborne Laser Scanning Data to Improve Models of Black Spruce Wood Density in the Boreal Forest of Ontario

: Spatial models that provide estimates of wood quality enable value chain optimization approaches that consider the market potential of trees prior to harvest. Ecological land classiﬁcation units (e.g., ecosite) and structural metrics derived from Airborne Laser Scanning (ALS) data have been shown to be useful predictors of wood quality attributes in black spruce stands of the boreal forest of Ontario, Canada. However, age drives much of the variation in wood quality among trees, and has not been included as a predictor in previous models because it is poorly represented in inventory systems. The objectives of this study were (i) to develop a predictive model of mean stem age of black spruce-dominated stands, and (ii) reﬁne models of black spruce wood density by including age as a predictor variable. A non-parametric model of stand age that used a k nearest neighbor (kNN) classiﬁcation based on a random forests (rf) distance metric performed well, producing a root mean square di ﬀ erence (RMSD) of 15 years and explaining 62% of the variance. The subsequent random forests model of black spruce wood density generated from age and ecosite predictors was useful, with a root mean square error (RMSE) of 59.1 kg · m − 3 . These models bring large-scale wood quality prediction closer to becoming operational by including age and site e ﬀ ects that can be derived from inventory data.


Introduction
Airborne Laser Scanning (ALS) technology has become widely accepted as an important tool for enhancing forest resource inventory (FRI) systems by increasing the accuracy of vertical structural measurements [1]. Point clouds derived from ALS data have also been shown to support a variety of ecological modeling initiatives that use implicit relationships to forest structures to make predictions of variables that are traditionally measured or estimated in the FRI such as stand age [2], as well as novel variables that could be used to optimize the value chain such as tree-level estimates of wood quality attributes [3,4]. Adding general indicators of wood quality to FRI polygons would supply basic information about where certain attributes are concentrated on the landscape. This information could support harvesting decisions to reflect market trends, and may also be useful in facilitating broader management goals such as climate change planning [5].
Wood quality attributes are strongly linked to environmental driving factors [6]. Variation in environmental conditions among stands, and the resulting impact of these factors on wood quality, is often detectible through analysis of ALS data. This suggests that some wood properties could be

Study Area
The study was conducted in the Hearst Forest (HF) management unit (83°40'W, 49°40'N), one million hectares of productive forest land [18] within northeastern Ontario, Canada ( Figure 1). The HF falls within the northern claybelt of the boreal forest region, which is typified by low-lying glaciolacustrine till, known for its poor drainage. The predominant topography is flat ground at low elevation (e.g., ≤86 m a.s.l.); however, there are some upland glaciofluvial landforms that reach elevations of up to 400 m [18,19]. The southern portion of the HF has sandy Humero-Ferric Podzols, with some bedrock outcrops. The northern portion has poor drainage and the soils are mostly Mesisols (24%) and Gleysols (18%) [20]. Black spruce (Picea mariana Mill. B.S.P.) is the most common tree species found in the HF, occurring across a broad range of conditions from dry uplands to wet lowlands [21]. Meteorological data from the weather station in Kapuskasing, ON, (49.3803°N, 82.4001°W) indicate that mean monthly air temperatures fall between −18.6 °C (January) and 17.2 °C (July). The yearly mean temperature is 1.3 °C and the average yearly precipitation is 838.8 mm [22]. Growth rates in the HF are limited by landscape level moisture and nutrient gradients, as well as a short number of frost-free days (92) during the summer months [18]

Sampling Design
In 2007, ALS data were acquired for the entire HF (1,231,707 ha). Northwest Geomatics LTD acquired the data during the leaf-on period between 4 July 2007 and 4 September 2007. The ALS sensor was a Leica ALS50 (Leica Geosystems, Heerbrugg, Switzerland) mounted on a Cessna 310 aircraft, with acquisition parameters that allowed for a 30 • field of view with a 20% overlap [21]. The 11900 Hz pulse rate and 32 Hz scan rate produced an average point density of 0.82 points per square meter [23]. The raw point cloud data was summarized into structural predictor variables (Table 1) at the plot-level and over a 20 × 20 m raster of the HF, as described in [23]. In addition to these structural variables, a 20 × 20 m topographic wetness index (TWI) from ALS-derived elevation data was calculated following Bevin [24]. In 2010, a network of 446 circular 400 m 2 (11.28 m radius) Temporary Sample Plots (TSPs) were established across the HF (Figure 2) in order to support a series of collaborative research projects [23]. meter [23]. The raw point cloud data was summarized into structural predictor variables (Table 1) at the plot-level and over a 20 × 20 m raster of the HF, as described in [23]. In addition to these structural variables, a 20 × 20 m topographic wetness index (TWI) from ALS-derived elevation data was calculated following Bevin [24]. In 2010, a network of 446 circular 400 m 2 (11.28 m radius) Temporary Sample Plots (TSPs) were established across the HF ( Figure 2) in order to support a series of collaborative research projects [23]. The TSP locations were chosen using a stratification that represented the full geographical range of the HF as well as the entire range of stand development stages present in the forest. A comprehensive breakdown of this sampling design including descriptions of how all ALS variables were calculated can be found in Penner et al. [21]. In each plot, forest mensuration variables such as height (m) and diameter at breast height (DBH) (cm) were collected from all living trees ≥ 10 cm The TSP locations were chosen using a stratification that represented the full geographical range of the HF as well as the entire range of stand development stages present in the forest. A comprehensive breakdown of this sampling design including descriptions of how all ALS variables were calculated can be found in Penner et al. [21]. In each plot, forest mensuration variables such as height (m) and diameter at breast height (DBH) (cm) were collected from all living trees ≥ 10 cm DBH [23]. An ecosite and substrate classification was designated for each plot based on information collected from a soil auger profile [23]. Classification of substrate and overstorey species cover was used to identify an ecosite classification according to the Ontario Ecological Land Classification Manual [25]. For this study, ecosites were grouped according to soil moisture regimes and textures. There were nine groups in total, describing site conditions from dry to hydric. Ecosite group 2 were dry sandy sites, ecosite group 3 were fresh, sandy or dry to fresh coarse loamy sites, ecosite group 4 were moist sandy to coarse loamy sites, ecosite group 5 were fresh clayey sites, ecosite group 6 were fresh silty to fine loamy sites, and ecosite group 7 were moist, silty to fine loamy to clayey sites. Ecosite group 8 (hydric or permanently flooded ecosites) was divided into three subcategories representing conifer swamps that were rich 8r, intermediate 8i, and poor 8p in nutrients.

Age Modeling
A subset of 134 TSPs was selected from the network to represent black spruce-dominated plots for the development of an age prediction model. Plots with ≥50% black spruce cover (basal area) were identified as spruce-dominated. A field campaign was initiated from December 2016 to August 2017 for the collection of a sample of approximately 10 increment cores from black spruce trees (DBH ≥10 cm) in each plot, which was expected to sufficiently capture the variation in age. This was similar to the sample size used by Racine et al. [2] to model stand-age in the boreal forest of Quebec. Increment core samples were mounted and sanded so that all annual rings were discernible using standard dendrochronological techniques [26,27]. Dating and counting of annual rings was done under an Olympus SZ61 bifocal microscope (Olympus Corporation, Tokyo, Japan) from 6.7-45× magnification. For increment cores that did not strike the pith, an estimate of the number of missing rings was made based on ring curvature following the method of Applequist [28], and ages were adjusted accordingly. For consistency, all individual tree ages were determined to 2007, when the ALS was flown.
Non-parametric approaches such as imputation methods using k nearest neighbor (kNN) classification derived from distances in multivariate space, are well suited to large ALS datasets of forest attributes [29], and have been successfully employed in age prediction for regular single cohort boreal forest stands [2]. A nonparametric, imputation approach allows a prediction to be made for a target observation based on the most similar reference observation within the data set [29]. Various methods of determining nearest neighbor reference observations can be employed. Random forests (rf) is a classification method [30] that can be used to generate a distance metric to apply kNN and impute predictions of a response variable across a landscape. Random forests distance is defined by Crookston et al. 2019 as (1-P), where P = the proportion of random forest trees where a target observation is in the same terminal node as a reference observation [31]. We conducted kNN with k = 1 and based on the random forests distance (hereafter referred to as rf-kNN) using the "YaImpute" package version 1.0-26 [31] in the R statistical computing environment version 3.4 [32]. YaImpute was created to deal with large numbers of predictor variables, and can be used to impute predictions as a raster output. For model variable selection, an automated recursive process was followed using the varselection function within YaImpute, which allowed predictors to be chosen that reduced the model's error while still providing unique information. The variables chosen using this method were TWI, Elevation, cc16, P50, cc4, COVAR, cc10 and Max height. Model performance was assessed based on the root mean square difference (RMSD), which is a measure of the model error based on the differences between target and reference values, for all observations that have both [31] Several subsets of the age dataset were examined to determine how model performance varied based on stand characteristics [33]. In mixed stands with weak dominance of black spruce the model produced poor age predictions. Thus, the final dataset was trimmed to include only plots that had greater than 70% black spruce (% basal area) and contained no hardwood or eastern white cedar (n = 116). The model also produced poor predictions in plots with a mean age greater than 120 years. Based on this trend and an analysis of the pattern of height changes as a function of age, an old growth age class was created for plots with mean ages ≥ 120 years to address this issue. A map layer depicting the area of the HF representing the stands described by the restrictions of the final age modeling dataset was generated from the FRI in ArcMap (version 10.1, ESRI, Redlands, CA, USA), and this map was linked to the raster of ALS-derived predictor variables for the same area. Stand age was imputed across the forest at the raster level (20 × 20 m pixel) using the YaImpute and the Asciigridimpute functions.

Wood Density Modeling
A total of 80 plots were selected from the 446 TSPs originally established in 2010 for collection of wood quality data. Large increment core samples (12 mm diameter) were collected from these plots over two field campaigns, conducted in 2011 and 2014. The approach taken with the wood density model was to predict an average tree level trait expected to occur within a given spatial area (in this case a 20 × 20 m raster pixel), which would allow users to identify portions of the landscape expected to contain trees with specific wood traits [4,9,34]. To facilitate this approach, it was important that the fitting dataset was composed of trees that were representative of the stands within which they occurred. The following criteria were used to identify cores that represented the average or typical tree for a given stand; cores were extracted only from co-dominant or dominant individuals, within ±10 centimeters of the mean diameter at breast height for the plot, and within 30 years of the mean plot age. Some of the plots in this study were within multiple cohort stands where a large range of ages and wood qualities were present. A mean wood density value from a single tree would not have adequately represented the wood density measurements across multiple cohorts at the plot level and therefore these plots were removed from the study. To use the imputed age from the rf-kNN analysis in the wood quality model, the dataset was further subset to match the conditions of that model, which meant removing plots that had less than 70% black spruce mixed with a component of hardwood or eastern white cedar. The final dataset was made up of 109 increment core samples from 80 plots that represented a range of ecosite groups. The majority of plots were classified as ecosite group 8i and 8p, comprising 70% of plots represented in this model. Increment cores were sent to FPInnovations (Vancouver), where they were analyzed using SilviScan technology to obtain ring-level wood quality measurements [35,36]. Nominal density was measured at the ring-level using an X-ray densitometer at 25 µm resolution, 0.5 mm below the core surface. From the nominal density measurement, true density was calculated with a moisture content of 8% from volume and mass measurements that were calculated with micrometry.
Prior to fitting classification models, a preliminary list of potential predictor variables was developed based on the results of a stepwise multiple linear regression to prune the variable list using the "car" package [37] in R. To ensure the model was not over fit, variables were removed until all remaining had variance inflation factors less than 10. Regression tree analysis is a method for predicting values of a single response variable using a hierarchical classification tree to split the data into progressively homogenous groups. While regression trees are specific to the fitting data set, they can provide insights into the ecological drivers of the response variable [38]. We developed a regression tree of wood density from the list of predictors derived from the multiple linear regression (MLR), using the "rPart" Package [39] in R. The regression tree was pruned to cp ≤ 0.015. Random forests [40], a consensus-based modeling approach that can be used to provide a general simulation of hierarchical classification, was used to predict wood density over the range of conditions represented by the fitting data set. The final model was run with nine predictor variables, six derived from the ALS point cloud (cc4, cc6, cc12, VCI, D2, and P50) one derived from the forest resources inventory (ecosite group), one modeled from ALS data (QMDBH), and the final variable, age, derived from the rf-kNN model. A regression type model was conducted, with 8 variables tried at each split, a maximum of 4 nodes permitted and a total of 5000 trees created. To predict wood density across the entire set of spruce-dominated polygons in the HF, the subset of polygons that represented the same conditions used in the rf-kNN age model was identified, and the random forests simulation was computed across the entire area using the Modelmap package in R [41].

Age Modeling
The sampling design captured a wide variation in age structure across the forest. The youngest plots were 8 years old at breast height, and the oldest were 160 years ( Figure 3). The height structure of the plots varied with mean plot age, showing a clear decrease in height at the top (P90) and middle (P50) of the canopy at approximately 80 years (Figure 4). At 120 years there was a conspicuous increase in height of P90 and P50 ( Figure 4). This notable increase in age complexity suggested a threshold beyond which the mean stem age could not be predicted. For this reason, a single age class was created for stands older than 120 years, which are hereafter referred to as 121+.     The predictor variables describing site conditions like elevation and TWI were the most important explanatory variables in the model ( Figure 5). Variables that represent the complexity of the canopy at various heights (cc16, cc4, and cc10) and COVAR, which describes complexity as SD/mean, were also somewhat important for explaining age. This was expected since age is highly related to structural changes over time (Figure 4). The kNN age model using random forests as the special distance metric explained 62.87% of the variation of age in the sample population ( Figure 6) and had a root mean square difference (RMSD) of 15 years. These are good results considering that many stands were from naturally regenerating origin and represented a large amount of structural and site condition variation across the landscape.
Remote Sens. 2019, 11, x FOR PEER REVIEW 9 of 21 The predictor variables describing site conditions like elevation and TWI were the most important explanatory variables in the model ( Figure 5). Variables that represent the complexity of the canopy at various heights (cc16, cc4, and cc10) and COVAR, which describes complexity as SD/mean, were also somewhat important for explaining age. This was expected since age is highly related to structural changes over time (Figure 4). The kNN age model using random forests as the special distance metric explained 62.87% of the variation of age in the sample population ( Figure 6) and had a root mean square difference (RMSD) of 15 years. These are good results considering that many stands were from naturally regenerating origin and represented a large amount of structural and site condition variation across the landscape.   The predictor variables describing site conditions like elevation and TWI were the most important explanatory variables in the model ( Figure 5). Variables that represent the complexity of the canopy at various heights (cc16, cc4, and cc10) and COVAR, which describes complexity as SD/mean, were also somewhat important for explaining age. This was expected since age is highly related to structural changes over time (Figure 4). The kNN age model using random forests as the special distance metric explained 62.87% of the variation of age in the sample population ( Figure 6) and had a root mean square difference (RMSD) of 15 years. These are good results considering that many stands were from naturally regenerating origin and represented a large amount of structural and site condition variation across the landscape.   To demonstrate the application of this model at an operational scale, a map of predicted mean tree age was produced from the imputed age for all polygons in the FRI that met the rf-kNN variable criteria (Figure 7). Age variation can be very high, even within one FRI polygon. The map shows pockets of older stands in the northeastern portion of the HF where less road access is possible. The map insert, shows that areas that were recently harvested appear as young stands with a single cohort of trees displayed as purple patches on the map. It also shows that for the most part when older raster cells (121+ years) are present they are normally interspersed with medium aged raster cells (~80 years), which conveys the multicohort nature of these stands. To demonstrate the application of this model at an operational scale, a map of predicted mean tree age was produced from the imputed age for all polygons in the FRI that met the rf-kNN variable criteria (Figure 7). Age variation can be very high, even within one FRI polygon. The map shows pockets of older stands in the northeastern portion of the HF where less road access is possible. The map insert, shows that areas that were recently harvested appear as young stands with a single cohort of trees displayed as purple patches on the map. It also shows that for the most part when older raster cells (121+ years) are present they are normally interspersed with medium aged raster cells (~80 years), which conveys the multicohort nature of these stands.

Wood Density Modeling
The dry sandy ecosite groups (EG2-EG4) had mean top heights greater than 19 m and were generally younger (64-80 years) ( Table 2). The hydric sites had shorter, older trees with mean heights between 14-18 m and mean ages of 100-120 years ( Table 2). Plots in ecosite group 8p, poor conifer swamps typified by poorly drained organic soil, had the highest mean density of 536 kg·m −3 . In contrast, the mean density for samples in ecosite group 4, dry sandy to coarse loamy sites, were the lowest at 413.2 kg·m −3 ( Table 2). The general trend was that ecosite groups that provided poorer growing conditions tended to have the highest wood density, whereas the faster growing, drier sites had the lowest wood density ( Table 2). There was a clear separation of density values across the nodes produced in the regression tree analysis (Figure 8). The regression tree model of wood density had a RMSE of 63.87 kg·m −3 and a coefficient of determination of 0.52. Age and ecosite were both important variables in the classification of wood density. The first split in the regression was based on ecosite groups, with EG 3-7 separating into a node where mean densities were typically low (~400-500 kg·m −3 ) (Figure 8). Mean density within the node containing the wet ecosites (EG-8i and 8p) was higher, with predictions ranging from 450-650 kg·m −3 (Figure 8). Other than the main division represented by ecosite differences, various other predictors formed nodes on the tree within these two branches. On wet ecosites, higher densities were predicted for sites with smaller canopy cover at 12 m and smaller QMDBH, and for sites with higher VCI. On dry ecosites, higher densities were predicted for sites with higher canopy closure at 6 m and higher ages. The lowest predicted density occurred for sites on EG 3-7, with canopy cover at 6 m less than 58%. The highest densities were predicted for sites on EG 8i and 8p, with canopy cover at 12 m less than 15% and QMDBH less than 11.91 cm. Although age was not directly selected in the nodes of the tree that included the wet sites, there was a strong correlation between age and VCI and QMDBH, which were selected variables on those nodes.
The most important variable extracted in the random forests simulation was ecosite group, which reduced mean squared error (MSE) by close to 50% when it was included in model fitting ( Figure 9). Other important variables that reduced MSE substantially (5-35%) were the same variables that were elucidated in the regression tree analysis (e.g., cc6, cc12, age, QMDBH, and VCI) (Figure 9). The variables D2, P50, and cc4 entered the random forests model; however these all had minimal impacts on error when excluded from the model (Figure 9).    Overall the final random forests simulation for all 109 black spruce plots described 13.54% of the variation in the sample population with a RMSE of 59.31 kg·m −3 ( Figure 10). Generally, the model tended to over-predict the plots with low wood density (400 kg·m −3 ) and under-predict plots with higher wood density (650 kg·m −3 ) ( Figure 10).
The mean wood density map shows an example of how wood qualities could be mapped across the landscape based on a projection of the random forests simulations employed in this study ( Figure 11). The insert shows a close-up view of a small section of forest that has recently experienced forest depletion through harvesting. There are clearly visible pockets of high (613 kg·m −3 ) and low density, which could be used for operational planning. The map indicates that clearly defined patches of wood with a relatively homogenous density can be found, as well as some places where there is a mixture of densities within the same area. This level of detail could also easily be averaged by polygon to show less variation as a more simplified tool for planning. Distinct areas of dense wood can be observed even at this low resolution. At the forest management unit level, patterns of wood density are visible such as the section of high density wood predicted in the northwest corner of the forest unit and patches of low density wood in the northeast section.  The mean wood density map shows an example of how wood qualities could be mapped across the landscape based on a projection of the random forests simulations employed in this study ( Figure 11). The insert shows a close-up view of a small section of forest that has recently experienced forest depletion through harvesting. There are clearly visible pockets of high (613 kg·m −3 ) and low density, which could be used for operational planning. The map indicates that clearly defined patches of wood with a relatively homogenous density can be found, as well as some places where there is a mixture of densities within the same area. This level of detail could also easily be averaged by polygon to show less variation as a more simplified tool for planning. Distinct areas of dense wood can be observed even at this low resolution. At the forest management unit level, patterns of wood density are visible such as the section of high density wood predicted in the northwest corner of the forest unit and patches of low density wood in the northeast section. Remote Sens. 2019, 11, x FOR PEER REVIEW 14 of 21

Age Prediction
The final model we produced using the rf-kNN approach was useful for efficiently and accurately predicting the mean age of black spruce-dominated stands in boreal Ontario within 15

Age Prediction
The final model we produced using the rf-kNN approach was useful for efficiently and accurately predicting the mean age of black spruce-dominated stands in boreal Ontario within 15 years, explaining 62.87% of the sample population variance. The random forests distance produced the best results, compared to two alternative Euclidian distance metrics also used in forest mensuration that were tried, most similar neighbor (MSN) and mahalanobis (MAL). These distance metrics produced poorer root mean square differences (RMSD) than random forests, at 23.9 years for MSN and 25.5 years for MAL. Our rf-kNN model results are comparable to those obtained by Racine et al. [2], who successfully applied the kNN approach with ALS-derived variables (e.g., Weibull curves, P95, first return penetration) as indicators of vertical forest structure and site characteristics to predict broad patterns of age in the boreal region of Québec, reporting slightly lower error (10 years), and higher explained variance (r 2 = 0.84). The discrepancy in performance between these models was driven by the greater range of age and canopy structure in our fitting data from the HF. The mean age of plots in the Quebec study was 50 years, whereas it approached 90 years in the HF. Age is harder to predict in older stands, since patterns of stand dynamics have a greater chance of being influenced by small scale localized disturbances like wind-throw. Additionally, as multiple cohorts of different aged trees replace the initial cohort that emerged after a stand-replacing disturbance, the mean age can be influenced by individuals at either end of the distribution, making representative sampling difficult. It would therefore be expected that model performance would decline in a forest with an increasing proportion of old, multi-cohort stands. A study in Finland that used time series photographs to determine age since last stand replacing disturbance had an overall accuracy of classifying stand age of 78.9% [14]. However, their methodology limited estimations of age to areas with suitable aerial photograph coverage. Their results showed that older photographs had poorer prediction accuracies (before 1991) which is beyond the mean stand age (90+ years) in the HF. Maltamo et al. [42] used the k most similar neighbor (k-MSN) approach to predict common FRI attributes including mean tree age, and were able to predict age with similar error to our study, with an RMSE of 16 years for Norway spruce. The consistent problem in all of these studies has been in predicting the age of old growth stands where patterns of development are non-linear [14].
Examination of the pattern of P90 and P50 of ALS data in relation to mean tree age illustrated the changes in canopy complexity expected as stands develop under single species stand dynamics [12]. The stem re-initiation phase began at 80 years as indicated by a decrease in top height and mid-canopy height. At 120 years there was another clear increase in height for both P90 and P50, suggesting the beginning of the transition old growth phase. Multiple cohorts of trees in these transition old growth stands make it very difficult to predict a meaningful age, since the mean age of trees no longer describes the majority of the stand, but rather represents a mid-point of a broad range of ages. For the HF, this occurred in stands with a mean age of approximately 120 years, and therefore an age class was created that indicates these stands are generally multi-cohort, and should be considered as transitional or true old growth.
Age prediction accuracy in black spruce-dominated stands that had even a small proportion of eastern white cedar and birch or trembling aspen was poor. The rf-kNN method uses ecological neighbor stands with known ages to impute predictions. The model requires stand structures in the fitting data set that are similar to those being imputed in the main data set, in order to find appropriate neighbor matches. Since the canopy structure of eastern white cedar or hardwood trees is different than black spruce, and the number of plots with examples of these eastern white cedar/hardwood structures was low, there were not enough neighbors to appropriately match stands and make good predictions. Specifically, eastern white cedar has an ellipsoidal crown structure, whereas spruce are conical. Most of the foliage on hardwoods is in the upper canopy and the shape of their crowns is almost circular. The point cloud derived variables were created from low-density ALS data, therefore, the various crown structures could not be clearly differentiated. Instead, measurements of canopy height distribution (e.g., cc16, P50) provided the best options for differentiating stand structures as a function of age. For this reason, eastern white cedar or hardwood trees might have caused confusion in the model by increasing canopy closure metrics that are similar to a different age in spruce-dominated stands.
Topographic wetness index (TWI) was the most important predictor of mean stand age. The TWI describes the pooling and flow of water on the landscape [24], and was also found to be an important predictor for age in boreal stands of Quebec [2]. It is derived from the digital terrain model and depicts hydrological movement, however it is not a direct descriptor of soil moisture availability, since it does not account for variation in soil texture. Thus, TWI can only be used to generally indicate areas that may be affected by flooding (accumulation) or drought (dissipation). In areas where trees are stressed by flooding or drought conditions, growth rate is slow. Elevation was another important predictor of stand age. The terrain in the HF is mainly flat, and the substrates consist largely of heavier organic/clayey soils on the low, flat areas, with dry sandy soils restricted to the higher elevations. Although the differences in elevation among sites across the HF can be small, the effects of elevation change can be clearly observed in the variation of the ratio of height and age. Ultimately, the effects of TWI and elevation manifest as increased nutrient availability, better drainage and a slight increase in soil temperature that collectively increase growth rate in upland stands compared to the slow growth of stands in low-lying areas. Most of the black spruce-dominated stands in the HF were classified as slow-growing lowland conifer swamps. Previous studies have demonstrated that poorer sites have higher tree longevity and longer successional stages than more productive sites [43][44][45]. This is due in part to the rate at which trees reach maximum growth and senescence [44]. Additionally, since growth rate is linked to site conditions, it may also have driven some of the harvest patterns in the past, where the largest and easiest to access trees were harvested first [45], creating a pattern on the landscape such that slower-growing stands on wet sites are older.
Studies have shown that stand height is typically an important predictor of stand age [2,12,14,46,47]; however, it was not a key predictor of age in our study. The main reason for this apparent exception is the range of ages and stand structures we considered for this study. The sample population of 116 plots had a mean age of 90 years, and the scatter plot of canopy returns by age in these stands suggested that gap dynamics start to affect the age structure of plots around 80 years. Thus, a large proportion of plots in our study were likely influenced by multi-cohort stand dynamics. If we had restricted the model to apply to only the development of single cohorts, which our data suggested occurs up to approximately 80 years, then it is likely that height would have been more useful for predicting age. The changes in canopy structure that accompany this successional process can be characterized by examining the vertical distribution of foliage over multiple layers or height bins, and quantities that describe this change are often important predictors of age because they are linked to stand dynamics [12]. The vertical complexity index (VCI), applies Shannon's equation to binned point cloud returns to capture the variation in vertical distribution of points in the point cloud [17]. Although VCI was entered as a predictor for our model, it had relatively low importance to model performance. Nevertheless, VCI was highly correlated to many of the variables that were important, including the canopy closure at various heights (cc16 and cc4) and P50 height. Thus, variables that describe vertical canopy structure were indeed useful predictors of stand age. Ultimately, our model allowed for the prediction of a mean stand age with an error of approximately 15 years. It was necessary to truncate the predictions at 120 years, given the difficulties that arise when dealing with multi-cohort stands; however, this is justified because the complexity of old growth stands is effectively ignored when they are assigned a mean age. Perhaps a better approach in such stands would be to attempt to predict the ages of individual trees and then examine various properties of the distribution in addition to the mean. One way that the stand age map created in this study could be useful is by identifying stands that have been classified as old growth, facilitating the use of the age structure of the forest for natural disturbance emulation [48]. Imputing ages on a 20 × 20 m raster pixel scale creates the opportunity for forest managers to account for age variation within FRI polygons, and support better management of forest age structure. Ultimately, a higher density ALS point cloud would improve the estimation of the key forest structural variables. The low-resolution of the point cloud utilized for this study restricted age predictions to a plot level mean. Since there were only age measurements for~10 trees per plot, error in the models was caused by matching point cloud statistics summarized for many trees at the plot level (including trees without age data) and the sample trees. Higher resolution point clouds and age information from all of the trees in a plot could make single tree age prediction models possible. A source of potential error in the rf-kNN model is a bias towards rare samples, since the dataset was found to be imbalanced in regards to plot age. In future studies, special attention should be paid to attaining a balanced sample distribution to avoid this bias.

Wood Density Modeling
The wood density models produced in this study have broadened the range of conditions under which wood density predictions can be made in the boreal forest of Ontario, by adding age as a potential predictor variable without substantially increasing error. The RMSE of our model that applies to stands up to 120 years in age was 59.1 kg·m −3 , and compares well to previous studies in which the range of ages was restricted to the first 50 years of growth and RMSE values of 40.4 kg·m −3 [9] and 38.8 kg·m −3 [4] were reported. Furthermore, including age effects achieves an operational implementation goal to apply the model to black spruce-dominated polygons of all ages. There are still gains that could be made in accuracy that might be achieved with higher density ALS data. Blanchette et al. [49] developed a model from terrestrial laser scanning (TLS) data, which has a much higher point density (30 pulses/meter), and were able predict wood density more accurately (RMSE 23.87 kg·m −3 ) at a finer scale. The use of TLS, however, limits the generalizability of this type of predictive modeling, since it is currently only suitable for use over a small area (plot level), whereas ALS allows for less accurate but broader scale predictions to be made, which could be used in enhanced FRI's.
Stand age was an important indicator of average wood density in both the regression tree and random forests analyses. Young trees have a large proportion of corewood typified by low density and short fibers [50,51]. In trees that have reached maturity, the density of wood stabilizes and therefore, older trees have a greater proportion of outerwood, which is of higher density. Understanding which logs have a higher proportion of high density wood, is useful for determining which trees are better suited for structural timber products like saw logs [10]. Additionally, there is an important interaction between stand age and site quality that clearly affects wood attributes such as density. In Ontario, ecosites are defined by substrate characteristics like soil type and texture, and moisture regime, which describe the availability of water across a landscape. In hydric sites like ecosite group 8i or 8p, pith to bark profiles reach a stable state with generally high density wood after the first few years of growth. This arises from a physiological response to consistently flooded conditions on these sites, and is caused by selective pressure for thicker cell walls to protect cells from embolism. Cell diameter is also reduced in the xylem of trees exposed to saturated conditions, which limits the risk of embolism when turgor pressure is low [52]. On the mesic sites represented by ecosite groups 3-7, pith to bark profiles tend to have greater variability of density in the first fifty years of growth [6,12]. Generally high density wood is found directly adjacent to the pith, then there is a steep decrease in density, followed by a gradual increase to a stable state within the first 30 to 50 years [6,12]. Drier sites tend to have less dense wood and more variability across the pith to bark profile than wetter sites. These common trends explain some of the descriptive power of ecosite group in the regression tree and random forests models. The interaction of age and ecosite that is captured in our model could have important consequences for operational application. For example, a mean density estimated for a typical tree on a dry to fresh site where the stable density is reached around 50 years, will become increasingly influenced by high density outerwood as the tree ages beyond 50. The effect would not be as important in trees on wet sites, which tend to have higher and more uniform density from pith to bark. For this reason, age may be a more important predictor on the dry or mesic sites, since young trees express this site-related variation in the pith-to-bark more strongly than older trees.
Forest structure also influences intrinsic wood qualities [7]. The ALS variables related to canopy closure and quadratic mean diameter were all found to be important in predicting wood density in our study. These metrics could be indicators of competition and light availability at the site level, which have been shown to have a negative relationship to wood density in shade tolerant species [53]. One reason for this negative correlation of wood density to variables that indicate a more open stand structure could be the role of high density wood with long fibers and thick cell walls as an anatomical response to wind, especially when there is low stocking density and trees are unprotected [12]. These metrics may also be indicators of average crown size, which would be related to stem structural strength required to support the higher load of a large crown, and would have a positive correlation to wood density [6,7,53]. Variation in density could also be affected by genetic variation within a species, which could account for some of the error in the model [54].
Predictive mapping is a useful way to link structurally explicit ALS data and site characteristics to show where specific wood qualities can be found in the forest. The wood density map created for this project for example, could be used to display areas where there are pockets of wood with predicted density greater than or less than a specific value. These pockets would be described by combinations of age and ecosite; therefore, the performance of the model could be constrained by the reliability of interpretations of ecosites in the FRI. Nevertheless, this approach could be useful for supporting decisions about products with specification thresholds for raw materials (e.g., density ≥ 500 kg·m −3 ). Enhanced forest inventories with wood quality variables that are not normally included could allow for the industry to be responsive to market demands that are constantly changing and support value chain optimization [55].

Conclusions
The non-parametric rf-kNN model of forest stand age developed for black spruce stands in the boreal forest of Ontario from ALS structural information was useful for imputing accurate age estimates, which could be subsequently included as a derived variable in a given FRI system. This type of indirect predictive modeling of age creates estimates that can be obtained efficiently and consistently, given that the technique applies the same rules to the entire land base. An age map derived from the model could be used on an operational scale for a number of applications, including forest management planning or other predictive modeling projects that require mean forest age. Predictive modeling of wood quality attributes, using the derived age imputed from the rf-kNN approach, allowed for the expanded application of a model of mean wood density using a random forests approach. Descriptive wood quality maps can be developed from random forests simulations and used as planning tools to optimize the value chain of forest timber harvesting by selecting specific wood qualities to target prior to harvest. The efficiency with which wood density can be calculated from ALS data creates the opportunity for enhanced forest inventories to be more reactive to economic demands that are difficult to react to when they are based on more traditional FRI methods.