Machine Learning Approaches for Estimating Forest Stand Height Using Plot-Based Observations and Airborne LiDAR Data

Effective sustainable forest management for broad areas needs consistent country-wide forest inventory data. A stand-level inventory is appropriate as a minimum unit for local and regional forest management. South Korea currently produces a forest type map that contains only four categorical parameters. Stand height is a crucial forest attribute for understanding forest ecosystems that is currently missing and should be included in future forest type maps. Estimation of forest stand height is challenging in South Korea because stands exist in small and irregular patches on highly rugged terrain. In this study, we proposed stand height estimation models suitable for rugged terrain with highly mixed tree species. An arithmetic mean height was used as a target variable. Plot-level height estimation models were first developed using 20 descriptive statistics from airborne Light Detection and Ranging (LiDAR) data and three machine learning approaches—support vector regression (SVR), modified regression trees (RT) and random forest (RF). Two schemes (i.e., central plot-based (Scheme 1) and stand-based (Scheme 2)) for expanding from the plot level to the stand level were then investigated. The results showed varied performance metrics (i.e., coefficient of determination, root mean square error, and mean bias) by model for forest height estimation at the plot level. There was no statistically significant difference among the three mean plot height models (i.e., SVR, RT and RF) in terms of estimated heights and bias (p-values > 0.05). The stand-level validation based on all tree measurements for three selected stands produced varied results by scheme and machine learning used. It implies that additional reference data should be used for a more thorough stand-level validation to identify statistically robust approaches in the future. Nonetheless, the research findings from this study can be used as a guide for estimating stand heights for forests in rugged terrain and with complex composition of tree species.


Introduction
Sustainable forest management has been identified as a key part of one of the 17 Sustainable Development Goals from the 2015 United Nations Sustainable Development Summit.Others have mentioned the importance of sustainable forest management in association with climate change mitigation, biodiversity, water resources, and the sustainable supply of forest products [1][2][3][4][5].At the government level, effective sustainable forest management for broad forest areas requires country-wide Forests 2018, 9, 268 2 of 16 forest inventory data.A stand-level inventory is appropriate as a minimum unit for local and regional forest management.In South Korea, stand-level forest type maps have been produced at a national scale since 1972 to characterize the status of forests across the country.These maps have been widely used to support various tasks such as generating forest statistics, forest-land management, fostering forest resources, and forest protection.Since 2008, the scale of Korean forest type maps has been enhanced from 1:25,000 to 1:5000.The attributes of the forest type maps consist of tree species, age, diameter-at-breast-height (DBH), and crown density as categorical variables.Other nations have also been producing forest type maps.For example, forest services in the United States and Canada generate Vegetation Resources Inventory (VRI) maps with attributes recorded based on the interpretation of aerial photos and ground samples.The thematic maps can contain various parametric information such as cover pattern, crown closure, tree layer, vertical complexity, species composition, age, height, basal area, density of living trees, dead tree density, confidence indices, and data source codes.The maps provide these forest parameters as geospatial layers to facilitate monitoring of overall forest condition.Finland produces forest thematic maps using multi-source data, which include forest inventory data, satellite images, land-use maps, and digital elevation models.The thematic maps contain information such as biomass, crown closure, stand-level mean DBH, stand height, basal area, and soil productivity.Unlike the forest maps produced by other countries that include many properties, the forest type maps in South Korea contain only four categorical parameters.Therefore, there is a strong need to supplement the forest type maps of South Korea with additional forest information to enhance monitoring and management of forest ecosystems.Among forest attributes, forest stand height is a representative index of vertical forest structure that informs how an ecosystem works, including cycles of carbon, water and nutrients [6].In addition, it is a useful index to estimate forest stand quality, site index, forest volume, and aboveground biomass.
In the context of remote sensing, Light Detection and Ranging (LiDAR), an active sensor that can be used to obtain three-dimensional information of geographical features, has been widely used for forest characterization.Advances in LiDAR technology make it possible to consistently and semi-automatically extract forest parameters when compared to traditional photogrammetry whose data quality depends on the scale and resolution of photographs, and the experience and skills of the photographic interpreter.In particular, airborne LiDAR has been used as a robust data source for obtaining forest inventory data [7][8][9][10].The definition of canopy or stand height is a representative height of the tallest tree stratum in a forest patch or stand [11].Different from gridded forest patches, a forest stand consists of adjacent trees with same tree species, age and DBH class distribution, site quality, spatial arrangements, structure and conditions that are distinguished from neighboring communities.Estimation of forest stand height using LiDAR data can be conducted using two types of approaches: individual tree-based and plot-based approaches [12].The individual tree-based approach calculates forest stand heights by averaging tree heights extracted based on individual tree crown delineation [13,14].This approach works well over relatively flat and homogeneous forests, but may not work well for rugged or heterogeneous areas [7].The plot-based approach estimates stand heights with various descriptive statistics at the plot level.Most previous studies modeled forest stand heights based on the plot-based approach [15,16].Some studies used linear regression to relate field-surveyed stand/plot heights and LiDAR-based statistical variables.Means et al. [17] estimated ground-based stand heights with airborne LiDAR-derived metrics using stepwise regression analysis.The average maximum height, maximum height, average mean height, height percentiles, and canopy cover percentiles were used as independent variables.The model accuracy for all 19 plots showed coefficient of determination (R 2 ) of 0.93 and root mean squared error (RMSE) of 3.4 m; model accuracy for 11 old-growth plots showed an R 2 of 0.98 and RMSE of 1.7 m.Similarly, other studies used multiple regression analysis between field measurements and airborne laser data on young forests, mature forests with poor site quality, and mature forests with good site quality [18][19][20].The height percentiles, maximum values, mean values, and variations were used as independent variables with R 2 of about 0.8-0.95.Other recent studies estimated forest stand heights using linear regression with LiDAR data and optical satellite images.Donoghue and Watt [21] compared linear regression-based stand height models with three different datasets: airborne LiDAR, Landsat ETM+ and IKONOS.Wulder and Seemann [22] used an empirical model to estimate airborne LiDAR-derived stand heights from Landsat5 TM data after delineating forest stands with an image segmentation.Mora et al. [23,24] estimated LiDAR-based stand heights with high resolution satellite images (i.e., QuickBird-2, Worldview-1) using diverse regression techniques including linear regression, k-nearest neighbor, regression trees, and random forest.However, prior studies were generally carried out on non-rugged terrain with relatively analogous tree species composition, which makes it difficult to apply to mountainous forests in South Korea.Vega et al. [25] also described difficulty in extracting tree biophysical parameters where complex forest terrains affect height normalization, which causes distortions to the canopy height model (CHM).Khosravipour et al. [26] showed that tree top detection on slope-distorted CHMs is strongly influenced by individual tree crown shape, which usually depends on tree species.Korean forests have rugged terrain with small patches of tree species.Therefore, there is a need for developing an improved model to estimate forest stand height under such complicated conditions.
The primary goal of this research was to develop novel forest stand height estimation models for rugged terrain with highly mixed tree species, considering their operational use and adding a new attribute (i.e., stand height) to the Korean forest type maps.We extracted area-based descriptive statistics of forest structure from airborne LiDAR data and then estimated forest canopy heights at the plot level using three machine learning approaches-support vector regression, modified regression trees, and random forest.We also proposed two approaches for expanding the plot level model to estimate stand level forest height, which was validated using field survey data.

Study Area
The study area is located on the southwestern region of Mt.Maehwa in Gangwon-do, South Korea, and covers 2500 ha (centered at 37 • 38 N, 127 • 50 E) (Figure 1).The mountainous forest has been designated as a government-owned sustainable forest management model since 2008.The forest management model consists of four forest function types: timber production forest, forest for conservation of water, living environment conservation forest, and disaster prevention forest.The ratio of natural forest to planted forest is 54%.The dominant tree species in the study area are Pinus koraiensis Siebold & Zucc., Larix kaempferi (Lamb.)Carr., Betula platyphylla var.japonica, Pinus densiflora Siebold & Zucc., and varied broad-leaf species.The dominant low vegetation species are saplings of Quercus mongolica Fisch.ex Ledeb., Rhododendron mucronulatum Turcz., Corydalis remota Fisch.Ex Maxim., and Lindera obtusiloba Blume.
The study area is in the temperate mid-latitude zone, which is hot and humid in the summer influenced by the North Pacific air mass, and cold and dry in the winter influenced by the Siberian air mass.Compared to the east coast, the difference between summer and winter temperature at this inland mountainous area is large.The annual mean temperature is 10.1 • C, January mean temperature is −5.6 • C, and August mean temperature is 24 • C. The rainfall pattern shows topographic variation and the annual mean precipitation is 1291.3mm.

Data
Airborne LiDAR data were acquired during 4 to 17 September 2013 using a Leica ALS60 sensor (Leica Geosystems, Heerbrugg, Switzerland) (Table 1).The LiDAR data were collected with an average point density of 6.25 pts/m 2 and neighboring flight lines overlapped by about 70%.Digital surface models (DSM) and digital terrain models (DTM) with spatial resolution of 0.25 m grid cells were made using the triangulated surface models and raster conversion in TerraScan Software (version 013, Terrasolid, Helsinki, Finland).A canopy height model (CHM) was created as the difference between the DSM and DTM.We filtered the CHM with a 5 m threshold to distinguish dominant trees from other shrubs and grasses.

Data
Airborne LiDAR data were acquired during 4 to 17 September 2013 using a Leica ALS60 sensor (Leica Geosystems, Heerbrugg, Switzerland) (Table 1).The LiDAR data were collected with an average point density of 6.25 pts/m 2 and neighboring flight lines overlapped by about 70%.Digital surface models (DSM) and digital terrain models (DTM) with spatial resolution of 0.25 m grid cells were made using the triangulated surface models and raster conversion in TerraScan Software (version 013, Terrasolid, Helsinki, Finland).A canopy height model (CHM) was created as the difference between the DSM and DTM.We filtered the CHM with a 5 m threshold to distinguish dominant trees from other shrubs and grasses.Field data were measured at the plot level using circular plots with a radius of 11.3 m.Tree locations, tree heights, representative tree species, and DBH were measured for a total of 42 plots.The 42 plots were selected with stratified random sampling based on tree species DBH classes considering accessibility.For each of 42 plots, approximately 5610 elevation cells (standard deviation of 750 cells) with 0.25 m resolution were used to develop the plot height models after filtering shrubs and grasses (<5 m).There were 9 dominant tree species in the sample plots (in decreasing order): Larix kaempferi (32.40%),Pinus densiflora (29.70%),Pinus koraiensis (10.80%),Quercus mongolica (10.80%),Quercus variabilis Blume (8.10%), Pinus rigida Mill.(2.70%), Quercus dentata Thunb.(2.70%), Carpinus laxiflora (Siebold & Zucc.)Blume (1.40%), Platycarya strobilacea Siebold & Zucc.(1.40%).In addition, to validate the stand height estimation models, three small stands (Stands A, B and C) were selected considering the dominant tree species in the region and accessibility for sampling (Figure 1).We collected data for all dominant and co-dominant trees within the three stands.The location of individual trees was calculated using the azimuth and distance from the plot center to each tree.Tree heights were estimated using devices called Haglof Vertex and TruPulse.Tables 2 and 3 show statistics for the sample plots and stands including number of trees, mean DBH, mean tree height, and mean slope.The digital forest type map produced by the National Institute of Forest Science (NIFS) provides forest status information including land use, forest type, forest physiognomy, tree species, age class, DBH class, and crown cover.NIFS has produced 1:25,000-scale forest type maps since 1972, with the 5th edition of this map series produced from 2006 to 2010.In addition, NIFS produced the first 1:5000-scale forest type map from 2009 to 2013.The 1:5000-scale forest type map was used in this study, providing the forest type polygons shown in Figure 1.Among the attributes of the forest type map, forest physiognomy and tree species are distinguished by an image interpretation system combining digitized aerial photographs and field data.In the case of mixed forests, the mixed ratio is categorized into ten classes while undistinguishable forest physiognomy is differentiated through comparison with field data.The DBH is calculated using an empirical formula, which is based on the relationship between DBH and the diameter of a crown estimated from stereo images of aerial photographs.DBH values are then classified into four classes: under 6 cm, 6-16 cm, 18-28 cm, and over 30 cm.The age class is identified using the tree height matrix composed of age class and tree species that is optimized for the Korean forest environment.The age class for undesignated tree species in the tree height matrix is filled using similar tree species in the matrix.The crown cover is classified into three classes-under 50%, 50-70% and over 70%-through visual interpretation of aerial photographs.

Methods
Stand height is defined as the mean height or the top height of trees within a stand [12,27].There are several ways for stand height calculation, which are widely used for forest inventory.This study used the arithmetic mean height (Equation (1); preferable for forest inventory in South Korea) of dominant and co-dominant trees for calculating stand heights, where h i indicates the i-th dominant or co-dominant tree height, N is the number of dominant and co-dominant trees and H D is the arithmetic mean height.
The proposed forest stand height estimation models in this study consist of two parts: (1) the development of height estimation models at the plot level (Figure 2) and ( 2) expansion of the developed models from the plot level to the stand level (Figure 3).First, the forest plot height was empirically modeled using 20 input variables, which were extracted for the 42 field-surveyed plots using the CHM (Table 4).The variables have been commonly used for extraction of forest inventory data from LiDAR [10,12].As the target variable, arithmetic mean heights at the plot level were calculated from individual tree-based height data measured from field-surveyed plots using Equation (1).Machine learning approaches, which have been used for various purposes in remote sensing fields [28][29][30][31][32][33][34][35], were used to estimate the forest plot height from the 20 input variables.In forestry-focused remote sensing, machine learning techniques have been commonly applied to estimation of forest parameters such as biomass and leaf area index, tree species classification, and individual tree characterization [36][37][38].
In this study, among many kinds of machine learning algorithms available, support vector regression (SVR), modified regression trees (RT) and random forest (RF) were used for estimating forest height at the plot level.SVR is a regression version of widely used support vector machines (SVM), which is a supervised non-parametric statistical technique [39].SVR/SVM have been known to have powerful predictability, especially when training data are small [40].SVR tries to find a hyperplane to separate data to predict a target variable in the high-dimensional feature space transformed through a kernel function [41,42].The selection of a kernel function with appropriate parameters is crucial for successful results.In this study, SVR was performed by using the libsvm open source library (https://www.csie.ntu.edu.tw/~cjlin/libsvm/) in matlab [42,43].We tested several widely used kernel functions with parameter optimization using a grid search approach, and finally selected Radial Basis Function (RBF) for the kernel function of the SVR model with the optimum parameters (i.e., gamma of 0.001953125 and the penalty of 2048).
A typical regression tree partitions input data into relatively homogeneous groups at nodes through recursive binary splits to predict a target variable.In this study, a modified regression tree (RT) approach based on Cubist software (RuleQuest, Pomona, Australia) was used [44].Unlike typical regression trees having constant values at final nodes, RT produces rules and their associated multivariate regression equations to estimate the target variable.Rules generated from RT are not mutually exclusive, and results are averaged when multiple rules apply to unknown samples.RT provides information on the frequency of input variables as percentage used in the rules and multi-variate equations, which can be used as relative variable importance.The more a variable is used, the more important the variable.Because RT provides rules and multi-variate equations, it is much easier to understand the results when compared to the typical regression trees or RF [32].
RF is an ensemble approach based on classification and regression trees (CART) [45].RF adopts two randomization processes to produce numerous independent trees (i.e., CART) to overcome well-known limitations of CART (i.e., sensitivity of training data and overfitting).The two processes are based on bootstrapping sampling: using a random subset of training samples for each tree and a random subset of input variables for each node of a tree.The results from multiple trees (typically 500-1000) are then integrated to reach a solution through majority voting (or weighted voting) for Forests 2018, 9, 268 7 of 16 classification or averaging (or weighted averaging) for regression.Similar to RT, RF also provides relative variable importance information using out-of-bag (OOB) data, which are not used in training for each tree.When a variable is randomly perturbed from OOB data, the decrease of the accuracy is calculated.The larger the decrease of the accuracy (mean decrease accuracy in percentage is typically used) for a variable, the more important the variable.RF was carried out with the random forest package in R statistical software (http://www.r-project.org).The RF was trained with default settings (e.g., the number of variables per node = 6) except for the number of trees (i.e., 1000 trees).For more information about the three machine learning algorithms, please see [42,44,46].[10,47] 50th to 100th percentiles with the interval of 10 6 The machine learning models developed at the plot level were evaluated using the coefficient of determination (R 2 ) and the root mean square error (RMSE) through 10-fold cross-validation due to the small number of samples.In addition, cross-validation mean bias (MB) was calculated to evaluate the machine learning models.ANOVA tests were also conducted to see if there is any significant difference between the three machine learning models in terms of estimation performance.
Since the mean height models were developed at the plot level, there is a need to expand these machine learning-based plot height models into the stand-level.Naesset et al. [18][19][20] attempted to expand plot-level models (i.e., mean tree heights and dominant heights) by averaging mean plot heights in each stand.The mean plot heights were calculated using the plot-level model and aggregated to a pixel size similar to the field-survey plots.Previous expansion methods, such as that reported by Naesset et al. [18][19][20], were conducted on forest areas where the terrain is relatively flat and tree species are homogeneous within stands (i.e., Picea abies Karst.and Pinus sylvestris L.).However, such an approach would not work well to forests in South Korea because of the extreme variability of canopy heights due to the complex terrain and the highly mixed tree species in relatively small, patchy stands.In this study, we proposed two different schemes (Schemes 1 and 2) for expanding our plot-level empirical models into the stand-level to estimate stand heights (Figure 3).Scheme 1: A plot with a 11.3 m radius (same size as the surveyed plots) around the geometric center of a forest stand (i.e., a central plot) is generated.Then, input variables, listed in Table 4, are extracted from CHM within the central plot.Finally, the empirical plot-level models are applied to the extracted input variables to estimate forest stand heights.
Scheme 2: The 20 input variables (Table 4) are extracted from CHM within a forest stand.Unlike the height-based variables, crown geometric volume (CGV) variables should be calculated considering the different areal scales between the plot and stand levels.Thus, the ratio of area between a plot and a stand is multiplied to the stand-level CGV to match the spatial scales.The empirical plot-level models are then applied to estimate forest stand heights based on the extracted variables.
The estimated stand heights over the study site were compared by model and scheme using R 2 and root mean square difference (RMSD).ANOVA tests were also conducted to see if there is any significant difference between each pair of estimated stand heights.The models were finally validated against field-survey stand heights for three stands (A, B and C).[10,47] 50th to 100th percentiles with the interval of 10 6

Comparison of Plot-Level Height Models
The arithmetic mean height was modeled at the plot level using SVR, RT and RF (Figure 4).Three models were compared in terms of R 2 , calibration RMSE, cross-validation RMSE (CV RMSE), and cross-validation MB.The SVR model resulted in R 2 of 0.82, RMSE of 2.15 m, CV RMSE of 3.08 m, and CV MB of −0.24 m.The RT model yielded the lowest CV RMSE (2.43 m) compared to the other two models.On the other hand, the RF model showed the best calibration performance resulting in the highest R 2 value (0.93) and calibration RMSE (1.48 m) among the three machine learning algorithms.However, the CV RMSE was the largest for the RF model, which implies an overfitting of the model to the training data.RF is known not to be prone to overfitting especially when training sample size is large [10,45].The large difference between calibration and CV RMSEs by the RF model is possibly due to the small training sample size (42 plots).While the RT model showed similar results for calibration and cross-validation, it yielded the highest absolute value of CV MB, which indicates the underestimation of arithmetic mean heights at the plot level.The RF model produced the lowest absolute value of CV MB, which implies that the estimation was relatively less biased than the other two models.The ANOVA test results (all p values of the pairs of the models >0.05) showed that there was no significant difference between the three models in terms of the performance estimating arithmetic mean heights at the plot level.
Other researchers have attempted similar analysis under different forest conditions.Naesset [19] modeled dominant height and mean height at plot level dividing forests based on age (young vs.

Comparison of Plot-Level Height Models
The arithmetic mean height was modeled at the plot level using SVR, RT and RF (Figure 4).Three models were compared in terms of R 2 , calibration RMSE, cross-validation RMSE (CV RMSE), and cross-validation MB.The SVR model resulted in R 2 of 0.82, RMSE of 2.15 m, CV RMSE of 3.08 m, and CV MB of −0.24 m.The RT model yielded the lowest CV RMSE (2.43 m) compared to the other two models.On the other hand, the RF model showed the best calibration performance resulting in the highest R 2 value (0.93) and calibration RMSE (1.48 m) among the three machine learning algorithms.However, the CV RMSE was the largest for the RF model, which implies an overfitting of the model to the training data.RF is known not to be prone to overfitting especially when training sample size is large [10,45].The large difference between calibration and CV RMSEs by the RF model is possibly due to the small training sample size (42 plots).While the RT model showed similar results for calibration and cross-validation, it yielded the highest absolute value of CV MB, which indicates Forests 2018, 9, 268 9 of 16 the underestimation of arithmetic mean heights at the plot level.The RF model produced the lowest absolute value of CV MB, which implies that the estimation was relatively less biased than the other two models.The ANOVA test results (all p values of the pairs of the models >0.05) showed that there was no significant difference between the three models in terms of the performance estimating arithmetic mean heights at the plot level.
Other researchers have attempted similar analysis under different forest conditions.Naesset [19] modeled dominant height and mean height at plot level dividing forests based on age (young vs. mature) and site quality (poor vs. good).Naesset [19] found that the mean differences between the predicted and field-surveyed dominant height data ranged from −0.01 m to 0.03 m and their standard deviations were 1.27 m-1.54 m.Yu et al. [16] compared forest attributes with area-based and individual tree-based methods at plot level.Their mean height with area-based method showed relative RMSE as 6.42% and R 2 of 0.94.Although these results seem to be better than our results, the terrain in our study is much more rugged and the plots have more diverse tree species.For example, the elevation ranges are 70-120 m for [19] and 125-185 m for [16], while 145-751 m for our study site with the maximum slope of 88.2 • .In addition, even the relatively homogeneous coniferous plots in our study site include a few deciduous trees, which made it difficult to estimate the canopy height at the plot level.Donoghue and Watt [21] found that a mixture of coniferous and deciduous trees within plots or stands degraded the accuracy of forest canopy height estimation.
We also analyzed the input variable importance for RF and RT (Figure 5).The two machine learning algorithms used the same input variables for modeling.The RF model showed a similar pattern of variable importance for estimating arithmetic mean height.The 100th percentile, 90th percentile, 80th percentile, kurtosis and variance of height highly contributed to the RF models, but volume-related variables did not influence the models.The RT model used 10th percentile, 30th percentile, 70th percentile of height, and kurtosis for developing the arithmetic mean height model.Interestingly, for both RF and RT models, height-based variables contributed more than volume-based variables, and the variance of height was one of the most contributing variables.This could be explained through the work of Montealegre et al. [48], whose second figure shows the vertical distribution of LiDAR point data of Aleppo pines under three height groupings: short, medium and tall.They reported that the variance of the vertical distribution is larger for the taller Aleppo pines, which implies that the variance of height is closely related to canopy height at a plot or stand level.
Forests 2018, 9, x FOR PEER REVIEW 9 of 15 our study site include a few deciduous trees, which made it difficult to estimate the canopy height at the plot level.Donoghue and Watt [21] found that a mixture of coniferous and deciduous trees within plots or stands degraded the accuracy of forest canopy height estimation.We also analyzed the input variable importance for RF and RT (Figure 5).The two machine learning algorithms used the same input variables for modeling.The RF model showed a similar pattern of variable importance for estimating arithmetic mean height.The 100th percentile, 90th percentile, 80th percentile, kurtosis and variance of height highly contributed to the RF models, but volume-related variables did not influence the models.The RT model used 10th percentile, 30th percentile, 70th percentile of height, and kurtosis for developing the arithmetic mean height model.Interestingly, for both RF and RT models, height-based variables contributed more than volumebased variables, and the variance of height was one of the most contributing variables.This could be explained through the work of Montealegre et al. [48], whose second figure shows the vertical distribution of LiDAR point data of Aleppo pines under three height groupings: short, medium and tall.They reported that the variance of the vertical distribution is larger for the taller Aleppo pines, which implies that the variance of height is closely related to canopy height at a plot or stand level.

Inter-Comparison of Stand-Level Height Models
We evaluated the relative performance of our models in terms of the three machine learning algorithms and two expansion methods (i.e., Schemes 1 vs. 2) using 575 stands in the study area.Figure 6 shows the comparison between the machine learning approaches when the expansion method is fixed.Stand height models of arithmetic mean height (Figure 6) show an approximately linear relationship between SVR, RT and RF.The estimated stand heights with the SVR-based model were slightly higher than those with RT and RF (Figure 6a-d).The RT model produced relatively higher arithmetic mean heights than the RF model, and the correlation between the heights by the two models (Figure 6e,f) was not higher than those by the other comparisons by scheme (Figure 6a-d).RMSDs between the models for Scheme 1 were relatively smaller than those for Scheme 2, which implies that Scheme 1 might be more stable than Scheme 2. ANOVA test results (p-values > 0.05) show that the estimated arithmetic mean heights at the stand level between the three models by scheme are not significantly different except for the pair between the RT and RF models with Scheme 2 (p-value = 0.01) (Figure 6f).
Based on forest reports for the Maehwa mountain region and communications with forest managers, it is expected that the typical range of dominant tree heights is between 10 m and 35 m.Although it is not possible to quantitatively assess all of the stand-level heights estimated using the proposed models, the SVR-based models tend to over-estimate the stand heights (Figure 6).
Since we trained the machine learning algorithms with plot-level input variables, there is a need to compare whether the choice of expansion method impacts the estimation of stand heights.We explored the difference in the results of the two different expansion schemes when the same machine learning algorithm and stand height type were used.Most of the pairwise comparisons (Figure 7) show low R 2 values, which indicate that there are weak relationships between Schemes 1 and 2. In addition, the slopes of trend lines in Figure 7 are smaller than 1, suggesting that the estimated stand heights from Scheme 1 tend to be slightly higher than those from Scheme 2. ANOVA test results (p-values < 0.01) show that the estimated stand heights are significantly different between the two schemes regardless of the model used, which might indicate that one scheme is more reliable than the other.The comparison between the schemes is further examined using the comprehensive validation for the three selected stands shown in the next subsection.
scheme are not significantly different except for the pair between the RT and RF models with Scheme 2 (p-value = 0.01) (Figure 6f).
Based on forest reports for the Maehwa mountain region and communications with forest managers, it is expected that the typical range of dominant tree heights is between 10 m and 35 m.Although it is not possible to quantitatively assess all of the stand-level heights estimated using the proposed models, the SVR-based models tend to over-estimate the stand heights (Figure 6).Since we trained the machine learning algorithms with plot-level input variables, there is a need to compare whether the choice of expansion method impacts the estimation of stand heights.We explored the difference in the results of the two different expansion schemes when the same machine learning algorithm and stand height type were used.Most of the pairwise comparisons (Figure 7) show low R 2 values, which indicate that there are weak relationships between Schemes 1 and 2. In addition, the slopes of trend lines in Figure 7 are smaller than 1, suggesting that the estimated stand heights from scheme 1 tend to be slightly higher than those from Scheme 2. ANOVA test results (p-values < 0.01) show that the estimated stand heights are significantly different between the two schemes regardless of the model used, which might indicate that one scheme is more reliable than the other.The comparison between the schemes is further examined using the comprehensive validation for the three selected stands shown in the next subsection.Since we trained the machine learning algorithms with plot-level input variables, there is a need to compare whether the choice of expansion method impacts the estimation of stand heights.We explored the difference in the results of the two different expansion schemes when the same machine learning algorithm and stand height type were used.Most of the pairwise comparisons (Figure 7) show low R 2 values, which indicate that there are weak relationships between Schemes 1 and 2. In addition, the slopes of trend lines in Figure 7 are smaller than 1, suggesting that the estimated stand heights from scheme 1 tend to be slightly higher than those from Scheme 2. ANOVA test results (p-values < 0.01) show that the estimated stand heights are significantly different between the two schemes regardless of the model used, which might indicate that one scheme is more reliable than the other.The comparison between the schemes is further examined using the comprehensive validation for the three selected stands shown in the next subsection.

Validation of Forest Stand Height Models
Field surveys of forest stands are typically time-consuming and require huge efforts over large areas.We investigated three relatively small forest stands for exhaustive validation.Each stand height was calculated using individual tree heights.We additionally compared the proposed approaches to calculation of stand heights based on individual tree crown delineation.For delineating individual tree crowns, we used the algorithm proposed in Fang et al. [7], which was developed for the same study area.Table 5 summarizes the mean difference and the mean absolute

Validation of Forest Stand Height Models
Field surveys of forest stands are typically time-consuming and require huge efforts over large areas.We investigated three relatively small forest stands for exhaustive validation.Each stand height was calculated using individual tree heights.We additionally compared the proposed approaches to calculation of stand heights based on individual tree crown delineation.For delineating individual tree crowns, we used the algorithm proposed in Fang et al. [7], which was developed for the same study area.Table 5 summarizes the mean difference and the mean absolute difference between forest stand height models (i.e., SVR, RT and RF) and ground reference data by scheme.The stand height model with RT and Scheme 1 shows the lowest mean difference, and the model with SVR and Scheme 2 shows the lowest mean absolute difference among the models for arithmetic mean height.The models with the lowest mean difference and mean absolute difference of estimated stand heights are also slightly lower than individual tree-based stand heights.This result is somewhat consistent with Yu et al. [16] who showed relatively better accuracy for estimating mean height using an individual tree-based approach than the area-based approach with airborne LiDAR data with average LiDAR density of 2.6 pts/m 2 .However, Fang et al. [7] pointed out that the proposed crown delineation approach used in this study worked better for relatively flat (e.g., slope < 20 degrees) plots while having larger errors for rugged plots.Considering that the three stands used for validation have moderately rugged terrain (with mean slopes of 16-25 degrees; Table 3), it seems reasonable that both the individual tree-based method and some of the proposed approaches (e.g., Scheme 1 with RT) produced similar results for the three stands.The individual tree-based approach resulted in high errors in estimating canopy heights for 42 plots, especially those with rugged terrain often resulting in overestimation up to 8.73 m.However, since only three stands (i.e., small sample size) were used for stand-level validation of the schemes and models, it is not possible to draw a conclusion from this case study.More stand-level samples with rugged terrain should be used for a thorough validation.
It is challenging to directly compare the results to other studies presented in the literature not only because of the influence that different environmental characteristics have on the results, but also because of the differences in data being applied.Nonetheless, our results are similar to previous stand-level height estimation.For example, Ahmed et al. [49] modeled forest heights at the stand level with Landsat time series images and airborne LiDAR data using multiple regression and RF.Their RF models resulted in R 2 of 0.88, RMSE of 2.39 m for mature forests, R 2 of 0.79, RMSE of 3.52 m for young forests, and R 2 of 0.82, RMSE of 3.17 m for combined forests.
Table 5. Mean difference and mean absolute difference between stand height models and field-survey stand heights (SVR: support vector regression, RT: regression tree, RF: random forests, Individual tree-based: average of individual tree heights extracted based on the crowns delineated using the method in [7], Scheme 1: central plot-based expansion, Scheme 2: stand-based expansion).

Potentials and Limitations
The novelty of this study is that the stand height models were developed for areas with very rugged terrain composed of highly mixed tree species.In particular, the comparison of the proposed two expansion methods (Schemes 1 and 2) gives us an insight on how to conduct stand-level forest height estimation from plot-level data and models for an operational purpose.The stand height map for the whole study area is depicted in Figure 8a.Few prior studies verified forest stand heights with field-survey data [18][19][20] and the method used by those studies was to calculate field-survey stand heights using sample plots systematically distributed in each stand.By contrast, we examined all trees within three stands and used them for a stand-level validation of the proposed approaches.
However, this study has several limitations.First of all, due to small sample size in both plotand stand-level analyses (42 and 3, respectively), statistically robust approaches were not identified.A more thorough plot-and stand-level validation should be conducted with more samples to further improve the approaches.In addition, the generalization capability of the proposed approaches should be further investigated for different forested areas.Based on the testing performed, it is not possible to say that the proposed approaches would work for forests in the US or China.
A forest stand is defined as an adjoining community which has similar species composition, structure, age class, size class, and site qualities.However, since forests in South Korea are very heterogeneous, many deciduous trees are often found in coniferous stands and vice versa, which makes it difficult to accurately estimate stand heights.In addition, it should be noted that field-surveyed tree heights contain measurement errors especially for dense forests with rugged terrain.

Conclusions
This study used airborne LiDAR data to estimate stand heights for forests with rugged terrain and highly mixed tree species.To calculate stand heights, the arithmetic mean height was used.Three machine learning algorithms were used for empirical modeling based on 20 statistical variables at plot level.Then two expansion schemes (i.e., central plot-based expansion and stand-based expansion) from the plot level to the stand level were examined.Plot-level models using SVR, RT and RF algorithms were first developed with the results showing varied performance metrics (i.e., coefficient of determination, root mean square error, and mean bias) by model for forest height estimation at the plot level.There was no statistically significant difference in performance among the three machine learning models based on the ANOVA test results (p-values > 0.05).The standlevel validation using all tree measurements for the three selected stands showed varied results by scheme and model used, which implies that more rigorous validation is required with more samples.Thus, it is not possible to suggest a statistically robust method among the tested approaches that can be used to quantify stand heights in forests with rugged terrain in South Korea.The proposed approaches should be further investigated with more samples over other areas with complex forest and terrain conditions so that they can be considered for use to operationally update stand-level forest inventories.
Author Contributions: J.L. led manuscript writing and contributed to the data analysis and research design.

Conclusions
This study used airborne LiDAR data to estimate stand heights for forests with rugged terrain and highly mixed tree species.To calculate stand heights, the arithmetic mean height was used.Three machine learning algorithms were used for empirical modeling based on 20 statistical variables at plot level.Then two expansion schemes (i.e., central plot-based expansion and stand-based expansion) from the plot level to the stand level were examined.Plot-level models using SVR, RT and RF algorithms were first developed with the results showing varied performance metrics (i.e., coefficient of determination, root mean square error, and mean bias) by model for forest height estimation at the plot level.There was no statistically significant difference in performance among the three machine learning models based on the ANOVA test results (p-values > 0.05).The stand-level validation using all tree measurements for the three selected stands showed varied results by scheme and model used, which implies that more rigorous validation is required with more samples.Thus, it is not possible to suggest a statistically robust method among the tested approaches that can be used to quantify stand heights in forests with rugged terrain in South Korea.The proposed approaches should be further investigated with more samples over other areas with complex forest and terrain conditions so that they can be considered for use to operationally update stand-level forest inventories.

Figure 1 .
Figure 1.Study area located on the southwestern part of Mt.Maehwa, Gangwon-do, South Korea.The right-bottom figure shows the field-survey plots (cyan points) with a radius of 11.3 m and forest type map (black line and gray line) with the digital terrain model (DTM) as a background.In addition to 42 field-surveyed plots, the two right-top figures show three field-surveyed forest stands denoted as 'A', 'B' and 'C', where all individual canopy trees were measured within the stands.

Figure 1 .
Figure 1.Study area located on the southwestern part of Mt.Maehwa, Gangwon-do, South Korea.The right-bottom figure shows the field-survey plots (cyan points) with a radius of 11.3 m and forest type map (black line and gray line) with the digital terrain model (DTM) as a background.In addition to 42 field-surveyed plots, the two right-top figures show three field-surveyed forest stands denoted as 'A', 'B' and 'C', where all individual canopy trees were measured within the stands.

Figure 2 .
Figure 2. The flow diagram of the proposed arithmetic height estimation models at the plot level.Figure 2. The flow diagram of the proposed arithmetic height estimation models at the plot level.

Figure 2 . 15 Figure 3 .
Figure 2. The flow diagram of the proposed arithmetic height estimation models at the plot level.Figure 2. The flow diagram of the proposed arithmetic height estimation models at the plot level.Forests 2018, 9, x FOR PEER REVIEW 8 of 15

Figure 3 .
Figure 3.The flow diagram of the proposed two schemes to expand the plot-level height estimation models to the stand level.The 20 predictor variables, extracted for Schemes 1 and 2, were applied to the models developed based on 42 plots (Figure 2).Scheme 1 extracted 20 variables from the canopy height model (CHM) in a circular plot (radius = 11.3 m) with the geometric center of a forest stand.Scheme 2 extracted the variables from CHM within a forest stand.The 1:5000 scale digital map of forest stands was provided as polygon data by National Institute of Forest Service.

Figure 4 .
Figure 4. Plot-level empirical model calibration and cross-validation results using three machine learning algorithms.Low vegetation under 5 m were removed for calculating input variables.R 2 means the coefficient of determination, RMSE means the root mean square error, CV RMSE means the 10-fold cross-validation RMSE, and CV MB indicates the mean bias of 10-fold cross-validation.(a) Arithmetic mean height model with SVR (b) Arithmetic mean height model with RT (c) Arithmetic mean height model with RF.

Figure 4 .
Figure 4. Plot-level empirical model calibration and cross-validation results using three machine learning algorithms.Low vegetation under 5 m were removed for calculating input variables.R 2 means the coefficient of determination, RMSE means the root mean square error, CV RMSE means the 10-fold cross-validation RMSE, and CV MB indicates the mean bias of 10-fold cross-validation.(a) Arithmetic mean height model with SVR (b) Arithmetic mean height model with RT (c) Arithmetic mean height model with RF.

Figure 4 .
Figure 4. Plot-level empirical model calibration and cross-validation results using three machine learning algorithms.Low vegetation under 5 m were removed for calculating input variables.R 2 means the coefficient of determination, RMSE means the root mean square error, CV RMSE means the 10-fold cross-validation RMSE, and CV MB indicates the mean bias of 10-fold cross-validation.(a) Arithmetic mean height model with SVR (b) Arithmetic mean height model with RT (c) Arithmetic mean height model with RF.

Figure 5 .
Figure 5. Relative variable importance identified by the RT and RF models for estimating arithmetic mean height at the plot level.

Figure 5 .
Figure 5. Relative variable importance identified by the RT and RF models for estimating arithmetic mean height at the plot level.

Forests 2018, 9 , 15 Figure 6 .
Figure 6.Comparison of support vector regression (SVR), regression tree (RT) and random forest (RF) models for estimating: (a, c and e) arithmetic mean height with Scheme 1 and (b, d and f) arithmetic mean height with Scheme 2, respectively.

Figure 6 .
Figure 6.Comparison of support vector regression (SVR), regression tree (RT) and random forest (RF) models for estimating: (a,c,e) arithmetic mean height with Scheme 1 and (b,d,f) arithmetic mean height with Scheme 2, respectively.

Forests 2018, 9 , 15 Figure 6 .
Figure 6.Comparison of support vector regression (SVR), regression tree (RT) and random forest (RF) models for estimating: (a, c and e) arithmetic mean height with Scheme 1 and (b, d and f) arithmetic mean height with Scheme 2, respectively.

Figure 7 .
Figure 7.Comparison of Scheme 1 and Scheme 2 for computing (a) arithmetic mean height with SVR algorithm, (b) arithmetic mean height with RT, and (c) arithmetic mean height with RF.

Figure 7 .
Figure 7.Comparison of Scheme 1 and Scheme 2 for computing (a) arithmetic mean height with SVR algorithm, (b) arithmetic mean height with RT, and (c) arithmetic mean height with RF.

Figure 8 .
Figure 8. Maps for the study area: (a) arithmetic mean height with RT model + Scheme 1, (b) digital elevation model (DEM), and (c) canopy height model (CHM).

Table 1 .
Specifications and acquisition parameters of the airborne Light Detection and Ranging (LiDAR) sensor used in this research.

Table 1 .
Specifications and acquisition parameters of the airborne Light Detection and Ranging (LiDAR) sensor used in this research.

Table 2 .
Statistics from field-survey of 42 plots.

Table 3 .
Statistics from field-survey of three stands.Stand C has much lower canopy density than stand A although they have similar mean diameter at breast height (DBH) and stand size.It is because more than 50% of stand C consist of grasses.

Table 4 .
Input variables of machine learning techniques for estimating forest stand heights.

Table 4 .
Input variables of machine learning techniques for estimating forest stand heights.