Spatial Autoregressive Models for Stand Top and Stand Mean Height Relationship in Mixed Quercus mongolica Broadleaved Natural Stands of Northeast China

The relationship of stand top and stand mean height is important for forest growth and yield modeling, but it has not been explored for natural mixed forests. Observations of stand top and stand mean height can present spatial dependence or autocorrelation, which should be considered in modeling. Simultaneous autoregressive (SAR) models, including spatial lag model (SLM), spatial Durbin model (SDM) and spatial error model (SEM), within nine spatial weight matrices were utilized to model the stand top and stand mean height relationship in the mixed Quercus mongolica Fisch. ex Ledeb. broadleaved natural stands of Northeast China, using ordinary least squares (OLS) as a benchmark model. The results showed that there was a high linear relationship between stand top and stand mean height and that there was a positive spatial autocorrelation pattern in model residuals of OLS. Moreover, SEM and SDM performed better than OLS in terms of reducing the spatial dependence of model residuals and model fitting, regardless of which spatial weight matrix was used. SEM was better than SDM. SLM scarcely reduced the spatial autocorrelation of model residuals. Among nine spatial matrices in SEM, rook contiguous matrix performed best in model fitting, followed by inverse distances raised to the second power (1/d2) and local statistics model matrix (LSM).


Introduction
Forest site productivity remains an essential concern in forestry because of its significant role in forest resource availability [1]. Estimating site productivity is crucial to predict forest growth and yield, as well as to maintain sustainable management of forest resources [2,3]. Site index, which is commonly defined as the mean height of dominant and codominant trees of a stand (hereafter referred to as stand top height) at a pre-specified age [4][5][6], has been accepted and widely used as the most expedient indicator of forest site productivity in Germany and North America since around the end of the 19th century because of its simple calculation and efficacious prediction [3,6,7]. Additionally, stand top height is less sensitive to thinning and allows more stability of assessment (e.g., [6,8,9]), and invariance with stand density (e.g., [1,10,11]).

Data
Data were collected from a total of 300 permanent sample plots (PSPs) of natural stands located on the Tazigou state-owned forest farm of Wangqing Forest Bureau in northeast China (129.971˝-130.222˝E, 43.327˝-43.492˝N), which is part of Changbai Mountain flora and has a monsoon climate with dry, windy springs and warm, wet summers. The mean annual temperature is approximately 3.9˝C. Mean annual rainfall is 600-700 mm, occurring primarily in the summer. The soil is classified as dark brown forest soil that formed under the combined influences of heat and moisture in mixed forests, and its mean thickness is approximately 40 cm. One-hundred and ninety-nine PSPs were mixed Quercus mongolica broadleaved natural stands and selected for study data. Quercus mongolica in all selected PSPs had the largest basal area proportion; its mean proportion was 46.2%, and its range was from 21% to 64%. In each selected PSP, the major species were Quercus mongolica, Betula platyphylla Sukaczev, Pinus koraiensis Siebold et Zucc., Populus ussuriensis Kom., Acer pictum Thunb. ex Murray and Tilia amurensis Rupr., whereas the minor species included Larix gmelinii (Rupr.) Kuzen., Betula dahurica Pall., Fraxinus mandshurica Rupr., Phellodendron amurense Rupr., Abies nephrolepis (Trautv.) Maxim., Picea asperata Mast. and Juglans mandshurica Maxim. All of the PSPs were established within 12 one hectare large plots in 2013 by the Research Institute of Forest Resources Information Techniques, Chinese Academy of Forestry, and were square in shape (20ˆ20 m). All of the standing live trees in each PSP with a diameter at breast height (dbh) equal to or larger than 5.0 cm were marked, tagged, and recorded by species. The total height and dbh of all trees were measured using a Vertext IV electronic measuring device and diameter tape, respectively. Geographic coordinates at the center point of each PSP were acquired using a recreation-grade GPS receiver (Garmin International Inc. Olathe, KS, USA). Summary statistics of stand variables are presented in Table 1.

Stand Top and Stand Mean Height Relationship
The typical top height definition is represented by a specific number of the largest (by height or diameter) trees per unit area represent, e.g., the mean height of the 100 largest trees per hectare [51][52][53]. Regardless of the particular definition of top height employed, the modeling principles involved in projecting stand top height remain essentially the same [53]. Meanwhile, the mean height of the 100 tallest trees per hectare is widely used in China [54] and was applied in this study. Therefore, stand top height was calculated as the arithmetic mean of the four tallest trees according to the 100 tallest trees per hectare, and the selection of the tallest trees did not take into account the species to circumvent the limitation [55] that site index models are difficult to apply in mixed stands, where height estimates of dominant trees must be available for each species in a stand [56]. Stand mean height was determined using Lorey mean height [2,9]. Summary statistics of these two stand variables are presented in Table 1.
A scatterplot of stand top height against stand mean height in 199 selected PSPs is linear in shape (Figure 1), with a high correlation coefficient of 0.847, which was obtained with the function "cor" of

Spatial Weight Matrix
Nine spatial weight matrices, including two contiguous neighbors, three inverse distances raised to some power, three geostatistical matrices and LSM, were used in SLM, SEM and SDM.

Contiguous Neighbors
Contiguous neighbors include rook contiguity, queen contiguity, Delaunay triangulation neighbors, sphere of influence neighbors, Gabriel graph neighbors and relative graph neighbors. Rook and queen contiguities are commonly used for regular location, whereas the rest of the contiguous neighbors are for irregular location [61]. In this study, rook and queen contiguities were used for our regular study plot. Rook contiguity has four neighbors at each central location in the cardinal directions, and queen contiguity has eight neighbors at each central location in all directions ( Figure 2). In Figure 2

Spatial Weight Matrix
Nine spatial weight matrices, including two contiguous neighbors, three inverse distances raised to some power, three geostatistical matrices and LSM, were used in SLM, SEM and SDM.

Contiguous Neighbors
Contiguous neighbors include rook contiguity, queen contiguity, Delaunay triangulation neighbors, sphere of influence neighbors, Gabriel graph neighbors and relative graph neighbors. Rook and queen contiguities are commonly used for regular location, whereas the rest of the contiguous neighbors are for irregular location [61]. In this study, rook and queen contiguities were used for our regular study plot. Rook contiguity has four neighbors at each central location in the cardinal directions, and queen contiguity has eight neighbors at each central location in all directions ( Figure 2). In Figure 2, cell 0 denotes the central location, cell 1, cell 2, cell 3 and cell 4 denote neighbors of the Their formulas are as follows [35,62]: where Equation (5) is rook contiguity, Equation (6) is queen contiguity, and w ij is the spatial weight value of the central location i and its neighbor j. Rook and queen contiguities were constructed with the "spdep" package in R [60].

Inverse Distances
Three of the inverse distances were selected, namely, inverse distances raised to one power (1/d), inverse distances raised to the second power (1/d 2 ) and inverse distances raised to the fifth power (1/d 5 ). The specific equation is as follows [35,49]: where dij is the distance between the central location i and its neighbor j, n = 1 is for 1/d, n = 2 is for 1/d 2 , and n = 5 is for 1/d 5 . Three inverse distances were constructed with the "spdep" package in R [60].

Geostatistical Matrix
The autocorrelations of spherical variogram, Gaussian variogram and exponential variogram were defined as the geostatistical matrix under the presumption of intrinsic stationarity [49] and these three geostatistical matrices are currently most commonly used [44,63]. Equation (8) is for spherical variogram (Spher), Equation (9) is for Gaussian variogram (Gaus) and Equation (10) is for exponential variogram (Exp). 3 3 ij where r is the range, the remaining terms are as shown in the above equations. Spher, Gaus and Exp were constructed with the "nlme" package in R [64].

Local Statistics Model Matrix
Local statistics model matrix (LSM) was proposed by Getis and Aldstadt [49] based on the Gi * local statistic [65,66]. A positive Gi * indicates that there is high-value clustering around observation I, whereas a negative Gi * indicates low-value clustering. These Gi * values accumulate with increasing

Inverse Distances
Three of the inverse distances were selected, namely, inverse distances raised to one power (1/d), inverse distances raised to the second power (1/d 2 ) and inverse distances raised to the fifth power (1/d 5 ). The specific equation is as follows [35,49]: where d ij is the distance between the central location i and its neighbor j, n = 1 is for 1/d, n = 2 is for 1/d 2 , and n = 5 is for 1/d 5 . Three inverse distances were constructed with the "spdep" package in R [60].

Geostatistical Matrix
The autocorrelations of spherical variogram, Gaussian variogram and exponential variogram were defined as the geostatistical matrix under the presumption of intrinsic stationarity [49] and these three geostatistical matrices are currently most commonly used [44,63]. Equation (8) is for spherical variogram (Spher), Equation (9) is for Gaussian variogram (Gaus) and Equation (10) is for exponential variogram (Exp). where r is the range, the remaining terms are as shown in the above equations. Spher, Gaus and Exp were constructed with the "nlme" package in R [64].

Local Statistics Model Matrix
Local statistics model matrix (LSM) was proposed by Getis and Aldstadt [49] based on the G i * local statistic [65,66]. A positive G i * indicates that there is high-value clustering around observation I, whereas a negative G i * indicates low-value clustering. These G i * values accumulate with increasing distance until these values never rise with distance, implying that any continuity in spatial autocorrelation over distance ends at that distance, which is regarded as the critical distance (d c ), whereas d c is merely an empirical value because there is no statistical test involved. The specific formula is as follows: is the G i * score at d ij , and G i *(0) is the G i * score for ith central observation only. The response variable stand top height has been used as attributed for SLM and SDM, and the model residuals of the OLS model have been used as attributed for SEM to create LSM [44]. LSM was written and constructed based on the "spdep" package in R [60]. All of the spatial weight matrices were row standardized with coding scheme "W" in R [57].

Test Spatial Autocorrelation of Model Residuals
We selected Moran's I to test the spatial autocorrelation of model residuals. The formula is as follows:

12)
where I denotes the Moran's I of model residuals, n is the number of selected PSPs, y i and y j are the model residuals y at PSPs i and j, y is the mean of all the model residuals y, and w ij is the spatial weight matrix between PSPs i and j. The expected value of I under the null hypothesis of no autocorrelation is not equal to zero but given by I 0 =´1/(n´1). If I is significantly (α = 0.01) greater than I 0 , then model residuals y presents positive autocorrelation; otherwise, y presents negative autocorrelation [67]. Moran's I of model residuals was measured with the function "moran.test" of the "spdep" package in the R statistical language to perform this analysis [60].

Model Fitting
The spatial autocorrelation patterns of OLS, SLM, SDM and SEM model residuals were compared using correlograms. To identify the best fitting model among OLS, SLM, SDM and SEM, we selected three statistics, i.e., coefficient of determination (R 2 ), root mean square error (RMSE) and Akaike information criterion (AIC), as the approximation criteria. Both R 2 and RMSE biased only for invalid models with dependent error terms because of these two statistics underlying the assumption of independent observations [37]. AIC can correspond to other goodness of fit measures based on the likelihood function, which does not underlie the assumption of independent error terms [63] and is also heavily influenced by the number of independent variables, penalizing formulations with more independent variables than those with fewer independent variables [49].

Relationship between Stand Top and Stand Mean Height
The stand top and stand mean height relationship was modeled by Equation (1). The model coefficients were estimated with OLS, and the coefficients were tested with the T-test. The estimates of the intercept coefficient β 0 and the slope coefficient β 1 were 3.121 (p < 0.001) and 1.043 (p < 0.001), respectively. Their standard errors of β 0 and β 1 were 0.690 and 0.047, respectively. Table 2 shows the Moran's I of model residuals and a p-value of two-sided with function "moran.test" of the "spdep" package [60] for three SAR models (i.e., SLM, SDM and SEM) against the base model (i.e., OLS) within nine spatial weight matrices. Model residuals of OLS showed a positive spatial autocorrelation pattern because all of Moran's I were greater than I 0 (I 0 =´1/(199´1) =´0.00505) and were significant (p < 0.01) regardless of which spatial weight matrix was used. SEM was the best spatial model to reduce the spatial autocorrelation for the smallest Moran's I of model residuals and the highest p value, followed by SDM, for each spatial weight matrix, except the 1/d matrix; however, the difference of the 1/d matrix between SDM and SEM was trivial. LSM was the best spatial weight matrix for its Moran's I closest to I 0 and its p value up to 0.9983, followed by the Spher geostatistical matrix and queen contiguous matrix, whereas the 1/d matrix was the worst. The Moran's I of SLM residuals was close to that of OLS, implying that SLM almost did a poor job of reducing the spatial dependence because the p-values of seven spatial weight matrices were far less than 0.01, except for 1/d 5 and Gaus.

Spatial Correlogram of Model Residuals
The spatial correlograms of OLS, SLM, SDM and SEM model residuals were constructed with the "pgirmess" package in R [68]. Figures 3 and 4 show the spatial correlograms of model residuals and their p-values for the three SAR models with nine spatial weight matrices and the OLS model. Moran's I of model residuals were a downtrend, and their p-values were an uptrend with increasing distance class in all models. When distance class increased from 400 m to 500 m, Moran's I decreased the most, and the p value showed the largest increase. The difference between Moran's I and I 0 was not significant (p > 0.1) after the distance class of 500 m, indicating that there was no spatial autocorrelation among PSPs when the distance between PSPs was larger than 500 m. Moreover, when the distance class was 250 m, different spatial weight matrices had different performances on reducing spatial dependence in SDM and SEM, whereas there were similar performances with all spatial weight matrices in SLM. LSM was the best spatial matrix to reduce spatial dependence for the lowest Moran's I distance class in all models. When distance class increased from 400 m to 500 m, Moran's I decreased the most, and the p value showed the largest increase. The difference between Moran's I and I0 was not significant (p > 0.1) after the distance class of 500 m, indicating that there was no spatial autocorrelation among PSPs when the distance between PSPs was larger than 500 m. Moreover, when the distance class was 250 m, different spatial weight matrices had different performances on reducing spatial dependence in SDM and SEM, whereas there were similar performances with all spatial weight matrices in SLM. LSM was the best spatial matrix to reduce spatial dependence for the lowest Moran's I and the highest p value, whereas the queen contiguity matrix was the worst in SDM and SEM (Figures 3 and 4).

Model Fitting
Two SAR models (SDM and SEM) fitted the relationship of stand top and stand mean height better than OLS according to the three criteria, i.e., R 2 , RMSE and AIC, no matter which spatial weight matrix was used. However, SLM was not always better than OLS, which was attributed to AIC within different spatial weight matrices, although the R 2 and RMSE of SLM were better than those of OLS (Table 3). SEM was slightly better than SDM in AIC. SEM with the rook contiguous matrix is the best model fitting for its highest R 2 , smallest RMSE and lowest AIC, followed by SEM with the 1/d 2 matrix and SEM with LSM (Table 3). Additionally, when the combination of spatial model incorporated with spatial weight matrix resulted in Moran's I not being significant (p > 0.1), the R 2 of the combination was equal or more than 0.73, which was higher than that of OLS, whereas the RMSE and AIC of the combination would be lower than those of OLS (Tables 2 and 3). This implied that reducing the spatial dependence could improve the model fitting.

Model Parameter Estimates
The intercept coefficients β 0 and the slope coefficients β 1 of SLM, SDM and SEM were all estimated and tested by the t-test (Tables 4-6), and the autoregressive parameters ρ, γ and λ were all estimated and tested by the likelihood ratio test (Table 7) using the function "lagsarlm" and "errorsarlm" in the "spdep" package [60]. The intercept coefficients β 0 of two spatial models (SLM and SDM) were close to that of OLS and were significant (p < 0.001) in four spatial weight matrices, including the queen contiguous matrix, rook contiguous matrix, Spher geostatistical matrix and LSM, but in the other five spatial weight matrices, the intercept coefficients β 0 were very different and could not reach a significance level of 0.01. Moreover, the intercept coefficients β 0 of SEM were nearly equal to that of OLS and were significant (p < 0.001), regardless of which spatial weight matrix was used. The slope coefficients β 1 of SLM, SDM and SEM were similar to that of OLS and were significant (p < 0.001) for each spatial weight matrix. The standard errors of β 0 and β 1 of SLM, SDM and SEM with nine spatial weight matrices were all equal to or higher than those of OLS (Tables 4-6).    It was obvious that different spatial weight matrices and spatial autoregressive models produced very different estimates of the spatial autoregressive parameters (Table 7). Table 7 shows that the estimates of ρ in SLM were less than those in SDM for each spatial weight matrix, although both SDM and SLM had the same spatially lagged response variable WH t (i.e., the weighted neighboring PSP of stand top height). The estimates of ρ in SLM did not reach the significant level of 0.01, except 1/d 5 and Gaus, whereas the estimates of ρ in SDM were all significant (p < 0.01), no matter which spatial weight matrix was used. The estimates of γ were negative and significant in five of nine spatial weight matrices. On the other hand, the estimates of λ were positive and significant in all spatial weight matrices (p < 0.01).

Spatial Autoregressive Model Selection
Recently, researchers have utilized three spatial autoregressive models (SLM, SDM and SEM) at the plot or stand level and the individual-tree level to reduce the spatial autocorrelation of model residuals [27,42,44]. At the plot or stand level, Kissling and Carl [27] used these models to fit the artificial data that had 1108 cells (10ˆ10 m), and the results showed that SEM was the most reliable spatial autoregressive model for reduction of spatial dependence. In individual-tree level, Meng et al. [42] fitted the tree height-diameter relationship in case 1 (49 artificial loblolly pine trees on a 0.049 ha plot) by using these models and showed that SLM was the better than the other two models, whereas in case 2 (a 4.6451 ha plot with 690 slash pine), the results indicated that SEM was superior to SLM and SDM. Lu and Zhang [44] fitted the same relationship as Meng et al. [42], but in a large softwood stand (400ˆ200 m) with 3982 trees, showed that SDM was the best, followed by SEM. These previous studies indicate that the performance of spatial autoregressive models depends on the study level and data size.
In this study, we used these spatial models to fit the stand top and stand mean height relationship with 199 selected PSPs (20ˆ20 m) in a plot or stand level. Our results showed a similar finding as the studies of Kissling and Carl [27] and Meng et al. [42] in case 2, that SEM was the best spatial autoregressive model to decrease the spatial dependence and improve the model fitting, independent of which spatial weight matrix was used. SDM was close to SEM. SLM was the worst to cope with spatial autocorrelation of model residuals because SLM with seven of nine spatial weight matrices cannot reduce the spatial dependence of model residuals (except 1/d 5 and Gaus). Moreover, when the difference of Moran's I of spatial model residuals and the expected value I 0 was not significant (p value was greater than 0.1), the one spatial model was superior to OLS in model fitting because of its higher R 2 , lower RMSE and smaller AIC, regardless of which spatial models and spatial weight matrix were used.

Model Parameter Estimates
The estimate of model parameter from spatial autoregressive models is a very significant issue in geographical ecology and forestry (e.g., [24,27,69,70]), and biased estimates and incorrect model specifications directly influence hypothesis testing (e.g., [44,69,71]). Lu and Zhang [44] suggested that various spatial weight matrices may induce unequal estimates of model parameters, the intercept coefficients can be affected by the model specifications (e.g., SDM and SLM) and spatial weight matrices, and the estimates of the model coefficient in SEM were very close to those of OLS. Our results confirmed their suggestion. Additionally, the intercept coefficients β 0 in SLM and SDM were similar to those of OLS when SLM or SDM combined with some spatial weight matrix (e.g., queen, rook, Spher and LSM), which enabled β 0 to reach the significance level of 0.001. The standard errors of the model coefficient estimates of OLS were equal or lower than those of the spatial autoregressive models for reducing the spatial autocorrelation, which confirmed the argument of West et al. [29], i.e., when OLS was used to fit the data that contained spatial dependence, the covariance matrix of the parameter estimates was underestimated.

Spatial Weight Matrices
Getis and Aldstadt [49] created three types of artificial data; each type had 9000 observations (30ˆ30 raster) and their spatial patterns were random, two clusters, and six clusters, respectively. Three artificial data were modeled with SLM and nine spatial weight matrices. They concluded that LSM was the best matrix among all of the spatial weight matrices through comparative analysis with AIC, ρ and Moran's I of model residuals, and geostatistical matrices were the next best generally. However, Lu and Zhang [44] concluded that the LSM was not the best but that geostatistical matrices were superior using large forest stand data, no matter which spatial autoregressive model used.
Our results yielded a similar finding as the study of Getis and Aldstadt [49] as LSM, rather than geostatistical matrices, with SEM performed better than the other eight spatial weight matrices to reduce the spatial autocorrelation of model residuals, followed by Spher geostatistical matrix and queen contiguous. The reason may have resulted from the study level (plot or stand level, individual-tree level) of our study and that of Getis and Aldstadt [49] as both were at the plot or stand level, whereas the study level of Lu and Zhang [44] was at the individual-tree level. However, using the SEM to model the data, rook contiguous matrix, instead of LSM, showed optimal performance in terms of model fitting (maximum R 2 , minimum RMSE and minimum AIC), and LSM and 1/d 2 were second best. This suggests that the optimal spatial matrix depends on specificity in study data. In terms of model intercept coefficients estimates, five spatial weight matrices, including 1/d (worst), 1/d 2 (second worst), 1/d 5 , Exp and Gaus, were not significant at a risk level of 0.01 in SLM and SDM, whereas the other four spatial weight matrices (queen, rook, Spher and LSM) did well in SLM and SDM.

Stand Top and Stand Mean Height Relationship
A high linear correlation between stand top and stand mean height was found in pure plantations as the correlation coefficients of European larch, Scots pine and Norway spruce are 0.994, 0.989 and 0.974, respectively [8], and the correlation coefficient of Chinese fir is 0.995 [21]. However, there was a gap in the relationship for natural mixed forests. In this study, we found that the correlation coefficient in a mixed natural stand was 0.847 and relatively lower than that of pure stands. This difference was mainly due to the complicated species composition. However, Moran's I of OLS residuals was different, with the expected value I 0 at a significance level of 0.01. We used three SAR models (SLM, SDM and SEM) and had a finding that SEM was the best to reduce spatial dependence, SDM was very close to SEM, and SLM did not adequately account for spatial autocorrelation. SLM incorporates spatial dependence with the WH t , and SDM considers spatial dependence in both the WH t and weighted neighboring PSP of stand mean height (called WH m ), and the primary objective of SEM is to achieve more efficient estimators by correcting potential estimated bias affecting spatial autocorrelation due to unobserved spatial variables [42]. Therefore, our finding indicated that stand top height at one PSP is affected not by the spatial variable WH t but by the combination of the spatial variables WH t and WH m . Furthermore, stand top height at one PSP is significantly affected by unobserved spatial variables at neighboring PSPs. In the future, we need to incorporate environmental factors, stand non-spatial structure characteristics and stand spatial structure characteristics into SAR models to further identify which is the most effective for stand top height. In addition to three SAR models, we could incorporate other models dealing with spatial autocorrelation, such as generalized linear mixed models, generalized additive models, spatial eigenvector mapping and Bayesian spatial models in further studies.
Stand top height is a key variable for which to use the site index method that is commonly species dependent, but the site index method has limitations when applying to mixed uneven-aged forests because of difficulties in tree species-specific age measurements and in height estimates that dominant trees must be available for each species [56]. An estimate of stand top height that is independent of species could potentially address this limitation [55]. Raulier et al. [72] used stand top height without accounting for species to compare the different site index curves from different data sources, including permanent sample plots, temporary sample plots and stem analyses. Anyomi et al. [56] used stand top height to study spatial and temporal heterogeneity of forest site productivity drivers within the eastern boreal forests of Canada. Consequently, we used stand top height and mean height without considering species.
In addition, mean height of 100 tallest trees per hectare was regarded as a top height in this study; other definitions of top height may be allowed for further study, such as the mean height of 100 thickest trees per hectare, mean height of a fixed percentage of occurring trees, especially the definition of top height, which considered the plot size selection effect to reduce estimate bias [52]. These definitions of top height may be available in forest inventory.

Modeling Approach Applications in LiDAR Data and Crown Fire Modeling
Circumvention of time and cost constraints in field-based forest inventory, foresters have increasingly regarded remote sensing data as a means of assisting and complementing, and forest canopy height was retrieved easily using LiDAR data and InSAR [22,[73][74][75]. In reality, forest canopy height in a resolution cell is more similar to its nearby resolution cells because they have similar soil, topography, microclimate, etc., that is, there is a spatial autocorrelation between a resolution cell and its nearby cells. Therefore, considering the spatial autocorrelation into forest canopy height from LiDAR could improve the utilization of LiDAR in forest monitoring and forest inventory. Mcroberts [76] reviewed that the k-Nearest Neighbors (kNN) was a suitable approach to deal with spatial autocorrelation and improve the prediction of forest variables those were from most similar, or nearest resolution cells. There are three common selections for spatial weight matrices in kNN, including a constant value of 1, 1/d and 1/d 2 [76][77][78]. These common spatial weight matrices within kNN were investigated to predict forest canopy height by Mcinerney et al. [74] and the result showed the performances of different spatial weight matrices depended on different k values and different bands. In addition to these common spatial weight matrices, other spatial weight matrices such as 1/d 5 , Gaus, Spher, Exp, queen, rook and LSM, may be suitable for reducing spatial dependence because different nearby resolution cells may generate the different spatial pattern (structure). Meanwhile, SLM could be regarded as an alternative model in height prediction for its assumption that the autoregressive process occurs only in the response variable, and SEM is also an alternative model for its assumption that the autoregressive process occurs in the unobserved variable. The spatial autoregressive model and spatial weight matrices incorporated into the height model of LiDAR can greatly increase height prediction accuracy.
Furthermore, the height model plays a key role in crown fire modeling because crown base height is one of the three main characteristics (canopy bulk density, canopy base height, and foliar moisture content) of canopy fuels and is an important variable in the crown fire initiation model that predicts whether a surface fire would be transited into some type of crown fire in a given fire environment [79][80][81]. Therefore, accurate prediction of crown base height is very important. However, crown base height (or crown diameter) of an individual tree may be influenced by its neighbors for tree competition and microenvironment. If crown base height (or crown diameter) measurements and individual tree coordinates are given, three SAR models (SLM, SEM and SDM) assume that the autoregressive process occurs in the response variable, in the error term, and in both response and explanatory variables, and different spatial weight matrices could be applied to improve prediction accuracy of crown base height.

Conclusions
In this paper, three SAR models with different nine spatial weight matrices were used to model the relationships of stand top and stand mean height in mixed natural forests, with OLS as a benchmark. Our results showed that the correlation between stand top height and stand mean height in mixed stands was strong (R 2 = 0.717) but relatively lower than that in pure stands. Significant spatial dependence exists in the relationship of stand top and stand mean height for different Moran's I of OLS residuals with I 0 , as well as tree height-diameter relationships [36,37,39,40,[42][43][44]. Among these three SAR models, SEM was the optimal spatial model to reduce the spatial autocorrelation of model residuals and model fitting (e.g., R 2 , RMSE and AIC), independent of which spatial weight matrix was used. The relationship provides a way to estimate stand top height from stand mean height and facilitate site quality assessment in natural mixed forests when only the latter is available, for example, in the China NFI system. Height modeling is an important topic for LiDAR (e.g., forest canopy height) and crown fire modeling (e.g., crown base height), and considering spatial regressive models can improve the height prediction accuracy.