A tree crown is characterized by crown height, crown width, crown density, leaf area, and crown ratio, and their measurements are useful for forest management and research. The crown ratio is considered a reliable indicator of the vigor and potential growth of a tree [1
]. Height to crown base (HCB) is an important tree measure to derive crown ratio and is also regarded as an indicator of log quality. HCB is usually understood as the vertical height from the ground to the bottom of live whorled branch on the bole of a tree [5
]. The ground-based measurement of HCB is a time-consuming and labor-intensive process; thus, it is rarely done during field inventory [6
]. Most researchers have obtained the HCB value by establishing linear or nonlinear HCB models with other variables as predictors, such as DBH, tree height, basal area, basal area larger than a target tree, the sum of basal area of all trees with diameter bigger than a target tree, crown competition factor, climate, and site index [8
]. Tree diameter at breast height (DBH) is also an important tree attribute that is used as a main predictor in forest growth and yield, taper, and biomass models. In general, the measurement of DBH is very common in ground-based inventory; however, field-inventory data could have a low accuracy, and their measurement needs more time and cost, especially measurements required for extensive forest areas. Therefore, methods of HCB data collection have been transformed from the traditional forest field inventory to modeling and prediction based on remote sensing technology [13
Light detection and ranging (LiDAR) can accurately determine the geographical position of surface objects by transmitting and receiving laser pulses. Laser pulses travel down the forest canopy, and detailed information on the three-dimensional structures of the forest canopy and understory topography can be obtained [17
]. Many tree attributes, such as tree height and crown dimensions [18
] can be obtained based on the LiDAR data. The study approaches based on HCB prediction may be divided into two categories: direct and indirect approaches. The direct approaches refer to those derived from HCB with various geometrical shapes of the crown [12
] or predicting HCB according to descriptive statistics of the LiDAR-based data distribution [4
]. Direct approaches do not require any ground-measured HCB data, which are costly and time-consuming, as they only require point-cloud data processing and analysis including tree detection and the determination of crown base positions. In addition, this approach could also cause considerable uncertainties in determining the base of the first normal green branch as a part of the crown. Therefore, its application is quite limited to estimating HCB. The indirect approach, on the other hand, refers to predicting HCB through the application of statistical modeling [22
]. This approach requires field-measured HCB data to establish the models for the prediction of HCB. The models for the accurate prediction of individual tree HCB can be built using LiDAR-based information, and so this method has been frequently used in recent years [22
The application of ordinary least square (OLS) regression to estimate the parameters of LiDAR-based DBH and HCB models is not generally preferred, but it is still used [16
]. This estimation method usually assumes that (i) regressors are random variables with errors, (ii) regressors are fixed variables without errors, and (iii) the associated error is subject to normal distribution with zero mean and constant variance [30
]. Any violation of the second assumption leads to the substantially biased estimation of the models [30
], which eventually reduces the prediction accuracy.
The prediction accuracy of the developed HCB and DBH models uses the LiDAR-based tree height, crown width and crown area may not be always satisfactory for a couple reasons. Firstly, LiDAR-based tree height, crown width, and crown area have random or systematic errors caused by LiDAR system configuration and parameter estimation. Any error involved in the variables could increase the residual variance of the model and also lead to invalid statistical tests [31
]. Secondly, the estimated DBH from a LiDAR-based DBH estimation model contains non-ignorable or inevitable errors [33
]. If such erroneous DBH is used as a predictor in a LiDAR-based HCB model, substantial bias would occur due to error transfers [34
]. In addition, estimating with a LiDAR-based DBH model and a LiDAR-based HCB model separately or independently using OLS disregards the inherent correlations of HCB with DBH and thus fails to account for the compatibility of the estimated HCB and DBH. Thus, estimating the parameters of both model types independently with OLS may create a remarkable problem, especially in the condition when errors are associated with both the regressors and response variables. An appropriate settlement of this problem is to apply error-in-variable (EIV) modeling, which takes the errors into consideration and can guarantee compatibility between HCB and DBH [29
] first introduced the theory on the development and application of linear EIV models, and, later on, Carroll et al. [32
] applied this concept on the nonlinear EIV modeling in detail. Kangas [31
] investigated the effects of EIV on the parameters of the diameter growth model and applied the simulation extrapolation algorithm to adjust the errors in the estimated parameters. Lindely [38
] proved that validation data from the same population as the fitting data resulted in predictions that were usually unbiased, even though the regressors were subject to error. Tang and Zhang [36
] developed an EIV model to investigate the unbiased parameter estimates. Tang and Wang [39
] proposed the two-stage EIV method to estimate the model parameters. In their study, the EIV concept was introduced into forest attribute modeling, which provides a theoretical basis for studying the influence of errors on stand growth and harvest models. Li and Tang [40
] compared three methods, namely simulation extrapolation, regression calibration, and EIV to estimate the models and found a better performance with EIV with smaller variances compared to other two methods.
Few studies have been carried out with DBH EIV modeling using remote sensing data. For example, Fu et al. [33
] developed an individual tree DBH and above-ground biomass (AGB) EIV model with LiDAR-based tree height and crown projection area as predictors with the application of the two-stage error-in-variable modeling (TSEM) and nonlinear seemingly-unrelated regression (NSUR) to estimate model parameters. Both TSEM and NSUR explain the correlations of DBH with AGB and also effectively explain the errors in DBH on the prediction of AGB. Zhang et al. [29
] reported that the DBH EIV model developed with errors associated with both response and regressor variables through the application of the maximum likelihood method was most appropriate. To the authors’ knowledge, no studies have been carried out on developing LiDAR-based HCB EIV models that were attributed to compatibility.
This study thus aimed (a) to develop a compatible simultaneous equation system of DBH and HCB EIV models based on the LiDAR data at the individual tree level for Picea crassifolia
Kom forests in northwest China, (b) to evaluate the compatibility of two different nonlinear OLS-based DBH and HCB models with the leave-one-out cross validation method, and (c) to compare various unbiased fitting algorithms including NSUR. To simplify the proposed simultaneous equation system and to guarantee its application in the future, only response variables (HCB and DBH) were assumed as the error-in-variables [39
], and predictor variables were regarded as error-out variables [33
]. The presented compatible simultaneous equation system of DBH and HCB models will be applicable to other Picea
species whose growth and stand conditions are very much similar to the basis of our studied species. This tree species is crucial to the economic and social development of the rural population, as well as regional carbon storage and cycling, and the maintenance of the structures and functions of the forest ecosystems in northwest China. This article is mainly concerned with the methodology employed in this study, which is clearly described in the Methods section; additionally, the major strengths and weaknesses of the methodologies, along with the main findings of the study, are thoroughly discussed while the potential contribution of the study is highlighted.
HCB is an important tree attribute to assess tree productivity and tree vigor. DBH is commonly used to predict HCB model, but DBH estimated with LiDAR-based attributes contains unignorable errors. In addition, the compatibility between DBH and HCB needs to be considered when estimating HCB. In this study, we investigated four algorithms to estimate DBH and HCB in an EIV equation system—NSUR, 2SLS, 3SLS, and FIML—that were compared with two model structures. The prediction accuracy of the four EIV equation system algorithms and two model structures were reflected by RMSE and MAE. The results showed that the impacts of measurement error of DBH on HCB and the compatibility between DBH and HCB were well accounted for by the NSUR algorithm.
HCB is an important indicator for tree vigor and tree stem form, as well as an indispensable measure for retrieving the crown ratio. However, measuring in-situ HCB is quite labor-intensive and costly, especially when conducted for large forest areas. In this situation, an efficient method of obtaining precise HCB is necessary, which can be possible with the HCB prediction model developed from the LiDAR-derived variables, such as tree height, crown projection area, crown width, and ground-measured DBH. The first three variables can be relatively more accurately and easily measured by applying the advanced remote sensing techniques. The HCB can be estimated from the established HCB model, which may also contain DBH as a predictor [11
]. The DBH estimation model can also be developed using the LiDAR-derived information [33
]. The estimation of HCB and DBH from their corresponding prediction models would be substantially biased if separately developed models were used, i.e., DBH model and HCB models developed independently from each other from the same tree data. In order to overcome such a bias, developing a compatible simultaneous equation system is the most appropriate solution. However, this equation system of DBH and HCB models is still unavailable in forest modeling literature. As mentioned in the introduction, other compatible simultaneous equation systems developed through the EIV modeling approach are available, e.g., a system of equations of DBH and individual tree above-ground biomass models [33
]. Considering the knowledge gap, we developed the simultaneous equation system of DBH and HCB models using the tree-level predictors (LH, LCW, and LCA), the information of which was derived from the LiDAR imagery. Four different algorithms (NSUR, 2SLS, 3SLS, and FIML) were used to estimate this equation system.
The data used in our study originated from the Picea crassifolia Kom forest, which is crucial to the economic and social benefits to the rural population, as well as regional carbon storage, regional carbon cycling, and the maintenance of the balanced-functions of forest ecosystems in northwest China. Two different model structures (the NLS and NBD model and the NLS and BD model) built by assuming errors associated with all the regressors and response variables were found to be inappropriate because this approach did not account for the inherent correlations of DBH with HCB and all the estimated parameters and variances were biased.
Generally, the structural estimators or fitting algorithms (NSUR, 2SLS, 3SLS, and FIML) should always be preferred to the NLS, as each of them effectively accounted for the errors in variables in an appropriate way. However, surprisingly, we found that NLS could sometimes provide a closer estimation of the structural estimators applied in this study, and it was the same for NLS and NBD. The NLS and NBD model had a smaller bias variance, so it has possibility to produce a smaller RMSE. However, NLS standard errors are, in all the likelihoods, not useful for inference purposes [57
]. The prediction accuracy of the NLS and BD model was the worst with the highest
and the biggest RMSE, thus, in this case, the EIV modeling approach clearly displayed the advantage over NLS. In general, individual tree DBH and HCB models based on the LiDAR data and field-measurements contain errors that exist in image capture, image processing, and the extraction of the information processes, and they are therefore very hard to completely avoid [29
The NLS and NBD could neither address the compatibility problem of DBH and HCB nor account for their inherent correlations. However, a simultaneous equation system (Equation (3)) can effectively address these issues. Among the four algorithms used in fitting simultaneous equation system (Equation (3)), NSUR and 2SLS are classified into the limited information estimators, while 3SLS and FIML are the full information estimators. The former two estimators can make use of the reduced model information, while the latter two estimators can make use of full information from the model [33
]. Based on the model validation results with LOOCV, the prediction accuracy of NSUR was slightly better than that of the other algorithms (2SLS, 3SLS, and FIML). This was probably because NSUR has a better ability to address the error transfers caused by DBH in the simultaneous equation system of the DBH and HCB models. Potentially because of this, Parresol [49
] applied NSUR to develop the additive tree biomass models in a pioneer modeling study about a simultaneous equation system in forestry. The prediction accuracy of 3SLS was slightly better than 2SLS, confirming the findings of Tang et al. [34
], who found that when errors in across equations were correlated, 3SLS outperformed 2SLS, and—when errors involved across equations were uncorrelated—2SLS outperformed 3SLS.
Our HCB equation system developed in this study was based on the most attractive fit statistics of the base model among the five frequently used HCB base candidate models [10
]. The analysis of correlations between the regressors and HCB showed strong connections among LCA, LH, DBH, and HCB. In other words, these tree characteristics strongly influenced HCB variations. Our DBH base model, which replaced LCPA with LCW in the models of Fu et al. [33
], showed a better fitting performance with a smaller RMSE. Both the HCB model applied with all the LiDAR-based data (except for DBH data, which were obtained from ground measurement) and the DBH model were developed by LiDAR data, and this enabled the DBH–HCB-compatible EIV models, suggesting the high possibility of the equation system’s application to an extensive forest area. The validation results based on the LOOCV for NSUR, 2SLS, 3SLS, and FIML were almost identical, even though NSUR slightly outperformed others; however, the prediction difference was still insignificant (Table 3
). In this study, we only considered DBH and HCB as error-in-variables; however, other regressors may contain various errors including measurement errors, tree crown delineation errors, and errors of parameter estimation. Ignoring all these errors can cause the complex uncertainties while developing models. Future researchers should focus on these issues. Therefore, readers need to be cautious when considering the conclusion of this study.
As mentioned in the introduction section, this study was based on a novel methodology, which resulted in a system of compatible simultaneous equations of DBH and HCB models in which various LiDAR-derived tree attributes were used. The measurement errors of both DBH and HCB were simultaneously taken into consideration to address the problem of compatibility between DBH and HCB models and to account for inherent correlations between these tree variables through a simultaneous modeling approach. The presented equation system of DBH and HCB models can fulfill the gaps of the unavailability of such an HCB EIV model system in forest modeling literature. A compatible simultaneous equation system of the DBH and HCB models developed using the information of the tree-level predictors (LH, LCW, and LCA) derived from LiDAR imagery and ground-based measurements confirmed the accurate prediction of HCB and DBH. Compared to any of the previously developed HCB models using only ground measurements [11
] and those based on LiDAR-derived databases [22
], the presented equation system in this article will be interesting and useful to both researchers and forest managers, as this system is able to accurately predict HCB. Furthermore, the presented modeling approach and algorithm in this article will be useful for establishing similar compatible equation systems of DBH and HCB EIV models for other tree species and other tree variables that have inherent correlations between themselves.