1. Introduction
Emerging bioeconomy strategies consider lignocelluloses from forests as an alternative to fossil raw materials [
1,
2,
3], suggesting that the demand for woody biomass is likely to increase globally [
4,
5]. At the same time, forest reference level scenarios gain increasing importance with regards to the United Nations Framework Convention on Climate Change (UNFCCC) [
6,
7]. National forest inventories (NFIs) and aggregated forest growth and wood supply modeling (AFGWSM) can be used to estimate the future availability of woody biomass and the correlated effects on forests and climate. NFIs have been established in several countries to generate data about forest conditions, dynamics, and productivity. Today, most European and North American countries conduct forest inventories based on statistical sampling [
8,
9,
10]. In addition, many countries use or are developing AFGWSMs that are directly connected to the respective NFI data [
11].
Although wood supply scenarios are not explicitly forecasts, they are expected to generate results that qualify as decision support for policymakers, administrations, industry, and other interest groups [
12]. As such, they are supposed to be “a coherent, internally consistent and plausible description of a possible state of the world” [
13] (p. 799) [
14]. Among possible themes for wood supply scenarios, business-as-usual (BAU) scenarios are of specific importance. While focusing rather on short-term trends, these scenario types can establish a baseline according to which other scenarios can be shaped or referenced [
15].
To design BAU scenarios for AFGWSMs, “heterogeneous forest land and owners with heterogeneous objectives” [
16] (pp. 200–201) have to be integrated. Concept-driven studies that apply theoretical preconceptions to the data [
17,
18,
19,
20] can be differentiated from data-driven methods that—as is the goal of this research—aim to learn these concepts from the data first [
21].
In the past, numerous data-driven studies have focused on the objectives and harvest decisions of forest owners. An overview of research on the non-industrial private forest (NIPF) sector, which was the subject of the majority of these studies, is provided by Amacher et al. [
22] and Beach et al. [
23]. A common method is to use panels or surveys that are analyzed with various specifications of tobit or logit models [
24,
25,
26,
27,
28]. Only a few panel-based studies have had access to large datasets [
29,
30], and the attitude towards or intention to harvest is often measured rather than the actual harvest [
31].
NFI-based studies offer the advantage that forest development and timber harvests can be derived from a large number of inventory plots that are often permanent and repeatedly measured [
8,
32]. Unlike survey-based studies, a representative fraction of the actual resource is considered as the principal research unit (PRU), as opposed to the individual forest owner or manager. Several studies used NFI data to review AFGWSM scenarios [
33,
34,
35]. Another group of studies used NFI or other forest inventory data to investigate and/or project harvest behavior using regression models or other machine learning algorithms [
21,
36,
37,
38,
39].
However, when BAU scenarios are learned and projected from NFI data, researchers are confronted with two major issues. First, unlike survey research, NFI-based studies often lack specific and relatable information about the relevant decisionmakers (forest owners or managers). For example, a study with access to linked inventory and survey data reported impacts of the more specific ownership characteristics of education and income on harvest probabilities [
36]. However, this information is often not available. In the case of the German NFI, for example, opportunities to collect information on individual plot owners are limited, since the exact location and land ownership of NFI tracts cannot be revealed. Second, on the PRU level, NFIs often produce noisy data that is not representative of the forest area upon which a harvest decision is made. Instead, data only become representative at higher aggregation levels.
The principal aim of this research was to parameterize a BAU wood supply scenario from NFI data by dealing explicitly with this restriction. The quality of the scenario was judged on its ability to reproduce 2002–2012 timber harvests reported by the NFI. To consider a BAU scenario as well calibrated, not only overall NFI harvested volumes should be well captured. The scenario should also retrace NFI harvested volumes throughout characteristics of the inventory data (e.g., stand types and timber dimensions). Furthermore, it was considered important that the observed occurrence or non-occurrence of harvest interventions at plot level be reflected in the scenario.
To address this issue, a stratified machine learning approach based on the Classification and Regression Tree (CART) algorithm [
40] was designed and tested in the scope of this research. The method optimizes the prediction of harvest occurrences and harvest shares towards patterns of the original inventory. It can be used as an upstream model that predicts harvest decisions (yes or no) for individual inventory plots. Commonly, overall cut-off benchmarks [
41] are used to convert predicted probabilities into a binary decision. In contrast, the presented method uses the learned harvest probabilities of strata as decision criteria. The first rule is that this probability must be met by the number of plots selected as harvested in each stratum. For the selection of individual plots to be harvested per stratum, two alternative options are presented: random selection (assuming that no other attributes would influence the harvest decisions in the strata), or a Random Forest (RF) algorithm [
42] trained on the training data of the corresponding stratum. Once the plots are predicted as harvested by the stratified approach, linear regression trained on harvested inventory plots only can be used to predict the harvested volumes for these plots.
To assess the pros and cons of the presented stratified harvest prediction approach, results are compared to a direct harvested volume prediction with linear regression, as well as to a two-step approach with logistic regression (using an overall cut-off benchmark), followed by linear regression trained on harvested inventory plots only.
A unique feature of this paper is that existing machine learning techniques are adapted and combined to serve specific requirements of large-scale inventory based harvest decisions and harvested volume predictions. The results of this research can be used to project business-as-usual timber supplies directly for a 10-year period (in this case 2012–2022). Furthermore, the results can inform harvest scenarios, which are used in combination with NFI-based forest growth projections, as described by Kändler and Riemer [
43].
3. Results
Table 2 and
Figure 4 compare the results of the upstream models that were used to predict the harvest occurrence with NFI results. Logistic regression resulted in higher classification accuracies when compared to the methods based on stratification (CART). Harvest decisions predicted randomly within the strata resulted in the lowest classification accuracies. When the selection within the strata was informed by RF, accuracy values were closer to, but still below, those of the logistic regression.
Figure 4 offers insights regarding the question of how well the models were able to predict harvest shares within various subsets of the data. Logistic regression (based on max kappa cut-offs) delivered good overall results for the entire test set. However, shares of harvested plots were underestimated strongly for subgroups with lower harvest probabilities (e.g., average plot age of <30 years, lowest standing volume class, and unfavorable harvest conditions). On the other side, the share of harvested plots was overestimated for groups with higher harvest probabilities (e.g., highest standing volume groups). A comparison of these results with the prediction based on the Youden index shows that lower or higher cut-off benchmarks could not solve this problem, but would rather shift the entire figure.
The two methods based on stratification (lower panel of
Figure 4) were able to retrace harvest shares of the NFI more accurately across the various subsets of the test set. For some subsets, the method based on random prediction within the strata performed better than the RF-informed stratified prediction (e.g., altitude >700 m a.s.l., small and medium private forest).
Before results for combined harvest decision and harvested volume models are presented, the harvested volume regression model should be evaluated separately. At the same time, results of the direct OLS-based harvest volume prediction can be considered.
Figure 5 shows the results of the two versions of the harvested volume models applied to test set plots that were actually harvested according to the NFI.
OLS that were trained on the entire training set and used to predict harvested volumes directly delivered good overall average harvested volume estimations. However, predicted harvested volumes were generated from 97.6% of the test set plots (for 2.4% of the plots, predicted volumes were ≤0). At the same time, the harvest intensities of the plots harvested according to the NFI were underestimated strongly (mid-panel of
Figure 5). When the OLS regression was trained on the harvested plots of the training set only (downstream model), average NFI harvest intensities could be retraced well.
Figure A2 shows the results of downstream OLS tested on plots harvested according to the NFI for various subsets of the test set. Harvested volume predictions were close to NFI values for most characteristics. Harvested volumes of some characteristics were under or overestimated.
Figure 6 shows the final harvested volume predictions derived by combining upstream and downstream models. The results of direct OLS harvested volume predictions are not shown in the figure, since the results and deficits of this approach were already presented in
Figure 5. When analyzing the final harvested volume predictions in
Figure 6,
Figure A2 and
Figure 4 should also be considered. This comparison can provide answers to the question of whether underestimations, fits, or overestimations were caused by the respective upstream or downstream model, or by a combination of both models.
The best harvested volume predictions were delivered by the stratified approach, where plots were selected randomly as harvested within the individual strata. With only a few exceptions, predicted harvested volumes remained within the standard error (confidence interval: 95%) of the NFI of the various subsets of the test set.
The stratified approach that used RF to inform the selection of harvested plots within the strata delivered the second best results. However, in this case, harvested volumes of several subsets of the test data were under or overestimated.
Predictions based on logistic regression overestimated harvested volumes from conifer and mixed stands. At the same time, harvested volumes from several data subsets of the deciduous stands were underestimated.
4. Discussion
Research results showed that each of the tested methods had strengths and weaknesses. Logistic regression achieved the highest overall classification accuracies, but overestimated harvest shares for some characteristics, while underestimating the harvest shares of others. The stratified method with random selection resulted in better prediction of harvest shares, but achieved much lower harvest accuracies. The RF informed stratified method achieved results in between the logistic and the stratified random approach. Results are discussed in more detail below, starting with the direct harvested volume prediction.
The linear OLS regression that was trained on both harvested and not-harvested plots and used to predict harvested volumes directly appeared at first glance to deliver good results. However, this approach neglected that a substantial share of NFI plots was not harvested within the 10-year period. Since the absence of logging interventions is part of current business-as-usual, it should also be captured in BAU scenarios (compare also Eid [
66] cited in Antón-Fernández and Astrup [
21]). A related shortcoming is that, on average, harvested plots were in reality treated with considerably higher harvest intensities than predicted by the directly applied models. Using the direct approach to predict future wood supply might result in adequate business-as-usual overall quantities for the first decade. However, instead of realizing these harvested volumes from around 65% of the plots, the direct model would harvest these quantities from nearly 100% of the plots treated with comparably low intensities. Thus, the prediction would result in a forest inventory structure that is clearly different from business-as-usual at the end of the decade. This direct approach would be especially unsuitable for projection periods longer than 10 years.
However, it was also shown that the OLS was able to predict the harvested volumes of harvested NFI plots well when trained and tested on subsets of harvested plots only (the downstream model). To make use of these versions of the model in a BAU scenario, it was necessary to first predict the occurrence or non-occurrence of harvest interventions.
Logistic regression is a common approach for binary decision problems [
21,
36]. In the context of this research, it delivered better results than the tested stratified methods with regards to classification accuracy, precision, specificity, and Cohen’s kappa. Nevertheless, the logistic regression failed to predict accurate harvest shares for several subsets of the test set. When combined with the downstream OLS model, the logistic model resulted in overestimations of harvested volumes for some characteristics, and underestimations of others. This problem is directly linked to only one overall benchmark being used to convert the predicted harvest probabilities into the decision yes or no. Predicting harvest occurrences from CART with an overall cut-off would create similar problems.
Logistic regression resulted in higher, but—at least in the case of this research—not sufficiently high classification accuracies. An important factor here is the noisiness of the large-scale inventory data. The forest area based on which an owner or manager makes a harvest decision might look different from what is suggested by the inventory plot. With the exception of selection (plenter) felling, a harvest decision is usually made for an area (e.g., a forest stand or a section of a forest stand). In the case of the German NFI, only one or two inventory plots might be found per included decision area. However, a higher number of plots would be needed in order to produce a representative picture of that area [
67].
The most accurate way of designing BAU wood supply scenarios would be to generate models with high classification accuracies. However, if this is not feasible (e.g., due to data noise) an alternative is to design models that deliver accurate harvest proportions throughout subsets of the dataset. The stratified prediction methods tested in the scope of this research can be used to optimize the model towards this goal.
In this research, CART was used for the stratification process. However, comparable stratifying machine learning algorithms such as C.45 [
68] might also be applicable for this purpose. The CART algorithm divided NFI plots of the training set into strata, and delivered common harvest probabilities for each stratum. The fitted CART could then be applied to the test set. The algorithm assigned the test set NFI plots to the respective stratum where corresponding probabilities (derived from the training set) could be used to select an adequate proportion of the test set plots as harvested. Thus, instead of assuming that plots with harvest probabilities of 20% were not harvested at all (as done by the cut-off method), it was possible to select 20% of these plots as harvested. The question is now: which of the plots should be selected as harvested?
In a first approach, it was simply assumed that no further factors, but rather only noise, influenced the harvest decision at the plot level, and plots were selected randomly as harvested according to the specific harvest probability of a stratum. This approach resulted in good predictions of harvest shares and harvested volume across the various subsets of the test set. An important drawback was that the classification accuracy, precision, specificity, and Cohen’s kappa values dropped considerably when compared to the cut-off approach. Furthermore, it was necessary to repeat the procedure and average outcomes of the repetition.
In a second approach, a RF algorithm was used to inform the selection of harvested plots within the various strata. In this way, each NFI plot within the test set obtained an additional individual harvest probability. Again, shares of plots corresponding to the harvest probabilities of specific strata were selected. However, this time, the plots with the highest individual harvest probabilities were selected first, and the selection stopped as soon as the respective harvest share of the stratum was reached. Harvest share predictions were comparable to the results of the random approach. However, for some subsets (e.g., plots above 700 m a.s.l., plots above 50-cm average DBH, and small private forest plots), the random selection delivered better results. At the same time, classification accuracy, precision, specificity, and kappa values increased considerably, but still remained below the results of the logistic models.
With reference to factors impacting harvest decisions, the outcome of this research mostly confirmed the results of previous research. Harvest condition and slope [
21,
38,
50] were found to have impacts on the general decision about whether or not a plot is harvested. These factors were not found to influence the harvest intensity once this decision was made. However, it is possible that plots with more difficult harvest conditions are harvested less frequently, but with higher intensity, and that this pattern is blurred due to the given period of 10 years. The factor ownership type [
17,
18,
37] was also found to have important impacts. An interesting finding was that, for harvest occurrence, differences were found between community, state, and large private forests (as one group), medium private forests, and small private forests (with decreasing harvest probability across the groups listed from first to last, see
Table A4). However, only the large private forest was found to be harvested at higher intensities. No differences in harvest intensities could be found between the other ownership types.
A restriction of the research is that the results remain on a relatively broad level as concerns timber species and dimensions. Instead of referring to individual timber species, plot level stand types grouped into five broad groups were used. Included dimension information referred to average values for the NFI plots. However, results of this research could be used to integrate BAU scenarios into AFGWSMs, where results could be further broken down to individual tree levels and even the computation of timber assortments would be possible [
43,
69].
The available information for NFI plots was dominated by stand or site-specific attributes. With regards to the human factor, only rough information on ownership type and property sizes was available. The objectives and behavior of NIPF owners are considered to be heterogeneous, dynamic, and complex [
29,
70,
71,
72,
73,
74]. However, given this complexity, and the fact that NFI plots are not representative of decision areas, it is unclear in how far more ownership-specific information would help to improve the accuracies of the models.
Within the reference period, business-as-usual procedures were disturbed by storms and a drought period. Here, it should be asked how far and to what degree of severity natural disturbances should be considered as part of usual business. For instance, bark beetle impacts related to elevation and timber dimensions as described by Sterba et al. [
33] might, for example, still be considered normal. In contrast, Seidl et al. [
75] dedicated a study to the projection of bark beetle disturbances in the context of climate change impacts and adaptive management strategies. This can be considered as diverging from business-as-usual. Within the German NFI, the specific timing of and reason for tree harvests are not recorded. Furthermore, the occurrence of salvage fellings is likely to impact other harvest decisions. Thus, in any case, it would be difficult to deduct harvest occurrences caused by disturbances.
Plots of the German NFI are not representative of the stands or decision areas in which they are located. The non-occurrence of harvest interventions on an NFI plot does not imply that the entire stand was not harvested and vice versa. Thus, it has to be kept in mind that a BAU scenario generated with the presented method is, first of all, a projection of the NFI. However, the NFI is considered a random sampling (by neglecting the systematic arrangement of the tracts) [
46] that delivers representative and valid data on forest conditions on a larger scale [
8].
CART turned out to be a valuable tool for the stratified prediction. An additional advantage is that the algorithm resulted in a classification tree that can be easily interpreted. In this context, Domingos [
57] (p. 2) argues: “[…] it is not enough for a learned model to be accurate; it also needs to be understood by its human users, if they are to trust it and deem it acceptable”. However, a disadvantage of CART is its instability [
57]. Setting the minimum number of observations per terminal leaf to 100 and pruning the tree helped to gain more stability and avoid over-fitting. Nevertheless, smaller changes in the input data could still result in different tree structures. However, alternative algorithms that could be used for stratification such as C4.5 [
68] have similar issues [
55,
57] and more stable methods such as RF are unsuitable for the stratification process.
5. Conclusions
It can be summarized that the stratified method generated good results. Among the group of tested prediction methods and under the given data conditions and restrictions, it can be considered as the most suitable for generating BAU timber supply scenarios. The direct prediction of harvested volumes with a linear regression trained on both harvested and not-harvested NFI plots is not an option, since this method resulted in harvesting patterns that were clearly different from BAU. Using logistic regression with an overall cut-off benchmark resulted in strong harvested volume overestimations for some characteristics, and underestimations for others. A trade-off of the stratified method is the decrease in classification accuracy and related quality measures. However, when the selection of plots to be harvested was not generated randomly, but rather informed by RF, this impairment remained within an acceptable range. Furthermore, in the scope of large-scale timber supply projections, it might be considered as more important to find the right number of harvested plots than to identify those plots that were actually harvested.
Results of this research must be interpreted in respect of the regional and periodical background conditions, the data quality of the German NFI, and the availability of attributes that could be used and linked to the NFI data. However, considering that large-scale forest inventories are often rather noisy, the presented stratified methods could also be helpful for generating useful BAU forest development and wood supply scenarios in other regions.
The stratified method should be tested and challenged when applied to other time periods, regions, and inventory variations. A further challenge is to combine this method with AFGWSMs to generate BAU scenarios for forest development and timber supply projections beyond one decade. In this context, another challenge is to learn and predict BAU harvesting choices for individual trees. The results of this research showed that BAU harvesting patterns could be retraced well without having more detailed background information on individual harvest decisions. However, the parametrized BAU scenario could also be used as a baseline for the design of alternative scenarios. For this purpose, underlying social and economic grounds for harvest choices would have to be studied specifically in relation to inventory based scenario design. More research explicitly dedicated to this task is needed.