Site Index Models for Main Forest-Forming Tree Species in Poland

: Site index is the most commonly used measure of potential site productivity, providing key information for forest management practices. It is determined using species-speciﬁc site index models that take into account climatic and edaphic factors. To reliably estimate the site index, appropriate models are necessary. In Poland, however, outdated guidelines, i.e., yield tables, are used to determine site classes, which result in the inappropriate estimation of height growth and increments of stands. Therefore, the aim of this study is to develop new site index models for the main forest forming tree species in Poland, in a total of eight species. For the development of site index models, we used growth trajectories of 3052 sample trees, representing the whole range of geographic locations and site conditions. Five dynamic models were selected and parametrized to develop the site index models. The models were evaluated using quantitative measures of goodness of ﬁt (MAE, R2, and AIC), the analysis of residuals, and the assessment of how the model reﬂects the biological phenomena of height growth. Results showed that depending on the species, di ﬀ erent models have the highest predictive ability. There are signiﬁcant di ﬀ erences in results using traditional yield tables and developed site index models. For most of the species, the largest di ﬀ erences characterized either the youngest or the oldest age classes. These di ﬀ erences can be attributed to the changes in growth conditions from the time when yield tables were developed. Growth dynamics of forest stands may also show spatial variability, thus, in future research additional site variables and, regional variability should be taken into account. black alder in the age classes. The yield tables showed that black alder’s stands reach nearly half of the total height in the ﬁrst 15 years of life, after which their growth slows down, while the site index model developed in this study shows a fast growth rate up to 40 years. Large di ﬀ erences in growth trajectories were observed for Norway spruce and silver ﬁr. For the former, the results show that the height growth of younger stands is considerably faster than what is found according to yield tables. For the latter, there were di ﬀ erences particularly in the oldest age classes, where the obtained model, in contrast, to yield tables, did not show the strong inhibition of height increase. Di ﬀ erences in height growth trajectories for European larch were rather small, with the larger height increment from developed models than in yield tables for the oldest stands. Growth curves from yield tables and the developed models provided similar trajectories for oak and silver


Introduction
Information concerning site productivity remains a fundamental variable in forestry [1][2][3][4]. The unbiased estimation of site potential is essential for forest management because it is the main determinant of species-specific decisions, connected with the selection of silvicultural treatments, species composition, and the determination of the allowable cut and rotation periods [3,5,6]. Site productivity is one area of strategic information in forest management, which determines the economic and environmental effects of forest management [6,7] and allows the forecasting of stand growth [2]. The most frequently used and commonly accepted measure of potential site productivity is the site index, determined using species-specific site index models based on the height of the stand at a certain age [1,[8][9][10]. Site index models are the main functions used in the empirical models of tree stand growth [11][12][13]. GADA allow for the development of local site-specific site index models, which properly describe local height growth trajectories and enable the reliable assessment of site productivity.
Site index models constitute an important decision-making tool in forest management. As a result of the inadequacy of existing yield tables, there is a high demand from the State Forests and Forest Management Bureau in Poland to develop new site index and growth models for the main forest-forming tree species. In response to this demand, in the years of 2015-2018, the research project titled "actual and potential productivity of the main forest-forming tree species in Poland," was implemented, financed by State Forests in Poland. As part of the project, a grid of over 600 permanent research plots was established, a representative for the main forest-forming tree species growing in Polish forests. Trees for stem analysis were located close to the research plots, allowing for retrospective growth reconstruction when felled. In addition, in the frame of the project, over 2400 growth trajectories were gathered from different research projects implemented in Poland in recent years. The collected database constitutes the empirical material used in the presented research. The aim of this research is to develop new dynamic site index models for the main forest-forming tree species in Poland.

Materials and Methods
The methodology used in this study involved the following steps: (i) the collection of the research material on trees from the projects, (ii) stem analysis, (iii) the selection of models of height growth, (iv) the estimation of models parameters, and (v) evaluation of the models.

Collection of the Research Material
The research material consists of the height growth trajectories of 3052 dominant and codominant trees of the eight main forest-forming tree species in Poland (Table 1), which were collected from the whole territory of Poland ( Figure 1). Sampled stands are mainly pure and even-aged. In part sampled stands were artificially planted, however, those are not typical forest plantations and the methods of forest management used in these forests, with the exception of Scots pine, are mainly close to continuous cover forestry. The research uses data from two sources. About 20% of the data were collected from the project titled "actual and potential site productivity in Poland for main forest-forming tree species," realized in 2015-2017 and financed by the State Forests of Poland. The project established 612 sample plots for eight of the main forest-forming tree species, where stands were selected for each species, representing the whole range of geographic distributions and site conditions. The material collected thus represented the whole variability of Poland's growth conditions for the forest-forming tree species analyzed. The second data source was the database of individual tree growth trajectories collected in Poland over the last few decades as part of research projects conducted by the main forest research institutions in Poland, namely, the Forest Research Institute in Sękocin Stary, the Faculty of Forestry, Warsaw University of Life Sciences (SGGW), and the Faculty of Forestry, University of Agriculture, Krakow. Both data sets used in the study meet the standards required for site index models development; the sample trees for the height growth reconstruction have been dominant without visible damage and were collected in pure stands of target species.

Stem Analysis
Stem analysis used the material of the 3052 trees to reconstruct the past height growth of the trees. The trees for stem analysis were selected in the direct neighborhood of the permanent sample plots established within the research projects. For each plot, one to three trees with a diameter and height close to the top height (TH) defined as the average height of the 100 thickest trees per 1 ha were selected and cut for stem analysis (SA). To reconstruct height growth, only trees with typically developed crowns with no visible damage or growth anomalies were chosen. After cutting, the total lengths of the trees were measured and rings were collected for stem analysis. Rings were taken from the base of the felled tree at a height of 0.5 m, at the diameter at breast height (DBH), and then, for trees over 15 m high, subsequent discs were collected at heights of 2 m, 4 m, and then every 2 m up to the top of the tree. For trees less than 15 m high, subsequent discs were collected at heights of 1.5, 2.5, 3.5, and every 1 m further up the tree. The course of growth of the individual tree was reconstructed using the height of the discs and the number of annual rings. The growth of the trees was also visually examined for patterns of suppression and release as well as growth anomalies. As cross-section lengths do not coincide with periodic height growth, the height-age data from the stem analysis were corrected using Carmean's method [57]. Table 1. The number and basic characteristics of growth series for the analyzed species, obtained from stem analysis and used for the development of the site index models.

Species
Number

Stem Analysis
Stem analysis used the material of the 3052 trees to reconstruct the past height growth of the trees. The trees for stem analysis were selected in the direct neighborhood of the permanent sample plots established within the research projects. For each plot, one to three trees with a diameter and height close to the top height (TH) defined as the average height of the 100 thickest trees per 1 ha
To reliably estimate the site index, a model should have the following desirable attributes [47,50,70]: 1.
Polymorphism, allowing the model to take into account different site-specific growth patterns, depending on site conditions; 2.
Variable asymptotes for different sites; 3.
Equality of the site index and the height at base age; 4.
The possibility of using the same function as both a height growth model and a site index model.
The GADA method, which enables one to use one or two site-specific parameters representing the site's quality, provides dynamic site index models that meet these criteria [47,59].
In the present study, we selected five dynamic models with symbols from M1 to M5 for further analysis ( Table 2). All of these models meet the most criteria above and have been successfully applied to site index modeling, which makes them potential candidates for reliable site index models. Model M1, derived by Cieszewski [48], was applied in site index modeling in Denmark [71] and Poland [69,72]. Model M2 from the base function of Chapman-Richards, was developed by Krumland and Eng [73] and was used as one of the candidate functions for development site index models for Norway spruce and Scots pine by Sharma [60]. GADA formulation of the model M3 was developed by Cieszewski [74] based on the Bertalanffy-Richards function. Models M4 [60] and model M5 are the growth models that have frequently been tested in many studies concerning site index modeling [60,61,75]. In the case of the model, M5 GADA is equivalent to ADA [76].
Cieszewski [48] M2 Krumland and Eng [73] M3 Anta et al. [76] a 1 , a 2 , a 3 , a 4 , a 5 are parameters in base models; b 1 , b 2 , b 3 are parameters in dynamic models; S is the parameter related to site; H 0 and H 1 are heights (in m) at age t 0 and t 1 (in years), respectively; X 0 is the solution of X for initial height and age.

Parameter Estimation
What distinguishes the selected equations is that the choice of the base age does not affect the estimates of the model parameters [47,48,50]. Local (site-specific) and global parameters (GPs) of site index models were simultaneously fitted using the iterative nested procedure (INP) that has been described by Cieszewski [77], Sharma [60], and Socha [69].
The INP starts by calculating the GPs of Function (1), using the height measured in base age for each of the growth trajectories of the individual trees. Next, for each growth trajectory, the GPs are assumed as constants, and the site-specific parameter (SI) is estimated. The SI values fitted for a given tree then become the "observed" values and the GPs are re-estimated. This procedure was repeated here until the GPs were stabilized.
To fit the models, we wrote a devoted R script [77]. As the data represent a time series, the models used a first-order autocorrelation error structure. We did so to deal with possibly autocorrelated residuals from the models, a problem often encountered in models fitted to data representing successive (and thus often correlated) observations in time, especially when all the possible time intervals are used. Such autocorrelation can adversely affect the estimates of model parameters and their standard errors [78].
To compare the fitted models with the corresponding growth trajectories from yield tables [26,28], we plotted height growth curves for a range of site indices.

Model Evaluation
To select the best models, we used three tools, namely, (i) graphical methods to compare residuals with the corresponding fitted values, (ii) statistics of the goodness-of-fit, and (iii) the evaluation of a model's suitability to represent biological realism [70]. Such a procedure has been frequently used to compare and evaluate models [45,55,60,61,66]. To assess biological realism, we analyzed the trajectories and age of culmination of height increment. According to the general rules of growth [30], we assumed that the height increment of stands growing on the weakest site conditions should not exceed the increment in more fertile sites. Besides, we assumed that as the site productivity decreases, the age of culmination of height increment should increase.
In the first step, to evaluate the quality of the fit, the procedure analyzes three statistics, namely, the mean absolute error (MAE, Equation (1)), coefficient of determination (R 2 adj , Equation (2)), and the Akaike information criterion (AIC, Equation (3)): where h i h i ,ĥ i are the measured, mean, and predicted values of the heights, respectively, n is the total number of observations used to fit the model, p is the number of parameters in the model, and L is the residual sum of squares.
In the second step, we analyzed plots of model residuals, which allow for assessing the model's fit across the range of the data. In the last step of the procedure, we analyzed whether each model had the desired biological properties.

Results
The five individual growth functions for the eight forest-forming species yielded 40 site index models (Table 3). Due to the properties of the differential equations, they were also height growth models. Table 3 summarizes parameter estimates and goodness-of-fit statistics of these candidate models. Figure 2 shows the model fit to growth series and Figure 3 shows the residuals against the predicted values. As emphasized earlier, to select the best model, we cannot use just one particular statistical measure, but instead, have to make a compromise between the statistical aspects and biological realism of the model. In the case of models with similar statistics of the goodness-of-fit, the model, which meets the criteria of biological realism was recommended.     Hence, the below analyses used quantitative measures of the goodness-of-fit (MAE, R 2 , and AIC), the analysis of residuals, and the evaluation of how the model reflects the biological phenomena of height growth as criteria.
The evaluation of fit statistics and the graphical analysis for model M1 indicates that this function had the best predictive ability as compared to the other models studied for Scots pine ( Figure  2) and most of the analyzed species (Table 3). The European beech models, built based on functions M1, M3, and M5, had comparable fitting statistics, but after analysis of the residuals of models, we found systematic errors of model M3 in the dependence on stand age. Therefore models M1 and M5 ( Figure 2) proved to be best suited to the empirical data among all of the other analyzed models. For Hence, the below analyses used quantitative measures of the goodness-of-fit (MAE, R 2 , and AIC), the analysis of residuals, and the evaluation of how the model reflects the biological phenomena of height growth as criteria.
The evaluation of fit statistics and the graphical analysis for model M1 indicates that this function had the best predictive ability as compared to the other models studied for Scots pine ( Figure 2) and most of the analyzed species (Table 3). The European beech models, built based on functions M1, M3, and M5, had comparable fitting statistics, but after analysis of the residuals of models, we found systematic errors of model M3 in the dependence on stand age. Therefore models M1 and M5 (Figure 2) proved to be best suited to the empirical data among all of the other analyzed models. For black alder, model M1 had the best predictive ability in terms of the results of residual distribution and biological realism compared to the other models. Models M1 and M4, tested for European larch, showed a good fit for statistics with small differences between models, however, considering the plot of residuals, we propose the application of model M4 for site index determination for this species. Among all the equations tested for silver birch, models M1 M4 and M5 were characterized by good fitting statistics but after a detailed analysis of growth trajectories and plots of residuals, we recommend model M1 here. For oak, we recommend model M5, which provided the best results for the distribution of residuals. Goodness-of-fit statistics showed that model M1 for Norway spruce had the best predictive ability as compared to model M3, however, based on analysis of residuals we recommend model M3. For silver fir, the model built based on model M5 had the best fitting statistics as compared to the other models tested for silver fir, and was also well-fitted across the range of data.
Residual plots were used as one of the criteria of the model evaluation. For the models chosen, the most appropriate plots of residuals are presented in Figure 3. None of the models showed any apparent trends and structures across the range of predicted values, which indicates that the site index curves for the recommended models well represent the observed height age patterns across the range of site conditions.
High values of the adjusted (by the number of parameters in the model) coefficients of determination (R 2 adj ) indicate that selected models accounted for at least 97.5% of the total variation in the data (Table 3), which represents a very good fit in every single case. Here, mean absolute errors ranged between 0.60 m (Scots pine) and 0.91 m (silver fir).
Growth curves obtained from the developed site index models were compared directly with the growth curves from the yield tables ( Figure 4). For Scots pine, the curves were similar, with the largest differences observed in the old and young age classes, especially for less productive sites. The models differed more significantly for European beech and black alder in the youngest age classes. The yield tables showed that black alder's stands reach nearly half of the total height in the first 15 years of life, after which their growth slows down, while the site index model developed in this study shows a fast growth rate up to 40 years. Large differences in growth trajectories were observed for Norway spruce and silver fir. For the former, the results show that the height growth of younger stands is considerably faster than what is found according to yield tables. For the latter, there were differences particularly in the oldest age classes, where the obtained model, in contrast, to yield tables, did not show the strong inhibition of height increase. Differences in height growth trajectories for European larch were rather small, with the larger height increment from developed models than in yield tables for the oldest stands. Growth curves from yield tables and the developed models provided similar trajectories for oak and silver birch.
considerably faster than what is found according to yield tables. For the latter, there were differences particularly in the oldest age classes, where the obtained model, in contrast, to yield tables, did not show the strong inhibition of height increase. Differences in height growth trajectories for European larch were rather small, with the larger height increment from developed models than in yield tables for the oldest stands. Growth curves from yield tables and the developed models provided similar trajectories for oak and silver birch.

Discussion
We developed site index models using dynamic functions derived from the GADA method, which meet the crucial requirement of polymorphism and variable asymptotes. This enables the good representation of empirical data, from both statistical and biological perspectives. To assess biological behavior, we analyzed height increment curves. We found the higher height increment in weaker site conditions, which was observed in the case of some models in the stands of older age classes, as incompatible with generally known growth rules [30]. We also assumed that as the site productivity decreases, the age of culmination of height increment should increase.
All the selected models had the desired properties of a good site index model. Compared to the methods still used in Polish forestry practice, such as yield tables, our models provide better predictive ability, which was also reported by many other studies that applied the GADA

Discussion
We developed site index models using dynamic functions derived from the GADA method, which meet the crucial requirement of polymorphism and variable asymptotes. This enables the good representation of empirical data, from both statistical and biological perspectives. To assess biological behavior, we analyzed height increment curves. We found the higher height increment in weaker site conditions, which was observed in the case of some models in the stands of older age classes, as incompatible with generally known growth rules [30]. We also assumed that as the site productivity decreases, the age of culmination of height increment should increase.
All the selected models had the desired properties of a good site index model. Compared to the methods still used in Polish forestry practice, such as yield tables, our models provide better predictive ability, which was also reported by many other studies that applied the GADA methodology, for instance [47].
It is important to note that yield tables were developed at the end of the nineteenth century and the beginning of the twentieth century. Each yield table value represents information obtained on the basis of only one age and site class, while each point of the curve of the developed models is the result of information from the entire range of empirical data. The purpose of modeling is to generalize the information contained in the measurement data and to compensate for random errors. The use of mathematical models developed by the simultaneous estimation of their parameters on the basis of all observations from all height growth trajectories collected for individual species is, statistically speaking, much more effective than tables containing the results of the individual adjustment of growth curves for individual site classes [79]. These attributes are one of the probable reasons for the differences between site index curves from yield tables and the newly developed dynamic site index curves.
The height growth trajectories obtained by the dynamic site index models have different courses from those of the yield tables. For most species, the two types of models showed the largest differences for either the youngest or the oldest age classes. For Scots pine, we observed some differences in both old and young age classes, especially for less productive sites (Figure 4), most likely due to the fact that yield tables poorly represent data within this range [80].
For European beech, the largest differences were observed in the youngest age classes, which is likely a consequence of the use of the longer regeneration periods for natural stands, which have been used in the past. A common current approach in the practice of beech forest management uses shorter regeneration periods, which could explain the observed changes.
The two types of models provided different results for black alder as well. The site index curves suggested a slower height growth speed in the youngest age classes than that resulting from the yield tables. This overestimation by yield tables may result from their use of data for coppice alder stands, which most probably present very intensive growth in the initial period of growth.
The differences we observed for European larch were similar to those observed for Scots pine, with less productive sites being poorly represented in the data and the oldest age classes being either underestimated or overestimated by yield tables.
Site index curves from yield tables and the developed model's curves for oak and silver birch have almost the same course, demonstrating the possibility of the use of yield tables for site index estimation for this species. The large differences observed for Norway spruce probably have resulted from using different regeneration methods in the past (such as the shelterwood system) and a long regeneration period. The guidelines for the forest silviculture of this species have changed quite significantly during the twentieth century. In particular, its regeneration period was shortened, which is a key factor affecting the growth dynamics of this species, particularly for trees up to 60-80 years in age. In comparison to the height growth trajectories obtained from the developed model for Norway spruce, the site index curves from yield tables showed large errors in determining site productivity.
Analyzing growth trends for fir in the oldest age classes showed significant differences. The comparison of the curves suggests that, currently, stands of the oldest age classes do not show as significant slowdown as yield tables may suggest, which is consistent with the biology of this species. The current knowledge indicates that, in favorable conditions, silver firs can grow for 200 years [81].
In general, the results indicate that for most forest-forming tree species, the currently used yield tables provide different height growth curves than those actually observed. On the one hand, this observation might be due to the small amount of data used to construct yield tables for the youngest and oldest age classes, as well as the less productive sites, which are relatively rare. On the other hand, site indices for young stands are commonly considered to be very variable, because early growth is irregular and determined not only by site conditions but also by other factors [82]. Two of the key roles in this variation are the regeneration period and the regeneration method, the former likely being a reason for the differences between the growth curves fitted for Norway spruce and European beech, while the latter for black alder.
Another important factor affecting growth curves is the selection of trees. It is known that not all trees in the top layer of the stand belong to this layer throughout their entire life. As such a change in the canopy position of a tree is quite probable [83], models based on the data from stem analysis can sometimes overestimate growth rates [10]. We used the verification procedure to assess the growth curve of the individual growth series against the background of more trees living in the given area. This allowed us to limit the probability of choosing trees whose growth curves significantly differ from the growth curves of other trees growing in similar site conditions.
Most likely, the differences in growth curves obtained using dynamic site index modeling and yield tables, the latter obtained using data older than 100 years, were due to the changes in growth conditions over this time. As shown by the latest research, these were changes mainly in nitrogen deposition and the climatic parameters [31,84], as well as an increase in carbon dioxide concentration [85]. Changing site conditions over time [31] undoubtedly constitutes a disadvantage of the site index concept [3], as they lead to a negative correlation between stand age and the site index [21,60,69]. This problem is sometimes solved by the inclusion of a stand age as one of the variables when modeling site productivity [33,86].
Site index, which is traditionally used to determine current growth conditions, has increasingly used in new research areas, such as mapping the site index from remote sensing data, simulating productivity changes, and analyzing the CO 2 sequestration of forest ecosystems [25,87]. Such analyses and tools provide valuable data to develop improved management practices in terms of mitigation and adaptation strategies for climate change. The appropriate prediction of site productivity is also key to developing effective strategies of CO 2 sequestration [25]. Such activities should form the basis of modern forest management, which would take into account the need to adapt forest ecosystems to the effects of climate change. In this research, we have observed differences in the growth dynamics and site requirements of the main forest-forming species. This knowledge may be key in many areas of forest management, especially in the context of maintaining the sustainability of forests in the face of climate and environmental changes.
What is important from the perspective of forest management is that Poland is geologically and climatically diverse, particularly along the northeastern and southwestern gradients. Thus, the growth dynamics of forest stands may show spatial variability. If this variability is significant, future research should focus on dynamic site index models that take into account additional site variables. This thought is not new, however, as various authors have pointed out that when analyzing the site index, regional variability should be taken into account [3,56,71]. Therefore, future research should focus on the development of regional site index models, which would help foresters to estimate the potential productivity of forest sites. Although the development of such regional models requires extensive research material, the recent research suggests that for building them, not only the results of stem analysis may be used, but also other data sources. Data from permanent plots that are part of the large-scale National Forest Inventory and data from aerial laser scanning have also successfully been used for site index modeling [60,69].

Conclusions
We have developed new site index models for eight of the main forest-forming tree species in Poland, namely Scots pine, European beech, black alder, European larch, silver birch, oak, Norway spruce, and silver fir, using over 3000 data series describing the trajectories of tree height growth. Selected for each species, the best dynamic forms of the growth function allow for a good fitting of the models to the empirical data, as evidenced by the explanation of about 98% variation in height growth, which was crucial for the elimination of systematic errors. The main purpose of the developed models is the determination of site productivity. Thanks to the properties of the functions used for the development of the models, however, they can also be used as models describing the height growth of the stands. This research is the response to demands from the State Forests in Poland, Forest Management Bureau and other stakeholders to develop new yield tables (growth models) for the main forest-forming tree species in Poland. The presented research is the first stage in the construction of new empirical growth models/yield tables. The site index models presented in this research will be one of the functions of the developed growth models.