Modeling Tree Species Count Data in the Understory and Canopy Layer of Two Mixed Old-Growth Forests in the Dinaric Region

: The distribution of tree species has traditionally been analyzed based on tree diameter (DBH) as a continuous variable. However, this approach does not usually provide information on how species are distributed across the area of interest. In this study, an inverse approach was applied to investigate tree distribution patterns in two Dinaric old-growth forest stands composed primarily of European beech, silver ﬁr, and Norway spruce. Speciﬁcally, the variance-to-mean relationship of tree counts based on 80 plots (40 in each old-growth stand) were evaluated by using a dispersion index. Understory trees exhibited clumped and random patterns, whereas canopy trees were mostly distributed in a random manner. A regular pattern was only determined for beech and all trees in the canopy layer (two cases out of ten). The observed discrete variables were further compared with three theoretical distributions. It was found that a Poisson, binomial, and negative binomial model best ﬁtted the observed count data, which, based on the dispersion index, exhibited a random, regular, and clumped pattern, respectively. The frequency of plots with low species presence and complete absence of species was also revealed. Consequently, the analysis and modeling of tree counts can be of practical use for species conservation purposes.


Introduction
Modeling the distribution of tree species in mixed forests has been an important task in forest ecology in the last two decades [1][2][3][4]. For this purpose, researchers usually tend to gather both discrete (count) and continuous data on variables of interest in forest ecosystems. However, in many instances, continuous data are limited or not available at all due to financial constraints or because standard inventory procedures encompass only count data. In addition, the focus of the research might solely be species richness for which only count data are necessary. Count data that originate from different fields of study typically follow a Poisson, negative binomial, or in some cases, binomial distribution [5]. For instance, these distributions have often been used to analyze and model count data in scientific fields such as parasitology [6], veterinary medicine [7], ornithology [8], and estimation of ore reserves [9]. However, their application in the analysis of the distribution of tree species is not very common. Instead, the distribution of tree species in forest research has traditionally been studied by measuring the tree diameter at breast height (DBH), that is, 1.30 m above the ground. Consequently, the conventional approach to analyzing tree species distributions assumes the construction of frequency distributions with DBH classes that usually range from 4(5) to 10 cm. Then, a proper model can be fitted to such grouped data or to the raw data [10]. Such models based on DBH as a continuous variable are still valuable in forest research as they provide previous studies on this topic in beech-coniferous old-growth forests (e.g., [4]), the null hypothesis in this study was formulated. Under the null hypothesis the clumped pattern is expected to be found in the understory and the random pattern in the canopy layer. The alternative hypothesis assumes that the most of the examined tree species, and all tree species combined, will deviate significantly from the clumped pattern in the understory, and likewise, from the random pattern in the canopy layer.

Study Area
The study was carried out in the Janj (44 • 08 N, 17 • 16 E) and Lom (44 • 27 N, 16 • 27 E) old-growth forests in Bosnia and Herzegovina. Both forests are classified as forest association Piceo-Abieti-Fagetum dinaricum, and include a mixture of European beech (Fagus sylvatica L.), silver fir (Abies alba Mill.), and Norway spruce (Picea abies (L.) H. Karst) with a negligible share of other species [30]. These forests are located in the central part of the Dinaric Mountains in south-east Europe, approximately 90 km from the Adriatic Sea. The Norway spruce in this region is considered to be an endangered species due to climate warming [31], while the silver fir seems to be less vulnerable [30]. The investigated old-growth forests are situated in an altitudinal belt between 1260 and 1400 m above sea level. The mean annual temperature in the study area is around 5 • C, and the annual precipitation ranges between 1400 and 1900 mm. The bedrock is composed of dolomite and limestone in Janj and Lom, respectively, while brown soils prevail in both forests. Considering the high levels of live and dead wood [23,30] and the long history of forest protection [32,33], the core areas of Janj and Lom rank among the best-preserved old-growth forests in Europe. In addition, the core areas of Janj (57.2 ha) and Lom (55.8 ha) are surrounded by relatively large buffer zones (237.8 ha and 297.8 ha, respectively) in which only low-intensity salvage cutting has been performed.

Field Measurements
A regular 100 m grid with 40 sampling points was superimposed on the core areas of Janj (summer 2011) and Lom (summer 2010), resulting in 80 plots in total. This square lattice arrangement of sample plots is a conventional forest inventory sampling procedure [14]. Each grid intersection defined the center of a circular sampling plot (radius = 12 m, area = 452 m 2 ). In each plot, all live trees with DBH >7.5 cm were tallied and sorted by species. In the understory of both old-growth forests, a total of 1090 trees were counted including 880 beeches, 125 firs, and 85 spruces, whereas in the canopy layer of both forests, a total of 605 trees were counted, which included 191 beeches, 252 firs, and 162 spruces. In addition, the DBH of all trees exceeding the inventory threshold were measured in two perpendicular directions to the nearest 0.1 cm. In the subsequent analysis, only a single DBH value was used for each tree obtained as a mean of the two perpendicular measurements.

Data Analysis
Contrary to commonly used frequency distributions based on DBH as a continuous variable [34,35], in this study the focus was on the discrete variable, that is, the tree count data. The frequency distributions of count data are based on the number of sample plots, where each plot contains 0, 1, 2, . . . , n trees of a particular species, or of all species combined. In this study, the discrete (count) data were divided into understory (≤27.5 cm DBH) and canopy trees (>27.5 cm DBH). The ecological rationale for this division is that live trees with a DBH above 27.5 cm represent "definitive" gap-fillers in Dinaric old-growth forests [36]. Another reason for using 27.5 cm as a dividing value is of a mathematical nature. Namely, DBH categories that are too narrow (e.g., 5 or 10 cm wide) often result in zero values, or such a low number of individuals per category/class that it would prevent the proper application of the chi-square test. It is well known that this test requires at least five individuals per class or category [37]. Thus, the two broader DBH categories were used to "capture" enough trees for robust statistical analysis.
Following this division of trees into the understory and canopy layer, in the next step the dispersion index Ic was applied to quadrat (plot) counts per individual species, for conifers combined, and for all trees combined (all species). This index is also called the variance-to-mean ratio as it is based on the relationship of the sample mean to the sample variance [38], and its computation was conducted at the stand level [9]. Theoretically, if index values are equal to 1, then the tree count data are randomly distributed. However, in forest ecosystems it is a rare phenomenon for the mean and variance to be absolutely equal, so small deviations from 1 are still "allowed" for tree count data to be classified as random. Specifically, the Ic index is based on the Poisson distribution [11]. Thus, for statistical inferences about significant deviations from 1 (randomness), confidence envelopes were constructed by using a χ 2 test with n−1 degrees of freedom, where n is the number of quadrats (plots). The testing was set at the p > 0.05 level. Namely, if the value for χ 2 fell within an envelope between the χ 2 tabular values of 0.975 and 0.025 probability levels, then agreement with a random distribution was reached, indicating that the variance virtually equals the mean. Considering the sample size of 40 plots in each studied stand, the count data in this study were classified as random when their computed Ic values fell between 0.68 and 1.42. The computed index values above and below this range denoted a significant deviation from randomness. Specifically, Ic values smaller than 0.68 represented an evenly-scattered (regular) distribution of individuals in the population, whereas values above 1.42 indicated a clumped (aggregated) pattern.
With respect to the division of trees into understory and canopy layers, the count data were also modeled per individual species, for conifers combined, and for all trees combined (all species). A Poisson distribution was applied under the assumptions that each sample plot has an equal probability of hosting a tree, the occurrence of a tree in a plot is not influenced by other trees, and the mean number of trees per plot remains constant for all sample plots in a given stand [9].
The Poisson distribution describes the probability p of 0, 1, 2, 3, . . . , n trees occurring in any selected sample plot, while the constant e is Euler's number, which equals 2.718282. If the Poisson model was accepted, then a random pattern was confirmed. However, if the Poisson model was rejected, then binomial and negative binomial distributions were employed for regular and clumped patterns, respectively.
A binomial distribution applies if the probability where p(x) is the probability of a sample plot containing a specified number of trees x, and N is the total number of observed trees in sampled plots.
In the negative binomial model, the expected probability of obtaining a given value of a count, r, is given by where p(r) is the probability of getting r individuals in the sample plot, m is the mean, and k is the "shape" parameter. Γ(k) is the gamma function of k, and it equals Γ(k) = [k+1]! All probabilities were obtained by the recurrence relation [39]. If a negative binomial distribution could not be rejected, then it was concluded that the studied tree species exhibits a clumped pattern. If a binomial distribution could not be rejected, then a regular species distribution pattern was confirmed. The goodness-of-fit of all applied models was tested by applying the χ 2 test, that is, by comparing the observed frequencies with the expected ones, but now with n-1-q degrees of freedom, where n is the number of frequency classes after necessary pooling [37], and q is the number of distribution parameters. In each case one degree of freedom was lost due to the overall sum, while additional degrees of freedom were lost depending on the number of distribution parameters.

Results
The values of the Ic index and best fitting models presented in Table 1 and Figures 1 and 2, suggest that in the two old-growth forests, the same distribution pattern, but in some cases different distribution patterns, may characterize a tree species. For instance, understory beech trees (≤27.5 cm DBH) had a clumped pattern in both of the studied old-growth forests. However, in the canopy layer (>27.5 cm DBH) this species exhibited a random pattern in Janj and a regular pattern in Lom. Silver fir trees in both the understory and canopy layer were characterized by a random pattern in both old-growth forests. Similarly, Norway spruce generally exhibited a random pattern, except for its understory trees in Lom, which exhibited a clumped pattern. However, when both conifers were jointly analyzed (fir and spruce as one variable), their distribution in the understory of both old-growth forests was clumped. On the other hand, the joint distribution of conifers in the canopy layer remained random as in the case of single coniferous species. When all trees (all species combined including beech, fir, spruce) were considered, they clearly exhibited a clumped pattern in the understory, while their distribution in the canopy layer varied from random to regular in Janj and Lom, respectively ( Table 1, Figures 1 and 2). In the understory layer in both old-growth forests, Ic index values ranged from 1.06 to 5.45, whereas these values for the canopy layer varied from 0.42 to 1.39. Generally, the Poisson distribution was the best fit to model species count data when the respective Ic index values were between 0.68 and 1.42. For index values below 0.68 and above 1.42, the binomial distribution and negative binomial distribution were found to be the best fitting models, respectively.
With respect to the count distributions of understory trees (Figure 1), the span of the beech counts in plots was much greater compared to that of fir and spruce, while the conifers combined resembled the beech distribution in the Lom old-growth forest. In this stand layer, the negative binomial distribution was the best fit for beech counts, for conifers combined, and for all trees combined in both old-growth forests. The only inconsistency was for spruce trees as the counts for this species in the understory were best modeled with the Poisson distribution in Janj (Figure 1c), while in Lom the negative binomial distribution was the best fit for this species (Figure 1h). Fir understory counts were best fitted with Poisson distributions in both studied old-growth forests. What was also interesting with respect to the understory figures, was that all plots contained beech trees, while the absence (plots with 0 tree count) of fir and spruce ranged from 8 to 21 (20% to 52.5% of plots), respectively. The observed and expected tree counts for individual tree species, for conifers combined, and for all species combined in the understory layer (7.5-27.5 cm diameter at breast height (DBH)) in the Janj (left: a, b, c, d, e) and Lom (right: f, g, h, i, j) old-growth forests. The expected counts were shown based on the models that best fitted the observed counts: negative binomial distribution (NBD) and Poisson distribution. Contrary to the understory layer, the span of count distributions in the canopy was fairly similar for beech, fir, and spruce. Also, in contrast to the understory layer where the negative binomial model prevailed, the trees in the canopy layer followed a Poisson distribution in most cases (Figure 2). In this stand layer, the binomial distribution was found to be the best fitting model only for beech counts and for all trees combined in the Lom old-growth forest (Figure 2f,j, respectively). It is important to emphasize that all fitted models were significant at the 0.05 α-level, except in the case of all trees in Lom (Figure 2j), namely, in the latter case, a Poisson and negative binomial distribution clearly deviated from the observed counts, and a binomial distribution followed it much better. Therefore, the binomial distribution was selected as the best fit. However, it should be noted that fitting all canopy trees in Lom, even with the binomial distribution was also non-significant. Consequently, explaining real (observed) tree count distributions with theoretical models seems to be more challenging in the case of a regular data pattern than in the case of random and clumped patterns.  left: a, b, c, d,  e) and Lom (right: f, g, h, i, j) old-growth forests. The expected counts were based on the models that best fitted the observed counts of canopy trees: Poisson distribution and binomial distribution (BD).

Discussion
This study showed that in mixed old-growth forests composed primarily of beech, fir, and spruce, the aggregated (clumped) pattern mainly characterized understory beech trees with a DBH between 7.5-27.5 cm. In the canopy layer (>27.5 cm DBH), the count data patterns of beech trees were

Discussion
This study showed that in mixed old-growth forests composed primarily of beech, fir, and spruce, the aggregated (clumped) pattern mainly characterized understory beech trees with a DBH between 7.5-27.5 cm. In the canopy layer (>27.5 cm DBH), the count data patterns of beech trees were more variable compared to fir as the counts of the latter in both studied old-growth forests followed a Poisson (random) distribution in the understory as well as in the canopy layer. Spruce clearly exhibited a random pattern in the canopy layer, whereas its count distributions followed a Poisson and negative binomial distribution in the understory of Janj and Lom, respectively. Contrary to single coniferous species, joint conifers (fir plus spruce) had clumped understory patterns that were best modeled with a negative binomial distribution, whereas in the canopy layer, their common pattern was random and followed a Poisson distribution. Interestingly, all trees (all species combined) exhibited patterns identical to those of beech in the understory and canopy. This study partly confirms the results reported by Gu et al. [40], which found that the degree of tree clumping decreases from juvenile to adult stages. In addition, it is important to note that different values of dispersion index for beech and for all trees also indicate different degrees (different intensity) of clumping on one hand, or regularity on the other.
The quadrat count method applied in this study has certain advantages and disadvantages compared to spatially-explicit methods where the distance between trees is used. Therefore, the results of this method should be treated with caution as they partly depend on the size of the sample plots [11]. When the purpose is to compare different studies that applied different sample plot sizes, this issue may be solved by recalculating tree counts to one (equal) sample plot size for all compared sites. However, such an approach is feasible only when the raw data are available or readable from figures; otherwise, comparison of the dispersion index between sites where the size of sample plots is different must be interpreted with caution. The second limitation of the quadrat count method is its inability to detect the spacing distance between trees, which might be useful information when the trees are clumped and/or regularly distributed. Consequently, there is no insight into the scales at which processes such as positive and/or negative autocorrelation between trees occur [9]. For instance, the application of the Ripley K-function and/or g(r) pair correlation function [41] not only provides information about tree patterns (random, clumped, or regular), but also information about facilitation (positive interaction between neighboring tree individuals) and competition (negative interaction between trees), which usually occur at different spatial scales within a forest stand.
Nevertheless, when spatially-explicit data are limited or missing, the quadrat count method seems to be a sound analytical approach to investigate whether the point pattern associated with individual trees in the stand exhibits complete spatial randomness or a clumped or regular pattern. This method also answers the question of how densely the sample plots are populated by constituent tree species, thus, the absence of any species of interest from a large percentage of plots may be an indication that something is wrong or that something unusual is happening with that species. Such information cannot be obtained based on traditional DBH distributions.
For instance, classical DBH distributions of a tree species may have virtually the same forms (shapes), when in fact a species may have very different data count patterns. Let us consider two cases: (a) a tree species may be densely present in very few plots, while at the same time it might be missing in most others; and (b) it may be regularly present in a similar number in all, or almost all, plots. The difference between these two cases cannot be detected by classical DBH distributions, and therefore, there is a good reason to supplement them with the quadrat count analysis whenever the goal is to investigate tree species distribution in detail.

Conclusions
Models based on count data are not meant to replace models based on continuous variables (e.g., DBH), but they may complement them by providing additional information about species count distributions across a forest stand. As previous studies in the Janj and Lom old-growth forests [30] have shown, a species distribution based on DBH as a continuous variable may indicate a form of sustainable distribution such as negative exponential or rotated sigmoid. However, such information is only partly useful to forest managers and nature conservationists as conventional DBH distributions do not disclose how a species is spatially distributed over an observed area. On the other hand, the distributions of species count data, as applied in this study, reveal how a tree species is distributed within a forest stand, that is, whether it more or less equally occurs in all parts of a stand, exhibits a random pattern, or tends to group in a few plots. This study also demonstrates that the above information can be obtained separately for trees in the understory and canopy layers providing that both data types (species counts and DBH) are available.
So far, modeling of species count data has usually been performed on single large plots, however, this study shows that it can be effectively conducted on small scattered plots as well. Such an approach might be used to supplement future studies of DBH distributions based on scattered plots, especially in mixed forests. Then, conclusions about sustainability of a tree species would be more reliable. The observations of real species counts and fitted theoretical models are important as they reveal not only the count (abundance) and the variability of an observed species in sample plots across a study area, but they also show the number of empty plots (absence of a species). In a spatial context, specifically at the stand level, such information might be highly useful to forest managers and nature conservationists interested in monitoring and sustaining a species at such a spatial scale.