Spatial prediction of diameter distributions for the alpine protection forests in Ebensee, Austria, using ALS/PLS and spatial distributional regression models

A spatial distributional regression model is presented to predict the forest structural diversity in terms of the distributions of the stem diameter at breast height (DBH) in the protection forests in Ebensee, Austria. In total 36,338 sample trees were measured via a handheld mobile personal laser scanning system (PLS) on 273 sample plots each having a 20 m radius. Recent airborne laser scanning (ALS) data was used to derive regression covariates from the normalized digital vegetation height model (DVHM) and the digital terrain model (DTM). Candidate models were constructed that differed in their linear predictors of the two gamma distribution parameters. Non-linear smoothing splines outperformed linear parametric slope coefficients, and the best implementation of spatial structured effects was achieved by a Gaussian process smooth. Model fitting and posterior parameter inference was achieved by using full Bayesian methodology and MCMC sampling algorithms implemented in the R-package BAMLSS. Spatial predictions of stem count proportions per DBH classes revealed that regeneration of smaller trees was lacking in certain areas of the protection forest landscape.

36,338 sample trees were measured via a handheld mobile personal laser scanning system (PLS) on 273 sample plots each having a 20 m radius.Recent airborne laser scanning (ALS) data was used to derive regression covariates from the normalized digital vegetation height model (DVHM) and the digital terrain model (DTM).Candidate models were constructed that differed

Introduction
The forests of the Alps provide a wide range of ecosystem services.In addition to offering wood production and unique habitat for a diverse set of species, these forests protect people, buildings, and infrastructure from natural hazards such as snow avalanches, rockfalls, and mudslides (Brang, 2001).Forests that protect human infrastructure are declared as "protection forests," often through a public authority's formal decree.In Austria, between 2009 and 2015, there were 20-70 severe rockfall events per year, resulting in 10 human injuries and damage to settlement areas and infrastructure (Perzl et al., 2017).Given its mountains terrain, ∼15.7% (615,852 hectares, ha) of Austria's total forest area offer some level of protective function (BML, 2022).
In an in-situ rockfall experiment, Dorren et al. (2005) show forest cover significantly reduced velocity, rebound height, residual hazard of rockfall, and depending on the quality and quantity of the forest structure, the number of rocks involved in a rockfall could be reduced by 64%.To a large degree, protective forest effectiveness is a function of the forest's vertical and horizontal structure.Specifically, relevant structural measures are stem density, tree sizes (i.e., diameter), and patch size (see, e.g., Brang 2001 and other references herein).To assess hazard risk and support the decision-making in forest management activities, Teich and Bebi (2009) use computer simulation models informed with spatially explicit forest structural summary inputs.These forest structure inputs were traditionally derived from aerial image analysis (Bebi et al., 2001).More recently, however, laser imaging detection and ranging (LiDAR) data are being used to support forest inventory (Köhl et al., 2006), and offer improved accuracy and efficiency through automation for mapping forest structure (Adnan et al., 2019).
Mean tree diameter and total stem count are often inadequate at characterizing forest structure, especially for structurally diverse settings.Building structural diversity, through silvicultural treatments, is a common management objective that has been shown to enhance biological diversity, carbon storage, and possibly climate resilience (Schütz, 2002;D'Amato et al., 2011;Gough et al., 2019).This desired structural diversity is often produced by "plentering," a highly intensive silvicultural prescription designed to move homogeneous stands to an uneven age structure with stratified crown layering (Schütz, 2001).Characterizing such structurally diverse forests is best accomplished using more detailed summaries of possibly complex size-class distributions.
Characterizing size-class distributions has traditionally been done using either a parameter prediction model (PPM) or a parameter recovery model (PRM) approach (Hyink and Moser, 1983).In PPM, a probability density function (pdf) is chosen to characterize the size-distribution (e.g., diameter distribution), and the pdf parameters are then estimated separately for each sample plot.Finally, the parameter estimates are regressed against covariates via regression analysis.In PRM, the mean and dispersion parameters are directly regressed to meaningful characteristics, and the estimates of the pdf parameters are finally achieved by the "method of moments."The PRM is often used to avoid confounding problems, which can occur with PPM, as similar pdfs can be achieved with different parameter combinations making it difficult to find unambiguously meaningful covariates.As consequence, the PPM equations can usually explain only little variation in the parameters and have typically a low R 2 .A shortcoming of the traditional PPM and PRM approaches to model tree size distributions is that they require separate model steps-estimating size distribution parameters happens separate from regression used to explain variability in those parameters.This can easily produce ill-behaved prognoses of diameter distributions when new covariate data is used.To make future predictions of diameter distributions more reliable and accommodate temporal dependence among repeat measurements, a bivariate distribution modeling was demonstrated in Knoebel and Burkhart (1991).Finley et al. (2014) proposed a non-parametric Bayesian approach to estimating diameter distributions that modeled each diameter class using a Poisson regression informed using LiDAR covariates and random effects designed to accommodate correlation among diameter classes and across spatial locations.While highly flexible, their proposed approach was computationally demanding and required a greater degree of user input to choose appropriate prior distributions and assess model convergence.
Here, we demonstrate and assess a different inferential approach aimed at overcoming key limitations of previously proposed methods for characterizing size-class distributions.Specifically, we apply recent advancements in distributional regression using generalized additive models that facilitate joint estimation of shape and scale parameters in parametric distributions.In particular we use a generalized additive models for location, scale, and shape (GAMLSS) based approach proposed by Rigby and Stasinopoulos (2005).In a series of papers, Klein et al. (2015c,b,a) and Umlauf et al. (2018) extended the original maximum likelihood mode of inference for GAMLSS parameters to a Bayesian approach using Markov chain Monte Carlo (MCMC) referred to Bayesian additive models for location, scale, and shape (BAMLSS).This Bayesian approach accommodate a richer set of models and uncertainty quantification.Many proposed BAMLSS features have been made available in user-friendly software (Umlauf et al., 2021).
Whereas within classical regression models, where the conditional mean of the response is regressed against covariates, distributional regression addresses the complete conditional response distribution, in that each distribution parameter is modeled in terms of covariates and, potentially, random effects.Compared to maximum likelihood based GAMLSS, the BAMLSS distributional regression supports a wider selection of distribution families, for which the parameters are not necessarily directly related to the location, scale, and shape of the given distribution but can form these measures indirectly via functional relationships.Kneib et al. (2023) offers an excellent review of distributional regression approaches including GAMLSS (and its Bayesian extensions) and the traditionally more conspicuous quantile regression.The review underscores key advantages to GAMLSS approaches with regard to modeling distribution parameters using versatile additive structure, nonlinear functions, varying coefficients, and spatially and temporally structured random effects.
In this paper, GAMLSS spatial distributional regression models are used to quantify forest structural diversity based on stand-level DBH distributions in a protection forest landscape near Ebensee, Austria.Sample plot data was collected using a handheld mobile personal laser scanning (PLS) and processed using automated software routines.
Regression covariates were derived from a digital vegetation-height model (DVHM) and a digital terrain model (DTM) provided by recent airborne laser scanning (ALS) campaigns.

Study region and model data
The study area was located in the southern region of the federal state of Upper Austria, near the village of Ebensee, and covers an area south of Traunsee lake (Fig. 1).The forest district Ebensee had a total area of 4,898 hectares (ha) and was partitioned into Q =1,237 forest stands.The average stand size was 3.96 ha, the minimum 0.14 ha, the median 2.33 ha, and the maximum 89.99 ha.
Forest inventory data was collected on n = 273 sample plots, which were spatially aligned in a regular 400 m×400 m grid (Fig. 1).Plot measurements were collected using a handheld mobile PLS GeoSLAM ZEB Horizon (GeoSLAM Ltd., Nottingham, UK).grid locations using fully automated routines detailed in Gollob et al. (2019Gollob et al. ( , 2020)), in Tockner et al. (2022) and in Ritter et al. (2017Ritter et al. ( , 2020)).Stem volume was calculated using a traditional stem-form function (Pollanschütz, 1965).The mean DBH of measurement trees was 14.6 cm, the standard deviation was 10.6, the minimum was equal to the predefined 5 cm threshold, and the maximum DBH was 92.4 cm.For each plot, growing stock timber volume (GSV) was expressed as m 3 /ha (i.e., computed as the sum of tree volume scaled by the 7.958 fixed-area plot tree expansion factor).The mean GSV of the sample plots was 259.6 m 3 /ha, the standard deviation was 177.9 m 3 /ha, and the minimum and maximum were 0.6 m 3 /ha and 980.7 m 3 /ha, respectively.

Model construction
A distributional regression model was built for the DBH distributions observed at the n PLS forest inventory plots with the model form where y i is the DBH distribution vector at the i-th plot, D is a parametric density distribution function with parameters ϑ 1 (x i ), . . ., ϑ K (x i ) that depend on a set of plot specific covariates x i .
The parameters are usually not directly formed by a regression predictor, but often through a monotonically increasing response function, which maps the predictor η ϑ l i to the l-th parameter via Assuming monotony and inverting the response function via the link-function g l (•) achieves the parameter predictor The vector of covariates x ′ i = (x ′ i , ν ′ i , s ′ i ) optionally contains measures x ′ i having a linear effect, ν ′ i having nonlinear effects, and s i representing generic geo-locations.Hence, the structured additive predictor becomes and is composed of linear covariate effects with parameters β ϑ l in the first summand, smooth nonlinear functions f ϑ l j (•) in the second summand, and a spatial effect f ϑ l geo (•) at geo-locations s i in the third summand.
Distributional regression models were constructed with the gamma distribution as proper candidate for D. Trials were also made with the Weibull distribution, but the Weibull distribution proved less flexible than the gamma distribution.Following Stasinopoulos et al. (2019), the gamma distribution's probability density function (pdf) considered here is defined by for y > 0, and with µ > 0 and σ > 0. Herein, E(Y ) = µ and Var(Y ) = σ 2 µ 2 .Linear predictors (Eq.4) were constructed for both parameters µ and σ.
The catalogue of the possible covariates for x i and ν i included summary statistics of the DVHM across the pixels per sample plot area, such as the mean vegetation height (MVH), its standard deviation (SDVH), and various percentiles of the distributions of the pixellated vegetation heights.In addition, topographic metrics were derived from the DTM, i.e., the elevation above sea level (ESL), and the average slope (SLO), and aspect (ASP) of the terrain.Finally, the geo-locations of the sample plot centroids were used to index a spatial Gaussian process (i.e., the f ϑ l geo (s i ) summand in 4).Various candidate models were tested that differed in their complexity, especially in terms of smooth nonlinear models for the covariate effects versus simple linear parametric coefficients, and with respect to presence/absence of a structured spatial effect.
The model fitting and subsequent prediction was performed within a Bayesian inferential framework and by using the R-package BAMLSS (Umlauf et al., 2018(Umlauf et al., , 2021)).Smooth nonlinear covariate effects (second summand in Eq. 4) were consistently modeled with BAMLSS's default thin plate regression splines (Wood, 2003), and the spatially structured effect for the continuously indexed sample plot location coordinates (third summand in Eq. 4) was alternatively represented by a spatial Gaussian process model or by a bivariate tensor product smooth.When a Gaussian process was chosen, a simplified form of the Matern covariance function was applied, according to suggestions by Kammann and Wand (2003).Parameter inference was derived from posterior distributions that were sampled via MCMC techniques.Computations were performed on a multi-core processor workstation.On 7 cores, 5,000 iterations were computed per each core.From each of the 7 chains, the first 2,000 iterations were discarded as "burn-in," and the from the remaining 3,000 iterations every 10-th sample was kept.This burn-in and thinning yielded approximately M =2,100 nearly independent MCMC samples upon which parameter and predictive posterior inference was based.The performances of the different candidate models were assessed by means of the deviance information criterion (DIC) (Spiegelhalter et al., 2002) and two varieties of the widely applicable information criterion (WAIC1 and WAIC2) defined in Gelman et al. (2014).For both, DIC and WAIC lower values indicate improved model fit.

Prediction
Our primary interest was in stem diameter distribution prediction for each of the 1,237 forest stands delineated in Fig. 1.For this purpose, the entire Ebensee forest district domain was partitioned into 35.5 m×35.5 m squared prediction pixels, where each pixel's area equals that of the 20 m radius sample plot.For each of the prediction pixels the same set of covariates as used in the candidate models were derived.Then, given these prediction pixel covariate values and centroid coordinates, posterior predictive distribution samples were generated via composition sampling for each pixel's µ, σ, and subsequently gamma pdf based stem diameter distribution.
To assess the protective function of the forest stands in terms of their structural diversity, forest practitioners from the Austrian Federal Forest Service were especially interested in the percentage shares of the stems that were allocated to broader diameter classes: (1) small (DBH<25 cm), ( 2) intermediate (25 cm≤DBH≤50 cm), and ( 3) large (DBH>50 cm).
To produce such estimates, the gamma distribution function F GA (•) was evaluated with the µ (q) j,m and σ (q) j,m estimates from each posterior sample m = 1, . . ., M for each prediction pixel j = 1, . . ., J q of the in total J q pixels within each forest stand indexed by q = 1, . . ., 1237 via: (1) j ), and ( 3) Complete posterior predictive distributions of the M aggregated estimates per forest stand were generated by an area-weighting through with a j being the non-constant area of pixel j that falls into stand q, and that might be reduced by stand border intersections.

Candidate models
In total 15 candidate models were constructed that differed in their covariates and in their constructions of the linear predictors for the µ and σ parameter of the gamma distribution (Table 1).The covariate effects were either modeled through a linear trend that was represented by a single parametric slope coefficient, or via non-parametric smoothing splines.The effect of the terrain aspect (ASP) was throughout represented by a cyclic version of a cubic regression spline smooth.The spatially structured effects were either modeled by a bivariate tensor product smooth with the continuous x,ycoordinates of the sample plots, or alternatively, by Gaussian process.
The distributional regression framework provided high flexibility and generally enabled different specifications of the linear predictors for both the µ and σ parameter of a single distributional regression model.However, it was found that a unique specification of both linear predictors worked well throughout all candidate models.Consequently, the two linear predictors of the µ and σ parameter of each distributional regression model were constructed with the same set of covariates and by using the same model representations (parametric term vs. smoothing spline) for the respective covariate effects.
Comparisons of the model performances in terms of the DIC and two calculations of the WAIC suggested that smoothing splines were more useful than the parametric linear trends; see diagnostics in

Analysis and inference of the best model
The effect curves of model m 14 (Fig. 2) showed the mean vegetation height (MVH) and elevation above sea level (ESL) had positive effects on µ and σ parameters.The slope of the terrain (SLO) as well as the 97.5-th percentile of the pixellated vegetation height measures (P97.5) had almost strictly negative effects on µ and σ.The standard deviation of the vegetation heights (SDMVH) had a strictly positive effect on µ.However, the effect of SDMVH on σ behaved ambiguous for values below 10, had a negative effect between 10 and 13, and acted positively for values greater than 13.The cyclic effect of the topographic aspect (ASP) on µ had a local minimum at 150 it had a positive effect on µ and on σ.
The quantile-quantile plot (qq-plot) in Fig. 3 shows quantile residuals lay close to the bisecting line between -2 and 3.This suggests the gamma distributional assumptions fit very well to the data and the distributional regression model m 14 was adequately specified.
For the model data of the 273 sample plots, the posterior mean estimates from the M MCMC samples of the µ parameter ranged between 2.93 and 35.62, and the 273 posterior mean estimates of σ parameter lay between 0.43 and 3.1 (Fig. 4).The correlation between the 273 posterior mean estimates for µ and σ was 0.42.The average relative mean squared error (MSE%) of the µ estimates was 5.0%, and the σ estimates had a MSE% of 7.2%.To assess what influence the covariates simultaneously had on both distribution pa-rameters µ and σ, the gamma density was evaluated under ceteris paribus conditions for grid values within the range of a single covariate, while the other covariates were kept fixed at their respective median values (Fig. 5).As MVH acted positively on the expectation as well as on the variance of the gamma distribution, an increasing MVH flattened the density and shifted the mass towards higher DBH values.Similar effects occurred for increasing SDVH and ESL values.A completely opposite effect became obvious for an increasing P97.5.More complex and nonlinear effects on the DBH distribution were observed for SLO, ASP, and P2.5.

Distributional prediction
As noted previously, the Ebensee forest district domain was partitioned into 35.5 m×35.5 m pixels.For each of these prediction pixels, covariate data were derived from the DVHM and the DSM in terms of the MVH, SDMVH, ESL, SLO, ASP, P2.5, P97.5, and the pixel centroid coordinates.The MCMC samples from the posterior parameter distributions for the covariate effects of model m 14 were then applied with these covariates to achieve posterior predictive distributions of µ and σ for each prediction pixel.Consequently, the gamma distribution was evaluated using these parameter estimates to produce predictions of stem count proportions that fall into the DBH classes such as specified in Section 2.3.Finally, an area-weighting scheme was applied to achieve aggregated predictions of these stem count proportions throughout all prediction pixels per forest stand.Such as demonstrated in Fig. 6, the 95% credible intervals were relatively tight and the size class predictions became highly precise across all forest stands.
Maps of these size class predictions (Fig. 7) revealed that smaller tree sizes were especially lacking in the central area of the Feuerkogel region located in the western part of the forest district Ebensee, while these stands also possessed a relatively high proportion of larger trees.Such as reported by the sample plot field crew, these sites were actually  Size class predictions and 95% credible intervals were relatively precise for both classification schemes and across all forest stands.The average MSE was only 0.85 percentage points, and the maximum was 9.6.The MSE was less than 1.6 percentage points for 90% of the size class predictions, and 95% of the predictions had a MSE less than 2 percentage points.

Discussion
Posterior standard deviations of the DBH class predictions were considerably low due to the spatially dense network of PLS sample points.Some areas of higher uncertainty are in the top corner of the Eastern half, which were characterized by irregular forests on steep slopes, and due to rock faces no PLS sample points could be captured.Also the South-Western ridge in the Western half has higher errors, caused by missing points due to inaccessibility and generally ragged terrain in high altitudes with less forest cover.
Spatially coherent diameter distribution predictions and subsequently derived prob-  The proposed modeling framework is flexible and able to represent all the structural differences among the sample plots.However, forest stands could theoretically possess a layered structure with an understorey of younger and thinner trees growing under a shelter of older and thicker trees.These circumstances often result in a multimodal DBH distribution.Practically, such structure could be modeled through a mixture of two or more different density functions.However, the methodology so far provided by the BAMLSS package is restricted to a single density and does not allow construction of com-posed mixture models.This limitation was practically irrelevant for our study, because none of the sample plots showed clear signs of a strict multimodal DBH distribution; compare Figs.S1-S8 in the supplemental material.
In addition, any problems that could have been caused by BAMLSS's limitation to unimodal distributions were ameliorated by our approach to achieve the predictions at forest stand level.Hereby, the forest stands were partitioned into smaller prediction pixels.For each of the prediction pixels, the gamma distribution parameters were predicted, and the density was evaluated.The forest stand level prediction of the DBH distributions was finally achieved via an area-weighted aggregate of the pixel-level densities.In so doing, the final prediction of the DBH distribution at stand level was no longer restricted to a unimodal parametric shape.
Forest inventory field work was conducted using a PLS to create "digital twins" of the vegetation and the terrain on a 20 m radius plot.The field work was very efficient with approximately 12 minutes labor time per plot, including the set-up of the equipment and the scanning process.By using fully automated routines, 133 trees were measured on average per sample plot, with LiDAR-derived information being not only DBH, but also other parameters such as height or crown base.This is in contrast to the traditional forest inventory practice, in which measurements are conducted with optical and mechanical instruments.Because these instruments consume high labor costs, traditional forest inventory uses much smaller plot sizes than our 20 m radius plots, so that often not more than 10 trees were measured per plot.With such smaller sample sizes of the traditional forest inventories, the distributional regression modeling would have been hardly possible, and the novel PLS-supported forest inventory can be regarded as key to successful DBH distribution modeling and prediction.
In this study, an approach was presented to model and predict stem diameter distributions in terms of a parametric probability density function.To produce a quantitative prediction of the absolute stem count per DBH class, a further estimate of the total stem count per area unit is needed.A possible approach to achieve such estimate would be to couple the proposed spatial distribution regression model with an extra spatial spatial regression model that considers the tree count per hectare as response.Appropriate methodology for the spatial regression modeling of the growing stock timber volume per area unit is presented in Nothdurft et al. (2021) and could be adopted to model number of trees per hectare.To date, high density ALS data is available for the complete Ebensee forest district domain and will probably be maintained in future, as the area is designated as research zone and has been of particular interest of the Austrian Federal Forest Service.In future work, we will therefore also test an individual tree segmentation from the ALS canopy height model using methodology implemented in the R-package lidR by Roussel et al. (2020); Roussel and Auty ( 2023) that provides a comparative approach to the spatial regression model of tree counts.

Conclusion
This study presented a method to estimate stem diameter distributions by linking PLS and ALS data in the protection forest landscape Ebensee.The Bayesian distributional regression framework was based on Gamma distributions as implemented in the BAMLSS R-package .The Gamma distribution's shape and scale parameters were modeled using linear predictors dependent on covariates from the PLS and ALS data.BAMLSS offered the modeling of nonlinear covariate effects by using penalized regression spline smoothers, which proved more favorable than linear parametric slope coefficients.Including spatially structured effects on both gamma parameters significantly enhanced the model performance.Thereby, the modeling of a spatial Gaussian process outperformed a bivariate tensor product smooth across the sample plot location coordinates.
A spatial wall-to-wall prediction of the gamma distribution was achieved by partitioning the entire domain into prediction pixels having an area equal to the sample plot.
The DBH distributions were predicted at forest stand level via area-weighted aggregates of the evaluated posterior predictive densities.
The proposed model framework can be easily adopted to other tasks when informa- Empirical histograms and posterior distributional predictions of the DBH distributions on the 273 model data sample plots are presented in Figs.S1-S8 of the supplemental material.These figures show the distributional predictions fit very well to the empirical histograms across all sample plots.

Figure 6 :
Figure 6: Predictions of DBH class allocations across the forest stands for the two classification variants.

Figure S1 :
Figure S1: Histograms and posterior predictive distributions of the DBH at the sample plots.

Figure S2 :
Figure S2: Histograms and posterior predictive distributions of the DBH at the sample plots.

Figure S3 :
Figure S3: Histograms and posterior predictive distributions of the DBH at the sample plots.

Figure S4 :
Figure S4: Histograms and posterior predictive distributions of the DBH at the sample plots.

Figure S5 :
Figure S5: Histograms and posterior predictive distributions of the DBH at the sample plots.

Figure S6 :
Figure S6: Histograms and posterior predictive distributions of the DBH at the sample plots.

Figure S7 :
Figure S7: Histograms and posterior predictive distributions of the DBH at the sample plots.
The 180 plots in the study area's Eastern half were scanned in autumn 2021, and the remaining 93 plots in the Western half were scanned in spring 2023.Position, diameter at breast height (DBH), and height for the approximately 36,338 measurement trees were derived from 3D point clouds collected on each 20 m radius plot centered on the n Figure 1: Location and extent of the Austrian forest district Ebensee and locations of the 273 sample points in the southern region of the federal state of Upper Austria.

Table 1 :
Construction of the candidate models."p" stands for a parametric effect and "s()" for a non-parametric smoothing spline.
"s(cc)" indicates a cyclic version of a regression spline smooth.Spatially structured effects are either modeled by assuming a Gaussian process "s(gp)" or via a tensor product smooth "te".Diagnostics show the deviance information criterion (DIC), two calculation variants of the widely applicable information criterion (WAIC1, WAIC2), and the associated effective number of model parameters (edf, p1, p2).Lowest DIC and WAIC are in bold.
tion is required on forest structural diversity across broader forest landscapes.The latter aspect might be of special interest to forestry enterprises charged with protection forest management.In such settings, estimating DBH distributions and other forest structure measures can inform management decisions focused on sustaining protective forest characteristics.