Assessing the Spatial and Spatio-Temporal Distribution of Forest Species via Bayesian Hierarchical Modeling

Climatic change is expected to affect forest development in the short term, as well as the spatial distribution of species in the long term. Species distribution models are potentially useful tools for guiding species choices in reforestation and forest management prescriptions to address climate change. The aim of this study is to build spatial and spatio-temporal models to predict the distribution of four different species present in the Spanish Forest Inventory. We have compared the different models and showed how accounting for dependencies in space and time affect the relationship between species and environmental variables.


Introduction
Tree species distributions are undoubtedly associated with climatic factors through the direct effects of climate circumstances on tree biological processes [1].Consequently, climate change is likely to affect forest development in the short term [2], as well as spatial distribution of species in the long term, through demographic processes [3,4].For temperature-limited boreal woods, the main expectation is a deep northward shift of appropriate tree species environment, while the situation in temperate areas tends to be more complex and is different between Mediterranean, continental, and maritime climates [5].Climate impacts, such as water deficit and the elevated risk of forest fires, will threaten Mediterranean forests [6,7], while forest development might benefit in continental and Atlantic forests, but only at sites where an increased evaporative demand can be satisfied by enough water availability [8][9][10].
Although the predicted impacts of global warming, there are uncertainties around the magnitude of the effects.Among forest researchers, awareness has increased that the global warming poses a huge impact to the management and environmental value of woodland areas [11].In Europe, it was projected that financial losses may come to several billion euros by the end of this century if policies for the forest sector do not change in response to the predicted climate changes [12].
To help species selection in reforestation and forest management treatments to manage climate change, species distribution models (SDMs) are a valuable tool [13].Species distribution models (SDMs) can be defined as a mathematical approach built on combination of observations of species presence or abundance with environmental factors.These models are treated to evaluate species distributions across sceneries [14].Although there are exceptions (e.g., [15]), SDMs usually predict the appropriate niches of species [13].Even though the limits of SDMs for global warming impact assessments on complex ecological structures, it has been recognized that species distribution models are theoretically sufficiently appropriate for simpler practical tasks: for example in leading global warming adaptation policies that include habitat restoration or species selection for reforestation or forest management [16][17][18][19].For such management treatments, the key is to match the source and the ambient target.However, it is uncertain whether subsequent long-term forest developments are correctly described by species distribution models that can be used to influence early decisions on species selection for a geographic area [13].
SDM typically consists of the following process: (1) compilation of the sites of occurrence of species; (2) collection of environmental variables from databases (pluviometry, soil composition, etc.) for the registered location; (3) regression algorithms to understand the connection between sites of presence or species abundance and the environmental variables collected in (2); (4) prediction of the outcome variable (occurrence or species richness) through the space/time of interest, based on the models in (3) [20].
Some of the newest SDMs only use the presences of the groups in the modeling process.Other approaches use presence/absence data or pseudo-absences.Logistic regression is the most common approach to studying presence/absence data [20].Currently the statistical knowledge of applied researchers is growing, and new approaches can handle bigger, more complex datasets, so that applied statisticians are faced with the necessity to specify sophisticated statistical approaches.Logically, as the difficulty of these models grows, it becomes more difficult to perform inferences.The Bayesian approach is mostly suitable as it is flexible and can deal with complex models, for instance, naturally accounting for a hierarchical structure, which could describe the data well, or deal with missing data imputation.Unquestionably, the most popular family of approximate inference methods in Bayesian statistics is the class of Markov Chain Monte Carlo (MCMC) approaches.These approaches, which exploded into popularity in the mid-1980s, have continued at the vanguard of Bayesian statistics ever since, with the basic structure being expanded to cope with progressively more difficult problems [21].
The modeling patterns of the presence/absence of species using local ecological variables has been a rising problem in the field of ecology over the last few years [22].This type of modeling has been highly used to address numerous questions, as the identification of essential wildlife habitats with the purpose of classifying and managing conservation regions [23], and predicting the reaction of species to environmental structures [24,25].Several methodologies and approaches have been presented in this perspective (see for instance [26][27][28][29], with generalized linear models and additive models (GLM and GAM) [30], species envelope models such as BIOCLIM [31], and the multivariate adaptive regression splines (MARS) [32] being some of the most commonly used models [33].
Most of these approaches consist of regression models to assess the role of environmental factors (e.g., precipitation, bathymetry, etc.) in explaining the species presence [30].However, some difficulties appear: for example, spatial autocorrelation must be taken into account, even if the data were captured through a consistent sampling scheme, as the observations are often adjacent and exposed to similar environmental characteristics [34,35].Furthermore observer error [36,37], gaps in the sampling, missing data, and the mobility of the species [38] can also influence the models.
Even though traditionally climatic variables have been believed the principal factor in the spatial distribution of the European forest species [39], several paleobotanic studies have shown that the Iberian forest structure has been influenced by the activity of the first agricultural societies from the Neolithic and Chalcolithic period [40][41][42][43].Hence, more complex analysis of forest ecosystems are required to comprehend how land-cover variations can affect vegetation dynamics and spatial distribution [44].Unfortunately, some of these activities are not possible for inclusion in our spatio-temporal framework, not only because of the absence of geographical references, but also because of the difference of timestamps (i.e., Forest Inventory is developed each 10 years).
The forest ecosystems in Europe were affected by several disturbances during the last years.One of the most important issues is degradation due to stress caused by anthropogenic processes.
Variation in forest circumstances were not connected to a particular issue but rather to a combination of stress factors that intensified one another.In order to understand the evolution of these ecosystems, the pertinent processes need to be approached by spatio-temporal modeling on detailed spatial and temporal scales [44].Spatio-temporal processes include the development of spatial patterns over time, thus providing a connection between pattern and process in ecological communities, and having a crucial role in understanding the ecosystem processes [45].Most of the analyses developed in forest communities were motivated by forest growth (i.e., [46][47][48][49] for general surveys, and [50] for a specific study), while in this paper, we take a slightly different perspective, and our interest is to show that trees communities are dynamic systems that are affected by environmental disturbances, and that these can also cause changes in the species distribution and dispersion in short periods of time.
Hierarchical models can manage complex interactions by specifying parameters varying on several levels via the introduction of random effects.The predicted value of the response is then articulated to be conditional on these random effects [51].The benefits of applying hierarchical Bayesian models arises moreso as complexity rises, when, for instance, spatio-temporal change needs to be modeled explicitly [37].The Bayesian structure similarly offers the benefit of supplying the full posterior probability of the set of parameters of interest, so that point estimates and measures of uncertainty can be easily computed, but with the added benefit that any other function of the parameters can be obtained with no additional effort [52,53].
Hierarchical Bayesian models have commonly relied on MCMC simulation techniques, which are challenging from a technical perspective and are computationally intensive, consequently limiting their use.However, a new statistical method is now available, namely integrated nested Laplace approximations (INLA) via the R-INLA package (http://www.r-inla.org)[51].The INLA approach and its potent application to handle complex datasets has been introduced to a wider nontechnical researchers [54].Differently from MCMC simulations, INLA applies an approximation for inference, and hence prevents the intense computational requests, convergence, and combining issues sometimes faced by MCMC algorithms [55].It is only implemented for latent Gaussian models, but this includes the class of models that we consider here for species distribution (for example, logistic regression).Moreover, when the interest lies in a continuous spatial phenomenon, for which realizations are obtained at discrete locations, R-INLA can be coupled with the stochastic partial differential equations (SPDE) approach [56] which performs discretization of the underlying continuous Gaussian field.This is the case of environmental inventories, which are typically characterized by clustered spatial patterns, and at the same time record large regions with absences.Jointly, these statistical approaches and their implementation in R allow researchers to fit intricated spatio-temporal models considerably faster and more reliably [57], due to the characteristics of this approach.
The aim of this paper is to build spatial and spatio-temporal models to predict the distribution of four different species present in the Spanish Forest Inventory.We want to compare the different models and show how accounting for dependencies in space and time affect the relationship between species and environmental variables.We will work with real data on four species of trees obtained from forest inventories developed in Galicia as part of the National Inventories 1970-2010, supported with environmental variables.In particular we consider the II (1980's), III (1990's), and IV (2000's) inventories.For these species we have their presence/absence at specific geographical coordinates, and we generate SDMs for each species.We specify a Bayesian hierarchical geostatistical modeling structure accounting for the spatial dependency.
Other studies have been developed to understand species distribution through a Bayesian approach using INLA, i.e., [58] analyzing the spatial distribution of caribou, or [59] analyzing the presence of invasive species and shrubs in Azores.One of the most interesting differences for our case is the temporal approach, as we work with data from three different inventories.
We chose the following four species from the Spanish National Inventory according to their characteristics, usage, and distribution in the Spanish Peninsula, and more specifically in Galicia: Silver fir (A.alba) is a huge evergreen tree located in central Europe, and in some parts of southern and eastern Europe.It is one of the largest tree species of the genus Abies in Europe.This species is considered to be a significant ecological and efficient balancer of European forests, and an essential species for preserving high biodiversity in forest ecosystems.Its future distribution is subject to debate between palaeoecologists and modelers, with contrasting climate-response forecasts [60].

Castanea sativa Mill.
The sweet chestnut is the single natural species of the genus in Europe.Extensive dispersion and active management caused the establishment of the species at the boundaries of its prospective ecological range.For this reason, it is difficult to trace its original natural area.In Europe, chestnut forests are mainly concentrated in a few countries such as Italy, France, Spain, and Portugal.This species has an extraordinary multipurpose nature, and can be managed for timber production, for chestnut production, and also for a broad range of secondary products and ecosystem services [60].

Pinus pinaster Ait.
The maritime pine is a widespread medium-size tree native to the western Mediterranean basin.This pine dwells well in temperate-warm locations, from coasts to high mountains.It does not tolerate shade.Due to its undemanding behavior, salt spray tolerance, and fast growth, it has been used for soil protection, reforestation of degraded areas, and dune stabilization as shelterbelts and also in intensive plantations.The maritime pine has been also traditionally utilized for the extraction of resin for turpentine and rosin.In the Southern Hemisphere, where maritime pine has been introduced for environmental and economical purposes, it has been considered as a highly invasive species [60].

Quercus robur L.
Pedunculate oak is a common deciduous tree species in Europe, found from the north (Scandinavia) to the southwest (Spain and Portugal).This genus has cultural importance for people through Europe, and the trees or leaves are commonly used in national or regional emblems.This genus can live several centuries and grow to about 40 m in height.The wood from oaks is strong and robust, and has been valued for centuries.It is preferred for structures, and also for barrels (to contain wine and spirits); overall, it was a main source of ship timbers.Currently, acute oak decline is one of the biggest concerns faced by this genus [60].

Environmental Variables
We have used the following variables to elaborate our models: mean annual temperature, mean of the maximum temperatures of the warmest month, mean of the minimum temperatures of the coldest month, and mean annual rainfall, calcareous soil and elevation.
The environmental variables used in this analysis were obtained from [61,62], and are those typically considered in this type of studies: mean annual temperature, mean of the maximum temperatures of the warmest month, the mean of minimum temperatures of the coldest month, and mean annual rainfall.We also considered the distribution of the calcareous parent materials as a useful predictor of plant species distribution in our study area [63].We used the European Soil Database [64] to assign each plot to a parent material class.All of the values were related to the data points.In each data point, we have obtained all the environmental variables apart from the presence/absence of the species and coordinates (X, Y, Z).
We can see the summary of the different meteorological variables (mean annual temperature, mean of maximum temperatures of the warmest month, mean of minimum temperatures of the coldest month, and mean annual rainfall) below (Figure 1).As we can see, there are no large fluctuations between the three inventories.

Environmental Variables
We have used the following variables to elaborate our models: mean annual temperature, mean of the maximum temperatures of the warmest month, mean of the minimum temperatures of the coldest month, and mean annual rainfall, calcareous soil and elevation.
The environmental variables used in this analysis were obtained from [61,62], and are those typically considered in this type of studies: mean annual temperature, mean of the maximum temperatures of the warmest month, the mean of minimum temperatures of the coldest month, and mean annual rainfall.We also considered the distribution of the calcareous parent materials as a useful predictor of plant species distribution in our study area [63].We used the European Soil Database [64] to assign each plot to a parent material class.All of the values were related to the data points.In each data point, we have obtained all the environmental variables apart from the presence/absence of the species and coordinates (X, Y, Z).
We can see the summary of the different meteorological variables (mean annual temperature, mean of maximum temperatures of the warmest month, mean of minimum temperatures of the coldest month, and mean annual rainfall) below (Figure 1).As we can see, there are no large fluctuations between the three inventories.On the other hand, the presence of the different species varied among the different inventories: three of the species analyzed showed a decrease in the presence in the III inventory, while A. alba was almost constant across time (Table 1).On the other hand, the presence of the different species varied among the different inventories: three of the species analyzed showed a decrease in the presence in the III inventory, while A. alba was almost constant across time (Table 1).
where D is a (fixed) subset of R d (here we consider d = 2).The actual data can be then represented by a collection of observations y = {y(s 1 ), ..., y(s n )}, where the set (s 1 , ..., s n ) indicates the spatial units where the measurements are taken.Depending on D being a continuous surface or a countable collection of d-dimensional spatial units, the problem can be specified as a spatially continuous or discrete random process, respectively [65].In our case, we can consider a collection of data points with their presence/absence obtained from the inventory, and the sampled points being the set (s 1 , ..., s n ) of n points; y s is the presence of each species in each point, and it is specified as: where π s is the probability of the species being present.Then, on the logit(π s ) a linear model is specified including the different covariates, x ms (Temperatures, precipitation, soil and elevation) and a spatial field ξ s : where M is the number of parameters, and a discretely indexed spatial random process (see [56]) is included to approximate the continuous process: where G is the total number of vertices of the triangulation.In practice, the discretization is done dividing the study region in triangles, and writing ξ s as a linear combination of basis functions ϕ g weighted by some zero means terms ξ g (for more details see [66]).
The vector ξ = {ξ 1 , ..., ξ G } can be modeled as a Gaussian Markov Random Field with a structured covariance function of the distance.

Spatio-Temporal Model
The concept of the spatial process can be extended to the spatio-temporal case, including a time dimension.The data are then defined by a process: As we define in the spatial model, we can consider a collection of data points with presence/absence obtained from the inventory and the sampled points are the set (s 1 , ..., s n ) of n points; y st is the species presence at each point in space and time, specified as: where π st is the probability of the species being present.Then, on the logit(π st ) a linear model is specified including the different covariates, x ms (Temperatures, precipitation, soil and elevation) and a spatio-temporal field ω st : where ω st refers to the latent spatio-temporal process that changes in time with autoregressive dynamics and spatial correlation innovations, which we model as follows: Forests 2018, 9, 573 with t = 2, . . .T, |a| < 1 and ω s1 ~Normal (0, σ 2 /(1 − a 2 )).ξ st , is a zero-mean Gaussian field that is temporally independent with the following spatio-temporal covariance: Cov(ξ st ,ξ ju ) = {(0, t = u; Cov(ξ s ,ξ j ), t = u), for s = j, where Cov(ξ s ,ξ j ) is modeled through the Matern spatial covariance function [56].

Implementation
We have used the Integrated Nested Laplace Approximation (INLA) implemented in R-INLA within the R statistical software.
The R-INLA package solves models using INLA, which is an approach to statistical inference for latent Gaussian Markov random field (GMRF).The approximation is divided in three stages.The first stage approximates the posterior marginal of θ using the Laplace approximation.The second stage calculates the Laplace approximation, or the simplified Laplace approximation, of π(x i |y,θ), for selected values of θ, in order to improve on the Gaussian approximation.The third process combines the previous two using numerical integration [57].
In R-INLA, the first step needed to process the geostatistical spatial model through SPDE, is the triangulation of the spatial domain of the study.We have used inla.mesh.createproviding the spatial coordinates used for estimation.This function executes a constrained refined Delaunay triangulation for a set of spatial locations: firstly the vertices of the triangles are placed at the observation coordinates, and then additional vertices are added, in order to satisfy triangulation quality constraints [56].Depending on the values selected for the arguments of the function, the total number of vertices changes, with a trade-off between the accuracy of the spatial field representation and the computational and time costs.
Given the mesh, we create the SPDE model object, to be used later in the specification of the final expression in our case, the different spatial and spatio-temporal models.

Implementation
We have used the Integrated Nested Laplace Approximation (INLA) implemented in R-INLA within the R statistical software.
The R-INLA package solves models using INLA, which is an approach to statistical inference for latent Gaussian Markov random field (GMRF).The approximation is divided in three stages.The first stage approximates the posterior marginal of θ using the Laplace approximation.The second stage calculates the Laplace approximation, or the simplified Laplace approximation, of π(xi|y,θ), for selected values of θ, in order to improve on the Gaussian approximation.The third process combines the previous two using numerical integration [57].
In R-INLA, the first step needed to process the geostatistical spatial model through SPDE, is the triangulation of the spatial domain of the study.We have used inla.mesh.createproviding the spatial coordinates used for estimation.This function executes a constrained refined Delaunay triangulation for a set of spatial locations: firstly the vertices of the triangles are placed at the observation coordinates, and then additional vertices are added, in order to satisfy triangulation quality constraints [56].Depending on the values selected for the arguments of the function, the total number of vertices changes, with a trade-off between the accuracy of the spatial field representation and the computational and time costs.
Given the mesh, we create the SPDE model object, to be used later in the specification of the final expression in our case, the different spatial and spatio-temporal models.
We consider now the triangulation of National Inventories using the inla.mesh.create(.)function.The function subdivides the region in the triangles, placing the initial vertices at the 2000 station locations and adding 1328 additional vertices (see Figure 2).

Results
In this section we show how each of the four species previously described has evolved in different ways during the last 30 years.Note that we present the maps of the posterior mean of the spatial field from the spatial model, as this model represents better the evolution of the different species.

Abies alba
As we can see in Figure 3, A. alba started being located in the western and southwestern area of Galicia.As time passes, it moves to the northern part and expands in the later years to occupy the western and northern part of the region.This pattern could be explained by the fact that this species does not tolerate high temperatures, and the southeast of Galicia is characterized by high temperatures during the summer, so the introduction of this species in this area should be avoided.

Results
In this section we show how each of the four species previously described has evolved in different ways during the last 30 years.Note that we present the maps of the posterior mean of the spatial field from the spatial model, as this model represents better the evolution of the different species.

Abies alba
As we can see in Figure 3, A. alba started being located in the western and southwestern area of Galicia.As time passes, it moves to the northern part and expands in the later years to occupy the western and northern part of the region.This pattern could be explained by the fact that this species does not tolerate high temperatures, and the southeast of Galicia is characterized by high temperatures during the summer, so the introduction of this species in this area should be avoided.As we can see in Figure 4, A. alba did seem to be affected by the environmental variables in a different way, depending on the time (inventory) and model (spatial vs spatio-temporal).In the first instance, we could see that all the variables, except the soil characteristics, have the same behavior in all the different models, with small negative point estimates, but with credibility intervals excluding zero.On the other hand, Soil showed a larger point estimate, but with an interval including zero, which was narrower in the spatio-temporal model.As we can see in Figure 4, A. alba did seem to be affected by the environmental variables in a different way, depending on the time (inventory) and model (spatial vs spatio-temporal).In the first instance, we could see that all the variables, except the soil characteristics, have the same behavior in all the different models, with small negative point estimates, but with credibility intervals excluding zero.On the other hand, Soil showed a larger point estimate, but with an interval including zero, which was narrower in the spatio-temporal model.

Results
In this section we show how each of the four species previously described has evolved in different ways during the last 30 years.Note that we present the maps of the posterior mean of the spatial field from the spatial model, as this model represents better the evolution of the different species.

Abies alba
As we can see in Figure 3, A. alba started being located in the western and southwestern area of Galicia.As time passes, it moves to the northern part and expands in the later years to occupy the western and northern part of the region.This pattern could be explained by the fact that this species does not tolerate high temperatures, and the southeast of Galicia is characterized by high temperatures during the summer, so the introduction of this species in this area should be avoided.As we can see in Figure 4, A. alba did seem to be affected by the environmental variables in a different way, depending on the time (inventory) and model (spatial vs spatio-temporal).In the first instance, we could see that all the variables, except the soil characteristics, have the same behavior in all the different models, with small negative point estimates, but with credibility intervals excluding zero.On the other hand, Soil showed a larger point estimate, but with an interval including zero, which was narrower in the spatio-temporal model.

Castanea sativa
This species is less clustered than the previous one, being mostly present in the central and northern parts of the region at the beginning of the period considered; as time passes, its presence becomes more pronounced (III inventory), but it becomes scattered across the whole Galicia during the last inventory (see Figure 5).

Castanea sativa
This species is less clustered than the previous one, being mostly present in the central and northern parts of the region at the beginning of the period considered; as time passes, its presence becomes more pronounced (III inventory), but it becomes scattered across the whole Galicia during the last inventory (see Figure 5).This species showed a different relationship with the environmental variables if we analyzed the different models (see Figure 6 below).In the spatio-temporal model, the average annual temperature was the most important variable, followed by the temperatures of the coldest and warmest months.Also, in the spatial models, the type of substrate was an important variable followed by the temperatures.This could be explained by the adaptability of this species to the substrate: it generally preferred siliceous substrate, but it could be present also in certain calcareous soils if there were optimum conditions.Also, in this case, we could see different variables behavior between spatial and spatio-temporal models.Looking at the Elevation, in the II and III inventories, all the values in the credible interval were positives, but then in the IV inventory and in the spatio-temporal model, there were negative and positive values.If we compared the credible intervals for the remaining variables (Soil, Precipitation, Temperature, Maximum emperature, and Minimum Temperature), we can see similar behaviors in the spatial models with negative and positive values.Moreover, the spatio-temporal model showed that Soil has a similar performance than the spatial models, while the other variables show differences.Precipitation and Temperature show positive values in the credible interval, with values close to zero in Precipitation; Max and Min Temperature showed negative values in this interval.These results suggest that this species is affected positively by the temperature, but extreme temperatures can also affect its presence.This species showed a different relationship with the environmental variables if we analyzed the different models (see Figure 6 below).In the spatio-temporal model, the average annual temperature was the most important variable, followed by the temperatures of the coldest and warmest months.Also, in the spatial models, the type of substrate was an important variable followed by the temperatures.This could be explained by the adaptability of this species to the substrate: it generally preferred siliceous substrate, but it could be present also in certain calcareous soils if there were optimum conditions.Also, in this case, we could see different variables behavior between spatial and spatio-temporal models.Looking at the Elevation, in the II and III inventories, all the values in the credible interval were positives, but then in the IV inventory and in the spatio-temporal model, there were negative and positive values.If we compared the credible intervals for the remaining variables (Soil, Precipitation, Temperature, Maximum emperature, and Minimum Temperature), we can see similar behaviors in the spatial models with negative and positive values.Moreover, the spatio-temporal model showed that Soil has a similar performance than the spatial models, while the other variables show differences.Precipitation and Temperature show positive values in the credible interval, with values close to zero in Precipitation; Max and Min Temperature showed negative values in this interval.These results suggest that this species is affected positively by the temperature, but extreme temperatures can also affect its presence.

Pinus Pinaster
This species shows similarities with C. sativa: it starts with a low presence, then it becomes present in the whole area of analysis, but in this case, the posterior mean in inventory IV shows that this species becomes concentrated in the southern area, with low presences in the interior (see Figure 7).Moreover, as we can see in Figure 8 below, not only temperatures, but also soil characteristics (calcareous) are very important in the different models for this species; we could see interesting differences between the spatial and the spatio-temporal models: in the latter, Temperature (average) positively affected the presence of the species, and extreme values (Maximum and Minimum), participated negatively.On the other hand, in spatial models, the most important variable was that the type of soil, followed by Temperatures in III and IV inventory.This species had a similar behavior to the previous one.Elevation and Soil had similar credible intervals in spatial and spatio-temporal models, while there are some differences in the other variables.Looking at Precipitation, in spatial models, the credible interval had positive and negative values, but only positive values were represented in the spatio-temporal model.Finally, the Temperature variables (average, maximum and minimum), had positive and negative values in the credible interval for the

Pinus pinaster
This species shows similarities with C. sativa: it starts with a low presence, then it becomes present in the whole area of analysis, but in this case, the posterior mean in inventory IV shows that this species becomes concentrated in the southern area, with low presences in the interior (see Figure 7).

Pinus Pinaster
This species shows similarities with C. sativa: it starts with a low presence, then it becomes present in the whole area of analysis, but in this case, the posterior mean in inventory IV shows that this species becomes concentrated in the southern area, with low presences in the interior (see Figure 7).Moreover, as we can see in Figure 8 below, not only temperatures, but also soil characteristics (calcareous) are very important in the different models for this species; we could see interesting differences between the spatial and the spatio-temporal models: in the latter, Temperature (average) positively affected the presence of the species, and extreme values (Maximum and Minimum), participated negatively.On the other hand, in spatial models, the most important variable was that the type of soil, followed by Temperatures in III and IV inventory.This species had a similar behavior to the previous one.Elevation and Soil had similar credible intervals in spatial and spatio-temporal models, while there are some differences in the other variables.Looking at Precipitation, in spatial models, the credible interval had positive and negative values, but only positive values were represented in the spatio-temporal model.Finally, the Temperature variables (average, maximum and minimum), had positive and negative values in the credible interval for the  Moreover, as we can see in Figure 8 below, not only temperatures, but also soil characteristics (calcareous) are very important in the different models for this species; we could see interesting differences between the spatial and the spatio-temporal models: in the latter, Temperature (average) positively affected the presence of the species, and extreme values (Maximum and Minimum), participated negatively.On the other hand, in spatial models, the most important variable was that the type of soil, followed by Temperatures in III and IV inventory.This species had a similar behavior to the previous one.Elevation and Soil had similar credible intervals in spatial and spatio-temporal models, while there are some differences in the other variables.Looking at Precipitation, in spatial models, the credible interval had positive and negative values, but only positive values were represented in the spatio-temporal model.Finally, the Temperature variables (average, maximum and minimum), had positive and negative values in the credible interval for the II and III inventories, but in the IV inventory and spatio-temporal model, all the values were positive in the average, and negatives in Max. and Min.

Quercus robur
This species showed an increasing presence in Galicia.During the third inventory, it was present in almost all of the areas, except the southeastern area (see Figure 9).As can see in Figure 10, the variables in Q. robur have a similar behavior to C. sativa, except in the IV inventory model, where the calcareous soil had a very important part in the species distribution.As we have seen in most of the other species, Elevation had similar credible intervals in all the different models, in this case, always showing positive values that were close to zero.As we have seen in C. sativa, the rest of the variables had similar performances in the spatial models and different performances in the spatio-temporal model (except the Max.Temperature, with similar credible intervals in all the models).All the variables had 95% credible intervals, including zeros in the spatial models, while the spatio-temporal model showed positive values in Min.Temperature and negative values in the Soil, Precipitation, and Temperature.

Quercus robur
This species showed an increasing presence in Galicia.During the third inventory, it was present in almost all of the areas, except the southeastern area (see Figure 9).

Quercus robur
This species showed an increasing presence in Galicia.During the third inventory, it was present in almost all of the areas, except the southeastern area (see Figure 9).

Summary
As we can see in Table 2, most of the species show different relationships with environmental and climatic variables between the spatial and spatio-temporal models.The positive symbol (+) summarizes a positive relationship between the variable and the presence; the negative symbol (−) represents the opposite; Rn does not show a clear relationship with a credible interval with the positives and negatives values.Also, if we generalize, species with more presences (see Table 1) show larger differences between models.Also, if we analyse the results from spatial to spatio-temporal models, typically variables not showing a clear effect become positively or negatively associated with presence, depending on the species and the variable.An usual way to estimate out-of-sample prediction error is cross-validation (see [67,68] for a Bayesian approach), but scientists have always looked for alternative methods, as cross-validation involves repeated model fits and it can run into trouble with sparse data [69].When the aim is model comparison, the most common index is the DIC [70,71], which, in the same way to the Akaike information criterion AIC involves two components, a term that measures the goodness of fit, and a penalty term for growing model complexity.More recently, the Watanabe-Akaike information criterion WAIC [72] has been suggested as an appropriate alternative for estimating the out-of-sample expectation in a fully Bayesian approach.This method starts with the calculated log pointwise posterior predictive density, and then adds a correction for the effective number of parameters to adjust for overfitting [69].WAIC works on predictive probability density of observed

Summary
As we can see in Table 2, most of the species show different relationships with environmental and climatic variables between the spatial and spatio-temporal models.The positive symbol (+) summarizes a positive relationship between the variable and the presence; the negative symbol (−) represents the opposite; Rn does not show a clear relationship with a credible interval with the positives and negatives values.Also, if we generalize, species with more presences (see Table 1) show larger differences between models.Also, if we analyse the results from spatial to spatio-temporal models, typically variables not showing a clear effect become positively or negatively associated with presence, depending on the species and the variable.An usual way to estimate out-of-sample prediction error is cross-validation (see [67,68] for a Bayesian approach), but scientists have always looked for alternative methods, as cross-validation involves repeated model fits and it can run into trouble with sparse data [69].When the aim is model comparison, the most common index is the DIC [70,71], which, in the same way to the Akaike information criterion AIC involves two components, a term that measures the goodness of fit, and a penalty term for growing model complexity.More recently, the Watanabe-Akaike information criterion WAIC [72] has been suggested as an appropriate alternative for estimating the out-of-sample expectation in a fully Bayesian approach.This method starts with the calculated log pointwise posterior predictive density, and then adds a correction for the effective number of parameters to adjust for overfitting [69].WAIC works on predictive probability density of observed variables rather Forests 2018, 9, 573 than on model parameter; hence, it can be applied in singular statistical models (i.e., models with non-identifiable parameterization, see [73]. We have also considered the conditional predictive ordinate (CPO) [74] to perform model evaluation.The conditional predictive ordinate (CPO) is established on leave-one-out cross-validation.CPO estimates the probability of observing a value, after having already observed the others.The mean logarithmic score (LCPO) was calculated as a measure of the predictive quality of the model [75,76].High LCPO values indicate possible outliers, high-leverage, and influential observations.
In Table 3, we can see the summary of the WAIC and LCPO values obtained in the different models for each species (spatial and spatio-temporal models); this shows that, looking at the WAIC, most of the species have a better fit for the spatio-temporal model with vegetation (except C. sativa), also looking at LCPO spatial model has more outliers than spatio-temporal models.Summarizing the computational costs of performing the different models, all the models were executed from the same terminal (laptop Core i7 with 12 GB RAM).Spatial models need between 10 and 30 min to obtain the results.However, spatio-temporal models need between 5 and 12 h to finish the process.

Conclusions
We have built spatial and spatio-temporal models to predict the distributions of four different species present in the Spanish Forest Inventory.We have compared the different models and show the relationship between species and environmental variables.We have shown that this relationship changes between spatial and spatio-temporal models.Most of the spatial models show a vague relationship with the environmental variables, which becomes more clear when we analyzed all of the time series when developing the spatio-temporal model.Also, we have shown how the species evolve in space along time, changing their distributions between the II to the IV inventory.
Initially our aim in this project was to apply these models to the whole of Spain, assuming that the spatial continuity is essential to understand species distribution.However due to technical issues, we were not able to finalize this.Currently not all the data from the last inventory are available for all the provinces, and also some of the areas have different reference systems to locate the parcels.Another problem was the computational cost; the available resources were not powerful enough to work with this data volume (more than 90,000 points per inventory).
There are interesting differences between spatial and spatio-temporal models for the different species.As we have shown, not all the same variables have the same weight in the different models.
Several factors can affect spatial distribution of species.Environmental factors are not the only variables that can affect this distribution, but socioeconomic factors, policies, and management criteria can also be important agents that have different impacts in the species presence.
Analysing the models, we can affirm that the use of spatio-temporal models is an advantage for the understanding of the different ecological dynamics, given the the temporal perspective is not very frequent in environmental research projects.
We have analyzed the credible interval of the different variables in order to understand the relationship between the environmental variables and species presence.We can see that some variables change their "weight" depending of the inventory, and several variables also have the same behavior in all the inventories, and also along the spatio-temporal model.Summarizing, we can generalize that permanent and theoretical inalterable variables have similar performances in spatial and spatio-temporal models, showing a similar relationship between the presence of species and these variables along time.Moreover, species presence does not always have a similar relationship with "non static" variables.This relationship is changing, not only due to changes in environmental factors, but also based on species management and possible human disturbances.
Spatio-temporal models and the R-INLA package appear to offer additional benefits beyond the common SDM or spatially-explicit modeling.The combination of using a complex spatial latent field to capture spatial processes and an underlying simple additive regression model for the response variables relationship to environmental factors, means that the fixed effects are potentially more straightforward to interpret [77].Another benefit of a Bayesian approach is the capture of uncertainty for each predicted value, with predictive uncertainty being an often ignored aspect of SDM modeling and prediction.R-INLA models are extremely flexible in their specifications, with spatial autocorrelation and observer bias being straightforwardly incorporated as random effects, while standard error distributions, such as Gaussian, Poisson, binomial, and a variety of zero-inflated models, can be used interchangeably [57].This method, therefore, has a built-in potential for extending SDM analysis away from simple binomial models by, for example, incorporating two or more types of data [78], hierarchical seasonal models [79], or fitting point-process models [80].We hope that our research will aid in the uptake of such fast spatial Bayesian methods, as this approach shows great promise for other analyses in ecology.

Figure 1 .
Figure 1.Boxplot summary of meteorological variables during the different inventories.From top to bottom: mean annual rainfall (Ppr) in mm; mean annual temperature (Tas) in °C, mean of the maximum temperatures of the warmest month (Tas MAX) in °C, mean of minimum temperatures of the coldest month (Tas MIN) in °C.

Figure 1 .
Figure 1.Boxplot summary of meteorological variables during the different inventories.From top to bottom: mean annual rainfall (Ppr) in mm; mean annual temperature (Tas) in • C, mean of the maximum temperatures of the warmest month (Tas MAX) in • C, mean of minimum temperatures of the coldest month (Tas MIN) in • C.

Figure 2 .
Figure 2. The National Inventory region triangulation with the extended boundary.Blue dots are the monitoring stations and the black bold line represents the region border.

Figure 2 .
Figure 2. The National Inventory region triangulation with the extended boundary.Blue dots are the monitoring stations and the black bold line represents the region border.

Figure 3 .
Figure 3. Maps of the posterior mean of the spatial field from the spatial model of A. alba, II, III, and IV forest inventories (coordinates in m.).

Figure 3 .
Figure 3. Maps of the posterior mean of the spatial field from the spatial model of A. alba, II, III, and IV forest inventories (coordinates in m.).

Figure 3 .
Figure 3. Maps of the posterior mean of the spatial field from the spatial model of A. alba, II, III, and IV forest inventories (coordinates in m.).

Figure 5 .
Figure 5. Maps of the posterior means of the spatial field from the spatial model of C. sativa, II, III, and IV forest inventories (coordinates in m.).

Figure 5 .
Figure 5. Maps of the posterior means of the spatial field from the spatial model of C. sativa, II, III, and IV forest inventories (coordinates in m.).

Figure 7 .
Figure 7. Maps of the posterior mean of the spatial field from the spatial model of P. pinaster, II, III, and IV forest inventories (coordinates in m.).

Figure 7 .
Figure 7. Maps of the posterior mean of the spatial field from the spatial model of P. pinaster, II, III, and IV forest inventories (coordinates in m.).

Figure 7 .
Figure 7. Maps of the posterior mean of the spatial field from the spatial model of P. pinaster, II, III, and IV forest inventories (coordinates in m.).

Forests 2018, 9 ,
x FOR PEER REVIEW 11 of 17 II and III inventories, but in the IV inventory and spatio-temporal model, all the values were positive in the average, and negatives in Max. and Min.

Figure 9 .
Figure 9. Maps of the posterior mean of the spatial field from the spatial model of Quercus robur, II, III, and IV forest inventories (coordinates in m.).

Forests 2018, 9 ,
x FOR PEER REVIEW 11 of 17 II and III inventories, but in the IV inventory and spatio-temporal model, all the values were positive in the average, and negatives in Max. and Min.

Figure 9 .
Figure 9. Maps of the posterior mean of the spatial field from the spatial model of Quercus robur, II, III, and IV forest inventories (coordinates in m.).

Figure 9 .
Figure 9. Maps of the posterior mean of the spatial field from the spatial model of Quercus robur, II, III, and IV forest inventories (coordinates in m.).As can see in Figure10, the variables in Q. robur have a similar behavior to C. sativa, except in the IV inventory model, where the calcareous soil had a very important part in the species distribution.As we have seen in most of the other species, Elevation had similar credible intervals in all the different models, in this case, always showing positive values that were close to zero.As we have seen in C. sativa, the rest of the variables had similar performances in the spatial models and different performances in the spatio-temporal model (except the Max.Temperature, with similar credible intervals in all the models).All the variables had 95% credible intervals, including zeros in the spatial models, while the spatio-temporal model showed positive values in Min.Temperature and negative values in the Soil, Precipitation, and Temperature.

Table 1 .
Percentage of presences of the different species in Galicia during the three inventories.

Table 1 .
Percentage of presences of the different species in Galicia during the three inventories.

Table 2 .
Posterior estimates summary.Comparison between variables and presence of different species in spatial and spatio-temporal models (+) represents a positive relationship, (−) a negative relationship, and (Rn) not a clear relationship.

Table 2 .
Posterior estimates summary.Comparison between variables and presence of different species in spatial and spatio-temporal models (+) represents a positive relationship, (−) a negative relationship, and (Rn) not a clear relationship.