Regeneration of Native Forest Species in Mainland Portugal: Identifying Main Drivers

: Persistence of native forests is a global concern. We aimed at unveiling the main factors affecting tree recruitment in Portuguese native forests, modelling sapling data collected during the 5th Portuguese Forest Inventory, for ﬁve main Quercus taxa. Zero-inﬂated count data models allowed us to examine simultaneously (i) the absence of tree recruitment and (ii) the density of tree recruitment. Using Akaike weights, we obtained importance values for 15 relevant explanatory variables. Results showed that seed availability and climatic variables were determinant to understand regional absence of regeneration for all taxa. Seed availability was also an important driver of sapling density, except for Quercus suber . Other variables impacted on regeneration density: grazing hindered Q. suber regeneration; regeneration of Q. rotundifolia and Q. suber was lower in ﬂat areas; recurrent ﬁre hampered the regeneration of Q. robur and Q. pyrenaica ; Q. broteroi and Q. pyrenaica showed depressed regeneration in regions where forest plantations abound, while Q. robur and Q. suber seemed selectively protected. We conclude that caution is warranted when analysing pooled data for Quercus spp. regeneration, as different variables affected Quercus taxa differently. Finally, we suggest dedicated management actions to enhance the establishment of new native forests.


Introduction
Native forests are of great concern for nature conservation [1]. On the one hand, they are known to shelter significantly more biodiversity than planted forests [2][3][4][5][6][7][8][9]; on the other hand, they have been exploited by man since long and eventually replaced on a large scale with agriculture, urban fabric and other land uses [10].
Native forests are nowadays subjected to new challenges that endanger their long-term persistence, such as regeneration failure [11][12][13], climatic changes [14,15] and replacement with non-native forest species [1]. The latter are already widely distributed in Portugal. Portuguese native forests were reduced to less than 7% of the territory by the end of the 19th century [16,17] and never recovered conspicuously [17,18] for three main reasons: (i) during the first half of the 20th century, Pinus pinaster Aiton was the preferred forestry species [17]; (ii) during the second half of the 20th century, preference was given to the exotic Eucalyptus globulus Labill., which is currently the most abundant forest species in Portugal, corresponding to 26% of the forested areas [17,19]; and (iii) since the end of the 20th century, a great increase in the extent of wildfires [20], which recurrently burn shrublands [21], delaying the progression of ecological succession.
We searched for high-resolution/high-quality variables, as the NFI5 dataset has high spatial accuracy (the maximum admissible positioning error of the plots centre was 6 m). Aiming at covering the widest possible range of variable types, we gathered 15 putative explanatory variables known to affect forest regeneration. We describe these variables in the following paragraphs.
From data collected during the NFI5, we used basal area (G), tree density (N), evidence of grazing (GRAZI) and evidence of understory clearing (UnderClear). Also using data from the NFI5 on the dominant and co-dominant trees, we built maps of the density of plots dominated or co-dominated by each of the studied Quercus taxa, within three different predefined neighbourhoods (2 km, 5 km and 10 km). Such maps (Figures S1-S5 within Supplementary Materials) were obtained using the focal function from R raster package [58] and statistically estimate the local to regional presence of adult trees, being fair proxies of (local to regional) seed availability (SeedAvail). SeedAvail derived for a taxon (e.g., Q. robur) was used only for modelling the regeneration of that same taxon (i.e., Q. robur saplings), thus this variable can be summarized as "the density of plots with conspecific adult trees, in a predefined neighbourhood". In the same way, using the dominant and co-dominant species information coming from NFI5, we calculated the density of plots dominated or co-dominated by Pinus pinaster trees (DensPp) and the density of plots dominated or co-dominated by Eucalyptus globulus trees (DensEg) (Figures S6 and S7 in Supplementary Materials). With such variables we wanted to test if (local to regional) forestry intensity could influence Quercus regeneration.
Using data from the Portuguese digital fire atlas (available online at http://www2.icnf.pt/portal/ florestas/dfci/inc/mapas) we produced two variables: fire frequency (FireFreq), as the number of fires experienced by each plot between 1975 and 2006 and time since last fire (TSLFire) coded as a factor with two levels ("≤5 years" and ">5 years"). In this way we could test if regeneration is hindered where fire recurrence is high or if it is triggered by a recent fire.
Concerning climatic data, Monteiro-Henriques et al. [59] published 13 climatological indices for mainland Portugal. From those we chose the summer quarter ombrothermic index (OMBRO) and the compensated thermicity index (THERM), as these are fairly uncorrelated and synthesise important climatological information known to affect plant species distribution [60]. OMBRO is related to climatic summer drought, which is important to discriminate between Mediterranean and temperate bioclimates and THERM reflects annual average temperature with extra weight given to winter temperatures, which are known to limit plant growth and distribution.
Using the digital elevation model produced by the Shuttle Radar Topography Mission version 4.1 [61] we obtained two physiographic variables (TRASP and SLOPE) and one edaphic variable (SoilMoist). TRASP was calculated as described in Evans [62], transforming topographic aspect into a 0 to 1 scale, higher values indicating the sunniest/driest aspects (S-SW exposures); flat stands were treated as if exposed southwards. SLOPE, that is, drop per distance, was obtained using TAUDEM 5.06 (Terrain Analysis Using Digital Elevation Model) D-infinity method [63,64]. SoilMoist, that is, the logarithm of the contributing area, was also calculated using TAUDEM (D8 method) and R STATISTICAL SOFTWARE 3.0.1 [65]. Several packages were used in the process: lattice 0.20-9 [66], permute 0.7-0 [67], foreign 0.8-55 [68], raster 2.1-49 [69], rgdal 0.8-10 [70] and sp 1.0-11 [71,72]. rgdal package relied on libraries GDAL version 1.10.0 and PROJ4 version 4.8.0 [PJ_VERSION: 480]. We calculated two variants of SoilMoist using two different weights: (i) OMBRO, which reflects summer soil moisture; and (ii) the annual ombrothermic index (AOI) [59], which reflects the accumulated soil moisture throughout the year. In order to calculate OMBRO and AOI for the Iberian Peninsula we merged precipitation interpolations coming from Nicolau [73] and Gonzalo Jiménez [74] and used data on temperature, for the Iberian Peninsula, coming from Hijmans et al. [75,76]. In this way we obtained two datasets for the Iberian Peninsula (with 35-m resolution) of weighted contributing areas. Further information can be found at http://home.isa.utl.pt/~tmh/ as well as links to download the contributing area variables and other associated hydrologic variables.
Soil parent material types (SPMType) were also tested as an edaphic explanatory variable, derived from the 1/500,000 Portuguese Geologic Map [77]. Appendix A details the methodology used to construct this variable; see also, in the same appendix , Tables A1 and A2 where further information is given on the construction of SPMType and Figure A1 where a map of SPMType is presented. Table 1 lists the 15 putative explanatory variables, the respective abbreviations, an informal classification of the variable type and the physical phenomenon related to each variable.

SeedAvail
Density of plots with conspecific adult trees (no. of plots in a predefined neighbourhood).
Seed source Local to regional seed availability/conspecific adult presence.

DensPp
Density of plots with adult P. pinaster trees (no. of plots in a predefined neighbourhood).

DensEg
Density of plots with adult E. globulus trees (no. of plots in a predefined neighbourhood).

SoilMoist
Logarithm of the contributing area weighted by OMBRO or AOI (mm • C −1 ).

Edaphic
Parent material base richness and potential soil texture.

Data Pre-Treatments
Pearson correlation coefficient (r) among all variable pairs was checked and multicollinearity was further assessed calculating the variance inflation factors for each taxon dataset using function vif from R package usdm [78]. For all taxa datasets r among all possible pairs of variables used in models was lower than 0.7 (see Figures S8-S12 in Supplementary Materials.). Variance inflation factors were under 2.8 (9.0 excluding zero saplings) for Q. broteroi and under 2.9 (5.2 excluding zero saplings) for all other species, showing low multicollinearity levels. Tables S1-S5 in Supplementary Materials present summary statistics for all variables in all datasets.
The presence of spatial autocorrelation on saplings data was checked with the Mantel test and 10,000 Monte Carlo permutations using mantel.rtest function from package ade4 [79]. For all taxa, the simulated p-values were between 0.75 and 0.97, indicating no correlation between saplings data (excluding zero saplings) and the respective Euclidean distance matrices.
Extreme observations can bring additional ecological insight. However, it was not feasible to compare the fitting results with and without extreme observations, due to the long computation time of the fitting process. After inspecting saplings boxplots for each species, we decided to exclude a few clearly isolated data points corresponding to suspicious extremely high counts; Figure S13 in Supplementary Materials shows the boxplots produced and Table 2 indicates the number of extreme observations removed. Table 2 shows, additionally, which soil parent material types (SPMType) were tested for each taxon and the total number of plots used to implement the models for each taxon (i.e., the total number of plots with available information on both saplings and the 15 explanatory variables).

Zero-Inflated Count Data Models
Zero-inflated count data models (ZICDM) are two-component mixture models that extend the classical generalized linear models (GLM) to account for excess of zeros in the dataset. One component incorporates a point mass at zero while the other corresponds to a count distribution such as Poisson or negative binomial. Therefore, zeros might come both from the point mass and from the count component. ZICDM using a negative binomial count model include a dispersion parameter θ in the modelling procedure, enabling the incorporation of both the excess of zeros and over-dispersion [80].
Following Zeileis et al. [80] the regression equation for the mean of a ZICDM, using the canonical log link for the count component, can be written as: where π i is the probability of belonging to the point mass component, that is, in our case, the probability that regeneration is absent (zero saplings). The second component e (β 0 +x 1i β 1 +...+x mi β m ) is simply the conditional mean coming from the count model (as in a classical Poisson or negative binomial GLM).
β's correspond to the m coefficients of the count model (β 0 the intercept), which multiply by each of the m selected explanatory variables (x i ). The unobserved probability π i is modelled, simultaneously, by a binomial GLM, which conditional mean can be written, using the standard logit link, as: where γ's correspond to the n coefficients of the zero inflation component (γ 0 the intercept), which multiply by each of the n selected explanatory variables (z i ) [80]. The coefficients β and γ, as well as the dispersion parameter θ (for the negative binomial count model) can be estimated, simultaneously, by maximum likelihood [80][81][82]. ZICDM are particularly appropriate to model saplings count data coming from national forest inventories. As count data, saplings number is appropriately handled by the count component of the model (by using Poisson or negative binomial distributions). Additionally, it is plausible that some species do not regenerate in some regions due to, for example, environmental constraints (which ultimately result in the absence of that species in that regions); as a result, an excessive number of zeros occur in the data. However, in the regions where regeneration is more probable, plots presenting zero saplings can still occur (e.g., by a simple matter of chance); these zeros are modelled by the count component.
Interestingly, by combining a logistic response (zero inflation component) with an exponential response (count component), it is possible for ZICDM to fit unimodal responses for a variable (thus, not only monotonic, as logistic or exponential by their own, when using first-degree polynomial terms). Additionally, such combined responses can also be asymmetric/skewed in shape. This added flexibility is only achievable when the variable in question is included in both components of the ZICDM.

Model Fitting
All the 15 chosen variables were tested as predictors in the count component. Regarding the zero inflation component, however, we decided to test OMBRO, THERM, SeedAvail and SoilMoist (i.e., as predictors of the absence of saplings). Other variables such as FireFreq, GRAZI, UnderClear can completely hinder regeneration (e.g., through very high fire frequency or overgrazing) but extremes in FireFreq are very rare in the studied area, thus not expected to produce a true inflation of zeros, neither the number of grazed or managed plots justifies their testing as zero inflating variables. Inversely, climatic variables can impede regeneration over extensive areas, for example, low OMBRO values represent severe summer drought, which in southern Portugal is expected to completely preclude the regeneration of temperate species like Q. robur. Another variable able to produce zero inflation is SPMType (e.g., Q. suber is known to be calcifugous); however, as it is a categorical variable, to avoid numerical problems and perfect fittings we decided simply to remove from the analysis the classes of SPMType presenting less than 10 sapling occurrences, for Q. robur, Q. pyrenaica, Q. suber and Q. rotundifolia, and, exceptionally, 5 for Q. broteroi (as the number of sapling occurrences was considerably lower).
To sample the number of saplings, NFI5 used two different subplot sizes, depending on the species dominating the forest plot, respectively 200 m 2 subplots in plots dominated by Q. suber or Q. rotundifolia and 50 m 2 subplots for all other species. Consequently, the logarithm of the subplot area was used as an offset in all of the fitted models, ensuring that saplings density was correctly modelled.
To decide between the Poisson and negative binomial distribution we firstly looked at the mean and variance of the response variable and checked the Wald-tests of log(θ).
Model fitting was performed in R STATISTICAL SOFTWARE 3.2.0 [83] using packages stats, utils, parallel [83], vegan [84] and pscl [85]. After rescaling continuous independent variables to zero mean and unit variance using function decostand (vegan package), we used zeroinfl function (pscl package) to fit the models and function AIC from package stats to extract the Akaike Information Criteria (AIC) of each model.
As variables influence is expected to vary between taxa, we individually fitted models for each Quercus taxon. Even if we understand that the inclusion of interactions between variables could certainly improve the obtained models [13,38], including 2-way interactions would have raised the number of possible model formulas to an impracticable number. Therefore we opted to not include interaction terms, focusing only on the main terms importance. In this way, the 15 variables used for the count component of the model and the 4 variables selected for the zero inflation component were combined in all possible ways using function combn from utils package to produce 524,288 different model formulas for each taxon.
We performed two sequential runs. In the first run we used: (i) SeedAvail with 2-km neighbourhoods, as we hypothesized that local scale effects on seed availability would be stronger than regional effects, due to the limited seed dispersal range usually found in the Quercus genus [33,34]; (ii) DensPp and DensEg with 10-km neighbourhoods, as we supposed the wider scale would best capture forestry intensity effects on regeneration; (iii) SoilMoist weighted by OMBRO, as the gradient of summer precipitation is pronounced in mainland Portugal and is an important limiting variable for the survival of seedlings and saplings. Therefore, in a first run we fitted models for all possible combinations of: After fitting all models and sorting them by increasing AIC we checked if SeedAvail, DensPp and DensEg were selected amongst the best model (lower AIC), or on some of the alternative models (∆AIC < 2). In case one (or more) of those variables was extant in this set of best models, we studied if other neighbourhood size (or a combination of neighbourhood sizes) minimized AIC further. Finally, when AIC decreased by replacing the neighbourhood size(s), we performed a second run, again fitting all the 524,288 combinations but now using the neighbourhood(s) which decreased AIC. Simultaneously, we also checked for AIC changes by replacing SoilMoist (summer) by SoilMoist (annual), whenever this variable was present in the set of best models. In case the replacement produced a decrease in AIC, we used SoilMoist (annual) in the second run. Even if we wanted to sense the differences between the different neighbourhood sizes and SoilMoist weights, we must stress that we expected minimal changes in the model performance, as the changes in neighbourhood or weight produced, generally, highly correlated variables (see Figures S8-S12 in Supplementary Materials.).

Variables Importance Value
AIC difference to the best model was calculated for each of the fitted models (∆ i = AIC i − AIC min ). As the relative likelihood is directly proportional to a simple calculation using the ∆ i 's ( L(g i data) ∝ e −(1/2)∆ i ), Akaike weights (w i ) can be computed for each model as: where K is the total number of models. Finally, variable importance values were obtained for each of the tested variables, summing the Akaike weights of all models containing the variable: See [86,87] for further information.
Besides variable importance, we found essential to inspect the signal of the effect (i.e., positive or negative), in order to discuss the results from an eco-physiological point of view. We examined the signal of the fitted coefficients in the following way: after ordering all models by their respective Akaike weight (w i ), we selected the first j models that totalized 0.95 of summed w i and checked if each variable sign was positive or negative among those models. A variable signal was considered consistent when always positive or negative within the first j models.

Results
As expected, a large proportion of plots had zero saplings; moreover, saplings variance was systematically greater than the mean (Table 3). Therefore the negative binomial distribution was used and, inspecting the set of best models, the logarithm of the dispersion parameter log(θ) was always significant in the Wald-test for all taxa. Neighbourhood sizes and SoilMoist weights producing lower AIC values varied among the studied taxa. Considering SeedAvail, 2-km neighbourhoods produced better models for Q. robur and Q. pyrenaica, yet for Q. broteroi and Q. rotundifolia higher values were chosen. For DensPp, 10-km produced lower AIC values for Q. robur and 2-km for Q. pyrenaica, while for DensEg, the 10-km neighbourhood was selected both for Q. broteroi and Q. suber. Table 4 summarizes best neighbourhoods results, as well as SoilMoist best weight (summer or annual ombrothermic index). The number of models used to study variable signal consistency was quite different among taxa ( Table 5), meaning that, for some of the studied taxa, Akaike weights were more concentrated within the first models with lower AIC, than for other taxa. Figures S14-S18 in Supplementary Materials show graphically, for each taxon, the sorted AIC of all models and the proportion of the first (sorted) models that summed up to 0.95 of Akaike weight. Importance values based on Akaike weights (w + ), both for the count and zero inflation components, allowed us to identify the most and the least relevant variables for the regeneration of each taxon. Table 6 presents the obtained importance values (values greater than 75% are in bold). On the side of each importance value the result of the signal consistency analysis is also presented: an up arrow indicates a consistent positive effect of the variable in the taxon regeneration; a down arrow indicates a consistent negative effect; if both arrows are present, signal was not consistent in the analysed models. Signals were consistent for all variables with importance values greater than 80%.
In order to allow the inspection of (the best) obtained fits, we present in Tables S6-S10 in Supplementary Materials the summary outputs of the best model (lower AIC) for each taxon.

Zero Inflation Component
Importance values for the zero inflation components of all studied taxa were very high for climatic variables, particularly OMBRO and, to a less extent, THERM. This was expected, as mainland Portugal encompasses two distinct macrobioclimates, temperate in the NW and Mediterranean in the rest of the territory [59,88] and climate is known to determine species absence from entire regions [89]. However, species and/or regeneration can be missing from areas that are climatically suitable. In fact, seed availability (SeedAvail), at relatively much smaller scales (from 2 to 10 km), presented also very high importance values, showing that the presence of nearby adult trees is critical for regeneration to occur. SoilMoist was less relevant by far. Still, zero-sapling occurrences of Q. robur decreased towards higher summer moisture areas (i.e., the bottom of the hills and valleys in areas with rainier summers), which agrees with the species preference for deep, cool soils under temperate climate [25,90].

Count Component
As expected, variable importance values were much different among each taxon. Notwithstanding, local to regional seed availability (SeedAvail) resulted positively linked to higher sapling densities for all taxa except Q. suber; the importance of nearby seed sources in Iberian Quercus regeneration was also found relevant by Gómez-Aparicio et al. [91]. OMBRO was also relevant for all taxa and its selection on both count and zero inflation components (with the same signal) indicates the modelling of non-monotonic responses. We now highlight other important variables for each taxon.

Q. robur
Q. robur regeneration was dependent on forest structure, increasing with basal area (G). In fact Q. robur ability to regenerate under intermediate to low levels of light is well-known [92,93]. However, higher tree densities (N) showed an opposite effect, which can be related to regeneration failure in younger and denser forests, probably plantations, where very low light levels can end up hindering Q. robur regeneration [94,95]. Fire frequency (FireFreq) was particularly relevant at hindering Q. robur regeneration, which agrees with this species sensitivity to recurrent fires that can lead to its replacement by shrubland [45]. DensPp had a moderate positive effect on Q. robur sapling density. Fast-growing pine plantations have long been introduced in the Mediterranean countries assuming they would facilitate hardwoods in the long-term [96]. In Portugal, this strategy was mostly unsuccessful but some old growth pine stands in the centre and north of Portugal currently comprise a substantial Q. robur component. We believe this relates to the effect found for DensPp, alongside with greater awareness of this species value. Sapling density was lower in southern aspects (TRASP), where summer water deficit is likely to be higher (or more severe) than in less sun-exposed terrain aspects.

Q. pyrenaica
The negative response of regeneration to higher basal area (G) or denser forests (N) concurs with [41], who studied the regeneration patterns of Q. pyrenaica and linked it to shade-intolerant species. Q. pyrenaica regeneration particularly suffered with understory clearing (UnderClear) and, to a lesser extent, with grazing (GRAZI). Indeed, this species occupies northern mountains where human settlements are scattered but extensive livestock grazing is still practiced [18,56]. Northern mountains display the highest fire frequency in Portugal, due to intricate relationships between land use, climate and topography [20,97]; accordingly, FireFreq affected sapling density negatively, as it did for Q. robur. Soils with heavier textures and higher base content (SPMType) had a positive effect on sapling abundance, similarly to Plieninger et al. [11] findings. Q. pyrenaica stands on such clayish and base-rich soils are rare in Portugal and are mainly limited to the NE region, where very dense stands with high and rapid regeneration have been described [56].
Although Q. pyrenaica occurrence is linked to the transitional areas between temperate and Mediterranean macrobioclimates, thus enduring summer drought better than Q. robur, southern aspects (TRASP) had lower sapling densities, like Q. robur. Contrariwise, DensPp had a negative effect on Q. pyrenaica, which might be related to forest management actions (as shown also by UnderClear) and possibly to the additional pressure this species suffers from its use as firewood [98].

Q. broteroi
After SeedAvail, regeneration of Q. broteroi was chiefly negatively affected by the regional density of E. globulus stands (DensEg). Q. broteroi use for firewood is documented in the centre west of Portugal, where it comprises a considerable proportion of burned firewood [99]. E. globulus stands planted for pulp production are widespread in this region. This possibly explains the regeneration depression in these areas due to both forest management practices and the overuse of Q. broteroi for firewood.

Q. suber
Basal area (G) affected Q. suber regeneration positively, in agreement with other studies where a nursing effect of tree cover was reported [39,100]. It is striking to note that Q. suber was the only taxon for which saplings density did not increase with the density of stands dominated or co-dominated by Q. suber adult trees in the neighbourhood (SeedAvail). In fact, regeneration failure in the montado system is a well-known and well-studied phenomenon [11,101,102] and most Portuguese Q. suber stands appear in the form of such cultural systems [103].
Grazing (GRAZI) is a common component of the agro-silvo-pastoral montado system and was highly relevant in depressing Q. suber regeneration, in agreement with studies showing that the montado sustainability is threatened by overgrazing, among other factors [101,102,104,105]. Slope arose also as an important variable, with flat areas associated with a decrease in regeneration; interestingly, montados are usually associated with flat areas [106]. Additionally, steeper terrain is usually less accessible, being less prone to human disturbances, as referred by Maia et al. [107].
Another remarkable finding is that Q. suber regeneration was positively associated to E. globulus stands, which was unexpected but it is justifiable. On one hand, E. globulus currently occupies a great fraction of the most oceanic Q. suber potential area in Portugal [19,47]. On the other hand, in this central-western part of Portugal it is common to find Q. suber trees exploited for cork, mainly in marginal areas, as a source of extra income for landowners. High environmental suitability for Q. suber and its presence either as scattered individuals or small stands, might explain its denser regeneration in those E. globulus dominated regions.
Q. suber is known to prefer siliceous substrata [90]. Considering SPMType, its saplings occurred in more than 10 stands for "siliceous loamy" (34.3%), "siliceous heavy" (60.7%) and "basic" (5%) types. Stands on "basic" substrata are considerably less in number, clearly evidencing its preferences. Yet, the variable did not result important as expected. We believe that the 15 stands on "basic" substrata might mainly result from mapping generalization errors (recall base data was on a scale of 1:500,000; see Appendix A) or from sites located in rainy areas where soil characteristics might change due to leaching.

Q. rotundifolia
From all studied taxa, Q. rotundifolia is the one able to withstand higher summer drought [108]. Its aptitude to thrive under dry climates is documented [109,110]. In fact, it was the only Quercus for which the probability of zero saplings increased with summer climatic moisture (OMBRO). As for the other Quercus, this was compensated by OMBRO in the count component (thus producing a non-monotonic response, as the others); however, it reflects the fact that sapling presence is concentrated towards summer drier regions, which are extensive in the centre and south of Portugal [59].
Recent fire events (TSLFire) triggered Q. rotundifolia regeneration, most probably linked to resprouting. Notwithstanding, high Firefreq resulted also negative for Q. rotundifolia regeneration but less importantly than for the temperate sub-Mediterranean species.
As for Q. suber, Q. rotundifolia is also frequently found dominating the tree layer of the Portuguese montado system. Yet, it is also commonly observed dominating residual and marginal forests throughout the Mediterranean areas of mainland Portugal. In Spain, the agro-silvo-pastoral system similar to montado, the dehesa, is also associated with a regeneration failure of Q. rotundifolia [106]. Yet, grazing (GRAZI) did not result relevant in obstructing Q. rotundifolia regeneration, as it did for Q. suber. This is somehow puzzling, as a similar pattern would be expected. However, both species co-occur frequently in Portugal and herbivore preference for Q. suber is documented [111,112], which can shed some light on why grazing affected these species differently. Nevertheless, as with Q. suber, slope resulted equally very relevant, with flat areas (common in montados) having decreased regeneration.

Selected Neighbourhoods
For all taxa, the most important variable affecting regeneration was the density of conspecific trees, that is, the proxy of local to regional seed availability (SeedAvail), being determinant in modelling regeneration absence (producing excess of zeros) and with the exception of Q. suber, it was also positively linked to sapling abundance. Quercus spp. seeds are known to be recalcitrant, with viabilities usually shorter than one year [113]. Thus, Quercus do not form a seed bank and, as expected, the proximity of adult trees resulted relevant for regeneration. Although different neighbourhoods performed differently for SeedAvail, changing neighbourhoods did not produce relevant changes in models. Nevertheless, for the temperate species Q. robur and Q. pyrenaica, 2-km neighbourhoods performed better (confirming our initial hypothesis), while for the Mediterranean species Q. broteroi and Q. rotundifolia, respectively, 5-km and 10-km performed better. In the case of Q. broteroi the neighbourhood increase might relate with the low number of adult stands in the dataset. The neighbourhood increase in Q. rotundifolia might relate with the highly continuous and even occurrence of this species in the southeast of Portugal.
Concerning DensPp and DensEg (which reflect local to regional forestry intensity), the 10-km neighbourhoods performed better in Q. robur, Q. broteroi and Q. suber models, as initially hypothesized.
For Q. pyrenaica the 2-km neighbourhood was the best performing one; in this case we believe the relatively simpler rural landscapes where this species generally occurs allows the smaller neighbourhood to capture forestry pressure more readily. SeedAvail, DensPp and DensEg different neighbourhood performance could be better clarified by further work.

Conclusions
Mediterranean taxa sapling abundance (Q. suber and Q. rotundifolia) increased on less accessible sites, reflecting the long lasting and widespread anthropization of such landscapes. Actually, Q. suber regeneration was particularly impeded by cultural practices, namely grazing.
P. pinaster and E. globulus plantations had a negative effect on less noble essences (Q. broteroi and Q. pyrenaica), which are selectively collected for firewood in their regions of occurrence. Results suggest the more valuable Q. robur and Q. suber saplings are selectively protected under plantations, especially the latter, which is kept for cork harvesting and is strictly protected by law.
Disturbance is a well-known trigger of regeneration as it creates site availability. From the set of chosen variables, FireFreq, TSLFire, UnderClear and GRAZI are vegetation disturbances known to be very frequent in the Portuguese context. Wildfires became very frequent in mainland Portugal since the 1970s. Fire triggered the regeneration of Q. rotundifolia (TSLFire), nonetheless, high FireFreq reduced the abundance of its saplings. FireFreq was particularly relevant in hampering the regeneration of temperate sub-Mediterranean taxa (Q. robur and Q. pyrenaica). Although Quercus species resprout after fire [114], frequent fires can exhaust belowground storage reserves precluding further resprouting and kill saplings that have not yet stored enough reserves for resprouting [115].
Pooled analyses, for example, Vayreda et al. [116], although relevant for the study of more general traits (e.g., pines vs. broadleaves), can mislead the importance and relevance of variables. Vayreda et al. [116] concluded that climate and even disturbances, were less relevant drivers of regeneration. In fact, climate cannot be expected to be much relevant in a pooled analysis, as Quercus spp. are ubiquitous in the Iberian Peninsula, thus pooling them is likely to result in a no-effect for climatic variables. In this study we showed that different disturbances, as well as climatic variables, can be important and their relevance changes from taxon to taxon. Eventually, sapling density, as well as sapling absence, showed strong correlation to climate, thus, climatic changes are expected to impact regeneration in the future.
In a nutshell, management actions aiming at improving native forests regeneration should consider fire control (Q. robur, Q. pyrenaica and Q. rotundifolia), thinning of existing forest plantations, for example, P. pinaster (Q. robur and Q. pyrenaica) and avoiding practices that eliminate the taxon, like understory clearing (Q. pyrenaica and Q. broteroi), firewood overexploitation (Q. broteroi) and overgrazing (Q. suber).

Supplementary Materials:
The following are available online at http://www.mdpi.com/1999-4907/9/11/694/s1, Figure S1: Maps of the no. of plots dominated or co-dominated by Q. robur adult trees (SeedAvail) in predefined neighbourhoods. Black dots represent sites where Q. robur saplings were found, Figure S2: Maps of the no. of plots dominated or co-dominated by Q. pyrenaica adult trees (SeedAvail) in predefined neighbourhoods. Black dots represent sites where Q. pyrenaica saplings were found, Figure S3: Maps of the no. of plots dominated or co-dominated by Q. broteroi adult trees (SeedAvail) in predefined neighbourhoods. Black dots represent sites where Q. broteroi saplings were found, Figure S4: Maps of the no. of plots dominated or co-dominated by Q. suber adult trees (SeedAvail) in predefined neighbourhoods. Black dots represent sites where Q. suber saplings were found, Figure S5: Maps of the no. of plots dominated or co-dominated by Q. rotundifolia adult trees (SeedAvail) in predefined neighbourhoods. Black dots represent sites where Q. rotundifolia saplings were found, Figure S6: Maps of the no. of plots dominated or co-dominated by P. pinaster adult trees (DensPp) in predefined neighbourhoods, Figure S7: Maps of the no. of plots dominated or co-dominated by E. globulus adult trees (DensEg) in predefined neighbourhoods, Figure S8: Pearson correlation coefficients (represented as ellipses) between pairs of variables used in the Q. robur models (n = 4167) [117], Figure S9: Pearson correlation coefficients (represented as ellipses) between pairs of variables used in the Q. pyrenaica models (n = 4461) [117], Figure S10: Pearson correlation coefficients (represented as ellipses) between pairs of variables used in the Q. broteroi models (n = 4631) [117], Figure S11: Pearson correlation coefficients (represented as ellipses) between pairs of variables used in the Q. suber models (n = 4485) [117], Figure S12: Pearson correlation coefficients (represented as ellipses) between pairs of variables used in the Q. rotundifolia models (n = 4642) [117], Table S1: Summary statistics for the Q. robur dataset (n = 4167), Table S2: Summary statistics for the Q. pyrenaica dataset (n = 4461), Table S3: Summary statistics for the Q. broteroi dataset (n = 4631), Table S4: Summary statistics for the Q. suber dataset (n = 4485), Table S5: Summary statistics for the Q. rotundifolia dataset (n = 4642), Figure S13: Sapling counts boxplots for all datasets. Whiskers extend 1.5 times the interquartile range. Outside whiskers, circles indicate observations that were used in the analysis, while crosses those removed, Figure Table S6: Summary output of the best fitted model for Q. robur, Table S7: Summary output of the best fitted model for Q. pyrenaica, Table S8: Summary output of the best fitted model for Q. broteroi, Table S9: Summary output of the best fitted model for Q. suber, Table S10: Summary output of the best fitted model for Q. rotundifolia.

Appendix A
This appendix details how the variable SPMType (soil parent material types) was constructed. The chemical composition and texture/grain size of the parent material influence pedogenesis and mostly determine soil richness and texture, which also strongly depend on other important factors such as climate, biota and so forth. Soil parent material types were obtained by reclassifying the Portuguese Geological Map 1:500,000 [77]. We reclassified each geological map unit (rock types and formations) into seven classes, crossing the associated potential soil base richness and potential soil texture (see Table A1). To position each rock type/formation in each of the seven classes we used information on the richness in SiO 2 , K and Ca and also information on texture/grain size of the parent material. Table A2 presents a detailed characterization of the seven used classes, namely presenting the relative content in SiO 2 , K and Ca of the parent material, the central (potential) soil texture (i.e., usually associated with that parent material) and finally the rock types/formations that were included in each class (mainly in view of the context of mainland Portugal geologic principal features).  Legend: + and -signs indicate relative richness; * generally poor but rare exceptions occur; var.: very variable content.
Igneous rocks were positioned mainly using data from QAPF diagrams (Quartz, Alkali feldspar, Plagioclase and Feldspathoid) and the TAS (Total Alkali vs. Silica) diagram [118,119] and other specific mineralogical information of each rock type/formation, gathered in geological and similar publications. Sedimentary and metamorphic rocks/formations were positioned following specific mineralogical information of the respective rock type/formation, gathered in geological and similar publications. For the present work, the "basic heavy" and "basic very heavy" classes of Table A2 were merged into one single class called "basic heavy".
After the reclassification, the new map was entered into a Geographical Information System (GIS) and orthoimagery was used to improve the map accuracy. In particular, using ESRI ARCMAP 10.0 SP5 Spatial Adjustment tools (rubersheet method) the obtained polygons were adjusted to the Portuguese official boundary [120], a highly accurate dataset of the Portuguese border, as well as, adjusted to match natural boundaries when unequivocally recognized in the orthoimagery provided by ESRI (ESRI basemaps, 50 cm resolution otrthoimages from DGRF collected from 2004 to 2006). Several natural boundaries between geological units could be recognized and were very useful to improve the map georeferencing, for instance: (i) boundaries between granitoid rocks and metasediments were sometimes recognizable in the orthoimages and particularly valuable in the centre and north of mainland Portugal. (ii) in the south, the transition between limestone and metasediments was often distinguishable in the orthoimages. (iii) ultramafic rocks, which are known to affect greatly the vegetation cover, allowed to add some correction points in the north-east of mainland Portugal, using the shift to other basic rocks. (iv) alluvial plains have usually dendritic shapes, matched to the easily recognizable river networks; additionally, their limits are also identifiable straightforwardly in orthoimages, as they often produce clear-cut geomorphologic transitions to the surrounding hills. Therefore, almost all the patches corresponding to alluvial sediments produced several georeferencing correction points.
In total 9875 (circa 25% on the border, 75% inland) control points were used to improve spatial accuracy of the produced dataset. Adjustments were slight (most, up to 600 m) but significantly improved by the matching with natural shapes and with the Portuguese border, thus certainly contributing to overall accuracy of the dataset. This spatial adjustment is, in practice, not much relevant at the 1:500,000 scale (recall 600 m correspond to 1.2 mm in the printed map) but is of utmost importance if the researcher wants to cross the dataset with other data with high spatial accuracy (e.g., point data coming from global positioning devices), as it was performed in the present article.
Some geological formations are composed of intricate layers of very different types of rocks (sometimes very thick) that might point to different classes. In such cases, we kept such information by defining different levels of classes. Although it is difficult to determine which of those complex layers dominate in terms of outcropping matter, based mostly on geological literature, we produced a first level of information (Level 0) where only the dominant class is mentioned. In Level 1 we ascribed (when present) up to two classes to such complex formations (thus, including the co-dominant class). In Level 2 we ascribed all the occurring classes, preserving the original geological information. In the present work, we used Level 0 (dominant class only, see Figure A1). This information can be downloaded from: http://home.isa.utl.pt/~tmh/. Forests 2018, 9, x FOR PEER REVIEW 16 of 22 Figure A1. Soil parent material types map. The map contains other two classes: "alluvia" which are not easily classifiable within any of the defined classes, as their chemical composition and texture vary and "water", corresponding to rivers, estuaries, etc.