Disentangling the Regeneration Niche of Vatica odorata (Griff.) Symington Using Point Pattern Analysis

: Seed dispersal and environmental heterogeneity, and the effects of their interaction, are perceived to be determinants of the spatial patterns of trees. We applied the spatial point process to analyse Vatica odorata (Griff.) Symington (Dipterocarpaceae) in Cuc Phuong National Park of Vietnam to understand its spatial patterns, and to decipher the main factors affecting seedling establishment of the species. We established a total of 12 replicated plots, each of which had one or two seed trees in the centre, and recorded all regeneration plants of V. odorata with their positions. A total of 671 regeneration plants were found. Covariates, including canopy, ground cover, and distance to seed trees, were measured on systematic grids of 4 × 4 m. In the context of the spatial point processes, we used a generalised linear mixed model, considering a random effect of the plot. In the model, the greatest distance observed is about 35 m from the seed tree. The canopy and ground cover have a signiﬁcant impact on the regeneration of the species: The intensity of regenerating stems was greatest with a canopy cover of 70%. The ground cover range for good development of regenerating plants was between 10% and 30%.


Introduction
Seed dispersal and environmental heterogeneity, and the effects of their interaction, are perceived to be determinants of the spatial patterns of trees in tropical forests [1,2]. Researchers have proposed various methods within the framework of spatial point process statistics to disentangle determining factors in the field by analysing the spatial pattern itself, for example, by applying full survey plots and a priori hypotheses about pattern effects of dispersal and/or environment. For the assumption of randomness, see [3,4], and for randomness and clustering, see [2,5].
The authors are not aware of studies based on spatial point processes that directly fitted and tested spatial models using data, taking into account both seed dispersal and effects of environmental factors in tropical forests. However, an attempt by Pranchai, et al. [6] to fit inhomogeneous point process models for the effects of environmental variables on seedling patterns in mangroves revealed the potential of such an approach. Clearly, certain prerequisites have to be met in order to render this approach feasible in forests.
isolation [8,9]. The extent of aggregation is tree size-dependent. It is intense at the sapling stage and weakens in most species as trees attain greater sizes [10]. This highlights the importance of the stage of tree development and the appropriate corresponding spatial scale of observation when conducting spatial pattern analyses.
The aggregation of juveniles around single adult trees can be observed in some species (Vouacapoua americana: Traissac and Pascal [9]; Shorea pauciflora: Suzuki et al. Suzuki,et al. [4]; Diospyros sylvatica: Nguyen, et al. [8]). The limitations of seed dispersal in these species appear to be an important cause of this clumping around adults in the earlier stages of development [10,11]. An opposite effect, namely an absence of young plants from the immediate vicinity of mature trees of the species, has also been described Nguyen, et al. [8]. Janzen [12] and Connell [13] have documented the fact of low regeneration densities in the vicinity of seed trees of the same species as part of the famous 'escape theory'. Spatial aggregation induced by dispersal limitations may be disrupted by density-dependent mortality to some extent [2].
In the case of species with low dispersal ability, the origin of seedlings and saplings in the vicinity of isolated adult individuals can be traced back to the source tree without laborious marking systems with a high degree of probability. This seems particularly true for Dipterocarps [10,14]. Knowledge of the origin of seedlings in combination with a series of spatial positions can provide information about the basic dispersal ability of the species. This constellation of a single adult tree and associated regeneration can allow comparatively simple algorithms to determine dispersal kernels, and so circumvent the obstacles associated with so-called 'inverse modelling' [15]. This may pave the way for a deeper analysis of further environmental influences on regeneration intensity [15].

Environmental Factors
After dispersal, the succeeding regeneration process is affected by environmental factors [16]. A number of environmental factors are well known to impact the germination, survival, and growth of seedlings and saplings in forests. Mean elevation, slope, exposition, direct and diffuse radiation, soil moisture, soil fertility, leaf litter, grass and herbaceous cover, shrub cover, root competition, and herbivory are the factors most often investigated [17][18][19]. Many of the primary effects might be influenced by the overstorey structure [20].
If soil and slope conditions are homogeneous, light is a crucial environmental factor. Too little light often results in limitations on growth [21] and may be detrimental to the very survival of forest plant species [22][23][24]. Nicotra, et al. [23] showed that light availability varies under different canopy densities and structures. Many plant species exhibit high vitality and competitive strength in open conditions [25]. Light-demanding species may lose their competitive power due to the changed environmental conditions created as the canopy closes, which may also contribute to the increased regeneration of shade-tolerant species under a closed canopy [25]. Busing [26] indicated that the densities of all species that regenerate by seed are high near or in canopy openings, i.e., gaps. This was found to be true for both shade-tolerant and shade-intolerant species [26]. Turner [27] showed that the survival and growth of seedlings of Dipterocarps depend on the light availability at microsite level. Shade-tolerant Dipterocarps can survive under closed canopy conditions for a long period of time [28] by building seedling banks. However, even the seedlings of these species eventually die under a fully closed forest canopy [29] because of the loss of root weight and photosynthetic rate at 1-3% relative light intensity [30].
Ground vegetation has a strong potential influence on the dynamics of tree regeneration [31] and forest succession [32], due to the competition for space, light, water and nutrients that it generates [33]. The herbaceous layer is the first hurdle in the regeneration stage, the stratum under which a new generation of trees must germinate and grow [31]. George and Bazzaz [33] found that the presence of understorey cover influences the growth and size structure of the seedling community. The presence of a ground vegetation layer generally tends to reduce seedling density [34].

Choice of Study Object
The aforementioned characteristics, namely limited seed dispersal, high shade tolerance, and a significant proportion of isolated adult trees in natural forests, may be observed for many species of the Dipterocarpaceae family [10,14]. Many Dipterocarps are also considered to be economically valuable [35].
The existence of these ecological conditions for many Dipterocarp species should enable us to apply spatial point process statistics to directly fit and test spatial models, taking into account both seed dispersal and the effects of environmental factors in tropical forests. We applied point pattern analyses to data collected for Vatica odorata (Dipterocarpaceae) seedlings. We chose a natural environment in the north of Vietnam to test the appropriateness of the method in the field. The aim behind the study was to understand and disentangle the main factors that affect the establishment of V. odorata seedlings. Inventory plots were established in Cuc Phuong National Park to collect data pertaining to canopy cover, ground cover, and the distance to seed trees. The data were used to test the following hypotheses: (i) The density of regeneration plants depends on the distance to the nearest conspecific adult. (ii) The ground vegetation cover has a significant influence on the density of V. odorata regeneration. (iii) Density of V. odorata regeneration varies according to the canopy cover.

Study Area
The study took place in Cuc Phuong National Park, Vietnam. The park is located in the north of Vietnam, between 22 • 14 -22 • 24 N and 105 • 29 -105 • 44 . The area covers a total of 22,220 ha [36,37], situated between 100-600 m above sea level. The geology is limestone karst [36], and the area comprises two mountains separated by a central valley [37]. The topography features many caves, small valleys, and rocky cliffs. Cuc Phuong is subject to a monsoon and tropical climate. The temperature ranges from 16.6-39 • C, with an average annual temperature of 24.7 • C [36]. In the core zone of the park, where the elevation is higher, the average temperature is about 21 • C, with a minimum of 9 • C in the winter [38]. The annual precipitation exceeds 2130 mm year −1 [36,38] and the humidity is about 90% [36]. The soils are derived from the underlying limestone and quartz [36]. Several types of soil may be encountered from the bottom to the top of the limestone mountains, such as red-gneiss-yellow, red-yellow soil, and yellow margaliste-ferralite [36]. The soils formed from karst are stable and have a high nutrient content [39], but the soil texture is light, the top layer is shallow and moist [36]. Cuc Phuong has a wide variety of higher plant species, with about 1926 species belonging to 990 genera and 229 families [36,40]. The most dominant species in the park belong to the Sterculiaceae and Dipterocarpaceae. The vegetation consists of evergreen tropical forest including many lianas. The number of plant species decreases from the valleys to the mountain peaks [36]. The plots studied were located at the bottom of the valley, along the single road within the park. The plot locations ranged from 200-400 m above sea level. The forest is characterised by five strata, including three woody layers, one shrub, and one herb layer. The number of floral species in the valleys is approximately 1219 species, including Bryophyta, Pteridophyta, and Angiosperma [36].

Sampling Design
We searched for and mapped single potential seed trees of V. odorata, positioned along the forest road running through the central valley of the park. Chosen trees were selected from a much larger population of potential seed trees in the area. An individual tree was regarded a 'seed tree' when the tree was deemed mature. Although the species may reach 40 m in height and 1 m diameter at breast height (dbh) [41], we recorded an individual as a seed tree when it was standing alone or as a pair in the forest and the diameter at breast height was ≥35 cm. Note that the species has bisexual flowers [41].
We established 8 single-tree plots and 4 plots with 2 seed trees each, for a total of 12 plots and 16 seed trees. The plots had different dimensions. The 8 plots with a single source tree had dimensions of 80 × 80 m. The 4 plots with two source trees were 100 × 100 metres (2 plots), 80 × 120 metres (1 plot), and 70 × 70 metres (1 plot). The dimensions of the plots were adapted to the individual conditions of the site, i.e., the plots should include all plants of the regeneration of V. odorata that could be associated with the source tree(s). The positions of the seedlings found in the plot were mapped using a Cartesian coordinate system, applying a level of precision of 0.10 m. Germinants were excluded from the investigation. See Figure A1 in Appendix A for illustrations of the plots.
We classified regenerating stems, which we here define as individuals with a height x of more than 20 cm and a dbh of less than 30 cm, into four height classes: h1 for x < 100 cm, h2 for 100 cm ≤ x < 200 cm, h3 for 200 cm ≤ x < 300 cm, and h4 for 300 cm ≤ x. A total of 671 plants of regeneration were found and mapped. Throughout this paper we will refer to plants of regeneration as 'seedlings' regardless of their actual height.

Systematic Grid and Environmental Factors
To analyse the effect of environmental factors we measured two covariates, canopy cover and ground cover, on a systematic grid of 4 × 4 m spacing, which was established within the boundaries of each plot. Covariates were measured at the grid points.

Canopy Cover
Measurements of canopy cover were taken with a crown mirror densiometer [42]. The percentage of canopy cover included any area covered by leaves and branches above a height of 1.3 m. The Paine scale was applied to assign a percentage cover [43].

Ground Cover
Ground cover was defined for the purposes of this study as the vegetation associated with herbaceous plants < 1 m in height. Higher grasses and palms were not included in the percentage ground cover but regarded as canopy cover instead.
At grid points, a sub-quadrat of 2 × 2 m was used to document the ground cover. The point of intersection of two lines in the grid was in the centre of the sub-quadrat. The percentage of a sub-quadrat covered by ground vegetation was recorded ocularly. The data were recorded in classes [44] and later transformed into a percentage ground vegetation cover (ground cover).

Establishing Image Datasets for Covariates
The environmental covariates, i.e., the ground vegetation and canopy cover, were observed at grid-points in plot windows. These observed numerical values were then spatially smoothed with an isotropic Gaussian kernel. The result is a pixel image providing a covariate value at any given point. Pixel values are values of the interpolated function.
The distance to the nearest potential seed tree was determined for any point in the plot area from the coordinates of the seed trees and the point coordinates (via Euclidean distance). This distance information was saved as a pixel image and further on treated like a covariate.

Investigating the Relationship between Canopy Cover and Ground Cover
The covariate data measured at grid points were analysed for cross-correlation. We computed a bivariate moving window correlation, thereby assessing the significance of the correlation between two spatial processes [45,46]. In addition, we performed non-parametric smoothing to visually represent the overall relationship between the two covariates.

Spatial Point Pattern Analysis
The general technique used to describe the spatial pattern of V. odorata regeneration was spatial point process analysis [47][48][49]. A prerequisite for an analysis of this kind is a common observation window (see description of plots) for the different data, e.g., point data and covariates. We established 12 replicates of the V. odorata source tree and regeneration settings. Each of these replicates was subjected to a known set of experimental conditions (described by the values of the covariates), and each replicate yielded a response, which is a spatial point pattern. The point patterns were assumed to be independent, but not necessarily having the same distribution.
The several point patterns and the associated covariate data were stored in a special format, i.e., a 'hyperframe' [47]. The covariate data were numeric data, stored as pixel images. In the 12 replicates, observation windows of different sizes were established (see description of plots).
An important characteristic of a spatial point pattern is its intensity measure, which specifies the expected number of points in subsets of the plane. It is here determined by its density, the intensity function λ(u), which expresses this per unit of area at each location u. We will refer to λ(u) as intensity or simply density ( [48] chapter 4, page 175).
Prior to modelling the density of V. odorata seedlings, the spatial point pattern of the V. odorata seedlings had to be described. The effects of the covariates were assessed individually.
The average density in the observation of V. odorata seedlings was estimated by the following Equation ( [48], p. 80)λ where N(W) is the number of V. odorata seedlings in the observation window and ν(W) is the area of the window. In addition to the question of mean V. odorata seedling densities, our analyses addressed the issue of the spatial patterns of V. odorata seedlings throughout the window.

Modelling V. odorata Seedling Intensity
To test our hypotheses (see introduction), we assumed that each replicate, given the plot random effect, is an inhomogeneous Poisson point process with intensity function λ(u) as specified by Equation (2) below.
We sought to analyse a spatial trend caused by environmental covariates. More specifically, we tried to describe the corresponding mathematical function for that spatial trend. We did not account for an internal process of clustering, i.e., point interaction. However, inhomogeneity and clustering may be related [47].
Indeed, because the random effect, which affects λ(u) in a multiplicative way (see Equation (3) below), is itself a random variable, the resulting point process is a 'mixed Poisson process' and belongs to the class of 'Cox processes'. We assumed that the logintensity is a normally distributed random variable. We checked for the Poisson distribution of the mean intensity of the replicates.
An important step in the analysis of a non-stationary point process is specifying a function ρ which provides information on how the intensity at a point u depends on the vector Z(u) of covariates at that location. Thus, To assess the effects of individual covariates on seedling density and get an initial understanding of the complexity of the models needed, we first created graphs of the estimated functional relationship between the response, i.e., intensity, and the given covariate, assuming a function ρ depending on that covariate only. Such a relationship might be called a resource selection function (RSF). We used a nonparametric method based on kernel smoothing, as described in [47] (chapter 6.6.3), which does not assume a particular form for the relationship. Applying this procedure, we selected for relevant covariates and visually gained a rough idea of the functional relationship. If the functional relationship of the RSF indicated this, the covariates were included in the model with exponents to allow for higher degree polynomials.
An initial model was then built, including all of the relevant covariates identified by the RSF as variables with fixed, unknown coefficients and a random intercept for each plot; i.e., this model was a mixed model.
The model was evaluated within a generalised linear mixed model (GLMM) framework with the intensity function for each plot i. β T is a vector of fixed coefficients, x i denotes the fixed covariates of plot i, γ i is the random intercept of plot i. Following the fit of the first model, we assessed the importance of each covariate using the AIC (Akaike information criterion) value. This was done by stepwise elimination of those variables whose elimination reduced the AIC of the previous model. Once further reduction of the AIC by eliminating additional variables was no longer possible, we checked for interactions among variables in the fixed model and the plot. As could be seen from the RSF, only the distance to the seed source showed some evidence of an interaction with the plot.
In all steps, we also compared the new, smaller model to the old, larger model using a likelihood-ratio test (LRT), assuming an asymptotic chi-squared distribution for the deviance difference. After the described sequence of model improvements, we finally obtained the model (see Section 3.2): where i denotes the i-th plot, u denotes the location within that plot, and the random effect parameter γ i relates to the i-th plot. In addition, 'dist' denotes the distance to nearest source tree, 'cover' is the canopy cover, and 'ground' indicates the ground cover. Finally, the 'cumulative distribution function' (CDF) test [47] was employed to test the goodness-of-fit of the model. By choosing the covariate 'distance to the nearest source tree' for the test, we ensure that the test is particularly sensitive to deviations from the null hypothesis, where intensity depends on the distance to the nearest source tree.
Predictions of the effect functions of single covariates were computed, and confidence intervals for the values of the effect functions were constructed as described in Appendix B.
We used the standard techniques for mixed effects models described in [47]. Computations were performed using the R software version 4.1.1 (package 'spatstat' (2.2-0); R Core Team [50]).
More information on the theory of replicated point patterns in ecology is available in [51].

Basic Statistics
The plot number was assigned according to the number of the source tree involved. If two source trees were present in a plot, the plot number was composed of the numbers of the two trees. Table 1 contains a summary of the characteristics of the source trees. The smallest trees recorded were located in plots 45-28 and 33, with a dbh of 38 cm and heights of 17 and 18 m. The largest reported trees were found in plots 39 and 48, and had dbh of 95 and100 cm and heights of 40 and45 m.
The general shape of the regeneration pattern and the spatial variation of the covariates are presented in Figure 1. The seed tree is in the centre of the plot. Covariate values based on the systematic grid data were estimated for the area of the whole plot by means of smoothing, as explained in Section 2.3.  18 18 79  35  26  25  25  56  37  25  27  27  79  36  23  28  45-28  38  17  8  33  33  38  18  11  34  35-34  79.5  35  22  35  35-34  84  40  24  36  36  54  30  19  39  39  100  45  26  42  42  50  32  16  45  45-28  47  25  15  48  48  95  40  23  50 50 The general shape of the regeneration pattern and the spatial variation of the covariates are presented in Figure 1. The seed tree is in the centre of the plot. Covariate values based on the systematic grid data were estimated for the area of the whole plot by means of smoothing, as explained in Section 2.3. The frequency distribution of the height classes of the seedlings was biased towards the smallest individuals. There were 317, 246, 66, and 42 trees assigned to the classes h1, h2, h3, and h4, respectively. Altogether 47% of the individuals were smaller than 1 m in height, and more than 83% of the 671 seedlings were smaller than 2 m.
The frequency distributions of distance to the nearest conspecific adult for the different height classes are shown in Figure 2. The median of the frequency distribution is a distance of approximately 12 m in h1 and h2, 15 m in h3, and 18 m in h4. The median distance to the nearest conspecific adult was smallest amongst the small seedlings and largest amongst the tallest. The difference between these two classes was approximately 5 m. The frequency distribution of the height classes of the seedlings was biased towards the smallest individuals. There were 317, 246, 66, and 42 trees assigned to the classes h1, h2, h3, and h4, respectively. Altogether 47% of the individuals were smaller than 1 m in height, and more than 83% of the 671 seedlings were smaller than 2 m.
The frequency distributions of distance to the nearest conspecific adult for the different height classes are shown in Figure 2. The median of the frequency distribution is a distance of approximately 12 m in h1 and h2, 15 m in h3, and 18 m in h4. The median distance to the nearest conspecific adult was smallest amongst the small seedlings and largest amongst the tallest. The difference between these two classes was approximately 5 m.
Ecologies 2022, 3 343 The frequency distributions of distance to the nearest conspecific adult for the different height classes are shown in Figure 2. The median of the frequency distribution is a distance of approximately 12 m in h1 and h2, 15 m in h3, and 18 m in h4. The median distance to the nearest conspecific adult was smallest amongst the small seedlings and largest amongst the tallest. The difference between these two classes was approximately 5 m. The ranges and mean values of the covariates are provided in Table 2. The canopy cover ranged from 40-95%, with few outliers below 40%. The mean value was 75.37%. The ground cover ranged from 10-90%, with a mean of 35.5%. The distance to the nearest potential seed tree depended on the size of the window, the number of seed trees, and their positions. Distances of up to 72 m between seedlings and potential seed source were observed in the windows.
The degree of cross-correlation between the covariates 'canopy cover' and 'ground cover' was investigated. The cross-correlation was almost always negative within plots, i.e., it varied between −0.41 and 0.09 with a mean of −0.22 (see Table A1 in Appendix A). However, the cross-correlation within a plot was never significant, which was tested by Dutilleul's modified t-test [46]. The overall relationship between ground cover and canopy cover is shown in Figure 3. The ranges and mean values of the covariates are provided in Table 2. The canopy cover ranged from 40-95%, with few outliers below 40%. The mean value was 75.37%. The ground cover ranged from 10-90%, with a mean of 35.5%. The distance to the nearest potential seed tree depended on the size of the window, the number of seed trees, and their positions. Distances of up to 72 m between seedlings and potential seed source were observed in the windows.
The degree of cross-correlation between the covariates 'canopy cover' and 'ground cover' was investigated. The cross-correlation was almost always negative within plots, i.e., it varied between −0.41 and 0.09 with a mean of −0.22 (see Table A1 in Appendix A). However, the cross-correlation within a plot was never significant, which was tested by Dutilleul's modified t-test [46]. The overall relationship between ground cover and canopy cover is shown in Figure 3. Table 2. Summary of the means and ranges of the covariates in 12 plots. In the case of plots with two seed trees, the plot name is a combination of the two tree numbers.

Plot
Seedling The canopy cover ranged from 40-95%, with few outliers below 40%. The mean value was 75.37%. The ground cover ranged from 10-90%, with a mean of 35.5%. The distance to the nearest potential seed tree depended on the size of the window, the number of seed trees, and their positions. Distances of up to 72 m between seedlings and potential seed source were observed in the windows.
The degree of cross-correlation between the covariates 'canopy cover' and 'ground cover' was investigated. The cross-correlation was almost always negative within plots, i.e., it varied between −0.41 and 0.09 with a mean of −0.22 (see Table A1 in Appendix A). However, the cross-correlation within a plot was never significant, which was tested by Dutilleul's modified t-test [46]. The overall relationship between ground cover and canopy cover is shown in Figure 3. Seedling densities differed from plot to plot, ranging from 0.003281 to 0.019896 seedlings per m 2 , with a mean of 0.007632 seedlings per m 2 . However, densities varied considerably within the plots and the covariates seem to have had a strong impact on seedling Seedling densities differed from plot to plot, ranging from 0.003281 to 0.019896 seedlings per m 2 , with a mean of 0.007632 seedlings per m 2 . However, densities varied considerably within the plots and the covariates seem to have had a strong impact on seedling density. Figure 4 shows the simple estimates based on kernel smoothing for the general relationship between the intensity of regeneration and the distance to the nearest seed source found in the 12 plots. In most plots, seedling density was highest at some distance from the source, between 10-20 m, and finally levelling out at distances ≥ 30 m. This course of development of seedling density can be described with a third degree polynomial. The relationship between regeneration density and the other covariates can be found in Appendix A (Figures A2 and A3).  . Estimated intensity of regeneration as a function of the distance to the nearest seed tree. The solid black line represents the estimated function (ρ(z)), the red dotted line represents the mean, and pointwise 95% confidence intervals for the true value are plotted as grey shading.

Model Results
The final generalised linear model (GLMM) for V. odorata seedling density is summarised in Table 3. . Estimated intensity of regeneration as a function of the distance to the nearest seed tree. The solid black line represents the estimated function (ρ(z)), the red dotted line represents the mean, and pointwise 95% confidence intervals for the true value are plotted as grey shading.
As discussed in Section 2.4, the general linear mixed model was configured based on the observed general trends and polynomials of the second degree (ground cover) and third degree (distance to seed source, canopy cover) were assumed for the effects of single variables.

Model Results
The final generalised linear model (GLMM) for V. odorata seedling density is summarised in Table 3.

Quality of Fit
The appropriateness of the model can be seen in the residual diagnostic plot (see Figure A4, Appendix A). Most of the observed values are covered by the cloud of values predicted by the model. The cumulative sum of residuals in most of the plots on both the x-axis and the y-axis are within the confidence interval. In addition, in the smoothed residual field, the difference between the model and the true value is equivalent to zero for most of the plots.

Quality of Fit
The appropriateness of the model can be seen in the residual diagnostic plot (see Figure A4, Appendix A). Most of the observed values are covered by the cloud of values predicted by the model. The cumulative sum of residuals in most of the plots on both the x-axis and the y-axis are within the confidence interval. In addition, in the smoothed residual field, the difference between the model and the true value is equivalent to zero for most of the plots.
The main effects of each covariate on regeneration are shown in Figure 5.

Distance to Seed Tree
In the model, the effect of the distance to the seed tree on the density of young trees was striking. The effect was always negative ( Figure 5A). The maximum distance we modelled was 35 m to the source tree. The distance at which the highest densities of the young trees were found was at 5-10 m from the parent trees. As can be seen in Table 4, the

Distance to Seed Tree
In the model, the effect of the distance to the seed tree on the density of young trees was striking. The effect was always negative ( Figure 5A). The maximum distance we modelled was 35 m to the source tree. The distance at which the highest densities of the young trees were found was at 5-10 m from the parent trees. As can be seen in Table 4, the distance effect was present in all plots, as the model with the interaction between distance and plot was inferior to the model without this interaction (but see the discussion of the effect of tree height on distance). A model taking the interaction of canopy cover and ground cover into account provided no added value either ( Table 4).
The cumulative distribution function (CDF) test, which tests the null hypothesis that the real distribution and the model distribution of the distance to the nearest potential seed tree are equal, proved that the model provided an acceptable fit to the regeneration pattern of V. ordorata (with the p-value = 0.503) ( Figure 6). However, some systematic deviations from the predicted distribution were observed. Whereas plots 33 and 42 had higher real values of cumulative seedlings at shorter distances than the model predicted, the opposite effect was observed for plots 25, 48 and 50-51.
Ecologies 2022, 3 347 distance effect was present in all plots, as the model with the interaction between distance and plot was inferior to the model without this interaction (but see the discussion of the effect of tree height on distance). A model taking the interaction of canopy cover and ground cover into account provided no added value either ( Table 4). The cumulative distribution function (CDF) test, which tests the null hypothesis that the real distribution and the model distribution of the distance to the nearest potential seed tree are equal, proved that the model provided an acceptable fit to the regeneration pattern of V. ordorata (with the p-value = 0.503) ( Figure 6). However, some systematic deviations from the predicted distribution were observed. Whereas plots 33 and 42 had higher real values of cumulative seedlings at shorter distances than the model predicted, the opposite effect was observed for plots 25, 48 and 50-51.

Ground Cover
Increasing ground cover had a negative impact on the regeneration of V. odorata. Figure 5B shows that the regeneration status is good between 10 and 30% ground vegetation cover. V. odorata seedlings failed to establish where the ground vegetation cover exceeded 70%.

Canopy Cover
The general trend is shown in Figure 5C. The young tree density increased from a degree of canopy cover of 40%, the lowest value in our plots. The density reached a peak at around 70% and then declined at values higher than 75%.
The graphs of the variables were calculated by fixing the other variables to the following values: canopy cover to 70%, ground cover to 20%, and distance to seed tree to 10 m.

General Observations
The primary seed dispersal distances observed in Dipterocarpaceae have generally been short [14]. Although small mammals may act as secondary dispersal agents [52], the observed distances between seedlings and the nearest seed source are short for most species, rarely exceeding 40 m [53]. The seedling positions relative to the nearest potential seed sources and the distances measured in our study are in line with these observations (see Figure 2).
The pattern of distance distributions ( Figure 2) shows that the distances of seedlings from the source tree shift between growth stages, with a significantly higher frequency of seedlings under the canopy of the adult conspecifics than of saplings [54]. The V. odorata seedlings that survive tend to be those located farther away from the parents than others [55]. This corresponds to the 'escape theory' put forward by Janzen [12] and Connell [13].
Takeuchi, et al. [53] found the same trend in a Dipterocarp forest in Malaysia, where the mortality of the seedlings and saplings of four Dipterocarp species was highest close to the conspecific adults (≤5 m). In our study, the median distance from the V. odorata seed tree increased from 12.8 m in the case of seedling and sapling height class h1, to 15.9 m, 15.6 m, and 17.9 m for height classes h2, h3, and h4, respectively. The largest saplings tended to be those that had 'escaped' the immediate vicinity of the closest conspecific adults.
Thus, we may conclude, in agreement with Takeuchi, et al. [53], that V. odorata is a 'long-distance disperser' on the Dipterocarpaceae scale; this is based on a maximum observed distance of almost 50 m from the nearest conspecific mature tree, the shift in the mean distance to the nearest conspecific adult by height class, from 12.8 m (h1) to 17.6 m (h4), and, finally, the virtual absence of height class h4 seedlings at distances of less than 10 m from the nearest conspecific adult.
The environment facing many of the seedlings in our plots was characterised by high canopy cover (mean: 75%) and a layer of ground cover (mean: 35.5%) lower than 1 m in height. Canopy cover and ground cover were negatively correlated in our data. Therefore, an opening of the canopy leads to an increase of the ground cover, including grass cover [56] and the herbaceous layer [25]. A ground cover value of below 35%, less than the mean value, only occurs where there is canopy cover of more than 75% (see Figure 3).
If one takes the findings presented in Figure 3 into account, and simply sums up the percentages of canopy cover and the corresponding ground cover, the result is almost always a total cover of more than 100%. The germinants, and subsequently the seedlings, must, therefore, cope with a heavily shaded environment, typical of multi-layered oldgrowth forests in tropical conditions. However, there is much variation on a small scale, and the spatial resolution of our data, e.g., the covariates, probably does not entirely reflect this variation.

Distance between Seedlings and the Nearest Potential Seed Source
Although the seedlings belonging to the different height classes exhibited different frequency distributions relative to the distance to the nearest potential seed source, the model did not account for the different height classes of the regeneration. Furthermore, the model predicted the spatial density (=N/m 2 ) of regeneration and not the total frequency (=N) at particular distances. Thus, in contrast to the maxima of the distance-frequency distributions of the seedlings, which were between 12-18 m (Figure 2), the maximum of the predicted density distribution in terms of distance to the nearest potential seed source was between 5-10 m. However, the model accounted for this hump-shaped feature of the distance-density relationship quite well, which again fits the Janzen-Connell hypothesis.
Taking into account the observations made in relation to the seedling positions in our study, and the distances to the nearest potential seed source mentioned above, the model results pertaining to the distance to the nearest seed source generally coincided very well with reality ( Figure 6). However, the model fit in terms of distance between seedlings and the nearest potential seed source sometimes underestimated the actual values (plots 48, 25, and 50-51) and sometimes overestimated these values (plots 33, 42, and 45-28), as can be seen in Figure 6 This observation suggests that the distance of seed dispersal is generally related to the height of the corresponding seed tree. Thus, the relationship between the seed dispersal distance and the height of the source tree is positive [32,57], a general phenomenon observed for 41 tropical tree species [58]. Future studies involving regeneration modelling in V. odorata should consider the height of the seed tree and other characteristics to arrive at a better understanding of recruitment.

Ground Cover
The ground cover exhibited a significant (negative) impact on the density of seedlings and saplings (Table 3 and Figure 5B). Rizza, et al. [59] showed that an increase in ground vegetation cover from 20%-50% to 75% brings about a reduction of active radiation (PAR) from 50% to 20%. Viewed in combination with the height class distribution of the seedlings in our investigation-47% of the individuals were <1 m and 83% were <2 m-it seems probable that this shading effect affects a large proportion of the seedlings. A high percentage of ground vegetation cover is a strong constraint on the survival of young trees. Similar to northern red oak (Quercus rubra) and eastern redbud (Cercis canadensis), both moderately shade-tolerant species [59], as well as Tetracentron sinense species native to Asia [60], and Homalanthus novoguineensis and Polyscias murrayi woody species in tropical regions [61], in our investigation survival was highest in the 20-30% ground cover range. The impact of ground vegetation cover on the regeneration of Dipterocarpaceae has rarely been studied to date. Döbert, et al. [17] found no significant effect of aboveground biomass of native plant species < 2 m in height on Dipterocarp regeneration. However, the filtering effect of ground vegetation on tree seedlings generally is well documented [33].
During our study, it became clear that ground cover is negatively correlated with canopy cover (Figure 3). Therefore, a prerequisite for a deterministic distribution of tree regeneration is fulfilled [1]. The effect of ground cover on regeneration density should not be assessed without taking canopy cover into account simultaneously. However, a model of seedling density incorporating the interaction between canopy cover and ground cover was not found to be superior to a model without this interaction (Table 4). This phenomenon of non-significant interactions between canopy cover and ground vegetation cover on seedling density could be explained by the inappropriate spatial resolution of the covariate data, masking closer relationships. At the same time, it is likely that recent ground cover is related to earlier canopy cover, as plants need time to adapt to the constantly evolving environmental conditions in forests. Thus, the correlation will always be imperfect.

Canopy Cover
Canopy cover and its effect on the regeneration of Dipterocarpaceae in humid tropical forests has been described previously [18,62]. The canopy cover is a factor that strongly influences light distribution in forests [23] and is also related to belowground resource availability [63]. Moreover, the quantity (solar radiation) and quality (wavebands) of the light that influences the growth of, and competition between, tree species in the forest is determined by the structure of the canopy [25]. Previous studies have shown that the regeneration of woody species is closely related to canopy cover in temperate regions [64] and in old-growth and secondary forests in tropical regions [23,26]. This relationship has also been demonstrated in gap research. Dipterocarp species exhibit their highest growth rates in the gap centre [65,66], where abundance is also highest [65]. Studies have proved that pioneer trees (light-demanding) can grow and rapidly occupy larger gaps (higher radiation) [67,68], while climax (shade-tolerant) species exhibit better competitive ability in small gaps (low radiation) [9,[69][70][71].
In our study, regeneration density follows canopy density in the form of an optimum relationship. V. odorata regeneration occurs from 40% to almost 100% canopy cover, with the highest seedling density predicted at 65-70% canopy cover ( Figure 5C). Similarly, the optimum threshold for the establishment and growth of seedlings is about 40-60% canopy cover for some temperate species [64]. However, confidence bands are very broad below 70% canopy cover ( Figure 5C), so that the interpretation of this feature should be done with caution.
At 75% canopy cover, the ground cover in our study was less than 35% of that found under more open conditions (Figure 3). The density of seedlings, in turn, was highest at a ground cover of less than 35% ( Figure 5B). Thus, canopy cover influences regeneration density both directly ( Figure 5C) and indirectly ( Figure 5B) by reducing ground cover ( Figure 3). These two effects cannot be disentangled on the basis of our data. However, the indirect effect mentioned corresponds to the indirect facilitation model [72]. This model assumes that the direct negative effect of adult trees on regeneration of fast-growing competitors produces an indirect positive effect on tree seedlings, which cancels out the direct negative effect of adult trees on seedlings [25,73]. The effect of the adult trees on the competing vegetation is therefore stronger than the same effect on the tree seedlings. Due to this stronger effect on the competitive vegetation by the old trees, the tree seedlings gain relative competitive strength. Due to the direct effect and the indirect facilitation, the canopy cover becomes an environmental factor of exceptional importance for the regeneration process. Silvicultural considerations should start here when it comes to creating conditions favourable for regeneration, e.g., during stand restoration.
In addition, canopy gaps impact plant regeneration by changing the forest microenvironment, litter layer, and other factors [25,73]. The seedling recruitment and mortality are influenced by not only direct but also interacting effects of canopy opening, leaf litter, and understory vegetation in tropical secondary forests [25,73]. At the seedling stage, during the decomposition process, litter can release nutrient or phytotoxic compounds [25,73], as well as soil physical and chemical characteristics [25,73], thus changing the microsite environment. Brearley, et al. [25,73] have proved that litter addition increased biomass (10-60%) and leaf area (25-55%) of three Dipterocarp species seedlings (Hopea nervosa, Parashorea tomentella, and Dryobalanops lanceolata) in tropical forests. In our study, the model took into account the canopy cover and ground vegetation as covariates that had a striking effect on the recruitment of species. However, the leaf litter thickness and its interacting effect of these variables has not been considered. Therefore, future studies involving regeneration modelling in V. odorata should consider the leaf litter to arrive at a better understanding of recruitment.

Conclusions
As the results of the statistical model in Section 3 showed, the density of V. odorata seedlings and saplings varied under the influence of different environmental variables and based on the distance to the nearest conspecific adult tree. The effects of all three of the factors considered were quantifiable and tested using a sound statistical framework. We utilised standard software, namely the R-package spatstat, to analyse the data. The method employed would seem, therefore, suitable to the aim of disentangling important determinants of the spatial patterns of trees in tropical forests.
Apart from the methodological aspects and the gain in ecological understanding, we also see potential for the application of the approach in restoration efforts. If we want to apply a strategy of promoting natural regeneration through the targeted creation of favourable environmental conditions, we need information on the regeneration ecology of the tree species of concern. However, restoration efforts in tropical forests are often hampered by a lack of knowledge of the species' natural regeneration ecology. The high species diversity of these forests remains a challenge for effective research on the regeneration ecology of individual species. This method has been shown to be appropriate for improving our knowledge of species' natural regeneration ecology. Acknowledgments: Firstly: we would like to thank the German Federal Ministery of Food and Agriculture for funding our project "Economic revaluation of degraded tropical secondary forests through natural regeneration in Vietnam" (OekAuNat, Funding code: 28I-032-01). We sincerely thank the three reviewers for their valuable comments. During the writing process, Dietrich Stoyan gave constructive comments on an earlier version of the manuscript. In addition, we would like to thank the Administration of Cuc Phuong National Park for the support during our inventory in the field.

Conflicts of Interest:
The authors declare no conflict of interest.  Figure A2. Estimated intensity of regeneration as a function of the canopy cover. The solid black line represents the estimated function (ρ(z)), the red dotted line represents the mean, and pointwise 95% confidence intervals for the true value are plotted as grey shading. Figure A2. Estimated intensity of regeneration as a function of the canopy cover. The solid black line represents the estimated function (ρ(z)), the red dotted line represents the mean, and pointwise 95% confidence intervals for the true value are plotted as grey shading.
Ecologies 2022, 3 354 Figure A3. Estimated intensity of regeneration as a function of the ground cover. The solid black line represents the estimated function (ρ(z)), the red dotted line represents the mean, and pointwise 95% confidence intervals for the true value are plotted as grey shading. Figure A3. Estimated intensity of regeneration as a function of the ground cover. The solid black line represents the estimated function (ρ(z)), the red dotted line represents the mean, and pointwise 95% confidence intervals for the true value are plotted as grey shading. Ecologies 2022, 3 355 Figure A4. The plots display the residuals from the fitted model. As the model estimates a point density, the residuals are also densities. These plots can be used to assess goodness-of-fit, to identify Figure A4. The plots display the residuals from the fitted model. As the model estimates a point density, the residuals are also densities. These plots can be used to assess goodness-of-fit, to identify outliers in the data, and to reveal departures from the fitted model. For each one of the 12 plots (compare to Figure A1, Appendix A), there is a four-panel graph. The colour ribbon at the bottom of the figure applies to the top left and the bottom right panels, i.e., to the residual maps. Please note that "Green" represents density deviations (residuals) close to zero, i.e., from −0.005 to about 0.005, while "Blue" represents larger negative deviations, and "Yellow" larger positive ones. The individual panels show in detail: Top left-The residual measure A. This displays circles cantered at the points of the data pattern with radii proportional to the (positive) residuals, while the negative part of the residual measure, which is a density, is plotted as a colour image. Bottom right-The residual measure B. Each data or dummy point is taken to have a 'mass' equal to its residual. This point mass is then replaced by a bivariate isotropic Gaussian density with standard deviation sigma. The value (can be positive or negative) of the smoothed residual field at any point in the window is the sum of these weighted densities. The field is plotted as an image and contour plot. Bottom left-The residual measure C. This is a plot of the cumulative sum of the residuals for all points which have an X coordinate less than or equal to x. The confidence limits are also shown. Top right-The residual measure C. This is a plot of the cumulative sum of the residuals for all points which have a Y coordinate less than or equal to y. The confidence limits are also shown.

Appendix B. Computation of Confidence Intervals for Effect Functions
As described in Section 2.4, we estimated the parameters of the point pattern model using the function mppm in the R package 'spatstat' (Version 3.6.3): library(spatstat) fit <-mppm(Reg~1 + Cover + I(Coverˆ2) + I(Coverˆ3) + Ground + I(Groundˆ2) + Dist + I(Distˆ2) + I(Distˆ3), H, random =~1|Plot, rbord = 3.0) To compute 95% confidence intervals for values of the effect function, shown as confidence bands in Figures 1 and 2, we used the estimated covariance matrix of coefficient estimates from the auxiliary lme model returned in the $Fit$FIT component of the mppm output as follows: ci_lme <-function(model, newdata, level=0.95) { X <-model.matrix(formula(model) [-2], newdata) est <-X %*% fixef(model) len <-qnorm(.5*(1+level)) * sqrt(rowSums(X %*% vcov(model$Fit$FIT) * X)) data.frame(fit=est, lwr=est-len, upr=est+len) } Readers might question this procedure for a number of reasons: (1) The computation is based on the auxiliary linear mixed effects model instead of directly on the point pattern model; (2) The ratio of the two variances (plot random effect and residual error) is estimated first and then considered fixed, cf. Section 2.3 in Pinheiro and Bates [73]. This means that the uncertainty in this ratio is neglected; (3) The quantiles are taken from a standard normal distribution, essentially a t-distribution with infinitely many degrees of freedom. This might not be appropriate.