Brown Bear Food-Probability Models in West-European Russia: On the Way to the Real Resource Selection Function

: Most habitat suitability models


Introduction
Habitat suitability models (HSMs) and the distribution maps built on them are necessary for the conservation and management of animal populations [1][2][3][4]. In almost all cases, habitat selection is based on the species' choice of certain resources needed for their vital activities [5,6]. According to Boyce and McDonald [5], if we know the distribution of resources on which the organisms depend, their distribution and abundance can often be characterised by resource selection functions (RSFs). In general form, RSF is a way of detecting and measuring the degree to which a resource is preferred (used) or avoided (4) identifying the richest areas in terms of brown bear food resources in two diverse landscape scenarios, i.e., natural and human-modified landscapes.

Study Area
The Central Forest State Nature Biosphere Reserve (hereinafter CFNR) is located in the west of European Russia at the southwestern edge of the Valdai Upland on the Main Caspian-Baltic watershed of the Russian Plain (Tver Region, Russia; 56 • 25 59 -56 • 31 01 N and 32 • 28 59 -33 • 01 01 E; Figure 1). The CFNR territory is characterised by a temperate moderately continental climate with relatively cold winters and warm summers. More than half of the precipitation falls as rain in the summer-autumn period; the remaining amount falls as snow during winter-spring [37]. The mean air temperature of the study area is +4.21 • C (+16 • C in July and −10 • C in January). The mean annual precipitation is 730.9 mm. This territory represents a ridge-hilly elevated plain with absolute elevation of 217-316 m a.s.l. The following geomorphological structures are present here: moraine ridges of Moscow and Valdai age (the latter with well pronounced kames) with elevation of 248-316 m a.s.l., and lake basins with elevation of 217-243 m a.s.l. The CFNR includes a core area (244.15 km 2 ) and a buffer zone (460.61 km 2 ). The terrain of CFNR core area is flat (inclination < 1 • ) consisting of plains. The drained parts of the moraine ridges are characterised by southern taiga broadleaved-spruce forests and their derivatives. The lake basins and fluvioglacial plains are characterised by boreal spruce forests. The raised bogs are developed on the gentle slopes of moraine ridges and in separate depressions of lake plains. On Sphagnum-dominated raised bogs and their edges, there are plant communities with the participation of Scots pine Pinus sylvestris L. [38]. In general, CFNR core area forests have not been heavily impacted by humans, and they are considered as mostly intact forest areas. At present, windfalls are the only natural drivers of vegetation dynamics. The CFNR forest flora is mainly composed of boreal species widespread in the taiga zone with a dominance of Picea abies L. (hereinafter-spruce) and nemoral species such as The buffer zone of the CFNR includes predominantly human-modified landscapes, where active agricultural and forest logging activities occurred until the 1990s. At present, in the CFNR buffer zone, as a result of forest cutting there are many clearcuts of various ages, from young clearcuts without trees to old-growth mixed forests. Nine settlements remain here with a total population of 130-150 people. Only one of them, Zapovedny village, is located in the immediate vicinity of the CFNR core area border. Current meadow plant communities are associated with moraine ridges and represent the cultural landscapes (pastures and hayfields) abandoned before the 1990s and currently overgrown with forest vegetation (Figure 1).
The CFNR forest flora is mainly composed of boreal species widespread in the taiga zone with a dominance of Picea abies L. (hereinafter-spruce) and nemoral species such as Tilia cordata Mill., Corylus avellana L., Ulmus glabra Huds., and U. laevis Pall. Populus tremula L., Betula spp., Alnus incana L., and A. glutinosa L. usually predominate in the disturbed areas after windfalls or forest logging. A special place belongs to Sphagnum-dominated raised bogs and Sphagnum forests with Pinus sylvestris. The CFNR fauna is of European origin, but it includes striking Siberian representatives, the European distributions of which are usually related to boreal landscapes [38]. The common species among large carnivores are brown bear, gray wolf Canis lupus L., and Eurasian lynx Lynx lynx L. Among ungulates, there is a high density of moose Alces alces L. and moderate densities of roe deer Capreolus capreolus L. and wild boar Sus scrofa L.

Food-Probability Models
All key steps for model building are presented according to the general recommendations for SDM studies [4,12]. The model objectives were aimed at studying the relationships between brown bear food resources and their environment (explanation), and mapping their spatial distribution. We describe our models following the Overview, Data, Model, Assessment, Prediction (ODMAP) protocol (ver. 1.0) for SDM [4] with the use of the range model metadata standards (RMMS) dictionary [39]. Here, we provide an overview and the main settings of the distribution models while the remaining ODMAP sections and data processing are detailed in Table S1. The general view of the modelling workflow with the seven main steps is shown in Figure 2. All data processing, analysis, and modelling were done using RStudio 1.1.447 software based on R 4.0.3 [40].
The main food resources of the brown bear have been determined on the basis of a previously conducted diet analysis [41]. We built 14 spatial distribution models for 13 brown bear food resources: Angelica sylvestris L., Aegopodium podagraria L., and Chaerophyllum aromaticum L. individually and all these Apiaceae forbs together, aspen Populus tremula, bilberry Vaccinium myrtillus L., cranberry (Vaccinium microcarpum (Turcz. ex Rupr.) Schmalh. and Vaccinium oxycoccos L., hereinafter Oxycoccus spp.), hazel Corylus avellana, rowan Sorbus aucuparia L., domestic apple Malus domestica Borkh., anthills, social wasps, xylobiont insects, and moose Alces alces. The importance of the main herbaceous plants in the brown bear diet differs within a season (S.S.O., unpublished data), i.e., A. podagraria is important in spring, A. sylvestris in spring and summer, and C. aromaticum in summer. For this reason, we built a general model for all three Apiaceae species and separate models for each of them. We divided all ants into two categories, according to their nest construction: (i) species that build nests with aboveground structure, such as soil nest mounds or anthills from plant remains (anthills); and (ii) species that build nests in dead wood: snags, logs, or stumps (coarse woody debris, according to Frank et al. [42]). The latter category also included the larvae of other xylobionts (mostly beetles), so we combined all insects from coarse woody debris into one category-xylobiont insects. All wasps (yellowjackets) were assigned to the general group of social wasps. On the basis of data from camera traps, brown bears mainly hunted moose in the CFNR during spring and early summer (S.S.O., unpublished data). For this reason, we built a distribution model for moose only for the period of April-June. The main food resources of the brown bear have been determined on the basis of a previously conducted diet analysis [41]. We built 14 spatial distribution models for 13 brown bear food resources: Angelica sylvestris L., Aegopodium podagraria L., and Chaerophyllum aromaticum L. individually and all these Apiaceae forbs together, aspen Populus tremula, bilberry Vaccinium myrtillus L., cranberry (Vaccinium microcarpum (Turcz. ex Rupr.) Schmalh. and Vaccinium oxycoccos L., hereinafter Oxycoccus spp.), hazel Corylus avellana, rowan Sorbus aucuparia L., domestic apple Malus domestica Borkh., anthills, social wasps, xylobiont insects, and moose Alces alces. The importance of the main herbaceous plants in the brown bear diet differs within a season (S.S.O., unpublished data), i.e., A. podagraria is important in spring, A. sylvestris in spring and summer, and C. aromaticum in summer. For this reason, we built a general model for all three Apiaceae species and separate models for each of them. We divided all ants into two categories, according to their nest construction: i) species that build nests with aboveground structure, such as soil nest mounds or anthills from plant remains (anthills); and ii) species that build nests in dead wood: snags, logs, or stumps (coarse woody debris, according to Frank et al. [42]). The latter category also included the larvae of other xylobionts (mostly beetles), so we combined all insects from coarse woody debris into one category-xylobiont insects. All wasps (yellowjackets) were assigned to the general group of social wasps. On the basis of data from camera traps, brown bears mainly hunted moose in the CFNR during spring and early summer (S.S.O., unpublished data). For this reason, we built a distribution model for moose only for the period of April-June.

Presence Records of Brown Bear Food Resources
The collection of data on the distribution of brown bear food resources was carried out along hiking routes through the whole study area during March-November of 2008-2020 ( Figure A1). We adhered to the general principles for mapping food resources [1,11,36]. In all cases, we recorded the presence of intact (undamaged by brown bears) focus plants and insect nests. Exceptions were the nests of social wasps, dead wood ants, and xylobiont insects, which are difficult to detect if they are not damaged.
Among plants, we recorded only those suitable as brown bear food. For instance, this included all vegetative shoots of herbaceous plants and all dwarf shrubs, regardless of their fruiting. For hazel, this included adult fruiting individuals, and young individuals for aspen and rowan. For each plant species, we recorded its abundance in rectangular "5 m × 5 m" plots using the Braun-Blanquet scale following Shores et al. [36]. In the models, we included as presence records only those locations where the percent cover of a plant was more than 25% (3-5 scores according to the Braun-Blanquet scale). The collection of data on plant distribution was performed using the specially developed form in the ArcGIS QuickCapture (Esri Inc., Redlands, CA, USA) mobile application. We used a specified minimum georeference accuracy of 4.6 m to record the locations of the vegetation plots at their centre. The presence of insects was recorded at each finding of their nests. Moose occurrence was estimated by recording hoof prints, scats, eaten plants, bed sites, and visual observations from April to June, when brown bears intensively prey on adult moose individuals and calves [28,43]. The collection of data on animal distribution was performed using the specially developed form in the ArcGIS Survey123 (Esri Inc., Redlands, CA, USA) mobile application with a given minimum georeference accuracy of 4.6 m (Table S1). To reduce the effects of spatial sampling bias, we thinned all food resource records [44][45][46]. As a measure of clustering, the average nearest neighbour index (NNI) was calculated for both plants and animals using the spatialEco R package [47]. Index values less than 1 exhibit clustering, while values greater than 1 exhibit dispersion or competition [47] ( Table A1). The spatial thinning of points was applied with the spThin R package [45]. Thinned points, satisfying a dispersed distribution (NNI > 1), were used as training points for model building.
It is recognised that the filtering (thinning) of species records allows for solving the problem of clustering caused by sampling bias but does not address the issue of spatial autocorrelation (SAC) [48]. Spatial autocorrelation can be observed not only in the model residuals, but also in the input data, i.e., presence records and predictor variables [48][49][50].
Since the calculation of residuals in presence-background models is still controversial, we can only reduce SAC in input datasets. Spatial autocorrelation must be avoided between training and testing data, in order to guarantee independence between them [12], i.e., spatially separated training and testing datasets should be used [48,50]. To account for spatial autocorrelation, we used block cross-validation [49] with a spatial blocking strategy with a random pattern and 100 iterations [50], where the block size was determined from the median of the spatial autocorrelation range among all predictor variables (664 m for plants, 663 m for insects, and 1590 m for mammals) ( Figure A2). Ten sets of spatially independent training and test records were performed using the blockCV R package [50].

Environmental Variables
Climatic, edaphic (soil), and topographical (terrain) variables are most commonly used to model the spatial distribution of plants [11,25,27,33,44,51], while topographical characteristics are more often used for ants [42,52]. In our case, the size of the study area does not allow us to use climatic variables, and edaphic variables are not available at high resolution for the CFNR. We used data from multispectral satellite imagery (vegetation indices, VI), a digital elevation model (DEM), proximity (distance) rasters, landcover types, and forest canopy cover. All of these variables are most often used in modelling food resource distributions at high resolution (30 m) [10,27,35,36].
For the multispectral satellite imagery data, we used nine Landsat 8 OLI-TIRS satellite images [53] (http://www.earthexplorer.usgs.gov; accessed on 10 March 2021) with cloudless coverage of the study area for the period April 2014-October 2020. Whenever possible, two scenes for different years were selected for each month of the vegetative season (Table S1). Before calculating the vegetative indices, a radiometric correction was performed using radiometric coefficients according to the accepted formula [53]. For the first category of predictor variables, we calculated five vegetation indices, including (1) the enhanced vegetation index (EVI) [54], which was developed to correct the vegetation signal through a decoupling of the canopy background signal and a reduction in atmosphere influences. It provides increased sensitivity in areas with dense vegetation and high biomass, while minimising the influence of soil and atmosphere [55]. The EVI is similar to the normalised difference VI (NDVI) [56] and can be used to assess the greenness of vegetation. The other calculated vegetation indices were: (2) the green NDVI (GNDVI), a modification of the NDVI used to assess photosynthetic activity and water consumption by plants [57]; (3) the normalised difference moisture index (NDMI), which measures the moisture content of vegetation [58]; (4) the green chlorophyll VI (GCVI), an estimation of the chlorophyll content in the leaves of various plant species [59]; and (5) the atmospherically resistant VI (ARVI), the first VI that is resistant to atmospheric influences and which also reflects differences in phytomass [60]. For all indices, mean values were calculated on the basis of all Landsat scenes. Additionally, orthogonal vegetation indices were calculated by tasseled cap transformation [61] with corrections for the OLI sensor of the Landsat 8 satellite [62]. Since there were no exposed rocks or large expanses of bare soil in the study area, we did not use brightness. We also did not use greenness, which was highly correlated with EVI. Thus, we used only the third component, wetness, which is a measured value for interactions of soil moisture, water, and other moisture features [62]. It was also calculated as a mean over all Landsat scenes. Calculations of all VI were performed in ArcMap 10.6.1 (Esri Inc., Redlands, CA, USA).
The second category of predictor variables included terrain characteristics calculated from the global DEM (SRTM 1 Arc-Second Global; http://www.earthexplorer.usgs.gov; accessed on 16 November 2018) with a spatial resolution of 30 m, i.e., elevation a.s.l., slope steepness, aspect (northness and eastness), hillshade, and the topographic ruggedness index (TRI) [63]. Before calculating these variables in ArcMap, the DEM layer was projected into UTM 36N according to the recommendations of Sillero and Barbosa [48]. We also calculated the incoming solar radiation (direct insolation) from the DEM. This is also an important variable for building distribution models of animal foraging plants [10,25,27,33,35]. Because soil moisture has been identified as an important predictor for the distribution of brown bear food resources [10,21,25], we calculated the compound topographic index (CTI) [64], also known as the topographic wetness index (TWI) [65], or soil wetness, which is highly correlated with soil moisture. Calculations of solar radiation and CTI were performed in SAGA GIS 7.7.1 [66].
As proximity rasters, distances to rivers were calculated in ArcMap on the basis of 1:500,000 topographic maps and a panchromatic band of Sentinel-2B satellite images. The landcover types included the following categories: boreal spruce forests, nemoral spruce forests, Sphagnum spruce-pine forests, deciduous mixed small-leaved forests, young forests and overgrown clearcuts, raised bogs, dry meadows, floodplain meadows in river valleys, and disturbed lands (villages and new clearcuts). We defined overgrown clearcuts as secondary forest stands older than 30 years [42]. These data were obtained from our landcover map, which was constructed on the basis of the results of visual classification of Landsat 8 scenes ( Figure 1) and performed utilizing the Image Classification tool in ArcMap GIS using the maximum likelihood classification method [67] (Table S1). The resulting map was then compared with ground descriptions and available landcover data from the Global Land Cover service (https://lcviewer.vito.be/; accessed on 5 April 2021). The overall classification accuracy was 88% and kappa was 0.85. For comparison, in similar RSF studies on brown bears, the assessment of the classification accuracy of landscape types ranged from 80 to 88% [1,13,21]. The forest canopy cover was used at a resolution of 30 m (product treecover 2010) [68] (data taken from the Global Land Analysis and Discovery-GLAD service; https://glad.umd.edu/; accessed on 26 April 2021).
All environmental variables were transformed to the same geographic extent and WGS84 coordinate system in the UTM 36N projection. A set of 24 preliminary environmental parameters was created at a resolution of 30 m (Table S1). This high resolution did not exceed the permissible error in recording the coordinates (4.6 m) of the presence records [12,48].
All environmental variables were tested for multicollinearity. At the pre-selection step, Spearman's rank correlation was performed and variables with r > |0.7| were excluded. This included GNDVI, ARVI, and GCVI. Additionally, a variance inflation factor (VIF) test with a threshold value of 10 was performed. The VIF test excluded the forest canopy cover variable for plant food-probability models and EVI for insect models. Because a large number of variables (>15) contribute to model overfitting [69], we also removed TRI and hillshade, as the least ecologically significant ones for the studied species in the CFNR. Floodplain meadow cover (for plant and mammal models) and CTI (for insect models) were deleted during the post-selection step after model fitting because of their low importance values. As a result, a set of 19 environmental variables was obtained, from which subsets of 15 independent environmental variables were compiled for plants, insects, and mammals (Table 1).

Model Tuning and Calibration (Training)
To build models of spatial distribution, the Maximum Entropy (MaxEnt) algorithm was used, as it is the most optimal for presence-only data [70][71][72][73]. For this we used the Maxent 3.4.1 software [74] freely available on the Internet (https://biodiversityinformatics. amnh.org/open_source/maxent/; accessed on 10 March 2021). Uneven collection of data across a study area inevitably leads to bias and skewed sampling [71], which can distort model realism [75,76]. To correct this error, we created a bias file based on the minimum convex polygon covering all record points [77]. Within this polygon, 10,000 random points were generated, which were indicated to MaxEnt as background samples for comparison [78]. In all cases, a maximum of 1000 iterations and a convergence threshold of 10 −5 were used.
Selection of the optimal settings for the model hyperparameters was performed using a genetic algorithm in the SDMtune R package [79]. Combinations of linear, quadratic, product, and hinge feature types were used, as they represent the most appropriate relationships of focal species with environmental variables [78]. The values of the regularization multiplier (RM) were chosen as optimal for data testing and to prevent model overfitting [71,79,80]. The choice of optimal settings was determined by running models in various combinations of the indicated feature types and the RM range from 0 to 8 with a step of 0.5. The final settings for each model are shown in Table 2. To validate such models, k-fold cross-validation was applied using the spatial block method and k = 10 [50]. The best combination of feature types and regularization multiplier was determined using the area under the receiver operating characteristic (ROC) curve (AUC) [81]. After model calibration, a variable reduction procedure was performed (variable post-selection) to remove the least important variables based on the testing true skill statistic (TSS) value [79]. Model tuning, building, and evaluation were performed using the SDMtune [82] and dismo [83] R packages (Table S1). Table 2. The input data, regularization multiplier (RM), and feature types (FT) for 14 brown bear foodprobability models in the Central Forest State Nature Reserve (West-European Russia) in 2008-2020. Feature types shown are linear (L), quadratic (Q), product (P), and hinge (H).

Model Evaluation (Testing) and Exploration
For the evaluation of model accuracy, we used different threshold-independent and threshold-dependent metrics [84]. The threshold-independent AUC was used as a measure of model discrimination power. Values of AUC range from 0 to 1, where 1 indicates the model's perfect ability to differentiate between presence and background points, and 0.5 indicates random-level discrimination [85]. The model with an AUC value > 0.7 suggests that it has good discrimination ability [76,86]. We used AUC test as the ability of the model to correctly differentiate a random presence point in the test data from a random point of the background [71,75,80]. To assess the overfitting of the model, AUC diff (the difference between AUC train and AUC test ) was calculated. A smaller difference between the values indicates a lower degree of overfitting [84,87]. Because AUC has been somewhat criticised and is not recommended for use as a baseline in assessing the accuracy of presence-only models [87,88], we also used TSS as the main metric. This statistic includes both sensitivity and specificity and can be expressed as sensitivity + specificity −1 [89]. Its values vary from −1 to 1 and follows grades similar to that of the kappa statistic: −1 to 0.40 = poor quality, 0.40 to 0.55 = fair quality, 0.55 to 0.70 = good quality, 0.70 to 0.85 = very good quality, 0.85 to 0.99 = excellent quality [90]. The TSS constitutes a threshold-dependent performance measure, and it was calculated using a TSS-maximization threshold [3,82]. We also utilised the continuous Boyce index (CBI), using the moving window of width W = 0.1 [8,85], to assess the presence probability maps [24]. This index evaluates how strongly the values predicted by the model differ from the randomly expected ones. Its values vary from −1 to 1, where 1 means that the distribution of the predicted values corresponds to the distribution of test values, 0 means the distribution does not differ from the random one, and −1 means that the opposite effect is observed [85]. Because the Boyce index is based on the Spearman rank correlation, we interpreted values > 0.7 (high correlation) as good quality of the model [85]. The CBI was calculated with the help of the ecospat R package [91].
We used multiple lines of independent evidence according to the "gold standard" from Araújo et al. [12]: spatial block cross-validation and held apart fully independent data. Independent test data were taken from the archive of the CFNR. These data were collected by rangers and researchers of the CFNR for the same period of time, independently of the main author. The number of independent test points for each model is shown in Table 2. For each food resource, ten models were tested, built on ten folds of training records according to the partitioning into spatial blocks. In the first case, average TSS, AUC train , AUC test, and AUC diff values were calculated on the basis of 10-fold block cross-validation. In the second case, the average probability map from the ten models was tested on the independent dataset by calculating the CBI.
To estimate variable importance, we ran a jackknife test removing one variable at a time and calculating the testing TSS to evaluate the final model with the help of the SDMtune R package [79,82]. We also plotted response curves for the most important variables for each food resource.

Mapping Spatial Distribution
The maps were constructed according to averaged predictions of 10-fold cross-validation models based on the cloglog output format, since we were interested in the presence probabilities of brown bear food resources, with predictions being interpreted as the probability of presence (0 = absence, 1 = presence) [72]. As a threshold for dividing predictions into binary classes (presence/absence of a species) when constructing binary maps, we used the maximum sum of sensitivity plus specificity (maxSSS) threshold value, which is considered to produce the best results for models based on presence-only data [24,92]. For estimating prevalence of each food resource in the CFNR core area and its buffer zone, we calculated area of its presence from a binary map for each territory type and multiplied it by a correction factor (1.32 for the reserve core and 0.68 for the buffer zone). The correction factor was estimated from the proportion of area of each territory type. To build a general map of food resource richness, we combined all binary maps and calculated the sum of the food resource categories for each pixel (Table S1). According to TSS values, low quality was noted only for models on xylobiont insects (TSS = 0.34) and social wasps (TSS = 0.39). Taking into account AUC test values, low quality was found for models on xylobiont insects (0.60), social wasps (0.66), rowan (0.64), and aspen (0.64) ( Table 3; Figure 3). At the same time, AUC diff had maximal values for the xylobiont insects (0.12) and Aegopodium podagraria (0.11) models, which also indicates model overfitting. The value of the continuous Boyce index was lowest for the rowan (CBI = 0.35) and xylobiont insects (CBI = 0.54) models. The most nonlinear shapes of CBI curves were demonstrated by the following models: XYLO (xylobiont insects), ANHI (anthills), SOWA (social wasps), and SOAU (rowan) (Figure 4).

Evaluation of Food-Probability Models
Models for brown bear diet-based RSF were selected on the following basis (in order of importance priority): values of TSS ≥ 0.40 (a quality of higher than fair), AUC test ≥ 0.70 (good quality), CBI ≥ 0.7 (high correlation and mostly smooth Boyce curves), and AUC diff ≤ 0.05 (not too overfitted models). Results of the model evaluation are shown in Table 3 and Figure 3. Based on obtained estimates, models on xylobiont insects (XYLO) and social wasps (SOWA) have the lowest predictive ability. In this regard, we cannot include maps of their predictions in the final RSF as reliable predictors of its food resources. POTR (aspen) and SOAU (rowan) models passed the check according to the TSS, but they showed low values of AUC test , CBI (for rowan), and AUC diff ; thus, they have only been used for building seasonal RSF (spring and autumn, respectively). Models for Aegopodium podagraria (AEPO) and bilberry (VAMY) were overfitted, but they had appropriate values for the other estimates. Therefore, we also retained them in accordance with the priority of importance of estimation criteria.    Models for brown bear diet-based RSF were selected on the following basis (in order of importance priority): values of TSS ≥ 0.40 (a quality of higher than fair), AUCtest ≥ 0.70 (good quality), CBI ≥ 0.7 (high correlation and mostly smooth Boyce curves), and AUCdiff ≤ 0.05 (not too overfitted models). Results of the model evaluation are shown in Table 3 and Figure 3. Based on obtained estimates, models on xylobiont insects (XYLO) and social wasps (SOWA) have the lowest predictive ability. In this regard, we cannot include maps of their predictions in the final RSF as reliable predictors of its food resources. POTR (aspen) and SOAU (rowan) models passed the check according to the TSS, but they showed low values of AUCtest, CBI (for rowan), and AUCdiff; thus, they have only been used for building seasonal RSF (spring and autumn, respectively). Models for Aegopodium podagraria (AEPO) and bilberry (VAMY) were overfitted, but they had appropriate values for the other estimates. Therefore, we also retained them in accordance with the priority of importance of estimation criteria.

Variable Importance
For the Apiaceae spp. model, the most important variables were EVI (TSS = 0.50), NDMI (TSS = 0.33), and DecFor (TSS = 0.31) (Table A2), with which a positive correlation was found ( Figure A3). The variable Eastness (TSS = 0.27) was also important; but at the same time, the presence probability of Apiaceae forbs was greatest on southeastern slopes and least on northern slopes ( Figure A3). All three Apiaceae species preferred areas with increased values of EVI and NDMI. For the Angelica sylvestris model, the most important variables were the same as for Apiaceae spp., with the addition of Northness (TSS = 0.27), which showed maximal probability of plant presence on southern slopes (Table A2; Figure A3). In addition, the strong positive correlation with the percent cover of mixed small-leaved forests was found only for the A. sylvestris model. For the Aegopodium podagraria model, besides EVI and NDMI, the most important variables were Elevation (TSS = 0.49) and Wetness (TSS = 0.46), while for Chaerophyllum aromaticum, DryMead (TSS

Variable Importance
For the Apiaceae spp. model, the most important variables were EVI (TSS = 0.50), NDMI (TSS = 0.33), and DecFor (TSS = 0.31) (Table A2), with which a positive correlation was found ( Figure A3). The variable Eastness (TSS = 0.27) was also important; but at the same time, the presence probability of Apiaceae forbs was greatest on southeastern slopes and least on northern slopes ( Figure A3). All three Apiaceae species preferred areas with increased values of EVI and NDMI. For the Angelica sylvestris model, the most important variables were the same as for Apiaceae spp., with the addition of Northness (TSS = 0.27), which showed maximal probability of plant presence on southern slopes (Table A2; Figure A3). In addition, the strong positive correlation with the percent cover of mixed small-leaved forests was found only for the A. sylvestris model. For the Aegopodium podagraria model, besides EVI and NDMI, the most important variables were Elevation (TSS = 0.49) and Wetness (TSS = 0.46), while for Chaerophyllum aromaticum, DryMead (TSS = 0.47) was also important. If the probability of A. podagraria presence decreased sharply in high humidity, this decrease was slightly more pronounced for C. aromaticum, which also manifests itself in a less pronounced relationship with an increase in NDMI ( Figure A3). At the same time, the probability of A. podagraria presence increased with elevation, while for C. aromaticum the probability of presence increased with increasing percent cover of dry meadows. For the aspen model, the predictors EVI (0.45), Elevation (0.38), NDMI (0.36), and Eastness (0.33) were the most important. With an increase in the values of EVI, Elevation, and NDMI, the probability of aspen presence also increased. In addition, this species also prefers eastern slopes ( Figure A3 (Table A2). Bilberry distribution is associated with low values of EVI and NDMI, increased humidity, and high percent cover of boreal spruce forests ( Figure A4 (Table A2; Figure A4). Only for Elevation did we find a sharp decrease with a consequent increase in presence probability. For the rowan model, the variables Elevation (TSS = 0.38), NDMI (TSS = 0.31), BorSpr (TSS = 0.31), and Wetness (TSS = 0.30) were the most important. Rowan preferred areas of elevated terrain, but this species was also able to inhabit wet locations. In addition, the model revealed the wide distribution of rowan in the study area, including the understory of boreal spruce forests. For domestic apple, the highest TSS values were found for the variables Wetness (TSS = 0.89), EVI (TSS = 0.88), DryMead (TSS = 0.79), and Eastness (TSS = 0.53). Domestic apple preferred dry meadows with a small amount of soil moisture, as its presence probability decreased with an increase in soil moisture and was associated with western and eastern slopes ( Figure A4 (Table A2). Response curves indicated the preference of ants for dry meadows with rich herbaceous vegetation, dry soils, and moderate (up to 60%) tree canopy cover ( Figure A5). According to the model, ants can inhabit sparse, well-illuminated forests and forest edges. In spring and early summer, moose preferred river floodplains with rich herbaceous vegetation and sparse mixed forests ( Figure A5).

Food-Probability Maps
Predicted food presence probability maps were generated for all reliable models ( Figure 5). According to the maps, we can assume that the vast majority of brown bear food resources were prevalent in the CFNR buffer zone, while only bilberry was much more widespread in the CFNR core area (Figure 6). Cranberry and rowan had more or less equal prevalence in both territories. By combining the predicted maps of food resource distribution, we obtained a general map reflecting the richness of brown bear food resources in the study area (Figures 7 and A6). Again, food richness was more relevant in the CFNR buffer zone than within its core area. Actually, the CFNR core area was characterised by twice as many areas without food resources (21%) as compared to the CFNR buffer zone (11.5%). In addition, in the CFNR core area there was a higher proportion of areas with one or two types of food resources (Table 4), while the CFNR buffer zone had considerably more areas with three to six different types of food resources. Finally, unique areas where all seven food resource types occur together were found only in the CFNR buffer zone.   resources. Finally, unique areas where all seven food resource types occur together were found only in the CFNR buffer zone.

Discussion
The abundance and distribution of main food resources determine seasonal and annual variations in their consumption, together with the spatial distribution, behaviour, demography [27,34,93], and habitat selection of the brown bear [10,26,94]. At the same time, food resource attractiveness is determined by not only its spatial distribution affecting the time required for its search and consumption, but also its energy value and local abundance [95]. As a result, spatial variations of food resources have a great impact on habitat suitability for the brown bear [96].
The use of direct food resource variables in brown bear habitat suitability modelling is largely based on a thorough study of its trophic ecology, including knowledge of diet, seasonal dynamics of food consumption, food resource energy values, and the characteristics of resource usage [10]. Such studies are necessary for a deeper understanding of the relationship between a species and the habitats producing its main food resources, and the peculiarities of the distribution of individuals across its territory [97][98][99].
According to the estimates obtained, most of our food-probability models predict the distribution of food resources over the study area quite well. The values of TSS ranged from 0.34 to 0.95. For example, forage plant distribution models for the brown bear in Alberta (Canada) were of inferior quality; their kappa estimates (which can be regarded as analogous to TSS) ranged from 0.09 to 0.48 [35]. At the same time, Penteriani et al. [33] obtained TSS estimates for brown bear food resources in the Cantabrian Mountains ranging from 0.52 to 0.75.
Spatial autocorrelation reduction and the thinning of presence records deserve special attention. The problem of SAC influence on the results of modelling using MaxEnt is still controversial [100][101][102][103]. Until now, the easiest way to solve this has been to avoid the spatial dependence of the training and test datasets [12,48,50]. By avoiding the problem of spatial sampling bias by rarefying the points, we considerably reduce the information available for training the model. Some of the studied species (cranberry, Chaerophyllum aromaticum, domestic apple, ant species with soil nest mounds) inhabit limited areas under specific conditions. For such plants and animals, the thinning of presence records reduced the realism of the obtained maps and was reflected in the highly nonlinear CBI curves.
The SDMs of herbaceous plants turned out to be quite reliable both for the whole group (TSS = 0.51) and for individual species in particular (TSS = 0.59-0.92). For example, model quality check for Hedysarum alpinum L. in Alberta (Canada) had a kappa value of 0.48 [35]. Models for Angelica sylvestris and Aegopodium podagraria had the lowest values of quality assessment among herbaceous plants. This is caused by the wide distribution of these species in the study area and their ability to inhabit both meadows and forests, including forest edges [104,105].
In general, the distribution of Apiaceae forbs was associated with areas having rich phytomass and moist conditions on southeastern slopes. The distribution of Angelica sylvestris was associated with the percent cover of mixed forests and southern slopes. The main habitats of this species were sparse forests (small forest edges and glades, open gaps in forests, forest roads, and narrow clearings) and the periphery of overgrown meadows, which are ecotone habitats. This is confirmed by its ability to renew itself with the help of renewal buds in case of damage [106], particularly that of being eaten by the brown bear, and its wide environmental valence in the study area [107].
The distribution of Aegopodium podagraria was similar to that of Angelica sylvestris, but it was also influenced by elevation and soil moisture. In contrast to A. sylvestris, based on our model, A. podagraria was to a greater extent confined to wet (but not waterlogged) soils, and to elevated terrain. This is confirmed by data on the ecological preferences of the species when A. podagraria is widespread throughout a study area, as its habitats are associated with moist tallgrass broadleaved-spruce and small-leaved forests [105]. According to our observations, it often preferred former farmlands and backyards in abandoned villages and their surroundings. The adaptability of A. podagraria to a wider range of habitats can be determined by its ability to adapt at the physiological level to various environmental conditions [108], although this species grows better in higher illumination conditions [107].
The distribution of Chaerophyllum aromaticum was associated to a greater extent with dry meadows than with forests. Unlike Aegopodium podagraria, C. aromaticum more strongly avoided waterlogged soils and was more associated with EVI, preferring areas in dry meadows rich in phytomass. This species was especially widespread within abandoned settlements, forming large thickets in former vegetable gardens with fertile soils. This is partly a consequence of the burrowing activity of wild boars Sus scrofa, which dig up the soil in dry meadows, feeding on the plant rhizomes. Thus, these animals create favourable conditions for C. aromaticum seeds, which, as early as the following year, can germinate at sites dug by wild boars. For this reason, C. aromaticum distribution was so strongly influenced by the percent cover of dry meadows. In the Smolensk Region, a short distance south of the study area, a plant community with a predominance of this species was recently described, namely the association Scirpo sylvatici-Alnetum incanae variant C. aromaticum [109]. It was noted that this plant community was formed in disturbed habitats in the vicinity of settlements on the richest and moistest loamy soils [109].
The aspen range covers all of Eurasia, from Iceland to the Russian Far East and from northern Scandinavia to the Mediterranean coast of Africa [110]. In forests of the taiga zone, aspen is a pioneer species of post-fire successions, and it is associated with many species of animals, bryophytes, and lichens in boreal forests [111]. Our model for aspen was not very reliable, as its AUC test did not reach the lower threshold value of reliability. Most likely, this was caused by the wide distribution of aspen in the study area and by limiting the sampling to only young trees suitable for brown bear consumption. Our model was able to reveal that young individuals of this tree species preferred small-leaved mixed forests, mainly on elevated areas and eastern slopes, with high EVI and NDMI values. This is generally consistent with the environmental requirements of aspen [107,[110][111][112].
Bilberry is a widely distributed species in many European regions associated with boreal forests [44,113,114]. In the study area, it is found in boreal spruce forests, Sphagnum spruce forests, and pine forests in areas surrounding raised bogs [105]. In this regard, there was a tendency towards wet areas with low values of EVI and NDMI, which was convincingly shown by our model, together with a close relationship with boreal spruce forests and Sphagnum spruce-pine forests. Bilberry distribution was also influenced by the sum of temperatures; however, it manifests itself mainly indirectly due to the predictors of elevation and forest type, as demonstrated in other studies, e.g., [114]. In our case, bilberry avoided deciduous forests, while it preferred boreal spruce forests and Sphagnum spruce-pine forests. In the boreal forests of Finland, bilberry abundance in pine forests was 1.13 times higher than that in spruce forests, due to different illumination conditions [114]. Our field observations also confirm this. In the study area, however, boreal spruce forests are distributed wider than Sphagnum pine forests. Therefore, the value of Sphagnum spruce-pine forests in our model was slightly lower (TSS = 0.22) than that of spruce forests (TSS = 0.33). Estimates of the quality of models for bilberry turned out to be quite high (AUC test = 0.75 and TSS = 0.50) and generally in agreement with those in other studies. For example, Penteriani and coauthors [33] found the following estimates for models of bilberry in the Cantabrian Mountains: AUC = 0.94 and TSS = 0.52. Nijland et al. [35] observed lower estimates for models of Vaccinium cespitosum Michx. (kappa = 0.32) and V. vitis-idaea L. (kappa = 0.38) in Alberta, Canada. Bilberry was the only food source whose distribution was associated more with the natural landscapes in the reserve core area than the human-modified buffer zone, which was also confirmed by our previous work [41]. This is explained by the active logging of boreal forests everywhere in the surrounding area in both the past and the present, and only because of the protected status of the reserve core area do the large expanses of intact boreal forests still persist to the present time.
Vaccinium microcarpum and V. oxycoccos are distributed in raised bogs. Both species inhabit raised bogs of the CFNR core area and buffer zone and Sphagnum pine forests. Thus, their distribution was associated with the percent cover of raised bogs and with depressions in the terrain. At the same time, the model revealed the avoidance of raised bog areas oversaturated with water, which is also consistent with environmental preferences of both species [115].
Hazel is the typical inhabitant in the understory of broadleaved and mixed forests in Europe [116]. According to the COAV model, in the study area, hazel was confined to mixed forests, where it grows in the understory. Such areas have higher EVI values. We noted that hazel was confined to elevated areas, but at the same time, this species can also exist under certain soil moisture levels. Estimates of model quality for hazel were at the good level (AUC test = 0.77 and TSS = 0.53). However, our estimates were lower than those in models for species with a similar food value, such as Fagus sylvatica L. (AUC = 0.97 and TSS = 0.75) and Castanea sativa Mill. (AUC = 0.89 and TSS = 0.54), for bears in the Cantabrian Mountains [33].
According to the SOAU model, the distribution of rowan was mainly determined by the presence of mixed and deciduous forests in areas of elevated terrain, which is consistent with the results of models in other studies, e.g., [51]. However, in our case, the models revealed that rowan (in a depressed state) can also grow in the understory of coniferous forests, including boreal ones, where this species successfully lives in moist soils. Although surviving in such conditions, rowan productivity in this area is very low, as there are practically no fruits. The model for this species has a borderline TSS value (0.40) and a low AUC test value (0.64), which is also associated with its wide distribution in the study area.
At the same time, Mellert with coauthors [51] reported an AUC test value of 0.68 for rowan, which is slightly greater than our estimate.
The distribution of domestic apple was related to past human activity in the study area. The CFNR buffer zone was densely populated 40 years ago. There were many villages and agricultural activities were carried out. In all the villages, people cultivated orchard apples, which had a great taste. Despite the fact that most of the villages were abandoned back in the 1980s, the remaining apple orchards still continue to actively produce fruit. Many trees have retained the good taste of their apples, which serve as a favourite food for bears in the study area.
Almost all the villages were located along moraine-kame ridges, which are related to the Valdai glaciation, and found primarily in the CFNR buffer zone. The study area has generally poor soils for agriculture. The cultivation of various crop plants was possible only on partially drained soils in elevated areas of moraine-kame ridges. In such places, villages existed for a long time. Around these villages, the forest was cut down, and arable lands, pastures, and hayfields appeared. Today, in these places there are dry meadows, which are actively overgrown with forest vegetation [38]. This was confirmed by our model, which indicated that domestic apple distribution was associated primarily with dry soils in dry meadows with rich herbaceous vegetation (and, as a consequence, high EVI values) on western and eastern (but not northern) slopes.
Our anthills model was very similar and showed that soil nest mounds were associated primarily with dry meadows with rich herbaceous vegetation (high EVI values), dry soils, and not high tree canopy cover. These landscapes were also associated with abandoned pastures and hayfields in the CFNR buffer zone. Field observations demonstrated that Lasius niger L. and Formica fusca L. often inhabit such places, as well as overgrown forest glades and forest roadsides, where Formica pratensis Retz., Formica exsecta Nyl., and Myrmica rubra L. live, and even along the edges of raised bogs (L. niger, M. rubra). It is known that dry soils and high illumination are among the most important factors for many ant species from this [52,117] and other groups [118]. This has been confirmed by other studies. For example, in Transylvania (Romania), nests of Lasius flavus F., L. niger, and Tetramorium caespitum L. were distributed in wood-pastures, i.e., meadows, with partial shrub and tree cover [117], as were anthills in the Slovakian Pol'ana Mountains [119]. However, in the study area, soil ants also often inhabit roadsides, small forest glades, and forest edges, which was also confirmed by our model. The model for moose had a relatively low TSS value (0.44), which was associated with the wide distribution of this species in the study area. It is a common species inhabiting almost all possible habitats in both the core area and buffer zone of the CFNR. Having no telemetry data, it is very difficult to reliably predict moose distribution even in a particular time period. Nevertheless, our model ALAL was able to reveal moose preferences in spring and early summer for river floodplains with rich phytomass and sparse mixed forests. After the winter and with the beginning of vegetation growth, moose actively consume young vegetation, including that found in floodplains (e.g., Equisetum spp., Filipendula ulmaria L., Sonchus spp.). In central Norway, moose selected good foraging habitats in spring, such as young forest stands and cultivated land, and older forests as shelter habitats. At the same time, reproducing females avoided open, food-rich areas in the first months after their calves were born, whereas males and females without young selected these areas in spring and summer [120].
In general, we found that the territory of the CFNR buffer zone contains a higher variety of brown bear food resources than the CFNR core area. According to the built models, the presence of up to seven food resources was predicted in the CFNR buffer zone. All main food resources apart from bilberry were more prevalent in the buffer zone. This is explained by a set of different factors. Terrain origin associated with the Valdai glaciation determined the nature of human land use and human settlement on elevated areas of moraine-kame ridges. Human agricultural and logging activities in these places modified the landscape and created a mosaic of heterogeneous landscapes (dry meadows overgrown with forest vegetation, secondary mixed forests, forest glades, and clearcuts) [38], which have high productivity for most food resources favoured by brown bears to the present time. Moreover, the distribution of many food resources, such as apples, hazelnuts, anthills, and many herbaceous plants, were associated mainly with these types of landscapes, which was demonstrated by our models.
On the basis of these results, the human-modified CFNR buffer zone seems to be more suitable for brown bears from a nutritional point of view in comparison to the natural landscapes of the CFNR core area. Strictly protected since 1931, spruce forests of the CFNR core area became overmature and have low nutritional value for brown bears. Only bilberry had greater prevalence in the core area, which was also confirmed by our previous work [41].

Conclusions
We believe that when building habitat suitability models for the brown bear it is necessary to use proximal resource variables instead of their environmental surrogates. With the help of additional field studies, it is possible to build food-probability models for further RSF with direct resources, or as we call them real RSFs. This can lead us not to assumptions, but to direct logical connections between a species and its resources and habitats. Combined maps of food resource distribution allowed us to conclude that the territory of the human-modified CFNR buffer zone contains a higher variety of food resources than the strictly protected CFNR core area, where there is a concentration of closed, intact spruce forests of boreal and nemoral structure. More research is thus needed to understand the real impact of protected areas on the conservation and management of brown bear populations (and, more generally, on large carnivore populations) compared to larger, less protected landscapes in human-modified regions. Actually, the typical large displacements of brown bears [121] increase the interactions that this species may have with areas characterised by diverse protection rules, a factor that needs to be addressed in conservation and management plans.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.
Appendix A Table A1. Check for spatial clustering with the average nearest neighbour index (NNI), its z-score, and p-level for 14 brown bear food-probability models in the Central Forest State Nature Reserve (West-European Russia) in 2008-2020.       Table 1 of the main text. The curves show the mean response of the ten cross-validation MaxEnt runs and the mean ± one standard deviation. EVI-enhanced vegetation index; NDMI-normalised difference moisture index; Wetness-tasseled cap wetness; Elevation-elevation (m a.s.l.); Northness-northness (aspect) from the digital elevation model; Eastness-eastness (aspect) from the digital elevation model; SolRad-solar radiation; BorSpr-proportion of boreal spruce forests (%); DecFor-proportion of deciduous forests (mixed small-leaved forests) (%); DryMead-proportion of dry meadows (%). Units of the variables are shown in Table 1 of the main text.  Table 1 of the main text.  Table 1 of the main text. Figure A5. Response curves for the five (insects) and six (mammals) most important variables of the brown bear animal food-probability models in the Central Forest State Nature Reserve and its buffer zone (European Russia) in 2008-2020. Designations: ANHI-anthills, ALAL-Alces alces. The curves show the mean response of the ten cross-validation MaxEnt runs and the mean ± one standard deviation. EVI-enhanced vegetation index; Wetness-tasseled cap wetness; Elevation-elevation (m a.s.l.); RivDist-distance to rivers; DecFor-proportion of deciduous forests (mixed small-leaved forests) (%); YoungFor-proportion of young forests and overgrown clearings (%); DryMead-proportion of dry meadows (%); TreeCov-forest canopy cover. Units of the variables are shown in Table 1 of the main text. Figure A5. Response curves for the five (insects) and six (mammals) most important variables of the brown bear animal food-probability models in the Central Forest State Nature Reserve and its buffer zone (European Russia) in 2008-2020. Designations: ANHI-anthills, ALAL-Alces alces. The curves show the mean response of the ten cross-validation MaxEnt runs and the mean ± one standard deviation. EVI-enhanced vegetation index; Wetness-tasseled cap wetness; Elevation-elevation (m a.s.l.); RivDist-distance to rivers; DecFor-proportion of deciduous forests (mixed small-leaved forests) (%); YoungFor-proportion of young forests and overgrown clearings (%); DryMead-proportion of dry meadows (%); TreeCov-forest canopy cover. Units of the variables are shown in Table 1 of the main text. Figure A6. The global map of brown bear food resource richness in the Central Forest State Nature Reserve (West-European Russia) in 2008-2020 showing individual food objects. The colours indicate different combination of food resources (one to seven) per pixel among the following items: Apiaceae forbs, aspen, bilberry, cranberry, hazel, rowan, domestic apple, anthills, and moose. The CFNR core area is inside the red line polygon; the CFNR buffer zone is outside the red line polygon. The colours indicate different combination of food resources (one to seven) per pixel among the following items: Apiaceae forbs, aspen, bilberry, cranberry, hazel, rowan, domestic apple, anthills, and moose. The CFNR core area is inside the red line polygon; the CFNR buffer zone is outside the red line polygon.