Natural and Human-Transformed Vegetation and Landscape Reflected by Modern Pollen Data in the Boreonemoral Zone of Northeastern Europe

Modern pollen composition obtained from waterbody surface sediment represents surrounding vegetation and landscape features. A lack of detailed information on modern pollen from Latvia potentially limits the strength of various pollen-based reconstructions (vegetation composition, climate, landscape, human impact) for this territory. The aim of this study is to compare how modern pollen from natural and human-made waterbodies reflects the actual vegetation composition and landscape characteristics. Modern pollen analyses from surface sediment samples of 36 waterbodies from Latvia alongside oceanic-continental, lowland-upland, urban-rural and forested-agricultural gradients have been studied. In addition, we considered the dominant Quaternary sediment, soil type and land use around the studied waterbodies in buffer zones with widths of one and four km. The information on climate for the last 30 years from the closest meteorological station for each study site was obtained. Data were analyzed using Pearson correlation and principal component analysis. Results show that relative pollen values from surface sediment of waterbodies reflect dominant vegetation type and land use. Modern forest biomass had a positive correlation with pollen accumulation rate, indicating the potential use of pollen-based forest biomass reconstructions for the boreonemoral zone after additional research and calibration.


Introduction
Pollen is one of the most abundant microfossils (sub-fossils) preserved in sediment archives, whose sedimentary assemblages are related to regional and local vegetation [1,2]. Whilst fossil pollen can be found in lake sediments extending back for thousands of years, modern pollen surface samples are a component of that fossil record found in the last decades. Modern pollen samples from lake surface sediment reflect differences in vegetation in a similar way to moss pollsters, pollen traps, and might be combined for vegetation or climate calibration purposes [3].
In a time when the modeling approach is significantly expanding, the production of raw data is less attractive to new and established researchers because it is time-and labor-consuming. Nevertheless, modeling of climate, environmental change, forest biomass reconstructions or distribution of biota, vegetation functionality and phylogenetic diversity require input data and validation [4][5][6][7][8][9]. Although limited in spatial coverage and affected by uncertainties [10], proxy records are used in model-data comparisons and quantitative syntheses [11,12]. For example, climate models are commonly validated against proxybased (e.g., pollen) reconstructions. These models should be able to reproduce past climate and vegetation change to be useful in future projections (e.g., [13]). To improve our ability to reconstruct environments and climate, modern proxy calibration studies along climatic and ecological gradients are needed. Modern pollen surface samples currently have a low spatial resolution in Latvia, i.e., 10 samples [14], therefore lacking the full potential estimated current, past and possible future natural and anthropogenic processes.
The aim of this study is to compare how modern pollen from natural and humanmade waterbodies in Latvia, northeastern Europe reflect actual vegetation composition and landscape characteristics. For the first time in Latvia, modern pollen from 36 waterbody sediment surfaces were analyzed and compared with local vegetation composition, biomass, dominant Quaternary sediment type and climate.

Study Sites
The study sites are situated 55-58 • N and 20-28 • E in Latvia, northeastern Europe ( Figure 1; Table 2) in the hemiboreal forest zone, which is characterized by a mixture of coniferous and deciduous tree species, such as the Norway spruce (Picea abies), Scots pine (Pinus sylvestris), birch (Betula spp.), alder (Alnus glutinosa, Alnus incana), wych elm (Ulmus glabra), European ash (Fraxinus excelsior), small-lived lime (Tilia cordata) and pedunculate oak (Quercus robur). require input data and validation [4][5][6][7][8][9]. Although limited in spatial coverage and affected by uncertainties [10], proxy records are used in model-data comparisons and quantitative syntheses [11,12]. For example, climate models are commonly validated against proxybased (e.g., pollen) reconstructions. These models should be able to reproduce past climate and vegetation change to be useful in future projections (e.g., [13]). To improve our ability to reconstruct environments and climate, modern proxy calibration studies along climatic and ecological gradients are needed. Modern pollen surface samples currently have a low spatial resolution in Latvia, i.e., 10 samples [14], therefore lacking the full potential estimated current, past and possible future natural and anthropogenic processes. The aim of this study is to compare how modern pollen from natural and humanmade waterbodies in Latvia, northeastern Europe reflect actual vegetation composition and landscape characteristics. For the first time in Latvia, modern pollen from 36 waterbody sediment surfaces were analyzed and compared with local vegetation composition, biomass, dominant Quaternary sediment type and climate.
A combination of a continental (Eurasia) and maritime (Atlantic Ocean) climate is typical for this area, where the east is more continental and the west-more maritime ( Figure 1; Table 3). Waterbodies for the particular study were selected to represent all four continentality index zones-weak (15 sites), moderate (seven sites), average (six sites) and strong (eight sites) ( Figure 1). The average annual air temperature in Latvia is +6.8 • C (varied from 5.7 to 8.0 • C) for the climatic normal 1991-2020. The lowest mean monthly air temperature observed in February, where it reaches −3.1 • C (varied from −1.1 • C to −5.1 • C), while the highest mean monthly air temperature 17.8 • C in July (varied from 17.0 to 19.4 • C). The annual precipitation in average reaches 679 mm (varied from 585 mm to 885 mm) according to the Latvian Environment, Geology and Meteorology Centre data. The highest precipitation amount fell during the summer months (July and August) and in autumn, while the driest periods were in winter and early spring.

Sediment Sampling
The top-most 3 cm sediment sample from waterbodies was obtained using a gravitycorer (© KC Denmark) through ice in winter 2021 and from a boat in winter 2020 and spring 2021. Samples were taken at the deepest place of the middle of the waterbodies. Surface sediment samples were subsampled at the field, put in small plastic bags, transported to the University of Latvia, and stored at 4-6 • C.

Pollen Analysis
Pollen subsamples of known volume (1 cm 3 ) were treated with 10% HCl, boiled in 10% KOH, and then acetolyzed for 3 min using standard acetolysis procedure [18]. Prior to chemical treatment, Lycopodium spores containing tablets were added to sediment samples in order to estimate the concentration of pollen per cm 3 [19]. The prepared samples were stored in glycerine. A minimum of 500 terrestrial pollen was counted from each slide. Pollen identification was carried out to the lowest possible taxonomic level, with the help of the identification guide of [20] and a modern pollen reference collection stored at the University of Latvia and Tallinn University of Technology. Additional pollen data were obtained from [14,15] (Table 2). Results from these publications are comparable because pollen analyses for these sites and publications were done by Normunds Stivrins with the same preparation and identification procedure. The percentage of dry-land taxa was calculated using arboreal (AP) and non-arboreal (NAP) pollen sums (excluding sporomorphs of aquatic and wetland plants). Counts of spores were calculated as percentages of the total sum of terrestrial pollen. The pollen diagram was compiled using TILIA software [21]. Pollen accumulation rate (PAR) was estimated by multiplying pollen concentration per sample with mean sediment accumulation rate in Latvian lakes (0.5284 cm/year).

Climate and Landscape
Climate data for each pollen surface sample site were derived from the nearest meteorological station. Time series of air temperature and precipitation were collected from the Latvian Environment, Geology and Meteorology Centre and represent reference period of climate normal for 1991-2020. For each sampling site, long-term observations of the nearest meteorological station were used to calculate the average annual air temperature, the average temperature in winter and summer seasons and the amount of annual precipitation, as well for winter and summer seasons.
According to study [17], the main climatic factors that reflect the degree of climate continentality and its increasing eastward sea distance in Latvia are minimum winter air temperature, duration of snow cover, depth of soil freezing. Based on these climatic indicators, they determined four continentality indices that have been used in the particular study. We characterized land cover using Corine Land Cover 2018 data [22] with aggregated thematic land cover categories (forests, agricultural areas, urban areas). We estimated forest biomass in 1 km and 4 km buffer zones around waterbodies from State Forest Register database [23] by summarizing standing timber volume from all forest compartments inside buffer zones. We used the Geological Map of Latvia [24] to characterize quaternary sediments in buffer zones.
Buffer zones were set to test how reliable pollen reflecting the surrounding setting is. There are various options and publications related to this topic, but following [4] we selected two buffer zones-1 km and 4 km. We measured buffer zone from the margin of the lake following the shape of the waterbody as suggested by [25]. Although it would be enough with one buffer zone, to test how reliable pollen can be used in tree biomass reconstructions (see [4]), we estimated modern tree biomass around the sites in 1 and 4 km buffer (calibration) zones. Therefore, all other landscape characteristics were set at the same zonation parameters.

Data Analyses
Statistical correlation of analyzed pollen, sites and variables was tested by using principal components analysis (PCA), which finds components accounting for as much as possible of the variance in data [26,27]. Significance of components was tested using broken stick method [28]. PCA was done in Past v.4.05 [29].
Cross-correlations between pollen results and CORINE landcover (forests, agricultural and urban area; %) data, as well as PAR and tree biomass (m 3 ) around lakes in 1 km and 4 km radius were calculated with Pearson correlation tests using cor.test() function in R (version 4.0.5.; [30]). Pearson's correlation was conducted under the assumption that there is a linear relationship between pollen composition and the actual data. Lake sizes were taken into account under the assumption that their size may have an indirect influence on the sediment pollen composition, thus also the closeness of the relationship between two variables. Study sites were divided into three groups: all, large (>50 ha) and small-medium lakes (<50 ha). Correlations were performed to determine whether modern pollen data can be quantified and validated based on current vegetation and tree biomass estimates, thus contributing to the reconstruction of long-term boreonemoral vegetation and biomass characteristics in the northeastern Europe region based on sub-fossil pollen data acquired from waterbodies of different sizes. Due to non-homogenous site sizes and specific soil types such alternative method as REVEALS modelling was not conducted.

Climate
Analyzed sites fall in non-linear distribution when considering climate parameters, but certain links can be drawn. Coastal (oceanic) sites have higher average temperatures and precipitations (Figure 2), while lower average temperatures and precipitations are characteristic of inland (continental) sites. There are also differences in the distribution of summer and winter temperatures-the lowest summer and highest winter temperatures are typical of sites near the open Baltic Sea coast, because of the large thermal inertia of the sea and to the flow of mild maritime air from the west. For sites with average and strong continentality and higher elevations, the winter temperatures are considerably lower. The same factors affect the distribution of seasonal precipitation-the higher summer precipitation is characterized for inland sites while lower for coastal locations. lower. The same factors affect the distribution of seasonal precipitation-the higher summer precipitation is characterized for inland sites while lower for coastal locations.  Table 1 and Figure 1).  Table 2 and Figure 1).

Quaternary Sediment and Soil
The majority of sites are dominated by glacigenic and glaciolacustrine Quaternary sediment (Figure 3). Glacigenic sediment consists of till-unsorted clay, clay-gravel material deposited directly by glacial ice. Meanwhile, glaciolacustrine sediment genesis involves formation in lakes deposited by glacial meltwater. Only one waterbody is located in the eolian sand and alluvial sand region. Quaternary sediment in large part forms a ground for topsoil where on glacigenic sediment are dystric cambisol (E1Pv) and stagnosols/planosols (Pgv), and on glaciolacustrine sediment dominate luvisols/alisols (Pv) (Figure 3). The majority of sites are dominated by glacigenic and glaciolacustrine Quaternar sediment (Figure 3). Glacigenic sediment consists of till-unsorted clay, clay-gravel ma terial deposited directly by glacial ice. Meanwhile, glaciolacustrine sediment genesis in volves formation in lakes deposited by glacial meltwater. Only one waterbody is locate in the eolian sand and alluvial sand region. Quaternary sediment in large part forms ground for topsoil where on glacigenic sediment are dystric cambisol (E1Pv) and stag nosols/planosols (Pgv), and on glaciolacustrine sediment dominate luvisols/alisols (Pv (Figure 3).

Pollen
Altogether 81 pollen taxa were identified from 36 sites (Figure 4; Supplementary Ma terial). Dominant tree taxa in the majority of sites were Pinus, Betula, Picea, Alnus, an Corylus. Higher Pinus values were recorded at lower altitudes, while higher Betula relativ share was identified from sites at higher altitude sites.
Pollen records of plants related to human activities and influence on the vegetatio composition throughout the altitudinal gradient are variable. As ruderal species tend t be characteristic of disturbed environments, most often affected by human activities, th increase in their abundance in the surface sediments of the lake is expected in relation t the wider open landscape area around the waterbody. It can also be caused by the urba environment and the use of adjacent areas for agriculture. In this study, more than 2 different species of cultivated land and ruderal plant species or genera were found in su

Pollen
Altogether 81 pollen taxa were identified from 36 sites (Figure 4; Supplementary Material). Dominant tree taxa in the majority of sites were Pinus, Betula, Picea, Alnus, and Corylus. Higher Pinus values were recorded at lower altitudes, while higher Betula relative share was identified from sites at higher altitude sites.
nantly Hordeum, Triticum and Brassicaceae pollen in waterbodies located near agricultural lands. An increased amount of pollen from grasses and sedges can also be considered as an effect of human activity, indicating the intensity of the overgrowth of the lake. This evidence is more characteristic for sites with throughflow hydrological regime collecting pollen from the wider catchment areas, for instance, Lake Ķikuru, Lake Durbes, Lake Liepājas, Lake Sesavas (Table 1; Figure 4). The composition of pollen in man-made reservoirs (ponds) is quite different, which is probably related to their management measures.   Figure 1 and Table 2.
Pollen records of plants related to human activities and influence on the vegetation composition throughout the altitudinal gradient are variable. As ruderal species tend to be characteristic of disturbed environments, most often affected by human activities, the increase in their abundance in the surface sediments of the lake is expected in relation to the wider open landscape area around the waterbody. It can also be caused by the urban environment and the use of adjacent areas for agriculture. In this study, more than 25 different species of cultivated land and ruderal plant species or genera were found in surface sediments. Chenopodiaceae, Artemisia and Urtica predominate among the pollen of ruderal plants. Cultivated land plant pollen has been found in low numbers, predominantly Hordeum, Triticum and Brassicaceae pollen in waterbodies located near agricultural lands. An increased amount of pollen from grasses and sedges can also be considered as an effect of human activity, indicating the intensity of the overgrowth of the lake. This evidence is more characteristic for sites with throughflow hydrological regime collecting pollen from the wider catchment areas, for instance, Lake K , ikuru, Lake Durbes, Lake Liepājas, Lake Sesavas (Table 2; Figure 4). The composition of pollen in man-made reservoirs (ponds) is quite different, which is probably related to their management measures.
The broken stick method indicated four main principal components (PC) strongly weighted by Pinus (PC1), Betula (PC2), Picea (PC3) and Alnus (PC4) ( Figure 5A). Although insignificant, it is worth underlining also PC5-Poaceae, human-related cultivated and ruderal plants. Results of PCA show sites in glaciolacustrine (clay, silt, and sand) areas are mostly forested, while sites in glacigenic (till) sediments display the opposite situationopen landscape and human-related agricultural pollen and vegetation ( Figure 5B). Results obtained in our study show a positive linear correlation between the actual forest cover around the sites (CORINE data) and tree pollen in surface sediment samples reflecting forest relative share (%) in the landscape. Obtained results reveal that if all studied sites are included, forest cover (1 km buffer zone) and tree pollen (%) correlation reached r: 0.39 (Figure 6 A). Lower correlation (r: 0.30) was for 4 km buffer zone. The highest correlation up to r: 0.86 (4 km buffer zone) and r: 0.87 (1 km buffer zone) was obtained when including only large sites (Figure 6 B). Tree biomass and tree PAR had r: 0.33 correlation for sites smaller than 50 ha (Figure 6 C) and only r: 0.1 when all sites considered (Figure 6 D). A slightly higher correlation was for agricultural land (%, CORINE) and agricultural pollen (cultivated plants, %) where r: 0.42 was reached when including all sites. Negative r: −0.27 correlation was between urban area (%, CORINE) and agricultural pollen (%). It is worth mentioning that only the highest correlations are shown in Figure 6. Results obtained in our study show a positive linear correlation between the actual forest cover around the sites (CORINE data) and tree pollen in surface sediment samples reflecting forest relative share (%) in the landscape. Obtained results reveal that if all studied sites are included, forest cover (1 km buffer zone) and tree pollen (%) correlation reached r: 0.39 ( Figure 6A). Lower correlation (r: 0.30) was for 4 km buffer zone. The highest correlation up to r: 0.86 (4 km buffer zone) and r: 0.87 (1 km buffer zone) was obtained when including only large sites ( Figure 6B). Tree biomass and tree PAR had r: 0.33 correlation for sites smaller than 50 ha ( Figure 6C) and only r: 0.1 when all sites considered ( Figure 6D). A slightly higher correlation was for agricultural land (%, CORINE) and agricultural pollen (cultivated plants, %) where r: 0.42 was reached when including all sites. Negative r: −0.27 correlation was between urban area (%, CORINE) and agricultural pollen (%). It is worth mentioning that only the highest correlations are shown in Figure 6.

Discussion
Higher pine and spruce pollen dominance was found in sites located at lower altitudes ( Figure 4). When looking within sites, pine has a close connection to glaciolacustrine and eolian sediment distribution where nearly all sites forest coverage is >40% as is reflected by the results of PCA ( Figure 5). The opposite situation is in sites dominated by herbs and cultivated plants where forest cover is <40% and the dominant Quaternary sediment is glacigenic till. While commonly modern pollen is displayed alongside altitudinal or temperature gradient [31], we found it was not meaningful in such low elevations as in Latvia. Instead, Quaternary sediment and subsequent soil type has the main controlling factor for vegetation distribution and patterns. Historically, areas dominated by glaciolacustrine, glaciofluvial, marine, and eolian Quaternary sediment form acidic sandy podsol soils are poor in nutrients and thus unattractive for farming.
Pine, spruce and birch are among the most common and also economically valuable tree species. Today pine (Pinus sylvestris) is making 40%, spruce (Picea abies) 22% and birch (Betula pubescens, Betula pendula) 36% of all tree species in Latvia [32]. Considering the high value of ecosystem services, it is crucial to clarify, for instance, how the forest biomass of the main tree species has changed. This can be found out by using pollen-based tree biomass reconstructions [4]. Before proceeding with any reconstruction, pollen accumulation rate data need to be calibrated to modern tree biomass values. Here we provide the first modern pollen calibration to the modern tree biomass values for Latvia and in this study obtained results show relatively low correlations (r: 0.18 to r: 0.33) ( Figure 6C,D). Possible reason for such low correlation outcome can be site selection. In their seminal paper about the biomass reconstructions, Seppä et al. [4] indicated-only small-medium sized lakes characterized by predominantly constant long-term sedimentation rates and relatively floristically simple composition are the most suitable sites for tree biomass reconstructions. Our results show a higher correlation for sites (r: 0.33) where small to medium-sized waterbodies predominated. From the obtained results is possible to draw several steps that can be considered in the tree biomass reconstructions. As results show, not all sites can be used appropriately and therefore only selected sites must be included in the forest biomass-training set.
Our results show that modern pollen assemblages from waterbody surface sediments reflect the actual landscape characteristics and overall vegetation coverage. A fairly good correlation (r: 0.87) between forest coverage and tree pollen percentages from large (>50 ha) lakes ( Figure 6B) were found. Based on this, we consider that pollen from large sites (>50 ha) better reflect the relative forest cover. Surprisingly, obtained results show a lower correlation between forest cover and tree pollen from small-medium sized waterbodies. A possible explanation for this finding can be a larger portion of regional pollen. Models and observations based on empirical investigations confirm that small lakes have smaller pollen source areas than large lakes [33,34]. The Prentice-Sugita pollen dispersal and deposition model predicts that within the same landscape pollen percentages are highly variable in small ponds but uniform in larger lakes (lakes with diameter > 250 m) [1,35]. Considering that the relative source area of pollen of lakes within a radius up to 100 m has been shown to be <2000 m from the center of the lakes [36], our small-medium sized waterbodies (radius 20-200 m), theoretically, should show pollen spectrum within 1 to 4 km. Differences in buffer zones and correlation outcome suggesting more work should be done for calibration purposes in hemiboreal zone.
Human impact on the landscape and subsequently on vegetation is evident through the modern pollen data. For instance, we found that when the urban area around the waterbody is more common, the relative share of agricultural pollen is lower ( Figure 6F). Although this correlation was weak and even negative (r: −0.27), it still underlines the general trend of such correlation. When we consider waterbodies located within agricultural areas, the correlation between agricultural pollen was significantly higher r: 0.42 ( Figure 6E). A lower correlation than expected can be linked to various land management aspects. Theuerkauf et al. [37] studied the effects of changes in land management practices on pollen productivity in Germany and found that decline in herb and grass pollen since the 1950s occurred not only due to a shift towards crops that emit comparatively small number of pollen but also earlier and more frequent mowing of grasses.
Modern pollen data are highly valuable in climate reconstructions and modeling. For instance, the most common climate reconstructions are sub-fossil pollen-based approaches which are using modern pollen training sets for calibration purposes [14,38]. As Seppä et al. [31] noted, due to human impact on the landscape, not all sites can be used for pollen-training set purposes. For instance, small-medium sized waterbodies (20-50 ha), with no significant human impact (e.g., large fields, forest management, housing areas) in the vicinity, >1.5 m deep, dominated by glacigenic and glaciolacustrine sediment [38] are more suitable for pollen-climate calibration sets. According to these requirements, only seven waterbodies (Lake Bricu, LakeČertoks, Lake Lielais Svētin , u, Lake Lielais Vipēdis, Lake Pinku, Lake Sesavas and Lake Gluhoje) from this study could be selected for climate reconstructions, although selection also depends on the chosen technique. Fortunately, the methodology of climate reconstruction techniques is advancing and most of these results and sites can be applicable for the reconstructions [8,38].

Conclusions
(1) Large waterbodies reflect forest cover better than small-medium-sized waterbodies.
(2) Pollen accumulation rate can be used for forest biomass reconstructions after additional site selection and calibration work.