Factors Controlling Contemporary Suspended Sediment Yield in the Caucasus Region

: This paper discusses the joint impact of catchment complexity in topography, tectonics, climate, landuse patterns, and lithology on the suspended sediment yield (SSY, t km − 2 year − 1 ) in the Caucasus region using measurements from 244 gauging stations (GS). A Partial Least Square Regression (PLSR) was used to reveal the relationships between SSY and explanatory variables. Despite possible signiﬁcant uncertainties on the SSY values, analysis of this database indicates clear spatial patterns of SSY in the Caucasus. Most catchments in the Lesser Caucasia and Ciscaucasia are characterized by relatively low SSY values (<100–150 t km − 2 year − 1 ), the Greater Caucasus region generally have higher SSY values (more than 150–300 t km − 2 year − 1 ). Partial correlation analyses demonstrated that such proxies of topography as height above nearest drainage (HAND) and normalized steepness index (K sn ) tend to be among the most important ones. However, a PLSR analysis suggested that these variables’ inﬂuence is likely associated with peak ground acceleration (PGA). We also found a strong relationship between land cover types (e.g., barren areas and cropland) and SSY in different elevation zones. Nonetheless, adding more gauging stations into analyses and more reﬁned characterizations of the catchments may reveal additional trends.


Introduction
Water erosion is the primary mechanism of sediment transport from the catchment area and, at the global level, determines the flow of pollutants into permanent streams and reservoirs [1,2]. A particularly significant effect of water erosion on surface water pollution was noted in the Anthropocene period [3]. Together with sediments, many organic fertilizers, in particular phosphorus [4] and other nutrients, enter the streams, which cause eutrophication of water bodies [5,6]. The suspended sediment yield is an integral characteristic of the erosion processes in river basins, considering the stream and watershed connectivity, which varies significantly depending on the combination of natural and anthropogenic factors [7,8]. These variations are especially noticeable in the mountains and foothill zone, where the most pronounced relief and lithology differences. Moreover, human activity and vegetation cover vary considerably. Finally, a climate that controls the bedrock weathering rates and the surface runoff significantly contribute to the sediment yield and the pollutant flux [9].
The Caucasian region, which includes the mountain ranges of the Greater and Lesser Caucasus and the foothill zones, is a unique territory due to the high proportion of the arable land in the foothill/low mountain zone, high population density in the foothills, and a wide variety of lithological composition of the bedrock that composes the area. The Caucasus is also essential for developing medical and health resorts, hydropower energy, industrial production, mining, and transport infrastructure. Erosion intensity in various altitudinal zones of the Caucasus and their changes over time can reflect the climate change and anthropogenic impact fluctuations, including recent decades [10,11]. The increasing recurrence of events associated with the formation of extreme floods and powerful destructive debris flows requires detailed quantitative assessments of the characteristics of the sediment yields (SSY, t km −2 year −1 ) in various altitudinal belts. However, there are currently no robust quantitative estimates of the denudation rates or sediment yields for various altitudinal belts of the Caucasus [12].
Due to practical constraints, most studies aiming to quantify denudation rates and sediment yields mostly remain limited to case studies in this region [13,14]. Several studies have addressed this issue regionally and, if so, only focus on a limited set of larger river systems [15][16][17]. These studies fail to fully identify and constrain the role of the various factors that may control SSY in diverse regions such as the Caucasus Mountains. Developing a model that can simulate the catchment sediment yield may shed some light on how natural factors influence sediment fluxes and, by extent, denudation rates. Moreover, such a model may allow analyzing to what extent various factors better explain variations in observed SSY [18]. Previous studies indicate that the interaction between tectonics, lithology, soils, climate, and landscape responses should be considered [15,16,[18][19][20][21][22].
Several studies show that indices like river steepness, chi maps, and knickpoint locations better explain long-term denudation rates than proxies like average slope [23,24]. Such indices could be extracted using available tools [23,25,26]. Nevertheless, in highly erodible landscapes, catchment responses to tectonics may remain difficult to detect with topographic indices [27].
In the previous sediment yield models [22], lithology was only considered in a rudimentary way. This factor can be better constrained using global lithology and soil maps and building on earlier approaches [28,29]. Furthermore, earlier work indicated that the interaction between lithology and seismicity could strongly influence the erosion and denudation rates through the effect of rock fracturation [30][31][32][33].
Although earlier works have not demonstrated a relevant climatic impact on SSY [22,34], this factor is bound to be more significant on a global scale [35,36]. We aim to thoroughly explore this by analyzing the link between SSY and a wide range of potentially relevant climatic parameters and investigating relationships between SSY and other controlling factors separately for different climatic/ecological zones. This may help better identify interactions between climate, vegetation cover, and other factors and processes driving SSY. Likewise, the role of glaciers on natural SSY can be explored using available global datasets and based on earlier proposed empirical approaches [21,37].
Numerous studies have shown that humans can significantly impact SSY [3,18,38]. First, the construction of dams and reservoirs can lead to a drastic decrease in SSY. Secondly, humans can have had a significant impact on land cover, e.g., through deforestation and agriculture, leading to increased soil erosion rates and corresponding sediment yield values [39,40]. Recent studies point to the influence of population density [20] on sediment yield. It uses in some models as a human impact parameter [16,21].
The authors have already published an overview database for the Caucasus region with SSY-observations for more than 200 catchments [12]. These SSY values are subject to uncertainties resulting from temporal variations in SSY, measuring errors, and observation period length. Nevertheless, entries in that database cover an extensive range of land uses (i.e., from nearly natural to completely cultivated), catchment sizes, and climatic, soil, geomorphic, and tectonic conditions. The unprecedented number of catchments considered provide the first (to the author's knowledge) statistically robust answers on regional questions about SSY variability in the Caucasus region. More specifically, the research objectives were: to find which climates and environments cause the highest SSY values in the Caucasus region; -to evaluate what factors most control SSY in the Caucasus region.

Study Area
The Caucasus region is located between the Black Sea and the Caspian Sea. It includes the Greater and Lesser Caucasus, separating them from the Rion and Kura River lowlands and the Ciscaucasia (the North Caucasus), adjacent to the Greater Caucasus from the north (cf. Figure 1b). The Caucasus Mountains belong to the European-Asian (Alpine-Himalayan) mountain belt, which crosses the whole of Eurasia from west to east. The upper elevation ranges of the Caucasus are occupied by young terrain of the Alpine type, which was formed with the help of glacial-nival processes in the Late Quaternary period [41].
The Greater Caucasus is an important barrier for air masses moving in the longitudinal direction, determining the difference in climate between the Ciscaucasia and the Transcaucasia (the South Caucasus). Hence, the Caucasus Mountains are subject to strong longitudinal variations in climate [42]. For example, Western Caucasus receives three to four times more precipitation than Eastern Caucasus [43]. At the same time, the Transcaucasia region is also characterized by higher precipitation, which can be up to 4000 mm year −1 in the southwest [44].

The Sediment Yield Data
The suspended sediment yield (SSY, t km −2 year −1 ) is the mean annual sum of suspended sediment load normalized by catchment area [45]. Many works carried out in different world regions are devoted to SSY's study, and some contain data on Caucasus Region [15,19,[46][47][48][49]. This study used a database recently created by Tsyplenkov et al. [12] updated with gauging stations (GS) for the Terek basin from Tsyplenkov et al. [11]. The overview of the database is reported in Figures 1 and 2.
The database consists of mean annual suspended sediment yield values for 257 gauging stations located in Russia (147), Azerbaijan (44), Georgia (41), Armenia (25). To avoid the potential impact of reservoirs on SSY, we excluded 13 gauging stations located downstream from the reservoirs (see Figure 1a). The location of dams and reservoirs was taken from the GRanD database [50]. Therefore, we end up with 244 gauging stations with catchment areas that vary from 4 to 22,000 km 2 and a median measuring period of 18 years (see Figure 2b). The SSY values were calculated as follows: in which SSY is the mean annual suspended sediment yield, t km −2 year −1 ; Q is the mean annual water discharge, m 3 s −1 ; SSC is the mean annual suspended sediment concentration, g m −3 ; A is the catchment area, km 2 ; n is the measuring period length in years. According to the Köppen-Geiger classification [51], all stations were classified into nine climatic zones (Figure 2c). We also used a map of landscape provinces (see Figure 1b) created by Isachenko A.G. [52] to assign the most common landscape province in the basin. The majority of class/zone within every basin determined both climate classes and landscape zones, i.e., the raster value occupying the greatest number of cells, considering cell coverage fractions [53].

Potential Controlling Factors
For every basin, several variables describing the topography, lithology, climate, and other factors that can potentially affect SSY were derived. Table 1 gives an overview of these parameters. Variable values were extracted for each catchment, using catchment boundaries derived from the digital elevation model (DEM). These boundaries and morphometric indices were calculated from the void-filled 30 m ALOS World 3D (AW3D30) DEM by the Japanese Aerospace Exploration Agency (JAXA). Nevertheless, this dataset has had horizontal and vertical errors in the range of tens of meters. It currently offers tremendous potential for regional mountain geomorphological analyses [55]. DEM preprocessing was done in ArcMap ver. 10.8 (http://www.esri.com (accessed on 9 November 2021)) using the Spatial Analyst toolbox.
From preprocessed DEM, we calculated an elevation's coefficient of variation (DEM [-] in Table 1) by dividing the standard deviation by mean catchment elevation. In line with several other studies focusing on the relation between denudation rates and topography [23,24,49], a normalized steepness index (K sn [m 0.9 ]) was also calculated. It represents the upstream area-weighted channel gradient [56]: in which S is the channel slope (m m −1 ); A is the upstream drainage area (m 2 ); θ is the concavity index set to 0.45 to ease comparisons between streams with differing catchment areas and concavities. We calculated K sn in MATLAB using the Topo Toolbox Software [57,58].

Potential Controlling Factors
For every basin, several variables describing the topography, lithology, climate, and other factors that can potentially affect SSY were derived. Table 1 gives an overview of these parameters. Variable values were extracted for each catchment, using catchment boundaries derived from the digital elevation model (DEM). These boundaries and morphometric indices were calculated from the void-filled 30 m ALOS World 3D (AW3D30) DEM by the Japanese Aerospace Exploration Agency (JAXA). Nevertheless, this dataset has had horizontal and vertical errors in the range of tens of meters. It currently offers tremendous potential for regional mountain geomorphological analyses [55]. DEM preprocessing was done in ArcMap ver. 10.8 (http://www.esri.com (accessed on 9 November 2021)) using the Spatial Analyst toolbox.
From preprocessed DEM, we calculated an elevation's coefficient of variation (DEM [-] in Table 1) by dividing the standard deviation by mean catchment elevation. In line with several other studies focusing on the relation between denudation rates and topography [23,24,49], a normalized steepness index (Ksn [m 0 . 9 ]) was also calculated. It represents the upstream area-weighted channel gradient [56]: in which S is the channel slope (m m −1 ); A is the upstream drainage area (m 2 ); θ is the concavity index set to 0.45 to ease comparisons between streams with differing catchment areas and concavities. We calculated Ksn in MATLAB using the Topo Toolbox Software [57,58]. As a proxy of slope lengths, we used the precalculated height Above the Nearest Drainage (HAND, m) in the MERIT dataset [59]. HAND normalizes DEMs according to distributed vertical distances relative to the drainage channels [60].
In order to assess the potential importance of tectonic activity in explaining the observed SSY variation, we computed the area-weighted average peak ground acceleration with a 10% exceedance probability in 50 years (PGA, m s −2 ) derived from the GSHAP Global Seismic Hazard dataset [61,62] following the recommendations from Vanmaercke et al. [22].
The global lithological map database (GLiM) [28] and average chemical weathering rates (CWR, t km −2 year −1 ) calculated for every individual GLiM class by Hartmann et al. [63] were used. The spatial distribution of CWR rates is presented in Supplementary Material. Then for every catchment in our database, the area-weighted average CWR was estimated.
We derived the sand-fraction map based on topsoil (0-30 cm depth) structure from the SoilGrids database [64] at 250 m resolution (Supplementary Material). Therefore, the SAND parameter (see Table 1) represents the average fraction of sand in the first 30 cm of soil.
The authors used several regional studies to assess the fraction of landuse/landcover classes (see Supplementary Material). Thus, cropland, forest, and barren areas for 2015 were calculated based on research by Buchner et al. [65]. The glacier areas for 2014 were derived from the GLIMS database [66]. We also estimated a mean tree canopy height (CANOPY, m) within a catchment as a proxy of forest structure and aboveground biomass. For this purpose, a 30 m global forest canopy height map [67] was used.
Additionally, the fragmentation of landuse/landcover classes within basins was determined using Shannon's Diversity Index (SHDI) [68]. The SHDI shows the relative profusion of each land cover class and is frequently used as a diversity metric in biodiversity and ecology studies [69][70][71]. The SHDI calculates as follows: where P i is the proportion of class i. This study used mean annual basin-averaged climate parameters calculated from a TerraClimate [72] dataset for 1958-2018. We estimated mean annual precipitation (P, mm), a mean annual air temperature (AET, • C), a mean annual downward solar radiation flux at the surface (SRAD, W m −2 ), a mean annual cumulative streamflow (Q, mm). We also estimated basin-averaged mean annual rainfall erosivity based on Global Rainfall Erosivity Dataset (GLORED, MJ mm ha −1 h −1 year −1 ) created by Panagos et al. [73] based on high temporal resolution rainfall records from 3530 stations worldwide.

Uncertainty Assessment
The errors of the suspended sediment yield values were estimated by taking into account the most important sources of uncertainties. In this research, other potential SSY values were simulated using the following equation, modified from Vanmaercke et al. [18]: in which SSY cor is a corrected value of SSY after considering the various sources of uncertainty. U ME reflects the uncertainties associated with measuring errors; U MP represents the uncertainty associated with the length of the measuring period. A first significant source of uncertainty (U ME ) is the effects of various methodological errors during the SSY measurements. The degree of these measuring errors depends on various (unknown) methodological details. In previous studies, it has been reported that these errors are usually 20-30% [18,76,77]. Therefore, we have assumed that 30% gives a realistic and relatively conservative estimate of the SSY-value uncertainty associated with measurement errors. Hence, U ME was simulated as a random number from a normal distribution with a mean of 1 and a standard deviation of 0.30.
Uncertainties related to the duration of the measuring period describe interannual sediment export variability. Recent studies by Vanmaercke et al. [78] indicate that Weibull distribution describes such interannual variations in SSY (with a scale of 172.7 and shape 1.22). We selected n values from the median Weibull distribution to simulate UMP with n equal to the measuring period duration. Then we calculated the sample's mean and divided it by the real average of the Weibull distribution [18]. The average value (24 years) was used when the duration of the measuring period was not available.
Likewise, data derived from global datasets generally have critical errors and uncertainties. We, therefore, try to consider them in this study by applying a simplified version of Equation (4). For every parameter examined as a potential controlling factor (F), we simulate alternative values (F sim ) using the following formula: in which U er is the uncertainty associated with dataset errors and accuracies (see Table 1). It was simulated as a random number from a normal distribution with a mean of 1 and a standard deviation of uncertainties from Table 1.
Previously, Boulton and Stokes [55] found that river network derivatives from AW3D30 (2017th release) are highly accurate (even the K sn ). However, there is no data available on relative errors of the AW3D30 river derivatives. The overall elevation accuracy of the AW3D30 DEM was estimated as <3 m by Boulton and Stokes [55] and as <5 m by mission specifications [79]. Previous studies found that the vertical root mean square error was lower (<5 m) in flat areas [80,81] and higher (up to 12-14 m) in mountainous areas [81]. Considering the average elevation difference within the studied watersheds equal to 2700 m and the elevation range for the whole studied region from 0 to 5639 m, we assumed a 1% elevation accuracy is a realistic uncertainty measure. For this study, we hypothesized that river network derivatives from AW3D30 would have the same relative error.
While accuracy assessment has not been previously reported for the HAND dataset, it is known from previous studies that it is mainly controlled by the definition of the accumulated area [60]. As in this study, we derived HAND from the MERIT dataset [59]. Thus, we assumed that uncertainty is attributable to errors in MERIT's flow accumulation estimates. In most cases, the reported relative error of the accumulation area is <0.05 [59]. There were only a dozen validation stations in the Caucasus region with accumulation area relative error ±5%. Hence, we could estimate the overall uncertainty of the HAND dataset as 5%. However, this value should be used with caution as it does not include uncertainties associated with validation station location, possible hydrologic fluctuations, and unreported HAND model accuracy.
Buchner et al. [65] reported the 2015th landuse map accuracy by classes: 29% for barren areas, 8% for cropland areas, and 15% for forest areas (averaged among various forest types). Therefore, uncertainties of the landcover change associated with measuring errors were simulated as a random number from a normal distribution with a mean of 1 and a standard deviation of 0.29 for barren, 0.8 for cropland, and 0.15 for a forest. Potapov et al. [67] indicated that the accuracy of the canopy tree dataset varies from 83% to 92.9%, depending on the forest class threshold and reference source. We assumed a relative error of 12% provides a realistic estimate of canopy height uncertainty.
Tielidze and Wheate [74], in their study, reported that the accuracy of glacier's outline identification in 1960 varies between 4.8% and 5.9%. The error was associated with mapping and georeferencing errors. We used a 5% value as an uncertainty measure.
Shedlock et al. [62] provided no further details on the accuracy of the PGA map. Seeing that that map has a probabilistic nature of hazard analyses makes it hard to validate [82]. Nevertheless, the relative error of the estimated peak ground acceleration should likely be about 20% since some studies [83] indicate that this is the percentage of large earthquakes that are not predicted.
Equations (4) and (5) were used to simulate 1000 alternative SSY and controlling factors for every basin. We calculated 95% confidence intervals on every SSY value (i.e., the difference between the 97.5% and 2.5% quantile of the 1000 simulated values).

Statistical Analyses
The relative importance of control variables in explaining SSY was assessed through partial correlation analyses (using the Spearman rank test). All statistical analysis was done in R 4.1.0 [84] unless otherwise noted.
The partial least squares regression (PLSR) technique was also used to analyze the relationship between SSY and controlling factors. This method combines Principal Component Analysis (PCA) and Multiple Linear Regression [85,86]. The PLSR model is based on a linear combination of the original scores that aim at the best representation of the response variable [87], i.e., suspended sediment yield. One of the strengths of the PLSR over the PCA is that both quantitative and qualitative variables can be used to explain the dependent variables. The Q 2 (i.e., the cross-validated R 2 ) index was used to assess the PLSR model quality [85]. The Q 2 represents the global component contribution and determines the most stable model when close to 1. Applying this goodness of prediction estimator is very common in PLSR models studies [85,[87][88][89].
There were two models run with various sets of explanatory variables. Model A was based only on numeric variables (see Table 1, everything except KG and LZ). In contrast, in Model B, we added qualitative (dummy) variables such as climatic region (KG) and landscape zone (LZ).
We applied variable importance in the PLSR projection (VIP) filter measure to select the most relevant variables. The authors used a VIP value threshold equal to 1, suggesting that variables with VIP higher than 1 are essential in explaining SSY [90]. We performed the PLSR analysis using the XLSTAT ver. 2021.3.1 software by Addinsoft. Table 2 gives descriptive statistics of collected data. The observed mean annual suspended sediment yield in the Caucasus region is 426 ± 71.9 t km −2 year −1 . It varies from 5 to 4100 t km −2 year −1 . The map presented in Figure 1a can reveal some general patterns of SSY in the Caucasus region. The map shows that low SSY values generally characterize large parts of the Lesser Caucasia and Ciscaucasia (<100-150 t km −2 year −1 ). While in the Greater Caucasus, values higher than 150 t km −2 year −1 and even above 300 t km −2 year −1 are a lot more frequent. Furthermore, the western part and the high mountain belt of the Greater Caucasus can be clearly distinguished by values above 600 t km −2 year −1 . Figure 1b displays the distribution of SSY per climate class, while Table 2 reports descriptive statistics of the SSY-values per climate class. This comparison indicates some apparent differences: the Dfc climate type (snow climate, fully humid with cold summer) has the highest SSY values, it is significantly (p < 0.00008) higher than the most common Dfb class (snow climate, fully humid with warm summer) and the Cfa (warm temperate climate, fully humid with hot summer). The ET regions have intermediate values. For other classes, the number of observations is too low to draw any conclusions. Grouping SSY entries by landscape provinces reveals much more SSY distribution diversity (see Figure 2d). Thus, SSY in 2c and 2e provinces (eastern part of the Greater Caucasus) is the highest in the region. Conducted pairwise Wilcoxon rank-sum test revealed that mean SSY values in the eastern part (landscape zone 2e and 2c) of the Great Caucasus are significantly (p < 0.002) higher than in the western (2f and 2a). We observe the lowest SSY values in the Lesser Caucasus (5d-f), while the central part of the Greater Caucasus (2b) generally has the intermediate values.

Database Overview
Once the SSY data are aggregated by altitude zone (Figure 2a, Table 2), the findings suggest that SSY distribution for lowland basins (GS altitude < 500 m a.s.l.) is overall lower than the SSY distribution for mountain catchments. Low mountain catchments (500-1000 m a.s.l.) also have lower SSY values than middle and high mountain ranges. However, all these differences are not significant according to the Wilcoxon rank-sum test. Table 3 shows the partial correlations between all considered variables and SSY. The partial correlation analyses indicate that catchment sediment yields are mainly controlled by HAND, K sn, and the contemporary fraction of barren land (Barren). We observe a negative correlation between the SSY and the CWR (Spearman r = −0.37, CI = [−0.4; −0.33]). However, the further study points to differences depending on the altitude zone (Figures 3-5). Therefore, if we consider only the high mountain ranges (>1500 m) catchments, there is a highly significant negative correlation between SSY and CWR (Spearman r = −0.65). The CWR explains 64% of the variation in SSY of high mountain catchments (Figure 6d). Table 3. The mean, lower and upper bounds of the 95% confidence interval of 1000 Spearman r-values between the corrected SSY (Equation (4)) and simulated controlling factors (Equation (5)). Every value was retrieved by calculating the correlation between randomly generated sets of SSY and controlling variables values (see Section 2.4). An explanation and source of each variable are given in Table 1.   (4)) and simulated controlling tectonics and lithology factors (Equation (5)) grouped by altitude zone. Every value was retrieved by calculating the correlation between randomly generated sets of SSY and controlling variables (Section 2.4). The red values represent the median Spearman r score.  (4)) and simulated controlling tectonics and lithology factors (Equation (5)) grouped by altitude zone. Every value was retrieved by calculating the correlation between randomly generated sets of SSY and controlling variables (Section 2.4). The red values represent the median Spearman r score.

Variable
Water 2021, 13, x FOR PEER REVIEW 12 of 2 Figure 4. The distribution of 1000 Spearman's rank correlation coefficients between the corrected SSY (Equation (4)) and simulated controlling landuse/landcover factors (Equation (5)) grouped by altitude zone. Every value was retrieved by calculating the correlation between randomly generated sets of SSY and controlling variables (Section 2.4). The red values represent the median Spearman r score. Figure 5. The distribution of 1000 Spearman's rank correlation coefficients between the corrected SSY (Equation (4)) and simulated controlling climate factors (Equation (5)) grouped by altitude zone. Every value was retrieved by calculating  (4)) and simulated controlling landuse/landcover factors (Equation (5)) grouped by altitude zone. Every value was retrieved by calculating the correlation between randomly generated sets of SSY and controlling variables (Section 2.4). The red values represent the median Spearman r score.
Water 2021, 13, x FOR PEER REVIEW 12 of 22 Figure 4. The distribution of 1000 Spearman's rank correlation coefficients between the corrected SSY (Equation (4)) and simulated controlling landuse/landcover factors (Equation (5)) grouped by altitude zone. Every value was retrieved by calculating the correlation between randomly generated sets of SSY and controlling variables (Section 2.4). The red values represent the median Spearman r score. Figure 5. The distribution of 1000 Spearman's rank correlation coefficients between the corrected SSY (Equation (4)) and simulated controlling climate factors (Equation (5)) grouped by altitude zone. Every value was retrieved by calculating Figure 5. The distribution of 1000 Spearman's rank correlation coefficients between the corrected SSY (Equation (4)) and simulated controlling climate factors (Equation (5)) grouped by altitude zone. Every value was retrieved by calculating the correlation between randomly generated sets of SSY and controlling variables (Section 2.4). The red values represent the median Spearman r score.
the correlation between randomly generated sets of SSY and controlling variables (Section 2.4). The red values represent the median Spearman r score.

Partial Least Squares Analyses
To visualize the multidimensional data structure between controlling variables (see Table 1), a biplot of scores and correlation loadings was constructed based on a PLSR model B (Figure 7a). The plot graphically summarizes which of the factors controls SSY. In component 1, 15% of the explanatory variables (X) variance was used to explain 43.8% of the response variables (Y) variance. In component 2, 23.1% of the variance of the controlling factors was used to explain 57.7% of the SSY variance. A measure of goodness of prediction Q 2 showed that component 1 and component 2 almost equally contribute to the model B quality (40.7% and 49.3%, respectively). Among factors controlling the SSY, the HAND and Ksn values were highly positively correlated (r = −0.95, p < 0.001) (Figure 7a). Suspended sediment yield was generally less associated with catchment area (r = 0.13 p = 0.04) and rainfall erosivity (r = 0.16 p = 0.01). Whereas forest fraction (FOREST) and mean annual temperature (AET) appeared not to be statistically significant factors (p > 0.05). Important explanatory variables (VIP > 1) in the projection for SSY (Figure 7b) created the following decreasing order: HAND > LZ > PGA > BARREN > Ksn > Cropland > Q > CWR > KG > SAND > SHDI. Contrary, HAND seems to have a positive effect on the SSY values. Again, we may observe elevation tendency: Spearman's r-value tends to increase with the GS altitude (see Figure 3). In middle and high mountain catchments, HAND explains 63 and 42% of the SSY variation (Figure 6e).

Partial Least Squares Analyses
To visualize the multidimensional data structure between controlling variables (see Table 1), a biplot of scores and correlation loadings was constructed based on a PLSR model B (Figure 7a). The plot graphically summarizes which of the factors controls SSY. In component 1, 15% of the explanatory variables (X) variance was used to explain 43.8% of the response variables (Y) variance. In component 2, 23.1% of the variance of the controlling factors was used to explain 57.7% of the SSY variance. A measure of goodness of prediction Q 2 showed that component 1 and component 2 almost equally contribute to the model B quality (40.7% and 49.3%, respectively). Among factors controlling the SSY, the HAND and K sn values were highly positively correlated (r = −0.95, p < 0.001) (Figure 7a). Suspended sediment yield was generally less associated with catchment area (r = 0.13 p = 0.04) and rainfall erosivity (r = 0.16 p = 0.01). Whereas forest fraction (FOREST) and mean annual temperature (AET) appeared not to be statistically significant factors (p > 0.05). Important explanatory variables (VIP > 1) in the projection for SSY (Figure 7b Adding extra dummy variables (climate region and landscape province) made it possible to describe 10% more variation of the SSY (Table 4) with the same number of components. However, the overall quality of the model (Q 2 ) does not change significantly. Table 4. PLSR models quality and quality indexes. Model A is the partial least square regression without qualitative variables (KG and LZ), while B is the PLSR model trained with qualitative variables.

SSY Overview
To the authors' knowledge, a database used in this research is currently the most extensive and detailed database of suspended sediment yield measurements in the Caucasus. However, we are aware that much more SSY data probably exists that is not included in this analysis. For example, the gauging station density in Transcaucasia (i.e., Georgia, Azerbaijan, and Armenia) is low, while earlier studies [41,91] indicate that there are some more GS exist.
Overall, the average SSY (426 t km −2 yr −1 ) agrees well with the average SSY of the European Alps (451 t km −2 yr −1 [15]). Earlier estimates of the denudation rates in the Caucasus region suggest a mean annual denudation rate of 0.2 mm yr −1 [91]. More recent Adding extra dummy variables (climate region and landscape province) made it possible to describe 10% more variation of the SSY (Table 4) with the same number of components. However, the overall quality of the model (Q 2 ) does not change significantly. Table 4. PLSR models quality and quality indexes. Model A is the partial least square regression without qualitative variables (KG and LZ), while B is the PLSR model trained with qualitative variables.

SSY Overview
To the authors' knowledge, a database used in this research is currently the most extensive and detailed database of suspended sediment yield measurements in the Caucasus. However, we are aware that much more SSY data probably exists that is not included in this analysis. For example, the gauging station density in Transcaucasia (i.e., Georgia, Azerbaijan, and Armenia) is low, while earlier studies [41,91] indicate that there are some more GS exist.
Overall, the average SSY (426 t km −2 year −1 ) agrees well with the average SSY of the European Alps (451 t km −2 year −1 [15]). Earlier estimates of the denudation rates in the Caucasus region suggest a mean annual denudation rate of 0.2 mm year −1 [91]. More recent estimates imply that the mean annual denudation rate should range from 0.005 to 2.32 mm year −1 [41]. Mozzherin and Sharifullin [41] calculated denudation rates (h c , mm year −1 ) using the following equation: in which SSY is a suspended sediment yield [t km −2 year −1 ]. Equation (6) was further used to compare this study's results with previous findings. As such, the mean denudation rate for the Caucasus region based on this study database is 0.16 ± 0.027 mm year −1 with a maximum of 1.55 mm year −1 . These values are in line with previous studies [41,91]. As indicated by Figure 1, the spatial patterns of our study correspond relatively well with those estimated by Mozzherin and Sharifullin [41]. However, these values should not be treated as actual denudation rates as the equation used (Equation (6)) to obtain them does not incorporate the bedload sediment transport. While the bedload fraction in mountain rivers can be 20-40% from total sediment load on average [92][93][94], it may increase up to 90-99% in some years [95].

Controlling Factors
We found that SSY distribution differs less between altitude zones than climatic regions and landscape provinces ( Figure 1). These findings are in line with Vanmaercke et al. [15]. They found that hilly catchments (<500 m a.s.l.) generally have lower SSY values, while other differences between mountain ranges are insignificant. While the SSY distribution within landscape provinces (Figure 2d) may somehow repeat general spatial climate patterns for the Caucasus [43,44], a closer look at the relationships between climate factors and SSY does not reveal any strong explaining power (Table 3, Figures 5 and 7). On the one hand, this is reasonably expected considering previous findings [22,34]. On the other, the sediment yield in mountain areas is mainly formed during the snow melting or rainfall [13,14], which can also lead to the formation of landslides, rainfall-driven floods, and glacier lake outburst floods [96,97]. Meanwhile, moraine deposits, traces of the Last Glacial Maximum [98], are actively eroded by streams at any water level rise, similar to that observed in the Rocky Mountains of eastern Canada [99]. Additional friable material is regularly exported by the first-second order tributaries (Horton system) to the bottoms of higher-order valleys with debris-and mudflows. For example, the modern sedimentation rate in the bottom of the Baksan river (North Caucasus) valley reaches tens of centimeters per year [100].
The most robust relationship was found between the SSY and topographic factors (Figures 3 and 6), which is easily explained by the latter's close relationship with river water discharge. Hillslope erosion is also controlled mainly by topographic variables such as height above the nearest drainage (HAND) and normalized steepness index (K sn ). However, it is well known that water discharges primarily determine the mountain river's sediment loads since streambank erosion is important in total load [101]. Nevertheless, a normalized steepness index corresponds to a more pronounced topography of the catchment area and, hence, higher erosion from slopes [102]. Notably, the non-linear increase of suspended sediment yield with basin average K sn (Figure 6d) appears to be consistent with the expectation from fluvial incision models [24,103].
The peak ground acceleration (PGA) significantly impacts the SSY in the Caucasus region since this territory belongs to the area of alpine folding. Noticeable movements of the Earth's crust constantly occur, which provokes mass movements and other slope processes. In general, the tectonics influence on the relief formation of the Caucasus region and denudation is currently much more significant than the climate impact [104].
A close relationship has been established between SSY and chemical weathering rate (CWR) only for basins in the high-mountain belt of the Caucasus region (see Figure 3). It generally agrees with the known concepts like when the threshold value in the denudation rate is reached, the contribution of chemical weathering begins to decrease with a denudation increase [105]. Namely, this is observed in the high mountain belt of the Caucasus, where the maximum SSYs are observed in proglacial rivers, sometimes exceeding the sediment load of similar rivers in other mountainous regions [13].
The most significant anthropogenic impact on SSY values could be expected in the hilly and lowland zones of the Caucasus region. This area is characterized by the maximum agricultural development of the territory and, consequently, the highest soil erosion rates on cropland for the European part of Russia [106,107]. However, the performed analysis does not reveal the presence of a close relationship between SSY and cropland area for given zones (Figure 6f). There are several reasons for this. Firstly, most rivers flowing in the hilly and lowland zones originate in the middle and high mountains, and most of the water is formed there. As a result, the share contribution of sediments from plains and foothills differs between basins, depending on the catchment areas' ratio in various elevation zones. Secondly, in the foothill and lowland zones, rainstorm soil erosion dominates. The sediment delivery ratio varies greatly depending on the length of the dry valley network [108] and the presence of other sediment traps in the catchment.
Thirdly, during extreme floods with low recurrence, which account for the bulk of the suspended sediment yield in mountain river basins [109], a significant part of the transported sediment is redeposited in the bottoms of river valleys within the foothill sections of their channels [100,110]. This is due to a sharp drop in slope steepness and the transport capacity of streams. It also has the most significant impact on the high variability of relationships between sediment yield in the foothills and plain zones of the Caucasus region and the other controlling factors. On the other hand, a reasonably close negative relationship between the cropland area and SSY was established for river basins in the high mountain zone (see Figure 6f). This is because croplands occupy the expansions of the valley bottoms in this zone and serve as sinks for sediments eroded from valley slopes.
Among other factors for all basins of the Caucasus region, a reasonably strong relationship was revealed between SSY and the proportion of barren lands (Figure 6e). This is also expected since barren lands, including badlands, are the primary sediment sources for areas slightly disturbed by humans [111]. Moreover, a good relationship between SSY and the fraction of barren lands is observed for all high-altitude zones. Barren lands are usually located close to streams, or their fraction is quite significant, as is observed in the mountains of Dagestan (Eastern Caucasus), where badlands distribution is associated with the destruction of the soil cover due to overgrazing [112].
The PLSR analyses revealed interdependencies between various explanatory parameters ( Figure 7). Thus, the PGA was likely a controlling parameter of the K sn and HAND, affecting the fraction of barren lands. While a fraction of cropland is somehow positively correlated with the population density, suggesting that arable lands are usually located near settlements. Surprisingly, we did not find any close relationship between rainfall erosivity (GLORED) and SSY values (see Figure 5) for hilly and low mountains catchments where a significant part of sediments should be formed by soil erosion [106]. Contrariwise, a high correlation between GLORED and SSY was observed for high-altitude basins (r = 0.51). This and the interdependency of GLORED with elevation covariates (DEM, GS elevation, see Figure 7) indicate that a possible impact of rainfall erosivity is associated with an elevation used to interpolate WorldClim data [113] used to create GLORED dataset [73]. Another possible explanation is the scarcity of meteorological stations in the mid-mountain and high-mountain belts of the Caucasus Mountains. In this regard, the reliability of the spatial variability of rainfall erosivity and its changes with altitude is generally relatively low.

Uncertainties and Further Implications
As a result of the uncertainties associated with the reported SSY data (see Section 2.4), our results should be interpreted carefully. Vanmaercke et al. [15] suggested that total uncertainties may be over 100% for some SSY measurements based on their overview of European sediment yield. Inaccurate SSY data measurements could partly explain some poor correlations (see Table 3, Figures 3-5). Nevertheless, considering various sources of uncertainties for both SSY and explanatory variables, high correlations with topographic and landuse factors reported here should be realistic.
It is worth noting that the relatively low correlation coefficients and quality indexes are expected in SSY studies [15,18,114,115]. Such low performance can be likely explained by uncertainties in SSY values (see Section 2.4), as our research was based on a similar empirical approach.
The SSY modeling was not the goal of this study; here, on the contrary, we only suggested variables that can be further used in the models. Nevertheless, a preliminary run of the PLSR showed that qualitative variables significantly increase the model's predictive power (Table 4). Therefore, adding dummy variables such as ecological zones or landscape provinces should be recommended for further studies.

Conclusions
Suspended sediment yield (SSY) in various environments has been studied for a long time, but SSY in the Caucasus region has received only limited attention. This study aimed to address this research gap by first exploring an extensive database of measured SSY data. Despite potentially significant uncertainties on SSY measurements and controlling factors, it is possible to point out major regional differences in SSY. Therefore, most catchments in the Lesser Caucasia and Ciscaucasia are characterized by relatively low SSY values (<100-150 t km −2 year −1 ), the Greater Caucasus region generally have higher SSY values (more than 150-300 t km −2 year −1 ). As these differences are based on extensive observations, they cannot be attributed to uncertainties in the available data. However, they must result from regional variation of the controlling factors of SSY and different dominant erosion processes.
Based on the exploratory analyses presented here, the particular importance of the various potential controlling factors of SSY could not be determined. We found that such proxies of topography as height above nearest drainage (HAND), normalized steepness index (K sn ) tend to be among the most important ones. However, a PLSR analysis suggested that these variables' influence is likely associated with peak ground acceleration (PGA) but much easier to measure. We also reveal a strong relationship between various land cover types (e.g., barren areas and cropland) in different elevation zones. We did not aim to create a tool to predict SSY in the Caucasus region. However, our research suggested that including such factors as HAND, K sn , a fraction of barren and cropland areas in models may be essential to shed some light on the SSY spatial and temporal patterns.
This line of investigation is continuing, looking at a broader range of data and such questions as possible relationships between SSY and natural or anthropogenic control factors and motifs involving their combinations. Using landscape provinces and other qualitative variables as controlling factors in line with PLSR modeling is also an intriguing possibility. In addition, the use of such models to predict suspended sediment yield is of great practical interest, especially in tectonically active regions.