Mesoscale Mapping of Sediment Source Hotspots for Dam Sediment Management in Data-Sparse SemiArid Catchments

: Land degradation and water availability in semi-arid regions are interdependent challenges for management that are influenced by climatic and anthropogenic changes. Erosion and high sediment loads in rivers cause reservoir siltation and decrease storage capacity, which pose risk on water security for citizens, agriculture, and industry. In regions where resources for management are limited, identifying spatial-temporal variability of sediment sources is crucial to decrease siltation. Despite widespread availability of rigorous methods, approaches simplifying spatial and temporal variability of erosion are often inappropriately applied to very data sparse semi-arid regions. In this work, we review existing approaches for mapping erosional hotspots, and provide an example of spatial-temporal mapping approach in two case study regions. The barriers limiting data availability and their effects on erosion mapping methods, their validation, and resulting prioritization of leverage management areas are discussed.


Introduction
Siltation of reservoirs (also called sediment entrapment or infilling of reservoirs) is a major problem in semi-arid regions where water is scarce and land degradation frequently results in high sediment loads in rivers entering reservoirs [1][2][3][4]. The sediment inflow in reservoirs reduces water storage volume, thereby endangering the water available for citizens, irrigated agriculture, and hydropower energy generation [5]. When water is the dominant mechanism of soil loss across a catchment, soil-water conservation measures in sediment source areas can drastically decrease catchment sediment yields [6,7]. Modeling and monitoring studies showed that timely and spatially adjusted management can significantly decrease downstream sediment delivery [8][9][10]. However, sediment sources are unknown in many semi-arid catchments, and no data are available to adjust management accordingly. Moreover, limited financial resources to apply soil-water conservation measures and institutional barriers pose further challenges for water and land managers [11]. In such situations, identifying and managing areas contributing to high sediment yields (as well as sediment hotspots) are crucial to decrease sediment inputs to reservoirs. We define "leverage areas" as sediment hotspots where management would have the greatest impact on reducing erosion and sediment delivery into downstream bodies (e.g., dam, catchment and outlet).
Several methods have been developed to define hotspots areas (or areas with high erosion risk) which would particularly benefit from further management. Methods differ mainly in: (i) subprocesses and mechanisms of water erosion being considered [12][13][14]; (ii) factors controlling of water erosion and sediment delivery [15]; (iii) representation of relationships between factors, including representation within conceptual, empirical, statistical, or physics-based models [16][17][18]; and (iv) study areas and scales. Commonalities among these approaches are: (i) difficulties ensuring the same input data quality for whole study area; (ii) ensuring the same spatial and temporal precision and quality of data for each controlling factor; (iii) validation of mapped areas; and (iv) disclosure of the most appropriate decision rules for identifying hotspots and leverage areas [19][20][21][22][23].
Application of sediment hot spot mapping and delivery of reliable outputs is limited by data availability and understanding of the processes driving sediment delivery in respective catchments. The misidentification of sediment sources and pathways increases the risk of reservoir infilling with serious consequences on water shortages in semi-arid regions already unprivileged in many aspects (e.g., economically and environmentally). Retaining reservoir volumes is of primary importance for sustaining drinking water, irrigation water availability, and hydropower generation, and is a primary concern of water managers. This study aimed to illustrate why and how data availability constrains the identification of key leverage areas for sediment management. We investigated mapping approaches for erosion hotspot mapping, and on two examples of data-sparse mesoscale catchments (in Iran and Sudan) illustrated how mapping of sediment of hotspots in data sparse semi-arid catchments can be enhanced using freely available global datasets. Lastly, we discuss which methodological approaches can be applied for mapping spatial and temporal variability and for validating results, and which challenges are common for research and management in data sparse areas.

Constraints on Data Availability
When it comes to data availability and sharing, researchers across the globe, administrators, mangers of land and water resources, and planners of companies designing new infrastructures (furthermore stakeholders) encounter similar constraints, all rooted in: 1. general data unavailability due to a lack of, a decline in, or unequal distribution of monitoring and measuring stations [24,25]; 2. low or questionable quality of existing data due to false research design or poor documentation [26][27][28]; 3. inaccessibility of data due to un-/official institutional policies or stakeholders' personal preferences for cross-institutional data sharing and co-production, including digitalization and use of data [26,28,29].
The above-listed data limitations define an operating space for the preparation of a management plan. The degree to which a management plan can fit the actual natural and socioeconomic processes influencing sediment delivery is directly linked to data available for stakeholders (as well as data inventory), and to the methodological tools they have available and are able to use. In principal, two inventory types-minimum and maximum ( Figure 1)-define stakeholders' operating space. Maximum data inventories are all data collected in a region of interest by various research, administration, and commercial institutions, which might be not available for analysis before or during the entirety of a management-planning project. On the other hand, minimum data inventories are the minimal data needed to identify leverage areas. These data are (or rather principally should be) freely available data (globally and regionally). The accessible data inventory includes the data which the stakeholder use in a specific time period for conducting research or analysis to support management decisions. Accessible data inventory can differ for different stakeholders (institution or person) responsible for the analysis. An inventory is any dataset collected in a region of interest, along with administration and commercial institutions available for analysis supporting management decision. Three types of inventories can be defined based on data availability. The minimum data inventory is the minimum dataset needed to identify leverage areas for sediment management. These data are freely available (globally and regionally). The maximum data inventory contains all data needed for an analysis in ideal spatial and temporal resolution and precision. The accessible data inventory is a dataset a stakeholder can collect for an analysis in a region of interest for a point or period of time. Conditions I-VI are conditions (requirements or processes) challenging the stakeholder (institution or person), who is responsible for the analysis in internal and external institutional environments. The arrows show the increase of complexity. Conditions V and VI are dataset-related conditions challenging the stakeholder, and they differ between types of datasets. Further explanation can be found in text.
Six conditions (requirements or processes) challenge the stakeholder conducting research or analysis to support management decisions: (i) requirement on data and knowledge in the region of interest; (ii) inter-disciplinary and inter-sectoral cooperation network; (iii) stakeholder analysis and involvement; (iv) resources (time, labor, money) to acquire data; (v) processing power and labor requirements; and (vi) spatial coverage, resolution, precision, validation. Conditions I-IV (in Figure  1) are related to internal and external institutional environments, and Conditions V and VI are more data related. They have various degree of complexity (increase in complexity is indicated by arrows in Figure 1). The complexity of data related challenges is not indicated, while it differs between types of datasets. For example, complexity can increase in sediment yield evaluation using multiple methods and maximum inventory in a catchment. In this scenario, researchers may have rainfall simulations, hillslope and runoff plot monitoring, lysimeters and gully head retreat measurements, sediment fingerprinting, legacy sediment analysis, and river discharge and sediment monitoring at their disposal. Alternatively, remote sensing-based monitoring of regional vegetation growth using minimum inventory provides better spatial information coverage in comparison to vegetation mapping, which is usually not performed at large scales or continuously due to enormous labor requirements. In this scenario, processing of remote-sensing based monitoring requires high processing power, sophisticated algorithms, and well-trained users.
Data inventories can limit or enhance sediment hotspot mapping and delivery. Subsequently, we review mapping approaches developed under constraints of unavailability.

Mapping Approaches under Constraints of Data Unavailability
Mapping sediment management hotspots requires conceptualizing of: (i) processes causing erosion (sediment production), transport (sediment delivery), and accumulation (deposition or sedimentation); and (ii) identifying and describing of functional relationships between processes and dominant factors in a specific geographic area. Multiple geomorphic processes are leading to reservoir siltation, e.g., runoff or subsurface-flow induced water erosion, wind erosion, or mass processes, such as landslides.
Mapping erosional hotspots for management is spatially and temporarily scale dependent. A single map of erosional hotspots is simply a snapshot of the state (current or future) of coupled natural-human system and represents a combination of potential and actual properties. On each scale (for example, a hillslope or catchment; event or year), and each resolution (e.g., 1 m, 30 m, or 1 arc sec) different system properties and functions can be analyzed [30,31]. The results of mapping are further dependent on technique, spatial distribution of observation points, and maintaining measurements/records for adequate temporal scales [25,32,33]. Therefore, conceptualizing sediment hotspot mapping for management requires clarified aims, which help define dominant processes (e.g., hillslope erosion or sediment delivery to reservoir); spatial scales, which help to define the area of interest and appropriate resolution (e.g., particular hillslope or catchment above reservoir); and temporal scales, which help to define period of interest and appropriate resolution (e.g., long-term mean, seasonal, and actual state after an event). Figure 2A conceptualizes the sub-processes leading to sediment delivery in semi-arid landscapes, and their controlling factors. In this example, the dominant controlling factors are: (i) rainfall properties; (ii) soil properties; in interaction with (iii) vegetation properties; (iv) topography; and (v) landscape design (the patterns of artificial or natural linear landscape elements and land uses). Hereby, sediment production and transport through landscape rather than runoff generation and propagation through landscape are in the focus, without considering the sediment production and transport in permanent streams. The occurrence of high intensity interval rainfall events, which are linked closely to peaks in sediment delivery [34,35]. Density and diversity of above-ground and sub-surface (roots) biomass influence infiltration, while vegetation patterns impact infiltration and connectivity of water and sediments in the landscapes [36][37][38][39][40]. From soil properties, the erodibility of soils and antecedent soil moisture are considered in Figure 2A, whereby specific catchment conditions may reflect the effects of other soil properties on runoff generation and particle detachment, i.e., duplex soils, content of shrinking clays, hydrophobicity, and/or exposure of parent material or subsurface soil horizons with erodibility differing from surface soil horizons [41]. Topography affects the sediment production via hillslope inclination and length ( Figure 2A) and sediment transport by influencing structural hydrological and sediment connectivity via surface roughness [42]. Finally, landscape structure (spatial organization of landscape elements and land uses) can enhance, modify, or retard sediment transport. Other factors (see, e.g., [16,43] and Figure  2B,C) can be chosen to represent the model of physical reality.

Figure 2.
Example concepts of relationships between dominant factors leading to detachment of sediment, sediment transport and delivery to a target: (A) Approach applied in this study. Sediment transport is here in focus, rather than runoff generation only. Simplifications are made for sediment detachment in transport routes (e.g., gullies, permanent streams). Factors-italics, sub processes are underlined. Red asterisk indicates variables considered in this study. (B) Complex relationship structure between controlling factors to evaluate the susceptibility to sediment production (soil erosion). NDVI, normalized difference vegetation index; SPI, stream power index; TWI, topographic wetness index. Reproduced with permission from [43]  The type of data that are selected, measured, or analyzed for particular spatial and temporal resolutions will be heavily influenced by the concepts/ideas that each researcher is attempting to map. For example, the relationship between controlling factors of sediment delivery could be represented in multiple ways. Firstly, expert knowledge can be applied to determine qualitative decision rules and hierarchical organization of controlling factors [16]. Alternatively, statistical approaches can be used [24,34,[44][45][46]. Other approaches include empirical models (mainly Universal Soil Loss Equation-USLE, or Revised Universal Soil Loss Equation-RUSLE) [14,18,20,22,23,47], or physically-based models. either spatially distributed or spatially lumped (e.g., Soil Water Assessment Tool-SWAT [12,[48][49][50] and Water Availability in Semi-Arid environments-SEDiments-WASA-SED [51]).

Leverage Areas for Sediment Management
To address some methodological aspects of hotspot mapping for dam sediment management, we provide an example for leverage areas mapping under constraints of sparse data in two semi-arid catchments described below.

Case Study Areas
Two study areas were Karun in southwest Iran and Upper Atbara, which spans southeast Sudan, north Ethiopia, and Eritrea ( Figure 3 and Table 1). Karun-the most effluent river in Iran-has its source in the Zagros Mountains. River flows are westward towards the Persian Gulf Basin, where it enters near the Euphrates and Tigris confluence (Arvand Rud/Shatt al-Arab). The Karun catchment (~66,600 km 2 ) has many sub-basins (e.g., Koohrang and Dez) with multiple operational dams (Karun 1-4, Gotvand, Dez, Ludbar Lorestan, and Kamal el-Saleh Dams). The catchment has an arid desert hot climate, with most rainfall falling November to March [52,53]. The Upper Atbara catchment (~95,750 km 2 ) is located in the northern part of Ethiopian Highlands. It can be subdivided into two sub-basins: Setit/Tekeze River and the Atbara River, which flow into the reservoir of Upper Atbara and Setit Dam complex shortly before their junction ( Table 1). The climate is temperate in the south and southeast, with increasing aridity towards the north, where arid steppe hot climate prevails [52]. Most precipitation falls during a rainy season between May and September [53]. In both regions, shrub and tree cover prevail at highest altitude, which transition to grasslands with rain fed agriculture. The driest areas in Karun basin are bare, sparsely to non-vegetated [54].   [56] Driest month September (mean rainfall 0-2 mm) January (mean rainfall 0-24 mm) [53] Rainiest month January (mean rainfall 28-98 mm) August ( mean rainfall 152-358 mm) [

Data Inventories
The minimum data inventory was represented by selecting data available from global repositories and published literature. The accessible data inventory was established during two years of continuous stakeholder involvement in cooperation with local scientists and water management authorities ( Table 2). In both regions, it was not possible to find local data covering the entire area of interest at finer resolutions than datasets available in the minimum inventory. Meteorological and hydrological monitoring data were available for some Karun sub-catchments, while no more detailed data were available for soils and vegetation. Several land use maps available at global scales [54,59] provided differing spatial distribution of land use classes, crop types, and irrigation agriculture. In both case study areas, no land use maps with spatial coverage comparable to global datasets were available, with information mostly restricted in agricultural areas. Contrary to local data, the metadata information (e.g., acquisition date and mapping methods) was available for global datasets, which made them more suited for approaches aiming to tackle spatial-temporal process dynamics. The global data best representing reality were selected based on expert opinion of water authorities and scientists [54].

Mapping Approach
Previously, we showed that different data inventories influence differences in spatial and temporal accuracy of representations of physical reality, and that a range of methods can be applied to high quality datasets. However, to demonstrate the effect of different inventories for mapping of erosion hot spots, basic GIS mapping methods were chosen. To do so, we used the conceptual approach visualized in Figure 2A, where red asterisks indicate the considered variables. The inputs for our analysis ( Table 2) were processed as shown in Figure 4. The hotspots and emerging hotspots were calculated using Getis-Ord Gi* [64,65] statistic in ArcGis 10.4. This geostatistical approach allowed distinguishing areas where features with high (hotspots) or low (cold-spots) values were surrounded with other features with high (low) values. Each feature was compared to its neighborhood, and each neighborhood sum with the sum of all datasets. The emerging hotspots were analyzed in spatial-temporal domain. The analysis allowed distinguishing whether hotspots are stable, emerging, or diminishing in relation to given point of time.
Topography data [55] were pre-processed as required for connectivity index calculation [42,66]; surface roughness was used as weighting factor. We calculated connectivity index to targets-dams and rivers (vectorized from Q Gis and Arc Info Base Maps). The connectivity index map was aggregated (by factor 30, median technique, providing pixel size ~900 m) to decrease the number of points for hotspot analysis ([64,65] ArcGis 10.4). The setting for spatial conceptualization of relationship was fixed-threshold distance, which means that effects of neighboring features diminish with distance. The Euclidean distance (connecting two points with a straight line) was chosen as distance calculation method. We applied false discovery rate correction (ArcGis 10.4), which reduced the critical values of statistical significance (p) due to spatial dependence. Hotspots identified by 95% confidence provided the final output (settings: GiBin > 2, GiZScore > 0, and GiPvalue < 0.1).
Soil Grid [58] layers (1 km)-silt, clay, and loam-were used to calculate texture [67]. Texture, together with GIS layers (dominant soil type, organic matter, and coarse sediments), was used to calculate K factors using previously published methodology [68]. An African soil map [61,62] was used for soil types in Upper Atbara. The hotspot analysis and selection followed the above-mentioned settings.
Rainfall station data from Karun (2000-2016; 29 stations; 15 hydrological years) were aggregated to monthly rainfall depths, and a shape file containing station location and the corresponding number of data points was used to build up spatial-temporal cube (settings: time step: 30 days; distance interval: 20 km; and summary fields: max, mean, sum, and med). These were included in an emerging hot-spot analysis ([63,64] ArcGis 10.4; settings: neighborhood distance: 60 km; neighborhood time step: 1; and statistically significant hotspots selected: GiZScore > 0, GiPvalue < 0.1, and GiBin > 2).
Vegetation data [63] were selected to visually represent maximum spatial distribution of low NDVI values (at the end of the season, or in December or May) from time-series 2000-2016. Afterwards, we filtered out negative values (no ecological meaning), and focused on point values from which hotspots were calculated. Statistically significant cold-spots (setting: GiPValue < 0.1, GiZScore < 0,and GiBin > −2) were selected, turned into points, and spatial-temporal cubes (time step: 1 year; and distance interval: 1000 m) were built for emerging hotspot analysis ([64,65,69] ArcGis 10.4). Statistically significant hotspots (GiZScore < 0, GiPValue < 0.1, and GiBin > 2) were selected for remaining hotspot analyses (connectivity index, gridded rainfall data, and K factor). In emerging hotspot analysis, the "last step" was to define the targeted temporal overlay (e.g., end of December).
The leverage areas represented the spatial-temporal overlay of hotspots or emerging hotspots of four controlling variables. Only statistically significant hotspots and emerging hotspots of the four controlling factors (with >90% probability of occurrence) where overlaid. For the spatial intersect, a buffer of three times the minimal layer spatial resolution (3 × 1 km) was allowed.

Validation Approach
To discuss challenges in validation procedure, spatial distribution of erosion hotspots was compared to observed or modeled sediment yield in Karun. Sediment data from sediment monitoring stations (39) in Karun consisted of continuous measurement with frequency of mostly one day in a month between October 1999 and September 2012. They were considered to represent monthly sediment volume, and, if more measurements were available in a month, cumulative value was considered. We calculated area-specific sediment yield (Mg km −2 month −1 ) for each measuring station. Consequently, long-term mean from annual mean values and long-term sediment yield in December were related to spatial distribution n of hotspots.

What to Map: Means or Extremes?
Maps are some diagrammatic representations for a system's state during the mapping period, for a particular spatial scale, while it remains unchanged. Variability of slow (e.g., soil texture), fast (e.g., soil moisture and rainfall volume), and emerging (connectivity) system variables, and the relations between them, change at different rates and intensities. If a map is used for decision making, it is preferable to define the conditions for which the map represents, and, when possible, represent multiple possible system states, conditions, and behaviors. Such an approach is well established for flood risk mapping, which displays various spatial distributions for floods of different recurrence intervals, or distinguish seasons in which flooding can occur. To our knowledge, this is rarely the case in sediment management. For example, studies showed that amount of sediment leaving a catchment at its outlet (furthermore, sediment yield) is closely related to rainfall seasonality and rainfall extremes in semi-arid catchments (e.g., [35,86]). However, the relation between rainfall amount and sediment yield is not linear, while redistribution of sediment in catchment or in river channel influence sediment yield when the catchment is well connected during the event. Therefore, ideally a manager of a dam would be aware of sediment production areas, temporary storage areas, and connectivity in their catchment, and consider their spatial-temporal dynamics. For this, modeling approaches that distinguish long-term means and extreme rainfall conditions, or hindcasting longterm trends, can be useful. Ideally, modeling would be applied for strategically planning management for upcoming seasons [10,87,88]. Again, availability of data inventories for continuous model calibration and validation, availability of seasonal rainfall predictions, and modeling (labor, financial, time, and programming) costs hinder the application of more complex models at large scales. Seasonally variable mapping cannot substitute for precise modeling approaches, but can provide more detailed information in comparison with those using mean conditions (e.g., long-term average rainfall). Seasonal rainfall volumes and vegetation state derived from remote sensing (e.g., Sentinel-1, Sentinel-2 I/II) are examples where it is possible to incorporate variable controlling factors. Figure 5 provides an example of spatial-temporal datasets based on different inventories in Karun (Iran). Minimum data inventories based on global raster datasets were applied for rainfall ( Figure  5A,B,D,E), structural connectivity, soil erodibility, and vegetation. The rainfall data inventory was based on station-based long-term measurement of rainfall ( Figure 5C,F). Three-possible representations of rainfall data were documented. Rainfall data in Figure 5B,E represent hotspots of mean yearly rainfall erosivity, and rainfall data in Figure 5A,D reflect hotspots of monthly rainfall depth in the rainiest month. The highest rainfall monthly depth was reached in January according to global data ( Figure 5A,D) [53], while analyzed station data showed it was December. Spatially and temporarily significant hotspots based on station data (points, not interpolations) at the end of the last analyzed hydrological year (September 2016) provided more detailed results, but were limited in spatial validity/extent. Another temporal dimension is provided by the vegetation data, which represents areas where the lowest biomass repeatedly occurred at the beginning of a growing season (end of October during 2000-2016), and approximately one month later when the rainiest months started (beginning of December, during 2000-2016). Temporal discrepancy of overlaid rainfall and vegetation hotspot maps was intentionally introduced to demonstrate the necessity of evaluating the temporal validity of input sources, and to select the appropriate time periods relative to application of management measures. Figure 6 shows leverage areas for sediment management derived from matching temporal sources-global data of monthly rainfall and vegetation in December (patterns relative to beginning of December 2016; Figure 6A), station-based rainfall, and MODIS-based vegetation (both patterns) relative to end of October 2015 ( Figure 6B). In the first example, the vegetation, soil, and connectivity hotspots are likely to occur before the December rains (annual peak in rainfall volume). This information can be used to prepare for ad-hoc sediment capture measurements in upcoming months. The latter example reflects leverage areas for sediment management surrounding rainfall monitoring stations. They cover areas where statistically significant high soil erodibility and high connectivity are located. Additionally, statistically significant spatial-temporal hotspots of rainfall and low vegetation cover repeatedly occurred there in the last 15 years (October 2000-October 2015). This map can assist sediment management by identifying permanent hotspots in proximity with a measurement station, which continue to be active subsequent periods.

What to Map-spatial or Temporal Extremes?
Representing spatial-temporal relationships to indicate when minimal management efforts could have maximized effects is desirable [10]. The use of gridded time-series products enables the application of geospatial statistical methods to distinguish spatial and temporal extremes. Many methods can be applied on spatial environmental variables [24,[89][90][91], and additional tools are available for mapping changes in temporal trends of spatial data (e.g., [92][93][94][95], and this study). In our study, emerging hotspots ( Figures 5 and 6) represent statistically significant relationships of a spatial component, within its spatial and temporal neighborhood. Analysis of spatial-temporal relationships was only possible for factors considered as dynamic (temporarily and spatially variable), and only where appropriate datasets were available within the applied inventory. The connectivity index used highlights how conceptual representations can restrict applicability of analyses to particular spatialtemporal domains.
Availability of a single digital elevation model constrains application of topography-based time series, which can only be derived from multiple digital elevation models (similar to [69,96]). We used surface roughness as a weighting factor for connectivity index calculation, while others [95,97] have opted for temporally changing (remote sensing derived) crop erodibility factor. The latter enabled derivation of a connectivity index time series in order to study vegetation-based changes in connectivity. Computational limitations of applied algorithms [64] at regional scales further limit the applicability of temporal time-series, and therefore the analysis framework can be simplified [95], ideally matching management's spatial and temporal requirements.

How to Validate and Prioritize Hotspots?
When mapping sediment hotspots using physically based models, the calibration and validation of model efficiency and sensitivity analysis were well established for cases where input data were available [31,48,98]. Despite the commonality of result validation in modeling studies, it is rather uncommon for hotspot mapping. One can argue that hotspot mapping (model) conceptualization focuses more on potential than actual erosion (sediment production, or sediment delivery), which makes validation troublesome [16]. Generally, two validation approaches are applied in modeling studies: (i) validation using monitoring data; and (ii) multi-model comparison. Use of monitoring data for validation is limited by availability, as well as, in cases of spatially lumped data (i.e., station measurement), by the ability of point measurements (plot or catchment outlet sediment yield) to describe spatially and temporarily distributed processes such as sediment cascades [16,99]. The same is valid for the mapping approach itself, and therefore caution should precede selecting appropriate variables and temporal snapshots for comparison [33]. For example, a long-term annual mean sediment yield measured at catchment outlet (during 2000-2013) is compared to leverage areas for sediment management at the end of the dry season (September in 2000-2016, Figure 7A), and December leverage areas are compared to long-term December mean ( Figure 7B). To show spatial extremes, a heat map of spatially lumped long-term means of area-specific sediment yield from outlet measurements (represented as one value in entire catchment area) are compared to spatially and temporarily significant spatially distributed data (leverage areas). However, such comparisons are dubious, as most catchments with identified sediment leverage areas lack sediment data measurements. In the study area, spatial distributed information of eroded/eroding areas (such as soil profile truncation maps, rills, and gullies maps) were unavailable, and therefore validation by geostatistical methods [17,100,101] was not applied. An alternative to validate results is to compare result with one or multiple models [102], as shown in Figure 7C,D. In these figures, leverage areas are plotted against heat maps of modeled, spatially lumped long-term means of area-specific sediment production on hillslopes.
The advantages are full spatial coverage of modeling results across the entire catchment where no data were available. Visual relations between hotspot areas with the upper quartile of sediment yield for each catchment is stronger than in Figure 7A,B, but high area-specific sediment production was modeled also in catchment were no hotspots were detected. Non-calibrated model runs were used for this comparison in order to point out possible misuse of modeling approaches in data sparse regions. Even though graphical validation of observed versus modeled (discharge and sediment load in river) values showed relatively good process representation for naturalized flow, the results themselves should be taken with caution and further compared with calibrated model. Under constraints of availability, caution should be paid to data pre-processing; model validation should go beyond model performance coefficients, and uncertainty introduced by missing data carefully considered and discussed [103].
A third possibility for validating mapping results where no validation data exist is an assessment by local expert knowledge or participatory mapping [20,21,104]. Accuracy of participatory mapping approaches is, among others, limited by: (i) diversity of definitions and approaches to participatory mapping; (ii) spatial attributes measured in participatory mapping; (iii) sampling, participation, and data quality; and (iv) relationships between participatory mapped attributes and physical places [105]. It further requires active stakeholders' involvement ( Figure 1), and its application for the study areas is a topic of ongoing discussions with Sudanese partners.
The primary motivation of risk, hotspot, or leverage area mapping is to support decision making in water and land management, thereby providing means to prioritize hotspots areas through data analysis and/or visualization [106]. While the decision to prioritize management is in the competence framework of responsible stakeholders (including institutions), scientific results can support multiple decision support frameworks [107]. Scientists can assist hotspot prioritization for variables and scales they considered in their analysis in order to identify the temporal and spatial domains when/where action is required. Local knowledge or short-term lumped data series can be compared to mapping outputs relative to a point of time, for prioritizing timescales and areas for ad-hoc management interventions ( Figure 8). Furthermore, environmental and societal factors can be used in analyses targeted at soil and water conservation measures [20,102]. Prioritization was performed in multiple ways: 1. ordinary decision rule matrix [16]; 2. threshold-based prioritization [20]; 3. value-based classification, including heat maps or percentiles (e.g., [49] and Figure 7); 4. geostatistically-based prioritization (e.g., statistic in spatial-temporal domain, [69,95,97], this study).
In doing so, the stakeholders' aims and spatial and temporal scales of interests should be considered [108][109][110].

Common Challenges for Research and Management of Leverage Areas with Sparse Data Availability
In previous sections, we discuss how a lack of available data limit identification of sediment source areas contributing to reservoirs siltation. We review existing approaches and develop a simple method using a minimum inventory of globally available datasets that can be applied in situations where additional precise data are available to enable physically-based modeling. We present a simple mapping approach that builds upon latest advancements in understanding connectivity as a factor of sediment delivery [111][112][113][114][115].
Our work highlights how lack of data availability limits the mapping in multiple ways. Firstly, data can limit understanding of controlling factors of sediment delivery and relationships between controlling factors. Despite good understanding of functioning of bio-physical systems in semi-arid areas [37,38,116,117], local eco-hydrological and eco-geomorphological conditions can stimulate specific catchment sediment responses [118,119]. Regional studies showed that rainfall-dischargesediment response curves and some of the controlling factors varied among catchments and/or seasons [14,35,44,120]. Events of similar magnitude can deliver differing amounts of sediment if intraor inter-event catchment connectivity changed, or stored sediment was released within an event [121]. Simplifying existing relations or underrepresentation of some processes in favor of others is a recognized issue in mapping and modeling studies, and limited data availability can pose additional challenges [31]. Simplifications can be introduced by considering elementary factors with general, but possibly not site-specific, influences on sediment delivery (e.g., our approach). Alternatively, controlling factors lacking explanatory power in the environment can be derived and mistakenly taken into consideration.
Secondly, data availability limits the spatial and temporal scale over which landscape can be represented and understood. The globally available remote sensing or gridded products (meteorology, soils, etc.) have a great information quantity and quality, and undisputable spatialtemporal information distribution. However, ground truthing is necessary to fully utilize their potential for applications such as classification of land use, crop cover, and phenophases [122,123]. Similar to digital elevation models, remote sensing products (and derivate such as in this study) have a defined resolution that can lead to misrepresentation of processes occurring on sub-grid scale [17,124,125], which in turn can limit management at sub-grid scales. For example, most sediment management measures applicable in sediment source areas, such as check dams, buffer strips, soil cover management (crop, land use change, and geotextiles), terracing, and others [17,126] would ideally be planned at fine scales as leverage areas, to be distinguished from global datasets. Recent developments may enable researchers to use fine-resolution remote sensing products such as Sentinel-2, with coarse-resolution data (up to 10-20 m) in the near future. Such methods were not applied here because we lacked sufficient time series data (acquisition started in 2015). Spatial resolution is further important if scale dependent variables (e.g., slope and flow path length) are analyzed [17,69,127]. Therefore, the scale should be carefully matched to the analysis purpose. This will differ for regional scale mapping focused on distinguishing dominant sediment sources in contrast to developing area-specific management plans. Additionally, when detailed scales are considered in management support systems, the usability of minimum inventory (global available datasets) decreases. Third, data availability can limit the applicability of advanced scientific methods, and possibly validation of results. Comparing models or methods can sometimes substitute for validation or provide results that are more complex. However, the following should be considered: (i) Many existing approaches consider similar variables and physical representations between them [31]. (ii) Although many models are available that account for all known influential variables, coarse data resolution, poor precision or quality, and/or general availability may limit their usage. One example found herein was the rainfall intensity and associated kinetic energy as a crucial variable leading to sediment detachment. During a rainstorm event, rainfall intensity changes several times [128], but daily aggregates, or maximum 30-, 15-, or 6-min intensities are used to calculate rainfall energy in soil erosion models, while more fine-scale data on intensity variability are not available.
Many approaches have been developed to tackle challenges posed by data unavailability. These include new acquisition methods, new models, and performance metrics development. Their development and application in land degradation research and other fields has grown in recent decades ( [129], this study). However, in the current era of data-driven science where terabytes of information are collected every second, some key components are still missing. This reflects institutional barriers or simply a lack of cooperation in data sharing and data coproduction. For example, water managers often record the quantity of water allocated to agriculture, but lack information on actual distribution among farmers and fields, as well as the actual groundwater withdrawal. Additionally, managers lack information on field management practices, such as seeding/planting terms (influencing plant cover of soil, evaporation changing as plant grows, etc.), amount of fertilizer, or number of cattle breeding on semi-natural pastures, which all contribute to land and water quality degradation. Multi-factor character of many environmental problems is well acknowledged in science, but not reflected in institutional structures and institutions' cooperation networks in many countries, and transboundary basins [35,130,1312]. The call for transdisciplinary research in the land degradation community has existed for a longer time, all the while acknowledging the need to address societal challenges is evermore present. Despite this, data sharing, open-source publishing, and truly inter-sectoral and transdisciplinary research projects are still uncommon scientific practices [29,[132][133][134].
Researchers and managers are hereby invited to define the temporal and spatial scales over which land degradation and/or water balance should be managed. Furthermore, we encourage open, sharable, and suitable data acquisition, which support development and application of methods that do not omit or simplify complexity and inter-scale characteristics of environmental issues linked to sediment delivery.
Another point raised during this study is how management considers spatiotemporal patterns of sediment delivery and its controlling factors. In other words, in both research and management it is necessary to account for the continuously changing nature of coupled socio-ecological systems, as well as compounding non-linear responses of landscape systems [135]. In our continuously changing world, sediment management is likely to fail if it continues to target long-term mean values rather than variability. Sediment management might fall behind ongoing and future environmental trends if it continues to omit in the most influential factors and their compound effects [128,136]. Ecohydrological hindcast modeling for water and sediment management [87,88] provides a step forward for addressing management challenges related to ongoing change. While ecosystem stewardship principles [137] that focus on preserving system functions are well-suited for reservoir sediment management, complex and systemic action needed to apply such principles are in direct conflict with multiple barriers described in the study.
Therefore, spatial-temporal trends analysis of socio-ecological hazards related to reservoir siltation must be included in risk-mapping context. Furthermore, examples (such as in this study) discussing the impacts of data unavailability on applicability of state-of-the-art scientific methods in management context need to be discussed in the context of sustainable management of socioecological systems from the perspective of both scientists and managers.  Mohammed Saleim (DIU) are acknowledged for their help. We are thankful for fruitful discussions with scientists and practitioners in Iran, Sudan, Brazil, Tunis, Algeria and Marocco. We appreciate the much inspiration provided by a growing number of anonymous academic and non-academic stakeholders who openly share their data, discuss their approaches and pitfalls, and bear with the challenges of open, transdisciplinary, and inter-sectoral research projects or science-policy dialogues.