Accuracy assessment of digital terrain model datasets 2 sources for hydrogeomorphological modelling at 3 Mediterranean catchments 4

Digital Terrain Models (DTMs) are currently a fundamental source of information in Earth 25 Sciences. However, DTM-based studies can contain remarkable biases if limitations and inaccuracies 26 of these models are disregarded. In this work, four freely available datasets such as SRTM C-SAR 27 DEM, ASTER GDEM V2 and two airborne LiDAR derived DTMs (at 5 and 1 m spatial resolution, 28 respectively) were analysed in a comparative study in three geomorphologically contrasted 29 catchments located in Mediterranean geoecosystems under intensive human land use influence. 30 Vertical accuracy as well as the influence of each dataset characteristics on hydrological and 31 geomorphological modelling applicability were assessed by using classic geometric and 32 morphometric parameters and the more recently proposed index of sediment connectivity. Overall 33 vertical accuracy – expressed as Root Mean Squared Error (RMSE) and Normalized Median 34 Deviation (NMAD) – revealed the highest accuracy in the cases of the 1 m (RMSE = 1.55 m; NMAD = 35 0.44 m) and 5 m LiDAR DTMs (RMSE = 1.73 m; NMAD = 0.84 m). Vertical accuracy of SRTM was 36 lower (RMSE = 6.98 m; NMAD = 5.27 m) but considerably higher than in the case of ASTER (RMSE = 37 16.10 m; NMAD = 11.23 m). All datasets were affected by systematic distortions. As a consequence, 38 propagation of these errors caused negative impacts on flow routing, stream network and catchment 39 delineation and, to a lower extent, on the distribution of slope values. These limitations should be 40 carefully considered when applying DTMs for hydrogeomorphological modelling. 41 42


Introduction
Digital Terrain Models (DTM) provide a continuous mathematical representation of the Earth's bare surface [1].In hydrogeomorphological modelling, DTMs are the most important data inputs as the replication of landscape elements is highly influencing model accuracy [2][3][4][5].Adequate representation of landscape elements is thereby often limited by intrinsic errors related to the data acquisition process as well as post-processing steps [6,7] used to generate the DTM.Moreover, DTM dataset characteristics such as spatial resolution [8][9][10][11] and vertical accuracy [13][14] may affect terrain representation and subsequent modelling.
Most of the DTM used in hydrogeomorphological applications are satellite-derived datasets such as SRTM DEM [15] and ASTER GDEM [16] due to their nearly global topographic coverage and unrestricted availability.However, their spatial resolution is coarse (~30 m) and their targeted vertical accuracy is ~16 m [17] and ~17 m [18], respectively.Airborne Light Detection and Ranging (LiDAR) derived DTMs are recognized to overcome these restrictions in spatial resolution and vertical accuracy [19][20][21][22] but their spatial coverage is limited due to their cost-intensive acquisition and processing.Despite the use of LiDAR DTMs is still mainly limited to small-scale modelling such as landslide and hillslope failure assessment [23][24][25] a broader range of potential applications in hydrology and geomorphology has been identified [26][27].
At the catchment scale, studies focussing on vertical accuracy and DTM effects on hydrogeomorphological modelling intensively used satellite-derived SRTM and ASTER DTM data.
These studies have been performed in a wide range of geomorphologically contrasted study areas, including low relief terrains [28][29][30] and steep gradient areas [31][32][33], as well as different geographic regions such as Mid-Latitudes [34][35][36] tropics and subtropics [37][38][39].However, Mediterranean environments are clearly underrepresented in such validation and comparison studies, despite few relatively old studies [40,41] or few some more recent investigations about detection of karst features in satellite-derived DTMs [42].This underrepresentation must be seen as a critical point, as the Mediterranean region is arguably one of the most human-influenced areas on Earth.For instance, a major characteristic is the massive presence of traditional soil and water conservation structures constructed since Roman times [43].However, changes in traditional land use systems, such as urban expansion and increases in irrigated agriculture [44] have altered complex geo-ecosystems thereby favouring the progress of land abandonment and further degradation [45,46] as well as profound changes in hydrological systems [47].
In order to adequately address current and future changes in hydrogeomorphological Mediterranean environments, the use of reliable and accurate DTM in modelling is fundamental.
Accordingly, the objectives of this study were twofold.Firstly, an assessment of the vertical accuracy of SRTM C-SAR DEM, ASTER GDEM V2 (spatial resolution 1 arc-second ~ 30 m at the equator, each) and two airborne LiDAR DTM datasets with 5 and 1 m spatial resolution in three small catchments on the island of Mallorca (Spain) representative of characteristic Mediterranean landscapes with morphologies and land uses differently altered by a long-stand story of human changes.Thereby, the influence on the vertical accuracy of these derived DTMs of anthropogenic features as well as the vegetation patterns were assessed.Secondly, an evaluation of DTM data sources and datasets characteristics on subsequent hydrogeomorphological modelling applications was carried out by calculating widely-used statistics and descriptors.In addition, representative plots were selected at 3 of 26 the catchments to investigate the role of characteristic landscape features including man-made artefacts on the sediment connectivity index proposed by Borselli et al. [48] and improved by Cavalli et al. [49] as critical assessment of water and sediment transferring between two landscape compartments or within a system.The results obtained by this study are assumed to expand the existing knowledge about the influence of DTM data source and dataset characteristics on terrain representation, also providing valuable information about the applicability of remote sensing derived DTMs in hydrogeomorphological modelling beyond Mediterranean environments.

Study Area
Three small contrasting catchments (Sa Font de la Vila, Es Fangar and Es Telègraf; ranging from 3 to 5 km²) located in the Island of Mallorca (Spain) were selected for this study (Fig. 1a).All catchments are located in the alpine Tramuntana Range although each of them is characterized by a unique underlying morphology with decreasing terrain complexity and land use intensity from the mountainous Es Telègraf over the mixed Es Fangar to the terraced Sa Font de la Vila catchment.This gradient in terrain complexity and land use intensity is caused by the underlying geological settings of the Tramuntana Range, which is aligned in NE to SW direction with several folds and thrusts with increasing relief energy and hillslope steepness to its northern part (Fig. 1b; [50]).
The southernmost catchment is Sa Font de la Vila (4.8 km²).It is located in the southwestern part of Mallorca (2°24'50'' E; 39°35'20'' N; Fig. 1c).The relief of the catchment is complex due to the interaction of soft and hard lithology materials and different tectonics [50].Moreover, the catchment is characterized by the massive presence of terraces, which are the most important land management, covering 37% of the whole catchment area and are supported by 147 km of dry-stone walls [51].Since the mid-twentieth century, the abandonment of traditional agriculture in these marginal areas caused an afforestation process with an increase of fuel availability [52].As a result, two severe wildfires affected the catchment in 1994 and 2013.Current land use is mainly dominated by agriculture (44%), sparsely vegetated terrains (38%) and forests (18%).
The catchment with the highest relief energy and steepest hillslopes is called Es Telègraf (due to a nearby homonym mountain pass; 2°51'0'' E; 39°48'30'' N; Fig. 1c).It is located at the highest part of the Tramuntana Range (Fig. 1b).The main characteristics of the mountainous Es Telègraf catchment (ca. 3 km²) are its tectonic structure dominated by NW-directed thrust system including several cliffs and steep slopes.The vegetation illustrates a sharp contrast between the headwaters -dominated (25% of total area) by sparse vegetation (called culminal Balearic stage) [53]-and the lower parts, mostly covered by shrublands and forested hillslopes (75% of total area).
Es Fangar catchment (3.2 km²; 2°60'00'' E, 39°50'10'' N; Fig. 1b) is mainly characterized by a combination of thrust and normal faults and synclinal-anticlinal structures as well the presence of different land use classes including forests (31%), sparse vegetated areas (20%) and agriculture (49%) which is also affected by check-dam terraces and terraced fields.Thus, the catchment shows a mosaic of different land uses and management practices as a result of a complex interaction between human and natural environments [46].Therefore, it is so-called the "mixed" catchment.
Three small plots of 50,000 m 2 , including the characteristic landscape elements in each of the three catchments (Fig. 1c), were selected to assess how the representation of typical Mediterranean landscape elements -either caused by natural conditions or human activity-in DTMs can affect subsequently on hydrogeomorphological modelling by using a morphometric hydrological and 4 of 26 sediment connectivity index (see Section 3.2).The selected plot at the Sa Font de la Vila catchment was located in a terraced hillslope (Fig. 1d).At the mountainous Es Telègraf catchment, the plot was located in a high mountainous relief with bedrock outcrops and steep hillslopes of ca.≥ 60% gradient slope (Fig. 1e).The selected plot was centred on a massive bedrock outcrop mainly formed by erosion-resistant limestone while the surrounding area is dominated by softer materials.At the mixed Es Fangar, the plot encompassed a flat terrain agricultural field plot covered by rainfed herbaceous crops also affected by traditional drainage systems combining man-made channels and subsurface tile drains [54] (Fig. 1f).Vila, 32 GCPs which had been surveyed in January 2016 following the same procedure and using the same equipment (covering mainly terraced hillslopes) were also included in the analysis assuming there had been no changes (confirmed by field observations).As a result, a total of 140 GCPs were included in the accuracy assessments.
After field acquisition, the GPS measurements were rectified from ellipsoidal to orthometric height to ensure comparability with the DTM datasets that refer to EGM1996 geoid (ASTER and SRTM) and to EGM2008-REDNAP geodetic vertical datum (IGN data).The robustness of the accuracy assessment in terms of statistic measurements was ensured by applying a threshold of a maximum absolute vertical error of 7 cm in the GPS data that is nearly 3 times more accurate than the accuracy given by IGN for the 5 m LiDAR data [59].Under this condition, the derived DTM accuracy had an error of 5% that was considered as tolerable [63].
Vertical accuracy of the DTM datasets was expressed as an error statistic.In order to determine if the underlying distribution of elevation errors equals a normal distribution, histogram plots were examined visually.Moreover, robust measures such as Normalized Median Deviation (NMAD) and the 68.3 and 95.0% sample quantiles of the error between GPS and DTM derived elevation were computed.All these measures were reported by Höhle and Höhle [63] and Höhle [64] to be reliable even for non-normal distributions.NMAD is computed according to: where ∆ denotes the median in elevation error ∆ computed from the ≤ differences between GCPs and DTM derived elevation values.For comparability with other accuracy assessments, the Root Mean Squared Error (RMSE) is also provided additionally: Additionally, the error statistic was also estimated into open terrain and dense vegetated areas for each catchment.

Quality assessment of DTMs for hydrogeomorphological modelling
All the different DTM datasets were projected to UTM system (Zone 31 N, ETRS89 ellipsoid) to ensure comparable geolocation.The original vertical datum of the datasets remained unchanged as the difference in height between EGM1996 and EGM2008-REDNAP were < 1 m at all study areas, which can be considered small if compared to the vertical resolution and precision of the satellite- derived DTMs.In order to correct the data for hydrogeomorphological purposes, the Planchon and Darboux [65] surface filtering algorithm was applied imposing a minimum slope gradient between grid cells of 0.1%.The algorithm adds a virtual layer of water to the DTM data, filling all depressions until their outlet point.This results in a surface that ensures drainage even in flat areas.Comparison of elevation values before and after sink filling was carried out to assess the influence of surface filtering.
Widely-used geomorphometric parameters such as slope and flow direction, upslope contributing area and catchment area were computed by using ESRI ArcGIS ® (Version 10.3) spatial analyst toolbox.Flow accumulation was obtained by applying the deterministic non-dispersive D8algorithm [66] that simulates water flow in the direction of the strongest slope gradient.Stream networks were delineated from flow accumulation raster imposing a minimum size of upslope contributing area of 30 ha that was considered to produce the most realistic results when compared to field observations.
In addition to these basic geomorphometric parameters, hypsometric curve [67], the slope-area relationship [68,69] and the cumulative area distribution function [70] were computed.The hypsometric curve relates relative area to relative elevation and allows the estimation of runoff response, dominant erosion processes [71] and landform maturity [72].Thus, it is a simplistic measure of mass and energy stored within a landscape.Furthermore, the slope-area relationshipdefined as the mapping of local slope gradient to contributing area-and the cumulative area distribution -defined as the proportion of catchment area that has drainage area greater than or equal to a specified drainage area-provide information about characteristic fluvial processes [69] and flow aggregation structures [70], respectively.In more detail, slope-area plots are usually employed to determine an inflection point that separates relatively small catchment areas dominated by interrill and rain splash erosion from flatter areas characterized by fluvial erosion and transport [69].In theory, the shape of slope-area plots of catchments equals a "boomerang" with the roll-over (inflection) point corresponding to the threshold in drainage area where hillslopes transition into channels [73].Likewise, the slope-area relationship breaking points between hillslopes and channels can be identified in the cumulative area distribution function [70].Moreover, the cumulative area relationship is focused on flow aggregation structures of the stream network [74] and on aggregation structures of hillslope elements [75].Herein, the leftmost section of the distribution curve is associated with diffusive erosional processes at hillslopes whereas the middle section describes channelized flow as a log-log linear straight line and the right most section is related with boundary effects.In principal, such an organization of flow structures can be determined for all datasets at all catchments [74].
In addition, to assess the effect of DTM models in terms of landscape stability and soil erosion, the empirical hill-length and slope steepness factor (LS factor) was computed.In its original formulation, as part of the Universal Soil Loss Equation (USLE) [76], the morphological LS factor is computed as: = * = ( ) * (0.065 + 4.56 * sin( ) + 65.41 * sin²( )) [3] where accounts for the length of the hillslope with being the erosive hillslope length that is usually calculated from flowlength, being the hill-length exponent defined as = where = and Ɵ is the hillslope [rad], and is the length of parcels used in the USLE experiment (~22.1 m).denotes the steepness of the terrain and depends upon hillslope Ɵ.However, due to its wide-spread use and better performance, it was used the improved approach developed by Desmet and Govers [77] who replaced flow length by upslope contributing area for LS factor calculation, assuming a rill/ interrill erosivity of 1, which was implemented in SAGA GIS [78].
The sediment Connectivity Index (IC) proposed by Borselli et al. [48] and improved by Cavalli et al. [49] was applied considering that a dynamic assessment of landscape connectivity helps to incorporate aspects of the process linkages that essentially drive sediment flux [79].It is therefore a dynamic property of a catchment, indicating the probability of a particle at a certain location to reach a defined target area, which in this study was established in the catchment outlet, being its effects analysed at a representative plot from each catchment (Fig. 1): where and are an up-and downslope components respectively, ̅ average percentage slope, A the size of the upslope contributing area, an averaged weighting factor representing terrain roughness and a flow length di of the i th cell along the steepest downslope direction.IC was calculated by using the freely available SedInConnect (Version 2.3) software developed by Crema and Cavalli [80].

DTM vertical accuracy
Errors in elevation in all analysed datasets were observed comparing DTM derived elevation values against those elevation values obtained from GCPs surveyed with a dGPS (Table 1).Accuracy increased from ASTER over SRTM to the 1 m IGN LiDAR model.A systematic overestimation of elevation values was observed for all datasets as indicated by the relative frequency plots (Fig. 2).
According to RMSE (16.10 m) and NMAD (11.23 m) values, ASTER showed the lowest overall accuracy among all datasets.Frequency distribution of elevation errors showed that ASTER values overestimated elevation values, although notable underestimations were also identified (Table 1).No clear difference between the influence of vegetation and open-terrain could be obtained from the frequency distributions as over-and underestimation were similarly distributed for both land cover types (Fig. 2).
The SRTM DTMs showed an overall accuracy of 6.98 m (RMSE) and 5.27 m (NMAD), evidencing that SRTM tends to overestimate surface elevation, as it is depicted in the histogram of elevation errors (Fig. 2).Among the three studied catchments, SRTM accuracy was the worst for RMSE at the terraced Sa Font de la Vila catchment (8.28 m), illustrating the mountainous Es Telègraf catchment the lowest accuracy performance for NMAD (Table 1).The highest accuracy of SRTM DTMs was observed at the mixed Es Fangar (RMSE = 4.06 m; NMAD = 2.98 m; Table 1).Comparing open and dense vegetated terrains (green bars in Fig. 2), dense vegetation generated an overestimation of elevation errors while underestimation slightly occurred at open-terrain points.The IGN LiDAR DTMs accuracy was considerably much higher for both spatial resolutions (i.e.,   For both LiDAR models, elevation at open-terrain GCPs points was slightly overestimated whereas vegetated points revealed minor underestimation of elevation values (Fig. 2).

Basic terrain statistics
Descriptive statistics of minimum, maximum and mean elevation as well as average slope values before and after applying surface filtering are summarized in Table 2. Surface filtering did not significantly affect the terrain models as the percentage of filled sink areas was between 0.1 and 0.3% for SRTM, 0.6 and 1.7% for ASTER and < 0.1% in the case of the LiDAR models.Since the DTMderived catchment areas were partly different (Table 2), absolute values of relief (difference between maximum and minimum elevation) cannot be directly compared.SRTM and ASTER showed clearly larger catchment areas at the mountainous Es Telègraf catchment (i.e., 3.55 and 3.17 km², respectively) than the LiDAR models (i.e., 2.73 and 2.72 km² for the 5 and 1 m model, respectively); conversely, the ASTER derived catchment areas coincided well with the LiDAR derived values at the terraced Sa Font de la Vila and at the mixed Es Fangar catchment.Among all datasets, SRTM showed the largest catchment extent at Sa Font de la Vila (i.e., 5.06 km²) and the smallest at the mixed Es Fangar (i.e., 3.04 km²).The differences between the two LiDAR models were negligible at all catchments.In addition to the observed discrepancies regarding the catchment area, a clear and general increase in the average slope from SRTM over ASTER to the 1 m LiDAR model can be seen in Table 2.

Geomorphometric parameters
Both the hypsometric integral and LS factor decreased from the mountainous Es Telègraf over the terraced Sa Font de la Vila to the mixed Es Fangar catchment, whereas flowlength was largest at the Sa Font de la Vila catchment (Table 3) that also had the largest catchment areas (Table 1).
The hypsometric integral values were nearly equivalent for all datasets except for the mountainous Es Telègraf catchment, where SRTM showed a higher integral value than the other models (Table 3).Oppositely, clear differences among the datasets are shown in mean and SD values for flowlength at all catchments (Table 3).SRTM flowlength was the smallest at all catchments followed by the ASTER at Es Telègraf and Sa Font de la Vila catchments, but not at Es Fangar catchment where ASTER generated the largest value.IGN LiDAR obtained the largest values of flowlengths except for the mixed Es Fangar catchment.However, a notable difference in magnitude (i.e.decametres) was observed between the two IGN LiDAR models, with the higher values (mean and SD) being generated by the 1 m dataset (Table 3).Considering the LS factor, IGN LiDAR values (both mean and SD) were larger at all catchments than those obtained by the other datasets, followed by SRTM at the mountainous Es Telègraf and the terraced Sa Font de la Vila catchments, but by ASTER at the mixed Es Fangar catchment (Table 3).Again, the mean LS values obtained by the IGN LiDAR 1 m dataset were much larger (i.e.twice as big) than those obtained by the 5 m dataset, being the SD also considerably higher (Table 3).
A further comparison between datasets is carried out plotting in the Figure 3the hypsometric curve (i.e.relative elevation vs relative area), slope-area relationship (i.e.slope vs contributing area; simplified by binning the data to 200 equidistant classes of contributing area and calculating the corresponding average slope per class to facilitate visual interpretation) and cumulative area distribution (i.e.cumulative area vs contributing area).Hypsometric curves showed a similar performance for all catchments and models except SRTM (Figure 3a).Slope-area plots, which are describing erosional processes in a catchment, allowed the detection of breaking points which separate different sections of the catchments for the two IGN models but not for the SRTM and ASTER ones (Figure 3b).Finally, the cumulative area distribution plots were able to identify breaking points between hillslopes and channels too (Figure 3c).However, such breaking points were hardly determined in the case of the satellite derived DTMs, as the hillslope region was described by a very small number of cells (leftmost section of the cumulative area plots).

Stream Networks
Using a uniform threshold of 30 ha of upslope contributing area, stream networks were delineated from all terrain models (Fig. 4).At the mountainous Es Telègraf catchment, SRTM produced straight line stream features which did not reflect the actual terrain topography.
Furthermore, the satellite-derived models indicated the existence of second order streams which were not represented in the IGN LiDAR data.It should be noted that Es Telègraf was also the catchment with the largest differences in delimited catchment area among the datasets (Fig. 4; Table 1).In contrast, a better consistence between the models was observed at the terraced Sa Font de la Vila catchment despite some shifts in the position of second order streams at the headwaters and at the outlet point.The position of the outlet was identical for all models at the mixed Es Fangar catchment, but remarkable differences were again observed in the second order streams, showing ASTER the highest deviations, being oppositely to its good matching with the IGN LiDAR data in the rest of the catchments.In the terraced Sa Font de la Vila catchment the stream networks matching was high although few sections evidenced discrepancies.As a result, the area of interest (AOI) presented in

Analysis of the Connectivity Index at plot scale
In all three catchments, the coarser spatial resolution of the satellite-derived models (~27 m) limited the applicability of the IC at all sites, whereas the IC estimated by using the LiDAR DTMs were useful, also considering that improved details are provided by using the 1 m data instead of the 5 m (Fig. 5).Larger differences between SRTM, ASTER and the IGN LiDAR models were accordingly observed.SRTM and LiDAR datasets showed disconnecting effects of a massive outcrop located at the mountainous Es Telègraf catchment (see corresponding orthophoto in Fig. 1) that enforced high values of IC along a pathway from NW to SE whereas such patterns were hardly visible in the SRTM derived map (Fig. 5).The representation of the connectivity pathway in the SRTM model is also limited as the high connectivity area appeared blurred and not that sharp as in the LiDAR data.
However, the IC applied on the ASTER dataset did not reproduce the disconnection and thus, the concentration effect on sediment connectivity, despite the existing bedrock formation in the easternmost part of the plot.At Sa Font de la Vila catchment, where the plot was centred on a terraced hillslope, similar spatial patterns of connectivity were observed for the two satellite-derived models without representing the terracing effects on connectivity (Fig. 5).The dense dendritic high connectivity network in the N-S direction reproduced by the IGN LiDAR derived maps was replaced by a nearly uniform area of high connectivity in the case of SRTM and ASTER.In addition, the effects generated by terracing are better performed by using the 1 m data instead of the 5 m LiDAR DTM.
Finally, slightly different results were observed at the Es Fangar catchment (Fig. 5), where the plot was located at an agricultural field delimited by a road and an artificial channel flowing at the southeastern corner (Fig. 5).Both SRTM and LiDAR data showed a high-connectivity pathway from NE to SW at the eastern part, while the ASTER derived IC values were highest at the northern part of the plot and lowest at the south without any sort of linear features.It should be noted that the position of the highest IC values line did not correspond exactly with the artificial channel.Further comparisons of SRTM and LiDAR-derived IC patterns revealed that SRTM could not resolve minor features indicating high connectivity crossing the plot from N to S.

Vertical Accuracy
The vertical accuracy results of the satellite-and airborne-derived models (Table 1) were in good agreement with other similar studies in hilly and mountainous areas (e.g.31, 32, 81), despite they were mainly focused in regions characterized by very different climate, vegetation patterns and land use systems to Mediterranean areas.SRTM was less affected by systematic errors than ASTER as indicated by the corresponding histogram plots and errors tended to show minor deviations from the normal error distribution (Fig. 2).SRTM clearly outperformed ASTER (lower RMSE and NMAD) at all catchments and vegetation classes [58,39].

Datasets evaluation
The accuracy of the IGN LiDAR models was considerably higher than that obtained from SRTM and ASTER.However, the vertical accuracy of 0.2 m for the 5 m DTM which is officially provided by the IGN was never reached.The error magnitude (RMSE and NMAD) of both LiDAR models was in agreement with Estornell et al. [82] and Simpson et al. [22], having the 1 m dataset the lowest RMSE and NMAD at all study sites and vegetation classes.The LiDAR data showed shifts of the error centre distribution in elevation errors towards negative values (Fig. 2), although the deviation and error magnitude was clearly lower than the satellite-derived models.Additionally, the increase in spatial resolution from 5 to 1 m in the LiDAR derived models did not involve remarkable changes in the frequency distribution of the errors.This might be related to the fact that both models were generated from the same 3D point-cloud in which it was applied the same pre-defined classification scheme to separate ground from non-ground returns, thus, identical classification errors are likely to affect both models [83,84].The point density of the underlying point cloud was ca.1.65 points per m² in average at all three catchments with highest density values at flat areas reaching up to 10 points per m²; whilst the lowest values was performed at steep vegetated hillslopes (> 1 point per m² in average).Therefore, grid interpolation algorithms such as the employed multilevel B-spline interpolation are likely to produce artefacts in the 1 m data in case that filling data-gaps becomes necessary.Gap filling, which particularly affects vegetated areas, is accounted for non-negligible errors in the resulting DTM [85,86].

Assessing the effects of catchment characteristics on the datasets
The classification of elevation errors in land uses types and catchments revealed further information about the DTM datasets.Regarding land uses, the satellite-derived models partially overestimated the elevation at vegetated areas whereas it was slightly underestimated at open terrain areas (Table 1).In the case of SRTM, Kellndorfer et al. [87] and Ludwig and Schneider [35] reported that vegetation introduces negative bias (i.e.overestimation) when comparing model-derived elevation to in-situ derived values.This can be explained by the relatively short wavelength (i.e.5.6 cm) of the C-band SAR used for DTM generation that is mainly affected by (back-)scattering within the vegetation canopy rather than at the true ground [80,88,89].Similar findings can be drawn from the ASTER results, despite the fact that ASTER imagery, as an optical remote sensing system, captures the top of the vegetation canopy (e.g.tree-tops).Consequently, a larger offset to the true surface elevation is expected to be obtained in comparison to SRTM.In this context, Li et al. [90] defined ASTER as a "first-return system".In contrast, last-returns are usually taken from discrete waveform LiDAR data to ensure that backscattering occurs as close as possible to the bare surface.However, as aforementioned, severely filtered LiDAR returns are reported to affect DTM accuracy.Clark et al. [83] showed that classification and, thus, filtering errors, resulted in a less predictable LiDAR DTM accuracy, observing that accuracy depended more upon the topographic curvature than upon the vegetation density.Furthermore, the two IGN LiDAR models showed under-and overestimation of the elevation values at both bare and vegetated areas (Table 1), contrasting with the results by Harding et al. [90], who observed higher underestimation in areas with dense vegetation compared to open terrain due to the dispersion of LiDAR pulses within the canopy.The results of the present work are more consistent with Clark et al. [83], who observed that the precision of the models can vary when different patterns and vegetation types are combined with different slope gradients and morphological characteristics.Estornell et al., [82] also reported that the effect of slope on the LiDAR DTM accuracy may be even larger than the effect of vegetation.Accordingly, in the studied Mediterranean catchments, the effect of vegetation on LiDAR-based DTM accuracy may not be that important if compared to other error sources related to morphological complex features (e.g.terraces or artificial channels) and derived technical limitations related to filtering or interpolation processes in these areas.

Evaluating slope and morphology on DTM accuracy
SRTM showed low vertical accuracy at the mountainous Es Telègraf and the terraced Sa Font de la Vila catchments (Table 1), probably due to geometric distortions in the underlying RADAR imagery.Such distortions are caused by shadowing effects under extreme viewing conditions in very steep and fragmented areas [91,92].Those findings can be also applied to explain the observed straight-line stream features at the mountainous Es Telègraf catchment (Fig. 5), which are also likely to be related to severe intrinsic errors of the SRTM data at highly mountainous areas.Conversely, optical imagery is less affected by relief effects as distortions, due that viewing angles are usually small.Nevertheless, the vertical accuracy of the ASTER model was extremely poor at the Telègraf catchment.Such poor performance can be directly related to the number of stack layers used for the DTM generation, which was between 2 and 7 at the specific case of the Telègraf.Low numbers of stack layers (i.e.<10) are reported to reduce remarkably the vertical accuracy of the ASTER model, as the amount of residual artefacts is large (ASTER-GDEM-Validation-Team, 2011).These findings were confirmed in steep terrains [93] as well as in low gradient areas [28].Therefore, in the case of ASTER the main problem was the insufficient amount of data available for the DTM generation, while in the case of SRTM, the intrinsic errors of the underlying remotely sensed data were mainly responsible of the poor performance related to slope and terrain morphology.Thus, such variables (i.e.slope and terrain morphology) must be considered as limiting factors for SAR-derived terrain models.
An insufficient number of data points might be responsible for the also observed deviations in the vertical accuracy of the LiDAR data, due to vegetation but also to the morphological complexity added by anthropogenic features.The model performance was relatively good at the mountainous Es Telègraf catchment (second in the rank; Table 1), being considerably better than at the terraced Sa Font de la Vila catchment.The relatively poor results obtained at Sa Font de la Vila catchment might be related to the fragmented surface [94], thereby favouring a frequent change between oversampling in the flat areas of terraces and an undersampling in the abrupt elevation changes along the dry-stone walls; an effect which could be even enhanced by erroneously classified LiDAR returns [82].As a consequence, the characteristic landscape might be missed out.At the Es Telègraf and Es Fangar catchments, where the terrain is also complex but less fragmented, these effects are also likely to be present, but to lower extent.

Assessing the reliability of DTMs for hydrogeomorphological modelling
The reliability of hydrogeomorphological modelling applications mainly depends upon the accuracy in which topographic models can replicate the landscape morphology.In this study, remarkable differences were shown between the descriptive statistics and hydrogeomorphological characteristics of the four assessed datasets.

Basic terrain attributes
Hypsometry (i.e., both hypsometric integral and hypsometric curve) was not observed to be clearly sensitive to DTM grid size or data source (Fig. 4 and Table 3).Therefore, the ability of the DTMs to replicate general patterns of mass and energy stored within a landscape is believed to be nearly independent of the horizontal resolution and dataset characteristics.The identified independencies are mainly related to the use of relative values of elevation and area that surpass the absolute differences [95].Results also showed a consistent smoothing of the slope values in case of SRTM and ASTER at all catchments (Table 2).The coarse spatial resolution of these models (i.e. 1 arcsecond) caused a certain oversimplification of the landscape elements, favouring a considerable loss of the hydrological and geomorphological details, similar to results obtained in other geographic regions (e.g.10,11,96).Consequently, such coarse DTM resolution did not only underestimate the hillslopes and thus simplified the landscape elements, but it also affected the subsequent modelling results in terms of physical measures.In more detail, Zhang and Montgomery [8] reported the implications of slope value distributions on process-based hydrological modelling as laws of transportation were not analogous to those obtained from field studies, especially at small catchments.
Differences in the distribution of the slope values were observed among all datasets, being the slope one of the most sensitive parameters with a clear non-linear relationship between the raster horizontal resolution and the derived terrain representation.In this case, only the differences among the LiDAR models were mainly related to spatial resolution, while the ASTER-derived average slope values exceeded those obtained from SRTM (Table 2).Thus, the residual artefacts and errors related to the intrinsic characteristics of the dataset were relevant not only in terms of vertical accuracy but also in terms of landscape representation, such as stream network patterns and catchment area delineation.That fact was also supported by the observed differences in flow routing characteristics (Table 3).

Geomorphic parameters
In order to check the ability of the investigated DTMs to adequately address hydrogeomorphological processes, different geomorphometric parameters were used (Fig. 4).The slope-area relationship and the cumulative area distribution indicated a poor performance at the hillslope scale and, hence, a poor ability to assess erosion and flow accumulation processes in case of the satellite-derived models.In contrast, both LiDAR models performed those areas dominated by interrill and splash erosion.These types of erosion characterize effective catchment areas that contributing sediment to the catchment conveyor belt [97].Even the advances of new technologies and geographic information system (GIS) tools, for a meaningful assessment of the physical structure of catchments, a holistic understanding and, thus, representation of hydrogeomorphological processes requires DTMs that can provide an interpretation of dominant processes at different spatial and temporal scales within a catchment.Thus, SRTM and ASTER might be insufficient for hydrogeomorphological modelling in terms of sediment transfer processes in small (Mediterranean) catchments.Additionally, both SRTM and ASTER illustrated very similar patterns in slope-area impossible the detection of such traditional drainage systems.Consequently, the model-derived stream networks and pathways of sediment connectivity could produce a drainage system that comes close to the "original" state of the system, but are still unable to reproduce the current state caused by man-made changes (even in the case of the high resolution 1 m LiDAR model).Thus, the use of LiDAR models with very high resolution is only recommended if point density is high enough.
Otherwise, the increase in computing time and required hardware resources cannot be justified in terms of increased hydrological detail and accuracy.

Conclusions
This study has examined the vertical accuracy of four DTMs with regard to data source and dataset characteristics as well as terrain morphology in three small Mediterranean catchments.The reliability of such DTMs for subsequent hydrogeomorphological modelling purposes has been also assessed.Finally, following key stone can be summarized: 1.
The airborne LiDAR models and -to lower extent-SRTM C-SAR (1 arc-second) provided a reliable source for most of the discussed hydrological and geomorphological modelling aspects with exception of highly mountainous areas where SRTM failed because if intrinsic errors associated with RADAR shadowing.In case of the LiDAR models, attention should be paid to the influence of data processing steps such as grid interpolation and point cloud classification.

2.
ASTER showed clearly lowest vertical accuracy and residual artefacts producing strongly non-normally distributed elevation errors that clearly reduced the reliability of the ASTER data.

3.
Vegetation patterns as well as terrain morphology and fragmentation of relief (especially in highly anthropised landscapes such as the Mediterranean region) influenced all datasets resulting in over-and underestimation of elevation values.

4.
Vertical accuracy of the datasets was found to directly influence subsequent modelling applicability through systematic errors.Error propagation had impacts on flow routing, stream network and catchment delineation and, to lower extent, distribution of slope values.Coarse horizontal raster resolution were found to reduce the degree of hydrological and geomorphological detail available from a DTM and its applicability in resolving processes at different scales within a catchment.
The results presented in this study are transferable to other geographic regions dominated by fluvial processes not restricted to Mediterranean environments.However, further research is required to assess the influence of vegetation.Moreover, current trends in LiDAR-derived DTMs such as the use of unmanned aerial vehicles (UAV) and full-waveform LiDAR systems and its possible gains in modelling applicability should be addressed by future research.

Figure 1 .
Figure 1.(a) Location of Mallorca within the Western Mediterranean Sea.(b) Location of the three contrasting catchments on the Island of Mallorca.(c) Sa Font de la Vila, Es Telègraf, and Es Fangar catchments in which the main land uses (extracted from the CORINE 2012), height contour lines (h = 100 m; numbers indicate minimum and maximum elevation in meters above sea level), and stream features (extracted from the 5 m LiDAR DTM) have been plotted.(d -f) Images of the analysed representative plots at each catchment (d = Sa Font de la Fila, e = Es Telègraf, f = Es Fangar).

5 and 1 m
) in comparison to SRTM and ASTER.The IGN 5 m model illustrated an overall vertical accuracy of 1.73 and 0.84 m expressed as RMSE and NMAD respectively, with a lower accuracy than officially reported by the IGN (i.e., 0.2 m).Conversely to SRTM and ASTER, the vertical accuracy increased clearly from the terraced Sa Font de la Vila catchment (RMSE = 2.09 m; NMAD = 0.98 m) to the mixed Es Fangar catchment (RMSE = 1.35;NMAD = 0.71 m), with the mountainous Es Telègraf located in the intermediate position (RMSE = 1.59;NMAD = 0.89).

Figure 2 .
Figure 2. Relative frequency distribution of elevation errors (binned to 5 m intervals).Blue bars indicate the distribution of all Ground Control Points (GCPs, N=140), whereas the red bars account for GCPs considered as open-terrain (N=87) and the green bars for densely vegetated GCPs (N=53).

Fig. 4a ,
Fig. 4a, illustrated how the streams derived from the IGN LiDAR data tended to reproduce the natural behaviour, following the slope gradient along the valley floor as indicated by the 10 m contour lines, while SRTM and ASTER streams partly crossed those height contour lines.Therefore, a clear inaccuracy of the streams position when considering the satellite-derived data has been observed.The empirical Inverse Cumulative Distribution (i.e., iCDF) of flow length probabilities was also calculated (Fig.4b) and revealed differences among the datasets especially at the Es Telègraf catchment, showing large differences in stream network patterns (lower iCDF discrepancies were found at the other catchments).

Figure 4 .Figure 5 .
Figure 4. Stream network patterns (solid lines) and catchment boundaries (dashed lines) derived from the DTM datasets showing the studied catchments.(a) Area of interest (AOI) selected at the Sa Font de la Vila (solid black lines indicate height contours with Δh = 10 m).(b) Empirical Inverse Cumulative Distribution Function (iCDF) of the derived flowlength values for the three catchments

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 24 October 2018 doi:10.20944/preprints201810.0558.v1
/land.copernicus.eu/pan-european/corine-land-cover)and high-resolution orthophotography (0.25 m ground resolution) imagery provided by the Government of the Balearic Islands (http://www.ideib.cat/).GCPs were also adjusted manually later (if needed) to ensure a representative proportion of densely and sparsely vegetated areas in the survey.At Sa Font de la

Table 1 .
Results of the vertical accuracy assessment expressed as Root Mean Squared Error (RMSE) and Normalized Median Deviation (NMAD) between the GPS measured elevation values and those values derived from the different DTM datasets (in meters).

Table 2 .
Descriptive statistic of the used DTM datasets.Uncorr refers to results before surface filtering and Corr after surface filtering.

Table 3 .
Geomorphological and hydrological characteristics of the studied datasets and catchments including the hypsometric integral, mean flowlength and mean flowlength-hillslope (LS) factor and the corresponding standard deviation (SD).