A Relief Dependent Evaluation of Digital Elevation Models on Di ﬀ erent Scales for Northern Chile

: Many geoscientiﬁc computations are directly inﬂuenced by the resolution and accuracy of digital elevation models (DEMs). Therefore, knowledge about the accuracy of DEMs is essential to avoid misleading results. In this study, a comprehensive evaluation of the vertical accuracy of globally available DEMs from Advanced Spaceborne Thermal Emission and Reﬂection Radiometer (ASTER), Shuttle Radar Topography Mission (SRTM), Advanced Land Observing Satellite (ALOS) World 3D and TanDEM-X WorldDEM ™ was conducted for a large region in Northern Chile. Additionally, several very high-resolution DEM datasets were derived from Satellite Pour l’Observation de la Terre (SPOT) 6 / 7 and Pl é iades stereo satellite imagery for smaller areas. All datasets were evaluated with three reference datasets, namely elevation points from both Ice, Cloud, and land Elevation (ICESat) satellites, as well as very accurate high-resolution elevation data derived by unmanned aerial vehicle (UAV)-based photogrammetry and terrestrial laser scanning (TLS). The accuracy was also evaluated with regard to the existing relief by relating the accuracy results to slope, terrain ruggedness index (TRI) and topographic position index (TPI). For all datasets with global availability, the highest overall accuracies are reached by TanDEM-X WorldDEM ™ and the lowest by ASTER Global DEM (GDEM). On the local scale, Pl é iades DEMs showed a slightly higher accuracy as SPOT imagery. Generally, accuracy highly depends on topography and the error is rising up to four times for high resolution DEMs and up to eight times for low-resolution DEMs in steeply sloped terrain compared to ﬂat landscapes.


Introduction
Relief plays a main role for numerous geomorphological, climatic, hydrologic and ecologic processes. Therefore, a detailed understanding of the prevailing terrain conditions is essential [1]. Nowadays, geomorphometric relief information is available by digital elevation models (DEMs), which provide a 2.5 dimensional digital representation of the Earth's relief using regularly spaced elevation data.
The first digital elevation datasets with a global coverage were DEMs from the Shuttle Radar Topography Mission (SRTM) and the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) with a resolution of 30 m, which have revolutionized the use of DEMs [2,3]. While analyses in former times only were possible for small areas, these DEMs enabled the possibility of analyzing larger surface areas up to almost the whole Earth. In recent years, the Advanced Land Observing Satellite (ALOS) World 3D as a third global DEM with a ground sampling distance (GSD) of 30 m was made publicly available [4,5]. The most recent dataset with global coverage is the TanDEM-X World DEM ™ dataset with a GSD of 12 m, which is expected to be the new standard in geometric resolution and accuracy [6,7]. However, all of these DEMs usually are less accurate and capture less terrain details due to lower GSD in comparison to very high-resolution elevation models derived by stereo satellite imagery. These datasets normally have a higher GSD and vertical accuracy. However, they are often not suitable for larger areas, due to high cost and time-consuming processing. Therefore, these are only usable for large scale analyses.
Elevation models generally enable the possibility of a quantitative characterization of relief and are used by geomorphometry as a multifaceted interdisciplinary subject in a multitude of different scientific fields [8]. Hence, DEMs are widely used sources in numerous geospatial studies for the terrain-based identification of environmental features. Many studies about landform distribution analyses [9][10][11][12][13], geomorphology [14] and the human impact on geomorphology [15] were conducted. Furthermore, in the field of hydrology DEMs are required for stream network analysis [16,17] and ground-water flow modelling [18]. Terrain features derived by DEMs are also used as a predictor for digital soil mapping [19][20][21]. Additionally, DEMs are crucial for ecological analysis, such as vegetation and plant distribution research [22][23][24]. Climatic issues, like the observation of glacier changes [25], sea-level rise [26] or climatic modelling [27] are also fields of application, where terrain information is needed.
All of these applications have raised increasing needs for accessible DEMs of higher resolution and accuracy. However, if the process or object of interest is spatially smaller than the GSD of the utilized elevation datasets, the risk of misleading results increases [28]. Likewise, it is well known, that the GSD of a digital elevation model directly influences derived terrain variables such as slope or aspect. For instance, Kienzle [29] showed that the mean slope can differ from 13.9% on a 50 m grid to 8.8% on a 250 m resolution DEM. This effect even increases for higher resolutions and steeper terrain. Zhang and Montgomery [30] derived mean slope differences of up to 24% between a 2 m and a 90 m DEM for the same area. Kramm et al. [11] showed that the accuracy of detected landforms might range up to 30% for the same algorithm and area by using DEMs in different resolutions. Likewise, also different DEMs with the same grid size result in significant differences in the delineated landforms.
Thus, the effects of scale and the impact of the DEMs GSD when deriving topographical features are well documented [31][32][33][34] and also the first techniques for multiscale analysis are available [35][36][37]. Furthermore, it is crucial to analyze the accuracy of DEMs to select the most suitable regarding aim, accuracy and scale of the study. Large-channel profiles over wide distances are easily possible to identify even with 90 m resolution data, landscapes in large scales require elevation data with 1-30 m GSD for a successful identification of individual hillslopes and ridges [38,39]. However, a DEM with a higher GSD is not always advantageous, as elevation models with a very high resolution can depict too many details, which are not relevant for the study.
In recent years, numerous studies have investigated the accuracy of available DEMs. Much research was already done on the accuracy of SRTM and ASTER Global DEM to examine their performance in different situations and sites [40][41][42][43][44]. All of these investigations indicate that the expectable root mean square error (RMSE) for the ASTER GDEM is about 3-4 m in flat terrain and 7-8 m with up to 16 m (RMSE) in steeper relief. For the SRTM1 DEM the results show an average error of 3-4 m in flat landscapes up to 7-8 m in mountainous areas. Additionally, the accuracy of the newer ALOS World 3D [45] and the TanDEM-X World DEM ™ [46][47][48] was assessed by comparing them with various reference datasets. Several studies directly compared the performance of different global elevation models in various geographical settings. The accuracy of the 30 m resolution DEMs from SRTM, ASTER Global DEM and ALOS W3D was assessed by many studies [49][50][51][52][53][54]. They indicate a slightly higher accuracy of ALOS W3D in comparison to SRTM and ASTER GDEM. The result showed an average error of 2-3 m (RMSE) in flat terrain and 6-7 m (RMSE) in steeper sloped landscapes for the ALOS W3D. However, less studies are available yet which compare the performance of the newly available TanDEM-X World DEM ™ with other globally available DEMs. Some studies investigated the accuracy of TanDEM-X World DEM ™ with elevation models from SRTM and ASTER Global DEM [55][56][57]. Others compared the accuracy of SRTM, ASTER GDEM, ALOS World 3D and TanDEM-X World DEM ™ for coastal relief settings [26,58] or for relative small areas [59,60]. They all showed a relative high performance of the TanDEM-X World DEM ™ which is mostly superior to the accuracy of the 30 m global DEMs. The results indicate an average error of less than 3 m for the 12 m TanDEM-X World DEM ™, but some recent studies also showed weaknesses of this DEM in very steep terrain [57].
Nevertheless, a comprehensive analysis of the accuracy of all four global DEMs over large areas is still missing. Furthermore, fewer studies evaluated the accuracy of DEMs on different scales and compared them to very high resolution elevation models derived by stereo satellite imagery. Alganci et al. [50] included some local DEMs derived by Satellite Pour l'Observation de la Terre (SPOT) and Pléiades satellite imagery in their study for an urban area with anthropogenic landscape. Thus, an analysis of the performance of these DEMs in a landscape which is not anthropogenic influenced is still missing.
The goal of this study is to conduct a comprehensive accuracy assessment of the vertical accuracy for a multitude of different DEMs, both for a regional coverage and for local coverages. The regional coverage for the selected study area, the Atacama Desert in northern Chile, is given by datasets with a nearly global coverage, namely the TanDEM-X World DEM ™, ASTER Global DEM, ALOS World 3D and SRTM DEM. Local areas are covered by DEMs derived from stereo-satellite imagery recorded by SPOT 6/7 and Pléiades satellites with areas of 100 to 400 km 2 . The accuracy assessment was performed with three control datasets, which are the light detection and ranging (LiDAR)-based elevation points from both Ice, Cloud, and land Elevation (ICESat) satellites and very accurate high-resolution elevation data derived by unmanned aerial vehicle (UAV)-based photogrammetry, as well as terrestrial laser scanning (TLS). The accuracy analysis is based on the root mean squared error (RMSE) and normalized median absolute deviation (NMAD). Furthermore, fewer studies have systematically investigated the influence of different terrain conditions on the vertical accuracy of DEMs. Some work has been done to investigate the appropriateness of DEMs for delineating different landforms [55] and the relationship of DEM errors in correlation to various landform types and altitude [61]. Additionally, the accuracy of DEMs for several small areas with plain, hilly and mountainous terrain [49] and with different slopes [57] was assessed. Nevertheless, more information about the impact of relief over larger areas on the vertical accuracy of DEMs is necessary. Thus, in this study the accuracy is addressed and evaluated with regard to the existing topography by linking terrain ruggedness index (TRI), topographic position index (TPI) and slope to error values, as terrain has a direct influence in accuracy [42,62,63].

Study Area
The study was conducted in the northern part of Chile ( Figure 1). The area covers the Chilean part of the Atacama Desert, represented by the administrative regions of Tarapacá and Antofagasta. The region, which is one of the driest areas on Earth, is characterized by its hyperarid climate with less than 10 mm/year rainfall on average [64] lying in the 'Arid Diagonal' of South America. This hyperaridity of the Atacama is caused by a combination of subtropical subsidence, coastal upwelling of the cold Humboldt current, and rain-shadow effects of the high Andes [65], which might have been established since the mid-Miocene or earlier [66].
The relief shows large height differences from the coast of the Pacific to the mountains of the Andes with altitudes up to 6700 m above sea level. Furthermore, the study area consists of a diverse topography with steep, seaward cliffs and deeply incised canyons, as well as large alluvial fans and volcanos in the mountain range of the Andes. Thus, the landscape offers a cross section of different relief types from flat and broad landscapes to steep and dissected terrain, with hardly any vegetation cover. The morphodynamic zonation of the Atacama from west to east is described by the coastal ranges with the coastal cordillera reaching up to 2500 m above sea level, the central depression at about 1000 m above sea level and the pre-Andean or western cordillera, as well as the Altiplano (~3800 m above sea level). DEMs are for instance used for geomorphometric analysis of alluvial fans at the coastal range [67], as well as the geomorphometric characterization of the unique, so-called zebra stone stripes, described as contour-parallel bands of dark gravels with contrasting bands of fine-grained soil [68]. 4  zebra stone stripes, described as contour-parallel bands of dark gravels with contrasting bands of fine-grained soil [68].

Global Digital Elevation Models (GDEMs)
The accuracy of several DEMs with a global coverage was validated in this study that are described in the following sections. Except for the 12 m TanDEM-X WorldDEM™, all utilized DEMs are freely available. All of these DEMs were evaluated for an area of around 190,000 km 2 ( Figure 1). To make the heights of all DEMs comparable, a conversion to the same vertical datum is essential. Therefore, in this study all elevation models were converted to the WGS84 ellipsoid as vertical datum.

Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) GDEM
The ASTER Global DEM was produced by processing the entire optical imagery archive from the ASTER sensor onboard of National Aeronautics and Space Administration's (NASA) Earth Observing System Terra satellite, which was launched in December 1999 [3,69]. The mission's aim was primarily to collect multispectral data of the Earth, but in addition to the multispectral bands, the ASTER sensor has a near infrared sensor, which is inclined by 27.6 • and enables stereoscopic recording according to the "as-track" principle [70].
A first version of this dataset was released for open access in June 2009 by the NASA and the Japanese Ministry of International Trade and Industry covering all land areas from 83 • N to 83 • S latitude. The second Version was released in October 2011 with a GSD of 1 arc-second (~30 m). It includes additional scenes from 2008 to 2011 and an improved water mask to achieve various improvements in overall accuracy and to reduce artifacts mainly caused by cloud edges [3]. The last update was created by including even more Level 1-A ASTER scenes acquired between March 2000 and November 2013 and conducting a more effective cloud masking to reduce artifacts. Furthermore, voids were filled with additional data from SRTM1 and the Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010) for most areas of the world. The average vertical accuracy of the ASTER GDEM Version 3 was estimated with a standard deviation of 12.1 m, which is 0.5 m superior than for the prior version [71].
The utilized ASTER GDEM Version 3 in this contribution was originally referenced horizontally to the World Geodetic System 1984 (WGS84) and vertically to the Earth Gravitational Model 1996 (EGM96). Thus, the DEM was converted to WGS84 ellipsoid with a calculated raster of the undulation between EGM96 geoid and WGS84 ellipsoid for the whole region. The creation of the undulation raster was done with the software MSP GEOTRANS v.3.8. To do so, a net of points, which were equally distributed over the whole area was created for which the undulation was calculated by the software. Subsequently, a raster was processed by interpolating the undulation points with Kriging algorithm. Finally, the heights of the undulation raster were added to the ASTER GDEM heights.

Advanced Land Observing Satellite (ALOS) World 3D
The ALOS was launched in 2006 by the Japan Aerospace Exploration Agency (JAXA). Onboard of the satellite was the Panchromatic Remote-sensing Instrument for Stereo Mapping (PRISM) sensor which operated from 2006 to 2011 with the aim to generate global elevation data from along-track triplet stereoscopic panchromatic images with 2.5 m GSD [4,72]. During the sensor´s operation time, approximately 6.5 million scenes, covering the entire globe, were produced which were used to generate a global DEM with a GSD of 5 m. To check data quality during the generation process an automatic check by comparing the data with reference information of ICESat GLAH14 heights and SRTM as well as by visual human interpretations was conducted to achieve a target height accuracy of 5 m [4]. Besides the 5 m DEM, which is only distributed commercially, JAXA released a freely available 1 arc-second (~30 m) ALOS DEM for non-commercial purposes in 2016, which was produced by resampling the original 5 m version [73]. The provided ALOS World 3D (W3D) dataset (Version 1), which was used in this study, is already referenced to WGS84 horizontal and WGS84 ellipsoidal vertical datum. Therefore, no further georeferencing was necessary for this dataset.

TanDEM-X World DEM ™
The TanDEM-X mission from 2010 to 2015 was launched as a public-private effort between the DLR and Airbus Defence and Space to produce a precise global DEM between the latitudes 90 • N and 90 • S with higher accuracy and resolution than the recently existing ones. The Earth was measured from two satellites (TerraSAR-X and TanDEM-X) in a controlled orbit with a baseline of 250-500 m with X-band radar interferometry (InSAR) [7,76]. The TanDEM-X WorldDEM™, which is subsequently denoted as 'TanDEM-X', was produced in the original Version with 0.4 arc-seconds (~12 m) GSD as a commercial product of the TanDEM-X mission. The DEM heights were calibrated with heights of the ICESat GLA14 data product [46]. Furthermore, a 1 arc-second (~30 m) version was generated from the unweighted mean values of the underlying 12 m pixels (Wessel, 2016). Additionally, a 3 arc-seconds (~90 m) elevation model has been released by the DLR in 2018, which is free of charge for use in academic research. This DEM was also created by resampling the original 0.4 arc-second dataset.
The originally intended accuracy for the produced TanDEM-X DEM was an absolute error of less than 10 m in horizontal and vertical direction [46,60,77]. Several studies showed that the DEM product reaches this goal [7,46]. However, they indicate that the accuracy is even higher than originally assumed. For the 90 m DEM only few studies are available yet, which investigated its accuracy, but they suppose a higher accuracy than the SRTM 90 m DEM [47,78].
In this study the TanDEM-X DEM was used in two different resolutions of 12 m and 90 m. The horizontal datum for both DEMs is WGS84 and the heights were already referenced to WGS84 ellipsoid heights.

Local Digital Elevation Models
In addition to the globally available elevation models with lower spatial resolution, 10 elevation models were derived from Pléiades satellite stereo imagery, seven DEMs from SPOT 6 and four from SPOT 7 satellite imagery. The DEMs are distributed over the whole area, depicted in Figure 1, and each DEM covers an area of about 100-400 km 2 . These datasets were processed with the software PCI Geomatica 2018 OrthoEngine with automatic ground control point (GCP) and Tie-Point collection. For the GCP collection process, an additional orthorectified image was used to improve the accuracy of extracted GCPs. The orthorectified image was calculated with provided rational polynomial coefficients (RPCs) information and elevation data from the TanDEM-X 12 m DEM to reduce topographical distortions in the original satellite images. For each dataset around 80 GCPs and 50 tie-points were extracted. The points were checked manually to receive a calculated residual error of less than 1 m (RMSE). All derived elevation models were resampled to 5 m during the generation process to avoid small artifacts in the DEM product.

Pléiades
The Pléiades system consists of a constellation of two satellites operated by the French Space Center (CNES) and ASTRIUM GEO-Information Services. The first satellite (Pléiades 1A) was brought into a sun-synchronous orbit on 16 December 2011. The second one (Pléiades 1B) followed on 2 December 2012 [79]. Both satellites are equipped with optoelectronic, charge-coupled device (CCD) scanners, which scan the Earth's surface transversely to the direction of flight and convert the measured radiation into a measurable electrical signal. It is recorded in a panchromatic channel and four multispectral channels each with five line sensors [80,81]. The line sensors of the panchromatic sensor have a width of more than 6000 pixels and the multispectral sensors have a resolution of 1500 pixels. Thus, the satellite achieves a GSD of 0.5 m in the panchromatic channel and 2 m in the multispectral channels [79]. The Pléiades satellites thus belong to the satellite systems with a very high GSD. The positional accuracy is indicated with 8.5 m at nadir and 10.5 m within an angle of 30 • . Due to the high agility of the satellites, the Pléiades system is able to acquire three or more nearly synchronous images of the same area [82,83].

Satellite Pour l'Observation de la Terre (SPOT) 6/7
The SPOT 6 satellite was launched in 2012 by EADS Astrium, SPOT 7 followed in 2014. Both satellites operate with high-resolution pushbroom sensors and record images in one panchromatic channel and four multispectral channels. They are able to produce images with a GSD of 1.5 m in the panchromatic and of 6 m in the multispectral channels [84]. They also have the capability of tri stereo imaging. The expectable geolocation accuracy for SPOT image products with Primary standard, which are also used in this study is stated with a circular error less than 10 m at the 90th percentile [84].

Ground Truth Elevation Data
For a vertical accuracy assessment, highly accurate evaluation data is necessary that should be at least three times more accurate than the evaluated dataset [85]. In this study the evaluation check was conducted by comparing the DEM heights with several highly accurate elevation data, which are described in the following.

Ice, Cloud, and Land Elevation satellite (ICESat)
The primary goal of NASA's ICESat mission was to observe the cryosphere and to measure changes in the polar ice sheet mass balance [86]. One of the utilized instruments of the ICESat satellite is the Geoscience Laser Altimeter System (GLAS) which has a 1064 nm laser channel for surface altimetry measurements [87]. It operated between February 2003 and October 2009 and the surface elevation data was measured during two to three observation periods each year of about 1 month each. The laser footprints have 172 m spacing along-track, and approximately 42 km cross-track spacing [86,87].
During its operation period ICESat has acquired a huge database of raw and processed data organized in 15 data products. Of interest for this contribution is the 14th product ICESat/GLA14 data, as this dataset contains highly accurate elevation data with a vertical accuracy of 0.1 m for flat locations and 1 m for undulated terrain [86,88]. Thus, several studies showed a successful vertical accuracy assessment over a broader regional extent with ICESat data [89,90].
The ICESat/GLA14 land surface elevation data points originally provided were referenced to the Topex/Poseidon ellipsoid. To make them comparable with the elevation models of this study, a conversion to the WGS84 ellipsoid was conducted by using Equation (1) [91].
To detect outliers, e.g., from cloud reflections, all ICESat points with a height difference value greater than 60 m compared to the TanDEM-X heights were eliminated prior the evaluation. Finally, a total amount of around 450,000 points was used to evaluate the accuracy of the regional elevation models. Their locations are depicted in Figure 1. For the local elevation models an average amount of 500 elevation points was used for each scene. Scenes with less than 50 points were not evaluated in this study. Therefore, for 12 local DEMs no accuracy assessment with ICESat points was possible due to insufficient availability of elevation points.

ICESat-2
The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2) was launched in September 2018 as the follow-on mission for ICESat [92,93]. It collects altimetry data from the Earth's surface with the Advanced Topographic Laser Altimeter System (ATLAS) instrument, which is a LiDAR system with a photon-level detection sensitivity. The outgoing single laser beam (532 nm) is split into three pairs of beams spaced approximately 3.3 km apart with a 90 m distance within the pairs. Furthermore, the laser emits a pulse signal every 0.7 m [93]. Therefore, it has a denser sampling and point coverage in comparison to its predecessor.
For accuracy assessment the measured terrain heights of the ATL03 and ATL08 version 001 products were used in this study. The ATL03 product contains height information of all received photons with a point density of 0.7 m along each track. All heights of the ATL08 dataset are processed in fixed 100 m data points along track, which contain at least 50 signal photons. They include the best fit terrain elevation of each 100 m segment calculated by interpolating all photons within the segment. Only few studies about the accuracy of these datasets are available yet. The terrain height accuracy of the ATL08 best fit dataset was denoted with a RMSE of 0.82 m for a large region in Finland [94].
Prior to the accuracy assessment, all points in both datasets were eliminated, which differ more than 30 m from TanDEM-X 12 m DEM heights. In this study a total of around 400,000 points were used from the ATL03 dataset and of around 650,000 points from the ATL08 dataset to evaluate the accuracy of the regional elevation models ( Figure 1). For each local DEM between 800 and 270,000 additional ATL03 points were used for evaluation. From the ATL08 dataset an average amount of 500 height points were used to evaluate each scene. The datasets were already provided with WGS84 horizontal and WGS84 ellipsoidal vertical datum. Therefore, no further georeferencing was required for this study.

Very High-Resolution DEMs Derived by Unmanned Aerial Vehicle (UAV)
To evaluate the accuracy of the DEMs for large scale terrain, the height accuracy was compared to 19 elevation models derived with UAV-based photogrammetry with very high resolution. Figure 2 gives an overview about the covered terrain by these DEMs. They were achieved by imagery captured with two different systems. First, a 12 megapixel FC330 camera and a 20 mm full frame equivalent lens, fixed by a shock-absorbent gimbal on a rotary-wing quadrocopter (type: DJI Phantom 4), set to capture images every 10 secs. Camera was set to shutter speed priority (1/1000) with ISO-100. Second, an octocopter (type: Mikrokopter MK-Easy) with a 36.4 MP full-format Sony Alpha 7R with a Sony 28mm lens (type: SEL28f20) was applied. Flights at all sites were manually conducted between 10 am and 12 am local time on cloud-free days in a line-based pattern at two different heights, flying slower than 2.5 ms -1 to improve the accuracy of planimetry and altitude. Missions result in a high overlap of > 9 images per point. Subsequent image processing was conducted with AgiSoft Photoscan Professional (vers. 1.4.2). Images were mostly aligned using evenly distributed GCPs measured by Real-Time Kinematic (RTK) positioning (type: Topcon GR5) and at 7 sites the direct Global Positioning System (GPS) measurements of the UAV recorded for each image were used. Processing in ultra-high quality for the dense point cloud generation resulted in a GSD of 1 cm to 8 cm for the DEM and each scene covers an average area of ca. 0.04 km². Average errors range from 3 cm to 1.5 m for the datasets without GCPs in the horizontal direction and 3 cm to 1 m in the vertical direction. All data was exported in WGS84 UTM Zone 19S (EPSG: 32719).
To evaluate the vertical accuracy of all DEMs their GSD was up-sampled to the resolution of the UAV elevation models. Then pixel-wise errors were derived by subtracting the heights of UAV derived DEMs from the other elevation models.

Terrestrial Laser Scanning
The topography at several areas (see Figure 1) was recorded by a terrestrial laser scanner (type: Riegl VZ-2000) in combination with the same RTK positioning system (type: Topcon GR5) for registration of the final point cloud. The derived raw point clouds were filtered and afterwards interpolated by inverse distance weighting to a raster dataset with a 50 cm cell size in ArcGIS Pro 2.2.4 (Environmental Systems Research Institute) for the estimation of the statistics comparable to the other analyses. The covered terrain of these raster datasets is depicted in Figure 3. In addition, subsampled point clouds with a similar mean point distance were compared with raster-datasets in Flights at all sites were manually conducted between 10 am and 12 am local time on cloud-free days in a line-based pattern at two different heights, flying slower than 2.5 ms −1 to improve the accuracy of planimetry and altitude. Missions result in a high overlap of > 9 images per point. Subsequent image processing was conducted with AgiSoft Photoscan Professional (vers. 1.4.2). Images were mostly aligned using evenly distributed GCPs measured by Real-Time Kinematic (RTK) positioning (type: Topcon GR5) and at 7 sites the direct Global Positioning System (GPS) measurements of the UAV recorded for each image were used. Processing in ultra-high quality for the dense point cloud generation resulted in a GSD of 1 cm to 8 cm for the DEM and each scene covers an average area of ca. 0.04 km 2 . Average errors range from 3 cm to 1.5 m for the datasets without GCPs in the horizontal direction and 3 cm to 1 m in the vertical direction. All data was exported in WGS84 UTM Zone 19S (EPSG: 32719).
To evaluate the vertical accuracy of all DEMs their GSD was up-sampled to the resolution of the UAV elevation models. Then pixel-wise errors were derived by subtracting the heights of UAV derived DEMs from the other elevation models.

Terrestrial Laser Scanning
The topography at several areas (see Figure 1) was recorded by a terrestrial laser scanner (type: Riegl VZ-2000) in combination with the same RTK positioning system (type: Topcon GR5) for registration of the final point cloud. The derived raw point clouds were filtered and afterwards interpolated by inverse distance weighting to a raster dataset with a 50 cm cell size in ArcGIS Pro 2.2.4 (Environmental Systems Research Institute) for the estimation of the statistics comparable to the other analyses. The covered terrain of these raster datasets is depicted in Figure 3. In addition, subsampled point clouds with a similar mean point distance were compared with raster-datasets in CloudCompare and analyzed by the M3C2 algorithm in order to calculate detailed, reliable differences [95]. CloudCompare and analyzed by the M3C2 algorithm in order to calculate detailed, reliable differences [95].

Accuracy Assessment
To assess the quality of the digital elevation models the deviation of height differences was calculated against all previously presented datasets. At first, the root mean square error for all digital elevation models compared to each available ground truth dataset was calculated from height differences with the following equation: where ∆ℎ = elevation difference between assessed DEM and reference DEM. = number of pixels. Additionally, the normalized median absolute deviation was conducted, as height differences tend to be not normal distributed and this is a more robust measure against outliers [96]. The equation is: where ∆ℎ = elevation difference between assessed DEM and reference DEM. ∆ = median of all elevation differences. If error values are normally distributed, the NMAD is identical to the RMSE, otherwise the RMSE will be larger than the NMAD.
Accuracy values are only comparable, if they can be related to the existing relief, since it is evident that different landscapes affect the accurateness of DEMs [42]. Therefore, a relief-adjusted evaluation of the ICESat heights was conducted by relating the accuracy of digital elevation models to specific terrain characteristics. In order to achieve this, several terrain parameters were calculated. First, the TRI was computed and divided into seven classes after Riley et al. [97] from leveled surfaces to extremely rugged terrain. Second, the slope was calculated and classified into five classes from flat (<5°), gentle (5°-15°), moderate (15°-25°), steep (25°-35°) to extreme (>35°). Additionally, the TPI after Weiss [98] was computed to assign the height errors to specific landforms. The number of classes was

Accuracy Assessment
To assess the quality of the digital elevation models the deviation of height differences was calculated against all previously presented datasets. At first, the root mean square error for all digital elevation models compared to each available ground truth dataset was calculated from height differences with the following equation: where ∆h i = elevation difference between assessed DEM and reference DEM. n = number of pixels. Additionally, the normalized median absolute deviation was conducted, as height differences tend to be not normal distributed and this is a more robust measure against outliers [96]. The equation is: where ∆h i = elevation difference between assessed DEM and reference DEM. m ∆h = median of all elevation differences.
If error values are normally distributed, the NMAD is identical to the RMSE, otherwise the RMSE will be larger than the NMAD.
Accuracy values are only comparable, if they can be related to the existing relief, since it is evident that different landscapes affect the accurateness of DEMs [42]. Therefore, a relief-adjusted evaluation of the ICESat heights was conducted by relating the accuracy of digital elevation models to specific terrain characteristics. In order to achieve this, several terrain parameters were calculated. First, the TRI was computed and divided into seven classes after Riley et al. [97] from leveled surfaces to extremely rugged terrain. Second, the slope was calculated and classified into five classes from flat (<5 • ), gentle . Additionally, the TPI after Weiss [98] was computed to assign the height errors to specific landforms. The number of classes was reduced to seven by combining the three ridge classes and two drainage classes to one class each. The TPI evaluation was only conducted for all global available datasets, as the coverage region of the others was too small to gain enough evaluation data for all classes. All landforms and terrain features were derived over the whole region on the basis of the 12 m TanDEM-X elevation model (see Figure 4). reduced to seven by combining the three ridge classes and two drainage classes to one class each. The TPI evaluation was only conducted for all global available datasets, as the coverage region of the others was too small to gain enough evaluation data for all classes. All landforms and terrain features were derived over the whole region on the basis of the 12 m TanDEM

Overall Accuracies
All determined overall accuracies are presented in the table of Appendix 1 and Figure 5. Mostly, they show similar results for each dataset compared to all reference data. The ICESat-2 ATL08 dataset generally produced the highest error values for the DEMs in comparison to the other reference datasets. For the TanDEM-X 90 m and the SPOT 125 datasets in particular, a very high RMSE (13.9 m and 11.6 m) was calculated. Overall, for 17 DEMs the highest RMSE and for 21 DEMs the highest NMAD values were calculated with the ICESat-2 ATL08 dataset. For the UAV data, the lowest differences between RMSE and NMAD values can be observed for most datasets in comparison to all ICESat and TLS datasets. While the mean difference between RMSE and NMAD values is 0.4 m for the UAV dataset, it is more than 1.7 m for the other datasets. The highest differences between RMSE and NMAD can be observed for the TLS point cloud dataset with a mean difference of 4.8 m.

Overall Accuracies
All determined overall accuracies are presented in Table A1 of Appendix A and Figure 5. Mostly, they show similar results for each dataset compared to all reference data. The ICESat-2 ATL08 dataset generally produced the highest error values for the DEMs in comparison to the other reference datasets. For the TanDEM-X 90 m and the SPOT 125 datasets in particular, a very high RMSE (13.9 m and 11.6 m) was calculated. Overall, for 17 DEMs the highest RMSE and for 21 DEMs the highest NMAD values were calculated with the ICESat-2 ATL08 dataset. For the UAV data, the lowest differences between RMSE and NMAD values can be observed for most datasets in comparison to all ICESat and TLS datasets. While the mean difference between RMSE and NMAD values is 0.4 m for the UAV dataset, it is more than 1.7 m for the other datasets. The highest differences between RMSE and NMAD can be observed for the TLS point cloud dataset with a mean difference of 4.8 m. The lowest accuracies were detected for the 30 m ASTER GDEM, which also tend to be lower than the calculated accuracies of both DEMs with 90 m GSD. The calculated RMSEs for the ASTER GDEM V3 are between 5.7 m and 10.9 m and the NMAD ranges between 5.8 m and 9.5 m. The smallest differences between RMSE and NMAD are detectable for the SRTM 30 m dataset, with a relatively small range between 4.8 m and 6.0 m (RMSE) and 3.2 m and 4.6 m (NMAD).
In contrast, for the 90 m TanDEM-X it is noticeable that the discrepancy between the calculated RMSE and NMAD values are rather high, especially for all three ICESat datasets. The RMSE ranges between 6.4 m and 13.9 m, the calculated NMAD between 2.0 m and 6.  The lowest accuracies were detected for the 30 m ASTER GDEM, which also tend to be lower than the calculated accuracies of both DEMs with 90 m GSD. The calculated RMSEs for the ASTER GDEM V3 are between 5.7 m and 10.9 m and the NMAD ranges between 5.8 m and 9.5 m. The smallest differences between RMSE and NMAD are detectable for the SRTM 30 m dataset, with a relatively small range between 4.8 m and 6.0 m (RMSE) and 3.2 m and 4.6 m (NMAD).
In contrast, for the 90 m TanDEM-X it is noticeable that the discrepancy between the calculated RMSE and NMAD values are rather high, especially for all three ICESat datasets. The RMSE ranges between 6.4 m and 13.9 m, the calculated NMAD between 2.0 m and 6.

Terrain-Dependent Accuracies
The results depicted in Figure 6 show the RMSE and NMAD of all digital elevation models calculated with the ICESat reference dataset according to their terrain ruggedness index. The calculation of error values was conducted here only for corresponding datasets with at least 10 available elevation points for more than one class. All other datasets were not considered.

Terrain-Dependent Accuracies
The results depicted in Figure 6 show the RMSE and NMAD of all digital elevation models calculated with the ICESat reference dataset according to their terrain ruggedness index. The calculation of error values was conducted here only for corresponding datasets with at least 10 available elevation points for more than one class. All other datasets were not considered.  The diagram shows for most elevation models only a slight increase of error from class 'level' to class 'highly rugged'. Both DEMs with a 90 m GSD have a higher increase of uncertainty from class 'intermediate rugged' terrain to 'extremely rugged' terrain. For all elevation models the biggest loss in accuracy is visible in the category 'extremely rugged'. The 90 m TanDEM-X in particular shows a very high accuracy in leveled terrain, which is similar to the 12 m TanDEM-X, but in rough terrain the accuracy decreases more than for all other DEMs and is even lower than for the SRTM 90 m DEM in 'extremely rugged' terrain. The freely available ALOS W3D dataset shows a better overall accuracy and terrain independency, than the other 30 m DEMs. Lowest accuracies were detected for the 30 m ASTER GDEM. Only in category 'extremely rugged' the 90 m TanDEM-X and 90 m SRTM perform with similar error values.
The highest accuracies according to their terrain ruggedness were detected for the high resolution Pléiades S DEM both in flat and rough terrain. For the Pléiades Rio Loa W and Pléiades Badlands E DEMs a strong increase of RMSE is detectable in the category 'extremely rugged', whereas the RMSE is rather low for these DEMs in all other categories. Furthermore, the NMAD of these two elevation models in the highest category is also rather low and does not show such an increase of error. Figure 7 shows the RMSE and NMAD of all elevation models according to their slope. All values were calculated against the ICESat reference dataset and only datasets with at least two classes with more than 10 reference points are considered here. Similar to the results of TRI classes an increase of RMSE and NMAD values for steeper slopes is observable here. For DEMs with lower GSD a stronger decrease of accuracy is detectable for rising slope degrees. The diagram shows for most elevation models only a slight increase of error from class 'level' to class 'highly rugged'. Both DEMs with a 90 m GSD have a higher increase of uncertainty from class 'intermediate rugged' terrain to 'extremely rugged' terrain. For all elevation models the biggest loss in accuracy is visible in the category 'extremely rugged'. The 90 m TanDEM-X in particular shows a very high accuracy in leveled terrain, which is similar to the 12 m TanDEM-X, but in rough terrain the accuracy decreases more than for all other DEMs and is even lower than for the SRTM 90 m DEM in 'extremely rugged' terrain. The freely available ALOS W3D dataset shows a better overall accuracy and terrain independency, than the other 30 m DEMs. Lowest accuracies were detected for the 30 m ASTER GDEM. Only in category 'extremely rugged' the 90 m TanDEM-X and 90 m SRTM perform with similar error values.
The highest accuracies according to their terrain ruggedness were detected for the high resolution Pléiades S DEM both in flat and rough terrain. For the Pléiades Rio Loa W and Pléiades Badlands E DEMs a strong increase of RMSE is detectable in the category 'extremely rugged', whereas the RMSE is rather low for these DEMs in all other categories. Furthermore, the NMAD of these two elevation models in the highest category is also rather low and does not show such an increase of error. Figure 7 shows the RMSE and NMAD of all elevation models according to their slope. All values were calculated against the ICESat reference dataset and only datasets with at least two classes with more than 10 reference points are considered here. Similar to the results of TRI classes an increase of RMSE and NMAD values for steeper slopes is observable here. For DEMs with lower GSD a stronger decrease of accuracy is detectable for rising slope degrees.  Generally, the local DEMs derived from Pléiades and SPOT scenes achieved the highest accuracy values. Though, the diagram curves indicate a slightly higher accuracy for Pléiades datasets compared to SPOT scenes. For very steep slopes only the TanDEM-X with a GSD of 12 m is able to achieve similar accuracy values (RMSE: 9.7 m, NMAD: 6.0 m) compared to the very high-resolution DEMs, which have an average accuracy of RMSE 6.6 m and NMAD of 6.1 m here. Figure 8 depicts the RMSE and NMAD for all global available DEMs according to their respective TPI landform class. The results show for all elevation models the lowest RMSE and NMAD values for the class 'plains'. All other classes achieved significantly lower accuracies. The highest error values were determined for the landform classes 'gully', 'drainage' and 'ridge'.
The highest accuracies are calculated for the 12 m TanDEM  Generally, the local DEMs derived from Pléiades and SPOT scenes achieved the highest accuracy values. Though, the diagram curves indicate a slightly higher accuracy for Pléiades datasets compared to SPOT scenes. For very steep slopes only the TanDEM-X with a GSD of 12 m is able to achieve similar accuracy values (RMSE: 9.7 m, NMAD: 6.0 m) compared to the very high-resolution DEMs, which have an average accuracy of RMSE 6.6 m and NMAD of 6.1 m here. Figure 8 depicts the RMSE and NMAD for all global available DEMs according to their respective TPI landform class. The results show for all elevation models the lowest RMSE and NMAD values for the class 'plains'. All other classes achieved significantly lower accuracies. The highest error values were determined for the landform classes 'gully', 'drainage' and 'ridge'.
The highest accuracies are calculated for the 12 m TanDEM   For the very high-resolution Pleiades and SPOT scenes varying differences are observable. Especially for SPOT Paposo N the results show relatively large differences, which are greater than  For the very high-resolution Pleiades and SPOT scenes varying differences are observable. Especially for SPOT Paposo N the results show relatively large differences, which are greater than the differences of TanDEM-X 12 m and ALOS W3D. For SPOT 120 and Pléiades S almost no deviation of the results was measured.  A visual interpretation on a local scale shows that height differences are mainly affected by small scale landforms with large height differences on small areas. It is depicted in Figure 10 for two example sites that DEM heights in depressions tend to be higher than the heights of the reference elevation data. In contrast, the heights of ridges and summits tend to be lower in comparison to reference data. It can be observed that this effect is increased for the TanDEM-X 12 m DEM compared to the 5 m Pléiades S elevation model. All DEMs with a coarser GSD were not at all able to depict the sample canyons in a sufficient way.  A visual interpretation on a local scale shows that height differences are mainly affected by small scale landforms with large height differences on small areas. It is depicted in Figure 10 for two example sites that DEM heights in depressions tend to be higher than the heights of the reference elevation data. In contrast, the heights of ridges and summits tend to be lower in comparison to reference data. It can be observed that this effect is increased for the TanDEM-X 12 m DEM compared to the 5 m Pléiades S elevation model. All DEMs with a coarser GSD were not at all able to depict the sample canyons in a sufficient way.  A visual interpretation on a local scale shows that height differences are mainly affected by small scale landforms with large height differences on small areas. It is depicted in Figure 10 for two example sites that DEM heights in depressions tend to be higher than the heights of the reference elevation data. In contrast, the heights of ridges and summits tend to be lower in comparison to reference data. It can be observed that this effect is increased for the TanDEM-X 12 m DEM compared to the 5 m Pléiades S elevation model. All DEMs with a coarser GSD were not at all able to depict the sample canyons in a sufficient way.

Discussion
The overall accuracies of the global DEMs achieved are within the expected range when comparing them with findings from other studies. As the relief of the study area represents a cross section of flat to very steep landscapes, the overall accuracies can be taken as an average error value over a broad landscape. Thus, the achieved overall accuracies are lower than conducted by studies in mostly flat terrain [26,45,56,58]. However, the results of these studies mostly fit well with the achieved results in flat landscapes. Likewise, the calculated overall accuracies in this study are generally higher than findings from many studies with predominantly undulated to very steep terrain conditions [51,55,57,61]. Nevertheless, these results are also consistent to the calculated error values in very steep terrain.
Of all globally available elevation datasets only the 12 m TanDEM-X was able to achieve similar accuracies in comparison to the local DEMs derived from Pléiades and SPOT imagery. Also, small-scale analyses show that this DEM is able to depict most terrain features compared to the other global DEMs. The results are generally consistent with findings of other studies, which also showed that the accuracy of the new TanDEM-X generally outperforms the accuracy of ASTER GDEM and SRTM [56,[58][59][60]. Only the freely available ALOS W3D dataset was able to achieve similar results with only slightly lower overall accuracies. In comparison of all 30 m DEMs the ALOS W3D seems to be superior compared to SRTM and ASTER. Therefore, similar findings from other studies can be agreed here [50,99]. It shows a good agreement with both evaluation scales and is more stable over all terrain types, slopes and landforms, only slightly worse than the TanDEM-X 12 m dataset. In mountainous areas with steep slopes in particular, the performance of ALOS W3D seems to be far superior compared to the other 30 m elevation models. This is probably caused by the fact that it is resampled from a higher resolution dataset and still more terrain features remain in the 30 m elevation data. Furthermore, optical imagery is often less affected by relief distortions due to usually small viewing angles. Nevertheless, the goal of 5 m vertical accuracy for ALOS W3D can only be reached here in flat to undulated terrain. In very steep terrain the uncertainties are still higher.
Possibly, the high accuracy values of TanDEM-X and ALOS W3D compared to the ICESat dataset are affected by the fact that ICESat points were already used for quality assessments during the generation process of both elevation models [100][101][102]. Thus, some correlation between those DEMs and the evaluation dataset cannot be excluded. However, the results of the evaluation with completely independent elevation data from UAV and TLS measurements produced similar results and a significant positive influence of ICESat data on the accuracy of both DEMs cannot be observed here.
The ASTER GDEM achieved the lowest overall accuracies in comparison to all global datasets. Furthermore, except for very steep terrain, its accuracies seem to be lower than the accuracies of both 90 m DEMs. Numerous studies showed that the previous 2nd version of the ASTER GDEM achieved the least accurate terrain representation compared to other freely available DEMs [41,43,103,104]. Although a direct comparison of the latest two ASTER GDEM Versions was not conducted here, the results indicate that also the last update of ASTER GDEM is not able to achieve the accuracies of the other global elevation datasets.
For all DEMs, a decrease in accuracy in rougher terrain compared to flat landscapes can be observed. This effect is particularly stronger for elevation models with coarser GSD, which have a higher decrease in accuracy compared to high-resolution elevation models. For the 90 m TanDEM-X in particular, which is similarly accurate in flat terrain than the 12 m TanDEM-X elevation model, a very high drop in accuracy can be observed in rougher terrain with steep slopes. A similar trend was also observable for the SRTM 90 m DEM but the decrease in accuracy is even higher for the 90 m TanDEM-X. This is in accordance with findings of Altunel [47], who already noticed some overestimations of 90 m TanDEM-X for cliffy terrain and a high accuracy in flat areas. Generally, a comparison of the two 90 m elevation datasets leads to the conclusion that TanDEM-X is significantly more accurate in flat landscapes, but the SRTM 90 m still seems advantageous in steeper relief. Furthermore, the difference between the calculated RMSE and NMAD is conspicuously high for the TanDEM-X 90 m dataset. It can be assumed that more outliers exist in these DEM in its first version compared to the other global elevation models, which were already revised several times. Nevertheless, the results lead to the conclusion that both DEMs with 90 m GSD are not suitable for accurate large scale terrain analyses, especially in rough landscapes. Likewise, the results of the ASTER GDEM show that the accuracy of this DEM is already lower in flat landscapes than the accuracy of high-resolution DEMs in rough terrain. Thus, for this DEMs the results indicate a least suitability for geomorphometric analyses in this area.
The results for the local DEMs derived from stereo satellite imagery show a varying overall accuracy, which highly depends on the topography of each scene. Relating them to slope or TRI tends in most cases to lower error values for each terrain category compared to the globally available elevation models. Furthermore, the accuracies of Pléiades imagery seem to be slightly higher than images derived from SPOT imagery.
For most DEMs similar accuracy values could be achieved by the different reference datasets. However, for some DEMs some anomalies could be detected. For example, the Pléiades S scene achieved relative low accuracies compared to the TLS point clouds, whereas the values were significant lower for all other datasets. This is mainly caused by different locations of the reference datasets, even in this small area. While all ICESat datasets and the UAV data mostly cover flatter areas, the TLS data is situated on a hillside with relative steep slopes. Therefore, it can be assumed, that also for this DEM the error values are much higher in steeper areas compared to flatter landscapes. Also, the relatively high error values for the Pléiades Rio Loa W and Pléiades Badlands W DEMs can be explained with very steep relief conditions. Both scenes and the evaluation data cover the Rio Loa canyon which is extremely steep at this point with average slope angles of more than 30 • . This steepness possibly produces more outliers, which are represented in the RMSE values, whereas the NMAD values are much lower.
In contrast, the large differences in the overall accuracy results from the Pléiades Shoreline DEM with the ICESat-2 ATL03 and ATL 08 reference data cannot be explained here by the relief, as both point datasets cover the same track. Therefore, it can be supposed, that these differences are originated in the reference dataset. A similar contradiction is also evident in the results of the SPOT Paposo S DEM. Indeed, in contrast to the results of the Pléiades Shoreline DEM there is also a large difference between the ICESat-2 ATL03 RMSE and NMAD. While the RMSE is with 15.1 m very high, the NMAD is extremely low with 1.0 m. This possibly indicates that a great amount of outliers exists in the ICESat-2 ATL03 dataset at this location.
It is conspicuous that for most DEMs the highest values were calculated by the ICESat-2 ATL08 dataset, which are often not in line with the error values calculated with other datasets. It can be assumed that the interpolated ATL08 heights are probably less suitable for DEM accuracy assessment. Thus, it is likely that the calculated values by the ATL08 dataset at this early stage overestimate the error of the DEMs. In contrast, the ICESat-2 ATL03 mostly fit well with the results from the other reference datasets.
The TLS raster data produced higher error values than the point clouds from TLS measurements. This is possibly caused by the height interpolations of some areas, which were not covered during the measurement process. During the generation process, it was not completely possible to exclude all of these areas and some small areas with probably lower accuracy remained for the evaluation process.
A comparison of elevation differences on a local scale shows, that the heights of small incised canyons are overestimated at the bottom and underestimated at upper elevations. The results reveal, that even for the very high-resolution DEMs a slight decrease in the deepness of such a canyon is detectable. This lack of deepness rises with coarser resolutions.

Conclusions
In this contribution, the accuracy of a multitude of digital elevation models was evaluated against various reference datasets. Furthermore, the influence of terrain on the accuracy of these DEMs was analyzed by relating the accuracy values to several extracted terrain features and landforms on a regional scale. The results reveal that the rougher and steeper the landscape is the higher resolutions are necessary to depict the landscape in an accurate way. For instance, the 90 m TanDEM-X elevation model showed eight times higher RMSE error values in terrain with steep slopes (25-35 • ) compared to landscapes with flat slopes (0-5 • ). Thus, an average rise of about 5 m RMSE per 10 • slope can be assumed for this DEM. In contrast, for the 12 m TanDEM-X the increase of error in steep terrain (25-35 • ) is only four times as high as in flat landscapes with slopes less than 5 • and an average rise of 1.5 m RMSE per 10 • slope can be supposed here. The results of the very high resolution DEMs from Pléiades and SPOT satellites reveal that the RMSE error is increasing by about 1 m per 10 • slope. Therefore, in an increase of error by about three times in terrain with steep slopes compared to flat landscapes could be expected for these DEMs. Hence, for analyses in flatter landscapes a 30 m or 90 m DEM could possibly be sufficient. If the relief is steeper, only high-resolution DEMs show satisfying accuracies. This applies for a regional coverage, but even more for analyses on a local scale with smaller landforms. Furthermore, the presented results are valid for regions with almost no vegetation cover. It cannot be stated here how the accuracies of different DEMs are affected by vegetation and to which degree an increase of error is probably detectable in dense vegetated areas.
The results of this study point out that of all globally available datasets only the TanDEM-X 12 m and partly the 30 m ALOS World 3D are able to depict the landscape in the same accuracy as very high resolution DEMs with a GSD of 5 m. Thus, it can be assumed that the 12 m TanDEM-X data are suitable not only for global scale analyses, but also has a sufficient accuracy for local scale analysis in flat to moderately sloped landscaped. Only in landscapes with very steep terrain they seem to be less accurate than DEMs derived by Pléiades and SPOT imagery. All other freely worldwide available elevation models were not able to achieve promising accuracies here and seem less suitable for delineating small terrain features in large scales.
Furthermore, it can be concluded that most reference datasets from different sources produced coherent values. Only the ICESat-2 ATL08 dataset seems to significantly underestimate the accuracy, especially of the local-scaled DEMs. Appendix A Table A1. Overall accuracies of all digital elevation models compared to the reference point datasets of Ice, Cloud and Land Elevation Satellite (ICESat), ICESat-2 ATL03, ICESat-2 ATL08 and Terrestrial Laser Scanning (TLS) point clouds as well as raster datasets derived from unmanned aerial vehicle (UAV)-based photogrammetry and TLS datasets. Listed are the ground sampling distance (GSD), the calculated root mean square error (RMSE), the normalized median absolute deviation (NMAD) and the total amount of applied points or raster for each DEM.