Digital Elevation Models: Terminology and Deﬁnitions

: Digital elevation models (DEMs) provide fundamental depictions of the three-dimensional shape of the Earth’s surface and are useful to a wide range of disciplines. Ideally, DEMs record the interface between the atmosphere and the lithosphere using a discrete two-dimensional grid, with complexities introduced by the intervening hydrosphere, cryosphere, biosphere, and anthroposphere. The treatment of DEM surfaces, affected by these intervening spheres, depends on their intended use, and the characteristics of the sensors that were used to create them. DEM is a general term, and more speciﬁc terms such as digital surface model (DSM) or digital terrain model (DTM) record the treatment of the intermediate surfaces. Several global DEMs generated with optical (visible and near-infrared) sensors and synthetic aperture radar (SAR), as well as single/multi-beam sonars and products of satellite altimetry, share the common characteristic of a georectiﬁed, gridded storage structure. Nevertheless, not all DEMs share the same vertical datum, not all use the same convention for the area on the ground represented by each pixel in the DEM, and some of them have variable data spacings depending on the latitude. This paper highlights the importance of knowing, understanding and reﬂecting on the sensor and DEM characteristics and consolidates terminology and deﬁnitions of key concepts to facilitate a common understanding among the growing community of DEM users, who do not necessarily share the same background.


Introduction
Digital topography, expressed as a digital elevation model (DEM), has been embraced by a wide variety of disciplines and communities including geomorphometry, cryospheric and soil sciences, precision agriculture, defense, sport, tourism, telecommunications, land planning, hydrology, natural hazards, remote sensing, and even video games. In addition to the users, the data producers include scientists, government mapping agencies, and commercial vendors. These user communities and data producers do not read the same literature and consequently many have adopted various definitions for digital topography and other digital representations (see Section 3.1) of topographic surfaces, as well as their derivatives. This paper aims to consolidate the definitions as agreed upon by participants of the 2019 Joint Research Centre (JRC) meeting and the Digital Elevation Model Intercomparison eXperiment (DEMIX) working group [1], which began work in 2020.
The paper starts with a definition of surface types. We separate physical phenomena into six distinct categories, known in geospatial sciences as "spheres" resulting in a set of complete, mutually exclusive object classes based on observable properties of the contained matter (e.g., lithosphere, hydrosphere, and atmosphere). The defined spheres have surfaces (or boundary layers) along which they interface with other spheres in their surroundings. These surfaces are defined as "real surfaces" in accordance with Florinsky [2] and categorized according to the adjacent spheres.
These physical "real" surfaces that can be referenced by elevation have limited practical value if they are too complex for rigorous mathematical handling [3]. This is overcome by simplifying and mathematically representing "real" surfaces as "topographic" surfaces [2]. These topographic surfaces have been chosen to be the underlying concept of every DEM and its immediate derivatives. Mathematical methods for calculating morphometric variables such as slope and aspect are collated and scrutinized according to the established definitions and translated into algorithms suitable for DEMs, including their extension to grids in latitude and longitude, taking into account the differences in data spacing with non-rectangular pixels.
The term "model" within DEM has multiple meanings, and different communities interpret it differently. The two most distinct interpretations consist of a data model in raster form, and a mathematical model. The natural world is a physical terrain, which can be conceptualized and abstracted into a nominal terrain that can be modeled either as an empirical grid, or as a formal mathematical model. In a computer science or data science view, "model" refers to a data storage model, and in the case of a DEM refers to a grid, matrix, or array data structure, all of which have the same meaning for different user communities. A DEM in this context is a very convenient and efficient way of storing and analyzing digitized elevation data as opposed to storing the elevations as lists of vectors/coordinates. A data elevation model merely represents the terrain, with no assumptions about its mathematical properties. The first developments of digital terrain models defined them in 1958 as "a statistical representation of the continuous surface of the ground by a large number of selected points with known xyz coordinates" [4]. These were not mathematical models, since they made no assumptions about the nature of the surface, but soon after their introduction they were being used to help create physical 3D terrain models [5]. Purely physical terrain models have a long history, and they have a scale based on the actual detail incorporated into the model. The original digital terrain model definition would comprise rasters (grids), Triangular Irregular Networks (TIN), and point clouds.
A mathematical elevation model consists of a series of analytical expressions to represent a surface, with coefficients, and these expressions can be integrated and differentiated rigorously [2,[6][7][8]. Widely used mathematical models with similarities to elevation models include the World Magnetic Model (WMM) [9] and the Earth Gravitational Model (EGM) [10], and consist of coefficients and source code to compute magnetic and gravity values at any desired location up to a resolution defined by the coefficients. These models, which are not DEMs because they do not represent the Earth's surface, have a scale, determined by knowledge of the relevant geophysical field, and the number of coefficients chosen for the representation which determine the level of detail and smoothing.
Some users prefer to consider a DEM as a mathematical model because such representations are favorable for interpolation, generalization and denoising, and computation of derivative geomorphometric variables [2]. These mathematical modeling approaches have not been widely adopted, mainly because they are not easily implemented in most available GIS software.
The term DEM has been accepted across a wide range of disciplines as a GIS raster format for elevation, and its use of "model" agrees with common meanings of model. If a fully mathematical model of topography becomes available, with both data and software to manipulate it, a new name will be required, but it is premature to provide a name for something that does not exist and would only confuse DEM users in many disparate fields.

Spheres and Interfaces
For the purpose of defining different surfaces that can be represented by elevation models, we introduce the concept of "spheres" as a collective of (physical) matter which share specific properties [11]. On planet Earth and some other solid celestial bodies, two of these spheres can be considered ubiquitous and persistent, meaning they exist at any geographical location and at any point in time (at least for the purpose of terrestrial elevation models). These are: • Lithosphere: The rigid outer layer of planet Earth and other solid celestial bodies.
Although not consistently depicted as such in literature, for the purposes of elevation models, the lithosphere is considered to include soils ("pedosphere"). • Atmosphere: The layer of gases, commonly known as air, including suspended liquid and solid particles known as aerosols (including dust, clouds, snow, and ice).
Because of their ubiquity and relative permanence, the upper boundary of the lithosphere and the lower boundary of the atmosphere can be defined as unique, globally contiguous surfaces, with the caveat that caves and overhanging cliffs need to be generalized.
The following spheres are temporary and local, i.e., they may or may not appear at any given location or point in time. For the purpose of elevation models, they are only considered where they occur between the upper surface of the lithosphere and the lower surface of the atmosphere (and have interfaces with at least two other spheres).

•
Hydrosphere: The masses of liquid water, such as oceans, lakes, and rivers; • Cryosphere: The masses of frozen water, such as sea ice, glaciers, and snow cover; • Biosphere: The masses of living organisms, such as vegetation and animals, including dead but still connected parts such as trunks or branches; • Anthroposphere: The masses consisting predominantly of matter processed by humans, such as wood, bricks, concrete, asphalt, glass, or plastics.
Digital topography records the surface separating the lithosphere from the atmosphere (or space for atmosphere-free solid celestial bodies). Challenges arise when intermediate layers such as the hydrosphere, cryosphere, biosphere and anthroposphere intervene ( Figure 1 and Table 1), and may not be handled consistently even within a single elevation model. For example, ETOPO1 [12] removes sea ice from the oceans but leaves the continental ice sheets for Greenland and Antarctica. The oceanic hydrosphere is removed, and some lakes (e.g., the Caspian Sea and the North American Great Lakes), but not the lakes in the East African Rift Valleys. The treatment of biosphere and anthroposphere add to the challenge of picking the desired surface and accurately recording it. In addition, the particular technology used to collect the topographic data also influences which surface is being captured. For example, radar signals might experience some degree of penetration within the vegetation canopy, depending on the band or wavelength they are acting in; laser altimeters can measure distinct intermediate surfaces in the vegetation structure; while the raw data from stereophotography represent the top of canopy.
Any DEM must balance the ideal surface that users need with the limitations of the measurement system and the first observable surface that can be detected. A DEM will always be an abstraction of the real surface, and in some cases will approximate a ground surface without man-made features such as buildings. Removal of the hydrosphere and cryosphere may or may not be possible, or desirable, depending on the end use of the DEM. Derived grids, such as ice thickness or vegetation canopy height, are not DEMs but fill many specialized needs. A DEM requires a coordinate system and a reference frame, with horizontal, vertical, and temporal components, which need to be specified in the metadata. Datums are defined at different scales (global, regional, national, or local) and timeframes, with each datum ideally assigned a unique European Petroleum Survey Group (EPSG) code. The horizontal datum, generally WGS84 or an equivalent, determines how the latitude and longitude coordinates map to the Earth's surface. The vertical datum defines the 0 elevation, and can be in terms of an ellipsoidal or geoidal (mean sea level) reference frame, which can differ by up to about 100 m (see Figure 2). Geoidal datums can be global, such as EGM2008 [10], continental, national, or local. The temporal component reflects when the data were acquired, which can be almost instantaneous (e.g., the Shuttle Radar Topography Mission, SRTM, was collected in less than two weeks), or the collection can be over a significant period of time during which the Earth's surface could have changed. Over time, the land surface can change through natural or human activity and even plate tectonics, which creates measurable displacements. Previous documents summarizing topographic surfaces, and the resulting definitions for digital topography, include [2,3,[13][14][15][16][17] and form the background for our definitions. A large majority of the participants in the 2019 JRC meeting and the 2020 DEMIX exercise agreed with the definitions below, treating DEM as a general term, as does the general public (e.g., [18]), commercial data providers, and mapping agencies in Western Europe who supply both digital surface models (DSM) and digital terrain models (DTM) products (e.g., United Kingdom, Norway, Netherlands, Denmark). These definitions seek to provide clear explanations that match current usage as much as possible. They match the usage for the Shuttle Radar Topography Mission (SRTM) DEM ( [19], 752,000 results on Google and 50,600 on Google Scholar in April 2021), which is actually an SSG (sensor surface grid) that approximates a DSM (see definitions in Section 3.3). The use of the term DEM does not match usage by the U.S. Geological Survey (USGS), which produces a DTM but calls it a bare-earth DEM [20], or the Spanish Instituto Cartográfico Nacional, which includes lidar point clouds within the DEM types [21]. From these semantic differences, it is clear that providers and users of DEMs must clearly understand what the datasets intend to represent.

Basic Geometric Definitions
• Height: Distance of a point from a chosen reference surface positive upward along a line perpendicular to that surface ( Figure 2) [22]. A height below the reference surface will have a negative value. Without one of the specific descriptors below, height is an informal term which will most often be interpreted as an orthometric height or the vertical size of a feature such as a tree or a building. To avoid ambiguity, the description must include the reference surface, since even minor differences in these surfaces can have a significant impact on analysis. • Elevation: Informal equivalent to height, which will most often be interpreted as an orthometric height. • Depth: Distance below the surface of a body of water, with an implicit negative sign, referenced to mean sea level or a local lake or river datum. When the body of water is the ocean, the depth is also an elevation but with an explicit negative sign. • Surface: A surface in the context of topography is a geographic feature that marks the (uppermost) boundary layer (in the gravitational direction) between two spheres as defined earlier. Figure 1 shows several types of real surfaces for the Earth's lithosphere, hydrosphere, and cryosphere. These real surfaces are too complex for rigorous mathematical treatment because they are not smooth and regular [2,3], they are therefore approximated by topographic surfaces. • Topographic surface: The topographic surface is a closed, oriented, continuously differentiable, two-dimensional manifold (S) in the three-dimensional Euclidean space (E 3 ). Five key characteristics (constraints in a mathematical sense) of topographic surfaces include: (1) single-valued (caves and overhanging cliffs are not allowed); (2) smooth, with the topographic surface having derivatives of all orders; (3) uniform local gravity, approximated by a plane; (4) planar size limitedness, so that Earth or planetary curvature can be ignored in computations; and (5) scale dependence (non-fractality and any fractal component is noise) [2,16]. • Grid: A network composed of two or more sets of curves in which the members of each set intersect the members of the other sets in an algorithmic way [22]. The curves partition a space into grid cells. A grid can be understood as a regular network of grid nodes (points at which curves intersect) or a mesh of grid cells (areas which are enclosed by curves). In practical terms, a grid is an efficient way for storing and accessing digital data. • Grid spacing: The horizontal distance of neighboring samples in a grid. These are most commonly in either meters or arc seconds, and generally but not always the same in the x and y directions. • Spatial resolution of gridded data: The horizontal dimensions of the smallest feature detectable by the sensor and modified after the gridding procedure, generally given in meters. • Sparse grid: A grid in which not all nodes or cells have values attached to them. These missing values or "voids" need to be filled using interpolation or be treated separately in grid operations. • Area-based grid: In this type of grid sampling, the values stated are representative for the entire area of the grid cell to which they refer (see Figure 6A). They can generally be assumed to be close to the median or (weighted) average of the original distribution of values within a given cell and its immediate surroundings. In this case, the spatial extent of the measurement is on the order of sampling distance or even slightly larger ("oversampling", as shown in Figure 3). This is usually the case for DEMs based on technologies such as InSAR and photogrammetric techniques, including Satellite Pour l'Observation de la Terre (SPOT)-derived DEMs, and all the DEMs discussed in Table 1. As we will discuss in Section 4, the sampling strategy must be differentiated from the grid storage format. • Point-based grid: In this type of grid sampling, the values stated are only representative for the grid node to which they are associated (see Figure 6B). Point-based grids could be based on ground surveys, but this is no longer a common production method for DEMs. • (Geo)Rectified grid: Grid for which there is an affine transformation between the grid coordinates and the coordinates of an external coordinate reference system [23].
If the coordinate reference system is related to the Earth by a datum, the grid is a georectified grid. • Pixel reference point: The single point that can represent the pixel for DEM manipulation. For point-based grids, this is the point, while for area-based grids, it is the pixel centroid. • Irregular networks: These are networks which do not qualify as grids because they lack algorithmic regularity. An example is a TIN, which can be constructed from any set of nodes (points) as they, for example, result from data collection with a irregular distribution of ground survey points, topographic cross-sections, or a single-beam bathymetric survey. They can be used to produce regular grids by interpolation or extrapolation. Tile: A rectangular representation of geographic data, often part of a set of such elements, covering a tiling scheme and sharing similar information content and graphical styling. Tiles are mainly used for fast transfer and easy display at the resolution of a rendering device [24]. Tile boundaries are usually parallels and meridians, similar to the map quadrangles used for paper maps from national mapping agencies. Distribution files are named for the tile, and generally use the SW corner location in the DEM. For the quasi-global DEMs, the tile size is usually 1 • × 1 • . • DEM bounding box: The smallest rectangle that will contain all pixel reference points in the DEM in a point-based grid, and all the pixel areas in an area-based grid. Some software, notably GDAL, adds a 1 ⁄2 pixel buffer to create the bounding box for point-based DEMs such as SRTM.

Topographic Grid Surface Definitions
• DEM (digital elevation model): General term for a digital representation of elevations (or height) of a topographic surface in form of a georectified point-based or area-based grid, covering the Earth or other solid celestial bodies. Currently most common DEMs use rectangular grids ("arrays") and raster image file storage formats. Alternative structures for digital topography, such as triangulated irregular networks (TINs), contours, and point clouds are not DEMs as defined here because they are not grids.

Definitions of Specific (Topographic) Surfaces
• DSM (digital surface model): A DEM that records the lower boundary of the atmosphere (and either the lithosphere, hydrosphere, cryosphere, biosphere or anthroposphere) ( Figure 4). • DTM (digital terrain model): A DEM that records the boundary between the lithosphere and the atmosphere, without biosphere and anthroposphere, also called a "bare-earth" DEM. The treatment of hydrosphere, cryosphere, and voids (e.g., excluded buildings, water and trees) must be specified and clearly localized, e.g., by respective masks (Figure 4). Removing elements of a DSM creates prominent voids and artefacts whose values need to be interpolated, and consequently adds uncertainty to the studied processes. NVS might better reflect the needs of some applications, and it can also be easier to compute because it merely selects the lowest point in each pixel and does not require building/vegetation classification and hypothesizing a surface below. The ability to create a NVS will depend on the collection method and data resolution [25] (Figure 4). For instance, removing vegetation can reveal archeological structures hidden by vegetation, and archeologists have called this a Digital Feature Model (DFM) [26]. -NUS (non-urbanized surface): A DSM that excludes the anthroposphere but includes the biosphere (mainly as a closed vegetation canopy). An example application is the creation of surfaces used for the orthorectification of high to very high resolution satellite images [27]. The top-reflective height information of a DSM in the anthroposphere can lead to distortions in the rectified satellite images. The transformation of these regions to a bare-ground-like height information ensures a better interpretation of the high to very high resolution satellite images to the disadvantage of high geolocation accuracy of the top of high urban elements. For dense canopy areas in the image, this effect is of less importance and therefore conserving the canopy top surface favors a more real representation including high geolocation accuracy.

Definitions of Terrain Representations That Are Not DEMs
DEMs can be used to generate a range of other layers that do not represent elevations and thus are not DEMs. Examples of such DEM derivatives include: • Water depth in rivers, lakes, or oceans as the difference between the DSM over water bodies (hydrosphere) and the bathymetric surface (lithosphere); • Ice thickness of glaciers or ice shields as the difference between the DSM over ice bodies (cryosphere) and the subglacial topography (lithosphere); • Geomorphometric surfaces, such as slope, aspect, several types of curvature, and hill-shading; • Landscape parameters such as drainage basin area and upslope contributing area.

Point-Based versus Area-Based DEMs
Point-based or area-based DEMs refer to several very different areas: (1) how the raw data are sampled, (2) how the data are stored, and (3) how the data are displayed. Disentangling these three can be complex.

DEM Sampling Methods
In area-based sampling, each point in the DEM is collected such that it is representative (e.g., weighted mean, median) for an interval (or cell) that has about the same nominal extent (usually slightly larger) as the interval between adjacent samples. This is done to prevent undersampling, which otherwise could have repercussions on the ability to resample or interpolate the data.
Except for NASADEM, all of the DEMs in Table 1 are stored in the GeoTIFF format; 1 grids all use area-based sampling and the elevations from the radar or optical sensors reflect a single, integrated elevation over the sampling area. The 3 grids vary in their sampling strategy; some average a number of 1 cells, and others take the central value so that the points in common to the 1 and 3 grids will have the same elevation ( Figure 3). Different versions of the SRTM dataset have used both approaches to resampling.
Point-based techniques, such as using lidar to create a grid with spacing much larger than the point footprint, have many options on picking the point elevation to use: OpenTopography [28] implements a number of these algorithms, and the chosen algorithm can greatly affect the resulting DEM, as can any filtering to selectively choose points. Figure 5 compares area-based and point-based approaches to creating a 90 m DEM, and the effect of shifting the pixel's starting location.
For the area-based approach (Figure 5B), the means remain centered in the point cloud and a bilinear interpolation between the center point of the pixel produces a smooth profile that is not greatly affected by shifts of the starting location for the grid. For the point-based approach (Figure 5C), where the profile points represent the lidar value closest to the pixel center, the starting location shifts make a much larger difference in the interpolated profiles. The point-based algorithms might best be employed to create a DEM with a spacing that places a relatively small number of points in each pixel (the pixels in Figure 5 have tens of thousands of points per pixel), and then downscaling the DEM. In this case, a DSM or DTM might be easier to create.
To distinguish between these sampling concepts the common GeoTIFF specification [29] includes a GTRasterTypeGeoKey (#1025), which may be set to either RasterPixelIsArea (the default), or RasterPixelIsPoint. Unfortunately, nothing prevents the providers of such files to use an inappropriate flag; for example, legacy reasons force the DGED standard [30] to use RasterPixelIsPoint irrespective of whether this corresponds to the nature of the data.  Figure 6C shows the difference between the elevation storage in the GeoTIFF files for the ASTER, ALOS, and SRTM 1 DEMs (Copernicus DEM is stored the same way as SRTM). For the purpose of visualization, the figure shows 30 pixel sizes in a 1 • tile. The red rectangle shows the extent of the tile (usually named for the latitude and longitude of the SW corner), with the other corners 1 • away. The gray star symbols, computed from the GeoTIFF file, show the RasterPixelIsPoint positions for the SRTM and Copernicus elevations. The gray rectangles show the RasterPixelIsArea areas for the ASTER elevations. Because the corner coordinates of edge pixels occur outside the nominal tile boundaries, the centroids of the pixels occur at the same place as the SRTM RasterPixelIsPoint positions and ASTER elevations can in principle be compared directly with SRTM and Copernicus elevations as the latter in reality are not point but area-based elevations, even though they are stored as RasterPixelIsPoint. Although ASTER and SRTM use different parts of the electromagnetic spectrum, their sensors sample the same areas, slightly larger than the 1 pixels. ASTER, SRTM, and Copernicus DEMs also duplicate a row or column with each neighboring tile. For the illustrative 30 pixels in Figure 6C, there will be 3 columns and 3 rows of data; for the actual 1 pixels, the 1 • tile will have 3601 × 3601 elevations.

Data Storage and Image Metadata
The ALOS pixels in Figure 6C are shown in black, with the black star for the center point displaced by a half pixel in both the X and Y directions from the corresponding points in the other DEMs. The tile has one fewer column and one fewer row compared to the other DEMs, which means that the 1 • ALOS tiles have 3600 × 3600 1 pixels. Direct comparisons of single elevations with the other DEMs are difficult, because the ground area sampled by ALOS is one quarter of the sampling area for each of 4 adjacent pixels in the other DEMs. The difference between point and area-based approaches to DEM storage is largely convention as defined by the Digital Terrain Elevation Data (DTED) standard from the U.S. military [31]. DTED and the formats that have followed it repeat the edge rows and columns in adjacent cells to accommodate 3601 × 3601 elevations in each 1 • tile and use the RasterPixelIsPoint model. As noted above, ASTER uses RasterPixelIsArea, but through an alternative selection of the coordinates for each pixel, it has the same effective locations as the RasterPixelIsPoint DEMs. The area-based ALOS DEM has 3600 × 3600 elevations, which results in a negligible 0.06% saving in storage, but requires a half-pixel shift to align with ASTER, SRTM and Copernicus data. These pseudo point-based DEMs could drop adherence to DTED standards [31] and store only 3600 × 3600 elevations by eliminating the top row and rightmost column and software should correctly handle them, except for formats such as the SRTM HGT files which have no internal metadata about the number of rows and columns and rely on software to recognize the simple format which has the 3601 × 3601 size hard-wired into the definition.

DEM Altering Operations
The first steps in DEM editing attempt to remove or correct sensor-related effects, such as data voids or water related artefacts, and will commonly be done by the data producer [17,32]. Some DEMs have metadata at the pixel level to record the source data or edit history. These layers can greatly assist in interpretation and analysis, and users should be aware to look for them and be cautioned that some secondary sources of DEMs strip out these metadata layers, which can more than double the size of the DEM. Evaluations of DEM quality should avoid areas that were changed to fill voids or from building or water edits, although, it would be desirable to report the percentage of the area filled with voids or building/water edits.
DEMs can be modified for particular purposes such as geomorphometry [33], sink/pit removal and hydrological flow enforcement, or flattening of water bodies or for the production of derivative products (e.g., from DSM to DTM) [17]. While the modifications will improve the DEM for a particular purpose, they could degrade its performance for other purposes by discarding valid measurements in favor of adhering to a particular landscape model. Removal of karst sinkholes to create an artificial drainage network would be one example that many DEM users would not want.
Users must be aware that resampling a DEM from one grid (e.g., geographic/unprojected) to another grid (e.g., projected) almost always will result in artefacts due to pixel shifting and cell size adjustments which require interpolation. Choosing the right interpolation technique must be done with great care and knowledge of the specificity of the original data and the resampling process. Table 1 lists 8 quasi-global DEMs approaching 1 and 3 arc second grid spacing; these are commonly and informally called 30 m or 90 m (sometimes 100 m) DEMs. Copernicus 1 and 3 arc second DEMs cover the entire land surface of the Earth; SRTM and NASADEM cover about 80% of the Earth's land area; the others cover more of the high latitude areas. All of these stop at the hydrosphere or cryosphere, so they do not include subglacial topography or the land surface below oceans, rivers, or lakes. At coarser scales, ETOPO1 (1 arc minute, [12]), SRTM 30+ (30 arc seconds, [34]), and SRTM 15+ (15 arc seconds, [35]) merge various DEMs on land with the best bathymetry, generally derived from radar altimetry. All are WGS84 horizontal datum; All use the default TIFF orientation (code 274 = 1), with the first points in the file in the NW corner. The DTED format starts with the SW corner. Copernicus 1 , 3 are available at both DTED and DGED formats; All name tiles for the SE corner (USGS NED/3DEP names for the NW corner); All have been reprocessed, and most have voids filled with other DEMs; All of these DEMs use area-based sampling, but the grid storage specifying the extent of each pixel's area used both point and area conventions ( Figure 6).

Implications for Applications
DEMs are regular grids, with two main types. Most local and regional DEMs use a projected coordinate system; they are plane square grids with square pixels, based on a map projection such as the Universal Transverse Mercator (UTM) or a Lambert Conformal Conic. Manipulation of these grids uses the projected coordinates, which form a Cartesian reference system. Almost all quasi-global, global, and continental DEMs use a geographic coordinate system (latitude/longitude) or spheroidal equal angular grid. This is an equirectangular projection; the pixels in these DEMs are spheroidal trapeziums and differ slightly from rectangles in terms of their linear dimensions in meters. At finer mapping scales, with DEM grid spacing up to 15 pixels, the differences between the spheroidal trapezium shapes of pixels and rectangles are insignificant for almost all computations. GIS software should use geodetic formulas to account for differences in grid spacing and correctly display the DEM in a cartographically reasonable way and correctly compute derivatives such as the slope, but this is problematic in some software.
The quasi-global DEMs suffer severe geographic distortion near the poles, in common with any cylindrical map projection. Various solutions have been proposed, with grids using other geometric forms [53,54] or selecting a map projection to minimize distortion and breaking it into continental regions [55]. The polar community has long faced this problem, and uses the polar stereographic projection to produce DEMs [56]. This approach requires distinct grids, with a break between the polar grids and those for low latitudes, perhaps with some overlap. New geometries for the grids require adoption by mainstream GIS software, as well as the creation of these data grids for such GIS software.
The best choice of DEM depends on the application and the characteristics of what is available. For land cover mapping, having both a DSM and a DTM allows the creation of an nDSM or CHM, which estimates land cover height (e.g., vegetation canopy heights), which can be utilized in classification techniques. Rectification of satellite imagery ideally needs a DEM which contains the vegetation coverage, since optical images map the surface of the canopy, while the inclusion of the anthropogenic elements might add artifacts to the results depending on the grid spacing of both the image and the DEM. Hybrid DSM/DTM can be produced in those cases [57]. Telecommunication applications with line-of-sight also need a DSM. Hydrological applications prefer a DTM, which may require drainage enforcement but depending on the scale, they might combine it with locally added buildings from a 3D product or from a DSM with finer grid spacing.
Resolution in the geosciences refers to the real-world dimension of the smallest observable feature. In a DEM, this could be a small canyon, a small ridge, a boulder. Sampling theory, often named after Shannon, Nyquist, or Whittaker, states that the smallest features that reliably can be resolved in the DEM will have linear dimensions at least twice the grid spacing [58,59] and [2] (Section 3.3, pp. 104-105). In that sense, grid spacing sets a lower limit to the identifiable object within that grid. The actual spatial resolution of a DEM is initially limited by the imaging and processing systems (and by the atmospheric clarity, especially for optical systems). The primary goal of resampling to a grid (gridding or "pixelization") during DEM production is to not waste that spatial resolution by choosing a cell size (and grid spacing) that is too large. The secondary goal of pixelization in DEM production is to not produce wasteful volumes of data by selecting values of grid spacing size that excessively oversamples the inherent spatial resolution of the imaging system. For area-based sampling, signal theory recommends the sampling area (over which values are integrated) to be slightly bigger (1.2-1.4 times) than the grid spacing (or cell size, see Section 4 and Figure 3), as this slight oversampling helps in avoiding artefacts due to aliasing. Consequently, well-produced DEMs have a grid spacing size that is not much different (only slightly smaller than) the spatial resolution of the measurement device. However, grid spacing, and with it pixel size, can never be assumed to equal the spatial resolution of a DEM, and later resampling to smaller grid spacings via interpolation will always cause pixel size and DEM spatial resolution to further diverge.
Although smaller pixel size does not always equate to higher spatial resolution, smaller pixel-sized DEMs are generally preferred for most applications as such datasets can potentially accommodate finer detail. However, there is often a trade-off between pixel size and data usability (storage, management and analyses). Resampling options can upscale (create a new DEM with smaller spacing, but it will generally be more generalized than a native DEM at that resolution, and the increased detail may be illusory) or downscale (the new DEM has larger spacing and less detail). Downscaling loses the extremes in the DEM, both the high topography and the valleys. Recognizing the significant challenges resulting from this loss, some of the DEMs provide auxiliary information such as the minimum, median, and maximum elevations in each cell of the averaged DEM (e.g., Global Multi-resolution Terrain Elevation Data (GMTED2010) [57]). Others provide indication of certain classes, e.g., water, which can then be used within a resampling process that respects the water border lines [60]: • Downscale by thinning. This ensures that the elevations at common locations in grids with different pixel sizes have the same elevation; • Downscale by averaging. If the DEM is area-based, this preserves the statistical sampling integrity of the data, but at the cost of generalization; • Reinterpolation using various techniques, such as bilinear interpolation, bicubic interpolation, or kriging. This allows creating any desired grid size, even smaller but smoothed pixels compared to the original DEM. It can also reproject the data to a new map projection. Some redistributions of the global DEMs use reinterpolation to simplify and standardize their data handling.
The larger the change in scale, the more severe the changes in the DEM and the greater the importance of selecting an appropriate method. Figure 7 shows the changes in a DEM, in this case a DSM that averages the surface, as the grid spacing increases. The large grid sizes produce greater averaging, losing the high and low elevations and driving the elevations toward the mean. Comparing the global DEMs must consider that they may sample different areas, and store the points at different locations on the ground. Direct comparison is challenging, and the necessity of interpolation to compare elevations may affect the results. As shown in Figure 6C, the ALOS pixels sample an area that includes 1 ⁄4 of each of 4 pixels in the other grids. This makes a direct point-to-point comparison impossible, and requires interpolation in either the ALOS grid, the other grids, or both. Additionally, the differences in date should always be considered. Figure 8 shows a comparison of 5 global DEMs to a lidar point cloud from Bled, Slovenia, along an east-west profile. ASTER, SRTM, NASADEM, and Copernicus DEM all have elevations at the same locations, and the figure shows how the elevations vary. ALOS has a half pixel shift in both the latitude and longitude directions; the east-west shift is obvious with the points being at different locations along the longitude axis. The shift in the north-south direction appears with the differences in the point cloud most clearly seen on the castle on the left side of the profile.
Some of the DEMs, as noted in Table 1, change the longitudinal grid spacing at higher latitudes. Users must be aware of this, as it may make merging DEMs with different grid spacing difficult. Near 60 • latitude, nominal 1 DEMs can have 1 × 1 , 1 × 1.5 , or 1 × 2 grid spacings, and average computed slopes vary systematically as the grid spacing changes. Some distributions of these DEMs also reinterpolate them to preserve the 1 × 1 spacing at all latitudes.  Figure 6C, the area sampled by each ALOS pixel covers 1 ⁄4 of the area of each of 4 pixels in the other DEMs. In this profile, half of the lidar points in the two profiles are the same, but the other half are to the north or south of the common points.

Data Quality
Users must assess the quality of a DEM [61], particularly when multiple comparable DEMs cover the region of interest, such as the 1 arc second DEMs discussed in Section 6. Recent reviews [62,63] highlight the challenges which have led to a variety of inconsistent, ad hoc approaches. The assessment can use robust quantitative [64] or qualitative visual [65] methods, and can use external reference data or consider the internal consistency and characteristics of the DEM. In making any comparisons, the user must ensure that the horizontal and vertical datums of the DEMs to be compared match, that the pixel representations match, that geolocation errors do not introduce a horizontal shift, and considerations such as how a geodetically measured benchmark elevation should correspond with a DEM elevation representing a 1 arc second pixel. Results comparing the same two DEMs can lead to opposite rankings for the DEM in floodplains [66] and mountainous areas [67].

Conclusions
This paper provides an overview of fundamental concepts and terminology relating to DEMs. It consolidates the findings of the Digital Elevation Model Intercomparison eXperiment (DEMIX) working group established in 2020. One of the aims of DEMIX is to guide end-users in selecting the most appropriate DEM for their specific application and to highlight the main characteristics that should be considered in the selection process, as they can influence the interpretation of the results. In addition, it aims to find consensus and reduce ambiguities among DEM-related terms. Many of the terms defined in this paper are well known and frequently used by geospatial practitioners. Others are more obscure and there is often disagreement among experts about their meaning.
Given that DEMs are in essence data models to digitally represent topographic surfaces, we started with an overview of the different interfaces between the lithosphere and atmosphere, as well as the hydrosphere, cryosphere, biosphere and anthroposphere. Due to limitations in measurement technologies, the interfaces between these "spheres" are not consistently represented in DEMs and users are advised to take these differences into consideration. A major source of confusion among users is that DEMs can (among others) represent DTMs, DSMs, and NVSs, or even combinations of these. This paper clearly defined and illustrated each of these surfaces to help users understand how they differ and how the selection of a particular DEM may impact their application. We explained that most DEMs are sensor surface grids (SSGs), created by a sensing system (e.g., lidar, optics, radar, or sonar), and that these often record elevations between those of a DSM and a DTM.
Users should also be aware that DEMs use different reference frames (e.g., horizontal and vertical datums), that can have a significant impact on applications, especially if multiple DEMs are combined, or when DEMs are used along with other geospatial data. Scale is another important consideration, as elevation posting interval (pixel size) is not necessarily an effective measure of how much topographic detail is contained in a DEM, but the pixel size does limit the potential spatial resolution [9,68]. The technology used and the scale at which the measurements were taken should rather be considered. One should also consider how elevations in a particular DEM are sampled (point-based or area-based) and resampled (nearest-neighbor, bilinear or cubic), as this may result in misalignment with other data. To demonstrate, we summarized and compared free global DEMs at 1 and 3 arc second grid spacing. The paper concluded with a synopsis of what users should look out for when selecting a DEM for their application.