High Resolution Digital Terrain Models of Mercury

: We reﬁned our Shape from Shading (SfS) algorithm, which has previously been used to create digital terrain models (DTMs) of the Lunar and Martian surfaces, to generate high-resolution DTMs of Mercury from MESSENGER imagery. To adapt the reconstruction procedure to the speciﬁc conditions of Mercury and the available imagery, we introduced two methodic innovations. First, we extended the SfS algorithm to enable the 3D-reconstruction from image mosaics. Because most mosaic tiles were acquired at different times and under various illumination conditions, the brightness of adjacent tiles may vary. Brightness variations that are not fully captured by the reﬂectance model may yield discontinuities at tile borders. We found that the relaxation of the constraint for a continuous albedo map improves the topographic results of an extensive region removing discontinuities at tile borders. The second innovation enables the generation of accurate DTMs from images with substantial albedo variations, such as hollows. We employed an iterative procedure that initializes the SfS algorithm with the albedo map that was obtained by the previous iteration step. This approach converges and yields a reasonable albedo map and topography. With these approaches, we generated DTMs of several science targets such as the Rachmaninoff basin, Praxiteles crater, fault lines, and several hollows. To evaluate the results, we compared our DTMs with stereo DTMs and laser altimeter data. In contrast to coarse laser altimetry tracks and stereo algorithms, which tend to be affected by interpolation artifacts, SfS can generate DTMs almost at image resolution. The root mean squared errors (RMSE) at our target sites are below the size of the horizontal image resolution. For some targets, we could achieve an effective resolution of less than 10 m/pixel, which is the best resolution of Mercury to date. We critically discuss the limitations of the evaluation methodology.


Introduction
Historically, Mercury has received comparatively little attention among the terrestrial planets, and many scientific questions remain unanswered. Only two spacecraft have ever reached the planet. The Mariner 10 probe of NASA (National Aeronautics and Space Administration) completed three flybys in 1974 and 1975 [1] but imaged only one side of the planet. The first orbiter was NASA's MESSENGER (MErcury Surface, Space ENvironment, GEochemistry and Ranging), which entered the orbit of Mercury in 2011 and remained there until the end of April 2015 [2]. MESSENGER revolutionized and expanded the understanding of Mercury, which is elaborated in Solomon et al. [3], eventually drawing the interest of the community to the innermost planet. At the time of writing, the Mercury Planetary Orbiter (MPO) and Mercury Magnetospheric Orbiter (MMO) of the BepiColombo mission are on their way to Mercury. BepiColombo is conducted jointly by the European Space Agency (ESA) and Japan Aerospace Exploration Agency (JAXA) and will enter orbit in 2025 to improve, extend, and complement the findings of MESSENGER [4]. Hapke et al. [20] 1975 Photoclinometry Mariner 10 20,000 Profiles Mouginis-Mark and Wilson [21,22] 1979 Photoclinometry Mariner 10 n.a. Profiles, topographic maps Watters et al. [23] 1997 Stereo, Photoclinometry Mariner 10 500 Local DTM of Discovery Rupes Zuber et al. [33] 2012 Laser Altimetry MLA 250 Northern hemisphere DTM Fassett [29] 2016 Stereo MDIS >45 Local DTMs Becker et al. [24] 2016 Stereo MDIS 665 Global DTM Preusker et al. [27] 2018 SfS originated in the computer vision community and eventually replaced early photoclinometry. SfS formulates the reconstruction problem in terms of variational calculus such that the 3D-surface is the result of minimizing an objective function. This objective function is termed intensity error and penalizes the deviation between a rendered version of the DTM and the original reflectance image of the surface under investigation. This approach was described by Horn [14] and most variants used today are based on this technique. Kirk [34] presented another early photoclinometry approach and later combined photoclinometry and photogrammetry to create topographic maps of possible landing sites on Mars [35]. Because SfS is essentially an ill-posed problem, suitable regularizations are necessary. Table 2 gives an overview of the SfS approaches developed over the last three decades and the regularizations used.
Horn [14] imposed an integrability constraint to ensure that the estimated slope field has zero curl and the resulting surface is physically reasonable. To further stabilize the algorithm and prevent it from getting stuck in local minima, Horn [14] introduced a penalty term for departure from smoothness. The weight for the smoothness constraint has to be chosen carefully such that the surface neither moves away from the optimal solution nor becomes over-smoothed. In addition, small errors of photometric gradient estimation may accumulate and lead to a deviation of the low-frequency component of the surface. To counteract these effects, constraints with a priori known depth data of low horizontal resolution can be used instead. Shao et al. [36] introduced an absolute depth constraint that minimizes the deviation from a previously known DTM. Grumpe and Wöhler [10] expanded on this by making use of a relative and an absolute depth constraint. The relative depth constraint penalizes the deviation of the reconstructed gradients from the gradients of the initial DTM. The absolute depth constraint does the same for the absolute heights of the surface. In addition to the height and gradient, Grumpe and Wöhler [10] also calculated the albedo at every pixel in the image to correctly separate brightness variations that are due to slope or due to local changes of the surface properties. Comparable approaches were subsequently presented by Alexandrov and Beyer [12], Wu et al. [15], Jiang et al. [16], Liu and Wu [37] and Douté and Jiang [38] with several distinctions. Jiang et al. [16] did not estimate the albedo and Wu et al. [15] and Liu and Wu [37] did not use an absolute depth constraint. To relate image intensity to surface gradients, SfS requires a reflectance model. The community standard of planetary photometry are the Hapke model [17] that was used by Grumpe and Wöhler [10] and Hess et al. [11] and the less commonly used Shkuratov model [39]. Approximations of the Hapke model, such as the Lunar-Lambert model [40] or the RTLS-Kernels [41] may be handier but lack physical interpretability of the model parameters.

Smooth-Ness
Horn [14] 1990 Lambert no no yes no Shao et al. [36] 1991 Lambert yes no yes no Kirk [34], Kirk et al. [42] 1987 2003 Minnaert, Lunar-Lambert no no n.a. no Grumpe and Wöhler [10], Hess et al. [11] 2014 2019 Hapke yes yes no yes Wu et al. [15], Liu and Wu [37] 2018 2020 Lunar-Lambert no yes no yes Alexandrov and Beyer [12] 2018 Lunar-Lambert yes no yes yes Jiang et al. [16], Douté and Jiang [38] 2019 Ross-Thick Li-Sparse kernels yes no no no This work 2020 Hapke yes yes no yes MESSENGER-based 3D-models of Mercury were all generated with either stereo methods or laser altimetry (see Table 1). Stereo modeling suffers from a variety of artifacts and is only possible if two or more images are available at a given location. Laser altimetry is designed for global mapping and is comparatively coarse such that it can not capture small topographic details. Even though the horizontal resolution is limited, both methods have high vertical fidelity. On the other hand, shape and albedo from shading yields high horizontal resolution up to the pixel level of the input image but may not have a reliable low-frequency behavior without suitable constraints. Furthermore, SfS only needs one input image which is an advantage over stereo, especially for coarser imaging campaigns. Recent approaches (see Table 2) have combined the advantages of photogrammetry and SfS. They use stereo DTMs as a low-frequency height constraint and substantially refine the DTM with a SfS procedure to arrive at accurate height and slope estimates on a pixel level of the input image. Consequently, the next obvious step is to apply Shape and Albedo from shading to MESSENGER data to improve the resolution and the quality of the DTMs that aid the planetary research on Mercury. In this work, we apply our established approach [11,43], and tailor it to the specific conditions and datasets of Mercury. Our approach is well suited for the innermost planet because its diverse surface makes it necessary to account for albedo variations. Further, our approach already incorporates the Hapke model that can readily be implemented with the Hapke parameters of Mercury [44].

Data and Methods
SfS requires a reflectance image, an initial DTM with a low horizontal resolution, a reflectance model, and the reconstruction algorithm. The reflectance images are provided by the multispectral cameras of NASA's MESSENGER probe, and the initial DTMs are derived from this image dataset by photogrammetric methods. For reflectance modeling, we employ the Hapke model [17,45], which is the community standard of planetary reflectance modeling. Finally, we introduce the SfS algorithm and the methodological novelties to address the specialties of Mercury.

Instruments and Image Dataset
All data used in this work were captured by instruments onboard NASA's MESSENGER spacecraft, primarily the Mercury Dual Imaging System (MDIS) [8]. MDIS consisted of a Wide Angle Camera (WAC) and a Narrow Angle Camera (NAC), which shared the same pivoting platform [46]. While NAC captured only monochrome images centered near 750 nm, WAC contained a mechanical filter wheel with 12 filters in the visible and near-infrared, ranging from 430 nm to 1020 nm [8]. In this work, we use either NAC images or WAC images with filter 7 (WAC-G) because the filters have approximately the same center-wavelength of 750 nm [46]. The original images are available in the NASA Planetary Data System (PDS). The images are calibrated with the Integrated Software for Imaging Spectrometers (ISIS) [47][48][49]. Because the images have a maximum size of 1024 × 1024 pixels, single high-resolution images only cover small areas of the surface. To investigate contiguous surface features, individual images are combined into mosaics.
Because MESSENGER did not have a dedicated stereo camera, several stereo imaging campaigns were carried out with MDIS [26]. The goal of these campaigns was to find image pairs with different emission angles both at a global level and targeted for areas of particular scientific interest. We employ stereo data either to initialize the SfS algorithm or to evaluate the final results. The stereo DTMs used in this work are described in Section 1 and summarized in Table 1. Parts of the global stereo DTM by Becker et al. [24] and the quadrangle DTMs from Preusker et al. [27] serve as default initialization of our SfS algorithm. The results discussed in Section 3.1 show that even the low-resolution global DTM allows for a good initialization. Because the regional DTMs of Fassett [29] are not rectangular and comparatively small, they are not suited for initialization but allow for a comparison with the results of this work. Note that the effective resolution of the stereo DTMs is often lower compared to the nominal resolution for the reasons mentioned above. A more detailed investigation of the influence of stereo artifacts on the evaluation of the SfS results is carried out in Section 4.3.
The orbit of MESSENGER around Mercury was strongly elliptical [1]. From entry into planetary orbit on 18 March 2011 until the end of the primary mission phase on 18 March 2012, the periapsis altitude was 200 km and the apoapsis altitude measured approximately 15,000 km [1,50]. In the last phase of the extended mission in 2015, the orbit was reduced such that the periapsis was between 15 and 25 km [51]. Therefore, the images from MDIS vary greatly in resolution across latitudes. Because the periapsis was almost always above 60 • N, particularly high-resolution images of the northern hemisphere are available.
MESSENGER carried the Mercury Laser Altimeter (MLA) [52]. The time-of-flight laser rangefinder measured Mercury's surface height with <1 m accuracy when the distance from the MLA to the surface was less than 1600 km. The elliptical shape of MESSENGER's orbit allowed only measurements north of 20 • S. The footprint size lies between 15 and 100 m with an average lateral distance of about 400 m between each sampling point [33]. Individual MLA shots are used in this work to evaluate the SfS results, but special care has to be taken to account for the different scales, as discussed in Section 2.8.

Reflectance Model
Consider a surface that is illuminated by collimated irradiance J that comes from a direction s. The surface reflects the incoming light, and radiance I travels into the direction v of a probe. A bi-directional reflectance model determines the ratio of reflected radiance to incoming irradiance for a given illumination geometry and a set of specific surface properties. A reflectance model is at the core of SfS because it relates the image intensity that is measured by the probe to surface slopes that are used to refine the DTM. The community standard for planetary reflectance modeling is the Hapke model [17,45]. It was successfully integrated into our SfS framework and has been applied to both the Moon [10] and Mars [11]. Domingue et al. [53] employed the Hapke model and the Shkuratov model [39,54] for photometric analysis of Mercury. Shkuratov's model yielded better results for high emission angles, but in the range that is relevant to SfS with emission angles e < 40 • and incidence angles 40 • < i < 70 • , the photometric functions of Hapke and Shkuratov can be considered equal. Because the Hapke model has already been successfully integrated into our SfS framework and works well for the relevant settings on Mercury, we use it in this work. The Hapke AMSA (anisotropic multiple scattering approximation) [17,45] is given by with µ 0 = cos i and µ = cos e. The relationship among the illumination direction s, the view direction v, and the angles i, e, and α is illustrated in Figure A1. The parameter w denotes the single-scattering albedo. The albedo describes the fraction of incident light reflected by the surface and ranges from 0 to 1 accordingly. It is the dominant parameter of the Hapke model, is dependent on the material, and varies with the wavelength of the scattered radiation. The parameters µ 0 , µ, and α encode the illumination geometry. The phase function p(α) describes the anisotropic single-scattering of the incident light. Different formulations of the phase function exist with varying numbers of parameters.
To adequately describe forward and backward scattering, the Double Henyey-Greenstein (DHG) function is used as formulated by [45]: The values determined by Warell [44] for Mercury are used for the free parameters a and b. These are given in Table 3. The term M(µ 0 , µ) describes multiple-scattering and depends primarily on the illumination conditions. H(x) is the Ambartsumian-Chandrasekhar H-function for isotropic scattering given by Because the function is not analytically solvable, Hapke [45] approximated it by with Because the surface of Mercury is covered by a porous regolith similar to the Moon, the opposition effect occurs for small phase angles α < 10 • . Hapke [45] modeled the opposition effect with the terms B SH and B CB for the Shadow Hiding Opposition Effect (SHOE) and the Coherent Backscatter Opposition Effect (CBOE), respectively. However, the two effects can only be separated from each other at phase angles α < 1 • [43]. Therefore, values greater than one are allowed for the SHOE amplitude B S0 to model both effects with just one term [43,44]: B CB is set to 1 and has no further influence on the model.

Shape from Shading
This section briefly discusses the objective function and regularization terms used in the SfS algorithm (for further detail and update equations, see [10,43]). The intensity error is modeled as the mean squared difference between the reconstructed and measured brightness by Horn [14]. The components of the photometric gradient p and q are subject to model uncertainties and image noise. To regularize this ill-posed problem, Horn [14] introduced the integrability error This term ensures the integrability of the gradient field by minimizing the deviation of the surface gradient, described by ∂z ∂x and ∂z ∂y , from the photometric gradient. To further stabilize the algorithm, especially if it is still far from a solution, Horn [14] introduced a smoothness constraint. However, because a DTM z DEM of lower horizontal resolution is already known, we instead use two constraints that penalize the deviation of the reconstruction from the a priori known DTM. The relative depth constraint penalizes the deviation of p and q from the gradient of z DEM . Low-pass filtering is applied to the gradient field, because the DTM is of lower horizontal resolution. This has the additional advantage that high-frequency details are suppressed, which can be determined more precisely by the SfS algorithm. The relative depth constraints is modeled with a Gaussian filter g σ grad of width σ grad , where • is the correlation operator. The absolute depth constraint restricts the deviation of the reconstruction from the surface z DEM . Again, a Gaussian filter g σ abs of width σ abs is applied, where * is the convolution operator. The resulting total error is E total = E I + γE int + δE grad + τE abs (11) with the weights λ, δ, and τ, which depend on the problem at hand and must be adjusted individually.

Mosaic Shading
Usually, SfS is applied to single or multiple images of the same target (Multi-Image-SfS [12]). However, because MDIS images have a maximum size of 1024 × 1024 pixels, a single image is often insufficient to examine a given surface feature with adequate coverage. Instead, multiple images have to be pieced together to form mosaics. We introduce a new method to apply SfS to mosaics of MDIS images. This method allows for the generation of large DTMs with seamless transitions between mosaic tiles. Because the illumination conditions of individual images can vary greatly, s and v must be calculated for each sub-image. Therefore, it is required to know the original image of each pixel. This information is saved by ISIS3 and used to create an image-sized matrix, in which every point (x, y) refers to the original image of the corresponding pixel (u, v). The metadata of the original images were assigned to the correct pixels in the mosaic using this reference matrix.
In addition to illumination and observation geometry, the estimation of the single-scattering albedo has to be revisited. The intensity of the reflected radiance is in principle a function of the illumination geometry and the surface properties which are modeled by the albedo and the phase function. To correctly estimate the geometry and the slope that is needed for SfS, it is necessary to disentangle the surface properties from the slopes. We assume that the phase function is known and the albedo varies locally. Grumpe and Wöhler [10] calculated w on the pixel level before executing the SfS algorithm depending on the current DTM. The deviation of the measured from the modeled reflectance is minimized. The shape of the surface is specified by the DTM. Assuming constant values for the remaining reflectance parameters given by Warell [44], the albedo w can be determined on the pixel level. Grumpe and Wöhler [10] used the Levenberg-Marquardt algorithm and processed the albedo with a gaussian filter which results in a smooth albedo map without discontinuities. Different image tiles of the mosaic are associated with different observation conditions. Because planetary regolith reflects light anisotropically, the reflectance behavior differs from tile to tile. This may lead to sharp transitions between adjacent tiles (e.g., Figure 1a).
These variations in reflectance may not be entirely explained by the reflectance model, because it may be under-constrained. Usually, geometrically variant reflectance changes that are not entirely covered by the reflectance model are absorbed by the single-scattering albedo w. If the illumination differs vastly from tile to tile, it may occur that adjacent tiles absorb the deviations differently which results in albedo variations between tiles. Gaussian filtering of the albedo map as done by Grumpe and Wöhler [10] enforces continuity at the tile borders and consequently blurs the sharp transitions. Applying the algorithm without modifications results in strong artifacts in the reconstructed topography, particularly at tile borders ( Figure 2a). To counteract this effect, we relax the continuity of the albedo map at tile borders by prohibiting the Gaussian filter from smoothing across these borders. The pixel-level albedo is calculated for each mosaic tile individually. We found that this approach sustainably eliminates boundary-artifacts and a reasonable topography map can be estimated ( Figure 2b). On the contrary, the albedo map may exhibit discontinuities at tile borders ( Figure 1b). To obtain a more realistic, continuous albedo map, more accurate parameters of the reflectance model are necessary. A possible solution is offered by Multi-Image-SfS. This approach attempts to estimate reflectance parameters from multiple images yielding a more accurate reflectance model. Alexandrov and Beyer [12] employed this strategy for Lunar NAC images and the Lunar-Lambert model. Jiang et al. [16] applied the method of Ceamanos et al. [41] to tailor their empirical reflectance model to Martian imagery. In the case of Mercury, the availability of multiple images of a given region that share a considerable overlap and the same level of resolution is limited. Therefore, we opt against a multiview approach. 30

Compensation of Albedo Effects
Hollows are a particularly interesting landform on Mercury, which are not known to occur on any other terrestrial body. Hollows are irregular, round depressions with flat floors and without raised rims, which often occur on the floors and rims of craters [55]. Their diameters are between a few tens of meters and several kilometers with shallow depths of about 20 m [55]. Hollows could already be observed as bright spots on Mariner 10 images, but they were first identified as a novel entity in MDIS images of higher resolution [55]. Due to their small size, there is often only one high resolution image available for an individual hollow. This prevents the use of stereo algorithms and necessitates SfS for the 3D reconstruction of hollows. In the beginning of their formation, hollows are characterized by a very bright ground and a halo, which consists of highly reflective material distributed around their edge [55]. Therefore, hollows tend to have an unusually high albedo and a strong contrast to the surrounding region. These contrasts require special consideration in the reconstruction. If the albedo is estimated incorrectly during the SfS procedure, the surface normals will be calculated incorrectly and the resulting DTM will be corrupted.
SfS works in several pyramid levels. The resolution of the DTM and the corresponding albedo map increase by a factor of two at each pyramid level. At the end of the procedure, brightness variations that arise either from slope variations or the material's albedo are disentangled. Strong albedo contrasts within a scene may not be captured accurately and may be mistaken for variations in slope. This is the case for hollows which exhibit a bright floor and are surrounded by bright haloes. To overcome this issue, we propose an iterative procedure. We repeatedly re-run the entire SfS algorithm and initialize the SfS procedure with the albedo map that was output in the previous run. The albedo map of the highest pyramid level is scaled down and serves as the initialization of the first pyramid level in the next iteration. We found that this approach already converges after a few iterations The results are verified qualitatively.
The effectiveness of the proposed method is illustrated using the Hopper crater ( Figure 3). Due to the high presence of hollows, the bright floor could already be observed on Mariner 10 images as bright spots. However, hollows could first be identified as a novel entity in MDIS images of higher resolution [55]. Running the SfS algorithm once yields an exaggerated elevation in the western part and an unrealistic depression east of the central peak ( Figure 4a). This is because the albedo features are misinterpreted as a slope variation. If we apply our iterative approach, the DTM ( Figure 4b         (b) Albedo map after the last iteration. Figure 5. (a) In the first iteration, the process underestimates the albedo of the hollows, which leads to fuzzy borders. (b) After the fourth iteration, the borders are more pronounced and modeled with a higher albedo.

Summary of the Process
The workflow of the entire 3D reconstruction process of Mercury's surface is shown in Figure 6. After defining a target area, suitable MDIS images are radiometrically calibrated with ISIS3 [48]. The iterative pyramid scheme begins with the downscaled image and initial DTM, for which we choose one of the stereo DTMs described in Sections 1 and 2.1. If necessary, the initial DTM is manually co-registered with the calibrated image or mosaic. After estimating the albedo, the SfS algorithm calculates the surface heights. The resulting DTM is used to initialize the next stage. After running through a predefined number of pyramid levels, the resulting DTM z Final and albedo map w Final are of the same resolution as the original image or mosaic. w Final can be used as an initialization for a new run if required.

Complexity of the Algorithm
To allocate adequate computing resources and to ensure efficient scheduling, the runtime complexity and storage complexity of an algorithm must be known. Therefore, we provide a complexity analysis of our SfS approach with some simplifications. For the sake of argument, we assume rectangular input images of the size n × m pixels.
The SfS procedure of this study falls into three parts: radiometric calibration, co-registration, and our SfS algorithm. For the complexity of stereo reconstruction, which is used for the absolute depth constraint, we refer to the rich literature of stereo algorithms. The input images are radiometrically calibrated by applying the calibration formula to each pixel. Consequently, the complexity of this step becomes O(n × m). Co-registration is carried out manually, but, as the images grow, more control points are necessary for a robust co-registration. SfS runs N pyramid levels. It starts at the top of the pyramid which means that 3D reconstruction is applied only to a downsized version of the original image. As one descends from the top of the pyramid, the resolution increases by a factor of two at each level until the original resolution of the input image is met. As described by Grumpe et al. [43] and Grumpe and Wöhler [10], the 3D-reconstruction of each pyramid level runs two nested loops. The number of iterations until the reconstruction converges is not a function of the image's input size but highly depends on the specific surface under investigation and the parameter settings of the weights. Therefore, the number of iterations is generally hard to predict. Experience from this study and previous studies [10,11,43] showed that the routine usually converges within 100-1000 iterations. Consequently, the runtime complexity of the ith level is Given the number of N pyramid levels, the total complexity becomes In practice, the last pyramid level is by far the most dominant contributor to runtime. On a Desktop PC with an Intel i7 processor, a region of 1024 × 1024 pixels will take between several minutes and several hours until convergence. If one opts for the iterative albedo retrieval, as described in Section 2.5, the whole procedure has to be carried out multiple times which is only recommended for specific sites with unusually high albedo contrast. Because 3D-reconstruction is usually applied to a manageable number of target sites and without real-time constraints, it can be run over night. Larger mapping campaigns would benefit from parallelization, which is possible and has been applied in the past. The algorithm needs at least the image, the height map, the gradient map, and the albedo map in the storage such that the memory complexity becomes O(n × m).

Evaluation Methods
The evaluation of high-resolution DTMs of Mercury is challenging. No actual ground truth (GT) has ever been acquired on Mercury's surface directly. A viable surrogate is to compare the SfS DTMs against other reference datasets that are available for the planet. In the literature, the SfS results are commonly compared with laser altimetry tracks or stereo DTMs. Jiang et al. [16] used points of the Marts Orbiter Altimeter (MOLA [56]) to evaluate SfS DTMs on Mars. Grumpe and Wöhler [10] and Alexandrov and Beyer [12] employed tracks of the Lunar Orbiter Laser Altimeter (LOLA [57]) to evaluate lunar SfS DTMs. SfS DTMs were compared to stereo DTMs by Wu et al. [15]. However, to carry out a meaningful evaluation, the vertical and horizontal accuracy of the reference datasets must be considered.
The vertical deviations between the best stereo reference DTMs available and the MLA tracks are in the range of tens of meters [29], which is similar to the vertical deviation of the SfS DTM but not much less, as it should be when considering the stereo DTMs a GT. Consequently, we do not to ground truth stereo DTMs but rather to reference stereo DTMs, similar to Wu et al. [15]. Nevertheless, stereo DTMs contain a variety of details that are not present in the sparse MLA points, which help to assess the topographic details that are recovered by SfS. The range error (vertical accuracy) of the MLA instrument is less than 1 m [33]. The horizontal resolution of the images that are used for SfS is about tens to hundreds of meters and the vertical accuracy of the DTMs is in a similar range, as we shown in Sections 3.3 and 3.4. Consequently, the vertical accuracy of the MLA instrument is significantly better than the vertical deviation between the SfS DTMs and MLA data in these cases. This justifies the assumption that MLA points closely approximate the GT (see also [10,43] for DTMs of the Moon).
In theory, the SfS DTMs can achieve a horizontal resolution that comes close to the resolution of the images used [10]. However, due to stereo artifacts, stereo DTMs have an effective resolution that may be about 5-10 times lower than the resolution of the input images [12,13]. Since the images originate from the same dataset, a comparison of larger areas or high-resolution SfS DTMs is not possible directly.
For evaluation, the SfS procedure is applied in Sections 3.1 and 3.2 to low-resolution WAC images of areas where local stereo DTMs are available. This ensures a sufficient difference between the resolution of the SfS DTM and that of the stereo reference DTM, which is required to conduct a meaningful comparison on the same spatial scale. Only the stereo DTMs created by Fassett [29] and discussed in Section 2.1 meet these criteria and are therefore used as reference. The general approach is as follows. First, the SfS DTM and stereo DTM are scaled to the same size. Due to inaccuracies in the camera and image calibration, the DTMs show a shift and must be co-registered manually. The two DTMs are compared qualitatively as well as quantitatively by extracting profiles with distinct surface properties. Due to a vertical offset of several hundred meters caused by the different reference radii, the height of the SfS profile is shifted by minimizing the RMSE. Finally, the RMSE is calculated for the heights and the first derivative along the profiles, as the SfS procedure is particularly suitable for determining surface gradients. We indicate the gradient along the profile by ∆z/∆d. The viability of this evaluation method is discussed in Section 4.3.
The horizontal resolution of DTMs derived from MLA data and the effective resolution of MLA measurements themselves generally do not match the resolution that SfS can provide. A meaningful evaluation is therefore only possible in a resolution range between MLA footprint size (about 15 m to 100 m [33]) and distance between sampling points (about 400 m [33]). This ensures the assignment of one MLA measurement to roughly one SfS pixel. We chose to directly compare our SfS results to the sampling points instead of comparing them to gridded DTMs derived from MLA data. This ensures that we compare to a measured value instead of an interpolated point. Consequently, SfS DTMs with a resolution between the footprint size and the sampling interval are chosen to carry out the comparison. The MLA tracks and the SfS DTM are co-registered, and a vertical offset may be removed by minimizing the RMSE. Finally, we calculate the RMSE of the height differences and the RMSE of the first derivatives along the track. The limitations of the method are critically discussed in Section 4.3.

Results
The evaluation is carried out with areas that are well suited for the evaluation methods described in Section 2.8 and which place different demands on the SfS algorithm. Praxiteles crater (Section 3.1) exhibits strong albedo variations that are counteracted with the iterative albedo estimation. The ghost crater (Section 3.2) and Rachmaninoff basin (Section 3.3) are reconstructed from image mosaics. A tectonic fault (Section 3.4) is chosen to evaluate the performance for input DTMs that are already highly resolved. Additional results of important science targets but with no available reference DTMs are also presented.

Evaluation Method 1: Praxiteles Crater
The Praxiteles crater has several large depressions at the floor and is therefore classified as a pit-floor crater [29,58]. These pits are marked by their irregular to round shapes, an average size of 10 km to 40 km, and the absence of raised rims or centers [58,59]. The floors and immediate surroundings of most pit craters are covered by materials that are interpreted as pyroclastic deposits. The adjacent spectral units cause strong contrasts in the albedo, which necessitates the method of initializing the single-scattering albedo, as discussed in Section 2.5. The final albedo map is shown in Figure 7b. Note the high albedo on the inner ring of Praxiteles crater and around the pit on the left side of the image. The image (Figure 7a) used for the SfS algorithm is part of the MDIS-WAC image EW1048451439G and has a resolution of about 250 m/pixel. Because the stereo DTM 301E27N_0 [29] with a nominal resolution of 65 m/pixel is in sinusoidal projection, the WAC image is also projected sinusoidally with a central meridian of 300.98 • . The global stereo DTM [24] was used to initialize the SfS procedure. Table 4 summarizes all properties of the original images and DTMs used for the first evaluation method. In Figure 7c,d, the stereo DTM is compared to the color-coded SfS DTM. The improvements are immediately apparent. Even though the resolution of the image is only about 2.7 times higher than the nominal resolution of the initial DTM, the SfS algorithm successfully reconstructs all large and medium-sized surface features that were unrecognizable before. Even small features that are not visible in the initial stereo DTM, such as craters with a diameter of 1-2 km, can be easily recognized in the SfS DTM.   For quality assessment, the stereo reference DTM is cropped and scaled to the size of the SfS DTM. Due to inaccuracies in the camera and image calibration, as well as the assumed reference radius of Mercury, the DTMs have to be manually co-registered. The final results are shown in Figure 8. Figure 8a,b shows a similar level of detail despite the large difference in nominal resolution of the input data (the reference was generated from NAC, SfS was generated from WAC images). There is, however, a noticeable deviation at the pit crater in the middle of the image. While there is no raised edge in the reference DTM, as expected for this type of depression, a slight height increase of the left crater rim is visible on the reconstructed surface. This indicates that the albedo estimation could not fully model the high reflectance of the deposits on the lower rim. Note, however, that the height and albedo determination is carried out with a single image. The left edge is strongly illuminated because the surface is lit from the south-east. These circumstances make it difficult for the SfS procedure to separate the causes for the high reflectance values. SfS with multiple input images could alleviate this effect as it imposes more constraints to separate height and albedo. N   For a more detailed comparison, we chose three profiles (northern, central, and southern) that cut through the pit crater ( Figure 8) and plot their elevations (Figure 9a) and derivatives along the profiles (Figure 9b). Figure 9a shows that the largest vertical deviations approximately correspond to the size of a pixel of the SfS DTM and the general shape of the crater was reconstructed well. After shifting the SfS DTM vertically, the final RMSE of each profile is shown in Table 5. In addition to the elevation, the SfS algorithm is also able to accurately determine the first derivative of the surface. We used the same profiles as before and calculated the derivative of the elevation along the profile (Figure 9b). The derivative of the SfS DTM is close to that of the reference DTM.

Evaluation Method 1: Ghost Crater
The Northern Smooth Plains (NSP), which cover a large part of Mercury's surface in the higher northern latitudes, are probably of volcanic origin and consist of deposited flood basalts [58]. In contrast to the surrounding terrain, the NSP have a lower density of superimposed craters and are therefore estimated to be younger [60,61]. The NSP can also be spectrally separated from their surroundings by their assignment to the High-Reflective Red Plains (HRP) [62]. Volcanic flooding has partially and sometimes entirely buried many older craters [60]. These ghost craters [61] can be recognized by subtle tectonic features such as wrinkle ridges and graben, which are caused by the local cooling of the lava and the global contraction of Mercury's surface [63]. Klimczak et al. [64] divided ghost craters into three subtypes, based on the relations between wrinkle ridges and graben. To evaluate the SfS procedure, we choose a crater of the second type [61]. A crater of this type is marked by a nearly circular ridge ring and graben on the inside, especially near the center. Furthermore, its diameter is always more than 40 km. The ghost crater used here was already described by Watters et al. [63] and has a diameter of about 100 km.
In contrast to the image examined in Section 3.1, the challenge for the SfS method in this case is not the albedo of the surface but the large incidence angles at higher latitudes. Although these circumstances make it easier to identify fine surface features in the ghost crater, the shadows in the superimposed craters impair the reconstruction. The stereo DTM 36E60N_0 [29] with a nominal resolution of 55 m/pixel is chosen for comparison. The resolution of the mosaic of WAC images is about 2.8 times coarser. For the initialization of the SfS procedure, a section of the DLR DTM of the quadrangle H05 [65] is chosen. Contrary to the initial stereo DTM, the SfS algorithm reconstructs the shape of the superimposed craters (see Figure 10a). The graben that run through the ghost crater are also only visible in the SfS DTM (compare the centers of Figure 10a,b). A problem that occurs in both methods is the preferred direction of the reconstruction. In the initial DTM, the crater walls on the eastern and western sides are elevated and indentations are visible on the northern and southern walls. In the SfS DTM, the northern and southern walls are unrealistically elevated instead. In the case of the SfS DTM, this is due to illumination from a south-southwest direction. The use of several images with different illumination conditions could potentially compensate for this effect.
The reference DTM in Figure 11a shows a strip of the region and contains all of the defining features of a ghost crater. The reference DTM gives a better estimate of the three craters at the top of the image, but a part of their interior is missing, which is likely due to faulty block-matching. The SfS method, however, can plausibly reconstruct the interior of the crater using a single image (Figure 11b). The network of graben in the center of the ghost crater is visible in both DTMs. The reference DTM has a noticeably rougher surface than the SfS DTM. The associated limitations of the evaluation are discussed in Section 4.3. In this section, the reference DTM is assumed to be an accurate representation of the surface.
The elevations and derivatives along the northern and the southern profile marked in Figure 11 are shown in Figure 12. The vertical offset of the SfS and initial DTM profiles is adjusted again by minimizing the RMSE. The final RMSE for both, height and derivative along the profile, is given in Table 6. The profiles illustrate the capability of the SfS algorithm to reconstruct surface features, even if they are of a small lateral size. In Figure 11a, the SfS DTM follows the shape of the reference DTM, while the initial DTM does not. Only on the first and last 5 km of the upper profile a deviation is apparent. On the right-hand side, the shape of the surface is estimated correctly, but not the absolute height. In the lower profile, the SfS algorithm modeled the crater very accurately. The derivatives along the profiles (Figure 11b) support this conclusion. The derivative of the southern profile can be seen as correctly reconstructed, considering the difference in resolution. The large proportion of high frequencies complicates the qualitative comparison of the derivatives along the northern profile. However, the low RMSE of 0.056 • indicates good modeling.    (b) SfS DTM Figure 11. Comparison of the (a) reference DTM [29] to the (b) SfS DTM. The profiles indicated by the red dashed lines are investigated in Figure 12 from top to bottom. To allow for the comparison, the stereo DTM is scaled to the size of the SfS DTM. Table 6. RMSE of elevation and along profile derivative of the SfS profiles to the reference DTM profiles from Figure 11 at a resolution of 155 m/pixel.

Evaluation Method 2: Rachmaninoff Basin
Rachmaninoff (Figure 13a) is one of the youngest and therefore best preserved peak-ring basins on Mercury [61,66]. The inner ring has a diameter of about 135 km and is open on the southern side. The material that makes up the smooth plains in the inner ring can be spectrally assigned to the High-Reflectance Red Plains (HRP). It is probably of volcanic origin and, with an age of one to two billion years old, is interpreted as a trace of recent volcanism on Mercury [66,67]. The plains are crossed by approximately circular graben. These are likely caused by thermal contraction of the cooling volcanic material [68]. Although the basin was flooded with lava, Rachmaninoff contains the lowest elevation on Mercury with 5.8 km below the planet's global average. The bright spot located southeast between the inner and outer ring is named Suge Facula [69]. Faculae are Red Unit areas on Mercury with particularly high albedo [70]. These are interpreted as pyroclastic deposits, often surrounded by volcanic vents [69]. One of these volcanic vents is located northeast of the Rachmaninoff basin and is examined in Section 3.5.  For the comparison of the SfS DTM with the MLA data, all sample points of the Rachmaninoff crater are selected that have a channel ID of 0, which means they were recorded with the highest threshold for the signal-to-noise ratio [50]. The points are then assigned to their respective tracks and the four MLA tracks with the most points are selected for further investigation. The color-coded SfS DTM is shown in Figure 13b. The positions of the MLA sample points are shown in white. For all tracks, the spacecraft flew across the surface from south to north. The elevation of the SfS DTM is compared to the MLA measurements along the plotted tracks in Figure 14. We also compute the first derivative of the SfS DTM and the MLA points along the track ( Figure 15). As mentioned above, uncertainties in the calibration of the instruments lead to small deviations in the calculated positions. Therefore, co-registration of the data is advised. This is achieved by correcting the lateral shift and the vertical offset by minimizing the RMSE. The final RMSE for each track is given in Table 7. Overall, the MLA data and the reconstructed surface elevation show a good fit and the RMSE of the elevations improved for every track compared to the initial stereo DTM. As shown in Figure 13b, the SfS algorithm estimates the lowest point to be at −5.4 km. The deviation of approximately 300 m to the actual minimum can be explained by the fact that the determination of absolute depth in the SfS procedure strongly depends on the initial DTM. For the reconstruction the global DTM of Becker et al. [24] was chosen, with a minimum at −5.38 km. Table 7. RMSE of the elevation (Figure 14) and the first derivative ( Figure 15) along the tracks in Figure 13b calculated for the SfS DTM and the initial stereo DTM.    In many situations, SfS can plausibly recreate the derivative of the surface as for example in Figure 15b between sampling points 300-400 and 500-600. The stereo profile is smooth, but SfS gradually aligns with the derivatives of the MLA track. Considering the RMSE of the along track derivatives, SfS is better in three out of four cases. For the outer western track, the RMSE values of the derivatives can be considered equal, but no improvement is evident. Even though all profiles followed the same co-registration procedure, the SfS derivatives in Figure 15a between sampling points 500 and 600 appear to be shifted. This likely worsens the RMSE. Figure 15 generally shows that the first derivative of the MLA track exhibits stronger amplitudes compared to the SfS or the stereo DTM.

Evaluation Method 2: Tectonic fault
The fault that runs through the heavily eroded crater in the center of WAC image EW1014041070G (Figure 16a) is a sign of Mercury's rich tectonic activity. This shortening structure [5] is probably a lobate scarp [71], which is characterized by its asymmetrical shape. The western side of the scarp has a defined and steep edge, while the slope on the other side is much less pronounced [5,72] (white arrows in Figure 16b). The cause of the shortening structure is likely the global contraction resulting from internal cooling [73]. Along with impact events and volcanic activity, tectonics contributed significantly to the formation of Mercury's surface. We created a SfS DTM (Figure 16b) that covers this representative region with a resolution of the input image of 178 m/pixel. The resolution of the initial stereo DTM is 222 m/px.  Again, the elevation of the SfS DTM and the initial stereo DTM are compared to MLA measurements along the tracks that are shown in Figure 14. To adjust the MLA tracks and the DTMs, the lateral shift and the vertical offset are corrected by minimizing the RMSE. Furthermore, the first derivative is determined in the direction of the MLA tracks. The final RMSEs of the elevation error and the error of the along track derivatives are given in Table 8. For the inner western track, the RMSE of the SfS DTM is clearly better than the RMSE of the initial stereo DTM. The plot of the elevation profile shows that small details are better retrieved with the SfS algorithm (Figure 17b). In two cases, the RMSE values are rather similar. Given the resolution of the image with 178 m/pixel, we do not consider these RMSE differences as a clear indication that either SfS or stereo is better than the other method. For the outer easter MLA track, the RMSE of the SfS DTM is worse compared to the initial stereo DTM. In that case, the RMSE increased because the bottom of the crater between sampling points 30 and 50 deviates from the MLA track (Figure 17d). This can either arise from a wrong reconstruction because the crater is nearly in shadow impairing the SfS or from a misalignment between height profiles. The RMSE of the along track derivatives clearly improved in three of four cases (see Figure 18b-d). In one case (outer western track, Figure 18a), the improvement can not be considered significant. Again, there is a small crater that is partly shadowed, which may worsen the RMSE of the along track derivative.     Figure 17) and the first derivative ( Figure 18) along the tracks in Figure 16b calculated for the SfS DTM and the initial Stereo DTM.

Nathair Facula
Nathair Facula is the most extensive accumulation of Red Unit material on Mercury [62,69]. During the MESSENGER mission, the spot received great interest and became a dedicated science target. Several high-resolution images were acquired with MDIS-NAC. The highly reflective material is distributed within a radius of 130 km around a central vent [6]. Pegg et al. [70] classified the pit with a depth of 4 km as one of 64 known compound vents within the 184 faculae discovered on Mercury so far. Wright [67] suggested the subdivision into five eruption centers, which differ in location and time of the eruption, but whose boundaries cannot be separated as clearly as in other compounds vents. The earliest eruption took place in the southeast corner of the pit. The present smooth and uniform surface was formed by the flowing material of eruptions that occurred later on. The most recent eruption took place in the northern part, evident by the much rougher structure of the ground and further small depressions [67,70]. These characteristics are reconstructed very well in the SfS DTM (Figure 19b). Although the stitching of the mosaic is not optimal, the method produces a DTM with enhanced details. Even details of small lateral extent, such as small craters superimposed on the smooth plains in the southeast, can be seen in the reconstruction. The fan-shaped walls of the vent, interpreted as a sign of downslope mass movement by Malliband et al. [74], are also visible. The rendered representation of the compound vent (Figure 19b), in conjunction with the color-coded DTM, shows that the wall on the south-eastern side has a much more gentle slope and does not exhibit the fan-like structure of the northern and western walls. Although the resolution of the mosaic at 30 m/pixel is relatively high for Mercury, it is not sufficient to resolve the hollows on the walls identified by Blewett et al. [55].

Hollows
The characteristics and the formation process of hollows are discussed in Section 2.5. In this section, we present the first high resolution DTMs of these unique surface features.
The mosaic in Figure 20a was already examined by Blewett et al. [55]. They proposed a formation and aging sequence for hollows, in which growing hollows are characterized by their high albedo. Eventually, the hollows stop to become wider and lose their bright halo until they are finally eroded by space weathering [55,75,76]. SfS enables the reconstruction of surface features this small in size, of which only single targeted images are available. The innovations introduced in Section 2.5 are able to compensate the strong albedo effects in Figure 20a. The global stereo DTM [24] was used to initialize the SfS algorithm. The resulting DTM is shown in Figure 20b at a resolution of 15.3 m/pixel. Blewett et al. [55] noticed that hollows are often located on steep crater walls and central peaks (Figures 20b and A2). The hollows are located slightly offset from the ridge of the crater ring on the Sun-facing side. Blewett et al. [55] suspected that the Sun heating the surface could contribute to the formation of hollows.
The hollows shown in Figure 21a at a resolution of 3.3 m/pixel are located on a small hill at the inner crater rim of the heavily eroded Sholem Aleichem impact basin [77]. The sudden transition from the flat floor to the straight wall, visible in the image and the SfS DTM (Figure 21b), distinguishes hollows from other types of small depressions [76]. No superimposed craters are visible inside the hollows. The smallest craters which are recognizable at this resolution are located on the outside. The dots visible on the floors of the hollows are not impact craters, but probably residues of old surface material. Figure A3 shows the rendered SfS DTM.

Discussion
To adapt the SfS algorithm to Mercury and MDIS image data, we introduced two innovations. The DTMs that were created with the expanded SfS algorithm and evaluated in Section 3 are discussed in Section 4.1. The viability of the methods and future work are reviewed in Section 4.2. As already mentioned in Section 2.8, the evaluation of SfS DTMs is challenging. We approached this task by comparing our results with two different datasets, i.e., stereo images and laser altimetry data. However, the evaluation methods pose several issues that we discuss in Section 4.3.

Results of SfS
We compared the SfS DTMs with MLA tracks that approximate the GT and with stereo reference DTMs. For Praxiteles crater (Section 3.1), the ghost crater (Section 3.2), and Rachmaninoff basin (Section 3.3), a large resolution gap between the initial DTM and the image exists, and the SfS algorithm significantly improves the initial DTM. Figures 9a,b and 12a,b indicate that the height and slopes all improve well. We found that the elevation RMSE of the reconstructed Praxiteles crater and the ghost crater lies in the range of 20.05 m to 120.00 m, and the RMSE of the derivative along the profile is in the range of 0.025-0.062. The deviations in height are smaller than the horizontal resolution of the original images, which are 250 m/pixel for Praxiteles, and 155 m/pixel for the ghost crater. Furthermore, the results in Section 3.3 show that the reconstruction of the Rachmaninoff crater yields an improved RMSE toward MLA tracks compared to the initial stereo DTM (see Table 7). The RMSE of the derivative along the MLA track also improved in three of four cases.
For the fault line (Section 3.4), the resolution difference between the initial DTM and the image is small, and the RMSE of the absolute heights remains in the range of the RMSE of the initial stereo DTM (Section 3.4 and Table 8). However, the RMSE of the derivative along the track improves. This aligns with the results from Hess et al. [11] and is further underpinned by the visual impression that the SfS DTMs appear sharper and more details are visible (see Figures 8b, 11, 16b, 19b, 20b, and 21b). The improved slopes are relevant for any study that involves surface roughness, thermal modeling, analysis of fine structures, and hazard assessment for landers. We also found that very dark regions may impact the SfS result. The two MLA profiles in Figure 17a,d intersect crater-walls that are nearly shadowed, which increased the RMSE of these profiles. The DTMs presented in this article are available in the Supplementary Material.

Viability of the Novel Methods
The ghost crater and the Rachmaninoff basin were reconstructed from image mosaics, and the Praxiteles crater was reconstructed with our iterative albedo estimation. From the results, we conclude that the SfS routine that is modified by both methods can successfully retrieve the shapes that are present in the reference stereo DTMs. However, both extensions require a critical discussion.
The first innovation we introduced is a way to apply SfS to mosaics of MDIS images, which consist of tiles with considerably different lighting and observation conditions. This method became necessary because many target areas comprise several images that were taken at various times. Because the phase function of the Hapke [17] reflectance model, with the global parameters of Warell [44], was not able to model the reflectance transitions at the image borders adequately, the albedo determination had to be adjusted. By separately estimating the single-scattering albedo for each image of the mosaic, the reflectance variations are absorbed, and realistic DTMs can be produced. However, this method leads to visible discontinuities in the albedo, which are not necessarily physically justified. A further step may be introduced in the future to make the approach more physically plausible by estimating the parameters b and c of the DHG phase function for the entire mosaic until the albedo is continuous at the image boundaries.
The second innovation refers to the effect that some surface features such as pit craters, hollows or volcanic vents have on the albedo. The strong local and topographically correlated albedo variations could not be correctly estimated before. This resulted in inaccurate slope and elevation calculations. To solve this problem, an iterative process was introduced in which the final albedo map is used as initialization for a new SfS run. Usually, only a few iterations were necessary to significantly increase the quality of the final DTM. Additional image processing techniques such as segmentation of bright spots may improve the albedo estimation further.

Limitations of the Evaluation Methods
In Sections 3.1 and 3.2, the stereo DTMs of Fassett [29] are used as reference DTMs, because they have the highest horizontal resolution of DTMs available so far. To allow for a direct comparison, the respective reference DTM was scaled down to suppress artifacts known to occur in stereo DTMs [11]. However, there must be a sufficiently large gap between the nominal resolutions of the reference DTMs and SfS DTMs to make sure that no stereo artifacts impede the evaluation. In Section 3.1, the nominal resolution of the original image used in the SfS process is coarser than the nominal resolution of the reference DTM by a factor of 3.9 (reference: 65 m/pixel; WAC image: 250 m/pixel). Note that Fassett [29] already specified the stereo DTMs as three times coarser than the resolution of the MDIS images that were used for stereo. Consequently, there is a significant gap (11.7) between the nominal resolution of the NAC images used for the stereo DTM and the WAC image that is used for SfS. In Section 3.2, the overall factor is 8.4 (reference: 55 m/pixel; WAC image: 155 m/pixel). We examined the limitations of this approach and rendered profiles sampled from Figure 11 and compared them to their respective original images ( Figure 22). All DTMs were artificially illuminated from the same direction as the corresponding original images which are shown next to them as reference. The NAC image is an orthoimage created by Fassett [29] from the stereo image pairs. The WAC image data are extracted from the mosaic, which is the basis for the SfS DTM (Figure 11b). On the one hand, the stereo DTM is covered by a pattern of high-frequency diagonal structures that are not present in the original MDIS images. In Figure 22, the impression arises that the SfS DTM represents the surface more precisely, because the rendered SfS DTM and the WAC image match. On the other hand, the stereo DTM more convincingly represents some surface features, such as the graben on the right hand side of the image and the elevation west of the crater. Consequently, it can be seen that there are still stereo artifacts in the reference DTM and the resolution gap is somewhat small. A direct comparison between the stereo reference DTM and SfS DTM has to be extended by considering the local context of absolute height data and surface plausibility. Regardless, it is remarkable that the SfS was constructed from the low resolution WAC image and largely agrees with the stereo data that were derived from the NAC images of much higher resolution.  Figure 11. The DTMs are compared to the corresponding image parts. In the case of the stereo DTM, the original image is an orthophoto from NAC images [29], and, for the SfS DTM, it is the mosaic of WAC images. Below that, the elevation of the profile (red dashed lines) over the length of the section is shown. The SfS DTM is drawn in red and the stereo DTM in black.
The evaluation with MLA data has the same basic problem as the comparison with stereo reference DTMs on Mercury. The resolution of DTMs derived from MLA data and the effective resolution of individual MLA measurements generally do not match the resolution that SfS can provide. The SfS DTM at 255 m/pixel in Section 3.3 and the SfS DTM at 178 m/pixel in Section 3.4 were specifically created for this evaluation and fit the criteria we establish in Section 2.8. The results suggest that SfS generally compares well to MLA tracks. However, in the case of Rachmaninoff, we found that the amplitude of the derivatives of the MLA track are stronger compared to the SfS DTM and the stereo DTM ( Figure 15). This is not the case for the tectonic fault ( Figure 18) where the amplitudes of the MLA derivatives and SfS derivatives are much more alike. A potential explanation could be the different resolution of the input images. The images of the Rachmaninoff crater have a coarser resolution, which may, in fact, yield SfS DTMs that are too coarse for comparing the derivatives with MLA tracks. For the tectonic fault, the scales match, which allows for a direct comparison. Furthermore, the RMSE or derived quantities are a common choice for comparison, but it comes with limitations. If there is one surface feature that deviates from the reference, it worsens the results even if the rest of the DTM is perfect. Because no actual ground truth of Mercury has ever been acquired, the RMSE values between the SfS DTMs and the approximative GT or reference DTMs are good performance measures but no absolute truth. All these limitations have to be kept in mind when interpreting the RMSE values of the elevation or the derivatives. Therefore, we emphasize that it is necessary to consider the context of the DTM and compare the SfS results with different datasets to arrive at a comprehensive picture of what SfS can provide.

Conclusions
We adapted the SfS algorithm of Grumpe and Wöhler [10] and used it to generate digital terrain models of Mercury's surface. The challenges posed by the properties of the surface and the available datasets lead to the introduction of two methodical innovations. Seamless mosaic shading and iterative albedo estimation were proven to work on Mercury. We found that these innovations can overcome the challenges that are introduced by the sparsely available imagery of Mercury, the comparatively small image patches that require mosaicking, and the brightness variations of the planet's unique surface features. The selection of the areas investigated in this paper served primarily to evaluate and demonstrate the capabilities of the SfS procedure. We confirmed that SfS is well suited to retrieve the shape of the surface by creating accurate heights and derivatives. If the initial stereo DTM is already well resolved, SfS could not improve the RMSE of the absolute heights, but the DTM is sharpened such that the derivatives became more accurate. Furthermore, we presented improved DTMs of Mercury's unique science targets and were the first to compute DTMs with 3.3 m/pixel of Hollows.

Conflicts of Interest:
The authors declare no conflict of interest. Appendix A Figure A1. Illumination geometry. The normal of a surface element is n. The vector s points to the light source and spans the angle of incidence i with n. v points towards the observer with the emission angle e. The angle between s and v is the phase angle α.