Ultra-High-Resolution 1 m/pixel CaSSIS DTM Using Super-Resolution Restoration and Shape-from-Shading: Demonstration over Oxia Planum on Mars

: We introduce a novel ultra-high-resolution Digital Terrain Model (DTM) processing system using a combination of photogrammetric 3D reconstruction, image co-registration, image super-resolution restoration, shape-from-shading DTM refinement, and 3D co-alignment methods. Technical details of the method are described, and results are demonstrated using a 4 m/pixel Trace Gas Orbiter Colour and Stereo Surface Imaging System (CaSSIS) panchromatic image and an overlapping 6 m/pixel Mars Reconnaissance Orbiter Context Camera (CTX) stereo pair to produce a 1 m/pixel CaSSIS Super-Resolution Restoration (SRR) DTM for different areas over Oxia Planum on Mars—the future ESA ExoMars 2022 Rosalind Franklin rover’s landing site. Quantitative assessments are made using proﬁle measurements and the counting of resolvable craters, in comparison with the publicly available 1 m/pixel High-Resolution Imaging Experiment (HiRISE) DTM. These assessments demonstrate that the ﬁnal resultant 1 m/pixel CaSSIS DTM from the proposed processing system has achieved comparable and sometimes more detailed 3D reconstruction compared to the overlapping HiRISE DTM.


Introduction
Over the last 50 years, mankind's knowledge of Mars has greatly increased as a direct result of the various orbital and robotic missions. Since the early 1970s, digital imaging sensors aboard these missions have been pivotal, because they not only show what the surfaces look like but also can provide three-dimensional (3D) information. Over that time period, the resolution and quality of orbital imagery has improved from hundreds of metres down to tens of centimetres. In parallel, through stereo and/or photoclinometry techniques, detailed surface feature studies can now be undertaken with 3D and terrain corrected imagery from multiple orbiting spacecraft.
In Europe, stereo acquisitions are now routine from sensors such as the European Space Agency (ESA) Mars Express High-Resolution Stereo Camera (HRSC) [1] or the ESA Trace Gas Orbiter's Colour and Stereo Surface Imaging System (CaSSIS) [2], whereas the U.S. sensors, including the Mars Reconnaissance Orbiter's Context Camera (CTX) [3] and the High-Resolution Imaging Experiment (HiRISE) [4], tend to acquire 3D information only The layout of the paper is as follows. In Section 1.1, we review previous work that is relevant to the proposed processing system. In Section 2.1, we introduce the input and validation datasets used in this work. From Sections 2.2-2.6 and 2.6.1 we describe the proposed processing system both in overview and also for each individual component. Experimental results and associated assessments are given in Sections 2.7.1 and 2.7.2, respectively. In Section 2.8, we discuss issues arising before drawing conclusions in Section 2.9.

Previous Work
One of the two key techniques employed in this work is SRR. SRR refers to the process of enhancing the spatial resolution, quality, and resolvable details of a given lowerresolution (LR) image by combining non-redundant information contained in multiple LR inputs or through prediction of the best higher-resolution (HR) solution via deep learning. Over the past 10 years, deep learning-based SRR techniques have been highly successful in the field of real-life photo and video enhancement. These include residual based SRR networks [16][17][18], recursive SRR networks [19,20], attention-based SRR networks [21,22], and GAN-based SRR networks [23][24][25][26]. More recently, SRR techniques have been applied to Earth observation images [27][28][29][30] and also applied to Mars orbital images [10,31] to maximise the potential outcome from the original imaging data products. With particular relevance to this work, the MARSGAN network is proposed in [10] for single-image SRR of CaSSIS images. In this work, we further explore the MARSGAN CaSSIS SRR results, to be coupled with SfS techniques to produce high-resolution CaSSIS DTMs.

Datasets
In this work, we use the 6 m/pixel CTX [2] and 4 m/pixel CaSSIS [2] images for experiments and the 1 m/pixel stereo-derived HiRISE [4] DTM products [5] for intercomparison. The CTX images are accessible through the U.S. National Aeronautics and Space Administration (NASA) Planetary Data System (PDS) [72]  Our experiments were performed using the CTX and CaSSIS data over the Exo-Mars 2022 Rosalind Franklin [14] rover landing site at Oxia Planum [15]. Many CaSSIS, HiRISE, and CTX images have been repeatedly captured over Oxia Planum (please refer to Supplementary Materials for a complete list of available CaSSIS images, HiRISE DTMs and potential stereo pairs, and CTX stereo pairs). In order to provide intercomparison of the resultant CaSSIS DTM, we demonstrate the new functionality with a specific CTX stereo pair, F23_044811_1985_XN_18N024W and F23_044956_1984_XN_18N024W, and CaS-SIS image MY34_001934_162_0_PAN, for areas that overlap with the PDS HiRISE DTM DTEEC_036925_1985_037558_1985_L01.

Overview of the Proposed Processing System
The overall workflow of the proposed CaSSIS DTM processing system is shown in Figure 2. It takes an arbitrary CaSSIS panchromatic band image and an overlapping CTX stereo pair as inputs and uses a sequence of novel techniques to produce DTM results that have higher resolution than the original input image. The complete workflow has 6 stages (a-f), which are listed as follows: (a) Photogrammetric stereo reconstruction of the input 6 m/pixel CTX stereo pair using CASP-GO [8] to produce an 18 m/pixel CTX DTM and a 6 m/pixel CTX orthorectified image (ORI). (b) Co-registration of the input 4 m/pixel CaSSIS image with the reference 6 m/pixel CTX ORI from (a) using the MSA-SIFT tie-point based algorithm [9]. (c) SfS 3D reconstruction of the co-registered 4 m/pixel CaSSIS image from (b), using HDEM [11] [13], to produce the final output product. The proposed CaSSIS DTM processing chain starts from a 4 m/pixel CaSSIS image and an initial low spatial resolution DTM (in this work we use CTX but HRSC could also be employed), using a coarse-to-fine strategy, to gradually exploit intensity varia-tions of the CaSSIS/SRR images and retrieve a DTM from 18 m/pixel to 4 m/pixel then eventually down to 1 m/pixel. We reuse previously developed algorithms (or in-house software) [8][9][10][11][12][13] to achieve the above listed 6 processing stages. It should be noted that the HDEM SfS process requires an initial DTM as input, which can be produced via photogrammetry, laser altimetry, or SfS using any overlapping data source. In our case, a stereoderived CTX DTM was a good option given the complete CTX stereo coverage of Oxia Planum and the similar spatial resolution between CTX and CaSSIS. However, if there is no CTX stereo available, we can always use a SfS-reconstructed HRSC DTM (at 12.5 m/pixel) or SfS-reconstructed CTX DTM (using HRSC DTM as initial input) to achieve similar outcomes with the same coarse-to-fine strategy used in this work. High-level technical details of the 6 processing stages are summarised in the following sections. For detailed descriptions of each of these, please refer to the original work in [8][9][10][11][12][13].

CTX DTM from CASP-GO
In the first processing stage (a), CASP-GO [8] is employed to produce an initial 18 m/pixel CTX DTM and 6 m/pixel ORI. The CASP-GO stereo reconstruction pipeline is based on the NASA ASP [6] framework with specific enhancements, dealing with matching artefacts, disparity gaps, and co-registration issues. In particular, CASP-GO uses adaptive least squares correlation (ALSC) and region-growing algorithms [7] to iteratively refine stereo matching accuracy and improve the matching completeness (fill in gaps). A simplified flow diagram of the CASP-GO stereo reconstruction pipeline is shown in Figure 3 and can be summarised with the following 6 main steps: (1) Stereo image pre-processing, including conversion of CTX raw data, denoising, camera model initialisation, epipolar map projection, and image enhancement. (2) Stereo matching using the ASP's normalised cross-correlation and Bayes expectation maximisation weighted affine adaptive sub-pixel cross-correlation. (3) Matching refinement, achieved interactively with (2) using fast maximum likelihood matching, outlier rejection, and ALSC. (4) Use of initial stereo matching results from (2) and (3) as seed points and use of ALSC with region growing to obtain matches for the neighbours of the seed points to gradually fill in any gap areas. (5) Camera Triangulation and DTM creation. (6) DTM post-processing, including outlier filtering, smoothing, grid-spacing, co-kriging interpolation, ORI and DTM co-registration and corrections (with a given reference data, e.g., HRSC/MOLA), and georeferencing.

CaSSIS to CTX Image Co-Registration
At the second processing stage (b), the MSA-SIFT tie-point based image co-registration [9] method is used to co-register the input 4 m/pixel CaSSIS image with the 6 m/pixel CTX ORI so that the initial height values from CTX DTM can be, as accurately as possible, associated with each CaSSIS image pixel. The MSA-SIFT based image co-registration algorithm provides sub-pixel co-registration accuracy through iterative refinement of SIFT feature matching and computes the image transformation function with globally minimised residuals. A simplified flow diagram of the MSA-SIFT based image co-registration method is shown in Figure 4. This method has 4 main steps, which are summarised as follows: (1) Feature detection and matching of the input CaSSIS image and CTX ORI using the Perspective-SIFT [74,75] algorithm.  Given the large resolution difference between the CaSSIS image (4 m/pixel) and the CTX photogrammetric DTM (18 m/pixel), the image co-registration accuracy was not considered essential in this work and is thus not discussed any further.

HDEM SfS
At the third and fifth processing stages (c) and (e), the HDEM [11] SfS method is employed to derive fine-scale 3D information from the 4 m/pixel co-registered CaSSIS image and subsequently the 1 m/pixel CaSSIS SRR image. HDEM SfS takes a single image and a co-registered coarse DTM as inputs, and iteratively refines the coarse input DTM via minimisation of a total cost function. The total cost function comprises 3 terms, i.e., the image irradiance function cost that combines a surface atmospheric radiative transfer scheme and a realistic bidirectional reflectance distribution function (BRDF; the Ross-Thick Li-Sparse model [76][77][78]), the integrability constraint regularisation term, and the photogrammetry constraint regularisation term.
In particular, HDEM SfS uses several novel approaches to produce robust highresolution DTM products with little artefact. These include, firstly, a Cartesian coordinate system aligned with the directions parallel and perpendicular to the sun azimuth instead of the image coordinates; secondly, a separately weighted integration regularisation term, which poses more penalisation on the direction normal to the sun azimuth, and thirdly, a Gaussian convolution scheme for the weighted photogrammetric regularisation term to take into account any difference of spatial resolution between the initial and updated DTMs. Depending on the characteristics of the scene in terms of composition and spatial albedo distribution, the BRDF model is homogeneous, piece-wise constant, or spatially variegated. The spectrophotometric properties of the endmember terrains can be, where available, extracted from a near-coincident CRISM multi-angular acquisition sequence using the MARS-Reco method [79]. They can also come from reflectance measurements of Martian analogue materials in the laboratory or be reduced down to a standard photometric model [80]. Figure 5 shows a simplified flow diagram of the HDEM SfS process. The HDEM SfS process has 4 main steps, which can be summarised as follows: (1) For the given input image and its geometrical acquisition conditions, the pre-calculated surface reflectance model, and the initial DTM input, calculate image intensity fields and thus minimise the cost function for the image irradiance equation and integrability constraint with respect to the surface gradients to obtain updated surface gradients. (2) For the given initial DTM input and pre-computed and updated surface gradients from (1), minimise the cost function for the integrability and photogrammetry constraints with respect to the DTM height to obtain an updated DTM. (3) For the given input image, pre-calculated surface reflectance model, and updated surface gradients from (1), minimise the cost function for the image irradiance equation with respect to the scaling factor of the image irradiance equation to obtain an updated scaling factor. (4) Repeat steps (1) to (3) but use the updated DTM from (2) to replace the initial DTM input until a pre-set maximum number of iterations is reached or until the total cost function for the image irradiance equation, integrability constraint, and photogrammetry constraint converges.

MARSGAN SRR
At the fourth processing stage (d), MARSGAN [10] is used to super-resolve the 4 m/pixel co-registered CaSSIS image and produce a 1 m/pixel CaSSIS SRR image, which will then be used as an input for the HDEM SfS 3D retrieval. MARSGAN is a deep learning-based single-image SRR network that is based on the GAN framework [81]. MARSGAN contains a generator network to generate potential SRR solutions that are highly similar to the HR version of the same scene, and in parallel, to compete with the generator network, a relativistic discriminator network [82] is used to divide the potential SRR solutions into 2 classes, i.e., "more realistic" or "more like fake". In particular, the generator of MARSGAN contains 23 adaptive weighted residual-in-residual dense blocks, followed by an adaptive weighted multi-scale reconstruction block, providing high network capacity and restoration of information at different image scales. Details of the MARSGAN network architecture can be found in [10]. Figure 6 shows a simplified flow diagram of the MARSGAN SRR training and inference process. MARSGAN is able to produce state-of-the-art SRR results for CaSSIS images at 4 times the original scale with an effective resolution enhancement factor of about 3 times [10]. In this work, we reuse the pre-trained MARSGAN network based on HiRISE images described in [10] to produce CaSSIS SRR results at 1 m/pixel spatial resolution. Subsequently, the intermediate 4 m/pixel CaSSIS DTM is further refined using the resultant 1 m/pixel CaSSIS SRR image via the same HDEM SfS process described in the previous section.

3D Co-Alignment
At the sixth processing stage (f), a B-spline fitting algorithm (as originally described in [13]) is used to check and correct any systematic errors of the resultant 1 m/pixel CaSSIS DTM with respect to the 18 m/pixel stereo-derived CTX DTM. Although systematic errors from HDEM SfS are generally rare, they could still occur when using a standard "Mars surface dust covered" as the BRDF model, i.e., without involving separate calculation of a regional Ross-Thick Li-Sparse model using multi-angular sequences of the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) data [83], as originally used in [11]. In this work, we use the default pre-computed "dusty-Mars" model for HDEM SfS and subsequently apply an in-house B-Spline fitting method to perform a systematic correction and finally produce the 1 m/pixel CaSSIS DTM that is co-aligned with the stereo-derived CTX DTM. Figure 7 shows a simplified flow diagram of the B-spline fitting algorithm, which has 4 main steps that can be summarised as follows: (1) Compute two planar B-spline surfaces that represent the large-scale topography of the 1 m/pixel CaSSIS DTM and 18 m/pixel CTX DTM. (2) Assign initial local affine transformations for each 3D point of the B-spline surface of the CaSSIS DTM from (1). (3) Update local affine transformations by minimising the cost function that comprises 3 terms, i.e., the weighted distance between the target and reference 3D points, the weighted stiffness term to penalise transformations of neighbouring 3D points, and the weighted landmark term, which in this case is simply a collection of the closest 3D points.
(4) Lower the stiffness weights to allow more localised transformations and go back to (3) until the cost function in (3) is minimised, then update the input CaSSIS DTM.

Demonstration and Visual Analysis
We provided intercomparison of the resultant CaSSIS DTM with existing PDS HiRISE DTM (DTEEC_036925_1985_037558_1985_L01) using overlapping cropped areas from the CTX stereo pairs (F23_044811_1985_XN_18N024W and F23_044956_1984_XN_18N024W), a CaSSIS panchromatic band image (MY34_001934_162_0_PAN), and the PDS HiRISE ORI (ESP_036925_1985_RED_A_01_ORTHO). It should be noted that the HiRISE ORI and DTM were co-aligned with the CTX ORI and DTM, which were themselves co-aligned with HRSC and MOLA (as part of the CASP-GO processing), in order to perform intercomparisons within the same geospatial context. Table 1 shows a list of the image IDs with associated geometrical metadata. It should be noted that except for image-specific parameters, we used the default sets of parameters for each processing step. There was still room for further improvement of the demonstrated results via fine-tuning of these parameters. However, in this work, we aimed at demonstrating the overall processing concept in the simplest possible manner.   Figure 9 for Area-A, Figure 10 for Area-B, Figure 11 for Area-C, and Figure 12 for Area-D.     From Figure 9 (Area-A), we can observe significant improvements of the CaSSIS DTMs in comparison to the CTX DTM, in terms of resolvable features, local topography variations, and clarity of large-scale features. Although the HiRISE DTM seemed to be smoother and therefore more visually pleasing, there were many surface features and peaks that had been smoothed out as a result of window-based matching products. For example, in the HiRISE DTM, the rippled dune features in the crater centre were mostly shown as flat terrain, and the crater shape had also been smoothed out into a "perfect" circle. Overall, in this example, the CaSSIS SRR DTM seemed to have similar or even higher resolution compared to the HiRISE DTM. The zoom-in views show that the fine-scale topography was significantly improved by using the proposed approach, and the topography features aligned with the features shown in the HiRISE image at 0.25 m/pixel.
In terms of errors (or artefacts), it was difficult to determine whether the CaSSIS SRR DTM or HiRISE DTM was more "correct", because there was no available higher-resolution ground-truth for validation. However, the artefacts, if there were any, would be at a slightly larger-scale for the CaSSIS SRR DTM, while the artefacts were mostly at small-scale for the HiRISE DTM, from a visual sense. Regarding the possible errors of the resultant CaSSIS SRR DTM, they could either have originated from any inherited errors of the initial DTM input (CTX DTM in our case) or come from any localised errors from using a generalised (instead of a locally computed) BRDF model. However, the latter issue (assuming this is correct) should only cause minor variations, as the final CaSSIS DTM was corrected against the CTX DTM at the final stage of the proposed processing chain, i.e., medium-to-large scale errors should have been eliminated unless they were in the CTX DTM. This can be observed, from the CaSSIS DTM and CaSSIS SRR DTM, for the inner crater ridge on the west, and just outside the crater edge on the south, as we can see that some outliers that do not agree with the CTX DTM or the HiRISE DTM have been eliminated (or largely reduced) in the CaSSIS SRR DTM. In order for the reader to examine the aforementioned details for the other areas, please refer to the full-resolution figures provided in the Supplementary Materials.
From Figure 10 (Area-B), we can observe increasing numbers of fine-scale surface features from the CTX DTM, CaSSIS DTM, and CaSSIS SRR DTM. HiRISE DTM seemed to have a level of detail similar to that of the CaSSIS DTM, but less detail than the CaSSIS SRR DTM. For example, the linear peaks inside the central crater were observable from both the CaSSIS image and HiRISE image; they have been brought out in the CaSSIS SRR DTM, but were not clear in the CaSSIS DTM and were even completely missing in the HiRISE DTM. The CaSSIS SRR DTM showed the most detail of the layered structures at the northeast corner. Additionally, the reconstruction quality of several of the small craters at the southwest region in the CaSSIS SRR DTM were better than those in the HiRISE DTM.
On the other hand, we could observe some errors around the edges of the open channel of the central crater from the CaSSIS DTM. Although the errors were reduced in the CaSSIS SRR DTM, they were not fully removed. The other point is, for the northwest crater ridge, there seemed to be a three-layer structure as shown in the CaSSIS and HiRISE image, where either or both the CaSSIS SRR DTM and HiRISE DTM may have been wrong. For the HiRISE DTM, the large-scale structure looked more correct, but the inner layer was completely missing, whereas in the CaSSIS SRR DTM, the inner layer was strongly shown but seemed to be incorrectly higher than it appeared in the images.
For Figure 11 (Area-C), the four DTM results showed good agreement with each other at large scale. The CaSSIS SRR DTM showed better agreement with the CTX and HiRISE DTM for the northwest corner in comparison with the CaSSIS DTM. Similar issues that are shown in Figure 10 (Area-B) are also shown in Figure 11 (Area-C), regarding the crater ridges. The crater walls shown in the CaSSIS DTM and the CaSSIS SRR DTM were much steeper compared to the HiRISE DTM. This seemed to agree with visual inspection of the images, but which one was more accurate remained an outstanding question. In addition, the CaSSIS SRR DTM showed good detail of the surface including revealing some very small-sized craters and the reconstruction of small height variations over flat areas. Such small craters and height variations were mostly smoothed out in the HiRISE DTM.
This observation is also reflected in Figure 12 (Area-D). Over the mostly flat region, as shown in Figure 12 (Area-D), we could observe the different levels of detail shown in the CaSSIS SRR DTM and HiRISE DTM while displaying good agreement with each other in broad scale. In the next section, profile measurements are given for 4 profile lines for the CTX DTM, CaSSIS DTM, CaSSIS SRR DTM, and HiRISE DTM over each of the 4 areas, in order to provide a more intuitive view regarding the observed differences as discussed above.

DTM Assessment: Profile Measurements and Crater Counting
The four measured profile lines (A1, A2, A3, and A4) for Area-A are shown in Figure 13, with A1 and A2 crossing the crater ridges and the central dunes feature from north to south, and with A3 and A4 crossing the main crater from west to east. The overall agreements for the 4 DTMs were fairly good for Area-A. The largest difference between the CaSSIS SRR DTM and HiRISE DTM was about 10 m, which occurred at the north and south edges of the crater as well as for the central rippled runes feature. We can observe that the CaSSIS SRR DTM that co-aligned with the CTX DTM was also better co-aligned with the HiRISE DTM. Similar to previous visual inspections, the CaSSIS SRR DTM has shown details of many features that were not shown on the HiRISE DTM, e.g., the peaks and valleys at the crater ridge and centre; however, their absolute accuracy remains a question.  showed the aforementioned effects of the 3D co-alignment processing, which eliminated regional errors (about 10 m) shown on the intermediate CaSSIS DTM. B3 measured the aforementioned "three-layer" crater structure at the northwest side of the crater ridge, which appeared to be the major difference between the CaSSIS SRR DTM and HiRISE DTM for Area-B. The differences were around 8 m. The centre part and the northwest crater ridges in B3 showed fairly good agreement in the 4 DTMs. B4 also showed good agreement in the 4 DTMs, with the largest difference between CaSSIS SRR DTM and HiRISE DTM less than 6m. These differences were mostly due to the reconstructed topographic variations of the CaSSIS SRR DTM, whereas they were mostly smoothed out on the HiRISE DTM. The four measured profile lines (C1, C2, C3, and C4) for Area-C are shown in Figure 15, with C1 crossing the south ridge of the main crater in the north, and C2 and C3 measuring the area that showed the major difference between the CaSSIS DTM, CaSSIS SRR DTM, and HiRISE DTM in Area-C, and with C4 crossing the main crater on the south. The main difference between the CaSSIS SRR DTM and HiRISE DTM, shown in C1, was the peak feature of the south ridge of the crater, whereas the HiRISE DTM showed a mostly flat profile. By visually inspecting the CaSSIS and HiRISE images, the peak, shown in the CaSSIS SRR DTM, seemed realistic; however, the 12 m maximum difference seemed too large to be real. C2 and C3 measured the area where the largest difference between CaSSIS SRR DTM and HiRISE DTM appeared in Area-C. The peaks of the two connected crater ridges had a maximum relative height difference between 15 m and 27 m for the CaSSIS SRR DTM, but the maximum relative height difference was between 6 m and 18 m for the HiRISE DTM. In C3, the joint ridge of the two connected craters, shown in the HiRISE DTM, was lower than the northwest crater ridge and a small peak on the southeast, whereas the joint ridge of the two connected craters, shown in the CaSSIS SRR DTM, was much higher than the northwest crater ridge. Visually comparing the CaSSIS and HiRISE images, the peaks shown in the HiRISE DTM seemed too low, but the peaks shown in the CaSSIS SRR DTM appeared too high. C4 showed a similar issue; the crater on the south seemed to have a much higher peak at the crater edge, as shown in the CaSSIS SRR DTM in comparison to the HiRISE DTM. This could be a smoothing issue with the HiRISE DTM or an over-structuring issue with the CaSSIS SRR DTM.
In the last example, Figure 16 shows four measured profile lines (D1, D2, D3, and D4) for Area-D, with D1 and D2 crossing from north to south for the west and east parts, respectively, and with D3 and D4 crossing from west to east for the north and south parts, respectively. Overall, the CaSSIS SRR DTM showed more detail and sharper peaks than the HiRISE DTM for this flatter terrain. The largest difference between the CaSSIS SRR DTM and the HiRISE DTM was less than 10 m for Area-D. This appeared on D1 and D3 when crossing the crater and peaks at the northwest region of Area-D. For flat areas, e.g., the middle part of D2 and D3, the differences were within 4 m. D4 clearly showed the local topographic variations that the CaSSIS SRR DTM brought out in comparison to the HiRISE DTM. These variations were mostly within 4 m and were highly realistic in visual assessments of the CaSSIS and HiRISE images. In addition to the profile analysis, we also counted the numbers of craters revealed on the CTX DTM, CaSSIS DTM, CaSSIS SRR DTM, and HiRISE DTM for the four example areas (A, B, C, and D). The resolvable craters are marked and shown in Figure 17. The marked craters were cross-checked against the HiRISE ORI in order to exclude any false-positives. Their diameters were determined using the ORI information (the same craters could show different sizes in a DTM due to errors). The crater counts, in terms of small-sized craters (diameter less than 30 m), medium-sized craters (diameter between 30 m and 100 m), and large-sized craters (diameter larger than 100 m), are summarised in Table 2.  Crater counts and analysis of the smallest resolvable craters were intuitive indicators as to the quality and effective resolution of the DTM products. Strong smoothing of DTMs could increase their perceptual quality due to reduced visible artefacts; however, it would conversely lower the number of resolvable craters (especially small-sized craters). From Table 2, we can observe that all four DTMs were able to resolve the large-sized craters, and there were increasing numbers of resolvable medium-sized and small-sized craters in the CaS-SIS DTM, HiRISE DTM, and CaSSIS SRR DTM. For medium-sized and small-sized resolvable craters, the CaSSIS SRR DTM outperformed the HiRISE DTM in all four areas, therefore yielding a higher effective resolution.
It has been noted that the differences in results between medium-sized craters (from CaSSIS SRR DTM and HiRISE DTM) were generally larger than the differences between small-sized craters. We believe this was because the 4 m CaSSIS image had natively recorded fewer very-small-sized craters (e.g., <10 m), thus the CaSSIS SRR image (the MARSGAN SRR processing improved the resolution of existing information but did not create fake information that did not exist in the original input). Therefore, some of the successfully reconstructed tiny craters (e.g., <10 m) that had good contrast and disparities in the HiRISE stereo images compromised some of the total counts of the small-sized class (<30 m) in Table 2. However, it should be noted that the crater counting work described here should only be considered as indicative. A more thorough evaluation with a much larger area using an automated crater-counting method will be considered in future works.

Discussion
From the experiments conducted to date, we observed successful 3D reconstructions from CaSSIS for very fine-scale details of the Martian surface. These included resolving small features, e.g., rippled dunes, small craters, restoring local topographic variations, and obtaining better clarity of large-scale features. Whilst resolving more 3D details compared to stereo-derived HiRISE DTM, the correctness of such details cannot yet be validated, given the lack of high-resolution ground-truth 3D information that the future rover will be able to provide. Validation of the differences between the CaSSIS SRR DTM and HiRISE DTM observed over some of the demonstrated areas remains an open question. However, the maximum difference between CaSSIS SRR DTM and HiRISE DTM was 12 m, which occurred at the central peak of the two connected crater ridges in Area-C; for most of the other places, differences between the CaSSIS SRR DTM and HiRISE DTM were less than 5 m, which is acceptable, given their resolution differences. On the other hand, there were a pair of pre-chosen hyper parameters in the HDEM SfS process that controlled the weights of the integrability term and photogrammetric constraint. By tuning these weights, we could control the production of any artefacts (like overshooting) versus producing more detail. A default set of these weights as 10 −3 and 10 −7 , respectively, is usually employed.
It should be noted that the radiometric calibration of the CaSSIS images may influence the contrast of the image and thus the performances of the SfS algorithm. We used the conversion factor from digital number (DN) to reflectance, given in the header file of version 1 of the archived CaSSIS image (1.86352e-005/DN).
In this work, we demonstrated how a high-resolution DTM can be generated using a coarse stereo-derived 18 m/pixel CTX DTM as an initial input followed by the SfS process with a 4 m/pixel CaSSIS image, given the close spatial resolution between CTX DTM and CaSSIS image. CTX has obtained global coverage of Mars to date and more than half of the surface has been covered with CTX stereo. This implies that it may be possible to produce semi-global DTMs at 4-6 m/pixel whenever CaSSIS data are available and satisfy the general SfS requirements (with incidence angles higher than 40 • and phase angles higher than 30 • ). Additionally, CTX has a greater swath width compared to the CaSSIS image, which makes it an ideal input source for CaSSIS SfS DTM reconstruction. However, if a CTX stereo pair is not available for a CaSSIS image, we can always start from a HRSC stereo pair. In this case, an additional HDEM SfS process may be required to bring the stereo-derived HRSC DTM into a higher spatial resolution, e.g., from 50 m/pixel to 25 m/pixel, in order to be used as the initial DTM input. Figure 18 shows an example of the resultant CaSSIS SRR DTM produced using a 50 m/pixel HRSC level 5 DTM (HMC_11W20_DA5) as input to replace the CTX stereo-derived DTM. In this example, the HRSC level 5 DTM was first refined to 12.5 m/pixel using SfS given a co-registered HRSC level 4 ORI (hd619_0000.nd4.03). Then, the 12.5 m/pixel HRSC SfS DTM was further refined using the 1 m/pixel CaSSIS SRR image. Note that given the 12.5 m/pixel HRSC SfS DTM was closer in resolution to the CaSSIS SRR image compared to the 18 m/pixel CTX DTM previously used, we did not need to use the original CaSSIS image to produce an intermediate 4 m/pixel CaSSIS DTM but used the CaSSIS SRR image directly to refine the HRSC SfS DTM. Even though in theory we could go up to a factor of 15 for the difference in scale between each SfS step, in practice scale jumps of a factor of 4 produce the most robust result. Finally, it should be noted that the proposed workflow could also be used to produce ultra-high-resolution HiRISE DTMs using HiRISE SRR and SfS. The current limitation is the processing speed of the HDEM SfS process. Processing of full-strip CaSSIS images should not be a problem using tiled processing in parallel, but larger areas, e.g., covering the whole of Oxia Planum, will require huge processing resources. GPU porting of the algorithm is also an option in the future. Another limitation is the need for spatially variant BRDF, which is not yet possible given the CRISM and OMEGA data currently available.
Another issue that could be studied in the future is the potential "overshooting" problem for steep slopes. Without fine-tuning the HDEM SfS parameters, certain levels of uncertainty are inevitable. Here we compare the HDEM SfS result from the CaSSIS SRR image with the SfS result from the HiRISE image in Figure 19. By measuring a profile line that crosses the crater rim where we previously observed the overshooting issue in Figure 13, we could observe that the differences between the HiRISE stereo-derived DTM and HiRISE SfS refined DTM (using the HiRISE stereo-derived DTM and HiRISE ORI as inputs) were smaller, at a maximum of 4 m, whereas the differences between the HiRISE stereo-derived DTMs and CaSSIS SRR SfS DTM were larger, at a maximum of 12 m. In Profile-1, the maximum difference between HiRISE stereo-derived DTM and CaSSIS SRR SfS DTM was 8 m (this would be reduced to 5 m if comparing against the HiRISE SfS DTM for the same location), and at another location, the maximum difference between HiRISE stereo-derived DTM and CaSSIS SRR SfS DTM was 7 m (it is also would be~7 m if comparing against the HiRISE SfS DTM for the same location). In Profile-2, the HiRISE stereo-derived DTM and HiRISE SfS refined DTM had very little difference at the crater edge, whereas in the CaSSIS SRR SfS DTM, the overshooting appeared to be 8 m and the undershooting appeared to be 12 m maximum. These differences mainly come from a poor photogrammetric constraint for the HDEM SfS process, as we started from a much lower resolution DTM to produce the CaSSIS SRR SfS DTM. We believe this issue could be reduced or eliminated in the future by using a higher-resolution input CaSSIS stereo-derived DTM-for example, CaSSIS stereo derived DTM, or CaSSIS predicted DTM using deep learning-based monocular DTM estimation techniques. In this work, we demonstrated the concept of using a series of automated processing techniques to obtain sub-pixel level DTMs of CaSSIS images. As mentioned in Section 2.7.1, we use the default set of parameters (i.e., algorithm-related parameters were identical, but image-related parameters were different) throughout the processing chain, in order to demonstrate a prototype in the simplest possible manner. However, fine-tuning the algorithm-related processing parameters could potentially improve the results. For example, improving the initial photogrammetric results by fine-tuning the image matching parameters could potentially improve the photoclinometry results, improving the SRR network parameters could potentially improve the image and DTM resolution, and fine-tuning the SfS parameters could result in better DTM quality. In addition, the selection of processing parameters could be either area-or surface feature-specific, thus the computational limitation of tiled processing could also become an advantage in such circumstances.

Conclusions
In this paper, we introduced a novel ultra-high-resolution DTM processing chain using a combination of photogrammetric 3D reconstruction, image co-registration, image SRR, SfS DTM refinement, and 3D co-alignment methods. Technical details were provided, and results were demonstrated using 4 m/pixel CaSSIS panchromatic images and an overlapping 6 m/pixel CTX stereo pair to produce 1 m/pixel CaSSIS SRR DTM for four small areas over Oxia Planum. Visual inspections and quantitative assessments were presented using the publicly available 1 m/pixel HiRISE DTM for intercomparison. Such assessments showed that the final resultant CaSSIS DTM from the proposed processing system achieved comparable and sometimes more detailed 3D information compared to a HiRISE DTM.