A Robust DEM Registration Method via Physically Consistent Image Rendering

Li, Yunchou; Jiao, Niangang; Wang, Feng; You, Hongjian

doi:10.3390/app16031238

Open AccessArticle

A Robust DEM Registration Method via Physically Consistent Image Rendering

¹

Aerospace Information Research Insitute, Chinese Academy of Sciences, Beijing 100094, China

²

Key Laboratory of Target Cognition and Application Technology (TCAT), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China

³

School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 101408, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(3), 1238; https://doi.org/10.3390/app16031238

Submission received: 20 December 2025 / Revised: 16 January 2026 / Accepted: 23 January 2026 / Published: 26 January 2026

(This article belongs to the Section Earth Sciences)

Download

Browse Figures

Versions Notes

Abstract

Digital elevation models (DEMs) play a critical role in geospatial analysis and surface modeling. However, due to differences in data collection payload, data processing methodology, and data reference baseline, DEMs acquired from various sources often exhibit systematic spatial offsets. This limitation substantially constrains their accuracy and reliability in multi-source joint analysis and fusion applications. Traditional registration methods such as the Least-Z Difference (LZD) method are sensitive to gross errors, while multimodal registration approaches overlook the importance of elevation information. To address these challenges, this paper proposes a DEM registration method based on physically consistent rendering and multimodal image matching. The approach converts DEMs into image data through irradiance-based models and parallax geometric models. Feature point pairs are extracted using template-based matching techniques and further refined through elevation consistency analysis. Reliable correspondences are selected by jointly considering elevation error distributions and geometric consistency constraints, enabling robust affine transformation estimation and elevation bias correction. The experimental results demonstrate that in typical terrains such as urban areas, glaciers, and plains, the proposed method outperforms classical DEM registration algorithms and state-of-the-art remote sensing image registration algorithms. The results indicate clear advantages in registration accuracy, robustness, and adaptability to diverse terrain conditions, highlighting the potential of the proposed framework as a universal DEM collaborative registration solution.

Keywords:

registration; feature point matching; digital elevation models (DEMs); image rendering

1. Introduction

Digital elevation models (DEMs), as fundamental data that represent surface elevation variations, are widely applied in numerous fields such as glacier monitoring, landform analysis, and nature hazard surveillance [1]. However, due to sensor attitude variations, image geometric errors, and limitations in ground control accuracy, DEM data from different sources or time periods often exhibit spatial offsets [2]. This leads to significant issues when integrating multi-source DEMs for remote sensing applications. Particularly in change detection, DEM fusion, and error assessment, the absence of precise registration can result in error propagation, feature blurring, or even misinterpretation of results [3]. To enable collaborative analysis among multi-source DEMs, DEM registration emerges as a critical step. The objective of DEM registration is to establish an appropriate geometric transformation that aligns one or more DEMs with a reference dataset. Accurate registration not only improves the consistency and fusion quality of multi-source DEMs but also provides a reliable data foundation for high-precision geospatial analysis [4,5].

DEM registration methods can be categorized into two types: one is based on three-dimensional model registration, utilizing the characteristics of DEM in three-dimensional space; the other is based on image registration, where the three-dimensional models of DEMs are projected into two-dimensional images, with elevation values serving as pixel values, followed by registration using image matching methods.

Both approaches have demonstrated promising results in distinct scenarios. Methods based on three-dimensional models rely on elevation differences or slope fields as core driving variables, exhibiting strong dependence on terrain feature representations while failing to fully leverage the spatial image characteristics inherent in DEMs. Meanwhile, two-dimensional image-based matching methods, primarily developed for remote sensing imagery, overlook the 3D geometric characteristics and elevation error mechanisms inherent in DEMs. They exhibit significant limitations when applied to DEM datasets with weak texture, flat terrain, or high-frequency noise. Therefore, this paper integrates DEM terrain features with advanced image matching algorithms to propose a DEM registration method based on physically consistent rendering and image matching. This method first renders the DEM into specialized images, incorporating physical properties in irradiance-based and parallax geometric models and thereby enhancing their ability to express terrain features across diverse complex scenarios. Subsequently, image matching algorithms extract matching point pairs. Based on the elevation information corresponding to these pairs, preliminary outlier filtering is performed. Robust matching points are selected through elevation error distribution analysis to estimate an affine model. The elevation data associated with the matching point pairs is then utilized to correct the overall elevation offset of the DEM. The goal of this study is to enable robust global DEM registration for multi-source datasets under challenging conditions, including low texture, flat terrain, and elevated noise levels. The main contributions are summarized as follows:

1.: A robust DEM registration algorithm is developed, which performs effectively in varying topography and resolution.
2.: By employing a physically consistent rendering method based on geometric modeling and radiative simulation, the digital elevation model is rendered as an image representation of illumination and radiative properties. Simultaneously, the elevation information from the DEM is utilized for matching the two-dimensional images. This approach fully leverages both the image information and elevation data of the DEM.
3.: Comparative validation with classical DEM registration algorithms and state-of-the-art remote sensing image registration algorithms demonstrates the proposed algorithm’s adaptability and robustness across diverse terrains, including plains, urban areas, and glaciers.

2. Related Works

2.1. Three-Dimensional Registration Methods

Three-dimensional registration methods primarily fall into two categories: terrain attribute analysis-based methods and point cloud-based registration methods.

Terrain attribute analysis methods calculate systematic offsets by establishing functional relationships between elevation residuals and terrain parameters like slope gradient and aspect. Among these, the Nuth & Kääb (NK) algorithm [6] is currently the most widely used, particularly for DEM data lacking precise geographic registration information, such as products automatically generated from optical or radar imagery without external control point constraints. Streutker et al. [7] statistically estimates horizontal and vertical offsets between two DEMs by analyzing local slope and elevation differences within overlapping DEMs. The NK algorithm assumes systematic shifts in elevation differences along the aspect direction between unregistered DEMs, constructing a sine fit between aspect and elevation differences to estimate horizontal offsets. This method offers advantages such as simplicity, computational efficiency, and independence from feature recognition, performing well in high-slope areas like glaciers and mountains. However, the NK method has limitations: it estimates only horizontal translation and cannot handle complex transformations like rotation or scaling; its core fitting model is sensitive to noise and anomalous terrain (e.g., shadows, snow cover); in low-slope regions, registration accuracy significantly degrades due to the lack of terrain attribute constraints.

Point cloud-based registration methods convert DEM sampling into discrete point clouds [8,9]. The Least-Z Difference (LZD) algorithm constructs a least-squares optimization objective based on elevation residuals to directly fit rigid or affine transformation parameters (including translation, rotation, and scaling), achieving flexible and high-precision registration [10,11]. It performs exceptionally well in scenarios with significant systematic errors or large DEM resolution differences. However, LZD relies on dense pixel interpolation calculations, resulting in relatively high computational demands and susceptibility to missing values and outliers in the DEM. Furthermore, its high accuracy depends on the initial registration being relatively close, otherwise, it is prone to local optima. Pilgrim et al. [12] introduced the M-LZD algorithm by incorporating maximum likelihood estimation, enabling it to handle substantial surface deformations. Zhang et al. [13] combined Least Trimmed Squares (LTSs) with LZD to control surface identification and transformation model construction, enhancing computational efficiency. Li et al. [14] compared the advantages and disadvantages of the NK and LZD algorithms, proposing a nonparametric method to eliminate complex systematic errors in DEM registration results. The Iterative Closest Point (ICP) algorithm, initially developed for 3D point cloud registration [15,16], has since been widely applied to geometric alignment of DEM data [17]. Its fundamental principle involves iteratively establishing closest-point correspondences between two datasets while minimizing Euclidean distance to estimate rigid or similarity transformation parameters. ICP’s advantages in DEM registration include the following: handling rotation, scaling, and non-parallel offsets; accommodating diverse data formats such as DEM-to-point cloud and DEM-to-DEM registrations. Many improved ICP algorithms have been widely used in various point cloud registrations. However, its limitations are also evident: ICP is sensitive to initial positions, and excessive initial offsets may lead to convergence to local optima. Additionally, point-to-point or point-to-surface searches on dense grid DEMs incur high computational costs, necessitating acceleration structures (e.g., kd-trees) for assistance [18,19]. Overall, ICP is better suited for precise alignment between DEMs and external high-precision point clouds or terrain models, but it lacks efficiency for rapid registration of large-scale, low-resolution DEMs [20].

2.2. Two-Dimensional Registration Methods

Two-dimensional registration methods treat DEMs as ordinary remote sensing images for registration, typically drawing on processing techniques from multimodal image matching. Huang et al. [21] employed Scale-Invariant Feature Transform (SIFT) to align multi-temporal DEM data for glacier change detection, while Ravanbakhsh et al. [22] utilized Mutual Information (MI)-based DEM registration, revealing that region-based image matching methods achieve higher accuracy than 3D registration approaches. Some studies attempt to integrate 2D and 3D registration concepts, forming phased or hybrid DEM registration strategies. Wang et al. [23] proposed a two-stage DEM registration method without ground control points for landslide deformation monitoring. This method employs image matching techniques for coarse registration in the first stage to eliminate large-scale horizontal offsets, followed by ICP algorithm-based fine registration in the second stage to further enhance accuracy. Li et al. [24] proposed an automatic DEM registration method based on sub-basin centroids from a topographic–hydrological perspective. This approach treats sub-basins as stable physical units, using their centroids as control points in the registration process. Compared to traditional geometric or image features, this method offers greater physical significance and stability. Furthermore, since DEMs inherently represent terrain elevation in a regular grid format, when errors between DEMs primarily manifest as horizontal variations, the registration problem can be partially transformed into a matching problem within a two-dimensional image space. Consequently, relevant image registration methods also hold certain reference value [25,26,27,28,29,30,31,32,33,34,35,36]. In recent years, with the advancement of deep learning, some studies have attempted to utilize end-to-end networks to learn cross-modal feature representations for the automatic registration of multimodal remote sensing images. Lindenberger et al. [37] proposed the LightGlue method, which achieves fast and robust matching through efficient local feature matching and graph neural network optimization, providing a potential reference for the registration of large-scale high-resolution remote sensing data. Quan et al. [38] proposed a cross-modal registration method based on deep wavelet learning for heterogeneous remote sensing images such as optical and SAR images, improving matching robustness through multi-scale feature extraction. Zhao et al. [39] designed a dual-branch network of global and local features for SAR and optical image registration, which can balance global consistency and local detail alignment. In terms of applications, Jing et al. [40] proposed a framework that integrates registration into change detection to achieve automatic registration and change analysis of misaligned remote sensing images, emphasizing the direct value of registration methods in downstream tasks. Kang et al. [41] proposed an attention-based and context-aware deep learning framework (ACAMatch) for robust and efficient optical-SAR image registration. These works demonstrate the importance of deep learning enhancement methods in remote sensing registration and provide insights for elevation data (DEM) registration methods, especially when dealing with complex terrain, multi-source or heterogeneous resolution data. While these methods have achieved satisfactory results in two-dimensional image registration tasks such as optical-SAR, they typically rely on large amounts of training data, and their generalization capabilities and interpretability still require further validation [42].

3. Materials and Methods

3.1. Study Area

We selected five sets of experimental data from different regions to validate the methods proposed in this paper. The first three sets involved simulated experiments where real DEMs were artificially distorted to mimic real-world DEM registration challenges. The accuracy of the algorithms was then estimated based on known distortion parameters. The latter two sets utilized real multi-temporal, multi-source DEMs to compare registration results obtained through different methods, evaluating the alignment quality of the two DEMs before and after registration. These experimental data, including DEMs at different resolutions and with varying terrain conditions, help us explore the adaptability of the proposed method under different circumstances.

3.1.1. Simulation Data

The simulation’s experimental data were obtained from Australia, Europe, and Antarctica. The Australian data, located at Christmas Island and Darwin, utilized the 5 m resolution LiDAR-derived digital elevation model (DEM) released by Geoscience Australia as reference data. For Antarctica, the Reference Elevation Model of Antarctica (REMA) DEM served as the reference DEM data. The European data, located in the Alps, came from the Shuttle Radar Topography Mission (SRTM). Detailed information on the data is provided in Table 1.

3.1.2. Actual Data

A typical urban area in the Middle East was selected as the registration experiment site. Located in the arid region of the Middle East, this area features a compact urban layout with high building density. Surface cover types primarily include residential areas, road networks, industrial facilities, and sparse vegetation with surrounding mountainous terrain. The study area is shown in Figure 1. Multi-temporal DEM data derived from the Gaofen-7 satellite were selected as the reference DEM and the DEM to be registered, featuring a spatial resolution of 1 m. These datasets exhibit anomalous points caused by occlusion, shadows, and cloud cover. The registered results serve as reference data for urban change detection, enabling near-real-time damage monitoring following war, earthquakes, or extreme weather events in urban areas.

The North China Plain was selected as the study area. Located at latitude N36 and longitude E115, this region features a warm temperate semi-humid monsoon climate. Its terrain is predominantly alluvial plains with low-lying topography, ranging between 30 and 70 m in elevation. Agricultural land is extensively distributed, with surface cover dominated by farmland and scattered villages. The terrain exhibits minimal elevation variation, characterizing it as a typical low-slope area. The study area is shown in Figure 2. The experimental dataset comprises two publicly available DEMs:

1.: Copernicus GLO-30 DEM (COP-30): A globally available open-access digital elevation model provided by the European Space Agency (ESA) and the European Commission. It features a spatial resolution of 30 m. This dataset offers consistent and seamless global coverage.
2.: ALOS World 3D-30 m (AW3D30): A global open DEM product generated by the Japan Aerospace Exploration Agency (JAXA) based on ALOS satellite PRISM imagery, also featuring a spatial resolution of 30 m.

To better understand the sources of error in the registration results, it is necessary to explain the absolute accuracy and limitations of the reference DEMs used. The Gaofen-7 DEM has a spatial resolution of 1 m and high overall accuracy, but outliers may still exist in areas affected by building obstructions, shadows, or clouds. The COP-30 and AW3D30 DEMs have a resolution of 30 m and good accuracy in flat areas, but they may exhibit larger vertical errors in mountainous or undulating terrain. Therefore, the residuals in the registration results presented in this paper reflect both the errors inherent in the method itself and the limitations imposed by the absolute accuracy of the reference data.

3.2. Methodology

The overall study workflow is illustrated in Figure 3. It can be introduced through two main components: physically consistent rendering and a multimodal image matching algorithm incorporating elevation information. The first stage integrates irradiance-based models with parallax geometric models to render digital elevation model data into images with visible textures, thereby enhancing the stability of subsequent feature matching. Feature extraction and matching pair acquisition comprise two steps: Harris corner detection and template matching methods are employed to screen matching pairs. Ultimately, robust matching points are selected from the template-matched pairs based on elevation error distribution and geometric consistency constraints for affine model estimation. After completing the transformation parameter estimation, it is applied to the entire DEM to achieve consistent elevation correction, ultimately generating a registered digital elevation model.

3.2.1. Physically Consistent Image Rendering

The core objective of DEM rendering is to convert raw elevation data into images with visible textural features through geometric modeling and radiative transfer simulation, thereby enhancing feature stability in subsequent image registration. The rendering process integrates solar irradiance geometry, terrain morphology information, and satellite observation angles, primarily incorporating two physical models: the irradiance-based model [43] and the parallax geometry model [44]. The irradiance-based model calculates illumination intensity based on solar elevation angle, azimuth angle, and slope aspect to enhance terrain shadows and brightness variations. The parallax geometry model simulates projection shifts under oblique imaging, making the rendered results more akin to real remote sensing imagery. This endows the DEM with rich geometric and radiometric characteristics, providing a stable foundation for subsequent feature point matching and high-precision registration.

Irradiance-Based Model: Assuming the surface is an ideal Lambertian reflector, its reflected radiant intensity follows the cosine law. That is, the reflected irradiance L on a unit surface is proportional to the cosine of the angle

β

between the incident ray and the normal:

L = E_{s} ρ cos β

(1)

where

E_{s}

is the solar radiation intensity,

ρ

is surface reflectance,

β

is the angle between the incident light direction and the surface normal.

cos β = cos θ_{s} cos ϕ_{z} + sin θ_{s} sin ϕ_{z} cos (ϕ_{a} - θ_{a})

(2)

where

θ_{s}

is the terrain slope,

ϕ_{z}

is the solar zenith angle,

ϕ_{a}

is the solar azimuth angle, and

θ_{a}

is the terrain aspect. The slope and aspect can be calculated from DEM:

θ_{s} = arctan (\sqrt{{(\frac{\partial z}{\partial x})}^{2} + {(\frac{\partial z}{\partial y})}^{2}})

(3)

θ_{a} = arctan (\frac{\partial z / \partial y}{\partial z / \partial x})

(4)

z represents continuous elevation information. Through the above model, the relative illumination intensity for each pixel can be obtained, rendering an image with light and dark textures to enhance terrain feature expression.

Parallax Geometric Model: To simulate terrain-induced parallax displacement in remote sensing imagery, spatial projection correction of the brightness map is required based on satellite observation geometry. The fundamental principle is as follows: when a satellite observes terrain at an oblique angle, elevation variations cause translational shifts in the projected positions of different terrain points:

x^{'} = x - Δ x = \frac{Δ h sin φ_{a}}{r_{x} tan φ_{z}}

(5)

y^{'} = y - Δ y = \frac{Δ h cos φ_{a}}{r_{y} tan φ_{z}}

(6)

where

φ_{a}

and

φ_{z}

represent the satellite’s elevation angle and azimuth angle, respectively;

Δ h

denotes the difference between the pixel’s elevation and the average elevation of the projection plane;

r_{x}

and

r_{y}

are the spatial resolutions of the DEM.

This process simulates spatial misalignment phenomena in remote sensing imagery, such as mountain silhouettes and projection distortions. Consequently, the rendered image more closely resembles actual observed imagery, facilitating subsequent feature point identification and image registration.

Through the aforementioned brightness and parallax modeling, the original DEM is transformed into a grayscale image rich in shadow, texture, and projection features. This rendered image combines geometric structure with radiometric properties, offering excellent image matchability. It provides a stable, repeatable visual reference for subsequent feature point extraction or dense registration methods, demonstrating particularly significant effectiveness in areas with minor topographic variations or insufficient texture in the original DEM.

3.2.2. Multimodal Image Matching Algorithm Incorporating Elevation Information

The core concept addresses the issues of radiometric differences and geometric shifts between rendered images. By integrating elevation data from corresponding DEMs, we propose a multimodal image matching algorithm that incorporates elevation information: First, the Harris operator uniformly extracts corner points in the reference image. Then, sub-pixel template matching is achieved using fast Fourier transform (FFT) cross-correlation to obtain candidate point pairs. Outliers in the corresponding DEM elevation data for candidate point pairs are filtered to eliminate false matches, ultimately yielding a robust set of feature point pairs. This enables simultaneous estimation of the affine model for DEM registration and calibration of elevation values.

First, corner points are detected on the reference image using the Harris corner detector [45]. Following the normalized formulation proposed by Noble [46], the corner response is computed as

R = \frac{det (M)}{trace (M) + ε},

(7)

where

M

denotes the image gradient matrix within a local window, a small constant

ε

is introduced to prevent division by zero. Compared with the original Harris response, this normalized form reduces sensitivity to overall image contrast and improves numerical stability. Standard non-maximum suppression is then applied to identify salient corner candidates, consistent with the Harris–Stephens framework.

To further improve the spatial uniformity of the extracted features and avoid excessive clustering in highly textured regions, we introduce an additional block-based grid filtering strategy. The image is partitioned into regular grid cells, and a limited number of corner points with the strongest responses are retained within each cell.

For each candidate corner, define a template window centered at the point with dimensions [47]. Extract a target region of the same size from the corresponding position in the image to be matched. The similarity between the two regions is assessed via frequency-domain cross-correlation. Upon successful matching, the coordinates of the matched point pair are recorded, and all invalid or mismatched pairs (e.g., FFT without peak response or mismatch displacement exceeding limits) are filtered out. The final output is a set of successfully paired feature control points from both images.

In the absence of real terrain changes, elevation differences are typically attributed to the combined effects of the following errors: sensor measurement error, imaging geometric error, DEM interpolation, and resampling errors and residual registration errors. These errors are typically regarded as independent random variables. Notably, we treat these errors as a single aggregated residual without attempting to distinguish their individual contributions. According to the central limit Theorem, the sum of multiple independent random errors tends to follow a normal distribution. Therefore, for successfully matched feature control point sets, the elevation differences between corresponding points should also conform to a normal distribution, as shown in Figure 4. However, in practical scenarios, factors such as terrain variations [4] and shadow occlusion may introduce significant outliers between corresponding points. The elevation difference model for these points is better represented by a Gaussian Mixture Model (GMM). Consequently, this paper employs the expectation–maximization (EM) algorithm to fit the GMM. After fitting, the Gaussian component with a mean close to zero, minimal variance, and maximum weight is selected as the primary component, achieving statistically significant outlier removal. Concurrently, by incorporating geometric consistency constraints, a robust set of feature point pairs is ultimately screened. These are utilized to calibrate elevation differences and fit an affine model. It should be noted that the proposed registration framework is formulated to estimate a global affine transformation between DEMs, aiming to correct systematic planar misalignments and global elevation biases. Local inconsistencies arising from non-rigid terrain deformation, temporal surface changes, or data artifacts are not explicitly modeled or corrected. Instead, such effects are treated as outliers during the feature matching and robust estimation stages and are consequently down-weighted or removed to prevent bias in the global transformation. The elevation correction and model estimation procedures are shown in Equations (8) and (9):

\bar{Δ h} = \frac{1}{M} \sum_{i = 1}^{M} (h_{i}^{t} - h_{i}^{r})

(8)

min_{A} \sum_{i = 1}^{M} {∥x_{i}^{r} - A x_{i}^{t}∥}^{2}

(9)

where

A

represents the two-dimensional affine transformation model, x denotes the coordinates of corresponding points, and h represents the elevation values of corresponding points.

3.3. Experimental Settings

This study selected several representative methods as comparative baselines, including common DEM registration methods and new state-of-the-art remote sensing image registration methods—LZD [10], NK [14], POS-GIFT [48], MSG [49], and WSSF [50]—to systematically evaluate the performance of the proposed method. In the simulation experiments, we also conducted registration experiments comparing raw DEM images with rendered imagery to validate the necessity and effectiveness of physically consistent rendering in enhancing feature matching stability and registration accuracy.

In the simulation experiment design, a known two-dimensional translation was first applied to the original DEM to construct a global horizontal displacement field. Subsequently, a fixed-amplitude local deformation was introduced in the central region of the DEM, as shown in Figure 5, to simulate the complex registration scenarios of real multi-source DEMs affected by factors such as systematic errors and local distortions. The planar accuracy of the matching results was quantitatively evaluated using the root mean square error (RMSE) by comparing the residuals between the theoretical displacement of matching points and their actual estimated displacements.

{RMSE}_{x y} = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} [{(Δ x_{i})}^{2} + {(Δ y_{i})}^{2}]}

(10)

Here, N is the number of matching points, and

Δ x_{i}

and

Δ y_{i}

are the displacement residuals on the plane.

In the real-data experiment, spatial alignment of multi-source DEMs was achieved using different registration methods, and elevation differences between DEMs before and after registration were calculated. By statistically analyzing the distribution characteristics of elevation residuals, robust statistical metrics including median, median absolute deviation (MAD), root mean square error (RMSE), LE68, and LE95 were employed to comprehensively evaluate the registration performance:

Δ h_{i} = h_{i}^{reg} - h_{i}^{ref}

(11)

{RMSE}_{h} = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(Δ h_{i})}^{2}}

(12)

Median = median (Δ h)

(13)

MAD = median (|Δ h_{i} - median (Δ h)|)

(14)

{LE}_{p} = percentile (| Δ h |, p)

(15)

where N is the number of pixels in an image,

h_{i}^{ref}

represents the elevation value of the reference DEM, and

h_{i}^{reg}

is the elevation value of the registered DEM.

4. Results

4.1. Parameter Setting

The proposed method involves several key parameters that are critical for reproducibility and performance:

Harris Corner Detection: The corner response is defined as (7). Image gradients $I_{x}$ and $I_{y}$ are computed using a Gaussian derivative kernel with standard deviation $σ_{D} = 0.7 σ$ , where $σ = 1.5$ is the gradient scale. The auto-correlation matrix $M$ is smoothed with a Gaussian window of standard deviation $σ_{I} = 1.5$ . A small constant $ε$ prevents division by zero. No sensitivity parameter k is required in this formulation.
Template Matching: Templates of size $201 \times 201$ pixels (radius 100) are used. As discussed in many local matching methods [51,52], this parameter choice provides an optimal balance between accuracy and computational efficiency. A Gaussian weighting window with standard deviation $σ_{w} = 50$ pixels emphasizes the central region. Both template and target regions are zero-padded to the next power-of-two for FFT efficiency. Sub-pixel peak location is refined using quadratic interpolation.
Outlier Filtering: The number of mixture components is automatically determined by minimizing the Bayesian Information Criterion (BIC), with the candidate component number limited to a maximum of three to ensure model stability. To identify the most reliable component representing consistent elevation correspondences, a scoring function is designed that jointly considers the proximity of the component mean to zero, the variance, and the mixture weight. The component with the highest score is selected as the inlier component. Matched points are then filtered based on their posterior probabilities belonging to this component, yielding a robust set of elevation-consistent correspondences for subsequent DEM registration and evaluation:

$S c o r e_{k} = \frac{π_{k}}{(| μ_{k} | + ε) (σ_{k} + ε)},$

(16)

where $π_{k}$ denotes the mixture weight of the k-th component, $μ_{k}$ and $σ_{k}$ represent the mean and standard deviation of the corresponding Gaussian distribution, respectively, and $ε$ is a small positive constant introduced to avoid numerical instability.
Rendering Geometry: Solar azimuth and zenith angles are $270 °$ and $55.56 °$ , respectively. Satellite azimuth and elevation angles are $121.40 °$ and $75.67 °$ , and they are kept fixed across all experiments.
RANSAC parameters: The interior point threshold for the RANSAC stage is set to 1.5.

For clarity, Table 2 summarizes the key parameters with the same symbols used in the text.

4.2. Results of Simulation Experiment

To simulate spatial and elevation perturbations in the DEM, we applied a 10-pixel translation to the digital elevation model of the experimental area along both the x-axis and y-axis. Simultaneously, a local deformation with an elevation increment of 50 m was introduced to the central region. The corresponding simulation results are listed in Table 3, where relevant information for the area has already been presented in Table 1 earlier.

As shown in Table 3, the proposed method achieves successful matching across all datasets while maintaining low error levels (approximately 0.008–1.24), significantly outperforming common methods (LZD, NK, POS-GIFT, MSG, and WSSF). The latter methods frequently exhibit matching failures or substantial error fluctuations in areas with elevation perturbations or extremely low texture density. Notably, WSSF failed to achieve successful matching in Antarctica and Australia. This failure stems from the method’s reliance on high-quality image texture information; under extreme terrain conditions or local deformations, its feature extraction and matching strategy struggle to generate valid corresponding points, leading to global matching failure. Overall, the experiments thoroughly validate the robustness and rendering-enhanced capabilities of the proposed method under various complex disturbance conditions.

Furthermore, to improve transparency and reproducibility, we present the data as a histogram overlaid with Gaussian Mixture Model (GMM) fitting curves, clearly indicating the thresholds corresponding to the selected points, as well as each component and principal component, as shown in Figure 6. We can clearly observe that due to the additional deformation applied to the simulated data, there are significant outliers deviating from 0 in the corresponding histograms. Furthermore, we provide a summary table for each experiment, as shown in Table 4, which displays the number of initial candidate points, the number of points retained after GMM filtering, and the number of points ultimately used for evaluation (after the RANSAC algorithm). This approach not only clarifies the selection of components and filtering strategies but also quantifies the impact of GMM selection on the evaluation of each experiment. The low GMM filtering retention rate in the simulation experiments is due to the additional deformation applied to the DEM, resulting in a large number of deformed regions in the points obtained from template matching. In some experiments, the number of matching points after GMM filtering is consistent with the number after further RANSAC elimination. This indicates that within these regions, the GMM screening based on the statistical distribution of elevation differences has effectively removed outlier matching points. The remaining point set maintains consistency under geometric model constraints; therefore, RANSAC did not detect any additional outliers. We effectively eliminated outlier matching points in the simulation experiments using GMM screening, providing a stable set of matching points for subsequent transformation evaluation and validating the reliability of the method under controlled conditions.

4.3. Results of Actual Experiment

To evaluate the feasibility of the proposed algorithm in practical applications, we need to describe the experimental hardware environment and computational costs in our actual experiments. The hardware configuration included an Intel Core i7-12700K CPU (12 cores, 20 threads, and base frequency of 3.6 GHz), an NVIDIA GeForce RTX 4090 GPU (equipped with 24 GB GDDR6X video memory), and 64 GB of system memory. Under this configuration, we recorded the running time for each algorithm to evaluate the trade-off between registration accuracy and computational cost.

4.3.1. Middle East Urban Experimental Area

Table 5 presents the runtime and error statistics of different registration methods on real experimental data, including median error (Median), root mean square error (RMSE), median absolute deviation (MAD), 68% confidence interval error (LE68), and 95% confidence interval error (LE95). To ensure fairness, all methods were evaluated using the same spatial mask and comparison area.

The results reveal the following:

The registration error for raw data (Before) is substantial, with RMSE and LE95 reaching 4.3880 and 9.3984, respectively, indicating significant translation and local deformation in the initial DEM.
Traditional Methods Comparison: LZD and WSSF showed limited error reduction, with RMSE still exceeding 4 and LE95 surpassing 8, indicating unstable performance in areas with local elevation disturbances or low texture. POS-GIFT and NK showed more pronounced error reduction, with NK’s RMSE decreasing to 2.7571 and LE95 to 4.8840, demonstrating superior accuracy and stability. MSG also exhibited low overall error, performing similarly to NK.
The proposed method achieved the best performance across all metrics, with an RMSE of 2.7246 and LE95 of 4.7217, while also maintaining the lowest MAD and LE68 values. This fully demonstrates the proposed method’s superior robustness and matching accuracy in handling both global translation and local deformation. Furthermore, the running time of the proposed method is shorter than that of baseline methods, exhibiting good applicability.

Figure 7 illustrates the elevation error distribution across simulated experimental data for various registration methods, including raw data (Figure 6a), traditional methods (Figure 6b–f), and our proposed method (Figure 6g). It is evident that the raw data exhibits significant global translation and local deformation, resulting in concentrated elevation errors with extensive distribution. Among common methods, LZD and WSSF exhibit instability in this region, while POS-GIFT and NK suppress errors—particularly NK, which demonstrates greater accuracy in handling local deformations. MSG approaches NK in overall error control. In contrast, our method achieves effective matching with significantly reduced and uniformly distributed elevation errors. Detailed examination reveals that our method correctly matches the mountainous area in the upper left, whereas other methods exhibit varying degrees of wrinkling. Effective registration will significantly improve the accuracy of downstream processes (such as urban change detection [5], terrain change detection [4], and data fusion [53]) and prevent error propagation.

4.3.2. North China Plain Experimental Area

Table 6 summarizes the runtime and error statistics of different registration methods on the experimental data, including the median error (Median), root mean square error (RMSE), median absolute deviation (MAD), 68% confidence interval error (LE68), and 95% confidence interval error (LE95). To ensure fairness, all methods were evaluated using the same spatial mask and comparison area.

The results reveal the following:

The registration error for the raw data (Before) is substantial, with an RMSE of 3.4682 and LE95 of 6.5962, indicating significant global shifts and local elevation discrepancies in the data.
Traditional Methods Comparison: LZD, MSG, and WSSF demonstrate better overall error control, with RMSEs of 2.4193, 2.3816, and 2.4650, respectively. LE95 is also significantly lower than the original data, demonstrating a certain improvement in registration accuracy. NK and POS-GIFT perform relatively moderately, with slightly higher RMSE. In flat areas, NK’s accuracy is somewhat constrained due to the lack of relevant terrain slope information.
Our method achieved the best performance across all metrics, with an RMSE of 2.3410 and LE95 of 4.3433, while maintaining the lowest MAD and LE68 values. This demonstrates high robustness and accuracy in both global translation correction and local elevation disturbance handling. Furthermore, the proposed method’s runtime is second only to the NK algorithm, while offering a significant improvement in accuracy, showcasing a good balance between runtime and registration accuracy.

Figure 8 displays the elevation error distribution across the North China Plain for different registration methods, including raw data (Figure 6a), traditional methods (Figure 6b–f), and the method proposed in this paper (Figure 6g). It can be observed that the raw data exhibits significant global translation. Among common methods, NK and POS-GIFT exhibit unstable performance in this region. In flat terrain, the performance of the NK algorithm degrades significantly because flat areas lack significant elevation undulations, and local slope and normal information are almost parallel, making it difficult to distinguish the elevation difference signal from noise, hindering effective outlier removal and geometric correction. In contrast, LZD, WSSF, and MSG demonstrate some error suppression. In contrast, our method also shows excellent registration capability, validating its advantages in accuracy, robustness, and rendering enhancement effects.

Similarly to the simulation experiments, Figure 9 shows the elevation difference histograms and the fitted Gaussian components, with the thresholds for the selected main component indicated by dashed lines. Table 7 summarizes the number of points at each stage: the initial candidate points obtained by template matching, the points retained after GMM/EM filtering, and the final points used for evaluating the registration transformation. Overall, combining the two methods allows for effective outlier filtering while retaining enough points for reliable registration evaluation.

5. Discussion

5.1. Effect of Physically Consistent Rendering on DEM Registration

We conducted comparative experiments in our simulation studies, evaluating both the proposed method and a method without the rendering module. The results demonstrate that physically consistent terrain rendering plays a crucial role in DEM registration rather than merely serving as an auxiliary visualization tool. Without rendering processing, the method struggles to obtain sufficiently stable matching points and may even fail completely, with registration residuals approaching the theoretical translation value. This phenomenon is particularly pronounced in areas with insufficient texture information. During rendering, the implicit elevation variations within the DEM are explicitly transformed into discernible image texture features, significantly enhancing image matching capabilities. By comparing the registration results obtained using the same method before and after rendering in simulated experiments (see Table 8), the contribution of this workflow to matching stability and registration accuracy was quantitatively validated. This confirms that the rendering step is a critical component for improving DEM registration robustness.

5.2. Parameter Sensitivity Analysis

To systematically evaluate the stability and applicability of the proposed method under different imaging conditions and parameter configurations, we further conducted sensitivity analyses on solar imaging parameters, satellite observation parameters, and template matching scales. Using the controlled variable method, we examined the impact of variations in various parameters on matching accuracy, aiming to reveal the dominant roles of different parameters in image matching tasks and their interrelationships.

To address the issue of empirically selected solar and satellite parameters during the rendering stage, we introduced comparative experiments to quantify their impact on registration accuracy. We selected two typical terrain types—the Antarctic polar region and the European mountain region—from our simulation data for sensitivity analysis. The evaluation metric used was the planar accuracy from our simulation experiments. The specific settings and conclusions are as follows.

To analyze the impact of solar geometric parameters on matching accuracy, we conducted single-parameter scanning experiments on the solar zenith angle and solar azimuth angle under fixed satellite observation parameters.

The specific experimental settings are as follows: The solar zenith angle was fixed at 10°, 40°, and 70°, representing three typical solar altitude conditions: low, medium, and high. Under each fixed solar zenith angle condition, the solar azimuth angle was scanned in steps of 10° within the range of 0–360°. Simultaneously, with the solar azimuth angle fixed at 0°, 90°, 180°, and 270°, the solar altitude angle was scanned in steps of 5° within the range of 0–90°.

The results are shown in the Figure 10 below. For solar parameters, the solar zenith angle is the main factor affecting registration accuracy. In regions with relatively flat terrain and high surface albedo, such as Antarctica, low to medium solar zenith angles result in a more stable brightness distribution, which is beneficial for repeated detection and matching of feature points. In mountainous and complex terrain regions, the shadow structure produced by medium to high solar zenith angles effectively enhances terrain contour features, providing richer geometric constraints for the matching algorithm, thus exhibiting better matching accuracy. Medium zenith angles show good matching accuracy under both terrain conditions, which is why we chose medium solar zenith angles as empirical parameters. The impact of the solar azimuth angle on registration accuracy is relatively weaker and more localized, mainly leading to a decrease in matching accuracy under high zenith angle configurations.

When analyzing satellite imaging geometry, the controlled variable method is also used. Under the premise of fixed solar parameters, the effects of satellite elevation angle and satellite azimuth angle on matching accuracy are studied separately.

The specific experimental setup was as follows: Satellite elevation angles were fixed at 20°, 60°, and 75°, representing low, medium, and high typical satellite elevation conditions. Under each fixed satellite elevation angle condition, the satellite azimuth angle was scanned in 10° increments within the range of 0–360°. Simultaneously, satellite azimuth angles were fixed at 0°, 90°, 180°, and 270°, and satellite elevation angles were scanned in 5° increments within the range of 0–90°.

The results are shown in the Figure 11. We observed that that the satellite elevation angle is the dominant geometric parameter affecting image matching accuracy. The experimental results consistently show that as the satellite elevation angle increases, the matching error decreases significantly. When the elevation angle reaches approximately 70° or higher, the RMSE converges rapidly and tends to stabilize, achieving optimal matching performance under near-frontal imaging conditions. In contrast, the influence of the satellite azimuth angle on matching accuracy is relatively weak; its effect is mainly indirectly reflected through its relative relationship with the terrain structure direction. In particular, when the satellite elevation angle is 75°, good registration is achieved, and no single optimal value is observed across different regions.

To investigate the impact of template size on DEM matching accuracy, we conducted experiments by systematically adjusting the template radius while keeping other conditions constant. The experimental results in Figure 12 show that when the template radius is too small, the matching accuracy is low due to insufficient matching information, while when the template radius is too large, although the matching information increases, it is prone to introducing other interferences, also leading to a decrease in accuracy. Comprehensive analysis of the matching errors under different template radii reveals that a template radius of approximately 100 pixels achieves the best balance between matching accuracy and algorithm performance, which is in good agreement with the conclusions of the previous related literature [51,52].

5.3. Performance Analysis Across Different Terrain Types

The experimental results demonstrate that different registration methods exhibit distinct applicability differences regarding noise levels, terrain undulations, and texture information. The LZD method is highly sensitive to local anomalies under noisy data conditions, with errors prone to accumulate during iterations, leading to reduced overall registration accuracy. The NK method relies on slope and aspect information to establish matching relationships; however, its registration performance is significantly constrained in flat terrain areas due to insufficient feature discriminability. The WSSF method heavily relies on image texture features and struggles to obtain stable matching points in DEM scenarios with weak textures, resulting in matching failures across multiple experiments. MSG and POS-GIFT exhibit insufficient stability in areas with pronounced local deformation, showing a marked decline in registration accuracy when geometric consistency assumptions are violated. In contrast, the proposed method demonstrates superior robustness and stability across diverse terrain conditions and data quality through its combined strategy of physically consistent rendering and elevation-constrained matching. Dávila-Cisnero et al. [4] pointed out that robust DEM registration under complex terrain conditions will directly improve monitoring results. This echoes the high-precision matching and robust screening strategies emphasized in our approach, further illustrating the potential value of using physically consistent rendering and elevation-constrained matching in complex terrain areas for multi-source DEM fusion and subsequent applications.

5.4. Limitations and Future Work

Although the proposed method demonstrates good stability and accuracy across multiple experiments, certain limitations remain. First, regarding transformation models, the current method is designed under the assumption that inter-DEM discrepancies can be well approximated by rigid or affine transformations. This assumption is valid for most global and regional DEM products, where systematic offsets are primarily caused by sensor geometry, datum inconsistencies, and platform-dependent biases. In cases involving pronounced non-rigid deformations (e.g., glacier flow dynamics, landslides, or tectonically active regions), a single global affine model becomes insufficient for accurately representing spatially varying distortions. Addressing such scenarios would require more flexible deformation models (e.g., piecewise affine, spline-based, or physically constrained deformation fields). Second, the lighting parameter settings involved in the rendering process are currently selected based on empirical choices and have not yet achieved adaptive optimization.

Future work will focus on two directions: first, introducing more flexible transformation models to accommodate complex terrain deformations, especially in areas involving significant non-rigid deformations (such as landslides, glacier flows, etc.); second, exploring adaptive estimation strategies for rendering parameters to adapt to rendering registration under extreme conditions, potentially combined with learning-based feature extraction methods, enhancing registration robustness while maintaining physical interpretability.

6. Conclusions

This paper proposes a DEM registration method based on physically consistent rendering and elevation-constrained image matching. By converting DEMs into physically meaningful image representations, this approach effectively enhances feature discernibility and matching stability, alleviating the challenge of DEM registration in areas with low relief and sparse texture. Simultaneously, elevation-based constraint filtering ensures the validity and robustness of matched points, maintaining strong performance even in deformed regions.

Simulation experiments and tests on real DEM data from multiple regions demonstrate that the proposed method outperforms or significantly outperforms several common registration methods—including LZD, NK, POS-GIFT, MSG, and WSSF—across multiple robust statistical metrics such as RMSE, MAD, LE68, and LE95. These results indicate that the proposed approach can provide a reliable technical basis for multi-source DEM fusion, elevation change analysis [4,5], and related geoscience studies, while further evaluation in diverse application scenarios is warranted.

Author Contributions

Conceptualization, Y.L.; methodology, Y.L.; software, Y.L.; validation, Y.L.; formal analysis, Y.L.; investigation, Y.L.; resources, Y.L.; data curation, Y.L.; writing—original draft preparation, Y.L.; writing—review and editing, Y.L., N.J. and H.Y.; supervision and project administration, F.W. and H.Y.; funding acquisition, F.W., N.J. and H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The LiDAR DEM dataset for Australia can be obtained from the following: https://ecat.ga.gov.au/geonetwork/srv/eng/catalog.search#/metadata/89644 (accessed on 1 October 2025). The SRTM DEM dataset for the European mountains can be obtained from https://lpdaac.usgs.gov/products/srtmgl1v003/ (accessed on 1 October 2025). The REMA DEM dataset for Antarctica can be obtained from https://www.pgc.umn.edu/data/rema/ (accessed on 1 October 2025). The Copernicus DEM dataset for the North China Plain can be obtained from https://portal.opentopography.org/datasetMetadata?otCollectionID=OT.032021.4326.1 (accessed on 1 October 2025). The AW3D30 DEM dataset can be obtained from https://www.eorc.jaxa.jp/ALOS/en/index_e.htm (accessed on 1 October 2025). On the other hand, due to research needs, the data from the Gaofen-7 satellite has not yet been released, but upon request, the relevant data can be provided by the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DEM	Digital elevation model;
LZD	Least-Z difference;
NK	Nuth & Kääb;
ICP	Iterative closest point;
SIFT	Scale-invariant feature transform;
FFT	Fast Fourier transform;
GMM	Gaussian mixture model;
EM	Expectation maximization;
RMSE	Root mean square error;
SRTM	Shuttle Radar Topography Mission;
RANSAC	Random sample consensus.

References

Wilson, J.P. Environmental Applications of Digital Terrain Modeling; John Wiley & Sons: Hoboken, NJ, USA, 2018. [Google Scholar] [CrossRef]
Li, P.; Shi, C.; Li, Z.; Muller, J.P.; Drummond, J.; Li, X.; Li, T.; Li, Y.; Liu, J. Evaluation of ASTER GDEM using GPS benchmarks and SRTM in China. Int. J. Remote Sens. 2013, 34, 1744–1771. [Google Scholar] [CrossRef]
Dai, X.; Khorram, S. The effects of image misregistration on the accuracy of remotely sensed change detection. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1566–1577. [Google Scholar] [CrossRef]
Dávila-Cisneros, S.; Castañeda-Miranda, A.G.; Bautista-Capetillo, C.F.; Mattos-Villarroel, E.D.; Rodríguez-Abdalá, V.I.; Robles Rovelo, C.O.; Pinedo-Torres, L.A.; Rodríguez-Trejo, A.; Ibarra-Delgado, S. Monitoring Landform Changes in a Mining Area in Mexico Using Geomatic Techniques. Geomatics 2025, 5, 63. [Google Scholar] [CrossRef]
Hou, Z.; Qu, Y.; Zhang, L.; Liu, J.; Wang, F.; Yu, Q.; Zeng, A.; Chen, Z.; Zhao, Y.; Tang, H. War city profiles drawn from satellite images. Nat. Cities 2024, 1, 359–369. [Google Scholar] [CrossRef]
Nuth, C.; Kääb, A. Co-registration and bias corrections of satellite elevation data sets for quantifying glacier thickness change. Cryosphere 2011, 5, 271–290. [Google Scholar] [CrossRef]
Streutker, D.R.; Glenn, N.F.; Shrestha, R. A slope-based method for matching elevation surfaces. Photogramm. Eng. Remote Sens. 2011, 77, 743–750. [Google Scholar] [CrossRef][Green Version]
Zhang, J.; Huang, S.; Liu, J.; Zhu, X.; Xu, F. Pyrf-pcr: A robust three-stage 3d point cloud registration for outdoor scene. IEEE Trans. Intell. Veh. 2023, 9, 1270–1281. [Google Scholar] [CrossRef]
Zhou, R.; Li, X.; Jiang, W. 3D surface matching by a voxel-based buffer-weighted binary descriptor. IEEE Access 2019, 7, 86635–86650. [Google Scholar] [CrossRef]
Rosenholm, D.; Torlegård, K. Three-dimensional absolute orientation of stereo models using digital elevation models. Photogramm. Eng. Remote Sens. 1988, 54, 1385–1389. [Google Scholar]
Karras, G.E.; Petsa, E. DEM matching and detection of deformation in close-range photogrammetry without control. Photogramm. Eng. Remote Sens. 1993, 59, 1419–1424. [Google Scholar]
Pilgrim, L. Robust estimation applied to surface matching. ISPRS J. Photogramm. Remote Sens. 1996, 51, 243–257. [Google Scholar] [CrossRef]
Zhang, T.; Cen, M. Robust DEM co-registration method for terrain changes assessment using least trimmed squares estimator. Adv. Space Res. 2008, 41, 1827–1835. [Google Scholar] [CrossRef]
Li, T.; Hu, Y.; Liu, B.; Jiang, L.; Wang, H.; Shen, X. Co-registration and residual correction of digital elevation models: A comparative study. Cryosphere 2022, 17, 5299–5316. [Google Scholar] [CrossRef]
Besl, P.J.; McKay, N.D. A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 239–256. [Google Scholar] [CrossRef]
Chen, Y.; Medioni, G. Object modelling by registration of multiple range images. Image Vis. Comput. 1992, 10, 145–155. [Google Scholar] [CrossRef]
Li, Z.; Bethel, J. DEM registration, alignment and evaluation for SAR interferometry. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2008, 37, 11–116. [Google Scholar]
Han, Z.; Xie, X.; Tang, G.; Li, P.; Li, S. A K-Dimensional Tree–Iterative Closest Point Algorithm for Overbreak and Underbreak Assessment of Mountain Tunnels. Appl. Sci. 2025, 15, 566. [Google Scholar] [CrossRef]
Yang, J.; Wang, C.; Luo, W.; Zhang, Y.; Chang, B.; Wu, M. Research on point cloud registering method of tunneling roadway based on 3D NDT-ICP algorithm. Sensors 2021, 21, 4448. [Google Scholar] [CrossRef]
Ravanbakhsh, M.; Fraser, C. A comparative study of DEM registration approaches. J. Spat. Sci. 2013, 58, 79–89. [Google Scholar] [CrossRef]
Huang, Y.; Hu, Q. Co-registration of multi-temporal DEM based on SIFT algorithm for change detection of glaciers. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, 42, 747–752. [Google Scholar] [CrossRef]
Ravanbakhsh, M.; Fraser, C. DEM registration based on mutual information. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2012, 1, 187–191. [Google Scholar] [CrossRef]
Wang, Y.; Li, J.; Duan, P.; Wang, R.; Yu, X. The DEM Registration Method Without Ground Control Points for Landslide Deformation Monitoring. Remote Sens. 2024, 16, 4236. [Google Scholar] [CrossRef]
Li, H.; Deng, Q.; Wang, L. Automatic co-registration of digital elevation models based on centroids of subwatersheds. IEEE Trans. Geosci. Remote Sens. 2017, 55, 6639–6650. [Google Scholar] [CrossRef]
Li, J.; Hu, Q.; Ai, M. RIFT: Multi-modal image matching based on radiation-variation insensitive feature transform. IEEE Trans. Image Process. 2019, 29, 3296–3310. [Google Scholar] [CrossRef]
Jiang, L.; Wang, F.; Zhang, W.; Li, P.; You, H.; Xiang, Y. Rethinking the Key Factors for the Generalization of Remote Sensing Stereo Matching Networks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 4936–4948. [Google Scholar] [CrossRef]
Yao, Y.; Zhang, Y.; Wan, Y.; Liu, X.; Yan, X.; Li, J. Multi-modal remote sensing image matching considering co-occurrence filter. IEEE Trans. Image Process. 2022, 31, 2584–2597. [Google Scholar] [CrossRef]
Xiang, Y.; Jiang, L.; Wang, F.; You, H.; Qiu, X.; Fu, K. Detector-Free Feature Matching for Optical and SAR Images Based on a Two-Step Strategy. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5214216. [Google Scholar] [CrossRef]
Zhang, Y.; Yao, Y.; Wan, Y.; Liu, W.; Yang, W.; Zheng, Z.; Xiao, R. Histogram of the orientation of the weighted phase descriptor for multi-modal remote sensing image matching. ISPRS J. Photogramm. Remote Sens. 2023, 196, 1–15. [Google Scholar] [CrossRef]
Wang, X.; Jiang, L.; Xiang, Y.; Jiao, N.; Yang, W.; Wang, F.; You, H. Enhancing Photogrammetric DSM Based on Multiscale and Domain-Invariant Semantic Feature Learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 28677–28694. [Google Scholar] [CrossRef]
Ye, Y.; Tang, T.; Zhu, B.; Yang, C.; Li, B.; Hao, S. A multiscale framework with unsupervised learning for remote sensing image registration. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5622215. [Google Scholar] [CrossRef]
Xiong, X.; Jin, G.; Xu, Q.; Zhang, H. Self-similarity features for multimodal remote sensing image matching. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 12440–12454. [Google Scholar] [CrossRef]
Xiang, Y.; Wang, X.; Wang, F.; You, H.; Qiu, X.; Fu, K. A global-to-local algorithm for high-resolution optical and SAR image registration. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5215320. [Google Scholar] [CrossRef]
Liu, X.; Teng, X.; Bian, Y.; Li, Z.; Yu, Q. Shape-Adaptive Modality Independent Region Descriptor for Multimodal Remote Sensing Image Matching. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 18139–18155. [Google Scholar] [CrossRef]
Xiang, Y.; Jiao, N.; Liu, R.; Wang, F.; You, H.; Qiu, X.; Fu, K. A geometry-aware registration algorithm for multiview high-resolution SAR images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5234818. [Google Scholar] [CrossRef]
Xiang, Y.; Peng, L.; Wang, F.; Qiu, X. Fast registration of multiview slant-range SAR images. IEEE Geosci. Remote Sens. Lett. 2021, 19, 4007505. [Google Scholar] [CrossRef]
Lindenberger, P.; Sarlin, P.E.; Pollefeys, M. Lightglue: Local feature matching at light speed. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 17627–17638. [Google Scholar]
Quan, D.; Wei, H.; Wang, S.; Li, Y.; Chanussot, J.; Guo, Y.; Hou, B.; Jiao, L. Efficient and robust: A cross-modal registration deep wavelet learning method for remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 4739–4754. [Google Scholar] [CrossRef]
Zhao, X.; Wu, Y.; Hu, X.; Li, Z.; Li, M. A novel dual-branch global and local feature extraction network for SAR and optical image registration. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 17637–17650. [Google Scholar] [CrossRef]
Jing, W.; Chi, K.; Li, Q.; Wang, Q. ChangeRD: A registration-integrated change detection framework for unaligned remote sensing images. ISPRS J. Photogramm. Remote Sens. 2025, 220, 64–74. [Google Scholar] [CrossRef]
Kang, Q.; Zhang, J.; Huang, G.; Liu, F. Robust optical and SAR image matching via attention-guided structural encoding and confidence-aware filtering. Remote Sens. 2025, 17, 2501. [Google Scholar] [CrossRef]
Wan, L.; Xiang, Y.; Kang, W.; Ma, L. A Self-Supervised Learning Pretraining Framework for Remote Sensing Image Change Detection. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5630116. [Google Scholar] [CrossRef]
Chen, H.W.; Cheng, K.S. A Conceptual model of surface reflectance estimation for satellite remote sensing images using in situ reference data. Remote Sens. 2012, 4, 934–949. [Google Scholar] [CrossRef]
Yuan, W.; Tong, X.; Xiao, B. SGM-Based Disparity Estimation Under Radiometric Variations. In Image and Graphics Technologies and Applications, Proceedings of the 14th Conference on Image and Graphics Technologies and Applications, IGTA 2019, Beijing, China, 19–20 April 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 382–391. [Google Scholar] [CrossRef]
Harris, C.; Stephens, M. A combined corner and edge detector. In Proceedings of the Alvey Vision Conference, Manchester, UK, 31 August–2 September 1988; Volume 15, pp. 10–5244. [Google Scholar]
Noble, J.A. Finding corners. Image Vis. Comput. 1988, 6, 121–128. [Google Scholar] [CrossRef]
Foroosh, H.; Zerubia, J.; Berthod, M. Extension of phase correlation to subpixel registration. IEEE Trans. Image Process. 2002, 11, 188–200. [Google Scholar] [CrossRef] [PubMed]
Hou, Z.; Liu, Y.; Zhang, L. POS-GIFT: A geometric and intensity-invariant feature transformation for multimodal images. Inf. Fusion 2024, 102, 102027. [Google Scholar] [CrossRef]
Zheng, C.; Li, S.; Wang, C.; Zhang, B. MSG: Robust Multimodal Remote Sensing Image Matching Using Side Window Gaussian Space. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4706223. [Google Scholar] [CrossRef]
Wan, G.; Ye, Z.; Xu, Y.; Huang, R.; Zhou, Y.; Xie, H.; Tong, X. Multimodal Remote Sensing Image Matching Based on Weighted Structure Saliency Feature. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4700816. [Google Scholar] [CrossRef]
Ye, Y.; Bruzzone, L.; Shan, J.; Bovolo, F.; Zhu, Q. Fast and Robust Matching for Multimodal Remote Sensing Image Registration. IEEE Trans. Geosci. Remote Sens. 2019, 57, 9059–9070. [Google Scholar] [CrossRef]
Xiang, Y.; Jiao, N.; Wang, F.; You, H. A robust two-stage registration algorithm for large optical and SAR images. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5218615. [Google Scholar] [CrossRef]
Okolie, C.J.; Smit, J.L. A systematic review and meta-analysis of Digital elevation model (DEM) fusion: Pre-processing, methods and applications. ISPRS J. Photogramm. Remote Sens. 2022, 188, 1–29. [Google Scholar] [CrossRef]

Figure 1. Middle East urban experimental area.

Figure 2. Experimental area in the North China Plain.

Figure 3. The workflow of the proposed DEM registration method.

Figure 4. Elevation difference histogram between corresponding points.

Figure 5. Schematic diagram of DEM simulation data generation.

Figure 6. Elevation difference histograms with GMM fit for simulated experiments (Europe, Antarctica, and Australia). The Gaussian curves represent the fitted GMM components, and the vertical lines indicate the threshold for selecting points. Each subfigure shows the distribution of elevation differences and highlights the main GMM component used for point filtering.

Figure 7. High-elevation error maps of different registration methods. The color scale indicates elevation error in meters. Each panel shows the corresponding registration error distribution.

Figure 8. High-elevation error maps of different registration methods. The color scale indicates elevation error in meters. Each panel shows the corresponding registration error distribution.

Figure 9. GMM filtering results for real data experiments. The histograms show elevation differences of matched points, and the colored curves represent the GMM-fitted components. Red and blue dashed lines indicate the thresholds for the selected main component.

Figure 10. The relationship between matching error and solar zenith angle and azimuth angle in Antarctica and European mountain regions. The horizontal axis represents the corresponding angle, and the vertical axis represents the matching accuracy (unit: pixels).

Figure 11. The relationship between matching errors and the satellite elevation angle and azimuth angle in Antarctica and European mountain regions. The horizontal axis represents the corresponding angle, and the vertical axis represents the matching accuracy (unit: pixels).

Figure 12. Sensitivity analysis of matching accuracy as a function of template size for eight experimental regions. Each subfigure shows RMSE (in pixels) corresponding to different template sizes for the simulation area.

Table 1. Summary of DEM datasets used in simulation experiment.

Region	Area	Type	Resolution	Size
Australia	Darwin	LiDAR-derived DEM	5 m	$1464 \times 1422$
Australia	Christmas Island	LiDAR-derived DEM	5 m	$636 \times 678$
Antarctica	Area 1	REMA DEM	2 m	$1464 \times 1422$
	Area 2			$1464 \times 1422$
	Area 3			$1464 \times 1422$
	Area 4			$1464 \times 1422$
Europe	Area 1	SRTM DEM	90 m	$1686 \times 1798$
Europe	Area 2	SRTM DEM	90 m	$1686 \times 1798$

Table 2. Key parameters used in the proposed method.

Parameter	Description	Value
$σ$	Gradient Gaussian scale	1.5
$σ_{D}$	Gradient kernel std	$0.7 σ$
$σ_{I}$	Auto-correlation smoothing std	1.5
$ε$	Small constant to prevent division by zero	$10^{- 6}$
Template size	Side length of template window	$201 \times 201$ pixels
$σ_{w}$	Gaussian weighting std for template	50 pixels
Solar azimuth	Sloar azimuth angle for DEM rendering	$270 °$
Solar zenith	Solar zenith angle for DEM rendering	$55.56 °$
Satellite azimuth	Satellite viewing azimuth	$121.40 °$
Satellite elevation	Satellite viewing elevation	$75.67 °$

Table 3. Matching errors (in pixels) of different registration methods on simulated DEM datasets. The symbol “–” indicates matching failure, and “*” denotes extreme matching results.

Area	LZD	NK	POS-GIFT	MSG	WSSF	Our Method
Darwin	*	0.0031	14.1481	–	–	0.4828
Christmas Island	8.3340	1.0744	–	14.1421	–	1.2363
Antarctica 1	9.9854	8.5800	0.2770	4.2796	–	0.0491
Antarctica 2	*	*	0.0296	–	–	0.0207
Antarctica 3	*	*	0.3033	–	–	0.1321
Antarctica 4	9.1847	*	0.2161	13.2876	–	0.0079
Europe 1	0.0219	0.0046	0.0626	0.1604	0.2308	0.0003
Europe 2	0.0798	0.0025	0.0841	0.3672	0.4598	0.0012

Table 4. Simulation experiments: comparison of initial candidate points, GMM-selected points, and final points after RANSAC. Initial points are obtained by template matching, GMM-selected points are obtained after EM/GMM filtering, and final points are used for evaluating each transformation.

Area	Initial Points	GMM Selected Points	Final Points
Europe data1	4898	2536	2536
Europe data2	4558	2391	2391
Antarctica data1	3705	2021	1225
Antarctica data2	3513	1651	1529
Antarctica data3	517	268	245
Antarctica data4	939	486	414
Darwin	2338	966	862
Christmas Island	1742	325	325

Table 5. Comparison of different DEM registration methods on experiment dataset1. All statistical indicators are in meters.

Method	Median	RMSE	MAD	LE68	LE95	Time (s)
Before	−0.4854	4.3880	2.9652	3.0812	9.3984	-
LZD	−0.1488	4.1393	2.8724	2.9403	8.7381	258.30
NK	−0.0030	2.7571	2.2502	2.2311	4.8840	64.46
POS-GIFT	−0.2285	3.4188	2.5852	2.6105	6.8608	1356.98
MSG	−0.2098	2.8100	2.3180	2.3112	5.0320	953.48
WSSF	−0.1432	4.3269	2.9489	3.0337	9.2293	203.48
Our Method	−0.2126	2.7246	2.2472	2.2442	4.7217	62.21

Table 6. Comparison of different DEM registration methods on experiment dataset2. All statistical indicators are in meters.

Method	Median	RMSE	MAD	LE68	LE95	Time (s)
Before	1.8426	3.4682	2.4702	3.2605	6.5962	-
LZD	0.1430	2.4193	1.9141	1.9353	4.5943	22.48
NK	0.0082	2.8673	2.3153	2.3764	5.5260	6.01
POS-GIFT	0.1132	2.8399	2.3260	2.3897	5.5405	295.81
MSG	0.0991	2.3816	1.8984	1.9120	4.5058	146.68
WSSF	0.0891	2.4650	1.9459	1.9647	4.6890	26.57
Our Method	0.1109	2.3410	1.8287	1.8396	4.3433	9.07

Table 7. Actual experiments: comparison of initial candidate points, GMM-selected points, and final points after RANSAC. Initial points are obtained by template matching, GMM-selected points are obtained after EM/GMM filtering, and final points are used for evaluating each transformation.

Area	Initial Points	GMM Selected Points	Final Points
North China Plain	4157	3900	3857
Middle East Urban Area	4493	3326	2715

Table 8. Comparison of planar matching accuracy (in pixels) before and after imaging rendering across six regions.

Method	Darwin	Christmas Island	Area 1	Area 2	Area 3	Area 4
Without Rendering	14.1489	13.9753	14.1239	14.1243	14.1408	14.1511
Our Method	0.4828	1.2363	0.0491	0.0207	0.1321	0.0079

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Y.; Jiao, N.; Wang, F.; You, H. A Robust DEM Registration Method via Physically Consistent Image Rendering. Appl. Sci. 2026, 16, 1238. https://doi.org/10.3390/app16031238

AMA Style

Li Y, Jiao N, Wang F, You H. A Robust DEM Registration Method via Physically Consistent Image Rendering. Applied Sciences. 2026; 16(3):1238. https://doi.org/10.3390/app16031238

Chicago/Turabian Style

Li, Yunchou, Niangang Jiao, Feng Wang, and Hongjian You. 2026. "A Robust DEM Registration Method via Physically Consistent Image Rendering" Applied Sciences 16, no. 3: 1238. https://doi.org/10.3390/app16031238

APA Style

Li, Y., Jiao, N., Wang, F., & You, H. (2026). A Robust DEM Registration Method via Physically Consistent Image Rendering. Applied Sciences, 16(3), 1238. https://doi.org/10.3390/app16031238

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Robust DEM Registration Method via Physically Consistent Image Rendering

Abstract

1. Introduction

2. Related Works

2.1. Three-Dimensional Registration Methods

2.2. Two-Dimensional Registration Methods

3. Materials and Methods

3.1. Study Area

3.1.1. Simulation Data

3.1.2. Actual Data

3.2. Methodology

3.2.1. Physically Consistent Image Rendering

3.2.2. Multimodal Image Matching Algorithm Incorporating Elevation Information

3.3. Experimental Settings

4. Results

4.1. Parameter Setting

4.2. Results of Simulation Experiment

4.3. Results of Actual Experiment

4.3.1. Middle East Urban Experimental Area

4.3.2. North China Plain Experimental Area

5. Discussion

5.1. Effect of Physically Consistent Rendering on DEM Registration

5.2. Parameter Sensitivity Analysis

5.3. Performance Analysis Across Different Terrain Types

5.4. Limitations and Future Work

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI