Next Article in Journal
Local Extrema Adaptive Pyramid Decomposition for Optical and SAR Image Fusion
Previous Article in Journal
Scale-Aligned Capacity Allocation: A Lightweight Face Detection Framework for Fixed-View Unmanned Restaurant Scenarios
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fast Conversion Algorithm of DSM Image Elevation Datum Based on MPI Parallel Technology

1
School of Geomatics, Liaoning Technical University, Fuxin 123000, China
2
Heilongjiang Longmei Geological Exploration Co., Ltd., Jiamusi 154002, China
*
Author to whom correspondence should be addressed.
Electronics 2026, 15(10), 2127; https://doi.org/10.3390/electronics15102127
Submission received: 8 April 2026 / Revised: 12 May 2026 / Accepted: 13 May 2026 / Published: 15 May 2026
(This article belongs to the Special Issue Advanced Information Systems: Data-Driven and Geospatial Approaches)

Abstract

The elevation datum is a critical element in surveying and mapping, as variations in elevation systems can lead to discrepancies between Digital Surface Model (DSM) products generated from satellite imagery. To eliminate these differences and ensure high-precision data consistency, this study constructs an elevation datum conversion scheme for multi-source DSM products using the SGG-UGM-2 (2190 degree) global gravity field model to calculate elevation anomalies. While traditional serial algorithms suffer from significantly decreased efficiency as the volume of DSM image files increases, this paper proposes a novel HDC-MPI elevation datum conversion algorithm based on Message Passing Interface (MPI) parallel technology. By leveraging distributed memory parallel computing, the processing task is partitioned into multiple sub-tasks, substantially enhancing overall throughput. Experimental results demonstrate that: (1) the HDC-MPI algorithm improves conversion efficiency by approximately 8 times compared to the serial approach when processing 12 image scenes; (2) the algorithm’s efficiency is primarily governed by image memory usage rather than terrain complexity; and (3) the conversion accuracy of the HDC-MPI algorithm remains fully consistent with serial results, ensuring the reliability of the elevation datum transformation.

1. Introduction

With the continuous maturity of satellite remote sensing, airborne LiDAR, and aerial photogrammetry technology, Digital Elevation Model (DEM) and Digital Surface Model (DSM) products with higher precision and wider application ranges are provided to users. However, due to the requirements of different application scenarios and regional surveying and mapping standards, the elevation datum (or vertical datum) used in the final delivery or fusion application of DEM/DSM products are often different. Therefore, the elevation datum conversion of the image is particularly important.
Ramdani et al. identified the necessity of local refinement for global models in regions with complex topography by evaluating the performance of various high-degree models in Indonesia [1]. This further underscores that when processing vertical datum transformations for high-resolution DSM data, the integration of ultra-high-degree models (such as SGG-UGM-2) with efficient parallel interpolation algorithms is of paramount importance for enhancing the efficiency and accuracy of regional height datum unification. Wang Yunpeng, Liu Xiaogang, and colleagues proposed improved iterative methods based on local and global integrals for constructing ultra-high-degree global gravity field models [2,3]. Using the EGM2008 and EIGEN-6C4 models as foundational inputs, they developed the DQM2022 ultra-high-degree gravity field model. Experimental results show that, compared with the original models over China, the DQM2022 model improves gravity field accuracy by 2.4–2.8 mGal and enhances elevation anomaly accuracy by 1.0–2.4 cm. The SGG-UGM-2 model, released by Liang et al. [4], integrates satellite altimetry with terrestrial gravity data and extends the spherical harmonic expansion to degree and order 2190. This advancement provides a critical mathematical and physical foundation for the unification of vertical datums at both global and regional scales.
Although ultra-high-order gravity field models offer a physical basis for elevation conversion and block processing techniques for large-scale remote sensing imagery continue to advance, current algorithms still encounter the following structural technical bottlenecks in handling vertical datum transformation of vast amounts of high-resolution DSM data:
  • Inefficiency and exponential time growth in serial processing: During the process of image elevation datum transformation (particularly in batch processing), traditional serial algorithms exhibit extremely low efficiency. With the escalation of image resolution and data volume, computation time increases exponentially, rendering them entirely incapable of satisfying the real-time or quasi-real-time processing demands of large-scale tasks [5].
  • Spatial Continuity and Interpolation Bottleneck in Block-wise Processing: Constrained by computational resources, large-volume remote sensing images are frequently partitioned into smaller blocks for processing. Although Yadav et al. [6] introduced a seamless stitching framework grounded in geometric correction and edge fusion to tackle the multi-scene seam problem, the utilization of ultra-high-order models for elevation transformation is hindered by the absence of an efficient parallel interpolation algorithm for collaborative processing. Consequently, this imposes substantial constraints on the overall computational efficiency while maintaining spatial and numerical continuity within the sub-blocks.
With the rapid advancement and democratization of high-performance computing (HPC) technologies, Message Passing Interface (MPI) parallelization has been extensively adopted across diverse domains, including climate modeling, artificial intelligence, physical simulation, and big data analytics. In the field of geomatics and remote sensing, MPI has demonstrated significant efficacy in gravimetric field recovery, satellite data processing, and large-scale imagery analysis. Within the research of high-resolution raster data parallelization, the core challenge lies in implementing effective domain decomposition tailored to geospatial characteristics. The foundational task partitioning principles proposed by Wang and Armstrong provided a critical theoretical framework for subsequent MPI-based parallel geoprocessing architectures [7]. Cui et al. [8] utilized MPI technology to perform an efficiency analysis of spherical harmonic synthesis for the EGM2008 model at degree and order 2160. Their study implemented three distinct experimental schemes, scaling from 1 to 180 processes. The experimental results demonstrated that the maximum speedup could reach 22.14. Addressing the limitations in the rapid tiling of large-scale raster imagery, Liu et al. proposed an MPI-based parallel algorithm named ParaTile [9]. Experimental results demonstrated that the ParaTile algorithm exhibited significant advantages over traditional serial methods in both processing throughput and system stability across various data scales. However, there remains a dearth of a parallel computing framework that is specifically and deeply optimized for the integration of “ultra-high-order gravity field models with large-scale image benchmark conversion”.
Addressing the limitations of existing algorithms in computational efficiency and processing scale, this study systematically integrates MPI technology into the domain of image elevation datum transformation, with the objective of developing a high-performance parallel transformation algorithm for HDC-MPI. The specific goals and anticipated performance advantages are as follows:
  • Overcoming computational bottlenecks and achieving substantial parallel acceleration: By optimizing both the domain decomposition strategy and the MPI parallel architecture, the proposed approach aims to fully mitigate the inefficiencies inherent in traditional serial batch processing. For large-scale datasets, the algorithm is expected to significantly reduce the exponentially increasing processing time, effectively doubling computational throughput and parallel acceleration ratios.
  • Ensuring data quality and spatial consistency: By combining ultra-high-order geodetic models (e.g., SGG-UGM-2) with efficient parallel interpolation schemes, the method enhances the efficiency of benchmark unification while rigorously preserving the spatial consistency and numerical continuity of the final output, even when processing massive imagery in parallel blocks.
  • Enhancing scalability and operational stability: The feasibility of applying MPI technology to image elevation benchmark conversion is systematically evaluated, demonstrating that the proposed algorithm can operate reliably on high-performance computing (HPC) clusters. This provides a robust solution to the challenge of reconciling the ever-increasing volume of remote sensing data with limited computational resources.

2. Theory and Method

2.1. The Principle of Image Elevation Datum Conversion

If the geodetic height of a certain point is known, according to the elevation relationship formulas:
H = H p + N
H = H n + ζ
In these equations, H is the geodetic height, H p is the orthometric height, H n is the normal height, N is the geoid difference, and ζ is the elevation anomaly. To more intuitively show the relationship between the geodetic height, normal height, and orthometric height, the elevation system and its relationship are illustrated in Figure 1.
The calculation of elevation anomalies using a geoid model fundamentally relies on mathematically modeling the Earth’s gravity field, transforming gravity signals into the geometric undulations of the geoid, and subsequently establishing the conversion relationship between geodetic height and normal height. In this study, the geoid model is fitted based on the SGG-UGM-2 gravity field model. Compared with the widely adopted traditional EGM2008 (which also extends to degree 2159), SGG-UGM-2 substantially improves the reliability of short-wavelength components by integrating the latest GOCE satellite gravity gradient data, denser ground gravity observations, and refined terrain reduction techniques. This enhanced capability to capture high-frequency signals carries dual core significance for the precise conversion of DSMs (Digital Surface Models):
  • Reduction of omission errors: High-resolution DSMs contain abundant short-wavelength terrain variations. If the spectral resolution of the reference gravity field model is insufficient, a substantial portion of short-wavelength signals in elevation anomalies will be lost, resulting in omission errors and inducing systematic spatial aliasing in the datum transformation.
  • Improvement of local fitting numerical stability: The superior fidelity of SGG-UGM-2 in the short-wavelength band ensures that calculated elevation anomalies correlate more strongly with actual topographic relief within local regions. Consequently, during the subsequent local quadric fitting process, residuals between model predictions and observed values are smaller and more evenly distributed. Physically, this enhances the convergence of the least squares estimation, reduces the sensitivity of the fitted surface to anomalies, and guarantees that even in regions of complex topography, the elevation datum transformation preserves high spatial consistency and accuracy.
The global geoid can be fitted based on the disturbing potential. The fully normalized spherical harmonic function expression of the global potential coefficient model released in recent years is generally:
T ( r , θ , λ ) = G M r n = 2 a r n m = 0 n C n m cos m λ + S n m sin m λ P n m ( cos θ )
where r is the distance from the calculated point to the center point of the ellipsoid; λ is the longitude of the calculated point; θ = π / 2 φ is the geocentric colatitude; a is the major semi-axis of the ellipsoid; G M is the geocentric gravitational constant; C n m , S n m are the fully normalized Stokes coefficients (potential coefficients); P n m is the fully normalized associated Legendre function; n denotes the degree; and m is the order in spherical harmonic expansions. According to the Bruns formula, the geoid difference or elevation anomaly can be expressed as [10]:
N = T γ
where γ is the normal gravity value. Substituting the disturbing potential into the Bruns formula, the spherical harmonic function expansion of the elevation anomaly at any point can be obtained:
N = G M r γ n = 2 a r n m = 0 n C n m cos m λ + S n m sin m λ P n m ( cos θ )
Based on the interpolation principle of the quadratic surface moving fitting method, an elevation datum conversion model can be constructed as follows:
  • According to the principle of interpolation, the plane coordinates ( X p , Y p ) of the point P to be determined are solved.
  • Taking the point P as the center and R as the search radius (Figure 2). Since the number of quadratic surface coefficients is 6, the number of selected data points must be n > 6 . A point i ( X i , Y i ) is selected if its distance d i = X i 2 + Y i 2 < R . If the selected points are insufficient, R is increased until n 6 .
  • After selecting a sufficient number of sampling points, the coordinate system is adjusted for computational convenience by translating the origin to the location of the point to be determined. Using the identified sampling points in combination with the quadratic surface fitting equation, the fitted surface can be expressed as:
    ζ ( x , y ) = a 0 + a 1 x + a 2 y + a 3 x 2 + a 4 y 2 + a 5 x y
    In the formula, [ a 0 , a 1 , a 2 , a 3 , a 4 , a 5 ] T are the corresponding fitting coefficients of the undetermined point P ( X p , Y p ) . Compared to linear interpolation, the quadric surface can better capture the fine gravity field signals of the SGG-UGM-2 model in the medium and short wavelength bands, effectively reducing the omission error in the reference transformation. In the algorithm implementation, besides requiring a certain number of sampling points, m > 6 (to ensure redundant observations), this algorithm introduces a multi-quadrant spatial distribution verification mechanism (as shown in Figure 2). The program divides the search area into four quadrants, mandating the presence of sampling points in each quadrant. This constraint ensures that the unknown point P remains within the geometric envelope of the control points, effectively mitigating the risk of rank deficiency and leverage effect in solving the normal equations, thereby ensuring the geometric robustness of the surface fitting.
    If there are m control points in the measurement area, the error can be calculated using Formula (6):
    V = [ v 1 , v 2 , , v m ] T
    X = 1 x 1 y 1 x 1 2 y 1 2 x 1 y 1 1 x 2 y 2 x 2 2 y 2 2 x 2 y 2 1 x m y m x m 2 y m 2 x m y m
    A = [ a 0 , a 1 , a 2 , a 3 , a 4 , a 5 ] T
    L = [ ζ 1 , ζ 2 , , ζ m ] T
    V = X A L
    In the formula, V is the residuals, A is the fitting coefficient vector, L is the measured observation vector of known points, and X is the design matrix of m × 6 , where each row represents the coordinate value of a data point mapped to the quadratic polynomial.
  • In the calculation process, the weight of each sampling point is very important. Here, the matrix P represents the weight of the data points. It needs to be clear that these weights do not directly reflect the observation accuracy of the adjacent points searched. Therefore, the principle of determining the weight should be related to the distance d i between the data point and the point to be determined. The commonly used weight forms are:
    P i = 1 d i 2
    P i = R d i d i 2
    This study conducted a parameter perturbation test: due to the extremely smooth physical surface of the quasi-geoid in space, experiments have shown that when the search radius R varies within the range of 10 km to 30 km, or the weighted power is switched between 1 and 2, the final fluctuation of elevation anomalies remains at the millimeter level. This proves that the algorithm has strong robustness to parameter settings, and the conversion accuracy does not depend on specific parameter tuning. where R is the radius of the selected point, which is the distance from the point P to the data point. In this paper, the point P ( X p , Y p ) to be solved is set as the origin of the local coordinate system. For the i-th data point i ( X i , Y i ) within the search radius R, the relative distance d i is defined as:
    d i = ( X i X p ) 2 + ( Y i Y p ) 2
  • Finally, according to the principle of least squares, the solution of the quadratic surface coefficient is obtained by means of the method:
    A = ( X T P X ) 1 X T P L
    and the value of the interpolation point is obtained.

2.2. MPI Parallel Conversion Algorithm Design

This study proposes the HDC (High-Performance Benchmark Conversion)–MPI algorithm, a fast conversion method based on MPI parallel technology, designed to improve the efficiency of batch image elevation benchmark conversions and provide a scalable solution for large-scale image processing. The HDC-MPI (High Definition Datum Conversion–MPI) algorithm operates on a distributed memory model and leverages the Message Passing Interface (MPI) to enable highly concurrent processing of DSM images. Unlike shared-memory parallelization approaches such as OpenMP, MPI allows the algorithm to overcome the memory limitations of a single physical node, facilitating the processing of large-scale DSM datasets that exceed the capacity of individual machines. Given the substantial volume of high-resolution DSM imagery, which can be processed independently, MPI-level process parallelism inherently accelerates the computation, providing natural scalability and efficiency advantages for the HDC-MPI algorithm.
Traditional single-threaded methods often fail to complete such tasks within a reasonable time frame when dealing with large-scale image datasets. In this context, the application of MPI parallel technology significantly improves conversion efficiency, particularly when processing large volumes of multi-image data.
MPI parallel technology allows programmers to parallelize operations at every stage of the program, enabling efficient data exchange and task collaboration in a multi-process environment. In the case of elevation datum conversion for DSM images, each image can be processed independently, and this inherent parallelism makes MPI an ideal solution. By leveraging MPI fast conversion, the algorithm efficiently utilizes the parallel capabilities of modern computing systems, significantly accelerating the conversion process.

2.2.1. Communication Model Design

The Collective Communication model allows all processes within a communicator to participate in data transmission and coordination simultaneously, serving as a core paradigm for efficient data exchange in parallel computing. Compared with traditional point-to-point communication, collective communication leverages highly optimized collective operations to synchronize tasks among multiple processes, significantly reducing communication latency in large-scale parallel environments [11].
In the proposed algorithm, the standard MPI_Bcast function is employed to distribute the global gravity field model parameters and control information from the root process to all participating processes. Within this parallel architecture, the root process handles global data initialization and resource management, while the worker processes execute the specific DSM elevation transformation tasks in parallel.
By invoking, the root process delivers data to the entire process group through a single collective call, utilizing the tree-based topology optimized within the MPI implementation. This approach circumvents the MPI_Bcast O ( N ) linear time complexity associated with sequential point-to-point transmissions. It should be noted that the correctness of this mechanism relies on the strictly consistent invocation sequence across all processes in the communicator. By ensuring synchronized storage management and communication ordering, provides implicit synchronization, thereby reducing the risk of logical errors that typically arise from manually managing complex point-to-point communication handles. During the data-loading phase of global gravity field models and large-scale DSM imagery, this mechanism ensures high consistency of data states throughout the parallel framework.

2.2.2. Load Balancing Design

Load balancing is commonly classified into two categories: static load balancing and dynamic load balancing. Static load balancing achieves a uniform distribution of workload by pre-dividing tasks and assigning them to computing processes prior to program execution. This approach is characterized by its implementation simplicity and minimal communication overhead, making it particularly well-suited for scenarios in which the task scale is fixed and the computational intensity is predictable [12].
In the context of processing large volumes of standardized remote sensing products—such as DSM tiles with consistent framing and uniform pixel sizes—the computational load associated with each scene is highly uniform. Under these conditions, static allocation can provide substantial parallel speedup while minimizing MPI communication overhead, thereby achieving efficient utilization of computational resources.
Given the variations in DSM imagery specifications across different task scenarios—such as the number of pixel rows/columns and file sizes for a single scene—as well as the total volume of images to be processed, the algorithm must exhibit high flexibility. To ensure that the HDC-MPI algorithm can adapt to production tasks of various scales, this study employs a static load balancing strategy based on data decomposition.
In a serial algorithm, images are typically processed in batches using a for loop, where images are read sequentially and then processed for elevation datum conversion. To improve efficiency, this paper proposes replacing the for loop of the serial algorithm with an MPI parallel algorithm. However, unlike the serial algorithm, which processes images sequentially without considering the overall task load, the parallel algorithm must account for the load balancing problem—specifically, how to distribute images effectively across different processes.
To address this, a modified MPI parallel algorithm is designed, suitable for handling the image processing tasks in parallel. This algorithm ensures that tasks are evenly distributed, taking into consideration the number and size of images. The algorithm can be expressed in the following formula:
Let the total number of image tasks to be processed be N t a s k s and the total number of MPI processes launched be N p r o c s . Define the process number currently executing the task as i d (where 0 i d N p r o c s ). First, calculate the quotient q and the remainder r o in the benchmark task:
q = N t a s k s N p r o c s , r = N t a s k s mod N p r o c s
start ( i d ) = i d · ( q + 1 ) , r > 0 and i d < r r · ( q + 1 ) + ( i d r ) · q , otherwise
end ( i d ) = start ( i d ) + ( q + 1 ) , r > 0 and i d < r start ( i d ) + q , otherwise
In the formula, q represents the quotient of the total image task volume divided by the total number of processes, r is the remainder, and i d is the rank of the currently executing process. In addition, process IDs are 0-based.
Two scenarios are considered for static load allocation based on the relationship between the total number of image tasks and the total number of processes.
Scenario 1: There is no exact multiple relationship between the total task volume and the number of processes. For example, when processing 10 image scenes using 4 processes, we have q = 2 and r = 2 . For the first two processes, the conditions r > 0 and i d < r are satisfied, so each is assigned q + 1 = 3 tasks. For the remaining processes ( i d = 2 and i d = 3 ), these conditions are not satisfied, so each is assigned q = 2 tasks. The resulting load distribution queue is [ 3 , 3 , 2 , 2 ] .
Scenario 2:The total task volume is exactly divisible by the number of processes. For instance, when processing 10 image scenes using 5 processes, q = 2 and r = 0 . Since none of the processes satisfy both r > 0 and i d < r , the allocation logic reduces to a simple AND operation. Consequently, each process is evenly assigned 2 tasks, resulting in a final load distribution queue of [ 2 , 2 , 2 , 2 , 2 ] .
This static allocation strategy ensures balanced workload distribution across processes, minimizing idle time and MPI communication overhead, even when the total task volume is not evenly divisible among the available processes. Figure 3 shows the image task allocation flowchart of HDC-MPI algorithm.

2.2.3. Core Algorithm Code

The specific implementation of the static task allocation strategy is detailed in Algorithm 1. The pseudo-code presented above provides a detailed illustration of the global image scanning and static task allocation process during the data preprocessing stage of the HDC-MPI algorithm. This process is structured into three core phases, designed to maximize parallel efficiency while minimizing idle time between processes:
Input and Broadcast Mechanism (Phase 1): To mitigate I/O bottlenecks and resource contention caused by multiple processes simultaneously accessing the underlying file system in large-scale parallel environments, the algorithm employs a single-point input strategy. Specifically, only the root process ( i d = 0 ) is responsible for receiving or reading the initial image file path, D i r _ P a t h . This path is then efficiently broadcast to all slave processes within the MPI communication domain using MPI’s collective communication mechanism (Broadcast). This ensures that all computing nodes maintain a consistent data context simultaneously, reducing redundant I/O operations.
Global File Analysis (Phase 2): Once the unified path is obtained, each process performs directory traversal in parallel, automatically filtering out invalid entries such as hidden or system files. During this step, a complete queue of pending image files, F i l e _ L i s t , is dynamically constructed. The system also determines the total number of tasks, M, corresponding to the valid images that require elevation datum conversion.
Static Load Balancing and Task Partitioning (Phase 3): This phase is critical for achieving a high parallel speedup ratio. To map the M processing tasks efficiently onto N p r o c s MPI processes, the algorithm implements a refined static data decomposition strategy, ensuring that workload is balanced across processes while minimizing inter-process communication overhead.
This structured preprocessing workflow enables the HDC-MPI algorithm to handle large-scale imagery efficiently, providing a foundation for high-throughput parallel elevation datum conversion.
Algorithm 1: MPI-based Global Image Scanning and Static Task Allocation
Electronics 15 02127 i001

2.3. HDC-MPI Parallel Conversion Workflow

The HDC-MPI algorithm has implemented a high-performance parallel elevation benchmark conversion workflow for large-scale DSM images. Its core components include:
Global Image Scanning and Task Allocation:The root process scans the input directory, filters out invalid files, and broadcasts a list of valid DSM images to all worker processes. A static load balancing strategy is used to allocate tasks, which takes into account the total number of images and the number of processes (Formulas (16) and (17)). Each process receives a continuous block of images, ensuring balanced workload distribution and minimal inter-process communication overhead.
Model Solving and Quadratic Surface Interpolation in Parallel Subtasks:In independent parallel loops of each process, the algorithm first calculates the elevation anomalies of control points based on the SGG-UGM-2 global gravity field model (2190 th order). Subsequently, for each pixel in the DSM image, elevation anomaly interpolation is performed using a quadratic surface model fitted to adjacent control points.
I/O and Memory Optimization:To avoid file system contention, the initial global model parameter reading is uniformly executed by the root process and broadcasted for distribution. In the image processing stage, each process independently reads the image blocks it is responsible for based on the allocated file handle. By finely managing matrix operations through local memory pools and limiting redundant data copies, the algorithm ensures high throughput when processing high-resolution and large-volume DSM images.
Synchronization and Result Collection:The set utilizes MPI’s collective communication operations to ensure that all worker processes synchronize their states during the computation cycle. To avoid network communication and memory bottlenecks caused by massive image data being returned to the root process, each worker process independently and in parallel outputs the processed DSM image blocks to the designated storage location after completing the benchmark conversion, ensuring that the final dataset remains consistent in space and values.

2.4. Technical Route

This study addresses the prominent issue of low conversion efficiency during the elevation datum transformation of batch DSM images. To solve this problem, an efficient batch DSM image elevation datum conversion method, named HDC-MPI, is proposed and specifically designed based on MPI technology. The primary objective is to significantly enhance the throughput and efficiency of batch processing. The overall workflow diagram is shown in Figure 4, specific research contents are organized as follows:
(1)
Study of fast batch image elevation datum conversion algorithm
A fast HDC-MPI image elevation datum conversion algorithm based on MPI is designed, focusing on the following aspects:
  • Analyze whether MPI parallel technology can improve the efficiency of batch image elevation datum conversion.
  • Investigate the impact of image quantity and image size on memory usage, and assess whether these factors influence MPI execution efficiency.
  • Analyze the effect of having identical versus different image sizes within an image set on conversion efficiency.
(2)
Study on the impact of terrain on elevation datum conversion
Four terrain types are selected, including plain, hilly, mountainous, and high-mountain regions. For each terrain type, two groups of test examples with different image sizes are designed. The differences between these groups mainly lie in pixel count and memory consumption. The speedup ratio is adopted as the evaluation metric to analyze the influence of terrain factors on the performance of the MPI-based fast conversion algorithm.
(3)
Study on the correctness of image conversion results
ArcGIS 10.8 software is employed to verify the correctness of the parallel conversion results:
  • Using the profile analysis function, the elevation results produced by the serial and parallel algorithms are compared along identical profiles to verify the correctness of the parallel conversion.
  • Contour lines are generated from both serial and parallel results, and contour maps are compared to further assess the accuracy and consistency of the parallel conversion results.

3. Example Analysis

In this study, the SGG-UGM-2 (2190 degree) gravity field model is utilized. The quasi-geoid grid data is computed using the ICGEM [13] platform’s data server interface, where the online tool calculates the geoid grid with a resolution of 5 × 5 . This model does not directly merge global terrestrial gravity measurements. Instead, it is constructed by integrating GOCE satellite gravity gradients (SGG) with high-low satellite-to-satellite tracking (SST-HL) observations, the ITSG-Grace2018 normal equation system, oceanic gravity anomalies derived from satellite altimetry, and continental gravity data used in EGM2008 (Figure 5).
In terms of accuracy, it is comparable to internationally recognized top-level models such as EGM2008 and XGM2016. Through external detection using China’s independent ground gravity dataset, it has been confirmed that the model possesses extremely high accuracy and applicability within China. Table 1 summarizes the specific parameters generated in this study, and Figure 6 shows the distribution process of the gravity field model.
Based on the high-precision Digital Surface Model (DSM) data provided by the ZY-3 satellite, this study designs the HDC-MPI experiment from three key aspects: time and acceleration ratio analysis, terrain classification, and accuracy testing. The objective is to analyze the efficiency improvements achieved through the application of parallel algorithms in terrain data processing. The experiment was conducted in a single-node, multi-core environment, as shown in Table 2.
Limitation Statement: Although MPI was primarily designed for multi-node clusters, testing in a single-node, multi-core environment helps to verify the correctness of the algorithm’s logic and thread-level parallel efficiency. In addition, the HDC-MPI algorithm is not specifically designed for personal computers; rather, it is a general parallel algorithm developed based on a distributed memory architecture. Due to limitations in laboratory computing resources, the current study validates the algorithm using a single-node, multi-core environment.
In order to demonstrate the optimization effect of the HDC-MPI fast conversion algorithm on the computational process more intuitively, the speedup ratio S p is introduced as a key performance metric:
S p = T 1 T p
In the formula, p ( p 1 ) denotes the number of processes; T 1 is the serial execution time; and T p is the execution time when using p processes.

3.1. Time and Acceleration Ratio Analysis Experiment

In this experiment, six distinct groups of test cases are established. Each group utilizes DSM images with a resolution of 2118 × 2411 pixels. The specific details of each test case group are provided in Table 3. The primary objective is to maintain a consistent CPU utilization rate across all groups, with the only variable being the number of images in each group. This setup is designed to analyze the performance variations of the algorithm under different task loads, thereby facilitating an understanding of the relationship between task volume and algorithm performance.
Table 4 presents the execution times of the serial algorithm (with one process in both the serial algorithm and subsequent experiments) and the HDC-MPI algorithm for test cases Example 1 through Example 6. According to the data provided, when utilizing 12 processes, the conversion time is approximately seven times faster than that of the serial algorithm. This indicates that the implementation of parallel algorithms can significantly enhance conversion efficiency, effectively reducing computation time and achieving the intended goal of time optimization.
From Figure 7, it can be seen that, in test cases 1 through 6, as the number of processes increases, the operational efficiency of each example group improves. Notably, when 12 processes are employed, the conversion efficiency of the HDC-MPI algorithm reaches its peak. Specifically, compared to the single-process scenario (1 process), the efficiency improves by approximately six times for Example 1, seven times for Example 2, seven times for Example 3, seven times for Example 4, six times for Example 5, and seven times for Example 6. These results demonstrate that the parallel algorithm significantly enhances operational efficiency, confirming its effectiveness in optimizing performance.
However, as observed from the data in Table 5 and Figure 8, the total task volume across Examples 1 to 6 varies, and consequently, the acceleration ratios for each example differ when compared to the serial algorithm using 12 processes. When employing 12 processes, the acceleration ratios for each example are 6.677, 7.360, 7.235, 7.502, 6.531, and 7.285, respectively. The acceleration ratios for the six examples range from 6.5 to 7.5, reflecting variability due to the differences in task volume across the examples.
However, it is noteworthy that the total task volumes for Example 2, Example 4, and Example 6 are exact multiples of the maximum number of processes, resulting in an average acceleration ratio of 7.382. In contrast, the average acceleration ratio for Example 1, Example 3, and Example 5 is 6.814. From this, it can be concluded that the parallel algorithm achieves higher conversion efficiency when the total task volume is a multiple of the maximum number of processes.
In summary, the parallel algorithm effectively enhances conversion efficiency, with the most significant improvements observed when the total task volume is a multiple of the maximum number of processes. This indicates that carefully selecting the total task volume can optimize computational performance and further improve the processing efficiency of the parallel algorithm.

3.2. Parallel Performance Analysis Under Different Data Loads

To further investigate the influence of data scale on the computational efficiency of the HDC-MPI algorithm, comparative experiments were conducted across four representative terrain types: plain, hill, mountain, and high mountain. For each terrain category, two test cases were selected, each comprising 12 scenes of imagery. To realistically reflect the heterogeneity inherent in multi-source surveying and mapping datasets, substantial variations in image resolution and physical spatial footprint (Memory Footprint) were present across the test cases.
The primary objective of this experiment is threefold: (1) to assess the decisive impact of data volume on algorithmic conversion efficiency using these heterogeneous datasets spanning diverse geographical features; (2) to verify the computational stability of the HDC-MPI algorithm under complex terrain conditions; and (3) to explore the dynamic relationship between the number of MPI processes and the achieved parallel acceleration ratio under varying data loads.
Table 6 summarizes the performance metrics of the HDC-MPI algorithm for elevation benchmark conversion across these four test sets. The table provides detailed information on the image data dimensions, including row and column resolutions as well as spatial occupancy, for each test case. This detailed characterization enables a rigorous analysis of how data scale and terrain complexity jointly influence parallel processing efficiency.
Table 7 presents the height datum conversion times of the HDC-MPI algorithm across different terrain examples, utilizing 1 to 12 processes. The data encompasses four terrain types: plain, hilly, mountainous, and high-mountain regions, with each terrain type containing two examples. The primary focus of the table is to illustrate the relationship between the number of processes and the corresponding conversion time for each terrain type, providing insights into the algorithm’s efficiency across varying computational setups.
Figure 9 displays a line chart illustrating the height datum conversion times for different terrain examples. From the chart, it is evident that when the number of processes is set to 1, there is a significant variation in conversion times across the terrains. This discrepancy is due to the differing sizes of the topographic images for each example. However, as the number of processes increases, the conversion times for all terrain types gradually decrease, and the differences between the terrains become less pronounced. This trend highlights the improvement in conversion efficiency as the number of processes rises.
Table 8 compares the height datum conversion acceleration ratios of the HDC-MPI algorithm across different terrain examples under varying numbers of processes (1 to 12). The data includes four terrain types: plain, hilly, mountainous, and high-mountain regions, with each terrain category containing two examples. The primary focus of the table is to analyze the relationship between the number of processes and the corresponding acceleration ratio for each terrain type, providing insights into how the algorithm’s efficiency improves with the increasing number of processes across different terrain conditions.

3.2.1. Performance Under Plain Terrain Conditions

In this experiment, two test cases were established, each containing 12 images, all of which were based on plain terrain. To ensure the stability of the experiment, Case 1 and Case 2 were configured with identical resolution and spatial size, ensuring that the CPU usage of each image remained consistent (see Table 6 for details). Table 7 and Figure 9 present the conversion time for plain terrain examples under varying numbers of processes. The results show that, compared to the serial algorithm, the parallel algorithm significantly improves conversion efficiency by approximately 8 times, particularly with the use of multiple processes. Specifically, for Plain Example 1, the conversion time decreases markedly as the number of processes increases, reaching 7.520 s when using 12 processes, exhibiting a sub-linear speedup with the number of processes. Plain Example 2 follows a similar trend, with the conversion time reduced to 1.157 s.
To visually demonstrate the conversion efficiency of the HDC-MPI algorithm, Table 8 and Figure 10 present the acceleration ratios for the two examples. From Figure 10, it is evident that the speedup increases as the number of processes grows. However, the speedup for Plain Example 2 is significantly higher than for Plain Example 1 when 12 processes are used. The reason is that the image of Plain 1 is larger (23 MB vs. 8 MB) and the total number of grid points in Plain 1 is more than that in Plain 2, indicating the asymmetry of computational intensity and memory load in parallel computing.
The speedup increases sharply from 2 to 12 processes, but does not follow a linear growth pattern between 5 processes and 6–11 processes, instead showing noticeable fluctuations. The primary reason for this fluctuation is that the number of processes exceeds the number of samples, leading to an overly dispersed task allocation. This causes higher inter-process communication costs, lower resource utilization, and ultimately a decrease in overall efficiency.

3.2.2. Performance Under Hilly, Mountainous, and Alpine Terrain Conditions

To verify the universality of the aforementioned rule across a broader range of data dimensions and thoroughly eliminate the interference of geospatial features on computational performance, we conducted extended experiments on cross-scale heterogeneous datasets encompassing various complex terrain variations. Similar to the plain terrain experiment, two sets of examples were set up in each terrain category. Each set contains 12 images, where the resolution and spatial size of the images are consistent within the same example, but may vary between different examples. Table 7 and Figure 9 present the conversion time for the three terrain types.
The results demonstrate that the conversion efficiency of the HDC-MPI algorithm for the three terrain types is approximately 6 to 8 times higher than the serial algorithm. Notably, when 12 processes are used, the conversion time is significantly reduced, showing a trend similar to that observed in the plain terrain experiment.
To further confirm the efficiency of the HDC-MPI algorithm, Table 8 and Figure 10 display the speedup for the six sets of examples across all terrains. The acceleration ratios for the Hill 1 through High mountain 2 examples at 12 processes are 7.262, 7.725, 7.084, 6.387, 6.417, and 6.817, respectively. The experimental results show that the conversion efficiency is driven by the scale of image data, memory bandwidth, and computational density, and has nothing to do with the image terrain. Although there is a significant correlation between image size and memory usage, performance is not nonlinearly dependent on a single memory variable. When dealing with large-scale DSM images, data access patterns and cache hit rates also have a decisive impact on execution efficiency. As the number of processes increases, the execution time shows a significant reduction trend. Although the speedup exhibits sub-linear growth due to communication overhead and synchronization bottlenecks, the HDC-MPI algorithm exhibits strong scalability in a multi-core parallel environment.
In summary, in order to improve the conversion efficiency in parallel computing, the number of processes should be set as the approximate number of total tasks as much as possible, so as to optimize the task allocation, reduce the communication overhead between processes, and improve resource utilization. In addition, when batch image conversion, the memory condition of the computer should be paid attention to, so as to avoid affecting the efficiency of the HDC-MPI algorithm due to memory overrun.

3.3. Serial and Parallel Consistency Verification Experiments

To verify the correctness of the parallel conversion in the HDC-MPI algorithm, this section introduces a set of special experiments specifically designed for systematic verification. Unlike the previous experiments, the images in these examples differ in terms of resolution and spatial size compared to the other images in the prior experiments. These variations are intended to ensure a comprehensive evaluation of the algorithm’s performance across different types of image data.
The details of the experimental setup, including the specific characteristics of the images (such as resolution and space size), are provided in Table 9.
Table 10 summarizes the conversion time and corresponding speedup ratios achieved by the serial algorithm and the HDC-MPI algorithm in the illustrative example. When executed with 12 MPI processes, the HDC-MPI implementation achieves a speedup of approximately 6.0× relative to the serial baseline, demonstrating substantial improvement in computational efficiency. Moreover, scalability analysis reveals consistent performance gains with increasing process count: conversion times decrease to 9.992 s (2 processes), 4.062 s (3 processes), 3.405 s (4 processes), 3.169 s (6 processes), and 2.970 s (12 processes). Although the reduction in execution time is non-linear across the tested configurations, the observed trend indicates strong parallel scalability—particularly in the range from 2 to 6 processes—suggesting diminishing returns beyond 6 processes.
To rigorously verify the correctness of the image elevation conversion results, two testing methods are employed: Profile Analysis and Contour Line Analysis. These methods ensure that the parallel processing algorithm’s output is consistent with the expected results and are as follows:
  • Profile Analysis Method: The profile analysis function in ArcGIS Pro is used to randomly select a profile across the elevation map. The elevation profiles of the converted results from both the serial and parallel algorithms are compared under the same profile. The data for the profiles is exported as a CSV file, from which 30 random points are selected for a direct comparison of the elevation values. This method allows for a detailed point-by-point analysis of the conversion accuracy.
  • Contour Analysis Method: The contour extraction tool in ArcGIS Pro is used to generate contour lines from the elevation data of the serial and parallel results. Two images are randomly chosen from the serial and parallel results, and the contour spacing is set to 10 m for both. By comparing the contour lines of the two sets of images, we can assess whether there are any significant discrepancies in the terrain features, thereby evaluating the consistency and accuracy of the elevation conversion.
As shown in Figure 11, a comparison of the conversion results for Image No. 2 and Image No. 12 is made between the serial algorithm and the HDC-MPI algorithm. The figures present the conversion results as follows: Figure 11a,b: These show the elevation curves of the two images, where both the serial and HDC-MPI algorithms produce the same elevation profiles along any section of the terrain. This confirms that both methods generate identical results in terms of elevation data at various points, indicating that the parallel algorithm does not introduce discrepancies in the vertical data. Figure 11c,d: These are contour line comparisons of the two images. The black double-dotted line represents the contour lines generated by the HDC-MPI algorithm, while the solid color line shows the contour lines from the serial algorithm. As can be observed, the contour lines are nearly identical between the two methods, further confirming the accuracy of the HDC-MPI algorithm. Based on these comparisons, it can be concluded that the HDC-MPI algorithm ensures the correctness of the conversion results, matching the serial method without introducing significant deviations in the elevation data or the contour mapping.

3.4. Analysis of Parallel Performance Bottlenecks and Robustness in Production Environments

3.4.1. Scalability and Performance Bottleneck Analysis

After confirming the strict mathematical consistency between the serial and HDC-MPI parallel algorithms in Section 3.3, this section further investigates the computational scalability and system-level bottlenecks of the algorithm when handling large datasets.
The parallel acceleration ratio observed in this study—up to approximately 8× compared to serial algorithms—along with related load optimization conclusions (e.g., achieving optimal efficiency when the task volume is an integer multiple of the number of processes) are primarily applicable to the single-node, multi-core shared memory architecture employed in the experiments. Under this configuration, MPI inter-process communication relies predominantly on the underlying shared memory and high-speed bus, resulting in extremely low communication latency and thereby maximizing the parallel computing advantages of the algorithm.
It should be noted, however, that these results cannot be directly extrapolated to cross-node distributed cluster environments, where inter-node communication latency and network overhead may significantly affect scalability and overall parallel performance. Consequently, further evaluation on distributed HPC clusters is necessary to comprehensively assess the algorithm’s scalability and robustness in real-world production scenarios.
If the HDC-MPI algorithm is deployed on a multi-node high-performance computing (HPC) cluster, its performance is expected to face significant challenges in the following two areas:
  • Increased Cross-Node Communication Overhead: Network communication between nodes inherently exhibits higher latency and lower bandwidth compared to shared memory within a single machine. While the current strategy of uniformly broadcasting model parameters from the root process remains effective in the present experimental setup, future implementations involving dynamic load balancing or the aggregation of massive elevation data across nodes may encounter substantial network overhead. Such overhead could become a critical bottleneck, potentially limiting the overall parallel efficiency of the algorithm.
  • Concurrent I/O Bottlenecks in Distributed Environments: The current I/O approach has not been validated under real distributed storage systems, such as parallel file systems. In a multi-node cluster, when hundreds or thousands of processes simultaneously access large volumes of DSM image files on shared storage, lock contention and I/O blocking in the underlying file system are highly probable. Therefore, prior to deployment in large-scale distributed production environments, it is necessary to adopt advanced distributed parallel I/O strategies—for instance, MPI-IO—to mitigate storage pressure arising from massive concurrent read/write operations.
Addressing these challenges is essential for ensuring that HDC-MPI can achieve robust scalability and high efficiency in practical, industrial-scale distributed computing environments.
Based on the current experimental results, we further investigated whether continuously increasing the number of MPI processes in a single node environment would lead to further speed improvement, or cause system overload and performance degradation. In Case 1, 24 DEM images with a resolution of 5590 × 5915 were used. In Case 2, 24 DEM images with a resolution of 3601 × 3601 were used.
Table 11 presents the serial and parallel execution times, speedup ratios, and memory usage for Cases 1 and 2. From the table, it can be observed that the HDC-MPI algorithm significantly improves processing speed. In particular, when the number of processes is small, the speedup approaches linear scaling. However, as the number of processes increases further, the speedup exhibits a diminishing growth trend, while memory usage continues to increase.
Analysis of the data shows that the algorithm does not exhibit super-linear speedup (i.e., speedup greater than the number of processes), but instead demonstrates typical sub-linear speedup behavior. The primary reasons for the decrease in parallel efficiency with increasing core count (i.e., the non-linear growth of speedup) are analyzed as follows:
  • Memory Bandwidth Bottleneck:This experiment is based on a single-node environment, where all MPI processes share the same physical memory channel. When 12 processes simultaneously read image data and perform a large number of double-precision floating-point operations, the memory bus bandwidth quickly becomes saturated, leading to a significant increase in memory access latency for each process. This is the core physical bottleneck limiting the performance of high-concurrency computations on a single node.
  • Communication and Synchronization Overhead:Although MPI_Bcast optimized the initial data distribution, during the task loop, each process independently performs I/O operations. As the number of processes increases, file system concurrent access locks (lock contention) become a potential bottleneck.
  • Computation-to-Communication Ratio:This algorithm is computation-intensive (pixel-by-pixel interpolation), where the computational overhead far outweighs the communication overhead. However, in small data examples (e.g., Hilly 2, only 2.05MB), the MPI initialization and process management overhead becomes relatively more significant, leading to less noticeable speedup compared to larger data examples.

3.4.2. Discussion on Load Localization and Dynamic Scheduling Mechanism

Although previous experiments have demonstrated that having the number of processes divisible by the number of tasks can achieve near-optimal efficiency, this result represents only an ideal scenario under static load balancing. In realistic production environments, the robustness of such a static allocation strategy is often challenged.
In practical engineering applications, the computational workload of images across different scenes is frequently heterogeneous and uneven. For instance, some images may contain extensive sea areas that do not require computation or include invalid fill values (NoData), whereas others are fully covered with land pixels. Additionally, I/O response times of storage nodes can fluctuate across different periods. Under these conditions, the existing static allocation strategy is susceptible to the Straggler Effect, whereby a small number of processes handling complex tasks or encountering I/O congestion delay the overall job completion time.
To fully adapt the HDC-MPI algorithm to complex industrial-level mapping tasks, future iterative deployments should transition towards dynamic load balancing. Key optimization strategies include:
  • Master–slave architecture and on-demand distribution:A dedicated master process maintains the global task queue. Upon completing the current task, each worker process actively requests a new task from the master. This asynchronous pull mechanism allows processes with lighter loads to automatically take on additional work, mitigating delays caused by data sparsity or I/O jitter.
  • Fine-Grained Tile Scheduling:Departing from coarse-grained scheduling based on full-scene imagery, large images are subdivided into smaller memory-resident tiles (e.g., 1024 × 1024 pixels). This finer scheduling granularity reduces tail waiting times for lagging processes and facilitates more efficient overlap between computation and I/O, improving pipeline utilization.
By incorporating these dynamic scheduling mechanisms, the HDC-MPI algorithm can effectively eliminate the negative impact of uneven task distribution and hardware-induced fluctuations, thereby achieving near-theoretical optimal scalability in complex and dynamic production environments.

3.5. Accuracy Evaluation of Elevation Accuracy Conversion

Although the previous chapters have successfully verified the computational fidelity and parallel efficiency of the HDC-MPI algorithm, the absolute physical correctness of the converted elevation datum requires independent geodetic verification.
Elevation information is a key attribute of DSM, and its accuracy is critically important. The accuracy assessment of DSM is a vital step to ensure the quality of DSM data, aiming to evaluate the deviations between the DSM-derived elevation values and the true terrain elevations, and to determine whether they meet the accuracy requirements for intended applications. Accuracy assessment is typically carried out in terms of vertical accuracy and horizontal resolution, with particular focus on analyzing the distribution patterns and statistical characteristics of elevation errors. In vertical accuracy assessment, the Root Mean Square Error (RMSE) is the most commonly used metric, providing a quantitative measure of the overall level of elevation error [14].
To quantitatively assess the geodetic fidelity of the elevation datum conversion, the converted elevations were compared against the high-precision GCPs [15]. The statistical assessment employs three standard metrics: Mean Error (ME) to detect potential systematic bias,
M E = 1 n i = 1 n ( Z i Z i * )
Standard Deviation (STD) to measure the dispersion of the residuals,
S T D = i = 1 n [ ( Z i Z i * ) M E ] 2 n
and Root Mean Square Error (RMSE) to quantify the overall absolute accuracy. These metrics are mathematically defined as follows:
R M S E = 1 n i = 1 n ( Z i Z i * ) 2
To validate the absolute geodetic accuracy of the HDC-MPI elevation conversion algorithm, a total of 47 high-precision ground control points (GCPs) were employed as validation data. Given that the accuracy of global gravity field models such as SGG-UGM-2 is fundamentally constrained by terrain complexity, a hierarchical validation approach was developed. Based on the elevation and topographic characteristics of these 47 GCPs, they were categorized into two groups: Table 12 shows complex terrain, while Table 13 shows simple terrain.
Complex Terrain Group (17 sites): Situated in mountainous and alpine regions characterized by high elevations and pronounced topographic variations.
Low-Altitude Group (30 sites): Predominantly located in plain areas with low elevations and relatively flat terrain. Statistical evaluation is conducted using three standard metrics: Mean Error (ME) for detecting systematic biases, Standard Deviation (STD) for assessing the dispersion of residuals, and Root Mean Square Error (RMSE) for quantifying overall absolute accuracy. The computational results are summarized in Table 14.
The statistical results presented in Table 14 clearly illustrate the conversion errors. For the simple terrain group, the algorithm achieved exceptionally high conversion accuracy, with a root mean square error (RMSE) of only 0.037 m (3.7 cm) and a mean error (ME) of 0.017 m. In contrast, for the complex terrain group, a positive mean error of 0.217 m was observed, with localized errors reaching up to 0.617 m, and the RMSE increased to 0.326 m. In regions with significant topographic variations, high-frequency gravitational fluctuations are inherently limited by the truncation of the global model’s spherical harmonic expansion (degree and order 2190), resulting in unavoidable localized omissions in elevation anomalies.
The overall RMSE across the 47 points remains below 0.2 m; however, the maximum localized error of approximately 0.6 m in complex terrains warrants consideration. This tolerance can be evaluated from two perspectives:
  • Inherent Uncertainty of Open-Source Digital Surface Models (DSMs): The proposed HDC-MPI conversion scheme is designed to efficiently harmonize global and regional medium-resolution DSM products (e.g., SRTM 30 m, AW3D30). According to official specifications, the inherent vertical RMSE of these original DSMs ranges from 4 to 10 m. Therefore, the observed conversion error of 0.5–0.6 m is an order of magnitude smaller than the sub-pixel noise floor of the original data. The conversion process preserves the nominal accuracy of the DSM, with no statistically significant degradation.
  • Application Standards: For large-scale applications, such as unified regional topographic mapping (e.g., at scales of 1:25,000 to 1:50,000), sub-meter vertical discrepancies are well within acceptable operational limits. Thus, the conversion effectively unifies the elevation datum while fully retaining the nominal accuracy of the original DSM.

3.6. Ram Analysis Experiment

In this group of experiments, a total of 4 sets of examples were set up. In this group of experiments, each example was 12 images. The details of each group of examples is shown in Table 15.
This set of experiments aims to analyze the impact of image memory usage on the efficiency of the parallel algorithm. Table 16 presents the maximum memory consumption for four sets of test cases under different numbers of processes. From the table, it can be observed that when only 1 process is used (i.e., serial processing), the memory usage for Case 1 to Case 4 is 10.95%, 12.00%, 13.20%, and 17.10%, respectively. This indicates a positive correlation between image size and memory consumption: as the size of the image increases, the memory usage also increases.
In order to more intuitively show the impact of memory on the performance of the algorithm, Figure 12 reflects the memory consumption of Example 1 to Example 4. For Cases 1 through 4, as the number of processes increases, the memory consumption of the HDC-MPI fast conversion algorithm also increases significantly. This indicates that, for the same number of images, both the image size and the memory footprint directly affect memory usage, and it can be further inferred that they also influence the computational time of the algorithm. Under conditions of sufficient memory, memory usage for Cases 1 through 4 shows a clear upward trend, with Case 4 exhibiting a more pronounced increase compared to the other three cases. Moreover, for all four cases, memory consumption demonstrates a strong linear relationship with the number of processes. This suggests that while increasing the number of processes results in higher memory demand, it simultaneously enables improved processing efficiency through parallel computation.

4. Conclusions

This study addresses the computational efficiency bottleneck encountered during large-scale elevation datum transformation of high-resolution Digital Surface Models (DSMs) by proposing and implementing HDC-MPI, a fast transformation algorithm based on MPI parallel technology. The algorithm significantly enhances the processing efficiency of elevation datum transformation for large-scale DSM imagery while maintaining the reliability and accuracy of the results. Its performance has been systematically evaluated through multiple experiments, with the key findings summarized as follows:
  • Substantial improvement in computational performance: In a 12-core parallel environment, HDC-MPI achieved an acceleration ratio of approximately 6.5–8× across DSM image sets of varying sizes and quantities, substantially reducing the time required for batch processing tasks.
  • Clear load characteristics: The efficiency of the algorithm is primarily driven by the volume of image data (memory footprint) and exhibits minimal correlation with terrain complexity (e.g., plains versus mountainous regions). This characteristic provides a reliable guarantee for applying the algorithm in heterogeneous geographical environments.
  • High reliability of results: The transformation accuracy after parallelization is fully consistent with that of traditional serial algorithms, ensuring numerical continuity and spatial consistency when unifying benchmarks across multi-source spatial datasets.
Although the HDC-MPI algorithm demonstrates significant acceleration in single-node, multi-core environments, it remains subject to several inherent limitations when scaling to large-scale data processing, due primarily to the constraints of the current testing conditions:
  • Single-Node Limitation of the Testing Environment (Core Limitation): All experiments in this study were conducted on a single machine equipped with 12 cores. Consequently, the observed performance gains of the HDC-MPI algorithm are based solely on shared-memory communication and high-speed intra-node bus access, and have not yet been validated in cross-node distributed computing clusters, such as real high-performance computing (HPC) environments. The effects of network latency, limited inter-node bandwidth, and other overheads introduced by cross-node communication on the overall acceleration ratio remain unclear.
  • The vulnerability of static load balancing: Current algorithms rely on a static task allocation mechanism based on the number of image files. In single machine testing, if there is a significant difference in the data volume of a single scene image, this mechanism can easily lead to inconsistent computing progress between different cores.
  • Single node resource bottleneck: Experiments have shown that memory consumption increases strongly linearly with the number of parallel processes. In a single machine environment, available physical memory will directly become a hard bottleneck that limits the further improvement of algorithm parallelism (number of processes).
Given that the current experiments are primarily confined to single-node, multi-core shared memory environments, this study has validated the fundamental performance of the HDC-MPI algorithm in processing medium- to large-scale DSM images. To further enhance algorithmic performance, future research will focus on addressing scalability bottlenecks and exploring the potential of the algorithm in more complex, large-scale distributed computing architectures.
  • Transitioning from Static Mapping to Dynamic Scheduling Mechanisms: Future research will focus on developing a dynamic load balancing strategy based on task pools. By dynamically allocating tasks according to the real-time status of each computing process, this approach is expected to eliminate the efficiency losses inherent in the current static allocation mode when processing images of varying complexity.
  • Distributed I/O optimization: Introduce parallel file system support (e.g., MPI-IO), optimize the reading and writing of image data, and reduce disk access conflicts during batch processing of massive small files.
  • Performance Validation from Single-Machine to Cross-Node Clusters: The next core objective of this study is to overcome the current limitations imposed by single-node hardware and conduct deployment experiments on high-performance computing (HPC) distributed clusters. By performing cross-node performance benchmarking, the study aims to rigorously evaluate the true scalability and system fault tolerance of the HDC-MPI algorithm under conditions of network latency, limited bandwidth, and distributed cluster topology.

Author Contributions

H.Z. contributed to the Conceptualization, Methodology, Data Curation, and Supervision of the study. C.H. was responsible for Formal Analysis, Software development, and Writing—Original Draft preparation. X.F. contributed to Investigation, Validation, and Writing—Review and Editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original data presented in the study are openly available in https://icgem.gfz.de/home (accessed on 5 May 2026) and https://www.gscloud.cn/home (accessed on 5 May 2026).

Acknowledgments

We would like to acknowledge the ICGEM, Geospatial Data Cloud, and China’s National Digital Elevation Model for their valuable contributions to the data used in this research.

Conflicts of Interest

Author Xinhao Fan was employed by the company Heilongjiang Longmei Geological Exploration Co. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DSMDigital Surface Model
DEMDigital Elevation Model
MPIMessage Passing Interface
HDC-MPIHeight Datum Conversion–MPI
SGG-UGM-2Specific Global Gravity Model, Ultra-High Degree 2
CPUCentral Processing Unit
CSVComma-Separated Values

References

  1. Ramdani, D.; Pahlevi, A.; Harahap, M.R. Optimal Global Gravity Field Model for Calculation of Local Gravity and Geoid in Indonesia. IOP Conf. Ser. Earth Environ. Sci. 2024, 1418, 012021. [Google Scholar] [CrossRef]
  2. Wang, Y.; Liu, X.; Li, Q.; Li, R.; Fang, L. DQM2022 series ultra-high-order earth gravity field model construction and its accuracy evaluation. J. Surv. Mapp. 2024, 53, 1505–1516. [Google Scholar] [CrossRef]
  3. Xiaogang, L.; Xiaoping, W. Construction of Earth’s gravitational field model from CHAMP, GRACE and GOCE data. Geod. Geodyn. 2015, 6, 292–298. [Google Scholar] [CrossRef]
  4. Liang, W.; Li, J.; Xu, X.; Zhang, S.; Zhao, Y. A high-resolution Earth’s gravity field model SGG-UGM-2 from GOCE, GRACE, satellite altimetry, and EGM2008. Engineering 2020, 6, 860–878. [Google Scholar] [CrossRef]
  5. Nie, P.; Cui, Z.; Wan, Y. A Rapid Parallel Mosaicking Algorithm for Massive Remote Sensing Images Utilizing Read Filtering. Remote Sens. 2023, 15, 4863. [Google Scholar] [CrossRef]
  6. Datla, R.; Krishna Mohan, C. A Novel Framework for Seamless Mosaic of Cartosat-1 DEM Scenes. Comput. Geosci. 2021, 146, 104619. [Google Scholar] [CrossRef]
  7. Wang, S.; Armstrong, M.P.; Ni, J.; Liu, Y. GISolve: A grid-based problem solving environment for computationally intensive geographic information analysis. In Proceedings of the Challenges of Large Applications in Distributed Environments (CLADE 2005), Research Triangle Park, NC, USA, 24 July 2005; IEEE: New York, NY, USA, 2005; pp. 3–12. [Google Scholar] [CrossRef]
  8. Cui, J.; Zhou, B.; Luo, Z.; Zhang, X. Efficiency Analysis of Spherical Harmonic Synthesis Implemented with MPI Parallel Algorithm. J. Wuhan Univ. (Inf. Sci. Ed.) 2019, 44, 1802–1807. [Google Scholar] [CrossRef]
  9. Liu, S.; Chen, L.; Xiong, W.; Wu, Y.; Li, J. MPI-based Parallel Tiling Algorithm for Large-scale Raster Images. Comput. Eng. Appl. 2018, 54, 48–53+111. [Google Scholar] [CrossRef]
  10. Dang, Y.; Jiang, T.; Chen, J. Research Progress of Global Elevation Datum. J. Wuhan Univ. (Inf. Sci. Ed.) 2022, 47, 1576–1586. [Google Scholar] [CrossRef]
  11. Gropp, W.; Lusk, E.; Skjellum, A. Using MPI: Portable Parallel Programming with the Message-Passing Interface; MIT Press: Cambridge, MA, USA, 1999. [Google Scholar]
  12. Ge, X.; Liu, Y.; Chen, Z.; Xiao, N. MPI-based Parallel Large Dataset Generator. Comput. Eng. Sci. 2022, 44, 1152–1161. [Google Scholar]
  13. Ince, E.S.; Barthelmes, F.; Reißland, S.; Elger, K.; Förste, C.; Flechtner, F.; Schuh, H. ICGEM—15 years of successful collection and distribution of global gravitational models, associated services and future plans. Earth Syst. Sci. Data 2019, 11, 647–674. [Google Scholar] [CrossRef]
  14. Lee, J.; Kwon, J.H. Precision Evaluation of Recent Global Geopotential Models Based on GNSS/Leveling Data on Unified Control Points. J. Korean Soc. Surv. Geod. Photogramm. Cartogr. 2020, 38, 153–163. [Google Scholar] [CrossRef]
  15. Zhang, P.; Bao, L.; Guo, D.; Li, Q. Estimation of the Height Datum Geopotential Value of Hong Kong Using the Combined Global Geopotential Models and GNSS/Levelling Data. Surv. Rev. 2022, 54, 106–116. [Google Scholar] [CrossRef]
Figure 1. Elevation system and its relationship.
Figure 1. Elevation system and its relationship.
Electronics 15 02127 g001
Figure 2. Quadratic surface interpolation scheme.
Figure 2. Quadratic surface interpolation scheme.
Electronics 15 02127 g002
Figure 3. Image task allocation flowchart based on the HDC-MPI algorithm.
Figure 3. Image task allocation flowchart based on the HDC-MPI algorithm.
Electronics 15 02127 g003
Figure 4. Overall workflow diagram of the HDC-MPI implementation and evaluation.
Figure 4. Overall workflow diagram of the HDC-MPI implementation and evaluation.
Electronics 15 02127 g004
Figure 5. Quasi-geoid grid schematic diagram.
Figure 5. Quasi-geoid grid schematic diagram.
Electronics 15 02127 g005
Figure 6. Gravity field model distribution flow chart.
Figure 6. Gravity field model distribution flow chart.
Electronics 15 02127 g006
Figure 7. Comparison chart of time analysis experiment.
Figure 7. Comparison chart of time analysis experiment.
Electronics 15 02127 g007
Figure 8. Acceleration ratio comparison diagram.
Figure 8. Acceleration ratio comparison diagram.
Electronics 15 02127 g008
Figure 9. Time comparison diagram of each terrain example.
Figure 9. Time comparison diagram of each terrain example.
Electronics 15 02127 g009
Figure 10. Comparison of acceleration ratio of each terrain example.
Figure 10. Comparison of acceleration ratio of each terrain example.
Electronics 15 02127 g010
Figure 11. Comparison of image conversion results: (a) No. 2 image conversion result profile. (b) No. 12 image conversion result profile. (c) No. 2 image serial and parallel results contour comparison map. (d) No. 12 image serial and parallel results contour comparison map.
Figure 11. Comparison of image conversion results: (a) No. 2 image conversion result profile. (b) No. 12 image conversion result profile. (c) No. 2 image serial and parallel results contour comparison map. (d) No. 12 image serial and parallel results contour comparison map.
Electronics 15 02127 g011
Figure 12. Memory analysis experiment of each example of elevation datum conversion memory consumption comparison chart.
Figure 12. Memory analysis experiment of each example of elevation datum conversion memory consumption comparison chart.
Electronics 15 02127 g012
Table 1. Quasi-geoid grid parameters of the SGG-UGM-2 gravity field model.
Table 1. Quasi-geoid grid parameters of the SGG-UGM-2 gravity field model.
CategoryInterrelating Parameter
FunctionalHeight_anomaly
ModelSGG-UGM-2
Grid RangeLongitude: 101.632 ° 116.924 ° ; Latitude: 30.922 ° 38.816 °
Grid Step 0.0833 ° × 0.0833 °
Total Grid Points17,760
Reference SystemWGS84
Min/Max 49.808 m/ 4.257 m
Mean/Stdev 27.709 m/ 10.596 m
Table 2. Experimental environment of image elevation datum conversion.
Table 2. Experimental environment of image elevation datum conversion.
Environment ConfigurationExperimental Details
ProcessorIntel® Core™ i5-12500H @ 2.50 GHz
Operating systemKylin Linux Advanced Server V10
Core number12 logical processors
Memory16 GB
Compiler and languageQt 5.12.8, C++
Parallel environmentMPI 3.3.2
Table 3. Image elevation datum conversion time and speedup ratio analysis experimental example details.
Table 3. Image elevation datum conversion time and speedup ratio analysis experimental example details.
NameQuantity/SceneImage Size (Pixels)Space Usage (MB)
Numerical example 1100 2118 × 2411 19.3
Numerical example 284 2118 × 2411 19.3
Numerical example 370 2118 × 2411 19.3
Numerical example 460 2118 × 2411 19.3
Numerical example 530 2118 × 2411 19.3
Numerical example 612 2118 × 2411 19.3
Table 4. Image elevation datum conversion time analysis experiment (unit: s).
Table 4. Image elevation datum conversion time analysis experiment (unit: s).
ProcessExample 1Example 2Example 3Example 4Example 5Example 6
1195.542167.959136.487116.64058.14425.117
299.74883.66569.73059.40729.65412.881
370.54058.33649.90741.69521.0808.778
452.34044.28237.78131.45316.8586.741
545.12438.63431.26127.17613.7947.181
640.41933.99327.89924.65311.8345.059
738.23430.12624.62723.00611.1355.402
834.04428.73622.86721.93311.1405.848
933.53928.25021.10019.87511.5756.077
1028.16625.78218.07717.4728.3726.469
1129.85123.94418.46119.2249.3606.320
1229.28422.81918.86515.5498.9033.448
Table 5. Image elevation datum conversion acceleration ratio analysis experiment.
Table 5. Image elevation datum conversion acceleration ratio analysis experiment.
ProcessExample 1Example 2Example 3Example 4Example 5Example 6
11.0001.0001.0001.0001.0001.000
21.9602.0081.9571.9631.9611.950
32.7722.8792.7352.7972.7582.861
43.7363.7933.6133.7083.4493.726
54.3334.3474.3664.2924.2153.498
64.8384.9414.8924.7314.9134.964
75.1145.5755.5425.0704.5364.650
85.6845.8455.9695.3185.2194.295
95.8305.9456.4695.8695.0235.133
106.4946.5157.5506.6766.9455.883
116.5517.0157.3936.0676.2475.974
126.6777.3607.2357.5026.5317.285
Table 6. Details of experimental examples of terrain elevation datum conversion based on HDC-MPI algorithm.
Table 6. Details of experimental examples of terrain elevation datum conversion based on HDC-MPI algorithm.
NameQuantity/SceneImage SizeSpace Usage (MB)
Plain example 112 3287 × 3255 23.3
Plain example 212 1283 × 1551 7.73
Hill example 112 1778 × 1948 11.6
Hill example 212 1280 × 1444 2.05
Mountain example 112 985 × 1015 3.71
Mountain example 212 3601 × 3601 24.8
High mountain example 112 2118 × 2411 19.3
High mountain example 212 2212 × 2290 19.0
Table 7. HDC-MPI algorithm execution time for each terrain example (unit: s).
Table 7. HDC-MPI algorithm execution time for each terrain example (unit: s).
ProcessPlain 1Plain 2Hill 1Hill 2Mountain 1Mountain 2High Mountain 1High Mountain 2
149.4609.13915.9465.6694.53137.21323.62623.424
225.3094.8065.0882.9832.41619.10011.77711.989
317.6253.2705.0872.0431.63813.6378.4508.353
413.4942.4674.5311.5891.25910.3586.5025.549
510.1593.6343.2791.2861.0918.3625.4784.873
69.1263.0192.8701.2140.9937.2945.3885.544
710.5452.0143.6411.3801.1289.0995.3885.544
811.5952.0423.8431.2621.0859.4575.9375.541
911.7992.9193.6971.4151.1169.6475.6045.861
1012.6312.1313.4481.3301.21210.4615.9465.883
119.8513.9443.2221.1070.9879.6495.8745.974
127.5201.1572.1960.7340.6405.8263.6823.436
Table 8. HDC-MPI algorithm acceleration ratio for each terrain example.
Table 8. HDC-MPI algorithm acceleration ratio for each terrain example.
ProcessPlain 1Plain 2Hill 1Hill 2Mountain 1Mountain 2High Mountain 1High Mountain 2
11.0001.0001.0001.0001.0001.0001.0001.000
21.9541.9011.9721.9011.8751.9472.0061.954
32.8062.7942.7772.7752.7672.7292.7962.804
43.6653.7043.6823.5683.5993.5933.6343.577
53.2373.4733.4653.4433.4043.3253.6113.295
64.8694.7994.8624.4094.5314.5944.3134.807
74.6904.5394.3804.1074.0174.4324.3854.225
84.2654.4754.1494.4914.1784.0904.3094.352
94.1924.7623.3144.0054.0603.9353.9794.227
103.9164.2893.8563.6893.6723.6583.9963.996
113.7773.8033.5853.5943.7403.6573.6083.944
126.5777.8997.2627.7257.0846.3876.4176.818
Table 9. Correctness of image elevation datum conversion results: Details of experimental examples.
Table 9. Correctness of image elevation datum conversion results: Details of experimental examples.
NameQuantityImage SizeCommitted Memory (MB)
Experimental image 112118 × 241119.3
Experimental image 212212 × 229019.0
Experimental image 313287 × 325523.3
Experimental image 411283 × 15517.73
Experimental image 511778 × 194811.6
Experimental image 611280 × 14442.05
Experimental image 71985 × 10153.71
Experimental image 813601 × 360124.8
Experimental image 911316 × 11856.32
Experimental image 1011316 × 11856.34
Experimental image 1111316 × 11856.18
Experimental image 1211316 × 11856.18
Table 10. Experimental example of the correctness of image elevation datum transformation—Transformation time and speed-up ratio.
Table 10. Experimental example of the correctness of image elevation datum transformation—Transformation time and speed-up ratio.
Number of ProcessesExample Conversion time/sExample Calculation Acceleration Ratio
116.3331.000
29.9921.635
34.0624.021
43.4054.797
53.6444.482
63.1695.155
73.4614.719
83.2405.041
93.1255.226
103.4554.728
113.3954.810
122.9705.500
Table 11. Extended Experiment Detailed Data for Case 1 and Case 2.
Table 11. Extended Experiment Detailed Data for Case 1 and Case 2.
Case 1Case 2
Process Time (s) Speedup Memory Usage Process Time (s) Speedup Memory Usage
1178.5091.0007.00%166.58841.0005%
291.9011.94210.40%233.64561.9736.50%
361.8692.88514.30%323.26132.8508.60%
447.4153.76517.10%417.83353.7149.60%
539.9434.46920.60%515.28184.37810.60%
632.3625.51623.80%612.32435.35012.40%
732.7495.45128.30%712.45495.31613.70%
825.0977.11330.80%89.84146.71815.20%
925.1007.11234.10%99.81866.63916.50%
1026.5016.73637.70%1010.30346.61417.60%
1125.8156.91540.60%119.97896.70518.30%
1218.3319.73844.30%126.84799.33219.90%
1617.292810.34956.60%167.16716.93624.90%
2019.38297.39068.50%2010.03166.72629.20%
2227.24956.64275.70%2211.30975.84732.50%
2426.88366.82285.30%2410.10036.59236.40%
Table 12. Error Analysis Table for Elevation Conversion Outcomes in Complex Terrain via the HDC-MPI Algorithm (Unit: m).
Table 12. Error Analysis Table for Elevation Conversion Outcomes in Complex Terrain via the HDC-MPI Algorithm (Unit: m).
Serial NumberCheck Point ElevationConversion Point ElevationElevation Error
check point 11367.2371367.623−0.386
check point 21321.9171322.496−0.579
check point 31315.4311316.048−0.617
check point 41320.2911320.440−0.149
check point 51287.1711287.801−0.630
check point 61291.8111291.7380.073
check point 71344.9911345.278−0.287
check point 81307.0751307.175−0.100
check point 91328.0531328.542−0.489
check point 101286.3721286.761−0.389
check point 111299.7461299.941−0.195
check point 1258.12158.1190.002
check point 1325.15025.1390.011
check point 143.7573.773−0.016
check point 1517.40217.3880.014
check point 16103.333103.3090.024
check point 1715.89315.8720.021
Table 13. Error Analysis Table of Elevation Conversion Results for Simple Terrain Using the HDC-MPI Algorithm (Unit: m).
Table 13. Error Analysis Table of Elevation Conversion Results for Simple Terrain Using the HDC-MPI Algorithm (Unit: m).
Serial NumberCheck Point ElevationConversion Point ElevationElevation Error
check point 1101.325101.2790.046
check point 2398.191398.1800.011
check point 323.88423.929−0.045
check point 47.3357.3170.018
check point 56.7326.7210.011
check point 68.9188.9170.001
check point 711.00911.0030.006
check point 819.60819.668−0.060
check point 9123.411123.416−0.005
check point 1053.31453.353−0.039
check point 1119.22819.2030.025
check point 1237.94037.974−0.034
check point 1310.62410.5800.044
check point 14302.313302.334−0.021
check point 15115.138115.139−0.001
check point 16171.280171.2730.007
check point 1745.53745.603−0.066
check point 1825.99025.991−0.001
check point 1925.13025.150−0.020
check point 2013.04813.0000.048
check point 21124.317124.373−0.056
check point 2225.62325.658−0.047
check point 234.3864.438−0.052
check point 24103.309103.343−0.034
check point 254.0044.062−0.058
check point 268.3308.362−0.032
check point 27233.890233.928−0.035
check point 2872.92872.940−0.012
check point 2958.12058.170−0.050
check point 303.7743.834−0.060
Table 14. Overall Error Analysis Table (Unit: m).
Table 14. Overall Error Analysis Table (Unit: m).
Terrain GroupPoints (N)MESTDRMSE
complex terrain170.2170.2510.326
simple terrain300.0170.0340.037
Overall Dataset470.0890.1790.199
Table 15. Details of Experimental Cases for Image Elevation Standard Conversion Memory Analysis.
Table 15. Details of Experimental Cases for Image Elevation Standard Conversion Memory Analysis.
NameNumber of Images/SceneImage SizeMemory Usage/MBImage Source
Case 1121283 × 15517.72Resource 3
Case 2121778 × 194811.6High-Resolution Satellite
Case 3122118 × 241119.3High-Resolution Satellite
Case 4123601 × 360124.8Resource 3
Table 16. Image elevation datum conversion memory analysis experiment each example operation memory consumption.
Table 16. Image elevation datum conversion memory analysis experiment each example operation memory consumption.
Number of ProcessesExample 1Example 2Example 3Example 4
110.95%12.00%13.20%17.10%
212.25%14.10%16.45%23.60%
314.00%17.00%19.40%30.00%
415.60%19.10%22.00%35.50%
514.55%21.20%24.75%42.25%
618.70%22.45%27.55%47.90%
717.90%24.30%30.35%51.45%
819.45%26.10%32.85%59.30%
920.20%28.35%35.90%64.90%
1020.30%29.85%39.40%71.10%
1122.50%32.40%41.65%77.40%
1224.30%35.50%45.90%83.05%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, H.; Huang, C.; Fan, X. Fast Conversion Algorithm of DSM Image Elevation Datum Based on MPI Parallel Technology. Electronics 2026, 15, 2127. https://doi.org/10.3390/electronics15102127

AMA Style

Zhang H, Huang C, Fan X. Fast Conversion Algorithm of DSM Image Elevation Datum Based on MPI Parallel Technology. Electronics. 2026; 15(10):2127. https://doi.org/10.3390/electronics15102127

Chicago/Turabian Style

Zhang, Hengjing, Changxuan Huang, and Xinhao Fan. 2026. "Fast Conversion Algorithm of DSM Image Elevation Datum Based on MPI Parallel Technology" Electronics 15, no. 10: 2127. https://doi.org/10.3390/electronics15102127

APA Style

Zhang, H., Huang, C., & Fan, X. (2026). Fast Conversion Algorithm of DSM Image Elevation Datum Based on MPI Parallel Technology. Electronics, 15(10), 2127. https://doi.org/10.3390/electronics15102127

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop