Research on Two-Dimensional Digital Map Modeling Method Based on UAV Aerial Images

Wang, Han; Zong, Kai; Gao, Dongfeng; Xu, Xuerui; Wang, Yanwei

doi:10.3390/app15073818

Open AccessArticle

Research on Two-Dimensional Digital Map Modeling Method Based on UAV Aerial Images

by

Han Wang

¹

,

Kai Zong

^1,*

,

Dongfeng Gao

²,

Xuerui Xu

¹ and

Yanwei Wang

³

¹

China Academy of Safety Science and Technology, Beijing 100012, China

²

CHN ENERGY, Beijing 100011, China

³

School of Mechanical Engineering, Heilongjiang University of Science & Technology, Harbin 150022, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(7), 3818; https://doi.org/10.3390/app15073818

Submission received: 14 January 2025 / Revised: 24 February 2025 / Accepted: 24 February 2025 / Published: 31 March 2025

Download

Browse Figures

Versions Notes

Abstract

Accurate acquisition of two-dimensional digital maps of disaster sites is crucial for rapid and effective emergency response. The construction of two-dimensional digital maps using unmanned aerial vehicle (UAV) aerial images is not affected by factors such as signal interference, terrain, or complex building structures, which are common issues with methods like single-soldier image transmission or satellite imagery. Therefore, this paper investigates a method for modeling two-dimensional digital maps based on UAV aerial images. The proposed Canny edge-enhanced Speeded-Up Robust Features (C-SURF) algorithm in this method is designed to enhance the number of feature extractions and the accuracy of image registration. Compared to the SIFT and SURF algorithms, the number of feature points increased by approximately 44%, and the registration accuracy improved by about 16%, laying a solid foundation for feature-based image stitching. Additionally, a novel image stitching method based on the novel energy function is introduced, effectively addressing issues such as color discrepancies, ghosting, and misalignment in the fused image sequences. Experimental results demonstrate that the signal-to-noise ratio (SNR) of the fused images based on the novel energy function can reach an average of 36 dB.

Keywords:

digital maps; feature extraction and registration; image stitching; UAV image

1. Introduction

Digital maps are discrete data sets that record and store geographical features in digital form within a defined coordinate system, including specific locations, attributes, relational indicators, and names [1,2]. Traditional paper maps, which transform data into graphical representations, are time-consuming, labor-intensive, and often result in lower data precision [3]. In recent years, geological disasters have occurred frequently, with an increasing trend in the occurrence of disasters such as landslides, mudslides, floods, and earthquakes [4,5]. The acquisition of the latest geographical information of disaster-stricken areas has become a critical aspect and an important guarantee for rapid emergency rescue after disasters, directly affecting the efficiency of operations during the golden rescue period and the level of protection for people’s lives and property [1,4,6]. Therefore, the method of two-dimensional digital map modeling method based on UAV aerial images plays a significant role in enhancing the response speed of emergency rescue operations and mitigating the impact of disasters [7].

Traditional survey methods, such as proximity inquiries and visual estimations, are increasingly inadequate due to the complexity of disaster situations and terrain. Methods like individual soldier image transmission and satellite imagery are also limited by signal interference, terrain challenges, and complex building structures, affecting their effectiveness. Advances in drone technology have significantly enhanced emergency response by capturing multi-angle photographs of disaster sites and transmitting real-time images, improving the timeliness of rescue operations [8].

The principle of UAV mapping involves mounting a camera on the drone to capture images of the Earth’s surface, utilizing the drone’s flexibility. Subsequently, the acquired images are processed to produce high-precision digital maps [9]. Due to their low cost, flexibility, and high resolution, UAVs have been increasingly applied in various important fields such as environmental monitoring, remote sensing, and target tracking [10,11]. Therefore, this study selects UAV imagery as the source of the images.

Image registration, a critical step in digital map production, aims to correct misalignments caused by variations in lighting, scale, displacement, and rotation across different modalities [12]. Ramli et al. proposed the CURVE feature extraction technique, which combines retinal vascular and noise features to enhance fundus image registration accuracy, achieving 86% accuracy, significantly outperforming existing methods [13]. Liang et al. developed a binary fast image registration method based on fused information, improving FAST feature detection and weighted angular diffusion radial sampling to achieve rapid and accurate UAV image registration [14]. Gu et al. introduced the AC-SURF algorithm with an angle compensation strategy for damage detection in rotating blades, enhancing the efficiency and accuracy of wind turbine blade operation monitoring and damage detection through digital image correlation techniques [15].

To obtain images with a wider field of view, image stitching is a necessary preprocessing step in UAV remote sensing applications [16]. Therefore, after image registration, it is necessary to select an appropriate method to stitch multiple images together. This involves projecting the overlapping images onto the same reference plane through spatial transformation, aligning the overlapping regions in a single step to create a panoramic image [17]. Due to factors such as the capture time, location, and lens distortion of the UAV, the overlap between adjacent UAV images is irregular. Consequently, traditional stitching methods often exhibit significant geometric and tonal discrepancies at the stitching boundaries, resulting in noticeable gaps in the stitched map [18]. Currently, there is extensive research focused on improving the quality of image stitching. Jia et al. proposed a multi-feature extraction algorithm based on grayscale, complementary wavelet (CW) chrominance, sharpness, and natural scene statistics (NSS) for image stitching tampering detection, significantly improving detection accuracy [19]. Zhang et al. presented an improved scale-invariant feature transform (SIFT) method for underwater image stitching, enhancing feature extraction and precise matching, which notably increased stitching quality and robustness, effectively reducing ghosting and distortion [20]. Li et al. proposed an improved RANSAC-SURF algorithm for vehicle seat type detection and spring hook missing recognition, enhancing detection accuracy and robustness [21]. Liu et al. introduced a merge-sorting-based method for multi-scene image stitching, significantly reducing computational time and improving the efficiency of image registration and stitching, minimizing distortion in the stitched images [22]. Chen et al. proposed a UAV image stitching method based on diffusion models, effectively eliminating irregular boundaries and seams in image stitching, and enhancing the perceptual quality of the stitched images [16]. However, there is still a need for further research on image stitching techniques that involve moving objects.

Current algorithms struggle with unstructured roads and frequently changing geographical environments, leading to instability in feature extraction and uneven exposure, which do not meet the requirements for high-precision digital maps. To address these issues, this paper proposes a two-dimensional digital map modeling method based on UAV aerial images, enhancing map accuracy. The innovations include:

To address the incomplete feature extraction of sequential images using SIFT and SURF algorithms, we propose a C-SURF algorithm to enhance feature detection efficiency. Moreover, we improve feature matching accuracy by optimizing the dimensionality of feature descriptors, thereby providing reliable data for sequential image stitching;
To mitigate ghosting and color artifacts in image stitching caused by moving objects, we propose the novel energy function that integrates pixel texture features. This approach aims to improve the visual quality of stitched sequential images by reducing these artifacts.

2. Feature Extraction and Registration Based on the C-SURF Algorithm

2.1. The C-SURF Algorithm

Feature-based image registration is a core component of image stitching, and the quality of feature point extraction significantly influences feature matching and image stitching. The production of digital maps imposes strict requirements on image quality. Commonly used feature point extraction algorithms, such as SIFT and speeded-up robust features (SURF), encounter challenges in detecting feature points in images that contain dynamic scenes, leading to the potential omission of certain elements.

The traditional SURF algorithm is an enhancement of the SIFT algorithm, resulting in similar operational processes. Initially, the SURF algorithm performs Hessian matrix analysis, followed by non-maximum suppression, while employing integral images to accelerate feature extraction. In generating feature descriptors, the SURF algorithm computes a 64-dimensional feature vector, which diminishes the accuracy of feature matching compared to the 128-dimensional feature vector utilized in the SIFT algorithm. Consequently, while the SURF algorithm retains the stability, robustness, and rotation invariance of the SIFT algorithm and enhances the real-time performance of feature extraction, it sacrifices the precision of feature registration.

To address this limitation, this paper proposes an algorithm that combines an improved Canny edge detection algorithm with an enhanced SURF algorithm, referred to as the C-SURF algorithm. The C-SURF algorithm not only improves the quality of image feature extraction but also increases the accuracy of feature matching. The specific implementation steps of the algorithm are as follows:

(1): The source image undergoes high-contrast denoising preprocessing to eliminate noise, ensuring that edge information in the source image is preserved and enhancing the stability of feature extraction and contour information;
(2): The improved Canny edge detection algorithm is employed to extract edge information from the source image;
(3): An improved SURF feature extraction algorithm based on logarithmic polar coordinates is utilized to perform feature detection on the image obtained in step two, ultimately resulting in a set of features from the image.

2.2. Improved Canny Edge Detection Algorithm in the C-SURF Algorithm

The Canny edge detection algorithm is an efficient and straightforward process with a small data footprint. However, when combined with Gaussian filtering during its application, it can weaken the edges, become overly sensitive to noise, and is prone to generating spurious edges. Therefore, before integrating it with the SURF algorithm, it is necessary to modify it by employing high-contrast filtering in place of Gaussian filtering. This modification aims to achieve noise reduction while effectively preserving edge information. The enhanced algorithm is divided into four steps: high-contrast image denoising, gradient calculation, feature extraction, and delayed edge tracking.

(1): High-Contrast Image Denoising

The expression of high-contrast filtering is shown as Equation (1):

h (x, y) = f (x, y) - \frac{\sum_{i, j} f (i, j, x, y) ω (i, j, x, y)}{\sum_{i, j} ω (i, j, x, y)}

(1)

where

h (x, y)

represents the pixel value of the pixel point after H-B filtering,

f (x, y)

represents the pixel value of the pixel at coordinate

(x, y)

in the source image,

ω (i, j, x, y)

represents the weight coefficient of the mask in the filtering process, and

ω

is shown as Equation (2):

\begin{array}{l} ω (i, j, x, y) \\ = e^{- [\frac{{(x - i)}^{2} + {(y - j)}^{2}}{2 σ_{d}^{2}} + \frac{| | f (x, y) - f (i, j) | |}{2 σ_{r}^{2}}]} \end{array}

(2)

(2): Gradient calculation

By iteratively calculating the pixel gradient magnitude G and the gradient direction

θ

, the extrema of the target function are identified, as shown in Equation (3):

{\begin{cases} G = \sqrt{G_{x}^{2} + G_{y}^{2}} \\ θ = \tan^{- 1} (\frac{G_{y}}{G_{x}}) \end{cases}

(3)

where

G_{x}

and

G_{y}

represent the first-order partial derivatives of the pixel gradient in the x and y directions, respectively.

(3): Feature Extraction

The feature extraction process commences with non-maximum suppression, which involves calculating the magnitude of the target pixel in the direction of the gradient to suppress non-maximum values. This is followed by a comparison with pixels within the local neighborhood; if the magnitude of the target pixel exceeds that of its surrounding pixels, it is retained, otherwise, it is discarded. By employing a double-threshold method for image segmentation, all pixel points within the target image are categorized into three classes based on the magnitude of their gradient values: strong edge points with gradient values exceeding the high threshold, weak edges with gradient values between the two thresholds, and pixel points with gradient values below the low threshold. The weak edges require further processing, as they encompass not only genuine edges but also spurious edges induced by noise and grayscale variations. Conversely, the suppressed pixel points no longer appear in the target image.

(4): Hysteresis Edge Tracking

During the process of non-maximum suppression, weak edges are identified. To filter out the true edges, it is considered that genuine edges are typically connected to strong edges. Therefore, a 3 × 3 rectangular window is employed to filter these weak edges. If a strong edge point is detected within this window, the pixel is retained as a true edge; otherwise, it is discarded as a spurious point. Continuous iterative tracking is utilized to confirm the true edges.

2.3. Improved SURF Feature Extraction in the C-SURF Algorithm

The traditional SURF algorithm frequently employs the concept of integral images and applies appropriate simplifications and approximations to the integrals within the Hessian matrix [13]. However, the traditional SURF algorithm constructs descriptors by statistically extracting histograms and determining the principal orientation of features in the neighboring region. It then sums the responses in two directions along the principal orientation and the absolute sum of the responses within a 4 × 4 sub-region, ultimately describing the corresponding feature points with a 64-dimensional feature vector. Consequently, compared to the SIFT feature extraction algorithm, although the reduction in dimensionality of the SURF algorithm’s feature descriptors alleviates data redundancy, it also results in a loss of accuracy in feature registration.

In the C-SURF algorithm, the improved SURF algorithm primarily focuses on optimizing the descriptor. The descriptor in this method is designed to extract logarithmic polar coordinates from the neighborhoods of key points obtained at different scales, thereby enhancing the accuracy of image registration. More specifically, the newly obtained neighborhoods in logarithmic polar coordinates are first divided into regions, segmenting the neighborhood into three blocks based on radial distances of 6, 9, and 15, while excluding the coordinates of the central pixel. Additionally, the area is divided into eight regions, resulting in a total of 17 sub-regions. Each sub-region calculates the gradient histogram in eight directions, ultimately yielding a 136-dimensional feature description. The specific construction process is illustrated in Figure 1.

To achieve rotation invariance, the logarithmic polar coordinates must capture the direction of the keypoints. After rotation, the new coordinates of the sampling points in the neighborhood of the central pixel point are as shown in Equation (4):

{\begin{cases} x ’ = x \cos (α) - y \sin (α) \\ y ’ = y \sin (α) + y \cos (α) \end{cases}

(4)

In terms of assigning pixel weights around keypoints, both the SIFT feature extraction algorithm and the traditional SURF feature extraction algorithm employ a Gaussian weighting function. The Gaussian weighting function assigns larger weights to pixels closer to the center pixel and smaller weights to those farther away, thereby reducing the influence of distant pixels on the central pixel. However, during nearest-neighbor matching, Gaussian weighting tends to blur the distinction between the nearest and the second-nearest neighbors, which can easily lead to mismatches and significantly increase the computational burden for improving registration accuracy. In contrast, distance-based weighting amplifies the difference between the nearest and the second-nearest neighbors during nearest neighbor matching, which is beneficial for enhancing the registration accuracy of feature points. To address the deficiencies associated with both Gaussian and distance-based weighting schemes in registration, this paper adopts a method that combines Gaussian and distance weights to improve the accuracy of subsequent feature point registration. The new weight calculation formula is presented in Equation (5):

w = m (x, y) e^{- \frac{{(x ’)}^{2} + {(y ’)}^{2}}{2 σ^{2}}} d

(5)

where

d = \sqrt{{(x - a)}^{2} + {(x - b)}^{2}}

represents the Euclidean distance between a neighboring pixel and the central pixel,

(x, y)

denotes the coordinates of the neighboring pixel, and

(a, b)

signifies the coordinates of the central pixel.

3. Image Stitching Based on the Novel Energy Function

Feature matching using the improved C-SURF algorithm yields relatively ideal results. For multiple images, image stitching is also required. Conventional image stitching methods abound, such as those based on trigonometric function weights, pixel-weighted averaging, and direct averaging methods. However, these methods often fall short, primarily because they struggle to achieve seamless stitching of color images, and when there are moving objects in the images to be stitched, the resulting effect is prone to ghosting and other issues [14]. Therefore, an optimal seam-finding method is needed to address discontinuities and ghosting in the orthoimage sequences obtained by UAVs.

This paper transforms the problem of finding the best seam into a pixel labeling problem. By constructing an energy function based on the energy spectrum, smoothness terms, and problem-specific factors, a novel minimum cut and maximum flow connected graph is derived, ultimately generating the optimal seam. Under the consideration of the energy spectrum and the smoothness term, the texture features of neighboring pixels with the same label are introduced as an expansion factor, and the novel energy function formula is shown in Equation (6).

E (x, y) = E_{d a t a} (x, y) + λ E_{s m o o t h n e s s} (x, y) + E_{t e x t u r e} (x, y)

(6)

A crucial step in the minimum cut and maximum flow algorithm is the identification of a connected graph. As depicted in Figure 2, assuming the source node is S and the sink node is T (hereinafter referred to as the S-T connected graph), the process involves the following: Pixels within the overlapping region correspond to nodes in the connected graph, with the nodes comprising a set of vertices V and a set of edges E. The network flow between the source and sink nodes traverses through various nodes within the overlapping region, assigning different weights based on distinct energy spectra. In the graph, the weights of the connections for network t are derived from the data term

E_{d a t a}

, for network n from the smoothness term

E_{s m o o t h n e s s}

, and for network m from the texture feature

E_{t e x e t u r e}

. Within the graph, the thickness of the network lines represents the magnitude of the weights; it is assumed that the closer the pixel values of two adjacent pixels p and q are to each other, the greater the weight assigned, indicating that pixel points p and q are likely from the same image. Conversely, if the pixel values are dissimilar, the two pixels may originate from two different source images. Thus, the S-T connected graph based on the overlapping region’s pixel points between different images is essentially formed.

The calculation formula for texture features is shown in Equation (7).

E_{t e x t u r e} (x, y) = \frac{\sum_{i = x - 1}^{x + 1} \sum_{j = y - 1}^{y + 1} (I_{1} (i, j) - {\bar{I}}_{1} (i, j)) (I_{2} (i, j) - {\bar{I}}_{2} (i, j))}{\sqrt{\sum_{i = x - 1}^{x + 1} \sum_{j = y - 1}^{y + 1} {(I_{1} (i, j) - {\bar{I}}_{1} (i, j))}^{2}} \sqrt{\sum_{i = x - 1}^{x + 1} \sum_{j = y - 1}^{y + 1} {(I_{2} (i, j) - {\bar{I}}_{2} (i, j))}^{2}}}

(7)

Here, λ represents the adjustment parameter, which is set to a default value of 1 in the experiments;

{\bar{I}}_{i} (i, j)

denotes the pixel points within the overlapping region of the two images; and

I_{i} (i, j)

represents the feature spectrum within the central 3 × 3 neighborhood.

The result of reassigning labels is obtained by optimizing the S-T connected graph using the min-cut and max-flow algorithms. Set

E_{d a t a}

to a specified value, that is, divide the pixel labels into three categories: pixel points in the overlapping area are in the source image I₁ but not in the source image I₂; pixel points in the overlapping area are not in the source image I₁ but are in the source image I₂; pixel points in the overlapping area are in both the source image I₁ and the source image I₂. The expression is as follows in Equation (8):

\begin{array}{l} E_{d a t a} (x, y) = {\begin{cases} \infty, f (x, y) = 0 \\ 0, f (x, y) = 1 \end{cases}, (x, y) \in I_{1} \\ E_{d a t a} (x, y) = {\begin{cases} \infty, f (x, y) = 1 \\ 0, f (x, y) = 0 \end{cases}, (x, y) \in I_{2} \\ E_{d a t a} (x, y) = 0, (x, y) \in R \end{array}

(8)

In this context,

f (x, y)

denotes the label of the pixel, and

(x, y)

represents the coordinates of the pixel.

The

E_{s m o o t h n e s s}

smoothness term, which defines the discontinuity between adjacent pixels within a four-neighborhood N, plays a crucial role in the entire energy function definition. It represents the difference between adjacent pixels in the overlapping region and directly influences the quality of the entire image after stitching. However, in traditional energy functions, to optimize the regularization term of the objective function and enhance the model’s generalization ability, as well as to simplify the fitting scenario, the smoothness term is often based on the L₂ norm. The expression is as follows in Equation (9):

f (x) = | | x | |_{2}

(9)

A well-designed seam should ideally traverse through the moving objects within the overlapping region to minimize the creation of artifacts, yet the smoothness term based on the L₂ norm fails to adequately distinguish between misaligned areas within the overlap. This is because the L₂ norm-based smoothness term does not provide an appropriate penalty when dealing with moving objects within the overlap. To mitigate this issue, this paper reconsiders the choice of norm for the smoothness term.

In order for the novel energy function to more effectively differentiate between aligned and misaligned regions of moving objects within the overlap, it is necessary to find a suitable function that allows the seam to better avoid these objects. The goal is to maximize the penalty of the smoothness term in misaligned regions and minimize it in aligned regions, thereby distinguishing between them. Assuming that pixels with the same label in source images I₁ and I₂ are denoted as x, the aim is to identify differences in the same moving objects across the overlapping regions of the two source images as much as possible to effectively avoid them. In this chapter, a new function is defined based on the L₁ norm, as shown in Equation (10):

f (x) = {(| | x | |_{1})}^{5 / 2}

(10)

The expression for the new smoothness term function is as follows in Equation (11):

\begin{array}{l} E_{s m o o t h n e s s} ((x_{1}, y_{1}), (x_{2}, y_{2}), f (x_{1}, y_{1}), f (x_{2}, y_{2})) \\ = | f (x_{1}, y_{1}) - f (x_{2}, y_{2}) | \\ \cdot [{‖ I_{f (x_{1}, y_{1})} (x_{1}, y_{1}) - I_{f (x_{2}, y_{2})} (x_{1}, y_{1}) ‖}^{5 / 2} \\ + {‖ I_{f (x_{1}, y_{1})} (x_{2}, y_{2}) - I_{f (x_{2}, y_{2})} (x_{2}, y_{2}) ‖}^{5 / 2}] \end{array}

(11)

The specific procedure of the algorithm is detailed in Algorithm 1. Initially, feature extraction and matching are performed on the images to calculate a global perspective transformation matrix H. Based on matrix H, the overlapping region R and boundaries of the two source images are identified. Subsequently, the data term, smoothness term, and texture term of the novel energy function are computed. The minimum cut and maximum flow algorithm is then applied to assign labels to pixels within the overlapping region, thereby identifying the optimal seam. Finally, Poisson fusion is used for color correction of the stitched image, resulting in an ideal panorama. When seeking the optimal seam, the search begins with pixels at the first row of each column within the overlapping region’s boundary. The energy function is calculated for each starting point, and then the search is expanded downward according to the energy formula, computing the energy values of the three adjacent pixels in the next row within the same column and summing them with the energy values of the starting expansion point from the previous row. This process continues until the last row is reached. Ultimately, the line formed by connecting the pixels with the minimum energy values is selected as the optimal stitching line.

Algorithm 1: Image Stitching Algorithm Based on the Novel Energy Function

Input: Source Image I₁ and Source Image I₂.

Output: Panoramic Image I.

1: Perform feature extraction on the source images using the C-SURF algorithm.

2: Conduct a coarse matching of the extracted features using a brute-force approach and refine the matches using the RANSAC algorithm.

3: Perform bundle adjustment on the images and calculate the global perspective transformation matrix H.

4: Determine the overlapping region R based on matrix H and identify the boundaries of region R.

5:

Compute the data term E_{d a t a}

according to the Equation (8).

6:

Compute the smoothness term E_{s m o o t h n e s s}

according to the Equation (11).

7:

Compute the texture term E_{t e x t u r e}

according to the Equation (7).

8:

Merge the data term E_{d a t a}

, smoothness term E_{s m o o t h n e s s}

, and texture term E_{t e x e t u r e}

into the novel energy function for computation, as shown in the Equation (6).

9: Solve the energy function equation using the minimum cut and maximum flow algorithm, and assign labels to pixels within the overlapping region.

10: Obtain the stitched panoramic image using Poisson fusion.

In summary, the specific process of the proposed two-dimensional digital map modeling method based on UAV aerial images is illustrated in Figure 3.

4. Experimental Validation

4.1. Data Collection Platform Setup

In the process of image acquisition, to ensure that the captured images are clear, the flight altitude of the drone is typically kept below 200 m. Wind speed can significantly impact the stability of drone photography. Given the presence of tall buildings in urban environments, low-altitude winds are generally classified as level 4 or lower, with speeds under 7.9 m/s. Consequently, when selecting a drone model, its wind resistance capability should be at least 10 m/s. During image capture, it is essential to control the drone’s flight speed to ensure that its endurance time is sufficient to cover the entire scene intended for detection. To meet this requirement, the drone’s flight endurance must be no less than 30 min. The sequence of images captured by the drone is intended for creating panoramic images through image processing techniques; therefore, the resolution of the drone’s onboard detection system must be at least 10 million pixels.

Taking all the aforementioned factors into account, the experimental platform utilized in this study is the DJI Mavic Air 2 drone equipped with a wide-angle camera. This platform boasts a maximum flight range of 5 km, and the wide-angle lens has a maximum tilt angle of 84°, which fully satisfies the requirements for the experiments conducted in this paper. Furthermore, the lens captures images with a resolution of up to 48 million pixels, and even during continuous shooting, each image retains a resolution of 12 million pixels. The lens has an equivalent focal length of 24 mm, and the available shooting modes include spherical, wide-angle, and vertical. During the actual imaging process, the flight altitude of the drone significantly influences the size of the captured image area. To obtain sequential images conducive to target extraction, the final flight altitude was determined to be between 140 and 145 m after several test flights. All image acquisitions for this experiment were performed under normal lighting conditions in the morning. To facilitate the storage of captured data, the platform is equipped with 8GB of onboard memory. Using the experimental platform for aerial photography, a total of 30 images were captured for this scene, and some of the obtained orthophoto source images are shown in Figure 4. In addition, the computer used for image processing in this article has an Intel (R) Core (TM) i5-6300HQ @ 2.30 GHz CPU with 8GB of RAM.

4.2. Image Preprocessing

Denoising the collected sequential images to obtain high SNR images is fundamental for image stitching based on the overlapping regions of sequential images. The quality of the stitched image depends on the choice of image acquisition equipment and algorithms during the image processing phase. During image capture, factors such as the drone’s flight altitude, orientation, and weather conditions inevitably lead to issues like blurriness, uneven lighting, and occlusions. Denoising is therefore necessary to enhance image quality, which in turn improves the accuracy of feature extraction and feature matching in sequential images.

An appropriate filtering algorithm not only can highly restore the information in an image but also maintain the clarity of the image’s edge contours, thereby ensuring the accuracy of edge extraction. In this study, the H-B filtering algorithm, based on Gaussian bilateral filtering, was employed for image denoising. After applying the H-B filter, SNR improved to approximately 40.85 dB, exceeding 38 dB, which qualifies as a high SNR for sequential images. The denoised images achieved noise reduction while preserving edge information from the original images, and also enhanced the clarity of details within the images, as shown in Figure 5. Following the image denoising process, high-quality materials suitable for panoramic image stitching were obtained.

4.3. Feature Extraction and Registration

To perform point feature-based image registration on high signal-to-noise ratio images, feature extraction is the first step. Two images captured by a drone were randomly selected, and the same preprocessing method was applied for noise reduction. With the threshold for the number of features set to 3000, the results of feature extraction using the SIFT algorithm, SURF algorithm, and C-SURF algorithm are shown in Figure 6, Figure 7 and Figure 8.

After applying the SIFT algorithm for feature extraction on the drone-captured images, instances of feature point omission were observed at the location of the crosswalk in Figure 6a and at the edge of the flowerbed in Figure 6b.

According to the detection results shown in Figure 7, the feature extraction results of the SURF algorithm also exhibit the same issue. Instances of feature point omission are observed at the location of the crosswalk in Figure 7a and at the edge of the flowerbed in Figure 7b.

Based on the principles of the C-SURF algorithm, feature extraction was performed on the collected images, and the experimental results are shown in Figure 8. The results indicate that the issue of feature point omission has been effectively addressed.

Additionally, Table 1 compares the feature extraction results of the SIFT algorithm, SURF algorithm, and C-SURF algorithm across four dimensions: set detection quantity, actual detection quantity, effective detection quantity (the actual number of detections quantity divided by the set number of detections quantity), and time consumption. From Table 1, it can be observed that the C-SURF algorithm demonstrates a low dependency of the number of detected effective features on the complexity of the scene within the images. When the threshold for the number of feature detections is set to 3000, the feature detection rate can reach over 90% of the threshold. Therefore, the C-SURF algorithm exhibits a greater advantage in detecting feature points compared to the SIFT and SURF algorithms.

During the feature registration phase, the feature descriptors of the C-SURF algorithm have an increased dimensionality of 136 dimensions, compared to 128 dimensions for the SIFT algorithm and 64 dimensions for the SURF algorithm, resulting in higher registration accuracy. The feature registration pairs of the three algorithms are shown in Table 2, where the registration accuracy rate is obtained by dividing the fine matching point pair/group by the coarse matching point pair/group. In the feature registration phase, the C-SURF algorithm employs a brute-force matching method for coarse matching, followed by the use of the Random Sample Consensus algorithm for filtering. The feature matching results after this process show an improvement of nearly 17% compared to the feature matching results based on the SIFT algorithm and an improvement of nearly 15% compared to the matching results of the SURF feature extraction algorithm. Additionally, the specific feature matching results are illustrated in Figure 9.

4.4. Image Stitching

The source images captured by drones contain moving objects, such as individuals riding electric bicycles and cars. Representative images were selected for stitching comparison, with Figure 9 illustrating the results of the image stitching process. Specifically, Figure 10a–c depicts the results of image stitching using traditional energy functions, while Figure 10d–f shows the results obtained using the improved the novel energy function. It is evident from Figure 10a–c that the results of image stitching with traditional energy functions exhibit noticeable illumination variations and ghosting artifacts at their edges. In contrast, the use of the improved the novel energy function for image stitching effectively addresses the illumination, exposure, and ghosting issues that traditional algorithms could not resolve. Subsequently, SNR analysis was conducted on the image stitching results, revealing SNR values of 13.1504, 12.7692, and 11.9154 for Figure 10a–c, and SNR values of 36.2802, 35.9015, and 36.3165 for Figure 10d–f. The average SNR of the stitched images based on the novel energy function exceeds 36, indicating excellent image quality and laying a solid foundation for target extraction tasks based on panoramic images.

Therefore, the image stitching method proposed in this paper, which is based on the improved the novel energy function, not only resolves the ghosting issues present in the stitching process of sequential images captured by drones but also addresses exposure problems. This provides a strong foundation for the subsequent production of two-dimensional digital maps.

5. Conclusions

This paper presents a method for creating digital maps from urban building sequence images captured by unmanned aerial vehicles (UAVs), which can be used for emergency rescue, accident handling, and other issues. During the image processing, the following problems were addressed and certain achievements were made:

(1): In the domain of image feature extraction, the C-SURF algorithm is proposed, which demonstrates a more comprehensive feature detection capability compared to the SIFT and SURF algorithms. When detecting features in the same image, the number of detected feature points using the C-SURF algorithm exceeds that of the SIFT algorithm by nearly 40% and that of the SURF algorithm by nearly 48%. This advancement mitigates the issue of certain elements within the image being undetectable as feature points and increases the number of correctly matched point pairs within the same image, thereby enhancing the effectiveness of feature registration;
(2): To address the issue of defects in stitched images caused by the presence of moving objects in the analyzed scene, a novel image stitching method based on the novel energy function is proposed. This method not only prevents the occurrence of uneven exposure when stitching two or more images but also significantly improves the issues of color discrepancies and ghosting artifacts that arise after image fusion. The average signal-to-noise ratio (SNR) of the stitched images increased from the original 12.6617 dB to 36.1661 dB, indicating a marked improvement in image quality.

Furthermore, during the research process of obtaining panoramic images and target extraction, there remain many areas for in-depth study. For instance, optimization of drone aerial image processing techniques under varying lighting conditions and different terrain complexities could be considered. The maps generated in this study do not take geographic coordinate information into account; thus, adding latitude, longitude, and altitude to the current maps to achieve more precise mapping is one of the directions for future research. Additionally, exploring the integration of the generated maps with GIS environments presents another potential avenue for development.

Author Contributions

H.W.: Conceptualization, methodology, software, data curation, writing—original draft. K.Z.: Conceptualization, writing—review and editing, Supervision. D.G.: validation, writing—review and editing. X.X.: Validation, writing—review and editing. Y.W.: Validation, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by STUDY ON SAFETY PRODUCTION MANAGEMENT SYSTEM OF CHN ENERGY, grant number GJNY-23-1.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Dongfeng Gao was employed by the company CHN ENERGY. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Popovaitė, V. Assembling Collaboration Space: Maps in Practice During Search and Rescue Efforts In Northern Norway. Saf. Sci. 2023, 164, 106186. [Google Scholar] [CrossRef]
Opach, T.; Rød, J.K. A User-Centric Optimization of Emergency Map Symbols to Facilitate Common Operational Picture. Cartogr. Geogr. Inf. Sci. 2022, 49, 134–153. [Google Scholar] [CrossRef]
Esteves Henriques, B.; Baglioni, M.; Jamshidnejad, A. Camera-Based Mapping in Search-and-Rescue via Flying and Ground Robot Teams. Mach. Vis. Appl. 2024, 35, 117. [Google Scholar] [CrossRef]
Zhao, Z.-H.; Sun, H.; Zhang, N.-X.; Xing, T.-H.; Cui, G.-H.; Lai, J.-X.; Liu, T.; Bai, Y.-B.; He, H.-J. Application of Unmanned Aerial Vehicle Tilt Photography Technology in Geological Hazard Investigation in China. Nat. Hazards 2024, 120, 11547–11578. [Google Scholar] [CrossRef]
Cao, X.; Liu, Z.; Hu, C.; Song, X.; Quaye, J.A.; Lu, N. Three-Dimensional Geological Modelling in Earth Science Research: An In-Depth Review and Perspective Analysis. Minerals 2024, 14, 686. [Google Scholar] [CrossRef]
Baile, A.; Jha, M.; Jain, N.; Tignath, S.; Kinattinkara, R. Integrated Landslide Analysis Using Petrological Investigation and Drone Based High Resolution Geospatial Assessment: Irshalwadi Landslide—A Case Study from Western Ghats of India. Nat. Hazards 2024, 121, 405–421. [Google Scholar] [CrossRef]
Khan, A.; Gupta, S.; Gupta, S.K. Emerging UAV Technology for Disaster Detection, Mitigation, Response, and Preparedness. J. Field Robot. 2022, 39, 905–955. [Google Scholar] [CrossRef]
Mugnai, F.; Masiero, A.; Angelini, R.; Cortesi, I. High-Resolution Monitoring of Landslides with UAS Photogrammetry and Digital Image Correlation. Eur. J. Remote Sens. 2023, 56, 2216361. [Google Scholar] [CrossRef]
Yang, K.; Li, W.; Yang, X.; Zhang, L. Improving Landslide Recognition on UAV Data through Transfer Learning. Appl. Sci. 2022, 12, 10121. [Google Scholar] [CrossRef]
Wei, Z.; Lan, C.; Xu, Q.; Wang, L.; Gao, T.; Yao, F.; Hou, H. SatellStitch: Satellite Imagery-Assisted UAV Image Seamless Stitching for Emergency Response without GCP and GNSS. Remote Sens. 2024, 16, 309. [Google Scholar] [CrossRef]
Pan, W.; Li, A.; Liu, X.; Deng, Z. Unmanned Aerial Vehicle Image Stitching Based on Multi-region Segmentation. IET Image Process. 2024, 18, 4607–4622. [Google Scholar] [CrossRef]
Luo, X.; Wei, Z.; Jin, Y.; Wang, X.; Lin, P.; Wei, X.; Zhou, W. Fast Automatic Registration of UAV Images via Bidirectional Matching. Sensors 2023, 23, 8566. [Google Scholar] [CrossRef] [PubMed]
Ramli, R.; Hasikin, K.; Idris, M.Y.I.; Karim, N.K.A.; Wahab, A.W.A. Fundus Image Registration Technique Based on Local Feature of Retinal Vessels. Appl. Sci. 2021, 11, 11201. [Google Scholar] [CrossRef]
Liang, H.; Liu, C.; Li, X.; Wang, L. A Binary Fast Image Registration Method Based on Fusion Information. Electronics 2023, 12, 4475. [Google Scholar] [CrossRef]
Gu, J.; Liu, G.; Li, M. Damage Detection for Rotating Blades Using Digital Image Correlation with an AC-SURF Matching Algorithm. Sensors 2022, 22, 8110. [Google Scholar] [CrossRef]
Chen, J.; Luo, Y.; Wang, J.; Tang, H.; Tang, Y.; Li, J. Elimination of Irregular Boundaries and Seams for UAV Image Stitching with a Diffusion Model. Remote Sens. 2024, 16, 1483. [Google Scholar] [CrossRef]
He, L.; Li, X.; He, X.; Li, J.; Song, S.; Plaza, A. VSP-Based Warping for Stitching Many UAV Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5624717. [Google Scholar] [CrossRef]
Zhou, L.; Zhao, S.; Wan, Z.; Liu, Y.; Wang, Y.; Zuo, X. MFEFNet: A Multi-Scale Feature Information Extraction and Fusion Network for Multi-Scale Object Detection in UAV Aerial Images. Drones 2024, 8, 186. [Google Scholar] [CrossRef]
Jia, R.; Nahli, A.; Li, D.; Zhang, J. A Multi-Feature Extraction-Based Algorithm for Stitching Tampered/Untampered Image Classification. Appl. Sci. 2022, 12, 2337. [Google Scholar] [CrossRef]
Zhang, H.; Zheng, R.; Zhang, W.; Shao, J.; Miao, J. An Improved SIFT Underwater Image Stitching Method. Appl. Sci. 2023, 13, 12251. [Google Scholar] [CrossRef]
Li, X.; Zhu, J.; Ruan, Y. Vehicle Seat Detection Based on Improved RANSAC-SURF Algorithm. Int. J. Pattern Recogn. Artif. Intell. 2021, 35, 2155004. [Google Scholar] [CrossRef]
Liu, W.; Zhang, K.; Zhang, Y.; He, J.; Sun, B. Utilization of Merge-Sorting Method to Improve Stitching Efficiency in Multi-Scene Image Stitching. Appl. Sci. 2023, 13, 2791. [Google Scholar] [CrossRef]

Figure 1. Construction of feature descriptor vector in the C-SURF algorithm.

Figure 2. S-T flow chart.

Figure 3. Flowchart of the two-dimensional digital map modeling method based on UAV aerial images.

Figure 4. The original UAV aerial photo display. (a–c) are the aerial photographs of streets collected from different locations.

Figure 5. The result of image denoising.

Figure 6. The results of SIFT feature extraction: (a) extraction result 1, (b) extraction result 2.

Figure 7. The results of SURF feature extraction: (a) extraction result 1, (b) extraction result 2.

Figure 8. The results of C-SURF feature extraction: (a) extraction result 1, (b) extraction result 2.

Figure 9. The results of the feature matching: (a) SIFT feature registration results, (b) SURF feature registration results, (c) C-SURF feature registration results.

Figure 10. Image stitching results: (a–c) the results of image stitching using traditional energy functions; (d–f) the results obtained using the improved novel energy function; (g) the overall fusion result graph based on the new improved energy function.

Table 1. Comparison of feature extraction results detected by the three algorithms.

Feature Extraction Algorithm	Set Detection Quantity	Actual Detection Quantity	Effective Detection Quantity (%)	Time Consumption (Seconds)
SIFT feature extraction result 1	3000	1842	61.4%	2.235 s
SIFT feature extraction result 2	3000	2101	70.3%	2.597 s
SURF feature extraction result 1	3000	1421	47.4%	1.876 s
SURF feature extraction result 2	3000	1813	60.4%	2.148 s
C-SURF feature extraction result 1	3000	2735	94.2%	2.014 s
C-SURF feature extraction result 2	3000	2836	94.5%	2.185 s

Table 2. Comparison of feature registration results among the three algorithms.

Feature Extraction Algorithm	Coarse Matching Point /Pairs	Fine Matching Point /Pairs	Time Consumption (Seconds)	Registration Accuracy Rate (%)
SIFT	403	236	4.861 s	58.6%
SURF	617	378	4.282 s	61.3%
C-SURF	1382	1092	7.153 s	79.0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, H.; Zong, K.; Gao, D.; Xu, X.; Wang, Y. Research on Two-Dimensional Digital Map Modeling Method Based on UAV Aerial Images. Appl. Sci. 2025, 15, 3818. https://doi.org/10.3390/app15073818

AMA Style

Wang H, Zong K, Gao D, Xu X, Wang Y. Research on Two-Dimensional Digital Map Modeling Method Based on UAV Aerial Images. Applied Sciences. 2025; 15(7):3818. https://doi.org/10.3390/app15073818

Chicago/Turabian Style

Wang, Han, Kai Zong, Dongfeng Gao, Xuerui Xu, and Yanwei Wang. 2025. "Research on Two-Dimensional Digital Map Modeling Method Based on UAV Aerial Images" Applied Sciences 15, no. 7: 3818. https://doi.org/10.3390/app15073818

APA Style

Wang, H., Zong, K., Gao, D., Xu, X., & Wang, Y. (2025). Research on Two-Dimensional Digital Map Modeling Method Based on UAV Aerial Images. Applied Sciences, 15(7), 3818. https://doi.org/10.3390/app15073818

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Two-Dimensional Digital Map Modeling Method Based on UAV Aerial Images

Abstract

1. Introduction

2. Feature Extraction and Registration Based on the C-SURF Algorithm

2.1. The C-SURF Algorithm

2.2. Improved Canny Edge Detection Algorithm in the C-SURF Algorithm

2.3. Improved SURF Feature Extraction in the C-SURF Algorithm

3. Image Stitching Based on the Novel Energy Function

4. Experimental Validation

4.1. Data Collection Platform Setup

4.2. Image Preprocessing

4.3. Feature Extraction and Registration

4.4. Image Stitching

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI