Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones

Zhou, Jiawen; Hu, Mei; Zhou, Chao; Liu, Zongmin; Ma, Chao

doi:10.3390/electronics14153147

Open AccessArticle

Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones

by

Jiawen Zhou

^1,2,

Mei Hu

^1,2,

Chao Zhou

^1,2,

Zongmin Liu

^1,2

and

Chao Ma

^1,2,*

¹

College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China

²

National Key Laboratory of Equipment State Sensing and Smart Support, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(15), 3147; https://doi.org/10.3390/electronics14153147

Submission received: 21 July 2025 / Revised: 3 August 2025 / Accepted: 5 August 2025 / Published: 7 August 2025

Download

Browse Figures

Versions Notes

Abstract

With the rapid development of the low-altitude economy, the application of drones in both military and civilian fields has become increasingly widespread. The safety and accuracy of their positioning and navigation have become critical factors in ensuring the successful execution of missions. Currently, GNSS spoofing attack techniques are becoming increasingly sophisticated, posing a serious threat to the reliability of drone positioning. This paper proposes a GNSS spoofing detection and autonomous positioning method for drones operating in mission mode, which is based on visual sensors and does not rely on additional hardware devices. First, during the deception detection phase, the ResNet50-SE twin network is used to extract and match real-time aerial images from the drone’s camera with satellite image features obtained via GNSS positioning, thereby identifying positioning anomalies. Second, once deception is detected, during the positioning recovery phase, the system uses the SuperGlue network to match real-time aerial images with satellite image features within a specific area, enabling the drone’s absolute positioning. Finally, experimental validation using open-source datasets demonstrates that the method achieves a GNSS spoofing detection accuracy of 89.5%, with 89.7% of drone absolute positioning errors controlled within 13.9 m. This study provides a comprehensive solution for the safe operation and stable mission execution of drones in complex electromagnetic environments.

Keywords:

drone; visual sensors; GNSS spoofing detection; absolute positioning; image matching

1. Introduction

The low-altitude economy is booming, and drones, with their flexible, efficient, and cost-effective advantages, are increasingly being applied in both military and civilian fields [1,2]. However, challenges related to the safety and accuracy of their positioning and navigation are becoming increasingly prominent. The loss of control of drones due to electronic warfare interference during the Russia–Ukraine conflict [3], as well as Amazon’s reliance on precise positioning for its delivery services, both highlight that high-reliability and high-precision navigation are prerequisites for the safe and efficient operation of drones. Currently, the global navigation satellite system (GNSS), the primary positioning method, faces the serious threat of spoofing interference [4]. The U.S. Department of Transportation warned as early as 2001 about the impact of global positioning system (GPS) deception on transportation [5]. In 2008, the Humphreys team developed a portable GPS signal deception device for USD 1500 [6], and the widespread adoption of software-defined radio (SDR) technology significantly lowered the barrier to constructing deception devices. In 2011 and 2012, the Iranian military successfully captured U.S. military drones using deception technology on two occasions [7,8], highlighting the real-world threat posed by this technology. Although public reports have decreased in recent years, possibly due to enhanced secrecy resulting from military confidentiality and escalating technological competition, the threat of navigation deception remains significant.

GNSS deception induces receivers to output incorrect positioning by forging signals. The current theoretical framework for drone navigation deception centers on state estimation and control (SEC) and particle hypothesis planning (PHP) [9]: The SEC model focuses on drone navigation and control modules, using state estimators and new information detection techniques to achieve covert deception without triggering receiver anomaly detection. Such as the clandestine deception algorithm proposed by Guo et al. based on GPS/inertial navigation system (INS) combined navigation, which bypasses detection by matching the acceleration of the forged signal with the drone’s actual motion parameters [10]. The PHP model, on the other hand, approaches the problem from a guidance control perspective, employing trajectory tracking and trajectory planning methods to establish a relationship model between the drone’s heading angle and cross-track error (CTE), thereby identifying the drone’s guidance strategy, thereby achieving mapping control of the drone’s trajectory. Ma constructs a modeling framework based on the point mass assumption and designs parameter identification and deception strategies for classical trajectory tracking algorithms [11].

As an important means of countering GNSS deception attacks, deception detection technology is primarily divided into two categories: signal-level and information-level. The former focuses on the physical or propagation characteristics of GNSS signals, whereas the latter relies on external sensors to provide auxiliary information.

Signal-level methods primarily include three categories: signal processing, encrypted signals, and signal spatial geometric relationships. Signal processing methods achieve detection by analyzing differences in characteristics such as the carrier-to-noise ratio (C/N₀) and signal power. For example, Jafarnia-Jahromi et al. found that spoofing signals cause sudden changes in C/N₀ and designed a monitoring algorithm based on this finding [12]; Akos used automatic gain control (AGC) of the radio frequency front end to monitor power anomalies and identify spoofing [13]. Encryption-based methods are primarily applied to military signals (e.g., GPS P(Y) codes), whereas civilian signals utilize techniques such as navigation message authentication (NMA) and the timed efficient streaming loss-tolerant authentication (TESLA) for anti-spoofing. Among these, NMA has been studied and applied in the Galileo system [14,15,16]. Signal-space geometric relationship-based methods exploit the difference between the multi-directional propagation of genuine signals and the single-directional transmission of spoofing signals. They estimate the direction of arrival (DOA) using multi-antenna arrays. For example, Hu et al. normalize the fractional part of the double-difference carrier phase and combine it with baseline vectors and DOA estimation to detect spoofing [17]. Table 1 summarizes the advantages and limitations of various signal-level detection methods.

Compared to signal-level detection, information-level detection integrates auxiliary information such as inertial measurement and visual data to achieve independent verification of navigation results [18], overcoming the limitations of signal physical layer analysis. In information-level methods, the inertial measurement unit (IMU) provides relative motion information and detects deception that causes abnormal violent motion through internal consistency (such as GNSS/INS tight integration of new information sequences [19]). However, the IMU cannot sense the external environment and has difficulty verifying the correctness of the absolute positioning. Visual information is highly correlated with the environmental scene and possesses rich spatial perception capabilities. It can identify environmental features and, when GNSS positioning deviates, can compare visual information to detect deception. For example, Varshosaz compared motion window sub-trajectories with satellite navigation trajectories to make judgments [20]; Xue used neural networks to match aerial photography with satellite images to verify the reasonableness of GNSS positioning [21]; and Davidovich compared the inter-frame correlation of drone videos with GPS coordinates to achieve real-time GPS deception detection [22].

Building on existing research, this paper aims to further enhance the GNSS deception detection capabilities of drones and enable autonomous positioning after deception attacks, thereby forming a defensive closed-loop system. Taking drones operating in mission mode as the research object, this paper proposes a GNSS deception detection and autonomous positioning method that does not require additional hardware support. The detailed contributions are summarized as follows:

Propose a GNSS spoofing detection method based on deep visual features. By matching features between real-time aerial images and satellite images of the GNSS positioning area, we identify position anomalies. Considering the differences in lighting, season, and resolution between aerial images and satellite images, traditional image matching methods struggle to meet robustness requirements [23]. This paper introduces a deep network that combines ResNet50 (Squeeze and Excitation) and SE attention mechanisms for feature extraction: the former stabilizes the training process [24], while the latter focuses on key features [25].
Propose an efficient visual autonomous localization method for drones. After detecting the spoofing behavior, the system switches to the autonomous localization phase and corrects the absolute positioning of the drone by matching the aerial image with the satellite image. Combined with the region constraint strategy, the a priori geographic information is utilized to effectively narrow the matching search range and improve the computational efficiency. And the SuperGlue graph neural network is introduced to realize end-to-end key point matching and to simplify the matching process [26].

This paper is organized as follows. Section 2 describes the overall architecture and two-stage workflow of the GNSS spoofing detection and autonomous localization system. Section 3 describes the GNSS spoofing detection method in detail, from image registration, and feature extraction to the specific implementation process of image matching, focusing on the improved ResNet50-SE network and twin network structure. Section 4 describes the visual autonomous localization method for drones under spoofing attack, including the determination of the maximum flight area, the SuperGlue algorithm matching process, and the post-processing strategy. Section 5 presents the experimental results and analysis. Section 6 summarizes the paper.

2. System Design

With the goal of achieving GNSS spoofing detection and autonomous positioning of drones, this research constructs a complete technical system. The overall architecture of the system is shown in Figure 1, which is mainly divided into two phases: the deception detection phase and the autonomous positioning phase, forming a closed-loop defense process from attack identification to absolute positioning.

In the deception detection phase, the system first combines the real-time aerial images of the drone with GNSS positioning and adopts the alignment method based on the central projection model to achieve accurate matching between the aerial images and the satellite reference images in geospatial location. After the image registration is completed, the system performs feature extraction on the image pairs. Firstly, the image is preprocessed with gray scale normalization and histogram equalization to reduce interference such as lighting variations and enhance image texture details. Subsequently, an improved ResNet50-SE network is used for image feature extraction. This network improves the extraction ability and computational efficiency of key information of remote sensing images by introducing the SE module and group convolution technique. The image matching module, which is the core part of deception detection, adopts the twin network architecture to map the extracted aerial and satellite image features to a unified metric space and quantify the image similarity through Euclidean distance. Combined with the contrastive loss function to optimize network training, when the image similarity is lower than the preset threshold, it determines that a GNSS spoofing attack has occurred and triggers the defense mechanism.

In the autonomous positioning phase, the system calculates the possible maximum flight area of the drone based on the preset route, the drone guidance strategy, and the maximum flight speed, and crops satellite images within the area as a reference. The SuperGlue-based algorithm is used for feature matching between the aerial image and the reference satellite image to achieve vision-based absolute positioning.

In this paper, we only consider the case that the drone is attacked by GNSS spoofing in the two-dimensional horizontal plane, and the drone flight altitude is jointly determined by the barometer and the satellite navigation system, and there are no effective spoofing attacks against barometers, so the altitude spoofing problem is not discussed in this paper.

3. GNSS Spoofing Detection

This section introduces the GNSS spoofing detection method based on heterogeneous image matching, whose core idea is to transform the spatial positioning reliability verification into a visual feature consistency verification problem. The system achieves binary classification detection of spoofing signals by comparing the drone aerial image with the satellite reference image corresponding to the GNSS positioning coordinates, using the depth feature similarity metric. The deception alarm is triggered when the image similarity is lower than a preset threshold.

3.1. Image Registration

Image registration technique refers to the use of position and orientation system (POS) data and camera internal reference during drone aerial photography to achieve accurate matching between aerial images and satellite reference images in terms of coverage and geospatial location. In this study, an alignment method based on the central projection model is used to achieve the geospatial alignment of heterogeneous images and provide alignment data input for subsequent deep learning-based spoofing detection.

Center projection, as a projection method for aerial photography [27], describes the process of mapping three-dimensional spatial points to a two-dimensional imaging plane via a single projection center. In drone remote sensing imaging, the center of projection is the camera photo center and ground points are projected to the image plane via a straight line through the photo center, as shown in Figure 2. To simplify the model, it is assumed that the aerial image satisfies the orthographic projection condition, the projection ground is limited to flat terrain, and the undulating terrain may introduce bias.

Although satellite images have an orbital altitude of hundreds of kilometers, their imaging process still follows the central projection law. Based on this geometric commonality, satellite image extraction of the target area is achieved by solving the geographic coverage of the aerial image. As shown in Figure 3, when the drone carries a camera with a focal length of

F

operating at an altitude

h

above the ground, combined with the physical dimensions of the image sensor

W_{0} \times H_{0}

, we obtain the field of view angle

α

and

β

:

\{\begin{matrix} α = a r c t a n (W_{0} / 2 F) \\ β = a r c t a n (H_{0} / 2 F) \end{matrix}

(1)

Based on (1) and the central projection model, the actual ground coverage of the aerial image

(W, H)

can be expressed as follows:

\{\begin{matrix} W = 2 h t a n α \\ H = 2 h t a n β \end{matrix}

(2)

Based on the ENU coordinate system (i.e., a local Cartesian coordinate system with the current position as the origin and the east, north, and sky directions as the x-axis, y-axis, and z-axis, respectively) established by taking the center of the aerial image corresponding to the center of the ground

O (x, y)

as the origin, the analytical equations for the coordinates of the four corners of the aerial image are constructed (

A (x_{1}, y_{1}), B (x_{2}, y_{2}), C (x_{3}, y_{3}), D (x_{4}, y_{4})

, as shown in Figure 4):

\{\begin{matrix} x_{1} = x + H s i n ψ - W c o s ψ \\ y_{1} = y + H c o s ψ + W s i n ψ \\ x_{2} = x + H s i n ψ + W c o s ψ \\ y_{2} = y + H c o s ψ - W s i n ψ \\ x_{3} = x - H s i n ψ + W c o s ψ \\ y_{3} = y - H c o s ψ - W s i n ψ \\ x_{4} = x - H s i n ψ - W c o s ψ \\ y_{4} = y - H c o s ψ + W s i n ψ \end{matrix}

(3)

where

ψ

denotes the drone heading angle, defined as the angle of counterclockwise rotation from due north to the drone heading

(0 ° \leq ψ < 36 0 °)

. Based on the WGS84 coordinate system and the POS to provide the latitude and longitude of O, the geometric boundaries of the area covered by the aerial map can be delineated on the satellite base map by calculating the geographic coordinates of the points at the corners of the aerial image.

3.2. Feature Extraction

Based on the alignment results of real-time drone aerial images with satellite reference images, this section focuses on the construction of feature representations of heterologous images, aiming to establish mathematical descriptions for subsequent similarity metrics.

3.2.1. Data Pre-Processing

In computer vision, image features include color, texture, shape, etc. Existing image annotation and retrieval systems are mostly constructed based on such features [28]. In GNSS spoofing detection, aerial and satellite images are susceptible to the interference of lighting conditions, seasonal variations, and differences in resolution, so stable features should be preferentially selected for matching analysis. Compared with the color features that are easily affected by the environment, the shape and texture features such as road edges, building contours, river courses, and roof lines have stronger anti-interference ability and more significant correlation with the real geographic location.

To achieve this goal, this paper adopts a two-step preprocessing operation: to eliminate light interference by unifying the pixel luminance range through greyscale normalization and to enhance the local contrast to improve the clarity of shapes and textures through histogram equalization, which is shown in Figure 5 and Figure 6 before and after the image preprocessing.

3.2.2. ResNet50-SE Network

To enhance the characterization of remote sensing image features, this paper proposes an improved ResNet50 network incorporating the channel attention mechanism (CAM). ResNet50 effectively mitigates the gradient degradation problem of the deep network through the residual learning mechanism. Considering the drone’s airborne characteristics, the SE module has the advantages of light computation and a small impact on model complexity, which is suitable for embedding into lightweight network structures [25]. In this paper, the SE module is introduced after the output of the fourth residual stage (stage 4) of the original ResNet50, named ResNet50-SE (its specific structure is detailed in Table 2). The SE module adaptively adjusts the feature channel weights and enhances the network’s ability to respond to the key information in the remote sensing images. This module enhances the quality of the feature representation generated by the network by explicitly modeling the interdependencies between the convolved feature channels [25]. Since layer 4 is used as the high-level feature extraction layer of ResNet50, its output feature map already contains rich semantic information. Adding the attention mechanism here can give full play to the guiding role of the high-level semantic features while avoiding the introduction of excessive computational overheads in the shallow layer of the network. The specific implementation process is as follows: the dual-channel parallel structure of global average pooling and global maximum pooling is used to capture the global statistical information of the channel dimensions, respectively, and the network can adaptively adjust the response weights of each channel through feature fusion and weight scaling operations.

In addition, to further optimize the computational efficiency, the group convolution technique is used in the channel enhancement module to divide the input channels into 64 groups for parallel computation. This approach reduces the number of model parameters while enhancing the network’s ability to model the local association of multi-channel features.

3.2.3. Image Matching

1.: Network Infrastructure

The core of GNSS spoofing detection is to verify whether the geographic location of the aerial image is consistent with that of the satellite image. In this paper, the twin network architecture is used to achieve heterogeneous source image matching, as shown in Figure 7. The model adopts the weight-sharing ResNet50-SE two-branch structure to process aerial and satellite images in parallel. Among them, the ResNet50 backbone network is directly loaded with ImageNet pre-trained weights without further training and tuning. By sharing the convolutional kernel parameters between the branches, the network can map the features of heterogeneous images into a unified metric space. After the feature vector is compressed by the fully connected layer, the Euclidean distance is used to quantify the feature similarity between aerial and satellite images. When the distance value is lower than a preset threshold, the image is judged to be a matched pair, i.e., no GNSS spoofing attack occurs; otherwise, an alarm signal is triggered.

2.: Loss Function

In this paper, a contrast loss function is used to optimize network training. Aiming at the difference between positive samples (real matching pairs) and negative samples (non-matching pairs due to spoofing) in GNSS spoofing detection, this loss function enables the network to differentiate between geographic consistency and spoofing scenarios by constraining the feature distances of the samples. For the input image pair

(I_{a}, I_{s})

, the corresponding feature map (i.e., the result of encoding the semantic content of the image by the ResNet50-SE network) is obtained after feature extraction as

(f_{a}, f_{s})

, and the loss function is defined as [29] follows:

L = \frac{1}{2} \{(1 - y_{i}) \times d^{2} + y \times m a x {(m a r g i n - d, 0)}^{2}\}

(4)

where

d = {‖f_{a} - f_{s}‖}_{2}

is the Euclidean distance of the feature map,

y_{i} \in \{0, 1\}

denotes whether the pair is paired or not (

y = 0

denotes a true matched pair without spoofing, and

y = 1

denotes a non-matched pair due to spoofing), and margin is a boundary parameter for controlling the minimum spacing between the features of negative samples. In this paper, through the validation set tuning,

m a r g i n = 4

is finally selected to enhance the differentiation ability of deception samples.

4. Autonomous Positioning of Drones

When a GNSS spoofing attack is detected, the system switches to autonomous localization mode and uses visual information to achieve absolute drone localization.

4.1. Maximum Flight Area Determination

When the drone performs a flight mission, when a GNSS spoofing attack is recognized, since the mission route is known, the drone’s cross-track (i.e., the vertical distance between the drone’s current position and the preset route) relative to the preset route can be calculated based on the current moment’s false GNSS positioning. By analyzing the drone guidance strategy, judging the possible motion behavior of the drone in the deviation from the route state, and combining with the maximum flight speed of the drone, the maximum flight area that the drone may reach can be determined, which provides the search area limitation for the matching of aerial images and satellite images during the subsequent absolute positioning.

Since the drone adopts the L1 guidance algorithm, the core idea is that after the drone deviates from the route, a point with a constant distance

L_{1}

from the current position to the route is selected as the target point to guide the drone back to the route. As shown in Figure 8, in the horizontal plane, the intended route is a straight line L. Assuming that a GNSS spoofing attack is detected at the moment t, the drone’s false GNSS localization at this moment is

P^{'} (x_{0}, y_{0})

, while the actual position is

P (x, y)

. The drone, affected by the spoofing, believes that it is off course and, guided by the guidance algorithm, determines the false target point as O′ while the actual target point is O. At this time, the drone’s heading angle is

θ

. The course equation can be expressed as follows:

A x + B y + C = 0

(5)

where A, B, and C are the route parameters. The cross-track of the current drone’s false localization

P^{'} (x_{0}, y_{0})

concerning the route can be calculated by the point-to-straight line distance equation:

d = \frac{|A x_{0} + B y_{0} + C|}{\sqrt{A^{2} + B^{2}}}

(6)

Let the maximum flight speed of the drone be

v_{m a x}

. The maximum area that the drone may reach during a specific flight period (i.e., from the moment t of detecting a spoofing attack to the moment t′ of acquiring the aerial photograph for absolute localization) is a rectangular area with the current position

P (x_{0}, y_{0})

as the vertex and the diagonal length

v_{m a x} \times (t^{'} - t)

(as shown by the red dashed box in Figure 8).

In the local coordinate system with the current position

P (x, y)

as the origin, the course direction as the y-axis, and the perpendicular course direction as the x-axis, the coordinates of the remaining three vertices of the rectangular region (

A (x_{1}, y_{1}), B (x_{2}, y_{2}), C (x_{3}, y_{3})

) can be expressed as follows:

\{\begin{array}{l} x_{1} = x + v_{m a x} (t^{'} - t) s i n θ \\ y_{1} = y \\ x_{2} = x + v_{m a x} (t^{'} - t) s i n θ \\ y_{2} = y + v_{m a x} (t^{'} - t) c o s θ \\ x_{3} = x \\ y_{3} = y + v_{m a x} (t^{'} - t) c o s θ \end{array}

(7)

where the heading angle

θ

is defined as the angle between the drone heading and the preset course

(0 ° \leq θ \leq 90 °)

. The determination of the maximum flight area is used for subsequent satellite image interception and as an input for absolute localization.

4.2. Absolute Positioning

Based on the local satellite images acquired from the maximum flight area, we study the matching of aerial images with satellite images based on deep learning methods to achieve absolute positioning in GNSS-denied environments based on visual information.

4.2.1. SuperGlue-Based Feature Matching

SuperGlue is a feature-matching method based on Graph Neural Networks (GNN), and compared with traditional feature matching methods such as Scale Invariant Feature Transform (SIFT), Oriented FAST, and Rotated BRIEF (ORB), etc., SuperGlue considers global geometric relationships in the image based on local feature matching to improve the matching accuracy [26]. Methods such as SIFT (Scale Invariant Feature Transform), ORB (Oriented FAST and Rotated BRIEF), and other feature matching methods, SuperGlue improves the matching accuracy by considering the global geometric relationships in the image based on local feature matching [26].

In this paper, SuperGlue is applied to feature matching between aerial images and satellite images. First, the SuperPoint network is used to extract descriptors from the two types of images separately. Subsequently, the extracted descriptors are input into the SuperGlue network, and the matching relationship between the feature points is globally optimized by a graph neural network, to find out the optimal feature-matching pairs and achieve accurate image matching. This method can effectively deal with challenges such as rotation, scale changes, lighting differences, and local occlusion, thus ensuring more accurate and reliable matching results between aerial images and satellite images. Through the image correspondence obtained by SuperGlue matching, combined with the geographic coordinate information of the satellite image, the position of the aerial image is further calculated to realize its absolute positioning, as shown in Figure 9.

4.2.2. Post-Processing of Matching Results

Based on the feature matching results of SuperGlue, the absolute positioning of the drone is realized by the following post-processing process: firstly, the aerial map is matched with the satellite map by multi-angle rotation, and the valid matching result with the highest number of feature points is selected. For the verified matching pairs, the pixel coordinates are converted to geographic coordinates by linear projection based on the relative positions of the matching points in the satellite map,

(u, v)

(normalized coordinates that satisfy

u, v \geq 0

and

u, v \leq 1

) and the latitude and longitude boundaries of the satellite map. Let the coverage of the satellite map be by latitude

φ \in [φ_{m i n}, φ_{m a x}]

, and longitude

λ \in [λ_{m i n}, λ_{m a x}]

, then the drone position coordinates

(φ_{0}, λ_{0})

can be calculated as follows:

\{\begin{matrix} φ_{0} = φ_{m i n} + v \times (φ_{m a x} - φ_{m i n}) \\ λ_{0} = λ_{m i n} + u \times (λ_{m a x} - λ_{m i n}) \end{matrix}

(8)

5. Experiment Result and Analysis

All the experiments in this study were carried out on a computing platform equipped with NVIDIA RTX 4090D (16 GB graphics memory), relying on CUDA 12.4 and cuDNN 9.1 acceleration libraries to build the deep learning computing unit. The experimental environment is a 64-bit Ubuntu 22.04 LTS operating system, using Python3 interpreter and PyTorch 2.5.1 deep learning framework.

5.1. GNSS Spoofing Detection

5.1.1. Dataset

The dataset used for the training of the GNSS spoofing detection model is derived from the open-source data proposed in the literature [21]. The dataset contains geographically aligned aerial-satellite image pairs covering a wide range of terrain features such as cities, towns, and forests. In order to verify the model’s generalization ability, the dataset is randomly divided into training and testing sets in the ratio of 7:3.

5.1.2. Assessment of Indicators

To comprehensively evaluate the effectiveness of image similarity metrics and their discriminative ability in binary classification fraud detection tasks, this paper uses the receiver operating characteristic curve (ROC) and its area under the curve (AUC) as core evaluation metrics. Assuming the test set contains N samples, with P positive samples (matching image pairs) and N-P negative samples (non-matching image pairs), given an image similarity threshold

τ

, the model’s output results can be divided into the following four classes:

True Positive (TP): The number of positive samples with similarity higher than $τ$ ;
False Positive (FP): The number of negative samples with similarity higher than $τ$ ;
True Negative (TN): The number of negative samples with similarity lower than $τ$ ;
False Negative (FN): The number of positive samples with similarity lower than $τ$ .

Based on the above definitions, the formulas for calculating the true positive rate (TPR) and false positive rate (FPR) are as follows:

T P R (τ) = \frac{T P (τ)}{T P (τ) + F N (τ)} = \frac{T P (τ)}{P}

(9)

F P R (τ) = \frac{F P (τ)}{F P (τ) + T N (τ)} = \frac{F P (τ)}{N - P}

(10)

The ROC uses FPR as the x-axis and TPR as the y-axis. As the decision threshold

τ

changes, the corresponding

(F P R (τ), T P R (τ))

coordinate points are calculated and plotted on a two-dimensional plane to form a curve. This curve reflects the model’s ability to correctly identify matching image pairs at different decision thresholds, as the proportion of incorrectly identifying non-matching image pairs as matching changes. The ROC visually demonstrates the trade-off between performance characteristics at different thresholds, where increasing TPR often comes at the cost of rising FPR. In scenarios where GNSS spoofing detection requires strict constraints on low false positive rates and high detection rates, the ROC effectively guides the selection of the optimal threshold.

AUC is defined as the area under the ROC bounded by the FPR axis:

A U C = \int_{0}^{1} T P R (F P R^{- 1} (x)) d x

(11)

Among them,

F P R^{- 1}

represents the inverse function of FPR. The AUC value range is [0, 1], and the closer the value is to 1, the better the model’s performance in distinguishing between positive and negative samples, which helps to complete the task of heterogeneous image matching more accurately. AUC comprehensively reflects the model’s overall ability to distinguish between matched and unmatched image pairs under all possible judgment thresholds, which is convenient for comparing performance between different model architectures.

5.1.3. Ablation Experiment

This paper conducts ablation experiments on the twin network of the ResNet50 architecture, focusing on investigating the impact of introducing the channel attention mechanism (SE module) at different network stages of ResNet50 on image matching performance.

This study trained and evaluated four network models on the same dataset: the ResNet50 base network, and networks that introduced the SE module at the 2nd, 3rd, and 4th residual stage of the ResNet50 network (ResNet50-SE-stage2, ResNet50-SE-stage3, ResNet50-SE-stage4). Figure 10 shows the classification accuracy curves of these four network models on the test set as a function of training epochs. Except for ResNet50-SE-stage2, the other networks converge stably, and the test set accuracy continues to increase with training iterations. Compared to the base ResNet50 (optimal test accuracy of 90.6%), ResNet50-SE-stage2 and ResNet50-SE-stage3 show limited performance improvements after introducing the SE module in the lower and middle layers, with optimal test accuracies of only 85.4% and 89.5%, respectively. When the SE module is applied to the deeper fourth stage (ResNet50-SE-stage4), the network achieves the highest optimal test accuracy (93.6%), representing an approximately 3.0% improvement over the base structure. The experimental results indicate that ResNet50-SE-stage4 exhibits the optimal performance trend during training. For each network, we saved the model parameters at their optimal accuracy and evaluated these optimal models using an independent dataset (distinct from the aforementioned training dataset). The performance comparison of the four models is shown in Table 3.

As shown in Table 3, the network (ResNet50-SE-stage4) that introduced the SE module in stage 4 achieved the highest accuracy, improving by approximately 4.6% compared to the baseline ResNet50 network. Introducing the SE module at shallower stages (stage 2, stage 3) did not yield performance gains in this task and even resulted in lower performance than the baseline network. This indicates that introducing the channel attention mechanism at the deeper stage (stage 4) is the most effective approach for improving the performance of the aerial-satellite image matching task in this study.

To further quantify the model performance, Figure 11 compares the ROCs and corresponding AUC values of the four network models on the test set. The results also show that the ResNet50-SE-stage4 network performs the best, with an AUC value of 0.965, an improvement of 3.3% over the basic ResNet50 architecture (AUC: 0.932). Specifically, when selecting a threshold of 1.76 on the ROC (corresponding to FPR = 8% and TPR = 92%), this network achieves a high true positive rate while maintaining a low false positive rate, validating the effectiveness of the method proposed in Section 3.2.2 for heterogeneous image matching tasks.

To systematically evaluate the performance differences among various attention mechanisms, this study selected three typical modules for comparison: The SE module obtains channel-level attention weights through global average/maximum pooling and uses fully connected layers to establish channel dependencies; CBAM (Convolutional Block Attention Module) employs serial channel and spatial attention, combining max/average pooling with convolution operations to achieve local spatial perception; ECA (Efficient Channel Attention) implements lightweight channel attention via one-dimensional convolution, avoiding the parameter overhead associated with fully connected layers. These three modules represent design approaches focused on channel-specific attention, spatial-channel hybrid attention, and lightweight channel attention, respectively.

To validate the unique advantages of the SE module, we conducted comparative experiments by replacing the CBAM and ECA modules in the same ResNet50-SE-stage4 architecture. The experimental settings were kept consistent, and the model performance comparison results are shown in Table 4.

Experimental results show that the SE module outperforms the CBAM module by 4.07%, 14.89%, and 7.97% in terms of accuracy, recall, and F1 score, respectively, and improves upon the ECA module by 7.56%, 20.20%, and 11.19%, respectively, validating its significant advantages in heterogeneous image matching tasks. The SE module employs a global channel attention mechanism to more effectively model deep semantic associations between heterogeneous images, and its explicitly constructed channel dependencies help mitigate interference caused by differences in imaging resolution. In contrast, CBAM’s spatial-channel hybrid attention may introduce noise due to imaging geometric differences when processing heterogeneous images. ECA’s lightweight design improves computational efficiency but sacrifices the depth of representation of heterogeneous image features.

Based on the above ablation experiment results, this paper selects the optimal network model (ResNet50-SE-stage4) that introduces the SE module in stage 4 and compares its performance with current mainstream deep learning networks in aerial-satellite image matching tasks. For fairness, all comparison networks strictly adopt the original authors’ proposed network structures and load their publicly released pre-trained parameters. The comparison networks include the following: ConvNeXt, DenseNet, EfficientNet, MobileNet, and Swin Transformer. Figure 12 shows the accuracy curves of each network on the test set as a function of training cycles.

As shown in the figure below, different networks exhibit varying performance in aerial photography-satellite image matching tasks. Among them, the ConvNeXt and Swin Transformer networks experienced performance degradation during training. Therefore, we saved the models of each network at their optimal accuracy and evaluated these optimal models using an independent dataset (distinct from the aforementioned training dataset). The performance comparison of the evaluation results is shown in Table 5.

Based on the data, ResNet50-SE-stage4 performs best across the three core metrics of accuracy, recall, and F1 score. In contrast, despite having over 330 million parameters, the large models ConvNeXt and Swin Transformer perform significantly worse than other models, indicating that simply increasing model complexity is insufficient to effectively address the imaging condition differences between aerial and satellite images. EfficientNet and MobileNet lightweight models achieve good accuracy and high precision with smaller model sizes, but their recall rates are significantly lower, indicating that they tend to make ‘conservative’ predictions, sacrificing the ability to recall more true matching pairs. DenseNet performs close to optimal with a relatively small model size, but its recall rate (0.7162) is significantly lower than our proposed model, indicating room for improvement in capturing all potential matching pairs. The ResNet50-SE-stage4 model size is 98.3 M, maintaining reasonable storage overhead at high accuracy compared to lightweight models. This offers a significant advantage in resource-constrained scenarios, demonstrating a good balance between accuracy and computational efficiency.

In summary, the ResNet50-SE-stage4 network effectively enhances the learning ability of cross-view discriminative features by introducing the SE module, not only improving recall but also maintaining high accuracy and reasonable model complexity, thereby validating its superiority in aerial and satellite image matching tasks.

5.2. Autonomous Positioning of Drones

5.2.1. Dataset

For the drone autonomous positioning task, the experimental data sources are as follows:

Satellite images: satellite images of the mission area were acquired from LocaSpace Viewer (LSV) 4.5.3, a domestic open-source GIS software.
Aerial images: acquired by DJI drones flying in the mission area.

5.2.2. Assessment of Indicators

To evaluate the accuracy of the positioning system, this paper adopts Haversine’s formula to calculate the spherical distance error between the original coordinates and the estimated coordinates. Haversine’s formula can calculate the shortest path between two points in the geographic coordinate system by considering the curvature of the earth, which is suitable for the evaluation of the system’s accuracy. Assuming that the estimated coordinate output from the absolute positioning system is

(φ_{0}, λ_{0})

and the original coordinate is

(φ, λ)

, the spherical distance error d between the two can be expressed as follows:

d = 2 R \times a r c s i n (\sqrt{s i n^{2} (\frac{φ_{0} - φ}{2}) + c o s φ_{0} \times c o s φ \times s i n^{2} (\frac{λ_{0} - λ}{2})})

(12)

where R is the radius of the Earth. The accuracy of the absolute positioning system can be quantified by calculating the error between the two coordinates. The smaller the Haversine value, the more accurate the positioning of the system is; on the contrary, the accuracy is poor. To further evaluate the positioning performance of the system, this paper defines ‘positioning accuracy’ as the proportion of accurate positioning of the system within a certain error threshold, and its calculation formula is as follows:

a c c = \frac{1}{N} \sum_{i = 1}^{N} I (d_{i} \leq δ)

(13)

where N is the total number of test samples and

I (\cdot)

is the indicator function when the spherical distance error

d_{i}

of the ith sample is less than or equal to the threshold

δ

,

I (d_{i} \leq δ) = 1

, indicating that the localization is accurate; otherwise, it is 0, indicating that the localization fails.

5.2.3. Matching Results

To validate the effectiveness of the maximum flight area restriction strategy, satellite images were obtained based on the constraints outlined in Section 4.1 and combined with real-time aerial images as network inputs to achieve absolute positioning. Compared to existing methods [30] that directly use large-scale satellite images, our strategy reduces the search space, thereby improving positioning efficiency and accuracy. Figure 13 shows a set of test data, with aerial images on the left and satellite images on the right. The figure labels the estimated and actual latitude and longitude coordinates.

Table 6 provides the relevant acquisition parameters for the example image shown in Figure 13:

Through the analysis of a large number of test samples, the statistical results of the spherical distance error of the system’s positioning are shown in Figure 14. This figure clearly shows the distribution range of the error. Table 7 summarizes the positioning accuracy rate (i.e., the proportion of samples with errors less than or equal to the threshold) at different distance thresholds. The results show that even under a strict 12 m threshold, the system still achieves an accuracy rate of 84.62%; when the threshold is relaxed to 15 m, the accuracy rate improves to 89.74%.

We conducted a rigorous performance comparison with existing advanced methods [30], and the results are shown in Table 8. Under the same 45 m threshold, the sample proportion of the [30] method was 88.0%, while our method achieved 100%, significantly improving the robustness of the system. The average positioning error of the [30] method is 13.45 m. In contrast, our method significantly reduces the average error to 8.46 m, representing a relative reduction of approximately 37%.

Based on the above experimental results, the superiority of the maximum flight area restriction strategy in improving the visual positioning performance of drones has been fully verified.

6. Conclusions

Aiming at the threat of GNSS spoofing attacks faced by drones, this paper proposes a closed-loop method for spoofing detection and autonomous positioning based on visual image matching. The core elements of the method include the following:

Deception Detection: using real-time aerial images of the drone and satellite images corresponding to GNSS positioning for feature matching to detect deception. Experimental results show that the accuracy of deception detection of the method reaches 89.5%.
Autonomous Positioning: after detecting a spoofing attack, the system switches to autonomous positioning mode. Absolute positioning is achieved by matching aerial images with satellite images of a specific region. Experimental results show that 89.7% of the positioning error can be controlled within 13.9 m.

This study provides a comprehensive solution for the safe operation of drones in environments where GNSS signals are unreliable. However, this method still faces challenges when applied to real-world complex applications.

Firstly, computational efficiency is the primary bottleneck. The significant computational overhead associated with high-precision image feature extraction and large-scale feature matching makes it difficult to meet real-time processing requirements on resource-constrained embedded drone platforms, potentially leading to system response delays and thereby impacting the timeliness of flight control and decision-making. Secondly, scene robustness requires improvement. In environments with scarce texture or structural features (such as large bodies of water, deserts, homogeneous farmland, or newly developed urban areas), the reliability of feature extraction and matching by the algorithm significantly decreases. Extreme lighting conditions (such as strong reflections or low light at night) and adverse weather conditions (such as dense fog) further degrade image quality, exacerbating feature instability and matching failure rates. Additionally, dynamic scene interference cannot be ignored. The time lag between satellite images and real-time aerial images causes dynamic targets such as road vehicles, pedestrians, temporary construction sites, or vegetation growth to become continuous matching noise sources, reducing detection and positioning accuracy.

To overcome these limitations, promote the practical application of this method, and enhance its robustness, efficiency, and practicality in complex real-world environments, future research will focus on the following directions.

Lightweight models and online incremental learning: Develop computationally efficient lightweight feature extraction and matching networks suitable for embedded platforms; introduce an online incremental learning mechanism to enable the system to continuously learn new scene features encountered during flight (e.g., temporary structures, seasonal changes), dynamically update the local feature database or matching model, reduce absolute reliance on static reference images, and enhance scene adaptability.
Scene Adaptation and Robust Matching Strategies: Design scene-aware adaptive feature selection and matching strategies. For example, use high-dimensional features in texture-rich areas to ensure accuracy and switch to texture-insensitive features (such as edges or contours) or combine region segmentation information for matching in low-texture areas.
Hardware acceleration and system optimization: Explore hardware-level optimizations based on FPGAs or dedicated AI acceleration chips for core visual computation modules (feature extraction, matching) to achieve a balance between performance and power consumption; optimize system architecture, such as adopting key frame selection strategies to reduce redundant computations.

In summary, the visual image matching method proposed in this paper provides an effective technical approach for addressing drone GNSS spoofing detection and positioning recovery issues, demonstrating good performance under specific conditions. Gaining a deep understanding of and addressing its computational overhead and scene robustness limitations are key to advancing this method toward practical application. Future research will focus on integrating multiple technologies, including lightweight models, online learning, adaptive strategies, and hardware acceleration, to develop a more robust and adaptable anti-deception and autonomous positioning system for drones capable of withstanding complex environmental challenges.

Author Contributions

Conceptualization, J.Z. and C.M.; methodology, J.Z. and C.M.; software, J.Z. and M.H.; validation, C.Z. and Z.L.; formal analysis, J.Z. and Z.L.; investigation, C.M.; resources, M.H. and C.Z.; data curation, J.Z. and Z.L.; writing—original draft preparation, J.Z.; writing—review and editing, J.Z., M.H. and C.M.; visualization, J.Z.; supervision, M.H. and C.M.; project administration, C.Z.; funding acquisition, C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (no. 62403481).

Data Availability Statement

Data are contained within the article. The current research is limited to the field of UAV navigation security, which is beneficial for enhancing the reliability of civilian drone operations and does not pose a threat to public health or national security. Authors acknowledge the dual-use potential of the research involving GNSS spoofing detection and drone repositioning techniques and confirm that all necessary precautions have been taken to prevent potential misuse. As an ethical responsibility, authors strictly adhere to relevant national and international laws about DURC. Authors advocate for responsible deployment, ethical considerations, regulatory compliance, and transparent reporting to mitigate misuse risks and foster beneficial outcomes.

Acknowledgments

The authors would like to thank the Editor, Associate Editor, and anonymous reviewers for their helpful comments and suggestions for improving this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chan, K.W.; Nirmal, U.; Cheaw, W.G. Progress on Drone Technology and Their Applications: A Comprehensive Review. AIP Conf. Proc. 2018, 2030, 020308. [Google Scholar] [CrossRef]
Maddikunta, P.K.R.; Hakak, S.; Alazab, M.; Bhattacharya, S.; Gadekallu, T.R.; Khan, W.Z.; Pham, Q.-V. Unmanned Aerial Vehicles in Smart Agriculture: Applications, Requirements, and Challenges. IEEE Sens. J. 2021, 21, 17608–17619. [Google Scholar] [CrossRef]
Toroi, G.-I. The Persistent Effectiveness of Decoys in Land Forces’ Operations—Lessons from The Russian-Ukrainian Conflict–. Rom. Mil. Think. 2024, 2024, 26–49. [Google Scholar] [CrossRef]
Khan, S.Z.; Mohsin, M.; Iqbal, W. On GPS Spoofing of Aerial Platforms: A Review of Threats, Challenges, Methodologies, and Future Research Directions. Peerj Comput. Sci. 2021, 7, e507. [Google Scholar] [CrossRef] [PubMed]
Volpe, J.A. Vulnerability Assessment of the Transportation Infrastructure Relying on The Global Positioning System; National Transportation Systems Center: Cambridge, MA, USA, 2021. [Google Scholar]
Humphreys, T.E.; Ledvina, B.M.; Psiaki, M.L.; O’Hanlon, B.W.; Kintner, P.M., Jr. Assessing the Spoofing Threat: Development of a Portable GPS Civilian Spoofer. In Proceedings of the 2008 ION GNSS Conference, Savanna, GA, USA, 16–19 September 2008; pp. 2314–2325. [Google Scholar]
He, L.; Li, W.; Guo, C.; Niu, R. Civilian Unmanned Aerial Vehicle Vulnerability to GPS Spoofing Attacks. In Proceedings of the 2014 Seventh International Symposium on Computational Intelligence and Design, Washington, DC, USA, 13–14 December 2014; Volume 2, pp. 212–215. [Google Scholar]
Hartmann, K.; Steup, C. The Vulnerability of UAVs to Cyber Attacks—An Approach to the Risk Assessment. In Proceedings of the 2013 5th International Conference on Cyber Conflict (CYCON 2013), Tallinn, Estonia, 4–7 June 2013; pp. 1–23. [Google Scholar]
Ma, C.; Qu, Z.; Li, X.; Liu, Z.; Zhou, C. Machine Learning and UAV Path Following Identification Algorithm Based on Navigation Spoofing. Meas. Sci. Technol. 2023, 34, 125034. [Google Scholar] [CrossRef]
Guo, Y.; Wu, M.; Tang, K.; Tie, J.; Li, X. Covert Spoofing Algorithm of UAV Based on GPS/INS-Integrated Navigation. IEEE Trans. Veh. Technol. 2019, 68, 6557–6564. [Google Scholar] [CrossRef]
Ma, C.; Yang, J.; Chen, J.; Zhou, C. Path Following Identification of Unmanned Aerial Vehicles for Navigation Spoofing and Its Application. ISA Trans. 2021, 108, 393–405. [Google Scholar] [CrossRef] [PubMed]
Jafarnia-Jahromi, A.; Broumandan, A.; Nielsen, J.; Lachapelle, G. GPS Vulnerability to Spoofing Threats and a Review of Antispoofing Techniques. Int. J. Navig. Obs. 2012, 2012, 127072. [Google Scholar] [CrossRef]
Akos, D.M. Who’s Afraid of the Spoofer? GPS/GNSS Spoofing Detection via Automatic Gain Control (AGC). Navigation 2012, 59, 281–290. [Google Scholar] [CrossRef]
Maier, D.; Frankl, K.; Blum, R.; Eissfeller, B.; Pany, T. Preliminary Assessment on the Vulnerability of NMA-Based Galileo Signals for a Special Class of Record & Replay Spoofing Attacks. In Proceedings of the 2018 IEEE/ION Position, Location and Navigation Symposium (PLANS), Monterey, CA, USA, 23–26 April 2018; pp. 63–71. [Google Scholar]
Fernández-Hernández, I.; Rijmen, V.; Seco-Granados, G.; Simon, J.; Rodríguez, I.; Calle, J.D. A Navigation Message Authentication Proposal for the Galileo Open Service. Navigation 2016, 63, 85–102. [Google Scholar] [CrossRef]
Fernández-Hernández, I.; Seco-Granados, G. Galileo NMA Signal Unpredictability and Anti-Replay Protection. In Proceedings of the 2016 International Conference on Localization and GNSS (ICL-GNSS), Barcelona, Spain, 28–30 June 2016; pp. 1–5. [Google Scholar]
Hu, Y.; Bian, S.; Ji, B.; Li, J. GNSS Spoofing Detection Technique Using Fraction Parts of Double-Difference Carrier Phases. J. Navig. 2018, 71, 1111–1129. [Google Scholar] [CrossRef]
Ye, X.; Song, F.; Zhang, Z.; Zeng, Q. A Review of Small UAV Navigation System Based on Multisource Sensor Fusion. IEEE Sens. J. 2023, 23, 18926–18948. [Google Scholar] [CrossRef]
Tanıl, Ç.; Khanafseh, S.; Joerger, M.; Pervan, B. Kalman Filter-Based INS Monitor to Detect GNSS Spoofers Capable of Tracking Aircraft Position. In Proceedings of the 2016 IEEE/ION Position, Location and Navigation Symposium (PLANS), Savannah, Georgia, 11–14 April 2016; pp. 1027–1034. [Google Scholar]
Varshosaz, M.; Afary, A.; Mojaradi, B.; Saadatseresht, M.; Parmehr, E.D. Spoofing Detection of Civilian UAVs Using Visual Odometry. ISPRS Int. J. Geo-Inf. 2020, 9, 6. [Google Scholar] [CrossRef]
Xue, N.; Niu, L.; Hong, X.; Li, Z.; Hoffaeller, L.; Pöpper, C. DeepSIM: GPS Spoofing Detection on UAVs Using Satellite Imagery Matching. In Proceedings of the 36th Annual Computer Security Applications Conference; Association for Computing Machinery, New York, NY, USA, 7–11 December 2020; pp. 304–319. [Google Scholar]
Davidovich, B.; Nassi, B.; Elovici, Y. Towards the Detection of GPS Spoofing Attacks against Drones by Analyzing Camera’s Video Stream. Sensors 2022, 22, 2608. [Google Scholar] [CrossRef] [PubMed]
Ma, J.; Jiang, X.; Fan, A.; Jiang, J.; Yan, J. Image Matching from Handcrafted to Deep Features: A Survey. Int. J. Comput. Vis. 2021, 129, 23–79. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 770–778. [Google Scholar]
Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef] [PubMed]
Sarlin, P.-E.; DeTone, D.; Malisiewicz, T.; Rabinovich, A. SuperGlue: Learning Feature Matching with Graph Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 4938–4947. [Google Scholar]
Paine, D.P.; Kiser, J.D. Aerial Photography and Image Interpretation; John Wiley & Sons: Hoboken, NJ, USA, 2012; ISBN 978-0-470-87938-2. [Google Scholar]
Tian, D. A Review on Image Feature Extraction and Representation Techniques. Int. J. Multimed. Ubiquitous Eng. 2013, 8, 385–396. [Google Scholar]
Hadsell, R.; Chopra, S.; LeCun, Y. Dimensionality Reduction by Learning an Invariant Mapping. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; Volume 2, pp. 1735–1742. [Google Scholar]
Liu, R.; Liu, H.; Meng, X.; Li, T.; Hancock, C.M. Detecting GNSS Spoofing and Re-Localization on UAV Based on Imagery Matching. Meas. Sci. Technol. 2024, 36, 016320. [Google Scholar] [CrossRef]

Figure 1. Overall system architecture.

Figure 2. Aerial imaging schematic.

Figure 3. Vertical section of aerial projection.

Figure 4. Aerial view diagram.

Figure 5. Aerial image before and after pre-processing.

Figure 6. Satellite image before and after pre-processing.

Figure 7. Twin network.

Figure 8. Determination of the maximum flight area.

Figure 9. Absolute positioning.

Figure 10. Accuracy curves of four network models as a function of training cycle.

Figure 11. Comparison of ROC and AUC scores for four network models.

Figure 12. Accuracy curves of different network models as a function of training cycle.

Figure 13. Absolute positioning results.

Figure 14. Positioning error.

Table 1. Comparison of signal-level spoofing detection methods.

Category	Advantages	Limitations
Signal processing	Easy to deploy, low latency	Effective only in the transition phase, easy to be avoided by adaptive spoofing
Encrypted signal	High theoretical reliability	Requires modification of signaling system, difficult to promote in civilian use
Signal spatial geometric relationships	Strong resistance to synchronization spoofing	High hardware cost, strict motion constraints

Table 2. ResNet50-SE network model.

Layer Name	ResNet50-SE	Output Size
input	—	720 × 960
conv1	7 × 7, 64, stride 2	360 × 480
conv2_x	3 × 3 max pool, stride 2	180 × 240
conv2_x	$[\begin{matrix} 1 \times 1, 64 \\ 3 \times 3, 64 \\ 1 \times 1, 256 \end{matrix}] \times 3$	180 × 240
conv3_x	$[\begin{matrix} 1 \times 1, 128 \\ 3 \times 3, 128 \\ 1 \times 1, 512 \end{matrix}] \times 4$	90 × 120
conv4_x	$[\begin{matrix} 1 \times 1, 256 \\ 3 \times 3, 256 \\ 1 \times 1, 1024 \end{matrix}] \times 6$	45 × 60
conv5_x	$[\begin{matrix} 1 \times 1, 512 \\ 3 \times 3, 512 \\ 1 \times 1, 2048 \\ f c, [128, 2048] \end{matrix}] \times 3$	23 × 30

Table 3. Comparison of the performance of four network models.

Model	Accuracy	Precision	Recall	F1 Score	Model Size
ResNet50	0.8488	0.8526	0.8709	0.8617	110.1 M
ResNet50- SE-stage2	0.7965	0.7431	0.9204	0.8223	94.7 M
ResNet50- SE-stage3	0.8488	0.8750	0.8750	0.8750	95.7 M
ResNet50- SE-stage4 (ours)	0.8953	0.9473	0.8372	0.8888	98.3 M

Table 4. Comparison of the performance of three network models.

Model	Accuracy	Precision	Recall	F1 Score	Model Size
ResNet50- CBAM-stage4	0.8546	0.9814	0.6883	0.8091	96.0 M
ResNet50- ECA-stage4	0.8197	1.0000	0.6352	0.7769	96.3 M
ResNet50- SE-stage4 (ours)	0.8953	0.9473	0.8372	0.8888	98.3 M

Table 5. Comparison of different network model performance.

Model	Accuracy	Precision	Recall	F1 Score	Model Size
ConvNeXt	0.4941	0.5000	0.3103	0.3829	337.0 M
DenseNet	0.8720	0.9814	0.7162	0.8281	54.5 M
EfficientNet	0.8139	0.9629	0.6341	0.7647	19.0 M
MobileNet	0.7790	0.9629	0.5909	0.7323	14.4 M
Swin Transformer	0.6162	0.6481	0.4268	0.5147	334.0 M
ResNet50- SE-stage4 (ours)	0.8953	0.9473	0.8372	0.8888	98.3 M

Table 6. Aerial photography and satellite image parameter information.

Category	Parameter	Value
Aerial Image	Image Resolution (px)	5472 × 3078
	Bands	RGB
	Flight Heading Angle (°) ¹	−0.600
	Flight Altitude (m)	114.178
	Camera FOV (°)	84.000
	Camera Pitch Angle (°) ²	−90.000
	Preprocessing	Grayscale Normalization
Satellite Image	Spatial Resolution (m/px)	0.474
	Bands	RGB
	Acquisition Date	2022-03
	Preprocessing	Grayscale Normalization

¹ The flight heading angle is 0° when facing true north, with clockwise being positive. ² The camera pitch angle is 0° when horizontal and −90° when pointing vertically downward.

Table 7. Localization accuracy at different thresholds.

Threshold (m)	≤8	≤10	≤12	≤15
Localization Accuracy	48.72%	64.10%	84.62%	89.74%

Table 8. Comparison with the positioning performance of the reference document [30].

Method	Threshold (m)	Localization Accuracy	Mean Error (m)
Baseline [30]	≤45	88.00%	13.45
Ours	≤45	100.00%	8.46

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, J.; Hu, M.; Zhou, C.; Liu, Z.; Ma, C. Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones. Electronics 2025, 14, 3147. https://doi.org/10.3390/electronics14153147

AMA Style

Zhou J, Hu M, Zhou C, Liu Z, Ma C. Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones. Electronics. 2025; 14(15):3147. https://doi.org/10.3390/electronics14153147

Chicago/Turabian Style

Zhou, Jiawen, Mei Hu, Chao Zhou, Zongmin Liu, and Chao Ma. 2025. "Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones" Electronics 14, no. 15: 3147. https://doi.org/10.3390/electronics14153147

APA Style

Zhou, J., Hu, M., Zhou, C., Liu, Z., & Ma, C. (2025). Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones. Electronics, 14(15), 3147. https://doi.org/10.3390/electronics14153147

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones

Abstract

1. Introduction

2. System Design

3. GNSS Spoofing Detection

3.1. Image Registration

3.2. Feature Extraction

3.2.1. Data Pre-Processing

3.2.2. ResNet50-SE Network

3.2.3. Image Matching

4. Autonomous Positioning of Drones

4.1. Maximum Flight Area Determination

4.2. Absolute Positioning

4.2.1. SuperGlue-Based Feature Matching

4.2.2. Post-Processing of Matching Results

5. Experiment Result and Analysis

5.1. GNSS Spoofing Detection

5.1.1. Dataset

5.1.2. Assessment of Indicators

5.1.3. Ablation Experiment

5.2. Autonomous Positioning of Drones

5.2.1. Dataset

5.2.2. Assessment of Indicators

5.2.3. Matching Results

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI