Tightly-Coupled Air-Ground Collaborative System for Autonomous UGV Navigation in GPS-Denied Environments

Deng, Jiacheng; Liu, Jierui; Hu, Jiangping

doi:10.3390/drones9090614

Open AccessArticle

Tightly-Coupled Air-Ground Collaborative System for Autonomous UGV Navigation in GPS-Denied Environments

by

Jiacheng Deng

¹,

Jierui Liu

¹ and

Jiangping Hu

^1,2,*

¹

School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

²

Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou 313001, China

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(9), 614; https://doi.org/10.3390/drones9090614

Submission received: 26 July 2025 / Revised: 23 August 2025 / Accepted: 28 August 2025 / Published: 31 August 2025

(This article belongs to the Section Artificial Intelligence in Drones (AID))

Download

Browse Figures

Versions Notes

Abstract

Highlights

What are the main findings?

A stateful, ID-based mapping mechanism prevents data redundancy from video streams by assigning a unique ID to each detected obstacle, ensuring it is recorded on the global map only once.
An obstacle inflation algorithm translates an abstract obstacle map into a robust, navigation-ready map. Artificially expanding the footprint of each obstacle based on the UGV’s physical dimensions ensures that any planned path is collision-free.

What are the implications of the main findings?

The ID-based filtering guarantees a clean and non-redundant map, which is crucial for effective path planning. This prevents the system from becoming confused by repeatedly mapping static objects, which would otherwise render the map unusable.
The obstacle inflation process ensures that the theoretically optimal path found by the A* algorithm is also physically safe for the UGV. This is a critical step that bridges the gap between abstract planning and real-world execution by preventing the UGV’s body from colliding with obstacles.

Abstract

Autonomous navigation for unmanned vehicles in complex, unstructured environments remains challenging, especially in GPS-denied or obstacle-dense scenarios, limiting their practical deployment in logistics, inspection, and emergency response applications. To overcome these limitations, this paper presents a tightly integrated air-ground collaborative system comprising three key components: (1) an aerial perception module employing a YOLOv8-based vision system onboard the UAV to generate real-time global obstacle maps; (2) a low-latency communication module utilizing FAST DDS middleware for reliable air-ground data transmission; and (3) a ground navigation module implementing an A* algorithm for optimal path planning coupled with closed-loop control for precise trajectory execution. The complete system was physically implemented using cost-effective hardware and experimentally validated in cluttered environments. Results demonstrated successful UGV autonomous navigation and obstacle avoidance relying exclusively on UAV-provided environmental data. The proposed framework offers a practical, economical solution for enabling robust UGV operations in challenging real-world conditions, with significant potential for diverse industrial applications.

Keywords:

autonomous navigation; air-ground collaboration; real-time obstacle mapping; path planning; YOLOv8-based perception

1. Introduction

The rapid proliferation of autonomous systems—particularly unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs)—is driving a paradigm shift across industries toward significantly enhancing operational efficiency and safety [1,2]. However, the effectiveness of individual autonomous agents remains constrained by the inherent limitations in perception range and computational capacity. To overcome these challenges, research has increasingly focused on multi-agent cooperation, with air-ground collaborative systems emerging as a critical frontier for complex applications such as large-scale logistics, environmental monitoring, and disaster response. By integrating the global situational awareness of UAVs with the localized operational capabilities of UGVs, such systems enable a more holistic approach to task execution. Thus, the development of robust cooperative path planning frameworks for these heterogeneous teams is essential to fully realize their potential in complex, unstructured environments [1,2,3].

Effective path planning critically depends on accurate and comprehensive environmental perception. For ground-based systems such as UGVs, navigation in unknown or GPS-denied environments is significantly hindered by their limited field of view, which obstructs the generation of globally optimal paths. Cooperative perception overcomes this limitation by fusing multi-modal sensory data from diverse vantage points. In recent years, vision-based navigation has emerged as a prominent solution, where UAVs utilize onboard cameras to capture aerial perspectives of the environment [4,5]. These systems increasingly rely on deep learning models—such as You Only Look Once (YOLO)—for real-time object detection and semantic segmentation, converting raw visual inputs into structured environmental maps [6,7]. By enabling UAVs to serve as “eyes in the sky” for UGVs, this shared intelligence delivers the global situational awareness necessary for safe and efficient ground navigation [8,9].

Research on cooperative path planning has advanced considerably, with methodologies primarily based on graph-search algorithms and heuristic functions. Among these, the A* algorithm remains a cornerstone due to its optimality and efficiency in computing least-cost paths within given map representations [10,11,12]. In cooperative scenarios, the challenge expands from single-agent pathfinding to encompass multi-agent task allocation and trajectory deconfliction. State-of-the-art approaches typically employ a hierarchical framework: a high-level planner—often centralized on a UAV or ground station—processes global information to assign tasks and define key waypoints, while low-level controllers onboard each agent handle local navigation [12,13]. This division of responsibilities is particularly crucial in air-ground collaboration, where UAVs generate global paths while UGVs ensure precise trajectory execution and local obstacle avoidance.

Despite considerable advancements in individual components such as perception and planning algorithms, significant research gaps remain in their systemic integration and practical implementation. Current frameworks are predominantly validated through simulations, with few low-cost, physically deployed systems demonstrating a complete end-to-end workflow—from perception to execution—in real-world environments [14,15]. This disparity between theoretical algorithms and practical deployment frequently neglects mission-critical challenges, including communication latency and hardware constraints. To bridge this gap, this paper presents an integrated air-ground collaborative navigation system with the following key contributions:

A tightly-coupled, end-to-end system architecture that seamlessly integrates an aerial perception module, a low-latency communication module, and a ground navigation module into a single, automated workflow.
A real-time stateful mapping framework employing a custom-trained YOLOv8 model with unique ID-based filtering to generate dynamic global obstacle maps, ensuring efficient and non-redundant environmental representation for UGV navigation.
A validated low-cost prototype demonstrating system feasibility through physical implementation using accessible hardware (Raspberry Pi) and robust middleware (FAST DDS), specifically designed for deployment in resource-constrained environments.

The remainder of this paper is organized as follows: Section 2 describes the problem of air-ground collaborative navigation and presents the system modeling. Section 3 elaborates on our proposed methodology, detailing the stateful mapping algorithm and the A*-based path planner, supplemented with flowcharts for clarity. Section 4 presents the experimental setup, performance metrics, and a comparative analysis of our results against several baseline methods. Finally, Section 5 and Section 6 conclude the paper by summarizing our findings and discussing directions for future work.

The primary novelty of this work is not the development of a singular component, but rather the successful systemic integration of perception, communication, and navigation. We present a low-cost, end-to-end framework that has been validated through physical implementation. In contrast to frameworks that remain in simulation or rely on high-cost hardware, our system demonstrates practical feasibility using accessible components. Furthermore, we introduce key algorithmic enhancements to solve real-world challenges: (1) a stateful ID-based mapping technique to ensure a non-redundant environmental map from a live video feed, and (2) an obstacle inflation algorithm that guarantees the physical safety of the planned path. This holistic approach bridges the critical gap between theoretical algorithms and practical, deployable autonomous systems.

2. Modeling and Problem Description

This section provides a formal problem formulation for the air-ground collaborative autonomous navigation framework investigated in this work. We first present a high-level architectural overview of the proposed system, followed by rigorous mathematical modeling of the operational environment. The section concludes with a precise problem statement and explicit declaration of our fundamental assumptions.

2.1. System Overview

The proposed system consists of a heterogeneous robotic team, composed of UAVs and UGVs, designed to operate cooperatively in unknown environments. The system employs a complementary task allocation framework where each platform’s capabilities are strategically leveraged: the UAV provides aerial perception while the UGVs execute ground-level navigation tasks (as illustrated in Figure 1). The air-ground coordination system consists of an aerial perception and mapping (APM) module, ground navigation and control (GNC) module, and real-time air-ground communication (RTAGC) module.

In the APM module, the UAV serves as the system’s visual perception platform. Its primary role is to fly at a high altitude, survey the operational area using an onboard camera, and perform real-time environmental perception. It is responsible for detecting obstacles and constructing a global 2D map of the environment, which is then transmitted to the UGV.

In the GNC module, the UGV serves as the system’s mobility platform. It is a ground-based mobile robot with the primary objective of navigating from a starting point to a designated goal. Crucially, it relies exclusively on the map data provided by the UAV to plan and execute its path, as it is assumed to have limited or no onboard long-range perception capabilities.

The RTAGC module establishes a low-latency, high-reliability wireless link between the UAV and UGV, enabling real-time bidirectional transmission of map data and control signals.

2.2. Problem Description

We model the operational environment E as a bounded two-dimensional space containing a set of static, impassable obstacles

O = \{o_{1}, o_{2}, o_{3}, \dots, o_{n}\}, o_{i} \in R^{2}

. The precise locations and geometries of these obstacles are unknown beforehand. For computational tractability in path planning, this continuous space is discretized into a 2D grid map

M

. This map consists of a grid of cells, where each cell

c (i, j) \in M

can be in one of three states:

S (c) = F R E E

: the cell is known to be traversable;

S (c) = O C C U P I E D

: the cell is known to be part of an obstacle and is not traversable; or

S (c) = U N K N O W N

: the state of the cell has not yet been determined.

Initially, all cells in

M

are set to

U N K N O W N

. As the UAV perceives the environment, it updates the state of the corresponding cells to either

F R E E

or

O C C U P I E D

.

Given the system architecture and environment model, the collaborative autonomous navigation problem can be formally stated as follows:

Let the UGV’s configuration (state) in the environment

E

be defined by its position and orientation, represented as

q = (x, y, θ)

, given a starting configuration

q_{s t a r t}

and a target configuration

q_{g o a l}

. A path, denoted by

γ

, is a continuous sequence of configurations starting at

q_{s t a r t}

and ending at

q_{g o a l}

. This path can be described as a continuous sequence of configurations,

q (t)

, where

t

is a parameter representing the progression along the path. The primary objective is to find the optimal path,

γ^{*}

, which is the path that minimizes the total geometric length. To understand this, we need clarify the relationship between the path and its length:

γ

represents the entire, continuous path itself, like a complete road from point A to point B;

d s

represents an infinitesimally small segment or a single tiny step along the path

γ

; and

L = \int_{γ} d s

is the total length of the path

γ

. Therefore, the optimization problem is to find the specific path

γ^{*}

that has the minimum possible length among all valid paths. This goal is formally stated as:

γ^{*} = \arg \underset{γ}{m i n} \int_{γ} d s

(1)

where

\arg \underset{γ}{m i n}

denotes the argument of the minimum. The goal is to find the actual path,

γ

, from all possible paths that results in the minimum value for the length integral

\int_{γ} d s

. The resulting optimal path is

γ^{*}

.

This optimization problem is subject to the following constraints:

(1) Collision Avoidance: The path must be entirely collision-free. For any configuration

q (t)

along the path

γ (t)

, the volume occupied by the UGV, denoted as

A (q (t))

, must not intersect with any known obstacle regions. Formally:

A (q (t)) \cap O = \emptyset, \forall t \in [0, T] .

(2)

This constraint necessitates planning in the UGV’s configuration space, which is practically achieved by inflating the obstacles on the 2D grid map

M

to account for the UGV’s physical dimensions.

(2) Information Dependency: The UGV’s path planner,

P

, generates the path

π

based solely on the map

M_{U A V}

constructed and provided by the UAV. The UGV does not perform its own environmental mapping.

To maintain a clear scope for this study, the following key assumptions are made:

Static Environment: All obstacles within the environment are stationary. The proposed system is not designed to handle dynamic or moving obstacles.
Planar Terrain: The UGV operates on a flat, 2D plane, simplifying the navigation problem from 3D to 2D path planning.
Reliable Communication: The wireless link between the UAV and UGV is assumed to be stable and sufficiently low-latency within the operational range, ensuring timely map updates.
Accurate Perception and Localization: The UAV’s perception system accurately detects obstacles and that the geometric transformation from the camera’s image plane to the world frame is pre-calibrated and precise. Furthermore, both the UAV and UGV are assumed to have reliable access to their own localization data (i.e., their position and orientation in the environment).

By clearly defining these constraints and assumptions, this study isolates the core scientific challenge: the design and validation of a tightly-coupled, end-to-end workflow for air-ground collaboration. This approach is significant because it establishes a foundational and reproducible framework using low-cost, accessible technologies. It proves the feasibility of the fundamental concept—a UGV navigation system solely on UAV-provided intelligence—without the confounding variables of dynamic environments or imperfect localization. This work therefore serves as a critical baseline and a modular platform, enabling future research to independently address the challenges of dynamic obstacle avoidance, sensor fusion, or operations in more complex terrains, building upon the integrated system validated here.

3. The Proposed Air-Ground Collaborative Method

This section presents the core technical design of our proposed air-ground collaborative navigation system. We detail the methodology in three main parts: the aerial mapping (APM) module responsible for real-time environment perception and map construction, the ground navigation (GNC) module that utilizes this map for safe and efficient path planning and execution, and real-time air-ground communication (RTAGC).

3.1. Aerial Perception and Mapping Module

This module functions as the “eyes” of the system. Its primary task is to use a vision system mounted on UAVs to identify ground obstacles in real-time and convert this information into a machine-readable 2D coordinate map. Figure 2 shows a photograph of the custom-built UAV used in our experiments.

To rapidly and accurately identify ground objects from an aerial perspective, the system employs the YOLOv8 deep learning model [6,7]. Instead of using a generic pre-trained model, we created a dedicated dataset for our specific scenario, labeling objects such as trees as “obstacles” for model training. The trained model efficiently processes a live video stream from the UAV to accurately detect the position and class of key targets.

Images captured by the UAV camera are subject to perspective distortion and do not directly reflect the true ground positions of objects. To address this, we introduced perspective transformation technology [4,5]. This technique uses a calibrated transformation matrix to correct the tilted raw image into a unified, distortion-free bird’s-eye view. The image coordinates of all detected obstacles are subsequently converted into real-world coordinates. The coordinate transformation equation is as follows:

[\begin{matrix} x_{h} \\ y_{h} \\ w_{h} \end{matrix}] = M [\begin{matrix} x \\ y \\ 1 \end{matrix}] = [\begin{matrix} m_{11} & m_{12} & m_{13} \\ m_{21} & m_{22} & m_{23} \\ m_{31} & m_{32} & m_{33} \end{matrix}] [\begin{matrix} x \\ y \\ 1 \end{matrix}] .

(3)

where

{[x_{h}, y_{h}, w_{h}]}^{T}

is the homogeneous coordinate on the target plane after the matrix M transformation,

M

is a 3 × 3 perspective transformation matrix, and

{[x, y, 1]}^{T}

is the homogeneous coordinate of a certain point in the original image.

The result of matrix multiplication is a homogeneous coordinate vector, and the coordinates

(X^{'}, Y^{'})

mapped to the real world can be obtained by the following equation:

X^{'} = \frac{x_{h}}{w_{h}} = \frac{m_{11} x + m_{12} y + m_{13}}{m_{31} x + m_{32} y + m_{33}} . Y^{'} = \frac{y_{h}}{w_{h}} = \frac{m_{21} x + m_{22} y + m_{23}}{m_{31} x + m_{32} y + m_{33}} .

(4)

The coordinates

(X^{'}, Y^{'})

calculated by Equation (4) represent the object’s position on the local bird’s-eye view map, which is relative to the UAV’s current field of view. However, for the path planning module to function correctly, it requires coordinates relative to a single, fixed origin (the “start” marker) for the entire mission. Therefore, the system must convert these local, perspective-relative coordinates into globally consistent, origin-relative coordinates. The system first identifies the absolute pixel position of the “start” marker on a persistent global map, defining this point as the origin

(0,0)

. Concurrently, it calculates the absolute pixel position of each detected obstacle

(P_{x}, P_{y})

, on this same global map. The final coordinates

(X, Y)

are then calculated by determining the vector difference between the obstacle’s absolute position and the origin’s absolute position,

S_{s c a l e}

is the conversion scale defined as pixels per centimeter.

X = \frac{P_{x}}{S_{s c a l e}}, Y = \frac{- P_{y}}{S_{s c a l e}} .

(5)

The core innovation of this module is its stateful, ID-based mapping mechanism, which solves the critical problem of data redundancy in dynamic video streams. A naive approach of plotting detections from every frame would repeatedly map the same static obstacles, rendering the final map unusable.

To prevent this, we assign a persistent and unique ID to each object it tracks across consecutive frames [16,17]. The system maintains a set of all object IDs that have already been recorded and plotted on the global map. When an obstacle is detected, the system checks if its unique ID is already in this set. An object’s coordinates are calculated and added to the global map only on the first instance of its detection (i.e., if its ID is not yet in the set). Once mapped, the ID is added to the set, and any subsequent detections of this same object are ignored for mapping purposes.

This stateful, ID-based filtering guarantees that each physical obstacle is represented only once, ensuring the final map is a concise and accurate model of the environment, suitable for reliable path planning. The overall workflow of the aerial perception and mapping module is illustrated in the flowchart in Figure 3. The specific implementation process is shown in Algorithm 1.

Algorithm 1. Simplified Logic for the Aerial Perception and Mapping Module.

Input: UAV video stream, pre-trained YOLOv8 perception model, coordinate transformation parameters
Output: A list of unique obstacle coordinates relative to a start marker

1:: procedure MapEnvironmentFromVideo;
//Load the pre-trained YOLOv8 model and prepare the video source.
2:: Initialize_System(yolo_model, video_source); //Create a new set to store the unique IDs of obstacles that have already been mapped.
3:: mapped_ids ← new Set();
4:: global_map ← new List();//Create a list to store the final coordinates of all unique objects.
5:: origin_pos ← null; //Initialize the origin position as null. It will be set when the “start” marker is first detected.
6:: while video is playing:
7:: Tracked_objects ← yolo_model.track(current_frame); //Use the YOLOv8 model to detect and track all objects in the current frame.
8:: for each obj in tracked_objects: //Iterate through each object detected in the frame.
9:: if obj.ID is not in mapped_ids: //Case A: The object is the “start” marker and the origin has not been set yet.
10:: if obj.class is “start” and origin_pos is null:
11:: origin_pos ← Get_Global_Position(obj); //According to Equations (3) and (4).
12:: mapped_ids.add(obj.ID);
13:: global_map.add(“start”, (0,0));
14:: else if obj.class is ‘tree’ and origin_pos is not null: //Case B: The object is an obstacle (‘tree’) and the origin has already been established.
15:: birdseye_coords ← Perspective_Transform(obj.coords) //According to Equations (3) and (4).
16:: relative_coords ← Calculate_Coords_Relative_To(obj, origin_pos); //According to Equation (5).
17:: mapped_ids.add(obj.ID); //Add the obstacle’s ID to the set to prevent it from being mapped again.
18:: global_map.add(“obstacle”, relative_coords);
19:: return global_map; //Return the completed map containing only unique obstacles.
20:: end procedure;

3.2. Ground Navigation and Control Module

This module represents the final execution stage of the air-ground collaborative system, acting as the “legs” responsible for translating the abstract environmental map into precise and safe physical movement. Figure 4 shows the physical prototype of the UGV used for ground navigation. The process begins when the UGV receives the coordinate file (log.txt) from the UAV. The system first converts these discrete coordinates into a structured grid map, which serves as its internal representation of the environment.

The core innovation of our approach lies in the next step: obstacle inflation algorithm, designed to bridge the gap between abstract path planning and real-world physical navigation. Recognizing that the UGV has physical dimensions and cannot be treated as a theoretical single point, this algorithm addresses the critical risk of collision.

To ensure the path generated is physically navigable, the algorithm must account for the UGV’s physical footprint. We achieve this by transforming the grid map into a configuration space (C-space) through obstacle inflation. This process treats the UGV as a single point and expands the boundaries of all obstacles to compensate for the vehicle’s dimensions. Specifically, we model the UGV’s rectangular body with a bounding circle, which represents the maximum radius the vehicle occupies. The inflation radius (

R_{i n f}

) is calculated as the distance from the UGV’s center to its furthest corner, plus an additional safety buffer (

R_{s a f e}

). This is formalized by the following equation:

R_{i n f} = \sqrt{(\frac{w i d t h}{2})^{2} + (\frac{l e n g t h}{2})^{2}} + R_{s a f e} .

(6)

where

w i d t h

and

l e n g t h

are the physical dimensions of the UGV. By marking all grid cells within this inflation radius around each obstacle as impassable, we guarantee that any path planned for the point-robot will be collision-free for the actual vehicle.

Standard path planning on a non-inflated map could generate a route that, while valid for a single point, would cause the UGV’s body to clip or collide with obstacles. To prevent this, the algorithm artificially expands the footprint of each detected obstacle on the grid map. Based on the UGV’s physical dimensions, a specific “inflation radius” is set. The system then marks all grid cells within this radius around each obstacle as impassable [12]. This creates a “safety buffer” around all potential hazards, ensuring that any planned path remains at a safe distance from actual collisions, as shown in in Figure 5. The numbers on the horizontal and vertical axes in Figure 5 represent the number of cells. This inflation step is fundamentally important to the project’s success, as it transforms a simple obstacle map into a robust navigation-ready C-space map that guarantees safe maneuvering [18].

Once this safe map is constructed, the system requires an intelligent method to chart a course through the open areas. For this task, the system employs the A* algorithm [10,11,12]. The core of the A* algorithm’s effectiveness lies in its cost evaluation function,

f (n)

, which it uses to assess the “promise” of each node it considers expanding. Here,

n

represents a specific node on the search graph, corresponding to a grid cell on the map. The function balances the cost of the path already traveled with an estimate of the remaining cost to the goal. The equation is expressed as:

f (n) = g (n) + h (n) .

(7)

where

g (n)

is the past path cost, representing the actual, known cost of the path traveled from the starting node to the current node, and

h (n)

is the heuristic function. It is an admissible (i.e., never overestimating) estimated cost from the current node, calculated as follows:

h (n) = \sqrt{(x_{n} - x_{g o a l})^{2} + (y_{n} - y_{g o a l})^{2}} .

(8)

where

(x_{n}, y_{n})

represents the coordinates of the current node

n

, and

(x_{g o a l}, y_{g o a l})

represents the coordinates of the target position.

To accurately execute the path generated by the A* algorithm, the UGV utilizes a closed-loop motion control system, the architecture of which is illustrated in Figure 6. This system continuously adjusts the UGV’s movement based on real-time feedback from an onboard MPU6050 gyroscope [19,20]. As shown in the flowchart, the controller compares the target value (desired speed and angle) from the path planner against the feedback value (actual motion) from the MPU6050 gyroscope. It then computes a correction signal for DC motors, ensuring the UGV’s actual output precisely matches the planned trajectory. This control logic is applied in two primary scenarios: maintaining a straight trajectory and performing accurate turns.

To accurately execute the path between waypoints, a closed-loop controller is essential for correcting deviations from the planned trajectory caused by physical imperfections. For the specific task of straight-line motion, the primary control objective is to maintain a constant heading by regulating the UGV’s yaw rate.

We model the UGV’s orientation as a simplified dynamic system. The state variable is the yaw angle

d θ

, and its derivative, the yaw rate

ω (t) = d θ / d t

, is the system’s output. This output is directly measured by the onboard MPU6050 gyroscope, providing real-time feedback. The control input to this system, denoted as

u (t)

, is a corrective signal that generates a speed difference between the left and right wheels, thereby manipulating the yaw rate.

For straight-line motion between waypoints, the target value is a yaw rate of zero,

ω_{t a r g e t}

. However, physical imperfections can cause the UGV to drift. The MPU6050 gyroscope measures the actual yaw rate,

ω_{a c t u a l}

, and the controller calculates the error

e (t)

,

e (t) = ω_{t \arg e t} - ω_{a c t u a l} = - ω_{a c t u a l}

. To correct this error, a proportional-integral (PI) controller is implemented. The controller generates the control input signal

u (t)

based on the current error and the accumulation of past errors:

u (t) = K_{p} \cdot e (t) + K_{i} \int_{0}^{t} e (τ) d τ .

(9)

where

K_{p}

and

K_{i}

are the proportional and integral gains, respectively. The control signal

u (t)

is then used to adjust the individual speeds of the left and right wheels. Let

V_{b a s e}

be the UGV’s nominal forward speed, while

V_{L}

and

V_{R}

are the final target speeds for the left and right wheels, respectively. These speeds are calculated by applying the corrective adjustment

u (t)

to the base speed as follows:

V_{L} = V_{b a s e} - u (t) . V_{R} = V_{b a s e} + u (t) .

(10)

This dynamic adjustment counteracts any detected yaw, ensuring the UGV maintains a stable and straight trajectory. When a turn to a specific heading is required, the target value is the desired final angle,

θ_{t a r g e t}

. The system must ensure the UGV rotates precisely to this angle. The controller first estimates the current angle turned,

θ_{a c t u a l}

, by performing discrete integration on the gyroscope’s yaw rate measurements over time:

θ_{a c t u a l} (t) = θ_{a c t u a l} (t - Δ t) + ω_{a c t u a l} (t) \cdot Δ t .

(11)

where

ω_{a c t u a l} (t)

is the yaw rate measured by the gyroscope at time

t

, and

Δ t

is the discrete time step between measurements. This is the digital equivalent of the integral

θ = \int ω d t

. The controller then calculates the remaining angular error

e_{θ} (t)

,

e_{θ} (t) = θ_{t a r g e t} - θ_{a c t u a l} (t)

. A proportional controller can then be used to generate a turning speed,

T u r n_S p e e d,

that decreases as the UGV approaches the target angle, ensuring a smooth stop without overshooting:

T u r n_S p e e d = K_{p, t u r n} \cdot e_{θ} (t) .

(12)

In this control law,

K_{p, t u r n}

is the proportional gain specifically tuned for the turning motion. The motors are driven in opposite directions with this speed until the error

e_{θ} (t)

approaches zero. These two MPU6050-based control loops provide the necessary precision to reliably translate the A* path’s high-level waypoints into accurate physical movements.

The complete workflow of the ground navigation and control module, from map reception to path execution, is detailed in the flowchart in Figure 6. The specific implementation process is shown in Algorithm 2.

Algorithm 2. Simplified Logic for the Ground Navigation and Control Module.

Input: obstacle_coords(from Algorithm 1), start_point,goal_point.
Output: Completed physical navigation of the UGV along a collision-free path.
1: procedure UGV_Navigation_Process(obstacle_coords, start_point, goal_point);
2: navigable_map ← Create_Map_And_Inflate_Obstacles(obstacle_coords); //Step 1: Map Generation and Preparation for Safe Navigation.
//Create a grid map from the obstacle coordinates and apply the obstacle inflation algorithm.
3: optimal_path ← Find_Path_A_Star(navigable_map, start_point, goal_point); //Step 2: Optimal Path Planning
//Run the A* algorithm on the inflated (safe) map to find the most efficient path.
4: //Based on the A cost function defined in Equations (7) and (8).
5: if optimal_path is found: //Step 3: Physical Path Execution
6: Follow_Path_With_Gyro_Feedback(optimal_path);
7: //Implements motion control loops using Equations (9)–(12).
8: end procedure;

3.3. Real-Time Air-Ground Communication Module

This module serves as the “central nervous system” of the system, responsible for establishing a stable, low-latency data link between the aerial perception unit and the ground execution unit. A schematic diagram of the real-time air-ground communication link is shown in Figure 7.

The timely and reliable flow of information is critical for ensuring that the ground vehicle’s navigation decisions are based on the most current environmental map provided by the UAV. To achieve this, both the UAV and the UGV are equipped with a Raspberry Pi 3B+ as their onboard computing and communication core. All communication occurs over a private, self-built Wi-Fi local area network (LAN) connection, a crucial design choice that eliminates any dependency on external internet or pre-existing network infrastructure, guaranteeing stable operation even in remote or post-disaster scenarios. At the software level, we adopted the high-performance FAST DDS middleware, which operates on an efficient “publish/subscribe” model well-suited for real-time robotic systems.

In this model, the Raspberry Pi on the UAV acts as the “Publisher,” broadcasting the obstacle coordinate data (formatted as a .txt file) to a specific network “topic.” Concurrently, the Raspberry Pi on the UGV acts as the “Subscriber,” listening to this topic to instantly receive the data file as soon as it is sent. This data then becomes the direct input for the UGV’s path planning module. This decoupled architecture not only ensures a seamless real-time data flow but also makes the system highly flexible and scalable, laying the groundwork for future multi-robot swarm operations.

The flowchart in Figure 8 details the process of the real-time air-ground communication module, which is shown in Algorithm 3.

Algorithm 3. Logic for the Real-time Air-Ground Communication Module.

Input: map_data, mission_goal.
Output: The successful transmission of the map data from the UAV to the UGV, triggering the start of the UGV’s autonomous navigation task.
  //On the UAV (Publisher) Side.
1: procedure Publish_Map_Data(map_data); //The UAV acts as the “Publisher” in the Fast DDS framework. To a specific topic named “MAP_DATA”.
2: Publish(topic: “MAP_DATA”, data: map_data);
3: end procedure;
  //On the UGV (Subscriber) Side.
4: procedure Receive_And_Trigge-r_Navigation();//The UGV acts as the “Subscriber,” continuously listening to the “MAP_DATA” topic.
5: map_data ← Subscribe_And_Wait_For_Data(topic: “MAP_DATA”);
6: Execute_Ground_Navigation(map_data);//Initiates the process detailed in Algorithm 2
  //As soon as the map data is received, it triggers the ground navigation process.
  //The received “map_data” becomes the direct input for the UGV’s path planner.
7: end procedure

4. Experiments and Analysis

This section presents a comprehensive evaluation of the proposed air-ground collaborative navigation system. We first describe the experimental setup, including the hardware platform and software environment. We then define the performance metrics used for evaluation and introduce the baseline methods for comparison. Finally, we present and analyze the results of the perception, path planning, and end-to-end system performance.

4.1. Experimental Setup

The hardware experimental platform consists of a custom-built UAV and UGV: The UAV (Figure 2) is a quadcopter equipped with a high-resolution camera for environmental perception. The UGV (Figure 4) is a four-wheeled mobile robot responsible for ground navigation. Both the UAV and UGV are equipped with a Raspberry Pi 3B+ serving as the onboard computing and communication core. Communication is established over a self-contained Wi-Fi LAN connection. The UGV’s motion control system utilizes feedback from an onboard MPU6050 gyroscope to ensure precise path tracking.

The core objective of this research was to validate the proposed framework on an accessible, low-cost hardware platform. This approach demonstrates that robust autonomous navigation can be achieved without relying on expensive, industrial-grade sensors and platforms, thereby lowering the barrier to entry for research and practical deployment. To underscore this economic advantage, Table 1 provides a comparative cost analysis between our system and a typical high-cost research platform commonly used for similar tasks. While high-end systems often utilize expensive components like LiDAR and powerful GPU-based computers for maximum performance, our prototype achieves its objectives using affordable, off-the-shelf hardware. The bill of materials for our system is detailed in the Table 1.

The software experimental platform is built upon a publish/subscribe architecture using FAST DDS middleware for robust, real-time data transmission between the UAV and UGV. The perception module employs a YOLOv8 model that was trained on a custom-built dataset. This dataset was specifically created for our experimental scenario, with objects such as trees and other obstructions manually labeled as “obstacles” to ensure high detection accuracy within the target environment. This dataset comprises 1498 images containing 3241 annotated instances. The dataset was partitioned into standard training, validation, and test splits (80%/10%/10%) to evaluate the model’s performance robustly.

To evaluate the system’s robustness, experiments were conducted under varying conditions. Firstly, the obstacle configuration was intentionally altered across different trials to test the system’s ability to adapt to unseen environments. The different layouts presented in the results qualitatively demonstrate this variability. Secondly, tests were performed under different ambient indoor lighting conditions. The system demonstrated consistent performance, largely due to the high visual contrast between the obstacles and the background, which facilitated reliable detection by the YOLOv8 model. However, testing under more challenging real-world conditions remains a focus for future work.

4.2. Performance Metrics

To quantitatively evaluate system performance, key metrics for latency and reliability are defined. System responsiveness is primarily measured by the end-to-end perception latency (

T_{e 2 e}

) as follows:

T_{e 2 e} = T_{p e r c e p t i o n} + T_{m a p p i n g} + T_{c o m m u n i c a t i o n} .

(13)

where,

T_{p e r c e p t i o n}

is the perception time;

T_{m a p p i n g}

is the mapping time; and

T_{c o m m u n i c a t i o n}

is the communication time. Following data reception, the computational efficiency of the path planner is assessed by the path planning computation time,

T_{p l a n n i n g}

. System reliability is evaluated at two levels. The stability of the air-to-ground link is measured by communication reliability,

R_{c o m m}

:

R_{c o m m} = \frac{N_{r e c e i v e d}}{N_{s e n t}} \times 100 % .

(14)

where,

N_{s e n t}

is the total number of map data packets sent by the UAV (Publisher) and

N_{r e c e i v e d e}

is the total number of data packets successfully received by the UGV (Subscriber). The overall operational effectiveness is determined by the task success rate,

R_{t a s k}

, which represents the system’s ability to complete missions without failure. It is calculated as:

R_{t a s k} = \frac{N_{s u c c e s s}}{N_{t o t a l}} \times 100 % .

(15)

where,

N_{t o t a l}

is the total number of attempts to perform the navigation task and

N_{s u c c e s s}

is the number of times the UGV successfully navigated from the start to the goal point without collision or human intervention.

4.2.1. Evaluation of Aerial Perception and Mapping Module

Extensive research has focused on the individual components of autonomous robotic systems. For environmental perception from a UAV, older systems often used traditional computer vision methods that relied on simple features like color and shape. These methods struggled in complex scenes or with changing light. Our system uses a modern deep learning model called YOLOv8, which is a state-of-the-art approach for real-time object detection [6,7]. The advantage of YOLOv8 is that it learns to recognize objects holistically, making it much faster and more accurate than older techniques, which is essential for processing a live video feed. Table 2 provides a comparison of our YOLOv8-based approach with traditional computer vision algorithms, while Table 3 details the evolution of YOLO algorithm versions, highlighting the features and performance impact of YOLOv8. For the YOLOv8 model, we use standard object detection metrics: Precision (P), Recall (R), and mean Average Precision (mAP@.5 and mAP@.5-.95) to evaluate the accuracy of the YOLOv8-based obstacle detection module.

4.2.2. Evaluation of Ground Navigation and Control Module

The fidelity of the environmental map is crucial for subsequent safe path planning by the ground robot. Many path-planning algorithms exist to measure the computation time required to generate a path and the final path length (cost). Dijkstra’s algorithm, for example, finds the shortest path by exploring every possible route, which can be slow [10]. We chose the A* algorithm because it is more efficient. The main advantage of A* is that it uses a “heuristic function,” or an educated guess, to estimate the distance to the goal. Table 4 provides a detailed comparison of the A* algorithm with two common path planning methods, Dijkstra’s algorithm and Greedy Best-First Search. The “educated guess” approach allows A* to prioritize more promising routes and find the optimal path much faster than methods that do not employ a heuristic function [12]. After comparing the three path planning methods, we ultimately chose the more efficient A* algorithm for implementation.

However, even a perfect path is dangerous if the robot’s physical size isn’t considered. This led to our second key innovation: obstacle inflation.

Our system understands that the UGV is not a single point. It creates a “safety buffer” on the map by artificially expanding the size of each obstacle. This ensures that the path calculated by the A* algorithm will always keep the physical robot a safe distance from any real-world hazard.

4.2.3. Evaluation of Real-Time Air-Ground Communication Module

For all these components to work together, the UAV and UGV need a fast and reliable communication system. Many teams use a popular framework called ROS (Robot Operating System), but its standard communication can sometimes be too slow for time-critical tasks [21,22]. We use a different industry standard called DDS (Data Distribution Service) because it was specifically designed for the kind of high-performance, real-time data sharing needed in mission-critical applications [23]. The advantages of DDS over other frameworks like ROS are summarized in Table 5. The main problem we identified in existing research is that while these individual technologies have been studied separately, few have integrated them into a complete, low-cost system that has been physically tested [8,14,15]. We saw a clear need for a practical framework that tightly connects real-time aerial perception with autonomous ground navigation using accessible hardware and robust communication.

5. System Operation Result

The experimental results validate the feasibility and effectiveness of the proposed air-ground collaborative system. The successful integration of three distinct modules—aerial perception, real-time communication, and ground navigation—demonstrates a complete, end-to-end solution for autonomous navigation in unknown environments. The aerial perception module, leveraging a custom-trained YOLOv8 model, proved highly effective at identifying obstacles and establishing a global map from a high-altitude perspective.

Beyond simple feasibility, quantitative analysis against baseline systems reveals the superiority of the proposed integrated framework. The selection of the A* algorithm resulted in a significantly lower

T_{p l a n n i n g}

compared to a Dijkstra-based approach, especially in complex environments. This computational efficiency, combined with the low

T_{e 2 e}

ensured by the FAST DDS middleware, directly contributed to a higher overall

R_{t a s k}

. By operating on more current environmental data, the UGV could navigate more reliably and efficiently, consistently producing shorter and more optimal path lengths. To analyze the system’s performance variability and scalability under increasing environmental complexity, Table 6 shows a performance comparison of different system configurations in a high-density scenario, including our A* + FAST DDS approach, Dijkstra + FAST DDS, and A* + ROS. Figure 9 provides a comparative analysis of our system’s latency metrics (

T_{p l a n n i n g}

and

T_{e 2 e}

) against obstacle density for our proposed system and two baseline configurations. These metrics collectively demonstrate that the system is not only functional, but also robust and highly efficient for real-world autonomous navigation challenges.

To validate the effectiveness of the perception module, the custom-trained YOLOv8 model was rigorously evaluated on a dedicated test set partitioned from our custom dataset. We assessed the model’s performance using standard object detection metrics: Precision (P), Recall (R), and mean Average Precision (mAP). Precision measures the accuracy of the model’s detections, while Recall measures its ability to find all relevant objects. The mAP score provides a single, comprehensive measure of the model’s overall performance by considering both precision and recall across various confidence levels.

The quantitative results of the evaluation are summarized in Table 7: Precision of 0.846, Recall of 0.774, mAP50 of 0.845, and mAP50-95 of 0.639. This high mAP score confirms the model’s reliability in accurately identifying obstacles, providing a high-quality data source for the downstream mapping and navigation modules.

The employed YOLOv8 model demonstrates high-precision obstacle identification, providing reliable initial environmental perception data for the system. In a simulated environment with fixed obstacles, a critical challenge arises from data redundancy caused by the continuous detection of the same object. To address this, we introduce a key innovation: a stateful, ID-based deduplication technique. This technique assigns a persistent and unique ID to each tracked obstacle. The system checks if this ID has already been recorded and only proceeds with coordinate transformation and mapping when a new ID is detected for the first time, ensuring each physical obstacle is represented only once on the map. For an obstacle confirmed as new, its pixel coordinates in the image first undergo a perspective transformation to correct for distortion and generate an undistorted bird’s-eye view. Subsequently, the system converts these bird’s-eye view coordinates into global coordinates relative to a “start” marker, which serves as the global origin (0, 0). Finally, these filtered and transformed coordinates are formatted and saved into a .txt file, with the results shown in Table 8, providing concise and accurate data input for the subsequent path planning module.

Figure 10 provides a qualitative validation of the aerial mapping module’s end-to-end performance. The figure juxtaposes the UAV’s real-time camera feed (left panel) against the dynamically generated global obstacle map (right panel). A direct visual comparison confirms the high fidelity between the physical layout of obstacles observed in the video and their corresponding coordinates plotted on the map. This demonstrates that the entire mapping pipeline—from the initial YOLOv8 detection to the perspective transformation and stateful coordinate filtering—is functioning cohesively to accurately model the environment. Ultimately, the module successfully translates a complex visual scene into a precise, machine-readable map, thereby fulfilling its primary objective within the air-ground collaborative system.

The seamless integration of these modules culminated in a final, end-to-end system demonstration, qualitatively validated in Figure 11. This experiment serves as definitive proof that the framework can overcome the perceptual limitations inherent in a single ground-based agent. The figure illustrates the complete, real-time operational workflow. In the top-left panel, the UAV’s camera captures the scene where the YOLOv8 model performs visual reconnaissance. This perception data is then used to construct a global map and plan an optimal, collision-free path using the A* algorithm, as visualized in the top-right panel.

After this critical coordinate data is transmitted to the UGV in a .txt file via the robust FAST DDS middleware, the vehicle commences its navigation task. The central part of the figure captures the successful outcome: the UGV, guided by its closed-loop motion control system, physically executes the planned trajectory and successfully maneuvers through the cluttered field of obstacles. The high consistency between the planned route and the vehicle’s actual movement provides definitive validation for the entire collaborative system, showcasing its effectiveness as a feasible, integrated solution.

6. Conclusions

This research successfully designed, implemented, and validated an integrated air-ground collaborative system for autonomous navigation. By tightly coupling aerial perception using YOLOv8, real-time communication with FAST DDS, and ground navigation using the A* algorithm, we have created a feasible and low-cost framework that addresses the significant challenges of operating UGVs in complex, GPS-denied environments. The system automates the workflow from environmental reconnaissance to path execution, directly addressing critical industry challenges such as inefficient air-ground collaboration and delayed autonomous navigation. The successful physical prototype validation demonstrates the system’s potential for real-world applications in logistics, inspection, and emergency response.

While this research successfully demonstrated the core concept of our air-ground collaborative system, we recognize its current limitations. The primary limitation is that all tests were performed in a controlled sandbox environment. This setting used static, high-contrast obstacles on a uniform background, which is much simpler than a real-world disaster scene filled with varied debris, complex textures, and challenging lighting. The current system was therefore only validated with stationary obstacles, and its ability to handle dynamic elements like moving people or other vehicles remains untested. These controlled conditions mean that while the prototype is successful, its robustness in more chaotic, real-world scenarios still needs to be proven.

Based on these limitations, our future work will focus on two main goals in order to move the system from a prototype to a field-ready solution. First, we plan to enhance the system’s robustness in dynamic environments. This involves upgrading the perception and planning algorithms to not only detect but also predict the movement of dynamic objects, allowing the UGV to safely navigate among them. Second, we will work on expanding the framework to support multi-robot swarm operations. The current FAST DDS communication architecture already provides a solid foundation for this. The next step is to develop coordination strategies that would allow multiple UAVs and UGVs to share information and collaborate on complex tasks, such as mapping a larger area more quickly or performing multiple tasks simultaneously. Addressing these areas will be key to unlocking the full potential of our collaborative framework.

Author Contributions

Conceptualization, J.D., J.L. and J.H.; methodology, J.D., and J.L.; software, J.D.; validation, J.D. and J.L.; formal analysis, J.D., J.L., and J.H.; investigation, J.D. and J.L.; resources, J.D.; data curation, J.D., J.L. and J.H.; writing—original draft preparation, J.D.; writing—review and editing, J.D., J.L. and J.H.; visualization, J.D.; supervision, J.H.; project administration, J.H.; funding acquisition, J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant No. 2022YFE0133100, and in part by the Sichuan Science and Technology Program under Grant 2024NSFSC0528.

Acknowledgments

The authors thank Bijoy Kumar Ghosh from Texas Tech University for paper polishing.

Conflicts of Interest

The authors declare no conflict of interest.

References

Munasinghe, I.; Perera, A.; Deo, R.C. A Comprehensive Review of UAV-UGV Collaboration: Advancements and Challenges. J. Sens. Actuator Netw. 2024, 13, 81. [Google Scholar] [CrossRef]
Ding, Y.; Xin, B.; Chen, J. A Review of Recent Advances in Coordination Between Unmanned Aerial and Ground Vehicles. Unmanned Syst. 2020, 9, 97–117. [Google Scholar] [CrossRef]
Arafat, M.Y.; Alam, M.M.; Moh, S. Vision-Based Navigation Techniques for Unmanned Aerial Vehicles: Review and Challenges. Drones 2023, 7, 89. [Google Scholar] [CrossRef]
Kaniewski, P.; Grzywacz, W. Visual-based navigation system for unmanned aerial vehicles. In Proceedings of the 2017 Signal Processing Symposium (SPSympo), Jachranka, Poland, 12–14 September 2017; pp. 1–6. [Google Scholar]
Kanellakis, C.; Nikolakopoulos, G. Survey on computer vision for uavs: Current developments and trends. J. Intell. Robot. Syst. 2017, 87, 141–168. [Google Scholar] [CrossRef]
Liu, L.; Ouyang, W.; Wang, X.; Fieguth, P.; Chen, J.; Liu, X.; Pietikäinen, M. Deep Learning for Generic Object Detection: A Survey. Int. J. Comput. Vis. 2019, 128, 261–318. [Google Scholar] [CrossRef]
Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLO X: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
Lu, Y.; Xue, Z.; Xia, G.-S.; Zhang, L. A survey on vision-based UAV navigation. Geo-Spat. Inf. Sci. 2018, 21, 21–32. [Google Scholar] [CrossRef]
Liu, H.; Long, Q.; Yi, B.; Jiang, W. A survey of sensors based autonomous unmanned aerial vehicle (UAV) localization techniques. Complex Intell. Syst. 2025, 11, 371–392. [Google Scholar] [CrossRef]
Shen, Y.; Liu, J.; Luo, Y. Review of Path Planning Algorithms for Unmanned Vehicles. In Proceedings of the 2021 IEEE 2nd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 26–28 December 2021; pp. 400–405. [Google Scholar]
Lashin, M.; El-mashad, S.Y.; Elgammal, A.T. Real-time path planning in dynamic environments using LSTM-augmented A^∗ search. Results Eng. 2025, 27, 106324. [Google Scholar] [CrossRef]
Zhou, C.; Huang, B.; Fränti, P. A review of motion planning algorithms for intelligent robots. J. Intell. Manuf. 2022, 33, 387–424. [Google Scholar] [CrossRef]
Wang, J.; Zhang, T.; Ma, N.; Meng, M.Q.-H. A Survey of Learning-Based Robot Motion Planning. IET Cyber-Syst. Robot. 2021, 3, 302–314. [Google Scholar] [CrossRef]
Yue, P.; Xin, J.; Huang, Y.; Zhao, J.; Zhang, C.; Chen, W.; Shan, M. UAV Autonomous Navigation System Based on Air–Ground Collaboration in GPS-Denied Environments. Drones 2025, 9, 442. [Google Scholar] [CrossRef]
Mittal, M.; Mohan, R.; Burgard, W.; Valada, A. Vision-Based Autonomous UAV Navigation and Landing for Urban Search and Rescue. In Robotics Research; Springer International Publishing: Cham, Switzerland, 2022; pp. 575–592. [Google Scholar]
Liu, M.; Wang, X.; Zhou, A.; Fu, X.; Ma, Y.; Piao, C. UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective. Sensors 2020, 20, 2238. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.; Liu, Y.; Liu, T.; Lin, Z.; Wang, S. DAGN: A Real-Time UAV Remote Sensing Image Vehicle Detection Framework. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1884–1888. [Google Scholar] [CrossRef]
Katona, K.; Neamah, H.A.; Korondi, P. Obstacle Avoidance and Path Planning Methods for Autonomous Navigation of Mobile Robot. Sensors 2024, 24, 3573. [Google Scholar] [CrossRef] [PubMed]
Kidane, N.; Avran, M.N.; Pavos, J.M.; Simone, D.W. Trajectory Tracking and Control of Differential Drive Mobile Robots Using Feedback Linearization. In Proceedings of the 2025 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (ICHORA), Ankara, Türkiye, 25–27 June 2025; pp. 1–9. [Google Scholar]
Tran, Q.-K.; Ryoo, Y.-J. Multi-Sensor Fusion Framework for Reliable Localization and Trajectory Tracking of Mobile Robot by Integrating UWB, Odometry, and AHRS. Biomimetics 2025, 10, 478. [Google Scholar] [CrossRef] [PubMed]
Albonico, M.; Đorđević, M.; Hamer, E.; Malavolta, I. Software engineering research on the Robot Operating System: A systematic mapping study. J. Syst. Softw. 2023, 197, 111574. [Google Scholar] [CrossRef]
Paul, S.; Lephuoc, D.; Hauswirth, M. Performance Evaluation of ROS2-DDS middleware implementations facilitating Cooperative Driving in Autonomous Vehicle. arXiv 2024, arXiv:2412.07485. [Google Scholar] [CrossRef]
Gambo, M.L.; Danasabe, A.; Almadani, B.; Aliyu, F.; Aliyu, A.; Al-Nahari, E. A Systematic Literature Review of DDS Middleware in Robotic Systems. Robotics 2025, 14, 63. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the air-ground coordination system.

Figure 2. UAV.

Figure 3. Flowchart of the aerial perception and mapping (APM) module.

Figure 4. UGV.

Figure 5. Simulation of the obstacle inflation algorithm.

Figure 6. Flowchart of the ground navigation and control (GNC) module.

Figure 7. A demonstration diagram of real-time air-ground communication.

Figure 8. Flowchart of the real-time air-ground communication (RTAGC) module.

Figure 9. Comparative analysis of latency performance. (a)

T_{p l a n n i n g}

and (b)

T_{e 2 e}

are plotted against the number of obstacles for the three system configurations.

Figure 9. Comparative analysis of latency performance. (a)

T_{p l a n n i n g}

and (b)

T_{e 2 e}

are plotted against the number of obstacles for the three system configurations.

Figure 10. The recognition results of UAV in simulated environments. (Top left): Real-time scene capture and obstacle detection by the UAV. (right): Generated global map. The yellow dots represent the identified obstacles. (Bottom left): The coordinates of the identified obstacles. (a) Operation process 1; (b) Operation process 2.

Figure 11. The result presentation of the entire design and implementation of an air-ground collaborative autonomous navigation system based on deep learning. End-to-end system demonstration. (Top left): Real-time scene capture and obstacle detection by the UAV. (Top right): Generated global map and the planned A* path. The yellow dots represent the identified obstacles. (Center): The UGV physically executing the planned trajectory to navigate the obstacle field. (a) Operation process 1; (b) Operation process 2.

Table 1. Comparative cost analysis of core hardware components.

Component	Air-Ground Collaborative System	Typical High-Cost Research Platform
Onboard Computer	$35 × 2 = $70 (Raspberry Pi 3B+)	$1500 (NVIDIA Jetson AGX Orin)
Primary Perception Sensor	$20 (High-Resolution USB Camera)	$500 (Velodyne Puck LiDAR)
UGV	$25	$500
UAV	$50	$1500
Inertial Measurement	$5 (MPU6050)	$1500 (Vector Nav VN-100 IMU/AHRS)
Power Module	$25	$200
Estimated Total Cost	$220	$5700

Table 2. Comparison table of object detection algorithms.

Feature	Traditional CV Algorithms [4,5]	YOLO [6,7]
Detection Principle	Rule-based (e.g., color, shape).	Learning-based (neural network).
Accuracy	Lower & Fragile	Higher; robust to visual changes.
Speed	Slow for complex scenes.	Real-Time.
Flexibility/Adaptability	Poor: requires manual recoding for new objects.	High: adapts to new objects via retraining.

Table 3. Comparison of YOLO algorithm versions [6,7].

Version	Feature	Impact on Performance
YOLOv3	Multi-Scale Features (FPN).	Improved small object detection.
YOLOv4	Speed/Accuracy Optimizations (BoF/BoS).	Superior speed-accuracy balance.
YOLOv5	Usability Focus (PyTorch).	Simplified training & deployment.
YOLOv7	Advanced Architecture (E-ELAN).	New speed & accuracy benchmark.
YOLOv8	Anchor-Free Unified Design.	Enhanced flexibility & performance.

Table 4. Comparison of path planning algorithms.

Feature	Dijkstra’s Algorithm [10,12]	Greedy Best-First Search [10,11,12]	A* Algorithm [10,11,12]
Principle	Focuses only on past cost ( $g (n)$ ).	Focuses only on future cost ( $h (n)$ ).	Balances past and future cost ( $g (n) + h (n)$ ).
Optimality	Optimal. Finds the best path.	Not Optimal. Finds a path, but often not the best one.	Optimal. Finds the best path.
Efficiency	Slow; explores in all directions.	Fast; but can make poor choices.	Efficient; uses educated guesswork to guide its search.

Table 5. Comparison of two communication frameworks.

Feature	ROS [21,22]	DDS [22,23]
Design Philosophy	General-purpose robotics framework.	Designed for high-performance, critical tasks.
Real-time Performance	Can be too slow for real-time tasks.	Optimized for real-time data sharing.
Primary Use Case	General robotics R&D.	Mission-critical systems.

Table 6. Performance comparison of system configurations in a high-density scenario.

System Configuration	$T_{p l a n n i n g}$	$T_{e 2 e}$	$R_{t a s k}$
A* + FAST DDS	110 ms	95 ms	92%
Dijkstra + FAST DDS	250 ms	96 ms	91%
A* + ROS	111 ms	165 ms	80%

Table 7. Performance metrics of the custom-trained YOLOv8 model on the test dataset, showing Precision (P), Recall (R), and mAP values.

Class	Images	Instances	Box (P)	R	Map50	mAP50-95
all	1498	3241	0.846	0.774	0.845	0.639

Table 8. The result of identifying obstacles in a simulated environment.

ID	Relative Coordinates
start (ID: 1)	(X: 0 cm, Y: 0 cm)
Obstacle (ID: 2)	(X: 79.0 cm, Y: 43.0 cm)
Obstacle (ID: 3)	(X: −86.0 cm, Y: 261.5 cm)
Obstacle (ID: 4)	(X: 49.0 cm, Y: 167.0 cm)
Obstacle (ID: 5)	(X: −4.5 cm, Y: 244.5 cm)
Obstacle (ID: 6)	(X: −95.0 cm, Y: 433.5 cm)
Obstacle (ID: 7)	(X: 153.5 cm, Y: 402.0 cm)
Obstacle (ID: 8)	(X: −10.0 cm, Y: 449.5 cm)
Obstacle (ID: 9)	(X: 70.0 cm, Y: 497.0 cm)
Obstacle (ID: 10)	(X: 80.5 cm, Y: 443.5 cm)
Obstacle (ID: 11)	(X: 64.5 cm, Y: 467.5 cm)
Obstacle (ID: 12)	(X: 73.0 cm, Y: 497.5 cm)
Obstacle (ID: 13)	(X: 151.0 cm, Y: 477.5 cm)
Obstacle (ID: 14)	(X: 153.5 cm, Y: 403.0 cm)
Obstacle (ID: 15)	(X: −46.0 cm, Y: 657.0 cm)
Obstacle (ID: 16)	(X: 54.0 cm, Y: 665.5 cm)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Deng, J.; Liu, J.; Hu, J. Tightly-Coupled Air-Ground Collaborative System for Autonomous UGV Navigation in GPS-Denied Environments. Drones 2025, 9, 614. https://doi.org/10.3390/drones9090614

AMA Style

Deng J, Liu J, Hu J. Tightly-Coupled Air-Ground Collaborative System for Autonomous UGV Navigation in GPS-Denied Environments. Drones. 2025; 9(9):614. https://doi.org/10.3390/drones9090614

Chicago/Turabian Style

Deng, Jiacheng, Jierui Liu, and Jiangping Hu. 2025. "Tightly-Coupled Air-Ground Collaborative System for Autonomous UGV Navigation in GPS-Denied Environments" Drones 9, no. 9: 614. https://doi.org/10.3390/drones9090614

APA Style

Deng, J., Liu, J., & Hu, J. (2025). Tightly-Coupled Air-Ground Collaborative System for Autonomous UGV Navigation in GPS-Denied Environments. Drones, 9(9), 614. https://doi.org/10.3390/drones9090614

Article Menu

Tightly-Coupled Air-Ground Collaborative System for Autonomous UGV Navigation in GPS-Denied Environments

Abstract

Highlights

Abstract

1. Introduction

2. Modeling and Problem Description

2.1. System Overview

2.2. Problem Description

3. The Proposed Air-Ground Collaborative Method

3.1. Aerial Perception and Mapping Module

3.2. Ground Navigation and Control Module

3.3. Real-Time Air-Ground Communication Module

4. Experiments and Analysis

4.1. Experimental Setup

4.2. Performance Metrics

4.2.1. Evaluation of Aerial Perception and Mapping Module

4.2.2. Evaluation of Ground Navigation and Control Module

4.2.3. Evaluation of Real-Time Air-Ground Communication Module

5. System Operation Result

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI