A Novel Amphibious Terrestrial–Aerial UAV Based on Separation Cage Structure for Search and Rescue Missions

Jia, Changhao; Xing, Yiyuan; Li, Zhijie; Ge, Xiankun

doi:10.3390/app15168792

Open AccessArticle

A Novel Amphibious Terrestrial–Aerial UAV Based on Separation Cage Structure for Search and Rescue Missions

¹

2011 College, Nanjing Tech University, Nanjing 211816, China

²

College of Electrical Engineering and Control Science, Nanjing Tech University, Nanjing 211816, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(16), 8792; https://doi.org/10.3390/app15168792

Submission received: 15 July 2025 / Revised: 4 August 2025 / Accepted: 6 August 2025 / Published: 8 August 2025

(This article belongs to the Topic Innovation and Inventions in Aerospace and UAV Applications)

Download

Browse Figures

Versions Notes

Abstract

Featured Application

The proposed UAV aims to enhance rescue personnel’s operational efficacy in search-and-rescue missions, enabling a more efficient localization of trapped individuals and the delivery of critical emergency supplies to reduce casualties, while concurrently safeguarding responders’ safety.

Abstract

In response to the challenges faced by unmanned aerial vehicles (UAV) in cluttered environments such as forests, ruins, and pipelines, this study introduces a ground–air amphibious UAV specifically designed for personnel search and rescue in complex environments. By innovatively designing and applying a separation cage structure, the UAV’s capabilities for ground movement and aerial flight have been enhanced, effectively overcoming the limitations of traditional single-mode robots operating in narrow or obstacle-dense areas. This design addresses the occlusion issue of sensing components in traditional caged UAVs while maintaining protection for both the UAV itself and the surrounding environment. Additionally, through the innovative design of an H-shaped quadcopter frame skeleton structure, the UAV has gained the ability to perform steady-state aerial flight while also better adapting to the separation cage structure, achieving a reduced energy consumption and significantly improving its operational capabilities in complex environments. The experimental results demonstrate that the UAV prototype, weighing 1.2 kg with a 1 kg payload capacity, achieves a 40 min maximum endurance under full payload conditions at the endurance speed of 10 m/s while performing real-time object detection. The system reliably executes multimodal operations, including stable takeoff, landing, aerial hovering, directional maneuvering, and terrestrial locomotion with coordinated steering control.

Keywords:

UAV; amphibious; separation cage structure; H-shaped quadrotor frame skeleton; cluttered environment

1. Introduction

With the advancement of science and technology, UAVs have been widely applied in both military and civilian fields, such as communication relay, disaster monitoring, border patrol, and more [1,2,3,4,5,6,7], where they play a crucial role. However, in spatially constrained environments, aerial drones still need to further improve their maneuverability and adaptability. To enhance the adaptability of UAVs to diverse environments, amphibious drones capable of both ground and aerial operations have gradually become a focal point of research and application hotspot [8,9,10,11,12,13,14].

Complex environments such as mines, pipelines, and post-earthquake debris are characterized by unknown conditions and high risks, making manual operations extremely challenging. Consequently, reckless human entry may always lead to significant losses. UAVs [15], relying on their advantages of compact size, high agility, and suitability for navigating narrow and hazardous spaces, can be effectively deployed for tasks such as environmental reconnaissance and the transportation of small-scale rescue supplies.

Currently, in the field of UAVs, some widely applied solutions to this problem include the following: Wenliang Yang introduced a small four-wheeled land–air amphibious UAV [16], and Juntong Qi proposed a search-and-rescue rotor UAV [17].

The quad-wheeled amphibious drone proposed by Wenliang Yang can traverse land using its four-wheeled mechanism, and its compact design allows it to adapt well to complex environments. However, it is prone to tipping over on uneven terrain, and its electronic components lack adequate protection, rendering the drone vulnerable to damage. Juntong Qi’s SR-RUAV system incorporates a low-altitude statistical image processing method for collapsed building detection, enabling the assessment of survivor presence. Despite its ability to efficiently cover large areas, such UAVs lack precision in close-range detection and cannot specifically search for people.

Ahmed Borik proposed a caged quadrotor drone for the inspection of central HVAC ducts [18], and [19,20,21,22,23] are all recent papers on caged drones. These cage-shaped drones have provided valuable insights for this study. The infrared search and rescue methodology presented in Reference [24] has significantly informed the experimental approach in this study.

But nearly all previous caged UAV designs face an unavoidable issue: their cameras and LiDAR sensors are inevitably obstructed by the cage structure. Furthermore, the single-cage structure inherently prevents the integration of ground locomotion capabilities in UAVs.

Based on the above considerations, this paper proposes a novel aerial–terrestrial search-and-rescue UAV. The innovative separation cage structure endows the drone with ground mobility while ensuring a stable traversal over rugged terrain through its fullerene-inspired geometric design, effectively minimizing rollover risks. The framework primarily employs 2 mm-diameter carbon rods that exhibit diamond-level hardness, high flexibility with elastic deformation capability, and a density merely one-sixth that of steel, which achieves a simultaneous weight reduction, resulting in a final drone weight of 1.2 kg. The UAV’s gross takeoff weight under full payload conditions is 2.2 kg. This separation cage structure not only significantly increases the hollow space to boost propulsive efficiency—achieving 40 min of endurance under full payload at the endurance speed of 10 m/s but also decisively addresses the critical limitation of internal camera occlusion inherent in traditional cage designs. The H-shaped quadrotor frame improves aerial stability and obstacle navigation, enabling operations in deep, narrow spaces through real-time environmental mapping via front-mounted LED arrays and a depth camera. This structural configuration also enhances payload capacity, supporting up to 1 kg while maintaining flight stability.

The designs directly address the critical shortcomings in previous work: it eliminates the sensor occlusion caused by monolithic cages, resolves terrain adaptability issues in wheeled amphibious UAVs, and overcomes close-range detection deficiencies in systems like SR-RUAV. Furthermore, this design effectively safeguards personnel and the surrounding environment from harm during the UAV’s terrestrial locomotion mode or when it enters an uncontrolled state due to physical damage or system failure. The prototype demonstrates a 40 min endurance with a full payload of 1 kg at the endurance speed of 10 m/s, positioning it as a versatile solution for missions demanding a robust multimodal mobility and precise close-range exploration in hazardous environments.

The Introduction frames the problem, motivation, and contributions; the Related Works section surveys prior art and contrasts it with our approach; and the Limitations section explicitly states the constraints, trade-offs, and non-addressed aspects.

2. Structural Design

2.1. General Functional Requirements

The ground–aerial amphibious UAV must possess a certain payload capacity, with a specified payload of M = 1 kg.
To adapt to narrow and confined spaces, the UAV should feature a compact design, with overall operational dimensions limited to Length × Width × Height = 150 × 70 × 70 cm or smaller. To ensure the UAV maintains stable flight with a 1 kg payload capacity, the propeller dimensions must exceed constrained minimum size thresholds. Consequently, the final airframe configuration measures Length × Width × Height = 77 × 50 × 50 cm.
The UAV must be capable of operating in dark, enclosed environments, enabling an immediate takeoff and stable flight. It should also exhibit resilience to collisions with obstacles, allowing for a quick recovery to steady-state operation.

The 3D image of the final UAV design is depicted in Figure 1.

2.2. Main Structural Design

The total weight of the UAV, including the battery, is 1.2 kg. As shown in Figure 2, the proposed ground–aerial amphibious UAV consists of the following key components:

Separation cage structure;
Bearing connectors;
Power-driven components;
Flight control carrier board;
H-shaped quadrotor frame;
LED lighting system.

2.2.1. Design of the Separation Cage Structure

As documented in [25], radar and photoelectric sensors are subject to environmental interference. Concurrently, traditional integrated caged drones experience sensor performance issues during ground locomotion due to disturbances generated by their rotating protective cages. Therefore, the proposed separation cage structure effectively addresses the problem of sensor occlusion encountered in traditional monolithic cage designs. By separating the two cage modules, the design ensures that sensors such as cameras and LiDAR remain unobstructed during ground operation, thereby improving measurement accuracy.

Furthermore, this design enables the UAV to achieve stable terrestrial locomotion even under adverse environmental conditions, such as high winds. In this mode, the UAV can operate for extended periods and pass through narrow apertures more easily, thereby efficiently transporting emergency supplies and executing search-and-rescue missions.

The separation cage structure provides robust protection for the UAV’s critical components, particularly the motors and propellers, while also preventing potential environmental hazards caused by rotor operation. This design significantly improves the safety of the UAV, effectively protecting people in densely populated areas from being injured by high-speed rotating propellers.

As shown in Figure 3, this is a single-cage structure. Each single-cage structural module adopts a design based on the molecular geometry of fullerene, characterized by a pentagon surrounded by five hexagons on both sides. Compared to the conventional radial circular cage structure, this design is more stable and robust, reducing the risk of tipping over on uneven terrain. The main frame consists of 2 mm carbon rods connected by nylon joints. This construction reduces the weight of the body while increasing the hollow area, minimizing the motor thrust reduction caused by the cage structure. Consequently, motor efficiency is maximized, significantly extending flight duration. Additionally, the hollow design of the cage structure cushions impact forces during operation.

As shown in Figure 4, this is the structure diagram of the fullerene. The fullerene-inspired structure can be derived from a regular icosahedron by truncating all twelve vertices where the truncation ratio λ is defined as follows:

λ = \frac{Truncated edge length}{Original edge length}

(1)

The λ parameter of the regular icosahedron significantly influences the global stress distribution in the fullerene-inspired structure, warranting a focused analysis [26]. This structure can be regarded as a rigid frame structure, and the maximum equivalent stress is calculated by the stiffness matrix. Each node in the spatial frame element possesses six degrees of freedom (u_x, u_y, u_z, θ_x, θ_y, and θ_z), while the fullerene-like structure has 60 nodes, making the overall stiffness matrix K a 360 × 360 matrix [27].

A global coordinate system (x, y, and z) and local coordinate system (

\bar{x}

,

\bar{y}

, and

\bar{z}

) are established, where

\bar{x}

aligns with the rod axis, and

\bar{y}

and

\bar{z}

are orthogonal to

\bar{x}

. The elemental stiffness matrix for rod elements is expressed as follows:

{\bar{k}}_{ij} = {[\begin{matrix} {\bar{k}}_{11}^{ij} & {\bar{k}}_{12}^{ij} \\ {\bar{k}}_{21}^{ij} & {\bar{k}}_{22}^{ij} \end{matrix}]}_{12 \times 12}

(2)

The coordinate transformation matrix facilitates the conversion of elemental stiffness matrices from local to global coordinate systems:

K_{ij} = R_{ij}^{T} {\bar{k}}_{ij} R_{ij} = {[\begin{matrix} k_{11}^{ij} & k_{12}^{ij} \\ k_{21}^{ij} & k_{22}^{ij} \end{matrix}]}_{12 \times 12}

(3)

The coordinate transformation matrix is defined as follows:

R_{ij}^{T} = {[\begin{matrix} r & 0 & 0 & 0 \\ 0 & r & 0 & 0 \\ 0 & 0 & r & 0 \\ 0 & 0 & 0 & r \end{matrix}]}_{12 \times 12}

(4)

where the directional cosine submatrix r satisfies the following:

r = [\begin{matrix} \cos θ_{x \bar{x}} & \cos θ_{x \bar{y}} & \cos θ_{x \bar{z}} \\ \cos θ_{y \bar{x}} & \cos θ_{y \bar{y}} & \cos θ_{y \bar{z}} \\ \cos θ_{z \bar{x}} & \cos θ_{z \bar{y}} & \cos θ_{z \bar{z}} \end{matrix}]

(5)

Here,

θ_{x \bar{x}}

represents the angle between the x-axes of global and local systems, and the rest goes the same way.

The expanded elemental stiffness matrices

K_{ij 360 \times 360}^{p}

are superimposed to construct the global stiffness matrix:

K_{360 \times 360} = \sum K_{ij}^{p}

(6)

The fundamental equilibrium equation governing nodal displacements and forces is expressed as follows:

K_{360 \times 360} ∆_{360 \times 1} = P_{360 \times 1}

(7)

where

∆_{360 \times 1}

denotes the nodal displacement vector and

P_{360 \times 1}

the external force vector. Member internal forces are derived through the following:

{\bar{P}}_{ij} = R_{ij}^{T} K_{ij} u_{ij} = {\bar{k}}_{ij} R_{ij}^{T} u_{ij}

(8)

According to the Mises criterion, assuming that all member units in the simulated fullerene frame structure are ideal units, when the bending and torsion effects of the member units are ignored, the stress at the node is determined as follows:

[σ_{i}] = \sqrt{σ^{2} + 3 τ^{2}}

(9)

where σ and τ represent the axial and shear stresses, respectively. Multiplying Equation (9) by the cross-sectional area yields equivalent nodal forces:

F_{i} = \sqrt{F_{x}^{2} + 3 (F_{y}^{2} + F_{z}^{2})}

(10)

A MATLAB (2022b) implementation incorporates material parameters: E = 50 GPa, G = 19 GPa, μ = 0.3, ρ = 1.5 g/cm³, circumsphere radius R = 30 cm, and circular cross-section r = 1 mm, with centripetal loading. Simulation results demonstrate the structural performance as illustrated.

As shown by Figure 5, the fullerene-inspired structure attains optimal load-bearing capacity at a truncation ratio of λ = 0.375, while showing low structural stiffness. The structure shows excellent load-bearing and stress-distribution characteristics. The tension form configuration ensures a uniform stress distribution and omnidirectional load-bearing capacity, resulting in a better structural integrity and resistance to stability.

2.2.2. H-Shaped Frame Design for Quadrotor

Unlike the conventional X-frame architecture, which is typically used in quadrotor UAVs, the present design innovatively employs an H-shaped quadcopter frame skeleton structure, as shown in Figure 6. Compared to traditional X-frame structures, the H-frame features an elongated geometry that increases the lateral separation between adjacent rotors. This configuration not only enables the integration of the separation cage structure but also simultaneously improves flight stability through enhanced aerodynamic decoupling.

3. Overall Algorithm Design

3.1. Controller Design

The Proportional-Integral-Derivative (PID) controller is a fundamental closed-loop feedback control mechanism prevalent in engineering systems. It generates a corrective output signal by computing a weighted sum of three distinct terms derived from the current error (e(t) = Setpoint (SP) − Process Variable (PV)): the Proportional term (responding to the present magnitude of the error), the Integral term (responding to the accumulation of past errors), and the Derivative term (responding to the predicted future trend of the error based on its rate of change). This composite control action aims to minimize the error over time, driving the system output towards the desired setpoint dynamically while ensuring stability, accuracy, and a responsive performance. Due to its robustness, straightforward structure, and adaptability across diverse applications, the PID controller remains one of the most widely implemented control strategies. The PID control equation is defined as follows:

u = k_{p} e (t) k_{i} \int_{0}^{T} e (ξ) d ξ + k_{d} \frac{d}{dt} e (t)

(11)

In classical PID control methods, the controller gain parameters are predetermined. Although these methods can achieve satisfactory control accuracy, they lack the ability to adapt to sudden changes. This limitation renders classical PID methods inadequate for quadrotor UAVs with nonlinear dynamics and require high-precision control requirements.

The Linear Quadratic Regulator (LQR) necessitates linear or precisely linearizable system dynamics. However, UAV attitude dynamics—characterized by inherently strong nonlinearities such as aerodynamic coupling and rotor interference—defy accurate linearization. Linear approximations introduce model discrepancies that degrade the actual control performance. Model Predictive Control (MPC) requires online rolling optimization, where computational complexity grows exponentially with the prediction horizon length. Given the computational constraints of UAV embedded platforms, satisfying millisecond-level real-time control demands becomes infeasible. Backstepping control entails the recursive construction of Lyapunov functions and virtual control variables, resulting in protracted derivation processes. Moreover, strong cross-coupling among UAV degrees of freedom (e.g., roll–pitch interaction) significantly increases design complexity.

This paper proposes a parameter self-adjusting fuzzy adaptive PID algorithm that integrates fuzzy control with conventional PID control. During operation, the algorithm adaptively tunes the PID parameters to maintain the stable flight of quadrotor UAVs in the event of actuator failures.

In the fuzzy adaptive PID framework, the first step involves establishing fuzzy relationships between the error (e), error rate (ec), and the PID parameters (

k_{p}

,

k_{i}

, and

k_{d}

) [28]. Through fuzzy inference, the incremental adjustments to the PID parameters (

{∆ k}_{p}

,

∆ k_{i}

, and

Δ k_{d}

) are dynamically calculated based on real-time e and ec values. The parameter tuning of the fuzzy adaptive PID controller is then achieved using the following formulation:

k_{δ} = {k^{'}}_{δ} + {∆ k}_{δ}

(12)

Here,

δ \in {P_{i}, I_{i}, D_{i}}

, where

{k^{'}}_{δ}

denotes the initial parameter values of the fuzzy adaptive PID controller, and

{∆ k}_{δ}

represents the real-time adjustments to

k_{δ}

.

As depicted in Figure 7 of the quadrotor UAV control system, the tracking position error e and error rate ec are utilized as controller inputs. These inputs undergo three sequential stages:

Fuzzification: Convert crisp inputs e and ec into fuzzy linguistic variables.

Fuzzy Inference: Apply a predefined rule base to derive the adjustments

{∆ k}_{p}

,

∆ k_{i}

, and

Δ k_{d}

.

Defuzzification: Transform fuzzy outputs into precise incremental parameter values for real-time tuning.

Due to the strong dynamic coupling inherent in quadrotor systems, the control architecture is decoupled into two subsystems:

Position Subsystem: Governs trajectory tracking via an outer-loop controller.

Attitude Subsystem: Stabilizes roll, pitch, and yaw angles via an inner-loop controller.

A dual-loop control strategy is implemented, where the outer position loop generates attitude references for the inner loop, ensuring coordinated motion control under actuator disturbances.

In the fuzzification process, the controller inputs e and ec, along with the outputs

{∆ k}_{p}

,

∆ k_{i}

, and

Δ k_{d}

, are defined as follows:

Fuzzy sets for all variables: {Negative Small (NS), Negative Medium (NM), Negative Big (NB), Zero (Z), Positive Big (PB), Positive Medium (PM), and Positive Small (PS)}.

Fuzzy input universes of discourse:

e: [−3, 3];
ec: [−3, 3].

Fuzzy output universes of discourse:

${∆ k}_{p}$ : [−2.5, 2.5];
$∆ k_{i}$ : [−0.5, 0.5];
$Δ k_{d}$ : [−5, 5].

Membership functions: Both input and output variables use linear membership functions.

The fuzzy controller operates in a two-input-three-output configuration, where the fuzzy inputs e and ec are continuously evaluated for their membership degrees. Through fuzzy inference rules, the membership degrees of the fuzzy outputs

{∆ k}_{p}

,

∆ k_{i}

, and

Δ k_{d}

are determined, thereby enabling real-time adjustments to the PID parameters.

During the PID parameter adjustment, the impacts of parameter variations on system performance must be carefully considered:

${∆ k}_{p}$ predominantly governs the system’s response speed;
$∆ k_{i}$ directly influences the steady-state error;
$Δ k_{d}$ critically affects the dynamic performance.

Different values of these parameters result in distinct effects on system error and performance. Therefore, parameter adjustments are prioritized based on the magnitude of the error e. When e is large,

k_{p}

and

k_{d}

are adjusted to enhance the transient response and stabilize dynamics; when e is small,

k_{p}

and

k_{d}

are tuned to minimize steady-state deviations.

Based on combinations of e and ec, 49 fuzzy rules are formulated to define the corresponding adjustments for

{∆ k}_{p}

,

∆ k_{i}

, and

Δ k_{d}

, ensuring systematic parameter adaptation.

Fuzzy control rules are critical in fuzzy control systems [29], typically derived from expert knowledge and empirical data synthesis [30]. The rule design heuristics are established as follows [31]: when |e| is substantially large, selecting a larger

k_{p}

enhances system responsiveness to rapidly mitigate tracking errors and restore stability, while concurrently adopting a smaller

k_{d}

attenuates the potential overshoot. Consequently, when both e and ec register as PB,

{∆ k}_{p}

should correspondingly be PB. For moderate |e| values, medium

{∆ k}_{p}

magnitudes are prescribed to preclude a significant overshoot while preserving an adequate dynamic response capability; thus, when e and ec are PM,

{∆ k}_{p}

is designated PM. When |e| is comparatively small, reduced

k_{p}

values are mandated to suppress steady-state oscillations, yielding the assignment

{∆ k}_{p}

= PS for e and ec in the PS domain. By applying these formulation principles, we have refined and optimized the conventional rule base. The improved fuzzy control rule matrix is presented in Table 1.

3.2. Control Scheme

The control flow of the entire UAV motion process is illustrated in Figure 8. Upon system power-up, initialization is performed first, including the configuration of the processor’s external interfaces. Subsequently, the sensor self-test and calibration routines are executed. Finally, the program proceeds in an orderly manner to execute the following algorithmic modules: mode determination, Kalman filtering, sensor data fusion and processing, ground control algorithms, and in-flight control algorithms.

The operational state achieved by rotating the drone 90° to align the cage parallel to the ground is termed the Vertical Operation Mode.

When encountering a passage that is impassable in its standard orientation, the drone utilizes its depth camera to measure the width of the passage. If the passage can accommodate transit in the Vertical Operation Mode, the system further verifies whether the passage floor constitutes solid ground. Upon confirmation of suitable terrain, the drone activates the Vertical Operation Mode: it adjusts its attitude to achieve the required orientation, reorients its propellers, and traverses the narrow passage. If the floor is identified as unsuitable ground, traversal is aborted and alternative routes are pursued. This protocol ensures safe navigation while utilizing aerodynamic reconfiguration for constrained environments.

3.3. Ground Control Station System

The Ground Control Station (GCS) serves as the core control hub of a UAV system, enabling both within visual line-of-sight (VLOS) and beyond visual line-of-sight (BVLOS) operations. Throughout all mission phases—from the initial preparation and critical task execution to final data processing and dissemination—the GCS plays a pivotal role. Its mission is to monitor the aircraft’s flight status and payload operational conditions, empowering ground operators to effectively command both the airframe and mission equipment. Core functions encompass mission planning, flight monitoring and control, imagery display and payload management, system monitoring, intelligence dissemination, and data recording.

The GCS system deployed in this research enables the real-time display of situational imagery and continuous monitoring of UAV kinematic status. This system further supports the control of flight modes and trajectory planning. Wireless communication between the GCS and UAV is established via the MAVLink protocol. Additionally, operators can remotely pilot the UAV using the Yun Zhuo H16 All-in-One Remote Controller, while concurrently monitoring real-time environmental imagery data. The workflow diagram is illustrated in Figure 9.

3.4. Visual Recognition Algorithm

By using the YOLOv5s-Ghost network model [32], the UAV relies on the Intel RealSense D435 depth camera installed at the front as its eyes to identify the surrounding environment. This setup ensures safe navigation within a range of 0–20 m.

The YOLO (You Only Look Once) series of detection algorithms are widely favored by researchers due to their high efficiency and fast detection speed [33]. Among them, YOLOv5 is the most widely adopted, capable of efficiently extracting target feature information, yet its network structure is overly complex [34]. YOLOv5s is the most lightweight version, with significantly reduced parameters and minimal floating-point operations, balancing model complexity and accuracy [35].

Despite its relatively fast detection speed, it still falls short of meeting the rapid detection requirements of UAVs. The proposed YOLOv5s-Ghost model integrates Ghost-BottleNeck modules with DarkNet-53, achieving a lighter network architecture and accelerated detection. Ghost-BottleNeck combines Ghost feature generation with bottleneck structures, primarily aiming to reduce parameters and enhance detection speed through feature redundancy assumptions and efficient feature generation.

As depicted in Figure 10, the Ghost module generates more feature maps with fewer parameters, primarily by dividing the convolution operations: half of the channels undergo standard convolution, while the other half use a 3 × 3 depthwise separable convolution (cheap transformation operations). Finally, these dual-path outputs are concatenated to form the final feature representation [36].

In the computation of FLOPs (Floating Point Operations Per Second) for standard convolutions,

h^{'}

denotes the output height,

w^{'}

the output width, n the output dimension (equivalent to the number of filters), c the number of input channels, and k the spatial dimensions (height and width) of convolutional kernels. This study introduces an improvement using Ghost modules, where the n filters in standard convolutions are partitioned into s groups. The theoretical speed-up ratio of upgrading the ordinary convolution with the Ghost module is

r_{s} = \frac{n \times h \times w \times c \times k \times k}{\frac{n}{s} \times h^{'} \times w^{'} \times c \times k \times k + \frac{(s - 1)}{s} \times n \times h^{'} \times w^{'} \times c \times d \times d} = \frac{c \times k \times k}{\frac{n}{s} \times c \times k \times k + \frac{(s - 1)}{s} \times d \times d} \approx \frac{s \times c}{s + c - 1} \approx s

(13)

where

d \times d

has a similar magnitude as that of

k \times k

, and s << c. Similarly, the compression ratio can be calculated as

r_{c} = \frac{n \times c \times k \times k}{\frac{n}{s} \times c \times k \times k + \frac{(s - 1)}{s} \times n \times d \times d} \approx \frac{s \times c}{s + c - 1} \approx s

(14)

For implementation, the filters are divided into two groups (s = 2), resulting in the Ghost module achieving a 50% reduction in both the parameter count and computational cost compared to standard convolutions.

Leveraging the lightweight characteristics of Ghost modules, this work employs a Ghost-BottleNeck architecture. As depicted in Figure 11, the module consists of two cascaded Ghost modules with integrated Batch Normalization (BN) layers to accelerate network convergence and suppress overfitting. The first segment incorporates Leaky ReLU activation functions to prevent neuron deactivation during training, while the latter segment intentionally omits activation functions to maintain a consistent feature distribution across network layers, enhancing the model’s convergence efficiency.

The architecture of the YOLOv5s-Ghost network is shown in Figure 12. YOLOv5s-Ghost only uses one CSP_X structure, which significantly reduces the model’s complexity. To maintain detection accuracy, gradient change information is fully passed to the feature map, thereby optimizing the network’s feature fusion capability.

The loss function L comprises regression loss and classification loss. The loss function L comprises two components: the localization loss L_loc and classification loss L_cla. The composite objective function is defined as

L = L_{loc} + L_{cla}

(15)

The localization loss L_loc employs the Complete Intersection over Union (CIoU) loss function [37], defined as

L_{CIoU} = 1 - CIoU = 1 - (IoU - \frac{D_{2}^{2}}{D_{1}^{2}} - \frac{ζ^{2}}{1 - IoU - ζ})

(16)

where

Intersection over Union (IoU) quantifies the overlap ratio between predicted and ground-truth bounding boxes;
D1: Diagonal distance of the minimum enclosing rectangle of the two boxes;
D2: Euclidean distance between their centroids;
$ζ$ : Aspect ratio consistency parameter calculated by the following.

$ζ = \frac{4}{π^{2}} {(\tan^{- 1} \frac{w^{gt}}{h^{gt}} - \tan^{- 1} \frac{w^{p}}{h^{p}})}^{2}$

(17)

Compared to the traditional IoU loss, the CIoU formula addresses non-overlapping cases and the aspect ratio of bounding boxes, thereby improving regression accuracy and convergence speed. When combined with non-maximum suppression (NMS), the CIoU-based method outperforms traditional NMS methods in both repeated detection suppression and detection accuracy.

4. Results and Discussion

4.1. Comparative Evaluation with Relevant UAVs

As demonstrated in Table 2, the proposed novel amphibious inspection UAV exhibits significant advantages over comparable small-scale UAV systems in both maximum payload capacity and endurance at maximum takeoff weight (MTOW). Moreover, the designed UAV achieves a commendable maximum demonstrated speed, with its high maneuverability proving sufficient for rapid deployment in emergency scenarios requiring expedited flight operations.

As detailed in Table 3, the comparative analysis of cage-equipped UAVs reveals distinct operational characteristics. References [18,19] present systems designed for central HVAC ducts inspection, capable of both terrestrial locomotion (sliding/rolling) and aerial flight, incorporating thermal cameras. Reference [20] introduces a UAV with a compliant-circuit hybrid 3D-printed safety cage, exclusively flight-capable and requiring manual control, equipped with a 2D camera. Similarly, reference [22] describes a safety-focused cage-equipped UAV limited to flight operations under human supervision, also utilizing a 2D camera. Reference [23] deploys a cage-equipped UAV for an antenna reliability assessment, restricted to flight with manual control, and employing printed antennas. Critically, all the aforementioned cage configurations induce sensor obstruction. In contrast, our mechanically optimized design structurally circumvents this limitation through a depth camera enabling dual-mode infrared and visible-light detection, supporting terrestrial–aerial multimodal operation.

4.2. YOLOv5s-Ghost Network Training and Experimental Results

In this paper, the datasets employed consist of visible-light and infrared subsets. A depth camera, Intel RealSense D435, is utilized as the image acquisition device. The visible-light dataset utilizes the MS COCO dataset, while the infrared dataset comprises 9119 pedestrian images captured by an infrared camera. Additionally, 508 images have been specifically collected for a dataset focusing on the cluttered rescue-like environment which encompass all scenarios prone to disaster and potentially requiring search and rescue support, including but not limited to forested areas, mining shafts, and analogous high-risk terrains. All images are in JPG format and have a resolution of 1920 × 1080 pixels.

To improve the generalization ability of the detection and localization algorithm, image enhancement operations such as adjusting the brightness and adding noise were applied to the images. These images are manually annotated using the LabelImg tool in PASCAL VOC format and saved as XML files. The images are randomly divided into a training set, a validation set, and a test set in a 9:1:1 ratio.

The experimental environment is a Windows 11 operating system, using NVIDIA GeForce RTX 3060 (NVIDIA Corp., Santa Clara, CA, USA) graphic processing unit for computing, GPU size is 8GB, CPU configuration is AMD Ryzen 7 5800H (Advanced Micro Devices, Inc., Santa Clara, CA, USA) with Radeon Graphics @ 3.20 GHz, CUDNN version is 8.9, Pytorch version is 2.1.1, and the Python language environment is 3.10.0.

As clearly evidenced by Table 4, the YOLOv5s-Ghost architecture achieves a 25% parameter reduction and 22% inference acceleration compared to the baseline YOLOv5s model, while maintaining a high mean average precision (mAP) of 77.8%. The lightweight improvements significantly enhance the overall performance of the network.

4.3. Vision-Based Detection Experiment

The UAV employs different detection modes based on its operational states. During terrestrial locomotion, visible-light-based visual recognition and ranging are utilized for the precise detection of target position and distance. In the aerial flight mode, the system activates infrared vision-based detection for target presence verification. This dual-mode sensing strategy arises from operational requirements: terrestrial locomotion necessitates a precise proximity detection of target position and range, while aerial operations require only target presence identification. Such adaptive sensing architecture significantly enhances the efficiency of search and rescue operations.

A ranging accuracy validation experiment was conducted under the visible-light recognition mode, with test subjects positioned at 300 discrete locations relative to the depth camera for distance measurement. Representative data are summarized in Table 5, where α denotes the target distance measured by the proposed stereo vision-based ranging method, β represents the ground truth distance obtained via a laser rangefinder, and the relative error (e) is calculated as

e = |\frac{α - β}{β}| \times 100 %

(18)

The proposed method exhibits a maximum error of 9.02% and a mean error of 5.67% within the 0–20 m range. Beyond 20 m, the ranging error exceeds 10%. As illustrated in Figure 13 depicting the UAV’s object detection and range demonstration during terrestrial locomotion at 2 m/s, experimental results validate the system’s capability to ensure safe flight operations with reliable ranging within 0–20 m.

And Figure 14 shows infrared-based target recognition during aerial flight operations when the UAV is at the endurance speed of 10 m/s. Experimental results indicate that recognition accuracy experiences degradation when the UAV operates at its maximum demonstrated speed of 18 m/s. Nevertheless, precision recognition remains achievable within 100 m even under these maximum velocity conditions.

4.4. Drop Tests

To verify the structural stability of the designed separation cage structure, drop resistance testing is conducted. The isolated separation cage assembly is mass-loaded to match the UAV’s gross takeoff weight of 2.2 kg. Drop tests commence from a 1 m height with 0.2 m incremental elevation steps, performing ten trials per height level on concrete substrates. At the 10 m free-fall height, joint loosening occurs between carbon fiber rods and nylon connectors. Subsequent full-prototype drop testing replicates this protocol, incorporating both free-fall and motor-termination scenarios during hovering. The prototype exhibits no structural compromise or component loosening up to an 8 m impact height. These results demonstrate significant structural robustness in the fullerene-inspired separation cage design.

4.5. Prototype Flight Experiment

A physical prototype has been developed according to design requirements and tested via remote control for terrestrial movement and confined-space flight. All experimental validations are conducted under full payload conditions. The prototype’s gross takeoff weight under full payload conditions is 2.2 kg.

Figure 15a shows terrestrial locomotion experiments conducted on Nanjing Funiu Mountain. The prototype commences terrestrial locomotion tests on the ground from a stationary level position. The prototype maintains a fixed pitch angle through the differential thrust between front and rear rotors, propelling forward via rotor thrust and ground friction with its cage structure. Steering is achieved by adjusting the differential speeds of diagonal rotor pairs. The system demonstrates stable movement on rugged terrain with smooth forward and backward motion and agile steering.

The prototype maintains stable ground locomotion even when tilted to a 30° roll angle relative to the horizontal plane. The terrestrial maneuvering mode demonstrates an enhanced stability, sustaining normal operation in outdoor environments with Beaufort Force 5 winds. Nevertheless, the walking velocity decreases significantly under such conditions, while power consumption escalates proportionally to airspeed disturbance intensity.

Figure 15b documents the prototype’s flight test conducted in the forest. The prototype commences flight testing from a stationary horizontal position on the ground. Like conventional quadrotor drones, this prototype achieves lift through the four rotors’ thrust and maneuvers (forward/backward motion and directional turns) via differential motor speed control. The dual LED arrays flanking the front-mounted camera enable visual exploration and obstacle avoidance in dark, enclosed spaces through pure vision-based navigation. The separation cage structure demonstrates critical protective functionality: when colliding with obstacles, it safeguards rotors from damage while allowing for rapid stabilization recovery. During flight operations, the prototype can regain flight stability within one second when subjected to collisions at its endurance-optimized cruising velocity of 10 m/s. At the maximum demonstrated speed of 18 m/s, stability recovery requires no more than two seconds under equivalent impact conditions. Experimental results confirm the prototype’s successful execution of all designed flight operations, with performance metrics aligning with predetermined specifications.

The supplemental experiments entail positioning the prototype in a deliberately inverted 90-degree orientation, establishing this configuration—where the circular cross-sections of its dual cage structures remain parallel to the ground—as the predefined initial state. Under these constrained conditions, unobstructed environments enable an autonomous self-recovery to the standard operating posture, and parallel wall confinement permits the structural rotation facilitates’ rolling locomotion along the constraining surfaces; however, scenarios where walls immobilize the cage assembly resulted in complete mobility failure.

When the drone is in operational status and encounters a passage impassable in its default attitude, it uses a depth camera to measure the channel width. If the channel accommodates traversal in vertical flight mode, the system further verifies whether the channel floor constitutes a navigable terrain. Upon confirmation, the drone activates vertical flight mode, adjusts its attitude to the vertical orientation, reconfigures propeller orientation, and subsequently traverses the narrow passage. If the channel floor is identified as non-navigable terrain, the drone aborts the traversal and reroutes via alternative paths.

For nocturnal operations, activation of front-mounted LED illumination arrays or switching to infrared vision detection mode both proved effective in achieving environmental perception and obstacle recognition in low-visibility conditions.

5. Conclusions

This research has successfully designed, prototyped, and validated a novel ground–air amphibious UAV tailored for search-and-rescue operations in complex, cluttered environments. The core innovation lies in the implementation of a separation cage structure, fundamentally overcoming the critical limitation of sensor occlusion inherent in traditional monolithic cage designs. This architecture, inspired by the fullerene molecular geometry (optimized at λ = 0.375), provides robust omnidirectional protection for the UAV and its surroundings during terrestrial locomotion or collision events, while simultaneously ensuring unobstructed sensor fields of view for cameras and LiDAR—a significant advancement in cage-type UAV design.

Complementing this, the H-shaped quadrotor frame skeleton enhances aerial stability and aerodynamic efficiency, facilitating precise control within confined spaces. The system demonstrates an exceptional operational capability, achieving a 40 min flight endurance under full payload (1 kg) at 10 m/s while performing real-time object detection via the optimized YOLOv5s-Ghost visual recognition system. The integrated fuzzy adaptive PID controller ensures robust multimodal operation—stable aerial flight, terrestrial locomotion over rugged terrain, and controlled transitions—even under adverse conditions.

The prototype validation confirms that this integrated design effectively addresses the key limitations of prior systems: eliminating sensor blockage in caged UAVs, mitigating terrain instability and component vulnerability in wheeled amphibious designs, and enabling the precise close-range detection absent in systems like SR-RUAV. Furthermore, the protective cage structure demonstrably enhances safety for both personnel and the operational environment.

6. Limitations

The current design focuses exclusively on conventional onboard sensors due to project scope limitations. However, as discussed in [40,41,42], rf-based sensors could potentially be implemented, offering significant improvements in environmental perception accuracy and personnel search capabilities for drones.

Budgetary constraints and project timeline limitations necessitate abbreviated impact resistance and visual recognition testing protocols. Consequently, a comprehensive assessment of the structure’s ultimate load-bearing capacity remains incomplete, as does high-velocity recognition accuracy evaluation. These unresolved aspects constitute a primary research focus for our subsequent phase.

Author Contributions

Conceptualization, C.J. and Y.X.; methodology, C.J. and Y.X.; software, Y.X.; validation, C.J., Y.X. and Z.L.; formal analysis, C.J.; investigation, Z.L.; resources, X.G.; data curation, Y.X.; writing—original draft preparation, C.J.; writing—review and editing, C.J.; visualization, C.J.; supervision, X.G.; project administration, X.G.; funding acquisition, X.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Jiangsu Students’ platform for innovation and entrepreneurship training program, grant number 202410291193Y and The APC was funded by Nanjing Tech University.

Data Availability Statement

The data are contained within the article.

Acknowledgments

The authors are grateful to each reviewer for their valuable comments and suggestions which improved the quality of this article. The authors would like to thank the anonymous reviewers for their valuable comments which improved the quality of this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

UVA	Unmanned Aerial Vehicles
RTK	Real Time Kinematic
LiDAR	Light Detection and Ranging
GCS	Ground Control Station
MAVLink	Micro Air Vehicle Link
YOLO	You Only Look Once
FLOPs	Floating Point Operations Per Second
BN	Batch Normalization
CIoU	Complete Intersection Over Union
IoU	Intersection Over Union
NMS	Non-Maximum Suppression
MTOW	Maximum Take-Off Weight
mAP	Mean Average Precision

References

Boiteau, S.; Vanegas, F.; Galvez-Serna, J.; Gonzalez, F. Model-Based RL Decision-Making for UAVs Operating in GNSS-Denied, Degraded Visibility Conditions with Limited Sensor Capabilities. Drones 2025, 9, 410. [Google Scholar] [CrossRef]
Liu, Y.; Xie, S.; Zhang, Y. Cooperative Offloading and Resource Management for UAV-Enabled Mobile Edge Computing in Power IoT System. IEEE Trans. Veh. Technol. 2020, 69, 12229–12239. [Google Scholar] [CrossRef]
Panigrahi, S.S.; Singh, K.D.; Balasubramanian, P.; Wang, H.; Natarajan, M.; Ravichandran, P. UAV-Based LiDAR and Multispectral Imaging for Estimating Dry Bean Plant Height, Lodging and Seed Yield. Sensors 2025, 25, 3535. [Google Scholar] [CrossRef]
Xu, J.; Panagopoulos, D.; Perrusquía, A.; Guo, W.; Tsourdos, A. Generalising Rescue Operations in Disaster Scenarios Using Drones: A Lifelong Reinforcement Learning Approach. Drones 2025, 9, 409. [Google Scholar] [CrossRef]
Tsachouridis, S.; Pavloudakis, F.; Sachpazis, C.; Tsioukas, V. Monitoring Slope Stability: A Comprehensive Review of UAV Applications in Open-Pit Mining. Land 2025, 14, 1193. [Google Scholar] [CrossRef]
Feng, C.; Fan, J.; Liu, Z.; Jin, G.; Chen, S. Unmanned Aerial Vehicle Anomaly Detection Based on Causality-Enhanced Graph Neural Networks. Drones 2025, 9, 408. [Google Scholar] [CrossRef]
Hong, M.; Wang, J.; Zhu, M.; Cao, S.; Nie, H.; Xu, X. Detection-Driven Gaussian Mixture Probability Hypothesis Density Multi-Target Tracker for Airborne Infrared Platforms. Sensors 2025, 25, 3491. [Google Scholar] [CrossRef]
Cagnazzo, C.; Angelini, S. Vertical Temperature Profile Test by Means of Using UAV: An Experimental Methodology in a Karst Sinkhole of the Apulia Region (Italy). Meteorology 2025, 4, 15. [Google Scholar] [CrossRef]
Al-Nabhan, N.; Alturkestani, R.; Belghith, A.; AlAloula, N. A Conflict Resolution Approach for Multiple Unmanned Aerial Vehicles. Electronics 2025, 14, 2247. [Google Scholar] [CrossRef]
Alotaibi, T.; Jambi, K.; Khemakhem, M.; Eassa, F.; Bourennani, F. Outdoor Dataset for Flying a UAV an Appropriate Altitude. Drones 2025, 9, 406. [Google Scholar] [CrossRef]
Wang, M.; Zhang, Z.; Gao, R.; Zhang, J.; Feng, W. Unmanned Aerial Vehicle (UAV) Imagery for Plant Communities: Optimizing Visible Light Vegetation Index to Extract Multi-Species Coverage. Plants 2025, 14, 1677. [Google Scholar] [CrossRef]
Zhang, Y.; Wei, L.; Zhou, Y.; Kou, W.; Fauzi, S.S.M. Integrating UAV-RGB Spectral Indices by Deep Learning Model Enables High-Precision Olive Tree Segmentation Under Small Sample. Forests 2025, 16, 924. [Google Scholar] [CrossRef]
Do-Duy, T.; Nguyen, L.D.; Duong, T.Q.; Khosravirad, S.R.; Claussen, H. Joint Optimisation of Real-Time Deployment and Resource Allocation for UAV-Aided Disaster Emergency Communications. IEEE J. Sel. Areas Commun. 2021, 39, 3411–3424. [Google Scholar] [CrossRef]
Zhao, J.; Fan, S.; Zhang, B.; Wang, A.; Zhang, L.; Zhu, Q. Research Status and Development Trends of Deep Reinforcement Learning in the Intelligent Transformation of Agricultural Machinery. Agriculture 2025, 15, 1223. [Google Scholar] [CrossRef]
Wang, Z.; Yang, K.; Wang, Y.; Zhu, Z.; Liang, X. Embrace the Era of Drones: A New Practical Design Approach to Emergency Rescue Drones. Appl. Sci. 2025, 15, 135. [Google Scholar] [CrossRef]
Yang, W.; Han, Y.; Xu, Z.; Dai, Y.; Yu, W. Structural design and test of small land-air amphibious UAV. Dev. Innov. Mach. Electr. Prod. 2019, 32, 57–60. [Google Scholar] [CrossRef]
Qi, J.; Song, D.; Shang, H.; Wang, N.; Hua, C.; Wu, Q.; Qi, X.; Han, J. Search and Rescue Rotary-Wing UAV and Its Application to the Lushan Ms 7.0 Earthquake. J. Field Robot. 2016, 33, 290–321. [Google Scholar] [CrossRef]
Borik, A.; Kallangodan, A.; Farhat, W.; Abougharib, A.; Jaradat, M.A.; Mukhopadhyay, S. Caged Quadrotor Drone for Inspection of Central HVAC Ducts. In Proceedings of the 2019 Advances in Science and Engineering Technology International Conferences (ASET), Dubai, United Arab Emirates, 26 March–10 April 2019. [Google Scholar] [CrossRef]
Khalil, A.; Jaradat, M.A.; Mukhopadhyay, S.; Abdel-Hafez, M.F. Autonomous Control of a Hybrid Rolling and Flying Caged Drone for Leak Detection in HVAC Ducts. IEEE/ASME Trans. Mechatron. 2024, 29, 366–378. [Google Scholar] [CrossRef]
Guo, L.G.; Vishwesh, D.; Rahul, K.; Zhen, K.P.; Wei, Y.L.; Guo, D.G.; Wai, Y.Y. Fabrication of design-optimized multifunctional safety cage with conformal circuits for drone using hybrid 3D printing technology. Int. J. Adv. Manuf. Technol. 2022, 120, 2573–2586. [Google Scholar] [CrossRef]
Eichhorn, C.; Jadid, A.; Plecher, D.A.; Weber, S.; Klinker, G.; Itoh, Y. Catching the Drone—A Tangible Augmented Reality Game in Superhuman Sports. In Proceedings of the 2020 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Recife, Brazil, 9–13 November 2020. [Google Scholar] [CrossRef]
Kakuya, I.; Hiroyuki, A.; Kohei, O.; Akiya, K. Risk-benefit optimization of drone cages responsible for the safety and quality of drone services. In Proceedings of the 13th Conference on Transdisciplinary Science and Technology, Tokyo, Japan, 18 December 2022. [Google Scholar] [CrossRef]
Triviño, I.V.; Skrivervik, A.K. Small antennas with broad beamwidth integrated on a drone enclosed in a protective structure. In Proceedings of the 2023 17th European Conference on Antennas and Propagation (EuCAP), Florence, Italy, 26–31 March 2023. [Google Scholar] [CrossRef]
Xu, J. Application of Multispectral and Thermal Imaging Technologies in Drone Search and Rescue Missions. Trait. Signal 2024, 41, 2317–2326. [Google Scholar] [CrossRef]
Semenyuk, V.; Kurmashev, I.; Lupidi, A.; Alyoshin, D.; Kurmasheva, L.; Cantelli-Forti, A. Advances in UAV detection: Integrating multi-sensor systems and AI for enhanced accuracy and efficiency. Int. J. Crit. Infrastruct. Prot. 2025, 49, 100744. [Google Scholar] [CrossRef]
Dong, Y.; Zuo, K.; Han, S.; Zhang, Z. Structure design of fullerene-like wheel hub two wheeled throwing robot. Mach. Electron. 2020, 38, 66–71. [Google Scholar]
Sun, J.; Zhang, S.; Kong, F. Design and Analysis for Umbrella-type Deployable Mechanisms Based on Spider Web Structures. China Mech. Eng. 2019, 30, 1613–1620. [Google Scholar] [CrossRef]
Wang, J.; Li, A. Adaptive fault-tolerant control of multi-quadcopter UAV formation. J. Lanzhou Univ. Technol. 2024, 50, 69–76. [Google Scholar]
Shi, P.C.; Xu, Z.W.; Wang, S.; Xiao, P. Study on adaptive fuzzy PID active suspension control in variable theory domain. Mech. Sci. Technol. Aerosp. Eng. 2019, 38, 713–720. [Google Scholar] [CrossRef]
Fernando, T.; Chandiramani, J.; Lee, T.; Gutierrez, H. Robust adaptive geometric tracking controls on SO(3) with an application to the attitude dynamics of a quadrotor UAV. In Proceedings of the 2011 50th IEEE Conference on Decision and Control and European Control Conference, Orlando, FL, USA, 12–15 December 2011. [Google Scholar] [CrossRef]
Chen, S.; Wang, C.W.; Zhang, Z.Y.; Ji, X.H.; Zhao, Z.K. Improved fuzzy PID method and its application in electro-hydraulic servo control. J. Mech. Electr. Eng. 2021, 38, 559–565. [Google Scholar] [CrossRef]
Jia, Y.; Cao, T.; Bai, Y. Improved YOLOv5 lightweight binocular vision UAV obstacle avoidance algorithm based on Ghost module. Chin. J. Liq. Cryst. Disp. 2024, 39, 111–119. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef]
Yu, Z.; Lei, Y.; Shen, F.; Zhou, S. Application of Improved YOLOv5 Algorithm in Lightweight Transmission Line Small Target Defect Detection. Electronics 2024, 13, 305. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, Y.; Xin, M.; Liao, J.; Xie, Q. A Light-Weight Network for Small Insulator and Defect Detection Using UAV Imaging Based on Improved YOLOv5. Sensors 2023, 23, 5249. [Google Scholar] [CrossRef] [PubMed]
Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. GhostNet: More Features from Cheap Operations. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar] [CrossRef]
Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. AAAI Conf. Artif. Intell. 2020, 34, 12993–13000. [Google Scholar] [CrossRef]
Liu, B.; Ma, X.; Wang, H.; Zhou, K. Design analysis methodology for electric-powered mini-UAV. J. Northwestern Polytech. Univ. 2005, 3, 396–400. [Google Scholar]
Grasmeyer, J.; Keennon, M. Development of the Black Widow Micro Air Vehicle. In Proceedings of the 39th Aerospace Sciences Meeting and Exhibit, Reno, NV, USA, 8–11 January 2001. [Google Scholar] [CrossRef]
Longo, G.; Cantelli-Forti, A.; Russo, E.; Lupia, F.; Strohmeier, M.; Pugliese, A. Collective victim counting in post-disaster response: A distributed, power-efficient algorithm via BLE spontaneous networks. Pervasive Mob. Comput. 2025, 106, 101997. [Google Scholar] [CrossRef]
Pervez, F.; Qadir, J.; Khalil, M.; Yaqoob, T.; Ashraf, U.; Younis, S. Wireless Technologies for Emergency Response: A Comprehensive Review and Some Guidelines. IEEE Access 2018, 6, 71814–71838. [Google Scholar] [CrossRef]
Muñoz-Castañer, J.; Counago Soto, P.; Gil-Castineira, F.; González-Castaño, F.J.; Ballesteros, I.; di Giovanni, A. Your Phone as a Personal Emergency Beacon: A Portable GSM Base Station to Locate Lost Persons. IEEE Ind. Electron. Mag. 2015, 9, 49–57. [Google Scholar] [CrossRef]

Figure 1. Three-dimensional image of the proposed UAV structure.

Figure 2. Overall structure diagram of the UAV.

Figure 3. A single isolated cage structure.

Figure 4. Structure diagram of the fullerene.

Figure 5. The influence of λ on the maximum equivalent force (Fₘₐₓ) in the overall structural system.

Figure 6. H-shaped quadcopter skeleton.

Figure 7. Structure diagram of UAV control based on fuzzy adaptive PID.

Figure 8. Control flowchart for UAV algorithms.

Figure 9. Ground station system flowchart.

Figure 10. Ghost module.

Figure 11. Ghost-BottleNeck.

Figure 12. Network structure of YOLOv5s-Ghost.

Figure 13. Experimental result of visible-light-based visual recognition and ranging.

Figure 14. Experimental result of infrared-based visual recognition.

Figure 15. Land walking experiment (a); flight experiment (b).

Table 1. The improved fuzzy control rule.

${∆ k}_{p}, ∆ k_{i}, and Δ k_{d}$	E
EC		NB	NM	NS	ZO	PS	PM	PB
	NB	PB/NB/PS	PB/NB/PS	PM/NB/ZO	PM/NM/ZO	PS/NM/ZO	PM/ZO/PB	PB/ZO/PB
	NM	PB/NB/PS	PB/NB/NS	PM/NM/NS	PM/NM/NS	PS/NS/ZO	PM/ZO/PS	PB/ZO/PM
	NS	PB/NB/ZO	PM/NM/NB	PM/NS/NM	PS/NS/NS	ZO/ZO/ZO	PS/PS/PS	PB/PS/PM
	ZO	PM/NM/NB	PM/NS/NM	PS/NS/NM	ZO/ZO/NS	PS/PS/ZO	PS/PS/PS	PM/PM/PM
	PS	PB/NS/NB	PM/NS/NM	PS/ZO/NS	PS/PS/NS	PS/PS/ZO	PM/PM/PS	PM/PM/PM
	PM	PB/ZO/NM	PM/ZO/NS	PS/PS/NS	PM/PM/NS	PM/PM/ZO	PM/PB/PS	PB/PB/PS
	PB	PB/ZO/PS	PS/ZO/ZO	PS/PS/ZO	PM/PM/ZO	PM/PB/ZO	PB/PB/PS	PB/PB/PB

Table 2. Comparative study with analogous small unmanned aerial vehicles.

UAV Model	Novel Amphibious Inspection UAV	Small Electric UAV [38]	Black Widow Micro Air Vehicle [39]
Max Payload (g)	1000	800	80
Endurance Speed(m/s)	10	15	5
Endurance at MTOW (min)	40	10	30
Maximum Demonstrated Speed (m/s)	18	20	8

Table 3. Comparative study on relevant caged unmanned aerial vehicles.

Caged UAV	Locomotion	Task	Autonomous	Main Sensor for Identification	Whether Cage Obstructs Sensors
Caged Drone for Central HVAC Ducts Inspection [18]	Slide and Fly	Inspection	Autonomous	Thermal camera	Yes
Hybrid Rolling and Flying Caged Drone [19]	Roll and Fly	Inspection	Autonomous	Thermal camera	Yes
Drone with Hybrid 3D-Printed Multifunctional Safety Cage Featuring Conformal Circuits [20]	Fly	-	Manual	2D camera	Yes
Drone With Safety Cage Ensuring Service Reliability [22]	Fly	Inspection	Manual	2D camera	Yes
Drone With Broad-Beamwidth Small Antennas and Protective Enclosure [23]	Fly	Evaluate the reliability of antennas	Manual	Printed antennas	Yes
Proposed Caged Drone	Roll and Fly	Search and rescue	Autonomous	Depth camera	No

Table 4. Performance comparison of different models in detection.

Model	mAP/%	Average Inference Time/ms	Parameters/M	FLOPs/G
YOLOv5s-Ghost	77.8	0.7	5.4	11.5
YOLOv5s	78.7	0.9	7.2	16.5
YOLOv5n	74.4	1.3	1.9	4.5
YOLOv5m	80.8	6.7	20.8	48.2
YOLOv5l	81.4	11.1	46.1	107.9
YOLOv7-tiny	80.0	4.0	6.0	13.1

Table 5. Distance measurement results.

α/m	β/m	e/%
0.78	0.78	0
0.85	0.86	1.16
1.31	1.36	3.67
2.14	2.08	2.88
2.58	2.73	5.49
3.22	3.58	7.26
4.22	4.13	2.18
4.91	4.99	1.60
5.39	5.61	3.92
6.32	6.47	2.32
7.05	6.97	1.15
8.11	8.44	3.91
9.52	9.83	3.15
11.15	10.48	6.39
12.52	12.91	3.02
14.61	13.71	6.56
15.32	14.68	4.36
16.45	15.47	6.33
17.27	16.26	6.21
17.72	16.89	4.91
18.40	17.85	3.08
19.83	18.78	5.59
21.03	19.29	9.02
23.45	21.15	10.87
24.58	22.53	9.10
25.92	23.52	10.20
26.01	23.01	13.04
28.83	25.02	15.23
28.37	25.13	12.89

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jia, C.; Xing, Y.; Li, Z.; Ge, X. A Novel Amphibious Terrestrial–Aerial UAV Based on Separation Cage Structure for Search and Rescue Missions. Appl. Sci. 2025, 15, 8792. https://doi.org/10.3390/app15168792

AMA Style

Jia C, Xing Y, Li Z, Ge X. A Novel Amphibious Terrestrial–Aerial UAV Based on Separation Cage Structure for Search and Rescue Missions. Applied Sciences. 2025; 15(16):8792. https://doi.org/10.3390/app15168792

Chicago/Turabian Style

Jia, Changhao, Yiyuan Xing, Zhijie Li, and Xiankun Ge. 2025. "A Novel Amphibious Terrestrial–Aerial UAV Based on Separation Cage Structure for Search and Rescue Missions" Applied Sciences 15, no. 16: 8792. https://doi.org/10.3390/app15168792

APA Style

Jia, C., Xing, Y., Li, Z., & Ge, X. (2025). A Novel Amphibious Terrestrial–Aerial UAV Based on Separation Cage Structure for Search and Rescue Missions. Applied Sciences, 15(16), 8792. https://doi.org/10.3390/app15168792

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Amphibious Terrestrial–Aerial UAV Based on Separation Cage Structure for Search and Rescue Missions

Abstract

Featured Application

Abstract

1. Introduction

2. Structural Design

2.1. General Functional Requirements

2.2. Main Structural Design

2.2.1. Design of the Separation Cage Structure

2.2.2. H-Shaped Frame Design for Quadrotor

3. Overall Algorithm Design

3.1. Controller Design

3.2. Control Scheme

3.3. Ground Control Station System

3.4. Visual Recognition Algorithm

4. Results and Discussion

4.1. Comparative Evaluation with Relevant UAVs

4.2. YOLOv5s-Ghost Network Training and Experimental Results

4.3. Vision-Based Detection Experiment

4.4. Drop Tests

4.5. Prototype Flight Experiment

5. Conclusions

6. Limitations

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI