1. Introduction
Unmanned surface vehicles (USVs) and unmanned aerial vehicles (UAVs), using advantages such as intelligence, automation, and strong environmental adaptability, are widely utilized in the monitoring and development of the marine hydrological environment. Conventional USVs are capable of carrying out long-duration autonomous navigation missions. However, limitations in vessel speed hinder their ability to perform highly maneuverable fixed-point monitoring. Furthermore, USVs possess limited perception capabilities for dynamic marine environments and exhibit weaker obstacle avoidance and path-planning capabilities. These limitations pose navigational safety risks when USVs operate independently. UAVs, benefiting from high speed and excellent maneuverability, can rapidly conduct wide-area observations of the water surface, capturing high-definition images and video [
1]. Although the autonomy and intelligence levels of UAVs continue to improve, their operational effectiveness in hydrological monitoring is significantly constrained by inherent limitations in endurance (flight time) and payload capacity [
2]. Coupling these two heterogeneous systems: using the USV as a docking platform for the UAV and the UAV as an observation platform for the USV can effectively overcome the functional deficiencies of each individual system and significantly enhance overall functionality.
Research has been conducted on the collaboration between unmanned aerial vehicles (UAVs) and unmanned surface vehicles (USVs). Shao designed a novel collaborative platform for a UAV-USV coupled system [
3]. This platform employs a multi-ultrasonic joint dynamic positioning algorithm to resolve localization challenges inherent in coupled unmanned reconnaissance systems. In addition, it uses a hierarchical waypoint-generation algorithm to achieve effective guidance for UAV landings on the USV (shown in
Figure 1a,b). Young proposed a measurement method assisted by UAV-USV [
4] (as shown in
Figure 1c). This method integrates visual sensors mounted on the UAV to capture high-resolution images with sonar sensors deployed on the USV to acquire bathymetric readings, enabling effective UAV-USV coordination. Sanchez Lopez ensured robust UAV (as shown in
Figure 1d) state estimation through a Kalman filter [
5], thereby enabling vision-based autonomous landing. Xu developed a third-order visual detection method to estimate the relative pose between the UAV and the USV [
6]. This estimation was subsequently used to control the UAV’s landing onto the USV, with the approach ultimately validated through lake-based experiments.
Current research on collaborative applications of UAV and USV remains primarily focused on surface and aerial observations. However, the complexity of the marine hydrological environment manifests not only at the surface but also significantly within the water column, which exhibits pronounced vertical gradients [
7,
8]. For instance, the vertical distribution structures of key oceanographic parameters—such as temperature, salinity, dissolved oxygen, and chlorophyll—are crucial for marine pollution monitoring, early warning systems, and climate change research. Consequently, existing UAV-USV collaborative paradigms exhibit a distinct functional gap concerning the monitoring of subsurface hydrological environments [
9]. The emergence of hybrid aerial underwater vehicles (HAUVs), however, demonstrates their potential for enabling cross-domain monitoring capabilities.
A hybrid aerial underwater vehicle is an unmanned vehicle capable of operating both underwater and in the air. It possesses the ability to repeatedly cross the air–water interface using its own power and can perform aquatic and aerial tasks when equipped with appropriate sensors [
10,
11]. This vehicle combines the maneuverability typical of underwater robots with the high-speed advantage inherent in aerial vehicles. Multirotor amphibious UAVs achieve locomotion control through the operation of multiple rotors. They feature vertical take-off and landing as well as hovering capabilities, offering high flexibility and controllability. When optimized for oceanographic profile observation applications, they present an ideal platform for subsurface vertical column monitoring.
To achieve precise monitoring of marine hydrological parameters, this paper proposes a deeply integrated co-design of a multirotor hybrid aerial underwater vehicle with cross-domain capabilities and an unmanned surface vehicle [
12,
13]. This approach constructs an air–sea heterogeneous monitoring platform that leverages the complementary advantages of both systems (shown in
Figure 2).
Within areas of expansive marine monitoring tasks, USVs provide basic platform support for HAUVs with their characteristics of long endurance and stable water platform. It assumes a core role in large-area cruising and the final integration and transmission of data. Sensors deployed on the USV primarily collect fundamental surface environmental parameters, such as sea surface temperature and salinity, and perform preliminary large-scale environmental scans.
When the task demands shift to monitoring the oceanographic vertical profile—such as detecting vertical temperature gradients, haloclines, or chlorophyll concentration profiles within the target sea area—the advantages of the collaborative system become evident [
14]. Under commands from either the USV or a remote control center, the multirotor HAUV carried by the USV, equipped with sensors tailored to the requirements, autonomously takes off and proceeds to the designated task area or coordinates [
15]. Leveraging its high-speed aerial maneuverability, it rapidly reaches the airspace above the predetermined monitoring point. It then executes the entire cross-domain hydrological profile monitoring mission, encompassing targeted submergence, underwater observation, surfacing, data transmission, and docking back onto the USV.
When continuous monitoring of underwater profiles at multiple distinct locations is required, the USV acts as a mobile base, transporting the HAUV to the vicinity of the next target area, where the aforementioned monitoring process is repeated cyclically [
16]. This “Air–Sea–Air” cyclic operation mode is expected to address the current gap in existing HAUV-USV systems regarding the monitoring of marine hydrological vertical profiles, enabling comprehensive environmental perception from the water surface down into the water column.
It is crucial to emphasize that the system’s unique advantage over traditional monitoring methods lies in its capability for dynamic, large-scale, and multiparameter vertical profile monitoring, such as studying ocean stratification phenomena or pollutant dispersion. In such scenarios, conventional approaches prove inefficient. Therefore, the proposed HAUV-USV system does not aim to replace traditional methods but serves as a complementary solution, working alongside them to establish a “fixed-mobile” collaborative monitoring network. This integration combines the high temporal resolution of fixed buoys with the superior spatial and vertical resolution of mobile HAUV-USV systems, thereby achieving comprehensive and precise marine hydrological monitoring.
The HAUV-USV collaborative system faces new challenges, primarily in collaborative visual tracking and landing control. Addressing the complexities inherent in cross-domain collaboration, such as high-speed relative motion, strong surface reflection interference, and wave disturbances, is crucial. This paper presents a series of innovative contributions.
The remainder of this paper is structured as follows.
Section 2 introduces the system architecture and cooperative control methods of the HAUV-USV heterogeneous platform.
Section 3 details the dynamic modeling of the HAUV-USV heterogeneous platform, the relative motion model of the cooperative system, and key technologies such as collaborative tracking and cooperative landing. The experimental system, key experimental techniques, and validation procedures are presented in
Section 4. Results and analyses are provided in
Section 5, followed by conclusions and further discussions in
Section 6.
2. HAUV-USV Collaborative System
The HAUV-USV platform constitutes a highly coupled heterogeneous autonomous system. As the unmanned surface vehicle and the hybrid aerial underwater vehicle represent distinct types of platforms, significant differences exist in their respective structural compositions, kinematic and dynamic models, and control methodologies [
17,
18]. Consequently, research into the structural configuration of both the USV and the HAUV, along with the system architecture and collaborative control technologies of this heterogeneous platform, is essential.
This section first introduces the collaborative system architecture of the HAUV-USV platform. The subsequent subsections elaborate on the structural composition of the amphibious UAV and the unmanned surface vessel individually. Finally, it provides a detailed description of the collaborative control methodology employed by the heterogeneous system.
2.1. Architecture of the HAUV-USV Collaborative System
The HAUV-USV collaborative system adopts a distributed architecture to achieve a full-process coordination of sea–air cross-domain monitoring missions. The system employs the unmanned surface vehicle as a mobile base and data hub, and the hybrid aerial underwater vehicle as a front-end detection unit. By establishing heterogeneous interconnection, it forms a dynamic platform–mobile sensor cooperative framework [
19]. As illustrated in the accompanying figure, the overall architecture adheres to core principles of modularity and scalability, supporting adaptive reconfiguration for different mission scenarios.
To address the computational demands of the Model Predictive Control algorithm, particularly the intensive optimization processes such as receding horizon optimization and constraint solving, a distributed computing architecture is adopted in the collaborative control system. The USV, as a mobile base with sufficient payload capacity and stable power supply, is equipped with a high-performance embedded processor NVIDIA Jetson to handle these computationally heavy tasks. This processor, integrated within the USV’s central control compartment, executes the MPC optimization in real time, generating optimal control increments that account for marine disturbances.
This protocol supports bidirectional data exchange, allowing the NVIDIA Jetson to send optimized control instructions to the Pixhawk, while the Pixhawk feeds back real-time state data of the HAUV to the high-level planner, forming a closed-loop control chain (shown in
Figure 3).
2.2. Structural Composition of the HAUV Platform
2.2.1. Primary Structural Framework
The multirotor hybrid aerial underwater vehicle employs a modular composite structure. Its primary frame predominantly utilizes carbon fiber-reinforced resin matrix composites and an acrylic-sealed main compartment, balancing lightweight properties with corrosion resistance. The upper section houses the flight control compartment, integrating the flight control system and power module. The lower section constitutes the underwater operational compartment, featuring a pressure-resistant spherical enclosure with a streamlined aerodynamic profile to minimize flight drag. This compartment withstands hydrostatic pressure equivalent to a depth of 100 m and is coated with a hydrophobic surface material to reduce hydrodynamic resistance during water entry. Detachable buoyancy modules, fabricated from closed-cell foam, are mounted at the rotor arm termini to provide ascent buoyancy post-submersion operations. A servo-actuated release mechanism installed beneath the main frame jettisons ballast weights upon reaching the predetermined operating depth, enabling controlled resurfacing.
To further validate the structural design and cross-domain mobility, we conducted field tests in a coastal shallow water area. During the tests, the HAUV successfully completed a full cycle of operations: taking off from the shore, flying to the designated offshore area, submerging into the water (reaching a maximum depth of 5 m), hovering underwater for 3 min, surfacing, and returning to the starting point. These tests directly demonstrated the reliability of the hardware integration—including the pressure-resistant compartment, buoyancy adjustment system, and cross-domain propulsion unit—and verified the basic feasibility of its air–water–air motion transitions in a near-shore marine environment.
The HAUV control system primarily adopts a distributed architecture, comprising the following components: a central control unit, data acquisition devices, communication modules, power management systems, and propulsion subsystems (shown in
Figure 4). The ground station utilizes a remote controller to transmit real-time commands to the HAUV’s central unit via wireless radio transmission protocols. Prior to take-off, the amphibious HAUV’s operational depth profile is preconfigured; alternatively, it autonomously ascends upon detecting a predefined clearance from the seabed during missions [
20]. The master control computer executes overall mobility control and manages measurement/storage of marine sensor data. By processing real-time navigation aid metrics (position, attitude), it dynamically identifies the current media environment (air/water) and operational phase of the mobility profile. This enables autonomous transitions between surface/submerged locomotion modes, as well as water–air/air–water media-crossing sequences.
This study employs the Pixhawk open-source control board—noted for its operational stability and ease of secondary development—as the core of the surface mobility control module for the amphibious vehicle. The board exhibits robust data processing capabilities, integrates multiple essential motion sensor modules, and provides extensive peripheral interfaces for expandability. It further enables bidirectional communication with ground stations via the MAVLink protocol. The ground segment centers on a computer running QGroundControl (QGC), an open-source ground station software. This platform facilitates mission trajectory planning, real-time monitoring, and—during indoor testing—direct real-time motion command input through connected joysticks. For field operations, multiple communication modalities exist between the onboard Pixhawk and ground station, including WiFi modules and telemetry radios. The system prioritizes telemetry radios for their greater range, enhanced stability, and superior field adaptability in amphibious vehicle deployments.
2.2.2. Cross-Domain Propulsion System
To achieve stable cross-domain transitions (air–water–air) for the HAUV, this study employs an advanced transition methodology. During water entry, the vehicle’s attitude and velocity are precisely controlled to ensure optimal angle of attack and contact speed with the water surface, thereby preventing air–propeller–water impact. Concurrently, externally mounted buoyancy modules stabilize surface flotation, followed by rapid submersion via ballast weight deployment. For water-to-air transitions, the buoyancy control system enables finely tuned adjustments, allowing the vehicle to exit the water at a stabilized velocity and attitude before switching seamlessly to flight mode. To guarantee transition continuity and stability, the propulsion and control systems incorporate failover mechanisms that maintain consistent power output and control authority during media shifts. Detailed designs of the cross-domain control system are presented in
Section 3.1.2.
2.2.3. Sensing Mechanism
The HAUV incorporates a modular payload bay at its base, enabling flexible deployment of diverse marine monitoring sensors (shown in
Figure 5). This sensor compartment utilizes an acrylic spherical pressure-resistant housing rated for pressures up to 1.5 MPa (equivalent to 100 m depth), supporting simultaneous operation of optical and acoustic sensors. The payload bay features an RS-485 standard communication interface at its apex, integrating a 24 V power bus and USB 3.0 data port to ensure real-time data exchange between sensors and the vehicle’s central control system.
Sensors implement an adaptive sampling strategy that dynamically adjusts acquisition frequency based on marine environmental gradients. When the vehicle submerges to predetermined depth intervals, the sampling rate escalates to achieve high-density vertical gradient data collection. The HAUV’s embedded high-precision inertial measurement unit (IMU) employs control algorithms to compensate for positional deviations in sampling locations induced by vertical motion dynamics.
2.3. Structural Composition of the USV Platform
The unmanned surface vehicle platform serves as the foundational infrastructure for the heterogeneous collaborative monitoring system. Its structural composition is meticulously engineered to fulfill requirements for extended endurance, stable operations, and cross-domain coordination (shown in
Figure 6).
The primary structure employs a monolithic pressure-resistant hull integrated with an electric propulsion system to sustain prolonged surface cruising capabilities [
21]. Within the hull, a central control compartment houses a data processing server and satellite communication module, enabling real-time environmental data integration and remote transmission. The sub-hull section accommodates a configurable standardized sensor array, including instruments such as sea surface temperature/salinity detectors and multiparameter water quality probes for acquiring fundamental surface environmental parameters [
22]. Concurrently, a sonar system facilitates shallow subsurface environmental scanning. The modular architecture of the USV is illustrated in the accompanying figure.
The overall design prioritizes a low-center-of-gravity configuration and optimized weight distribution, conferring multiple operational advantages [
23]. On one hand, the low-center-of-gravity structure significantly enhances hull stability in complex sea states. Even when subjected to wave impacts or strong wind disturbances, it effectively mitigates roll and pitch amplitudes, ensuring stable operation of both the deck-mounted sensor array and the HAUV docking platform. On the other hand, this design reduces acoustic interference caused by hull heave motion on sonar systems, thereby improving the accuracy of acoustic data acquisition during shallow subsurface environmental scans. Simultaneously, it provides a more stable reference plane for HAUV take-off and landing operations, enhancing the reliability of cross-domain collaborative missions.
The pivotal innovation resides in the HAUV docking platform at the vessel’s deck center. This platform incorporates a landing pad imprinted with fiducial markers for visual guidance. Leveraging a landing guidance algorithm, it enables HAUV trajectory pre-planning through visual recognition, providing multirotor amphibious drones with a cross-domain docking surface that ensures both stabilized positioning and decimeter accuracy.
Furthermore, the design allocates space for an integrated charging dock and modular payload bay. This facilitates rapid post-recovery energy replenishment and sensor reconfiguration—for example, swapping dissolved oxygen probes for underwater cameras—establishing the foundation for a unified monitoring framework where USV and HAUV systems operate synergistically, complementing each other’s capabilities.
2.4. HAUV-USV Collaborative Control System
To achieve collaborative integration of the HAUV and USV systems while ensuring proximity tracking accuracy and landing recognition precision, a relative motion model of the cooperative system must be established [
24]. The architecture of the collaborative control system is illustrated in the block diagram below (shown in
Figure 7).
The underwater operation mode of the Autonomous Submersible Vehicle operates through a “predefined task + autonomous execution” mechanism, eliminating the need for real-time communication with the surface unmanned underwater vehicle. During surface operations, the USV sends predefined task instructions to HAUV, specifying target monitoring points, depth ranges, and sampling parameters. After deployment, HAUV autonomously completes vertical profile monitoring (including temperature and salinity gradient data collection) according to pre-programmed procedures, with all data temporarily stored in onboard storage modules. Upon completing underwater tasks and surfacing, HAUV wirelessly transmits data back to the USV while updating its operational commands.
First, coordinate system unification is accomplished through transformations: Position data from the USV’s local coordinate system is converted into the HAUV’s navigation frame via a Direction Cosine Matrix (DCM) using Euler angle rotation matrices. Leveraging the Ar Pose visual marker library within the Robot Operating System (ROS) framework, camera and body-fixed coordinate systems are aligned to achieve position offset compensation across heterogeneous platforms.
Subsequently, for collaborative tracking control, a Model Predictive Control (MPC) algorithm addresses multi-constrained challenges under marine wind–wave–current disturbances [
25,
26]. This approach formulates the linearized HAUV mathematical model into a discrete state-space representation, solving for optimal control increments through receding horizon optimization to enable real-time HAUV trajectory tracking relative to the USV’s reference path.
Finally, during collaborative landing control, computer vision positioning ensures continuous alignment above the landing marker’s centroid. This integrates with onboard attitude predictors estimating the USV’s orientation angles [
27]. A phase-split control strategy dynamically adjusts landing maneuvers to counteract wave-induced vessel motions, achieving pinpoint autonomous docking despite hydrodynamic perturbations. Detailed system control methodologies are presented in
Section 3.
6. Conclusions
This study establishes a deeply integrated heterogeneous monitoring platform through the co-design of a hybrid aerial underwater vehicle and an unmanned surface vehicle, effectively bridging the functional gap in marine hydrological vertical profile monitoring. The proposed system leverages the complementary capabilities of both platforms: the USV serves as a long-endurance mobile base for surface operations and data integration, while the HAUV executes rapid cross-domain transitions for high-resolution vertical gradient sensing across air–water interfaces. A distributed “Air–Sea–Air” cyclic operational architecture enables comprehensive environmental perception from surface to subsurface layers, supporting repeated missions such as temperature/salinity profiling and chlorophyll concentration mapping.
Key innovations include the development of a coupled HAUV-USV dynamic model that incorporates aerodynamic/hydrodynamic interactions during media transitions, alongside an MPC-based collaborative tracking algorithm that maintains real-time trajectory pursuit under marine disturbances. Experimental validation confirmed that the receding horizon optimization strategy effectively constrained tracking errors while balancing control stability and precision. Furthermore, a vision-guided synchronous landing methodology integrating Ar Pose marker localization and USV attitude prediction achieved decimeter-level docking accuracy under dynamic sea conditions.
While simulation results verify the system’s robustness in trajectory tracking and wave-rejection capabilities, future work will address system reliability in corrosive marine environments, increase lightweight acoustic modules to demonstrate the potential advantages of underwater communication for system synergy, and extend the framework to multi-HAUV coordination scenarios. Field validation in open-ocean environments remains essential to evaluate performance under extreme hydrodynamic conditions. This work ultimately provides a foundational paradigm for autonomous, scalable oceanographic observation systems capable of capturing critical vertical gradient parameters previously inaccessible to conventional monitoring platforms.