Next Article in Journal
Lightweight Design of Blended-Wing-Body Underwater Glider Skeleton via Integrated Topology and Data-Driven Optimization
Previous Article in Journal
Reliability Enhancement of Underwater Acoustic Communication in Dynamic Underwater Channels via Unequal-Rate Frequency–Phase Signaling
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Tightly-Coupled Multi-Source Navigation Using Acoustic-Geometric Constraints for Underwater Vehicles in Tunnels

1
State Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
2
Key Laboratory of Marine Robotics of Liaoning Province, Shenyang 110169, China
3
China Yangtze Power Co., Ltd., Yichang 443000, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2026, 14(12), 1097; https://doi.org/10.3390/jmse14121097 (registering DOI)
Submission received: 12 May 2026 / Revised: 11 June 2026 / Accepted: 12 June 2026 / Published: 13 June 2026
(This article belongs to the Section Ocean Engineering)

Abstract

Utilizing underwater vehicles for hydropower infrastructure inspection is increasingly vital. However, these GNSS-denied and confined environments pose significant navigation challenges: Inertial Navigation Systems (INSs) suffer cumulative drift, Doppler Velocity Logs (DVLs) face acoustic blind zones near walls, and visual navigation frequently fails in highly turbid waters. To address these issues, this paper proposes a tightly coupled multi-source (INS/acoustic/optical/vision) navigation algorithm leveraging prior wall geometry constraints. Developed within an Error-State Kalman Filter (ESKF) framework, the model seamlessly accommodates sensor spatiotemporal heterogeneity. To overcome optical failures, a structural surface constraint model is innovatively constructed using single-beam sonar ranging. The core contribution involves transforming sonar ranging data into 6-DOF spatial pose constraints based on the dam’s planar characteristics, effectively bounding the localization drift perpendicular to the surface. Field experiments at the hydropower station dam demonstrate that under extreme conditions with total visual failure, the proposed algorithm effectively constrains critical motion degrees of freedom. By maintaining the wall-tracking error within 0.08 m (Root Mean Square Error, RMSE)—which effectively represents the relative localization error given the known absolute position of the structural wall—this method significantly enhances the operational robustness and precision of close-wall inspections in extreme underwater environments.

1. Introduction

1.1. Research Background and Engineering Significance

1.1.1. Challenges in the Inspection of Large-Scale Hydropower Infrastructure

Over the past few decades, China’s water conservancy and hydropower sector has achieved significant structural engineering advancements, leading to the deployment of large-scale hydropower facilities. As the service life of these mega-projects increases, their operational and maintenance focus has gradually shifted from large-scale construction to refined management and maintenance. Serving as critical assets for flood control, power generation, and water resource allocation, large-scale hydropower infrastructures feature underwater foundation structural surfaces that endure immense hydrostatic pressure from hundred-meter-level water heads over prolonged periods. Furthermore, they are continuously exposed to sediment abrasion, flow shear stress, chemical dissolution, and the alternating wet–dry conditions caused by periodic fluctuations in reservoir water levels.
These complex, multi-physics coupled environments render the concrete surfaces of the dams highly susceptible to structural defects, such as micro-cracks, spalling, leakage, and reinforcement corrosion. If these defects are not precisely identified and repaired in their nascent stages, micro-cracks can rapidly propagate into penetrating structural damages under the high-pressure driving force of the “water wedging” effect. Ultimately, this poses a severe threat to the overall structural integrity and long-term service performance of the dams.

1.1.2. Limitations of Traditional Inspection Methods

Traditional underwater structural inspection primarily relies on commercial divers. However, the operating environments, such as diversion tunnels and the upstream face of hydropower dams, are extremely harsh: depths far exceed the limits of conventional air diving; water temperatures persistently remain low; and immense pressure gradients along with unpredictable local turbulence are prevalent. Not only do diver operations face extremely high risks of decompression sickness and physical exertion limits but their operational windows are also short and their coverage is limited. Furthermore, inspection results largely rely on subjective tactile sensations and blurred vision, making it difficult to establish standardized digital archives for subsequent comparative analysis. Therefore, utilizing Remotely Operated Vehicles (ROVs) or autonomous underwater vehicles (AUVs) equipped with high-definition optical cameras, sonars, and laser scanners to replace human labor for all-weather, full-coverage, close-wall refined inspections has become an urgent demand and an inevitable technological trend in the hydropower industry.

1.1.3. High-Precision Navigation: A Critical Technological Challenge for Intelligent Inspection

To achieve precise quantitative inspection of millimeter-level cracks by the vehicle, a prerequisite is that the robot must possess centimeter-level, high-precision navigation and localization capabilities. Only by acquiring the precise pose of the vehicle on the dam surface can the detected defect data be mapped onto the 3D digital model of the dam, enabling targeted tracking and life-cycle management of the defects. Within these specific “confined spaces”, conventional underwater localization technologies face severe “failure” challenges:
  • GNSS denial: Satellite signals are completely absorbed within a few centimeters of entering the water, resulting in a complete denial of absolute spatial positioning data.
  • Severe magnetic interference: The dense reinforcing steel mesh inside the underwater structures and the giant generator units generate strong magnetic field interference, rendering the magnetic compass unable to provide reliable heading references.
  • Acoustic multipath interference: The dam, acting as a massive acoustic reflector, combined with the water surface, bottom, and gate slot structures, forms a high-reverberation environment. This severely afflicts Ultra-Short Baseline (USBL) localization systems with multipath effects, causing severe localization outliers.
  • Limited optical visibility: The water in hydropower stations exhibits high turbidity. The strong scattering effect of suspended sediment particles on light (backscattering forms a “light curtain,” while forward scattering causes blurring) makes it difficult for visual SLAM algorithms based on natural texture features to operate stably, making them highly susceptible to tracking loss.
In light of this, exploring a high-precision autonomous navigation and localization method that can adapt to the turbid, highly interfered environments of diversion tunnels without relying on external base stations serves as the critical technological breakthrough point for realizing intelligent underwater structural inspection.

1.2. Analysis of the Current State of Research

Traditional “loosely-coupled” schemes integrate sensor data only at the position or velocity level, making it difficult to handle scenarios where a single sensor fails. In recent years, “tightly-coupled” technologies based on factor graph optimization and the Error-State Kalman Filter (ESKF) have become the mainstream. Xu et al. proposed the AQUA-SLAM system, which achieves tight coupling of DVL beam velocities, IMU pre-integration, and binocular visual features within a factor graph framework [1]. Zhang et al. introduced VIA-SLAM, which directly incorporates raw DVL beam observations into the visual feature tracking front-end, significantly enhancing the scale observability of the system in texture-less regions [2].
To address long-term drift, Song et al. proposed Acoustic-VINS, which introduces Long Baseline (LBL) acoustic ranging information as a constraint into a Visual–Inertial System (VINS) to achieve drift-free global localization [3]. Yang et al. developed a SINS/DVL/PS fusion algorithm based on a Robust Interactive Multiple Model (RIMM); by adaptively adjusting the observation noise covariance, they effectively suppressed the impact of DVL blind zones or outliers on navigation solutions [4]. Furthermore, to tackle motion blur in visual sensors during high-speed maneuvers, research in 2025 has begun exploring tightly coupled schemes involving event cameras, acoustics, and inertial navigation, utilizing the high dynamic range of event streams to compensate for the motion blur inherent in traditional imagery [5].
Zheng et al. developed an underwater image restoration network incorporating attention mechanisms, which significantly improved the contrast and feature extraction success rate for turbid images [6]. Ou et al. proposed the PL-VAP framework, which utilizes line features combined with acoustic depth information to partially resolve the tracking difficulties of point features on texture-less man-made structural surfaces [7]. Huang et al. achieved tight coupling of vision and acoustics on a bionic robotic shark, using acoustic features to compensate for the absence of visual data [8]. For hydropower dam inspection robots, we designed a robust control system based on an environmental disturbance observer, indirectly improving localization stability in high-flow, high-turbidity environments through robust algorithms [9]. In research on ship hull cleaning robots, Starbuck et al. proposed a Manifold Invariant Extended Kalman Filter [10]. This method utilizes known CAD models of the hull surface to construct manifold constraints, restricting state estimation to a specific curved surface and effectively eliminating normal drift. Similarly, recent studies have extended the Unscented Kalman Filter (UKF) to manifold spaces (Manifold-UKF), addressing the non-linear attitude estimation problem for AUVs during large maneuvers [11]. Ding et al. pointed out in a review that future underwater navigation will evolve toward “semantic–geometric” dual constraints; utilizing structural features of the environment (such as dam planes) to correct inertial navigation drift is a significant trend for low-cost, high-precision navigation [12]. While the concepts of observability-constrained EKFs (OC-EKF) and manifold assumptions have been explored in previous terrestrial and aerial SLAM literature, their application has been predominantly restricted to gravity-aligned ground planes utilized to bound vertical depth or altitude drift via simple Cartesian projections. Furthermore, existing sonar-based wall-following applications for autonomous underwater vehicles traditionally treat acoustic range measurements merely as external reactive inputs for lower-level PID distance control loops, offering no corrective feedback to the underlying dead-reckoning navigation solution.
The SVIn2 system, developed by Rahman et al., fuses sonar, vision, and inertial information, utilizing local geometric maps constructed by sonar to correct visual drift, rep representing a significant methodological framework in underwater multi-sensor geometric-constraint SLAM [13]. Hong et al. proposed the Inspection-NeRF algorithm, which uses Neural Radiance Fields (NeRF) to reconstruct various local images of dam surfaces, addressing the lack of detailed texture in traditional 3D reconstruction and providing rich priors for appearance-based relocalization [14]. Shaukat et al. introduced an ESKF fusion algorithm enhanced by Radial Basis Function (RBF) neural networks, achieving superior localization accuracy compared to traditional ESKF in highly nonlinear and uncertain underwater environments [15]. Moreover, recent advancements continue to emphasize the critical importance of geometric and acoustic constraints under severe sensor limitations. For instance, Cohen and Klein proposed a seamless underwater navigation framework designed to maintain system stability when DVL measurements are severely limited or unavailable due to complex bottom topographies [16]. Similarly, Li et al. introduced a dynamic stochastic model optimization method based on singular value decomposition (SVD) for underwater acoustic navigation, demonstrating that coupling geometric constraints (such as absolute depth priors) with stochastic optimization effectively suppresses acoustic error propagation [17]. Compared to these recent approaches—which primarily focus on maintaining horizontal velocity tracking or bounding 1D vertical drift—our proposed method further advances the field by analytically deriving the exact Jacobian that maps a 1D scalar acoustic range onto a structural planar prior. Chen et al. proposed a SLAM scheme for terrestrial environments based on road network topology and reflectivity enhancement; although the application scenarios differ, the strategy of utilizing structural features to constrain multi-source fusion offers valuable insights for navigation in structured underwater environments [18]. For low-cost miniaturized platforms, Xu et al. designed a low-cost visual–inertial odometry system, verifying the feasibility of implementing tightly coupled navigation on resource-constrained platforms [19]. Khan et al. systematically reviewed deep learning-based underwater object detection, highlighting challenges in low-light, high-turbidity environments [20]. Zhou et al. proposed a real-time target detection technique for complex underwater environments capable of identifying key landmarks (e.g., gate slots, cracks) in low signal-to-noise ratio images, providing semantic support for vision-aided navigation [21].
Despite significant progress in multi-sensor fusion, several gaps remain in extreme engineering environments:
  • Excessive reliance on vision: The current literature focuses predominantly on extracting geometric features via vision. When extreme turbidity leads to total visual failure, the robustness of these systems is severely challenged.
  • Insufficient acoustic geometric modeling: Systematic research is still lacking on how to construct high-precision wall manifold constraints using only single-beam sonar in visual-blind conditions and how to derive the corresponding analytical Jacobian matrices for the full 6-DOF state.

1.3. Main Work and Innovations

To address the critical vulnerabilities of existing tightly coupled acoustic–visual-inertial SLAM frameworks—which predominantly rely on feature-rich environments and fail catastrophically under simultaneous visual denial and acoustic bottom-tracking loss—this paper proposes a novel manifold-constrained Error-State Kalman Filter. The core theoretical innovation lies in the analytical derivation of a spatial constraint model that maps single-beam sonar scalar ranging data onto a predefined structural planar prior. Unlike traditional loosely coupled depth or altitude bounding methods, the proposed formulation derives the exact analytical Jacobian linking the 1D acoustic distance measurement to the full 6-DOF state of the vehicle. This mathematically rigorous coupling ensures that the vehicle’s translational drift perpendicular to the structural wall, as well as its rotational drift (pitch and roll), are deterministically bounded. Consequently, the proposed method guarantees stable state estimation even in totally featureless, highly turbid confined spaces where conventional visual–acoustic SLAM systems diverge. This study focuses on resolving perceptual robustness in engineering environments. The core contributions are as follows:
  • Construction of a deep fusion framework for heterogeneous sensors adapted to confined spaces: Considering the actual dynamic characteristics of the underwater vehicle, a multi-source information fusion model encompassing the IMU, DVL, depth sensor, vision, and ranging sonar was established. This framework primarily resolves the synchronization and alignment issues among disparate sensors concerning sampling frequencies (1 Hz–200 Hz), signal transmission delays, and physical installation deviations, thereby ensuring system stability under severe water flow disturbances.
  • Proposal of a strong acoustic constraint algorithm based on wall geometric priors: Utilizing the engineering characteristics of the regular geometric planes inherent in hydropower infrastructures and diversion tunnels, the ranging data from the single-beam sonar are transformed into spatial position constraints. By establishing a direct mathematical correlation between the sonar observations and the 6-DOF pose of the vehicle, this algorithm effectively corrects the localization drift perpendicular to the structural surface, preventing the vehicle from colliding with or deviating from the inspection area during operations.
  • Verification of the operational robustness of “acoustic-relaying” vision in highly turbid environments: Close-wall inspection experiments were conducted at the dam of a certain hydropower station under a high turbidity of approximately 400 NTU. The results demonstrate that under extreme conditions where the visual system completely fails due to the “light curtain effect,” the acoustic geometric constraints can take over the localization correction task in real time. This achieves a seamless transition from optical-vision-based navigation to acoustic-geometry-based navigation, ensuring the continuity of the inspection tasks.

2. System Modeling

2.1. Coordinate Frame Definitions

To achieve precise localization and operations of the vehicle in the complex environments of hydropower stations, a customized “intelligent underwater defect inspection robot for large-scale hydropower dams” was utilized as the experimental platform (Figure 1). This platform features a 6-DOF open-frame structure with vectorial propulsion capabilities, enabling it to execute complex maneuvers such as lateral translation, constant depth maintenance, constant altitude maintenance, and rotation around axes. Its core navigation sensor suite includes an Inertial Measurement Unit, a Doppler Velocity Log, a visual system (comprising a wide-angle camera and artificial lighting), a single-beam ranging sonar, and a depth sensor.
To accurately describe the motion state of the vehicle and its sensor observations, the following coordinate frames and their transformation relationships are defined:
  • Earth-Centered, Earth-Fixed frame ( e -frame, ECEF): The origin is located at the center of mass of the Earth. It is used to describe the absolute geographical position of the dam on Earth.
  • Navigation frame ( n -frame): A fixed water-entry point in the inspection area of the dam is selected as the origin, adhering to the North–East–Down (NED) convention. In this frame, the gravity vector is denoted as g n . The real-time position p n , velocity v n , and attitude quaternion q b n of the vehicle are all expressed in this frame.
  • Body frame ( b -frame): The origin coincides with the measurement center of the IMU. The X-axis points to the front of the vehicle (longitudinal axis), the Y-axis points to the right (transverse axis), and the Z-axis points downward (vertical axis).
  • Sensor frames:
    • Camera frame ( c -frame): The origin is located at the optical center of the camera, with the Z-axis pointing forward along the optical axis.
    • Sonar frame ( s -frame): The origin is located at the center of the transducer surface, with the Z-axis pointing along the direction of acoustic wave emission.
    • The mounting positions (translation vectors) and angles (rotation matrices) of each sensor relative to the body frame ( b -frame) have been obtained through rigorous offline calibration, which are used to compensate for the “lever-arm effect” caused by sensor installation deviations.

2.2. Strapdown Inertial Navigation System (SINS) Kinematics Equations

Serving as the core of the entire low-level navigation framework, the Strapdown Inertial Navigation System utilizes high-frequency measurements from the accelerometer and gyroscope for continuous integration and dead reckoning, providing the vehicle with high-update-rate ego-motion states. Let the nominal state vector of the underwater vehicle in the navigation frame be denoted as x n o m = [ p n , v n , q b n ] T . Considering that the operating speed of the ROV during dam surface inspections is relatively low (typically less than 0.5 m/s) and its motion range is relatively restricted, the Coriolis effect induced by Earth’s rotation is reasonably neglected in the continuous-time nominal kinematic model herein to reduce computational complexity. The kinematic equations can be expressed as
p ˙ n = v n v ˙ n = R b n ( a m b a ) + g n q ˙ b n = 1 2 q b n [ 0 , ω m b g ] T
where a m and ω m are the raw measurements of the three-axis accelerometer and gyroscope, respectively, b a and b g are the real-time biases of the accelerometer and gyroscope, respectively, which drift slowly over time and constitute the primary error sources leading to localization divergence during long-endurance underwater operations; R b n is the Direction Cosine Matrix (DCM, or rotation matrix) converted from the current attitude quaternion q b n ; and the symbol denotes the quaternion multiplication operator.

2.3. Error-State Dynamics Modeling

In highly disturbed underwater environments, sensors inevitably exhibit noise and biases. If the Extended Kalman Filter is directly applied to the aforementioned nonlinear nominal states, not only is the computational load heavy but the nonlinearity of the attitude also easily leads to singularity in the covariance matrix, subsequently causing filter divergence. While modern sliding-window factor graph optimizations (FGOs) provide superior global consistency by jointly optimizing historical trajectories and iteratively minimizing sensor residuals, the Error-State Kalman Filter architecture was deliberately selected as the core estimation framework for this highly dynamic underwater application. The rationale is profoundly driven by the specific constraints of the operational environment. First, FGO architectures are heavily reliant on the continuous extraction of distinct environmental landmarks; under the extreme conditions of simultaneous visual denial and acoustic feature starvation modeled in this study, nonlinear optimization backends frequently encounter feature starvation, leading to severe ill-conditioning and divergence. Second, the nonlinear continuous-time kinematics of the vehicle require deterministic, ultra-low latency state feedback to maintain stable hydrodynamic control within turbulent boundary layers. The recursive nature of the ESKF mathematically guarantees this microsecond-level deterministic execution without the immense computational overhead of historical marginalization, ensuring a stable state output critical for safe close-wall operations. Therefore, the Error-State Kalman Filter architecture, which is highly robust in engineering applications, is adopted in this paper. Its core engineering philosophy lies in decomposing the true physical state x t r u e into a nominal state x n o m obtained through high-frequency integration and a low-frequency error state δ x to be estimated.
The 15-dimensional error state vector of the underwater vehicle is defined as
δ x = [ δ p n , δ v n , δ θ , δ b a , δ b g ] T
where the position error δ p n and velocity error δ v n are represented by standard three-dimensional vectors; the attitude error δ θ is represented by a three-dimensional rotation vector, avoiding the over-parameterization problem associated with quaternions; and δ b a and δ b g denote the random walk errors of the sensor biases.
By performing a first-order perturbation analysis on the nominal kinematics, the continuous-time error-state differential equations are rigorously defined as δ x . = F δ x + G w . The 15-dimensional continuous-time state transition matrix F is analytically constructed to explicitly capture the complex cross-coupling between attitude misalignment and specific force projections:
F = 0 3 × 3 I 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 [ R b n f b ] × R b n 0 3 × 3 0 3 × 3 0 3 × 3 [ ω i n n ] × 0 3 × 3 R b n 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3 0 3 × 3
where R b n represents the direction cosine matrix from the body frame to the navigation frame, f b denotes the specific force measured by the accelerometer, ω i n n is the angular rate of the navigation frame, and ( × ) denotes the skew-symmetric matrix operator. The process noise vector w = [ n a , n g , n b a , n b g ] T is mapped to the state derivative via the noise driving matrix G .
For digital implementation at the IMU sampling interval Δ t , the continuous-time system is discretized. The discrete-time state transition matrix Φ k is approximated via the first-order Taylor expansion Φ k I 15 × 15 + F Δ t . The covariance prediction is recursively updated as
P k | k 1 = Φ k 1 P k 1 | k 1 Φ k 1 T + G k 1 Q G k 1 T Δ t
where Q represents the diagonal process noise covariance matrix derived directly from the IMU’s intrinsic Allan variance parameters.
During the observation phase, once the geometric residual r k and the analytical observation Jacobian H g e o are established, the standard ESKF correction is computed via the Kalman Gain K k :
K k = P k | k 1 H g e o T ( H g e o P k | k 1 H g e o T + R ) 1 δ x k = K k r k P k | k = ( I K k H g e o ) P k | k 1
where R defines the scalar sonar measurement noise covariance. Upon completion of the update step, the estimated error state δ x k is injected to correct the nominal state, and the error state is subsequently reset to zero to preserve the validity of the small-error linearization assumption.
This phase fully exploits the high-frequency characteristics of the IMU to maintain the continuity and smoothness of the vehicle’s state under water flow disturbances. However, pure IMU dead reckoning causes the covariance P to expand continuously over time; thus, it is imperative that subsequent external observations (sonar, vision) are relied on for feedback correction to constrain the divergence.
Furthermore, the computational efficiency of the proposed architecture was rigorously profiled to guarantee its suitability for resource-constrained autonomous underwater platforms. During the field deployments, the complete tightly coupled ESKF algorithm was executed onboard the embedded processor. Performance profiling data indicates that the algorithm accomplishes a comprehensive state prediction and multi-sensor observation update cycle in an average execution time of 1.5 ms. This process consumes less than 8 % of a single CPU core’s capacity. Such a low computational footprint effortlessly sustains a stable, synchronous high-frequency state estimation output matching the IMU frequency. This ultra-low-latency, deterministic real-time performance is a critical prerequisite for the high-bandwidth feedback control systems required to stabilize the vehicle during aggressive close-wall maneuvers in turbulent boundary layers.

2.4. Spatiotemporal Alignment Mechanism for Heterogeneous Data

Prior to conducting the observation update, the “spatiotemporal heterogeneity” issue between the IMU and the acoustic/optical sensors must be resolved.
  • Time synchronization: The IMU sampling rate is 100 Hz, whereas vision and sonar typically operate at 1 Hz–30 Hz, accompanied by transmission and processing delays. A combined hardware–software synchronization mechanism based on the IMU time axis as the primary reference is adopted in this paper. At the hardware level, unified trigger signals are transmitted via an FPGA. At the software level, to address transmission delays, linear interpolation is utilized to align the low-frequency sensor data with the nearest IMU timestamp. Let the visual observation timestamp be t o b s , which falls between two IMU timestamps t k and t k + 1 . The state at t o b s is predicted using the state at t k to calculate the residual, and the correction is subsequently retroactively updated to the current time step.
  • Spatial alignment: All observation data must be uniformly transformed into the body frame or the navigation frame for residual calculation via extrinsic parameter matrices (e.g., R c b ,   p c b ), comprehensively accounting for the influence of the lever-arm effect on velocity and position observations.

3. Strong Acoustic Constraint Algorithm Based on Wall Geometric Priors

This constitutes the core algorithm proposed in this study to address the critical pain point of visual failure induced by highly turbid waters. Its fundamental principle is to utilize the relatively regular geometric structures of hydropower infrastructures as “never-disappearing” navigation landmarks.

3.1. Geometric Priors and Physical Constraint Mechanisms

At a micro-local scale during the inspection of diversion tunnel walls, the structural surface can be modeled with extreme precision as a geometric plane. This geometric feature serves as objective prior information, entirely unaffected by lighting conditions or water turbidity.
Let the equation of this plane in the navigation frame ( n -frame) be defined as
π : n T p n + D = 0
where n R 3 is the unit normal vector of the plane (pointing toward the water side; for an inclined structural surface, its Z -component is non-zero), and D R is the distance parameter from the origin to the plane. In practical field conditions, the absolute position of the infrastructure wall is known. Thus, these macroscopic planar parameters are acquired directly in advance from the highly accurate as-built CAD design drawings of the hydropower infrastructure. Any minor local parameter deviations or structural imperfections encountered during operations are inherently mitigated by the system’s robust outlier rejection mechanism, rendering the navigation filter largely insensitive to minor planar parameter errors.
During operations, the vehicle utilizes a single-beam ranging sonar to measure the vertical distance ρ m e a s from the vehicle’s body center to the structural surface in real time. This seemingly simple scalar observation actually imposes a strongly coupled, high-dimensional geometric constraint on the position vector ( p x n , p y n , p z n ) and attitude angles ( ϕ , θ , ψ ) of the vehicle: the motion trajectory of the vehicle must be restricted to a manifold located at a specific distance from the structural surface. This implies that the position component of the vehicle perpendicular to the structural surface is effectively constrained and the variations in its pitch and roll angles are strictly restricted by the distance observation.

3.2. Construction of the Acoustic Geometric Constraint Model

Let the position of the vehicle’s IMU center in the navigation frame be p b n , and the attitude rotation matrix be R b n . The sonar is mounted on the vehicle body; its mounting position (lever arm) in the body frame ( b -frame) is t s b , and the acoustic wave emission direction in the sonar frame ( s -frame) is v s = [ 0 , 0 , 1 ] T (assuming emission along the Z-axis). The mounting rotation matrix of the sonar frame relative to the b -frame is R s b .
Assuming the distance measured by the sonar is ρ , the hit point p h i t n of the acoustic wave on the structural surface can be expressed as
p h i t n = p b n + R b n ( t s b + R s b v s ρ )
According to the geometric priors, the hit point p h i t n must lie on the plane π , thus satisfying the plane equation:
n T p h i t n + D = 0
Substituting the hit point coordinates into the equation yields the nonlinear observation equation:
h g e o ( δ x ) = n T [ p b n + R b n ( t s b + R s b v s ρ ) ] + D = 0
To integrate this geometric constraint into the Kalman filter, the residual r g e o must be constructed. Here, the sonar measurement ρ is treated as the observation z , and the residual is formulated through inverse derivation, representing the difference between the “actual measured distance” and the “theoretical distance inferred from the current state estimate”:
r g e o = ρ m e a s ρ ^ ( x n o m )
where ρ ^ is the theoretical sonar distance calculated based on the current nominal position p ^ b n and nominal attitude R ^ b n . According to the geometric relationship of ray-plane intersection, ρ ^ can be analytically computed by the following equation:
ρ ^ = ( n T p ^ b n + D + n T R ^ b n t s b ) n T R ^ b n R s b v s

3.3. Derivation and Linearization of the Analytical Jacobian Matrix

To update the ESKF, it is necessary to compute the Jacobian matrix of the residual r g e o with respect to the error state δ x , denoted as H g e o = r g e o δ x . This is a critical step that directly determines whether the constraint can correctly amend the errors.

3.3.1. Partial Derivative with Respect to the Position Error δ p

Since the true position is formulated as p b n = p ^ b n + δ p , the theoretical observation ρ ^ evaluated at the nominal state p ^ b n exhibits a linear relationship with the position error perturbation; we have
ρ ^ δ p = n T n T R ^ b n R s b v s
The denominator term n T R ^ b n R s b v s is essentially the cosine value of the dot product between the acoustic beam direction vector and the normal vector of the structural surface. When the acoustic beam is emitted approximately perpendicularly to the structural surface, this value approaches −1. In this case, ρ ^ δ p n T . This intuitively indicates that the sonar constraint primarily observes and corrects the position component along the normal direction of the structural surface.

3.3.2. Partial Derivative with Respect to the Attitude Error δ θ

The attitude error δ θ affects R b n . Utilizing R R ^ ( I + [ δ θ ] × ) , the derivative of ρ ^ is obtained via the chain rule.
Let the denominator be K = n T R ^ b n R s b v s and the numerator be M = ( n T p ^ b n + D + n T R ^ b n t s b ) .
Thus, ρ ^ = M / K .
Taking the derivative of the numerator,
M δ θ = n T R ^ b n [ t s b ] ×
Taking the derivative of the denominator,
K δ θ = n T R ^ b n [ R s b v s ] ×
Consolidating these using the quotient rule,
ρ ^ δ θ = K M δ θ M K δ θ K 2 = 1 K M δ θ M K 2 K δ θ = 1 K M δ θ ρ ^ K K δ θ
Substituting and rearranging yields
ρ ^ δ θ = n T R ^ b n [ t s b + ρ ^ R s b v s ] × n T R ^ b n R s b v s
Analysis of physical significance and observability: The derived Jacobian explicitly reveals the local observability properties and the geometric conditions of the system under this planar constraint. In the position block, any translational perturbation parallel to the structural plane yields a zero dot product with the normal vector, making along-track and depth translations unobservable solely from this single-beam sonar constraint. In the attitude block, governed by the cross-product term, a rotational perturbation parallel to the normal vector (analogous to the yaw degree of freedom relative to the wall) results in a null projection, placing yaw strictly in the unobservable subspace. Conversely, translations perpendicular to the plane (cross-track distance) and rotations orthogonal to the normal vector (pitch and roll) generate full-rank, non-zero projections in the measurement space. When the vehicle undergoes a slight tilt in pitch or roll, the measured distance ρ changes significantly due to the amplification effect of the acoustic path length. By observing Δ ρ , the filter can inversely correct these specific attitude errors. This concise observability analysis mathematically guarantees that the geometric constraint deterministically bounds the cross-track drift and suppresses pitch and roll divergence during long-endurance missions. The final observation matrix H g e o is
H g e o = ρ ^ δ p , 0 1 × 3 , ρ ^ δ θ , 0 1 × 6
where the observation residual is z = ρ m e a s ρ ^ , and the observation model is z = H g e o δ x + ν .

3.4. Adaptive Robust Mechanism for Strong Constraints

In actual engineering environments, structural surfaces are rarely perfectly ideal planes. Micro-irregularities, construction joints, and local structural defects will inherently cause instantaneous non-Gaussian noise or sudden spikes in the single-beam sonar ranging data. If forcibly applied through the coupled Jacobian matrix, these irregular distance measurements would artificially perturb the vehicle’s attitude estimation. To mitigate the influence of these local geometric anomalies, an adaptive robust mechanism based on the Mahalanobis Distance is utilized. Measurements falling within structural defects trigger the threshold and are subsequently down-weighted via covariance inflation or rejected entirely. This allows the continuous-time IMU prediction model to smoothly bridge the defect gap without compromising the integrity of the state estimation. To this end, an adaptive robust mechanism based on the Mahalanobis Distance was designed.
Calculate the test statistic:
γ = r g e o T ( H g e o P H g e o T + R s o n a r ) 1 r g e o
where R s o n a r is the measurement noise covariance of the sonar, and P is the current error covariance matrix.
Set a threshold χ α 2 (based on the chi-square distribution with 1 degree of freedom and a 95% confidence level).
  • If γ < χ α 2 : The measurement is considered valid, and a standard Kalman update is executed.
  • If γ χ α 2 : The measurement is identified as a geometric outlier. At this point, rather than directly discarding the data, a Huber kernel function or covariance inflation strategy is employed to scale up R s o n a r by a factor of κ , substantially reducing the weight of this observation. This approach not only prevents outliers from biasing the trajectory but also retains partial information to maintain the stability of the filter.

4. Experimental Verification

4.1. Simulation Experiments and Result Analysis

To verify the effectiveness and robustness of the proposed tightly coupled navigation algorithm—which is based on the strong acoustic constraints of wall geometric priors—under high-turbidity and strong flow field environments, a high-fidelity underwater vehicle navigation simulation platform was constructed. By comparing the localization performance of conventional dead reckoning (DR), standard visual/inertial/acoustic tightly coupled algorithms, and the proposed algorithm under simulated harsh conditions, the performance advantages of the proposed method were quantitatively evaluated.

4.1.1. Simulation Platform and Environmental Modeling

The simulation system was developed in Python 3.7, utilizing a 100 Hz frequency to discretely deduce the rigid-body kinematics and dynamics of the underwater vehicle.
Specifically targeting the close-wall inspection scenarios, the following virtual environment was constructed:
  • Structural Surface Model: A vertical plane of 100 m × 100 m was constructed in the simulation space as an ideal structural surface, simulating the close-wall operating condition wherein the vehicle maintains a distance of approximately 2 m from the infrastructure.
Flow Field Model: To simulate the complex fluid environment adjacent to the infrastructure, a superimposed model comprising constant laminar flow and random turbulence was established. It is important to note that, in accordance with the strict operational safety regulations of major hydropower stations in China, actual underwater vehicle inspections are exclusively permitted during non-flood seasons in relatively calm water environments. Deployment under high flow velocity conditions is strictly prohibited by station authorities. Consequently, to accurately reflect these practical engineering constraints, a constant flow velocity of vcurrent = 0.1 m/s along the tangential direction (Y-axis) of the structural surface was set, which represents the realistic maximum flow disturbance encountered during actual permissible field operations. This was superimposed with Gaussian white noise with a mean of 0 and a variance of 0.005 m/s2 to act as local disturbances.
Visual Failure Zone: To simulate the “light curtain” effect induced by highly turbid waters, the simulation time interval t ∈ [100 s, 200 s] was designated as the “visual-denied interval”. During this period, the observation noise covariance matrix R V O of the Visual Odometry (VO) was forcibly increased to 1002, simulating the localization divergence caused by feature tracking loss.
The simulation object was an underactuated 4-DOF (x, y, z, yaw) underwater vehicle, equipped with the sensor configuration detailed in Table 1. To approximate actual engineering applications, the sensor data generation models incorporated biases, random walks, and Gaussian white noise.

4.1.2. Experimental Setup

Three comparative algorithms were designed to verify the contribution of the proposed algorithm in heterogeneous multi-source fusion. While the chosen baseline methods are fundamental, they were specifically selected to conduct an ablation verification—isolating and demonstrating the direct impact of the proposed acoustic–geometric constraint on the system’s stability:
  • Method 1: High-precision dead reckoning (DR baseline). This method utilizes a SINS/DVL/Depth/Compass combination, fusing velocity and depth information via EKF. This is the standard configuration for industrial ROVs, operating independently of visual information but suffering from cumulative errors.
  • Method 2: Standard tightly coupled ESKF (standard ESKF). This approach introduces VO observations based on the DR baseline. It maintains high precision when vision is normal but lacks external position correction during periods of visual failure.
  • Method 3: The proposed method (Proposed Manifold-ESKF). Building upon Method 2, this method introduces a manifold constraint for the structural surface based on single-beam sonar. By constructing the observation equation h g e o ( δ x ) = 0 , sonar ranging is transformed into a strong position constraint along the direction perpendicular to the structural surface (X-axis).
The total simulation duration was 300 s. The vehicle executed a standard “lawnmower” trajectory, encompassing typical maneuvers such as constant-depth lateral translation, depth-changing descent, and attitude adjustment.

4.1.3. Comparative Analysis of Simulation Results

Figure 2 illustrates the 3D spatial trajectory comparison of the three algorithms under a flow disturbance of 0.1 m/s.
  • DR method (green dashed line): Due to the absence of absolute position observations, although the DVL and compass possess high precision, the vehicle’s position estimation still exhibited a slow but continuous drift over time. This cumulative error became particularly significant when overcoming flow resistance during lateral scanning.
  • Standard ESKF Method (blue dashed–dotted line): During the initial 100 s normal visual phase, the trajectory highly aligned with the ground truth. However, upon entering the visual-denied zone at 100 s–200 s (indicated by the orange shaded area), the filter rapidly degraded to a pure prediction mode due to the loss of the sole position correction source, resulting in severe trajectory divergence within a short period.
  • The proposed method (red solid line): During the visual failure period, the algorithm automatically adjusted weights and utilized the geometric constraints constructed by the sonar to effectively constrain the degree of freedom perpendicular to the structural surface. The trajectory indicates that even when VO was highly unreliable, the vehicle still moved closely along the predefined manifold surface without divergence, achieving a smooth transition.
To quantitatively evaluate precision, the RMSE of position for each algorithm was calculated, with the results shown in Figure 3.
  • Overall localization precision: The localization error of the proposed method consistently converged within 0.1 m. In contrast, the error of the DR method grew linearly over time to exceed 0.25 m.
Robustness analysis: After the visual interruption at t = 100 s, the slope of the error curve for the standard ESKF (blue) increased sharply, indicating a loss of system observability. Conversely, the error curve of the proposed method (red) exhibited only minor fluctuations, demonstrating that the sonar constraint effectively compensated for the uncertainty induced by the absence of vision.
The core contribution of this study lies in utilizing geometric priors to restrict drift. Figure 4 specifically demonstrates the estimation error perpendicular to the structural surface (X-axis).
  • In the visual failure interval, the X-axis error of the standard method exhibited random walk characteristics, with a maximum deviation exceeding 1.0 m. This implies a high risk of the vehicle colliding with the infrastructure or deviating from the inspection distance in actual operations.
  • The X-axis error of the proposed method was consistently restricted within the coupling range of the sonar measurement noise and the DVL integration error, reducing the standard deviation by over 85%. This strongly proves that the manifold constraint model successfully compressed the vehicle’s state space onto a manifold cluster parallel to the structural surface, ensuring the safety of close-wall operations.

4.1.4. Simulation Analysis

The simulation results indicate that under a typical flow disturbance of vcurrent = 0.1 m/s and a total visual denial condition lasting up to 100 s, the multi-source tightly coupled algorithm proposed in this paper exhibited superior performance:
  • Precision enhancement: Compared to the traditional DR method, the overall localization RMSE was reduced by approximately 60%.
  • Safety improvement: By introducing the sonar manifold constraint, the drift risk perpendicular to the structural surface was effectively eliminated, resolving the localization divergence issue caused by visual SLAM “tracking loss” in turbid waters.
  • Smooth transition: The algorithm seamlessly switched between states of visual validity and failure, ensuring the continuity of prolonged and large-scale inspection tasks on structural surfaces.

4.2. Field Experiment Validation

4.2.1. Experimental Environment

To comprehensively verify the performance of the proposed integrated navigation algorithm based on strong acoustic constraints in real-world confined environments, large-scale field experiments were conducted at the downstream structural surface of a hydropower station in October 2025 (Figure 5).
  • Turbidity: The water body adjacent to the infrastructure was highly turbid, exhibiting strong backscattering and extremely poor visibility.
  • Visual environment: With the artificial lighting fully activated, the effective visual distance was merely 0.3–0.5 m.
  • Experimental task: The ROV executed a “lawnmower” close-wall scanning inspection, with the target distance to the structural surface set at 1.0 m and a navigation speed of 0.2 m/s. It is important to emphasize that the localization error evaluated in these field tests refers specifically to the relative tracking error with respect to the known structural wall. The tracking performance metrics for the vehicle’s heading, depth, and cross-track distance to the wall are directly calculated by employing the respective target control commands as the absolute reference baseline (ground truth).

4.2.2. Analysis of Experimental Results

Figure 6 and Figure 7 illustrates the depth tracking performance. The current depth closely followed the target depth. Statistical analysis indicates that the Mean Absolute Error was only 0.032 m, and the RMSE was 0.062 m. During step changes in the target depth, the system responded rapidly without overshoot. Despite vertical flow velocity disturbances, the error was consistently restricted to the centimeter level, proving the high stability of the vertical control.
Figure 8, Figure 9 and Figure 10 display the 3D navigation and localization trajectory, verifying the core role of the “manifold constraints”.
  • Front view (Y-Z plane): Figure 8 demonstrates a standard lawnmower scanning path covering water depths from 2 m to 13 m. The uniform layer spacing indicates that even in the event of visual failure, the fusion algorithm provided continuous and smooth position feedback.
  • Side view (Distance-Z plane): Figure 9 strongly proves the effectiveness of the proposed algorithm. The X-axis represents the distance to the structural surface, with data densely concentrated at depths of 7 m, 10 m, and 13 m. This corresponds to the vehicle conducting surface inspections at these respective depths, as illustrated in Figure 8. The close-wall tracking distance consistently converges around the target distance. This demonstrates that the sonar geometric manifold constraint successfully constrained the degree of freedom perpendicular to the structural surface. Compared to traditional schemes, the proposed algorithm utilizes planar geometric priors to restrict the localization error strictly within the sensor noise level.
  • Three-dimensional trajectory reconstruction: Figure 10 illustrates the spatial morphology of the close-wall scan, reflecting the algorithm’s decoupling capability between “in-plane motion” and “out-of-plane constraint”.
Figure 11 and Figure 12 quantitatively analyze the control errors under strong flow fields.
  • Heading stability: The vehicle consistently maintained heading stability (target value of approximately 135°). Although the water flow generated yaw moments, the fused localization provided accurate attitude estimation, with an overall heading MAE of approximately 2.56°.
  • Close-wall distance maintenance: The deviation between the measured close-wall distance and the target fluctuated within ±0.15 m for the vast majority of the time, achieving a Root Mean Square Error (RMSE) of 0.08 m (as further illustrated in Figure 12). It is important to note that because the absolute global position of the wall is known, the vehicle’s position is continuously corrected by the distance inferred from this wall tracking. Consequently, this wall-tracking error can represent the relative localization error to a large extent, thereby avoiding the ambiguity of claiming an absolute global localization error without an external independent ground-truth system. This effectively validates the localization precision along the cross-track axis and demonstrates that the navigation information fulfills the rigorous requirements of close-wall precision operations.

4.2.3. Summary

Field experiments indicate that under the highly turbid and strong flow field conditions of hydropower stations, the proposed multi-source tightly coupled navigation algorithm exhibited excellent performance:
  • Effectiveness: Long-endurance, large-scale lawnmower scanning was successfully achieved with a smooth trajectory.
  • Robustness: By introducing manifold constraints, the limitations of visual failure were overcome, restricting the localization error perpendicular to the structural surface to the centimeter level.
  • Engineering practicability: The algorithm operates stably with good real-time performance, satisfying the demands of normalized intelligent inspection for hydropower infrastructures.

5. Conclusions

To address the localization challenges of underwater vehicles in highly turbid and confined waters, this paper proposes and validates a heterogeneous multi-source tightly coupled navigation algorithm based on strong acoustic constraints derived from structural surface geometric priors. Research indicates that the acoustic observation model, constructed utilizing planar geometric characteristics, effectively transforms single-point ranging into manifold constraints within the high-dimensional state space via analytical Jacobian matrices. Field experiments demonstrate that this mechanism successfully overcomes the limitations of visual failure, restricting the localization error perpendicular to the structural surface to within 0.08 m and significantly suppressing attitude divergence. Even under extreme conditions of simultaneous visual and acoustic bottom-tracking failures, the system maintains stable and reliable blind-navigation state estimation, substantially enhancing the safety of close-wall operations.
Although the algorithm exhibits exceptional robustness on continuous planar surfaces, it possesses theoretical limitations when navigating through tunnels with complex macroscopic geometries, sharp transitional corners, or heavily non-planar topological variations. When the vehicle traverses a structural corner, the singular global planar assumption becomes instantly invalid, which could precipitate filter instability. To generalize this method to arbitrary complex tunnel topologies, the foundational architecture must transition from a static global plane to a dynamic, piecewise planar constraint model. Furthermore, establishing a comprehensive benchmarking framework to compare the proposed algorithm with recent state-of-the-art underwater localization methods—such as tightly coupled factor graph optimizations and sliding-window approaches—will be a primary objective in our subsequent studies. Finally, integrating deep learning to achieve environment-driven multi-modal adaptive weight adjustment is a key evolutionary direction for realizing smarter fusion strategies. This study not only fulfills the urgent demands for the refined inspection of hydropower infrastructures but also provides a highly valuable navigation paradigm for the intelligent operation and maintenance of various large-scale underwater structures with similar geometric features in marine engineering.

Author Contributions

Conceptualization, X.W. and B.Z.; methodology, X.W.; software, M.Y.; validation, L.L.; investigation, X.L.; resources, X.L.; data curation, T.M.; writing—original draft preparation, X.W.; writing—review and editing, all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by State Key Laboratory of Robotics and Intelligent Systems, grant number 2025-Z13, and Natural Science Foundation of Liaoning Province, grant number 2024-BSBA-52.

Data Availability Statement

The datasets presented in this article are not readily available because they are subject to strict confidentiality agreements mandated by the hydropower station authorities. Requests to access the datasets should be directed to the corresponding author.

Acknowledgments

During the preparation of this manuscript, the authors used DeepSeek-v3 for the purposes of translating the original Chinese draft into academic English and polishing the language for clarity and grammatical correctness. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

Author Xinyu Li is employed by China Yangtze Power Co., Ltd.; the rest of the authors declare no conflicts of interest.

References

  1. Xu, S.; Zhang, K.; Wang, S. AQUA-SLAM: Tightly-Coupled Underwater Acoustic-Visual-Inertial SLAM With Sensor Calibration. IEEE Trans. Robot. 2025, 41, 2785–2803. [Google Scholar] [CrossRef]
  2. Zhang, J.; Han, F.; Han, D.; Yang, J.; Zhao, W.; Li, H. Integration of Sonar and Visual–Inertial Systems for SLAM in Underwater Environments. IEEE Sens. J. 2024, 24, 16792–16804. [Google Scholar] [CrossRef]
  3. Song, J.; Li, W.; Zhu, X. Acoustic-VINS: Tightly Coupled Acoustic-Visual-Inertial Navigation System for Autonomous Underwater Vehicles. IEEE Robot. Autom. Lett. 2024, 9, 1620–1627. [Google Scholar] [CrossRef]
  4. Yang, H.; Gao, X.; Huang, H.; Li, B.; Jiang, J. A Tightly Integrated Navigation Method of SINS, DVL, and PS Based on RIMM in the Complex Underwater Environment. Sensors 2022, 22, 9479. [Google Scholar] [CrossRef] [PubMed]
  5. Fan, J.; Liu, X.; Ou, Y.; Zhang, P.; Zhou, C.; Hou, Z. Underwater Robot Self-Localization Method Using Tightly Coupled Events, Images, Inertial, and Acoustic Fusion. IEEE Trans. Ind. Electron. 2025, 72, 5126–5135. [Google Scholar] [CrossRef]
  6. Zheng, J.; Zhao, R.; Yang, G.; Liu, S.; Zhang, Z.; Fu, Y.; Lu, J. An Underwater Image Restoration Deep Learning Network Combining Attention Mechanism and Brightness Adjustment. J. Mar. Sci. Eng. 2024, 12, 7. [Google Scholar] [CrossRef]
  7. Ou, Y.; Xia, S.; Fan, J.; Zhou, C.; Huang, Y.; Shuai, P.; Xue, Y.; Hou, Z. PL-VAP: A Tightly Coupled Self-Localization Framework for Underwater Robots Using Point-Line Features and Visual-Acoustic-Pressure Sensor Fusion. IEEE/ASME Trans. Mechatron. 2025, 30, 4116–4128. [Google Scholar] [CrossRef]
  8. Huang, Y.; Li, P.; Yan, S.; Tan, M.; Yu, J.; Wu, Z. Self-Localization of a Biomimetic Robotic Shark Using Tightly Coupled Visual-Acoustic Fusion. IEEE Trans. Ind. Electron. 2024, 71, 12581–12591. [Google Scholar] [CrossRef]
  9. Zhao, B.; Li, S.; Wang, X.; Yang, M.; Yu, X.; Meng, Z.; Wan, G. Design of Control System for Underwater Inspection Robot in Hydropower Dam Structures. J. Mar. Sci. Eng. 2025, 13, 1656. [Google Scholar] [CrossRef]
  10. Starbuck, B.; Fornasier, A.; Weiss, S.; Pradalier, C. Consistent State Estimation on Manifolds for Autonomous Metal Structure Inspection. In 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China; IEEE: New York, NY, USA, 2021; pp. 10250–10256. [Google Scholar] [CrossRef]
  11. Wang, Y.; Xie, C.; Liu, Y.; Zhu, J.; Qin, J. A Multi-Sensor Fusion Underwater Localization Method Based on Unscented Kalman Filter on Manifolds. Sensors 2024, 24, 6299. [Google Scholar] [CrossRef] [PubMed]
  12. Ding, S.; Zhang, T.; Lei, M.; Chai, H.; Jia, F. Robust visual-based localization and mapping for underwater vehicles: A survey. Ocean Eng. 2024, 312, 119274. [Google Scholar] [CrossRef]
  13. Rahman, S.; Li, A.Q.; Rekleitis, I. SVIn2: A multi-sensor fusion-based underwater SLAM system. Int. J. Rob. Res. 2022, 41, 1022–1042. [Google Scholar] [CrossRef]
  14. Hong, K.; Wang, H.; Yuan, B. Inspection-Nerf: Rendering Multi-Type Local Images for Dam Surface Inspection Task Using Climbing Robot and Neural Radiance Field. Buildings 2023, 13, 213. [Google Scholar] [CrossRef]
  15. Shaukat, N.; Ali, A.; Iqbal, M.J.; Moinuddin, M.; Otero, P. Multi-Sensor Fusion for Underwater Vehicle Localization by Augmentation of RBF Neural Network and Error-State Kalman Filter. Sensors 2021, 21, 1149. [Google Scholar] [CrossRef] [PubMed]
  16. Cohen, N.; Klein, I. Seamless Underwater Navigation with Limited Doppler Velocity Log Measurements. IEEE Trans. Intell. Veh. 2024. [Google Scholar] [CrossRef]
  17. Li, J.; Wang, J.; Xu, T.; Shu, J.; Liu, Y.; Ma, Y.; Xu, Y. Dynamic Stochastic Model Optimization for Underwater Acoustic Navigation via Singular Value Decomposition. J. Mar. Sci. Eng. 2025, 13, 1329. [Google Scholar] [CrossRef]
  18. Chen, Z.; Zhu, H.; Yu, B.; Fu, X.; Jiang, C.; Zhang, S. Robust Multi-Sensor Fusion SLAM Based on Road Network and Reflectivity Enhancement. In 2024 7th International Conference on Advanced Algorithms and Control Engineering (ICAACE), Shanghai, China; IEEE: New York, NY, USA, 2024; pp. 1434–1439. [Google Scholar] [CrossRef]
  19. Xu, Z.; Haroutunian, M.; Murphy, A.J.; Neasham, J.; Norman, R. A Low-Cost Visual Inertial Odometry System for Underwater Vehicles. In 2021 4th International Conference on Mechatronics, Robotics and Automation (ICMRA), Zhanjiang, China; IEEE: New York, NY, USA, 2021; pp. 139–143. [Google Scholar] [CrossRef]
  20. Khan, A.; Fouda, M.M.; Do, D.T.; Almaleh, A.; Alqahtani, A.M.; Rahman, A.U. Underwater Target Detection Using Deep Learning: Methodologies, Challenges, Applications, and Future Evolution. IEEE Access 2024, 12, 12618–12635. [Google Scholar] [CrossRef]
  21. Zhou, H.; Kong, M.; Yuan, H.; Pan, Y.; Wang, X.; Chen, R.; Lu, W.; Wang, R.; Yang, Q. Real-time underwater object detection technology for complex underwater environments based on deep learning. Ecol. Inform. 2024, 82, 102680. [Google Scholar] [CrossRef]
Figure 1. Intelligent underwater defect inspection robot for large-scale hydropower dams.
Figure 1. Intelligent underwater defect inspection robot for large-scale hydropower dams.
Jmse 14 01097 g001
Figure 2. Comparison of 3D trajectory tracking results for three algorithms.
Figure 2. Comparison of 3D trajectory tracking results for three algorithms.
Jmse 14 01097 g002
Figure 3. RMSE in position tracking for the three algorithms.
Figure 3. RMSE in position tracking for the three algorithms.
Jmse 14 01097 g003
Figure 4. Estimated error perpendicular to the dam face (yellow area indicates visual failure interval).
Figure 4. Estimated error perpendicular to the dam face (yellow area indicates visual failure interval).
Jmse 14 01097 g004
Figure 5. Field experiments at a hydropower station.
Figure 5. Field experiments at a hydropower station.
Jmse 14 01097 g005
Figure 6. Depth tracking data.
Figure 6. Depth tracking data.
Jmse 14 01097 g006
Figure 7. Depth tracking error.
Figure 7. Depth tracking error.
Jmse 14 01097 g007
Figure 8. Front view: navigation trajectory (Y-Z).
Figure 8. Front view: navigation trajectory (Y-Z).
Jmse 14 01097 g008
Figure 9. Side view: wall-following (Distance-Z).
Figure 9. Side view: wall-following (Distance-Z).
Jmse 14 01097 g009
Figure 10. Three-dimensional navigation trajectory.
Figure 10. Three-dimensional navigation trajectory.
Jmse 14 01097 g010
Figure 11. Heading control data.
Figure 11. Heading control data.
Jmse 14 01097 g011
Figure 12. Close-wall distance maintenance data.
Figure 12. Close-wall distance maintenance data.
Jmse 14 01097 g012
Table 1. Sensor parameter configuration for simulation.
Table 1. Sensor parameter configuration for simulation.
SensorUpdate Rate (Hz)Noise Standard Deviation (σ)Remarks
IMU100Acc: 0.02 m/s2,
Gyro: 0.001 rad/s
Contains bias random walk
DVL100.02 m/sSimulates bottom-tracking mode
Depth Sensor100.02 mAbsolute depth observation
Magnetic Compass100.05 radHighly susceptible to magnetic interference
VO10Normal: 0.1 m/Failure: 100 mSimulates pose observation
Single-Beam Sonar100.05 mCore of the proposed constraint
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, X.; Yang, M.; Zhao, B.; Ma, T.; Liu, L.; Li, X. Robust Tightly-Coupled Multi-Source Navigation Using Acoustic-Geometric Constraints for Underwater Vehicles in Tunnels. J. Mar. Sci. Eng. 2026, 14, 1097. https://doi.org/10.3390/jmse14121097

AMA Style

Wang X, Yang M, Zhao B, Ma T, Liu L, Li X. Robust Tightly-Coupled Multi-Source Navigation Using Acoustic-Geometric Constraints for Underwater Vehicles in Tunnels. Journal of Marine Science and Engineering. 2026; 14(12):1097. https://doi.org/10.3390/jmse14121097

Chicago/Turabian Style

Wang, Xiangbin, Mingyu Yang, Bing Zhao, Tengfei Ma, Lijia Liu, and Xinyu Li. 2026. "Robust Tightly-Coupled Multi-Source Navigation Using Acoustic-Geometric Constraints for Underwater Vehicles in Tunnels" Journal of Marine Science and Engineering 14, no. 12: 1097. https://doi.org/10.3390/jmse14121097

APA Style

Wang, X., Yang, M., Zhao, B., Ma, T., Liu, L., & Li, X. (2026). Robust Tightly-Coupled Multi-Source Navigation Using Acoustic-Geometric Constraints for Underwater Vehicles in Tunnels. Journal of Marine Science and Engineering, 14(12), 1097. https://doi.org/10.3390/jmse14121097

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop