MDPI - Publisher of Open Access Journals

15 pages, 4497 KB

Open AccessArticle

Sea Bottom Line Tracking in Side-Scan Sonar Images Using WTMM-Based Edge Detection

by Jisheng Ding, Fengbiao Jiang, Fangqi Wang and Long Yang

J. Mar. Sci. Eng. 2026, 14(11), 1002; https://doi.org/10.3390/jmse14111002 - 28 May 2026

Viewed by 200

The topographic features of the seafloor can be observed clearly via high-resolution side-scan sonar imagery. However, the faithful interpretation of a sonar image depends strongly on the accuracy with which the location of the sea bottom line can be tracked within the image, [...] Read more.

The topographic features of the seafloor can be observed clearly via high-resolution side-scan sonar imagery. However, the faithful interpretation of a sonar image depends strongly on the accuracy with which the location of the sea bottom line can be tracked within the image, and current tracking methods function poorly under high sonar signal noise or suffer from high complexity. The present work addresses this issue by applying the characteristics of simple sonar waterfall maps in conjunction with robust edge detection and multi-scale analysis based on wavelet transform modulus maxima. The proposed tracking method is demonstrated to provide superior effectiveness and accuracy in comparison with existing baseline methods based on the results of experiments conducted with a representative side-scan sonar image with and without applied speckle noise. This superiority can be attributed to the good localization characteristics and multi-scale detection features of wavelet transform analysis, which can suppress the impact of noise in the sonar image on the accurate extraction of edge information. Full article

(This article belongs to the Section Physical Oceanography)

► Show Figures

Figure 1

22 pages, 14706 KB

Open AccessArticle

Ultra-Fast Object Detection for Side-Scan Sonar Images via Target Presence Awareness

by Guoqing Xie, Guang Pan, Ju He, Hu Xu and Yang Yu

Remote Sens. 2026, 18(11), 1679; https://doi.org/10.3390/rs18111679 - 22 May 2026

Viewed by 417

Abstract

Side-scan sonar (SSS) imaging plays a critical role in underwater perception for autonomous underwater vehicles (AUVs). However, the spatial sparsity of targets and the limited computational resources remain challenging for real-time object detection. Existing methods typically adopt dense inference strategies, leading to substantial [...] Read more.

Side-scan sonar (SSS) imaging plays a critical role in underwater perception for autonomous underwater vehicles (AUVs). However, the spatial sparsity of targets and the limited computational resources remain challenging for real-time object detection. Existing methods typically adopt dense inference strategies, leading to substantial computational redundancy and limited deployment feasibility. In this work, we propose a lightweight and ultra-fast SSS object detection framework based on target presence awareness. The proposed framework follows a coarse-to-fine inference paradigm, in which a target presence analysis module is first employed to rapidly filter out target-absent image patches, and only target-positive patches are forwarded to an Object Forward Detection (OFD) module for fine-grained detection. The TPA module integrates spatial–frequency convolution to efficiently capture both local structural cues and global contextual information with minimal computational overhead. Furthermore, an AttnConv-enhanced detection module is introduced in the OFD stage to strengthen high-frequency target features and improve fine-grained detection performance. Extensive experiments on public SSS datasets demonstrate that the proposed method achieves an mAP of 74.63% on the AI4Shipwrecks dataset and 63.02% on the SSS-Mine dataset. Notably, the framework delivers an ultra-fast inference speed of 174.74 FPS on embedded hardware, representing a 5.2× speedup over conventional dense-processing detection methods. Full article

(This article belongs to the Section Ocean Remote Sensing)

► Show Figures

Figure 1

18 pages, 21503 KB

Open AccessArticle

GhostVision: Democratizing Derelict Gear Detection Using Low-Cost Sonar and Artificial Intelligence

by Cameron S. Bodine, Kleio Baxevani, Naveed Abbasi, Jared Wierzbicki, Ophelia Christoph, Catherine Hughes, Onur Bagoren, Olivia Hines, Julia Greco and Arthur Trembanis

J. Mar. Sci. Eng. 2026, 14(10), 951; https://doi.org/10.3390/jmse14100951 - 20 May 2026

Viewed by 418

Abstract

Derelict crab pots (“ghost pots”) cause bycatch mortality, habitat degradation, and lost harvest in shallow coastal ecosystems. Existing detection and recovery programs rely on expert operators and high-cost sonar, limiting coverage and reproducibility. Here, we present GhostVision, an open-source framework that integrates low-cost [...] Read more.

Derelict crab pots (“ghost pots”) cause bycatch mortality, habitat degradation, and lost harvest in shallow coastal ecosystems. Existing detection and recovery programs rely on expert operators and high-cost sonar, limiting coverage and reproducibility. Here, we present GhostVision, an open-source framework that integrates low-cost consumer side-scan sonar with modern object-detection models to enable scalable, rapid post-processing and mapping of derelict gear. Mobile Mapping Units (MMUs) equipped with off-the-shelf fishfinders surveyed more than 1500 acres in Delaware’s Inland Bays between 2020 and 2022. Three architectures (YOLOv12, YOLOv26, RF-DETR) were trained on 3110 manually annotated sonar images and evaluated with both dataset-centric metrics and full pipeline implementation. YOLOv12 showed the strongest untuned operational performance (F1 = 0.512; recall = 0.922), while post-processing optimization produced comparable performance across all three models (F1 ≈ 0.71–0.73). Across 11 complete test recordings, end-to-end processing required only 8.87–9.79% of survey time (approximately 10–11× faster than real-time), supporting same-day analysis and recovery workflows. GhostVision can foster community engagement in derelict crab-pot removal by pairing low-cost sonar with AI to aid recovery efforts at management-relevant scales. By lowering financial and technical barriers, GhostVision provides a reproducible pathway for large-scale stewardship and supports future extensions to multi-class detection and autonomous platforms. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

33 pages, 40054 KB

Open AccessArticle

MVDCNN: A Multi-View Deep Convolutional Network with Feature Fusion for Robust Sonar Image Target Recognition

by Yue Fan, Cheng Peng, Peng Zhang, Zhisheng Zhang, Guoping Zhang and Jinsong Tang

Remote Sens. 2026, 18(1), 76; https://doi.org/10.3390/rs18010076 - 25 Dec 2025

Cited by 1 | Viewed by 1056

Abstract

Automatic Target Recognition (ATR) in single-view sonar imagery is severely hampered by geometric distortions, acoustic shadows, and incomplete target information due to occlusions and the slant-range imaging geometry, which frequently give rise to misclassification and hinder practical underwater detection applications. To address these [...] Read more.

Automatic Target Recognition (ATR) in single-view sonar imagery is severely hampered by geometric distortions, acoustic shadows, and incomplete target information due to occlusions and the slant-range imaging geometry, which frequently give rise to misclassification and hinder practical underwater detection applications. To address these critical limitations, this paper proposes a Multi-View Deep Convolutional Neural Network (MVDCNN) based on feature-level fusion for robust sonar image target recognition. The MVDCNN adopts a highly modular and extensible architecture consisting of four interconnected modules: an input reshaping module that adapts multi-view images to match the input format of pre-trained backbone networks via dimension merging and channel replication; a shared-weight feature extraction module that leverages Convolutional Neural Network (CNN) or Transformer backbones (e.g., ResNet, Swin Transformer, Vision Transformer) to extract discriminative features from each view, ensuring parameter efficiency and cross-view feature consistency; a feature fusion module that aggregates complementary features (e.g., target texture and shape) across views using max-pooling to retain the most salient characteristics and suppress noisy or occluded view interference; and a lightweight classification module that maps the fused feature representations to target categories. Additionally, to mitigate the data scarcity bottleneck in sonar ATR, we design a multi-view sample augmentation method based on sonar imaging geometric principles: this method systematically combines single-view samples of the same target via the combination formula and screens valid samples within a predefined azimuth range, constructing high-quality multi-view training datasets without relying on complex generative models or massive initial labeled data. Comprehensive evaluations on the Custom Side-Scan Sonar Image Dataset (CSSID) and Nankai Sonar Image Dataset (NKSID) demonstrate the superiority of our framework over single-view baselines. Specifically, the two-view MVDCNN achieves average classification accuracies of 94.72% (CSSID) and 97.24% (NKSID), with relative improvements of 7.93% and 5.05%, respectively; the three-view MVDCNN further boosts the average accuracies to 96.60% and 98.28%. Moreover, MVDCNN substantially elevates the precision and recall of small-sample categories (e.g., Fishing net and Small propeller in NKSID), effectively alleviating the class imbalance challenge. Mechanism validation via t-Distributed Stochastic Neighbor Embedding (t-SNE) feature visualization and prediction confidence distribution analysis confirms that MVDCNN yields more separable feature representations and more confident category predictions, with stronger intra-class compactness and inter-class discrimination in the feature space. The proposed MVDCNN framework provides a robust and interpretable solution for advancing sonar ATR and offers a technical paradigm for multi-view acoustic image understanding in complex underwater environments. Full article

(This article belongs to the Special Issue Underwater Remote Sensing: Status, New Challenges and Opportunities)

► Show Figures

Graphical abstract

31 pages, 15645 KB

Open AccessArticle

RCF-YOLOv8: A Multi-Scale Attention and Adaptive Feature Fusion Method for Object Detection in Forward-Looking Sonar Images

by Xiaoxue Li, Yuhan Chen, Xueqin Liu, Zhiliang Qin, Jiaxin Wan and Qingyun Yan

Remote Sens. 2025, 17(19), 3288; https://doi.org/10.3390/rs17193288 - 25 Sep 2025

Cited by 3 | Viewed by 2803

Abstract

Acoustic imaging systems are essential for underwater target recognition and localization, but forward-looking sonar (FLS) imagery faces challenges due to seabed variability, resulting in low resolution, blurred images, and sparse targets. To address these issues, we introduce RCF-YOLOv8, an enhanced detection framework based [...] Read more.

Acoustic imaging systems are essential for underwater target recognition and localization, but forward-looking sonar (FLS) imagery faces challenges due to seabed variability, resulting in low resolution, blurred images, and sparse targets. To address these issues, we introduce RCF-YOLOv8, an enhanced detection framework based on YOLOv8, designed to improve FLS image analysis. Key innovations include the use of CoordConv modules to better encode spatial information, improving feature extraction and reducing misdetection rates. Additionally, an efficient multi-scale attention (EMA) mechanism addresses sparse target distributions, optimizing feature fusion and improving the network’s ability to identify key areas. Lastly, the C2f module with high-quality feature fusion (C2f-Fusion) optimizes feature extraction from noisy backgrounds. RCF-YOLOv8 achieved a 98.8% mAP@50 and a 67.6% mAP@50-95 on the URPC2021 dataset, outperforming baseline models with a 2.4% increase in single-threshold accuracy and a 10.4% increase in multi-threshold precision, demonstrating its robustness for underwater detection. Full article

(This article belongs to the Special Issue Efficient Object Detection Based on Remote Sensing Images)

► Show Figures

Figure 1

23 pages, 17670 KB

Open AccessArticle

UWS-YOLO: Advancing Underwater Sonar Object Detection via Transfer Learning and Orthogonal-Snake Convolution Mechanisms

by Liang Zhao, Xu Ren, Lulu Fu, Qing Yun and Jiarun Yang

J. Mar. Sci. Eng. 2025, 13(10), 1847; https://doi.org/10.3390/jmse13101847 - 24 Sep 2025

Cited by 7 | Viewed by 2838

Abstract

Accurate and efficient detection of underwater targets in sonar imagery is critical for applications such as marine exploration, infrastructure inspection, and autonomous navigation. However, sonar-based object detection remains challenging due to low resolution, high noise, cluttered backgrounds, and the scarcity of annotated data. [...] Read more.

Accurate and efficient detection of underwater targets in sonar imagery is critical for applications such as marine exploration, infrastructure inspection, and autonomous navigation. However, sonar-based object detection remains challenging due to low resolution, high noise, cluttered backgrounds, and the scarcity of annotated data. To address these issues, we propose UWS-YOLO, a novel detection framework specifically designed for underwater sonar images. The model integrates three key innovations: (1) a C2F-Ortho module that enhances multi-scale feature representation through orthogonal channel attention, improving sensitivity to small and low-contrast targets; (2) a DySnConv module that employs Dynamic Snake Convolution to adaptively capture elongated and irregular structures such as pipelines and cables; and (3) a cross-modal transfer learning strategy that pre-trains on large-scale optical underwater imagery before fine-tuning on sonar data, effectively mitigating overfitting and bridging the modality gap. Extensive evaluations on real-world sonar datasets demonstrate that UWS-YOLO achieves a mAP@0.5 of 87.1%, outperforming the YOLOv8n baseline by 3.5% and seven state-of-the-art detectors in accuracy while maintaining real-time performance at 158 FPS with only 8.8 GFLOPs. The framework exhibits strong generalization across datasets, robustness to noise, and computational efficiency on embedded devices, confirming its suitability for deployment in resource-constrained underwater environments. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

24 pages, 5149 KB

Open AccessArticle

Impact of Input Image Resolution on Deep Learning Performance for Side-Scan Sonar Classification: An Accuracy–Efficiency Analysis

by Xing Du, Yongfu Sun, Yupeng Song, Wanqing Chi, Lifeng Dong and Xiaolong Zhao

Remote Sens. 2025, 17(14), 2431; https://doi.org/10.3390/rs17142431 - 13 Jul 2025

Cited by 6 | Viewed by 4241

Abstract

Side-scan sonar (SSS) image classification is crucial for underwater applications, but the trade-off between the accuracy afforded by high-resolution images and the associated computational cost challenges deployment, particularly on resource-constrained platforms like AUVs. This study systematically investigates and quantifies this accuracy–efficiency trade-off in [...] Read more.

Side-scan sonar (SSS) image classification is crucial for underwater applications, but the trade-off between the accuracy afforded by high-resolution images and the associated computational cost challenges deployment, particularly on resource-constrained platforms like AUVs. This study systematically investigates and quantifies this accuracy–efficiency trade-off in SSS image classification by varying input resolution. Using two distinct SSS datasets and a resolution-adaptive deep learning strategy employing MobileNetV2 and ResNet variants across six resolutions, we evaluated classification accuracy and computational metrics. Results demonstrate a clear inverse relationship: decreasing resolution significantly reduces computational load and processing times but lowers classification accuracy, with the degradation being more pronounced for the more complex four-class dataset. Notably, model test accuracy did not necessarily increase monotonically with resolution. Importantly, acceptable accuracy levels above 90% or 80% could be maintained at significantly lower resolutions, offering substantial efficiency gains. In conclusion, strategically reducing SSS image resolution based on application-specific accuracy requirements is a viable approach for optimizing computational resources. This work provides a quantitative framework for navigating this trade-off and underscores the need for developing SSS-specific architectures for future advancements. Full article

(This article belongs to the Special Issue Advancements in Deep Learning for Object Detection and Segmentation in Remote Sensing Imagery)

► Show Figures

Graphical abstract

14 pages, 1438 KB

Open AccessArticle

CDBA-GAN: A Conditional Dual-Branch Attention Generative Adversarial Network for Robust Sonar Image Generation

by Wanzeng Kong, Han Yang, Mingyang Jia and Zhe Chen

Appl. Sci. 2025, 15(13), 7212; https://doi.org/10.3390/app15137212 - 26 Jun 2025

Cited by 2 | Viewed by 1258

Abstract

The acquisition of real-world sonar data necessitates substantial investments of manpower, material resources, and financial capital, rendering it challenging to obtain sufficient authentic samples for sonar-related research tasks. Consequently, sonar image simulation technology has become increasingly vital in the field of sonar data [...] Read more.

The acquisition of real-world sonar data necessitates substantial investments of manpower, material resources, and financial capital, rendering it challenging to obtain sufficient authentic samples for sonar-related research tasks. Consequently, sonar image simulation technology has become increasingly vital in the field of sonar data analysis. Traditional sonar simulation methods predominantly focus on low-level physical modeling, which often suffers from limited image controllability and diminished fidelity in multi-category and multi-background scenarios. To address these limitations, this paper proposes a Conditional Dual-Branch Attention Generative Adversarial Network (CDBA-GAN). The framework comprises three key innovations: The conditional information fusion module, dual-branch attention feature fusion mechanism, and cross-layer feature reuse. By integrating encoded conditional information with the original input data of the generative adversarial network, the fusion module enables precise control over the generation of sonar images under specific conditions. A hierarchical attention mechanism is implemented, sequentially performing channel-level and pixel-level attention operations. This establishes distinct weight matrices at both granularities, thereby enhancing the correlation between corresponding elements. The dual-branch attention features are fused via a skip-connection architecture, facilitating efficient feature reuse across network layers. The experimental results demonstrate that the proposed CDBA-GAN generates condition-specific sonar images with a significantly lower Fréchet inception distance (FID) compared to existing methods. Notably, the framework exhibits robust imaging performance under noisy interference and outperforms state-of-the-art models (e.g., DCGAN, WGAN, SAGAN) in fidelity across four categorical conditions, as quantified by FID metrics. Full article

► Show Figures

Figure 1

22 pages, 3096 KB

Open AccessArticle

SDA-Mask R-CNN: An Advanced Seabed Feature Extraction Network for UUV

by Yao Xiao, Dongchen Dai, Hongjian Wang, Chengfeng Li and Shaozheng Song

J. Mar. Sci. Eng. 2025, 13(5), 863; https://doi.org/10.3390/jmse13050863 - 25 Apr 2025

Cited by 2 | Viewed by 1254

Abstract

This paper proposes a novel SDA-Mask R-CNN framework for precise seabed terrain edge feature extraction from Side-Scan Sonar (SSS) images to enhance Unmanned Underwater Vehicle (UUV) perception and navigation. The developed architecture addresses critical challenges in underwater image analysis, including low segmentation accuracy [...] Read more.

This paper proposes a novel SDA-Mask R-CNN framework for precise seabed terrain edge feature extraction from Side-Scan Sonar (SSS) images to enhance Unmanned Underwater Vehicle (UUV) perception and navigation. The developed architecture addresses critical challenges in underwater image analysis, including low segmentation accuracy and ambiguous edge delineation, through three principal innovations. First, we introduce a Structural Synergistic Group-Attention Residual Network (SSGAR-Net) that integrates group convolution with an enhanced convolutional block attention mechanism, complemented by a layer-skipping architecture for optimized information flow and redundancy verification for computational efficiency. Second, a Depth-Weighted Hierarchical Fusion Network (DWHF-Net) incorporates depthwise separable convolution to minimize computational complexity while preserving model performance, which is particularly effective for high-resolution SSS image processing. This module further employs a weighted pyramid architecture to achieve multi-scale feature fusion, significantly improving adaptability to diverse object scales in dynamic underwater environments. Third, an Adaptive Synergistic Mask Optimization (ASMO) strategy systematically enhances mask generation through classification head refinement, adaptive post-processing, and progressive training protocols. Comprehensive experiments demonstrate that our method achieves 0.695 (IoU) segmentation accuracy and 1.0 (AP) edge localization accuracy. The proposed framework shows notable superiority in preserving topological consistency of seabed features, offering a reliable technical framework for underwater navigation and seabed mapping in marine engineering applications. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

20 pages, 31492 KB

Open AccessArticle

The Bright Feature Transform for Prominent Point Scatterer Detection and Tone Mapping

by Gregory D. Vetaw and Suren Jayasuriya

Remote Sens. 2025, 17(6), 1037; https://doi.org/10.3390/rs17061037 - 15 Mar 2025

Cited by 2 | Viewed by 1233

Abstract

Detecting bright point scatterers plays an important role in assessing the quality of many sonar, radar, and medical ultrasound imaging systems, especially for characterizing the resolution. Traditionally, prominent scatterers, also known as coherent scatterers, are usually detected by employing thresholding techniques alongside statistical [...] Read more.

Detecting bright point scatterers plays an important role in assessing the quality of many sonar, radar, and medical ultrasound imaging systems, especially for characterizing the resolution. Traditionally, prominent scatterers, also known as coherent scatterers, are usually detected by employing thresholding techniques alongside statistical measures in the detection processing chain. However, these methods can perform poorly in detecting point-like scatterers in relatively high levels of speckle background and can distort the structure of the scatterer when visualized. This paper introduces a fast image-processing method to visually identify and detect point scatterers in synthetic aperture imagery using the bright feature transform (BFT). The BFT is analytic, computationally inexpensive, and requires no thresholding or parameter tuning. We derive this method by analyzing an ideal point scatterer’s response with respect to pixel intensity and contrast around neighboring pixels and non-adjacent pixels. We show that this method preserves the general structure and the width of the bright scatterer while performing tone mapping, which can then be used for downstream image characterization and analysis. We then modify the BFT to present a difference of trigonometric functions to mitigate speckle scatterers and other random noise sources found in the imagery. We evaluate the performance of our methods on simulated and real synthetic aperture sonar and radar images, and show qualitative results on how the methods perform tone mapping on reconstructed input imagery in such a way to highlight the bright scatterer, which is insensitive to seafloor textures and high speckle noise levels. Full article

► Show Figures

Figure 1

16 pages, 3921 KB

Open AccessArticle

Effect of Seabed Type on Image Segmentation of an Underwater Object Obtained from a Side Scan Sonar Using a Deep Learning Approach

by Jungyong Park and Ho Seuk Bae

J. Mar. Sci. Eng. 2025, 13(2), 242; https://doi.org/10.3390/jmse13020242 - 26 Jan 2025

Cited by 1 | Viewed by 1617

Abstract

This study examines the impact of seabed conditions on image segmentation for seabed target images acquired via side-scan sonar during sea experiments. The dataset comprised cylindrical target images overlying on two seabed types, mud and sand, categorized accordingly. The deep learning algorithm (U-NET) [...] Read more.

This study examines the impact of seabed conditions on image segmentation for seabed target images acquired via side-scan sonar during sea experiments. The dataset comprised cylindrical target images overlying on two seabed types, mud and sand, categorized accordingly. The deep learning algorithm (U-NET) was utilized for image segmentation. The analysis focused on two key factors influencing segmentation performance: the weighting method of the cross-entropy loss function and the combination of datasets categorized by seabed type for training, validation, and testing. The results revealed three key findings. First, applying equal weights to the loss function yielded better segmentation performance compared to pixel-frequency-based weighting. This improvement is indicated by Intersection over Union (IoU) for the highlight class in dataset 2 (0.41 compared to 0.37). Second, images from the mud area were easier to segment than those from the sand area. This was due to the clearer intensity contrast between the target highlight and background. This difference is indicated by the IoU for the highlight class (0.63 compared to 0.41). Finally, a network trained on a combined dataset from both seabed types improved segmentation performance. This improvement was observed in challenging conditions, such as sand areas. In comparison, a network trained on a single-seabed dataset showed lower performance. The IoU values for the highlight class in sand area images are as follows: 0.34 for training on mud, 0.41 for training on sand, and 0.45 for training on both. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

20 pages, 8476 KB

Open AccessArticle

AquaPile-YOLO: Pioneering Underwater Pile Foundation Detection with Forward-Looking Sonar Image Processing

by Zhongwei Xu, Rui Wang, Tianyu Cao, Wenbo Guo, Bo Shi and Qiqi Ge

Remote Sens. 2025, 17(3), 360; https://doi.org/10.3390/rs17030360 - 22 Jan 2025

Cited by 8 | Viewed by 2028

Abstract

Underwater pile foundation detection is crucial for environmental monitoring and marine engineering. Traditional methods for detecting underwater pile foundations are labor-intensive and inefficient. Deep learning-based image processing has revolutionized detection, enabling identification through sonar imagery analysis. This study proposes an innovative methodology, named [...] Read more.

Underwater pile foundation detection is crucial for environmental monitoring and marine engineering. Traditional methods for detecting underwater pile foundations are labor-intensive and inefficient. Deep learning-based image processing has revolutionized detection, enabling identification through sonar imagery analysis. This study proposes an innovative methodology, named the AquaPile-YOLO algorithm, for underwater pile foundation detection. Our approach significantly enhances detection accuracy and robustness by integrating multi-scale feature fusion, improved attention mechanisms, and advanced data augmentation techniques. Trained on 4000 sonar images, the model excels in delineating pile structures and effectively identifying underwater targets. Experimental data show that the model can achieve good target identification results in similar experimental scenarios, with a 96.89% accuracy rate for underwater target recognition. Full article

(This article belongs to the Special Issue Artificial Intelligence for Ocean Remote Sensing)

► Show Figures

Figure 1

18 pages, 62968 KB

Open AccessArticle

Improving ICP-Based Scanning Sonar Image Matching Performance Through Height Estimation of Feature Point Using Shaded Area

by Gwonsoo Lee, Sukmin Yoon, Yeongjun Lee and Jihong Lee

J. Mar. Sci. Eng. 2025, 13(1), 150; https://doi.org/10.3390/jmse13010150 - 16 Jan 2025

Cited by 2 | Viewed by 1901

Abstract

This study presents an innovative method for estimating the height of feature points through shaded area analysis, to enhance the performance of iterative closest point (ICP)-based algorithms for matching scanning sonar images. Unlike other sensors, such as forward looking sonar (FLS) or BlueView, [...] Read more.

This study presents an innovative method for estimating the height of feature points through shaded area analysis, to enhance the performance of iterative closest point (ICP)-based algorithms for matching scanning sonar images. Unlike other sensors, such as forward looking sonar (FLS) or BlueView, scanning sonar has an extended data acquisition period, complicating data collection while in motion. Additionally, existing ICP-based matching algorithms that rely on two-dimensional scanning sonar data suffer from matching errors due to ambiguities in the nearest-point matching process, typically arising when the feature points demonstrate similarities in size and spatial arrangement, leading to numerous potential connections between them. To mitigate these matching ambiguities, we restrict the matching areas in the two images that need to be aligned. We propose two strategies to limit the matching area: the first utilizes the position and orientation information derived from the navigation algorithm, while the second involves estimating the overlapping region between the two images through height assessments of the feature points, facilitated by shaded area analysis. This latter strategy emphasizes preferential matching based on the height information obtained. We propose integrating these two approaches and validate the proposed algorithm through simulations, experimental basin tests, and real-world data collection, demonstrating its effectiveness. Full article

(This article belongs to the Special Issue Unmanned Marine Vehicles: Navigation, Control and Sensing)

► Show Figures

Figure 1

32 pages, 6380 KB

Open AccessArticle

Application and Analysis of the MFF-YOLOv7 Model in Underwater Sonar Image Target Detection

by Kun Zheng, Haoshan Liang, Hongwei Zhao, Zhe Chen, Guohao Xie, Liguo Li, Jinghua Lu and Zhangda Long

J. Mar. Sci. Eng. 2024, 12(12), 2326; https://doi.org/10.3390/jmse12122326 - 18 Dec 2024

Cited by 7 | Viewed by 2524

Abstract

The need for precise identification of underwater sonar image targets is growing in areas such as marine resource exploitation, subsea construction, and ocean ecosystem surveillance. Nevertheless, conventional image recognition algorithms encounter several obstacles, including intricate underwater settings, poor-quality sonar image data, and limited [...] Read more.

The need for precise identification of underwater sonar image targets is growing in areas such as marine resource exploitation, subsea construction, and ocean ecosystem surveillance. Nevertheless, conventional image recognition algorithms encounter several obstacles, including intricate underwater settings, poor-quality sonar image data, and limited sample quantities, which hinder accurate identification. This study seeks to improve underwater sonar image target recognition capabilities by employing deep learning techniques and developing the Multi-Gradient Feature Fusion YOLOv7 model (MFF-YOLOv7) to address these challenges. This model incorporates the Multi-Scale Information Fusion Module (MIFM) as a replacement for YOLOv7’s SPPCSPC, substitutes the Conv of CBS following ELAN with RFAConv, and integrates the SCSA mechanism at three junctions where the backbone links to the head, enhancing target recognition accuracy. Trials were conducted using datasets like URPC, SCTD, and UATD, encompassing comparative studies of attention mechanisms, ablation tests, and evaluations against other leading algorithms. The findings indicate that the MFF-YOLOv7 model substantially surpasses other models across various metrics, demonstrates superior underwater target detection capabilities, exhibits enhanced generalization potential, and offers a more dependable and precise solution for underwater target identification. Full article

(This article belongs to the Special Issue Application of Deep Learning in Underwater Image Processing)

► Show Figures

Figure 1

27 pages, 6983 KB

Open AccessArticle

DA-YOLOv7: A Deep Learning-Driven High-Performance Underwater Sonar Image Target Recognition Model

by Zhe Chen, Guohao Xie, Xiaofang Deng, Jie Peng and Hongbing Qiu

J. Mar. Sci. Eng. 2024, 12(9), 1606; https://doi.org/10.3390/jmse12091606 - 10 Sep 2024

Cited by 12 | Viewed by 4137

Abstract

Affected by the complex underwater environment and the limitations of low-resolution sonar image data and small sample sizes, traditional image recognition algorithms have difficulties achieving accurate sonar image recognition. The research builds on YOLOv7 and devises an innovative fast recognition model designed explicitly [...] Read more.

Affected by the complex underwater environment and the limitations of low-resolution sonar image data and small sample sizes, traditional image recognition algorithms have difficulties achieving accurate sonar image recognition. The research builds on YOLOv7 and devises an innovative fast recognition model designed explicitly for sonar images, namely the Dual Attention Mechanism YOLOv7 model (DA-YOLOv7), to tackle such challenges. New modules such as the Omni-Directional Convolution Channel Prior Convolutional Attention Efficient Layer Aggregation Network (OA-ELAN), Spatial Pyramid Pooling Channel Shuffling and Pixel-level Convolution Bilat-eral-branch Transformer (SPPCSPCBiFormer), and Ghost-Shuffle Convolution Enhanced Layer Aggregation Network-High performance (G-ELAN-H) are central to its design, which reduce the computational burden and enhance the accuracy in detecting small targets and capturing local features and crucial information. The study adopts transfer learning to deal with the lack of sonar image samples. By pre-training the large-scale Underwater Acoustic Target Detection Dataset (UATD dataset), DA-YOLOV7 obtains initial weights, fine-tuned on the smaller Smaller Common Sonar Target Detection Dataset (SCTD dataset), thereby reducing the risk of overfitting which is commonly encountered in small datasets. The experimental results on the UATD, the Underwater Optical Target Detection Intelligent Algorithm Competition 2021 Dataset (URPC), and SCTD datasets show that DA-YOLOV7 exhibits outstanding performance, with mAP@0.5 scores reaching 89.4%, 89.9%, and 99.15%, respectively. In addition, the model maintains real-time speed while having superior accuracy and recall rates compared to existing mainstream target recognition models. These findings establish the superiority of DA-YOLOV7 in sonar image analysis tasks. Full article

(This article belongs to the Topic Applications and Development of Underwater Robotics and Underwater Vision Technology)

► Show Figures

Figure 1

Search Results (46)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (46)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI