MDPI - Publisher of Open Access Journals

19 pages, 3473 KB

Open AccessArticle

Enhancing Instance Segmentation in High-Resolution Images Using Slicing-Aided Hyper Inference and Spatial Mask Merging Optimized via R-Tree Indexing

by Marko Mihajlovic and Marina Marjanovic

Mathematics 2025, 13(19), 3079; https://doi.org/10.3390/math13193079 - 25 Sep 2025

Viewed by 658

Abstract

Instance segmentation in high-resolution images is essential for applications such as remote sensing, medical imaging, and precision agriculture, yet remains challenging due to factors such as small object sizes, irregular shapes, and occlusions. Tiling-based approaches, such as Slicing-Aided Hyper Inference (SAHI), alleviate some [...] Read more.

Instance segmentation in high-resolution images is essential for applications such as remote sensing, medical imaging, and precision agriculture, yet remains challenging due to factors such as small object sizes, irregular shapes, and occlusions. Tiling-based approaches, such as Slicing-Aided Hyper Inference (SAHI), alleviate some of these challenges by processing smaller patches but introduce border artifacts and increased computational cost. Overlapping tiles can mitigate certain boundary effects but often result in duplicate detections and boundary inconsistencies, particularly along patch edges. Conventional deduplication techniques, including Non-Maximum Suppression (NMS) and Non-Mask Merging (NMM), rely on Intersection over Union (IoU) thresholds and frequently fail to merge fragmented or adjacent masks with low mutual IoU that nonetheless correspond to the same object. To address deduplication and mask fragmentation, Spatial Mask Merging (SMM) is proposed as a graph clustering approach that integrates pixel-level overlap and boundary distance metrics while using R-tree indexing for efficient candidate retrieval. SMM was evaluated on the iSAID benchmark using standard segmentation metrics, with tile overlap configurations systematically examined to determine the optimal setting for segmentation accuracy. The method achieved a nearly 7% increase in precision, with consistent gains in F1 score and Panoptic Quality over existing approaches. The integration of R-tree indexing facilitated faster candidate retrieval, enabling computational performance improvements over standard merging algorithms alongside the observed accuracy gains. Full article

(This article belongs to the Special Issue Mathematics Applications of Artificial Intelligence and Computer Vision)

► Show Figures

Figure 1

16 pages, 11231 KB

Open AccessArticle

Aerial Vehicle Detection Using Ground-Based LiDAR

by John Kirschler and Jay Wilhelm

Aerospace 2025, 12(9), 756; https://doi.org/10.3390/aerospace12090756 - 22 Aug 2025

Viewed by 856

Abstract

Ground-based LiDAR sensing offers a promising approach for delivering short-range landing feedback to aerial vehicles operating near vertiports and in GNSS-degraded environments. This work introduces a detection system capable of classifying aerial vehicles and estimating their 3D positions with sub-meter accuracy. Using a [...] Read more.

Ground-based LiDAR sensing offers a promising approach for delivering short-range landing feedback to aerial vehicles operating near vertiports and in GNSS-degraded environments. This work introduces a detection system capable of classifying aerial vehicles and estimating their 3D positions with sub-meter accuracy. Using a simulated Gazebo environment, multiple LiDAR sensors and five vehicle classes, ranging from hobbyist drones to air taxis, were modeled to evaluate detection performance. RGB-encoded point clouds were processed using a modified YOLOv6 neural network with Slicing-Aided Hyper Inference (SAHI) to preserve high-resolution object features. Classification accuracy and position error were analyzed using mean Average Precision (mAP) and Mean Absolute Error (MAE) across varied sensor parameters, vehicle sizes, and distances. Within 40 m, the system consistently achieved over 95% classification accuracy and average position errors below 0.5 m. Results support the viability of high-density LiDAR as a complementary method for precision landing guidance in advanced air mobility applications. Full article

(This article belongs to the Section Aeronautics)

► Show Figures

Figure 1

24 pages, 94333 KB

Open AccessArticle

Medical Segmentation of Kidney Whole Slide Images Using Slicing Aided Hyper Inference and Enhanced Syncretic Mask Merging Optimized by Particle Swarm Metaheuristics

by Marko Mihajlovic and Marina Marjanovic

BioMedInformatics 2025, 5(3), 44; https://doi.org/10.3390/biomedinformatics5030044 - 11 Aug 2025

Viewed by 767

Abstract

Accurate segmentation of kidney microstructures in whole slide images (WSIs) is essential for the diagnosis and monitoring of renal diseases. In this study, an end-to-end instance segmentation pipeline was developed for the detection of glomeruli and blood vessels in hematoxylin and eosin (H&E) [...] Read more.

Accurate segmentation of kidney microstructures in whole slide images (WSIs) is essential for the diagnosis and monitoring of renal diseases. In this study, an end-to-end instance segmentation pipeline was developed for the detection of glomeruli and blood vessels in hematoxylin and eosin (H&E) stained kidney tissue. A tiling-based strategy was employed using Slicing Aided Hyper Inference (SAHI) to manage the resolution and scale of WSIs and the performance of two segmentation models, YOLOv11 and YOLOv12, was comparatively evaluated. The influence of tile overlap ratios on segmentation quality and inference efficiency was assessed, with configurations identified that balance object continuity and computational cost. To address object fragmentation at tile boundaries, an Enhanced Syncretic Mask Merging algorithm was introduced, incorporating morphological and spatial constraints. The algorithm’s hyperparameters were optimized using Particle Swarm Optimization (PSO), with vessel and glomerulus-specific performance targets. The optimization process revealed key parameters affecting segmentation quality, particularly for vessel structures with fine, elongated morphology. When compared with a baseline without postprocessing, improvements in segmentation precision were observed, notably a 48% average increase for glomeruli and up to 17% for blood vessels. The proposed framework demonstrates a balance between accuracy and efficiency, supporting scalable histopathology analysis and contributing to the Vasculature Common Coordinate Framework (VCCF) and Human Reference Atlas (HRA). Full article

► Show Figures

Figure 1

30 pages, 225854 KB

Open AccessArticle

LGWheatNet: A Lightweight Wheat Spike Detection Model Based on Multi-Scale Information Fusion

by Zhaomei Qiu, Fei Wang, Tingting Li, Chongjun Liu, Xin Jin, Shunhao Qing, Yi Shi, Yuntao Wu and Congbin Liu

Plants 2025, 14(7), 1098; https://doi.org/10.3390/plants14071098 - 2 Apr 2025

Cited by 4 | Viewed by 1194

Abstract

Wheat spike detection holds significant importance for agricultural production as it enhances the efficiency of crop management and the precision of operations. This study aims to improve the accuracy and efficiency of wheat spike detection, enabling efficient crop monitoring under resource-constrained conditions. To [...] Read more.

Wheat spike detection holds significant importance for agricultural production as it enhances the efficiency of crop management and the precision of operations. This study aims to improve the accuracy and efficiency of wheat spike detection, enabling efficient crop monitoring under resource-constrained conditions. To this end, a wheat spike dataset encompassing multiple growth stages was constructed, leveraging the advantages of MobileNet and ShuffleNet to design a novel network module, SeCUIB. Building on this foundation, a new wheat spike detection network, LGWheatNet, was proposed by integrating a lightweight downsampling module (DWDown), spatial pyramid pooling (SPPF), and a lightweight detection head (LightDetect). The experimental results demonstrate that LGWheatNet excels in key performance metrics, including Precision, Recall, and Mean Average Precision (mAP50 and mAP50-95). Specifically, the model achieved a Precision of 0.956, a Recall of 0.921, an mAP50 of 0.967, and an mAP50-95 of 0.747, surpassing several YOLO models as well as EfficientDet and RetinaNet. Furthermore, LGWheatNet demonstrated superior resource efficiency with a parameter count of only 1,698,529 and GFLOPs of 5.0, significantly lower than those of competing models. Additionally, when combined with the Slicing Aided Hyper Inference strategy, LGWheatNet further improved the detection accuracy of wheat spikes, especially for small-scale targets and edge regions, when processing large-scale high-resolution images. This strategy significantly enhanced both inference efficiency and accuracy, making it particularly suitable for image analysis from drone-captured data. In wheat spike counting experiments, LGWheatNet also delivered exceptional performance, particularly in predictions during the filling and maturity stages, outperforming other models by a substantial margin. This study not only provides an efficient and reliable solution for wheat spike detection but also introduces innovative methods for lightweight object detection tasks in resource-constrained environments. Full article

(This article belongs to the Topic Advances in Smart Agriculture with Remote Sensing as the Core and Its Applications in Crops Field)

► Show Figures

Figure 1

27 pages, 51227 KB

Open AccessArticle

Improved Detection and Location of Small Crop Organs by Fusing UAV Orthophoto Maps and Raw Images

by Huaiyang Liu, Huibin Li, Haozhou Wang, Chuanghai Liu, Jianping Qian, Zhanbiao Wang and Changxing Geng

Remote Sens. 2025, 17(5), 906; https://doi.org/10.3390/rs17050906 - 4 Mar 2025

Cited by 2 | Viewed by 1130

Abstract

Extracting the quantity and geolocation data of small objects at the organ level via large-scale aerial drone monitoring is both essential and challenging for precision agriculture. The quality of reconstructed digital orthophoto maps (DOMs) often suffers from seamline distortion and ghost effects, making [...] Read more.

Extracting the quantity and geolocation data of small objects at the organ level via large-scale aerial drone monitoring is both essential and challenging for precision agriculture. The quality of reconstructed digital orthophoto maps (DOMs) often suffers from seamline distortion and ghost effects, making it difficult to meet the requirements for organ-level detection. While raw images do not exhibit these issues, they pose challenges in accurately obtaining the geolocation data of detected small objects. The detection of small objects was improved in this study through the fusion of orthophoto maps with raw images using the EasyIDP tool, thereby establishing a mapping relationship from the raw images to geolocation data. Small object detection was conducted by using the Slicing-Aided Hyper Inference (SAHI) framework and YOLOv10n on raw images to accelerate the inferencing speed for large-scale farmland. As a result, comparing detection directly using a DOM, the speed of detection was accelerated and the accuracy was improved. The proposed SAHI-YOLOv10n achieved precision and mean average precision (mAP) scores of 0.825 and 0.864, respectively. It also achieved a processing latency of 1.84 milliseconds on

640 \times 640

resolution frames for large-scale application. Subsequently, a novel crop canopy organ-level object detection dataset (CCOD-Dataset) was created via interactive annotation with SAHI-YOLOv10n, featuring 3986 images and 410,910 annotated boxes. The proposed fusion method demonstrated feasibility for detecting small objects at the organ level in three large-scale in-field farmlands, potentially benefiting future wide-range applications. Full article

(This article belongs to the Special Issue Proximal and Remote Sensing for Precision Crop Management II)

► Show Figures

Figure 1

15 pages, 9146 KB

Open AccessArticle

Research on Intelligent Recognition Method of Ground Penetrating Radar Images Based on SAHI

by Ruimin Chen, Ligang Cao, Congde Lu and Lei Liu

Appl. Sci. 2024, 14(18), 8470; https://doi.org/10.3390/app14188470 - 20 Sep 2024

Cited by 2 | Viewed by 1508

Abstract

Deep learning techniques have flourished in recent years and have shown great potential in ground-penetrating radar (GPR) data interpretation. However, obtaining sufficient training data is a great challenge. This paper proposes an intelligent recognition method based on slicing-aided hyper inference (SAHI) for GPR [...] Read more.

Deep learning techniques have flourished in recent years and have shown great potential in ground-penetrating radar (GPR) data interpretation. However, obtaining sufficient training data is a great challenge. This paper proposes an intelligent recognition method based on slicing-aided hyper inference (SAHI) for GPR images. Firstly, for the problem of insufficient samples of GPR images with structural loose distresses, data augmentation is carried out based on deep convolutional generative adversarial networks (DCGAN). Since distress features occupy fewer pixels on the original image, to allow the model to pay greater attention to the distress features, it is necessary to crop the original images centered on the distress labeling boxes first, and then input the cropped images into the model for training. Then, the YOLOv5 model is used for distress detection and the SAHI framework is used in the training and inference stages. The experimental results show that the detection accuracy is improved by 5.3% after adding the DCGAN-generated images, which verifies the effectiveness of the DCGAN-generated images. The detection accuracy is improved by 10.8% after using the SAHI framework in the training and inference stages, which indicates that SAHI is a key part of improving detection performance, as it significantly improves the ability to recognize distress. Full article

(This article belongs to the Special Issue Ground Penetrating Radar (GPR): Theory, Methods and Applications)

► Show Figures

Figure 1

14 pages, 4199 KB

Open AccessArticle

Detection Method for Inter-Turn Short Circuit Faults in Dry-Type Transformers Based on an Improved YOLOv8 Infrared Image Slicing-Aided Hyper-Inference Algorithm

by Zhaochuang Zhang, Jianhua Xia, Yuchuan Wen, Liting Weng, Zuofu Ma, Hekai Yang, Haobo Yang, Jinyao Dou, Jingang Wang and Pengcheng Zhao

Energies 2024, 17(18), 4559; https://doi.org/10.3390/en17184559 - 12 Sep 2024

Cited by 4 | Viewed by 1736

Abstract

Inter-Turn Short Circuit (ITSC) faults do not necessarily produce high temperatures but exhibit distinct heat distribution and characteristics. This paper proposes a novel fault diagnosis and identification scheme utilizing an improved You Look Only Once Vision 8 (YOLOv8) algorithm, enhanced with an infrared [...] Read more.

Inter-Turn Short Circuit (ITSC) faults do not necessarily produce high temperatures but exhibit distinct heat distribution and characteristics. This paper proposes a novel fault diagnosis and identification scheme utilizing an improved You Look Only Once Vision 8 (YOLOv8) algorithm, enhanced with an infrared image slicing-aided hyper-inference (SAHI) technique, to automatically detect ITSC fault trajectories in dry-type transformers. The infrared image acquisition system gathers data on ITSC fault trajectories and captures images with varying contrast to enhance the robustness of the recognition model. Given that the fault trajectory constitutes a small portion of the overall infrared image and is subject to significant background interference, traditional recognition algorithms often misjudge or omit faults. To address this, a YOLOv8-based visual detection method incorporating Dynamic Snake Convolution (DSConv) and the Slicing-Aided Hyper-Inference algorithm is proposed. This method aims to improve recognition precision and accuracy for small targets in complex backgrounds, facilitating accurate detection of ITSC faults in dry-type transformers. Comparative tests with the YOLOv8 model, Fast Region-based Convolutional Neural Networks (Fast-RCNNs), and Residual Neural Networks (Retina-Nets) demonstrate that the enhancements significantly improve model convergence speed and fault trajectory detection accuracy. The approach offers valuable insights for advancing infrared image diagnostic technology in electrical power equipment. Full article

(This article belongs to the Section F: Electrical Engineering)

► Show Figures

Figure 1

16 pages, 19548 KB

Open AccessArticle

Using YOLOv5, SAHI, and GIS with Drone Mapping to Detect Giant Clams on the Great Barrier Reef

by Olivier Decitre and Karen E. Joyce

Drones 2024, 8(9), 458; https://doi.org/10.3390/drones8090458 - 3 Sep 2024

Cited by 3 | Viewed by 4091

Abstract

Despite the ecological importance of giant clams (Tridacninae), their effective management and conservation is challenging due to their widespread distribution and labour-intensive monitoring methods. In this study, we present an alternative approach to detecting and mapping clam density at Pioneer Bay on Goolboddi [...] Read more.

Despite the ecological importance of giant clams (Tridacninae), their effective management and conservation is challenging due to their widespread distribution and labour-intensive monitoring methods. In this study, we present an alternative approach to detecting and mapping clam density at Pioneer Bay on Goolboddi (Orpheus) Island on the Great Barrier Reef using drone data with a combination of deep learning tools and a geographic information system (GIS). We trained and evaluated 11 models using YOLOv5 (You Only Look Once, version 5) with varying numbers of input image tiles and augmentations (mean average precision—mAP: 63–83%). We incorporated the Slicing Aided Hyper Inference (SAHI) library to detect clams across orthomosaics, eliminating duplicate counts of clams straddling multiple tiles, and further, applied our models in three other geographic locations on the Great Barrier Reef, demonstrating transferability. Finally, by linking detections with their original geographic coordinates, we illustrate the workflow required to quantify animal densities, mapping up to seven clams per square meter in Pioneer Bay. Our workflow brings together several otherwise disparate steps to create an end-to-end approach for detecting and mapping animals with aerial drones. This provides ecologists and conservationists with actionable and clear quantitative and visual insights from drone mapping data. Full article

(This article belongs to the Section Drones in Ecology)

► Show Figures

Figure 1

24 pages, 7433 KB

Open AccessEditor’s ChoiceArticle

Improved YOLOv8 and SAHI Model for the Collaborative Detection of Small Targets at the Micro Scale: A Case Study of Pest Detection in Tea

by Rong Ye, Quan Gao, Ye Qian, Jihong Sun and Tong Li

Agronomy 2024, 14(5), 1034; https://doi.org/10.3390/agronomy14051034 - 13 May 2024

Cited by 34 | Viewed by 5764

Abstract

Pest target identification in agricultural production environments is challenging due to the dense distribution, small size, and high density of pests. Additionally, changeable environmental lighting and complex backgrounds further complicate the detection process. This study focuses on enhancing the recognition performance of tea [...] Read more.

Pest target identification in agricultural production environments is challenging due to the dense distribution, small size, and high density of pests. Additionally, changeable environmental lighting and complex backgrounds further complicate the detection process. This study focuses on enhancing the recognition performance of tea pests by introducing a lightweight pest image recognition model based on the improved YOLOv8 architecture. First, slicing-aided fine-tuning and slicing-aided hyper inference (SAHI) are proposed to partition input images for enhanced model performance on low-resolution images and small-target detection. Then, based on an ELAN, a generalized efficient layer aggregation network (GELAN) is designed to replace the C2f module in the backbone network, enhance its feature extraction ability, and construct a lightweight model. Additionally, the MS structure is integrated into the neck network of YOLOv8 for feature fusion, enhancing the extraction of fine-grained and coarse-grained semantic information. Furthermore, the BiFormer attention mechanism, based on the Transformer architecture, is introduced to amplify target characteristics of tea pests. Finally, the inner-MPDIoU, based on auxiliary borders, is utilized as a replacement for the original loss function to enhance its learning capacity for complex pest samples. Our experimental results demonstrate that the enhanced YOLOv8 model achieves a precision of 96.32% and a recall of 97.95%, surpassing those of the original YOLOv8 model. Moreover, it attains an mAP@50 score of 98.17%. Compared to Faster R-CNN, SSD, YOLOv5, YOLOv7, and YOLOv8, its average accuracy is 17.04, 11.23, 5.78, 3.75, and 2.71 percentage points higher, respectively. The overall performance of YOLOv8 outperforms that of current mainstream detection models, with a detection speed of 95 FPS. This model effectively balances lightweight design with high accuracy and speed in detecting small targets such as tea pests. It can serve as a valuable reference for the identification and classification of various insect pests in tea gardens within complex production environments, effectively addressing practical application needs and offering guidance for the future monitoring and scientific control of tea insect pests. Full article

(This article belongs to the Special Issue Innovation of Intelligent Detection and Pesticide Application Technology for Horticultural Crops)

► Show Figures

Figure 1

25 pages, 20130 KB

Open AccessArticle

Improved YOLOv5 Based on Multi-Strategy Integration for Multi-Category Wind Turbine Surface Defect Detection

by Mingwei Lei, Xingfen Wang, Meihua Wang and Yitao Cheng

Energies 2024, 17(8), 1796; https://doi.org/10.3390/en17081796 - 9 Apr 2024

Cited by 7 | Viewed by 2043

Abstract

Wind energy is a renewable resource with abundant reserves, and its sustainable development and utilization are crucial. The components of wind turbines, particularly the blades and various surfaces, require meticulous defect detection and maintenance due to their significance. The operational status of wind [...] Read more.

Wind energy is a renewable resource with abundant reserves, and its sustainable development and utilization are crucial. The components of wind turbines, particularly the blades and various surfaces, require meticulous defect detection and maintenance due to their significance. The operational status of wind turbine generators directly impacts the efficiency and safe operation of wind farms. Traditional surface defect detection methods for wind turbines often involve manual operations, which suffer from issues such as high subjectivity, elevated risks, low accuracy, and inefficiency. The emergence of computer vision technologies based on deep learning has provided a novel approach to surface defect detection in wind turbines. However, existing datasets designed for wind turbine surface defects exhibit overall category scarcity and an imbalance in samples between categories. The algorithms designed face challenges, with low detection rates for small samples. Hence, this study first constructs a benchmark dataset for wind turbine surface defects comprising seven categories that encompass all common surface defects. Simultaneously, a wind turbine surface defect detection algorithm based on improved YOLOv5 is designed. Initially, a multi-scale copy-paste data augmentation method is proposed, introducing scale factors to randomly resize the bounding boxes before copy-pasting. This alleviates sample imbalances and significantly enhances the algorithm’s detection capabilities for targets of different sizes. Subsequently, a dynamic label assignment strategy based on the Hungarian algorithm is introduced that calculates the matching costs by weighing different losses, enhancing the network’s ability to learn positive and negative samples. To address overfitting and misrecognition resulting from strong data augmentation, a two-stage progressive training method is proposed, aiding the model’s natural convergence and improving generalization performance. Furthermore, a multi-scenario negative-sample-guided learning method is introduced that involves incorporating unlabeled background images from various scenarios into training, guiding the model to learn negative samples and reducing misrecognition. Finally, slicing-aided hyper inference is introduced, facilitating large-scale inference for wind turbine surface defects in actual industrial scenarios. The improved algorithm demonstrates a 3.1% increase in the mean average precision (mAP) on the custom dataset, achieving 95.7% accuracy in mAP_50 (the IoU threshold is half of the mAP). Notably, the mAPs for small, medium, and large targets increase by 18.6%, 16.4%, and 6.8%, respectively. The experimental results indicate that the enhanced algorithm exhibits high detection accuracy, providing a new and more efficient solution for the field of wind turbine surface defect detection. Full article

(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

► Show Figures

Figure 1

23 pages, 4882 KB

Open AccessArticle

USES-Net: An Infrared Dim and Small Target Detection Network with Embedded Knowledge Priors

by Lingxiao Li, Linlin Liu, Yunan He and Zhuqiang Zhong

Electronics 2024, 13(7), 1400; https://doi.org/10.3390/electronics13071400 - 8 Apr 2024

Cited by 4 | Viewed by 2199

Abstract

Detecting and identifying small infrared targets has always been a crucial technology for many applications. To address the low accuracy, high false-alarm rate, and poor environmental adaptability that commonly exist in infrared target detection methods, this paper proposes a composite infrared dim and [...] Read more.

Detecting and identifying small infrared targets has always been a crucial technology for many applications. To address the low accuracy, high false-alarm rate, and poor environmental adaptability that commonly exist in infrared target detection methods, this paper proposes a composite infrared dim and small target detection model called USES-Net, which combines the target prior knowledge and conventional data-driven deep learning networks to make use of both labeled data and the domain knowledge. Based on the typical encoder–decoder structure, USES-Net firstly introduces the self-attention mechanism of Swin Transformer to replace the universal convolution kernel at the encoder end. This helps to extract potential features related to dim, small targets in a larger receptive field. In addition, USES-Net includes an embedded patch-based contrast learning module (EPCLM) to integrate the spatial distribution of the target as a knowledge prior in the training network model. This guides the training process of the constrained network model with clear physical interpretability. Finally, USES-Net also designs a bottom-up cross-layer feature fusion module (AFM) as the decoder of the network, and a data-slicing-aided enhancement and inference method based on Slicing Aided Hyper Inference (SAHI) is utilized to further improve the model’s detection accuracy. An experimental comparative analysis shows that USES-Net achieves the best results on three typical infrared weak-target datasets: NUAA-SIRST, NUDT-SIRST, and IRSTD-1K. The results of the target segmentation are complete and sufficient, which demonstrates the validity and practicality of the proposed method in comparison to others. Full article

(This article belongs to the Special Issue Object Detection, Segmentation and Categorization in Artificial Intelligence)

► Show Figures

Figure 1

12 pages, 8148 KB

Open AccessArticle

Multi-Module Fusion Model for Submarine Pipeline Identification Based on YOLOv5

by Bochen Duan, Shengping Wang, Changlong Luo and Zhigao Chen

J. Mar. Sci. Eng. 2024, 12(3), 451; https://doi.org/10.3390/jmse12030451 - 3 Mar 2024

Cited by 6 | Viewed by 2259

Abstract

In recent years, the surge in marine activities has increased the frequency of submarine pipeline failures. Detecting and identifying the buried conditions of submarine pipelines has become critical. Sub-bottom profilers (SBPs) are widely employed for pipeline detection, yet manual data interpretation hampers efficiency. [...] Read more.

In recent years, the surge in marine activities has increased the frequency of submarine pipeline failures. Detecting and identifying the buried conditions of submarine pipelines has become critical. Sub-bottom profilers (SBPs) are widely employed for pipeline detection, yet manual data interpretation hampers efficiency. The present study proposes an automated detection method for submarine pipelines using deep learning models. The approach enhances the YOLOv5s model by integrating Squeeze and Excitation Networks (SE-Net) and S2-MLPv2 attention modules into the backbone network structure. The Slicing Aided Hyper Inference (SAHI) module is subsequently introduced to recognize original large-image data. Experimental results conducted in the Yellow Sea region demonstrate that the refined model achieves a precision of 82.5%, recall of 99.2%, and harmonic mean (F1 score) of 90.0% on actual submarine pipeline data detected using an SBP. These results demonstrate the efficiency of the proposed method and applicability in real-world scenarios. Full article

(This article belongs to the Special Issue Morphological Processes and Evolution of Marine Geomorphology: Observations, Modeling and Applications)

► Show Figures

Figure 1

17 pages, 4718 KB

Open AccessArticle

Ensemble Deep Learning for Automated Damage Detection of Trailers at Intermodal Terminals

by Pavel Cimili, Jana Voegl, Patrick Hirsch and Manfred Gronalt

Sustainability 2024, 16(3), 1218; https://doi.org/10.3390/su16031218 - 31 Jan 2024

Cited by 1 | Viewed by 2985

Abstract

Efficient damage detection of trailers is essential for improving processes at inland intermodal terminals. This paper presents an automated damage detection (ADD) algorithm for trailers utilizing ensemble learning based on YOLOv8 and RetinaNet networks. The algorithm achieves 88.33% accuracy and an 81.08% F1-score [...] Read more.

Efficient damage detection of trailers is essential for improving processes at inland intermodal terminals. This paper presents an automated damage detection (ADD) algorithm for trailers utilizing ensemble learning based on YOLOv8 and RetinaNet networks. The algorithm achieves 88.33% accuracy and an 81.08% F1-score on the real-life trailer damage dataset by leveraging the strengths of each object detection model. YOLOv8 is trained explicitly for detecting belt damage, while RetinaNet handles detecting other damage types and is used for cropping trailers from images. These one-stage detectors outperformed the two-stage Faster R-CNN in all tested tasks within this research. Furthermore, the algorithm incorporates slice-aided hyper inference, which significantly contributes to the efficient processing of high-resolution trailer images. Integrating the proposed ADD solution into terminal operating systems allows a substantial workload reduction at the ingate of intermodal terminals and supports, therefore, more sustainable transportation solutions. Full article

(This article belongs to the Special Issue Sustainable Supply Chain Optimization and Risk Management)

► Show Figures

Figure 1

20 pages, 6428 KB

Open AccessArticle

Automatic Detection of Feral Pigeons in Urban Environments Using Deep Learning

by Zhaojin Guo, Zheng He, Li Lyu, Axiu Mao, Endai Huang and Kai Liu

Animals 2024, 14(1), 159; https://doi.org/10.3390/ani14010159 - 3 Jan 2024

Cited by 2 | Viewed by 3282

Abstract

The overpopulation of feral pigeons in Hong Kong has significantly disrupted the urban ecosystem, highlighting the urgent need for effective strategies to control their population. In general, control measures should be implemented and re-evaluated periodically following accurate estimations of the feral pigeon population [...] Read more.

The overpopulation of feral pigeons in Hong Kong has significantly disrupted the urban ecosystem, highlighting the urgent need for effective strategies to control their population. In general, control measures should be implemented and re-evaluated periodically following accurate estimations of the feral pigeon population in the concerned regions, which, however, is very difficult in urban environments due to the concealment and mobility of pigeons within complex building structures. With the advances in deep learning, computer vision can be a promising tool for pigeon monitoring and population estimation but has not been well investigated so far. Therefore, we propose an improved deep learning model (Swin-Mask R-CNN with SAHI) for feral pigeon detection. Our model consists of three parts. Firstly, the Swin Transformer network (STN) extracts deep feature information. Secondly, the Feature Pyramid Network (FPN) fuses multi-scale features to learn at different scales. Lastly, the model’s three head branches are responsible for classification, best bounding box prediction, and segmentation. During the prediction phase, we utilize a Slicing-Aided Hyper Inference (SAHI) tool to focus on the feature information of small feral pigeon targets. Experiments were conducted on a feral pigeon dataset to evaluate model performance. The results reveal that our model achieves excellent recognition performance for feral pigeons. Full article

(This article belongs to the Special Issue Sensors-Assisted Observation of Wildlife)

► Show Figures

Figure 1

19 pages, 22358 KB

Open AccessArticle

SIPNet & SAHI: Multiscale Sunspot Extraction for High-Resolution Full Solar Images

by Dongxin Fan, Yunfei Yang, Song Feng, Wei Dai, Bo Liang and Jianping Xiong

Appl. Sci. 2024, 14(1), 7; https://doi.org/10.3390/app14010007 - 19 Dec 2023

Cited by 4 | Viewed by 1780

Abstract

Photospheric magnetic fields are manifested as sunspots, which cover various sizes over high-resolution, full-disk, solar continuum images. This paper proposes a novel deep learning method named SIPNet, which is designed to extract and segment multiscale sunspots. It presents a new Switchable Atrous Spatial [...] Read more.

Photospheric magnetic fields are manifested as sunspots, which cover various sizes over high-resolution, full-disk, solar continuum images. This paper proposes a novel deep learning method named SIPNet, which is designed to extract and segment multiscale sunspots. It presents a new Switchable Atrous Spatial Pyramid Pooling (SASPP) module based on ASPP, employs an IoU-aware dense object detector, and incorporates a prototype mask generation technique. Furthermore, an open-source framework known as Slicing Aided Hyper Inference (SAHI) is integrated on top of the trained SIPNet model. A comprehensive sunspot dataset is built, containing more than 27,000 sunspots. The precision, recall, and average precision metrics of the SIPNet & SAHI method were measured as 95.7%, 90.2%, and 96.1%, respectively. The results indicate that the SIPNet & SAHI method has good performance in detecting and segmenting large-scale sunspots, particularly in small and ultra-small sunspots. The method also provides a new solution for solving similar problems. Full article

(This article belongs to the Special Issue Advanced Image Analysis and Processing Technologies and Applications)

► Show Figures

Figure 1

Search Results (25)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (25)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI