MDPI - Publisher of Open Access Journals

22 pages, 6482 KiB

Open AccessArticle

Surface Damage Detection in Hydraulic Structures from UAV Images Using Lightweight Neural Networks

by Feng Han and Chongshi Gu

Remote Sens. 2025, 17(15), 2668; https://doi.org/10.3390/rs17152668 (registering DOI) - 1 Aug 2025

Timely and accurate identification of surface damage in hydraulic structures is essential for maintaining structural integrity and ensuring operational safety. Traditional manual inspections are time-consuming, labor-intensive, and prone to subjectivity, especially for large-scale or inaccessible infrastructure. Leveraging advancements in aerial imaging, unmanned aerial [...] Read more.

Timely and accurate identification of surface damage in hydraulic structures is essential for maintaining structural integrity and ensuring operational safety. Traditional manual inspections are time-consuming, labor-intensive, and prone to subjectivity, especially for large-scale or inaccessible infrastructure. Leveraging advancements in aerial imaging, unmanned aerial vehicles (UAVs) enable efficient acquisition of high-resolution visual data across expansive hydraulic environments. However, existing deep learning (DL) models often lack architectural adaptations for the visual complexities of UAV imagery, including low-texture contrast, noise interference, and irregular crack patterns. To address these challenges, this study proposes a lightweight, robust, and high-precision segmentation framework, called LFPA-EAM-Fast-SCNN, specifically designed for pixel-level damage detection in UAV-captured images of hydraulic concrete surfaces. The developed DL-based model integrates an enhanced Fast-SCNN backbone for efficient feature extraction, a Lightweight Feature Pyramid Attention (LFPA) module for multi-scale context enhancement, and an Edge Attention Module (EAM) for refined boundary localization. The experimental results on a custom UAV-based dataset show that the proposed damage detection method achieves superior performance, with a precision of 0.949, a recall of 0.892, an F1 score of 0.906, and an IoU of 87.92%, outperforming U-Net, Attention U-Net, SegNet, DeepLab v3+, I-ST-UNet, and SegFormer. Additionally, it reaches a real-time inference speed of 56.31 FPS, significantly surpassing other models. The experimental results demonstrate the proposed framework’s strong generalization capability and robustness under varying noise levels and damage scenarios, underscoring its suitability for scalable, automated surface damage assessment in UAV-based remote sensing of civil infrastructure. Full article

(This article belongs to the Special Issue Unmanned Aerial Vehicle-Based Inspection in Infrastructure Maintenance)

► Show Figures

Figure 1

40 pages, 18911 KiB

Open AccessArticle

Twin-AI: Intelligent Barrier Eddy Current Separator with Digital Twin and AI Integration

by Shohreh Kia, Johannes B. Mayer, Erik Westphal and Benjamin Leiding

Sensors 2025, 25(15), 4731; https://doi.org/10.3390/s25154731 (registering DOI) - 31 Jul 2025

Abstract

The current paper presents a comprehensive intelligent system designed to optimize the performance of a barrier eddy current separator (BECS), comprising a conveyor belt, a vibration feeder, and a magnetic drum. This system was trained and validated on real-world industrial data gathered directly [...] Read more.

The current paper presents a comprehensive intelligent system designed to optimize the performance of a barrier eddy current separator (BECS), comprising a conveyor belt, a vibration feeder, and a magnetic drum. This system was trained and validated on real-world industrial data gathered directly from the working separator under 81 different operational scenarios. The intelligent models were used to recommend optimal settings for drum speed, belt speed, vibration intensity, and drum angle, thereby maximizing separation quality and minimizing energy consumption. the smart separation module utilizes YOLOv11n-seg and achieves a mean average precision (mAP) of 0.838 across 7163 industrial instances from aluminum, copper, and plastic materials. For shape classification (sharp vs. smooth), the model reached 91.8% accuracy across 1105 annotated samples. Furthermore, the thermal monitoring unit can detect iron contamination by analyzing temperature anomalies. Scenarios with iron showed a maximum temperature increase of over 20 °C compared to clean materials, with a detection response time of under 2.5 s. The architecture integrates a Digital Twin using Azure Digital Twins to virtually mirror the system, enabling real-time tracking, behavior simulation, and remote updates. A full connection with the PLC has been implemented, allowing the AI-driven system to adjust physical parameters autonomously. This combination of AI, IoT, and digital twin technologies delivers a reliable and scalable solution for enhanced separation quality, improved operational safety, and predictive maintenance in industrial recycling environments. Full article

(This article belongs to the Special Issue Sensors and IoT Technologies for the Smart Industry)

19 pages, 3130 KiB

Open AccessArticle

Deep Learning-Based Instance Segmentation of Galloping High-Speed Railway Overhead Contact System Conductors in Video Images

by Xiaotong Yao, Huayu Yuan, Shanpeng Zhao, Wei Tian, Dongzhao Han, Xiaoping Li, Feng Wang and Sihua Wang

Sensors 2025, 25(15), 4714; https://doi.org/10.3390/s25154714 (registering DOI) - 30 Jul 2025

Viewed by 157

Abstract

The conductors of high-speed railway OCSs (Overhead Contact Systems) are susceptible to conductor galloping due to the impact of natural elements such as strong winds, rain, and snow, resulting in conductor fatigue damage and significantly compromising train operational safety. Consequently, monitoring the galloping [...] Read more.

The conductors of high-speed railway OCSs (Overhead Contact Systems) are susceptible to conductor galloping due to the impact of natural elements such as strong winds, rain, and snow, resulting in conductor fatigue damage and significantly compromising train operational safety. Consequently, monitoring the galloping status of conductors is crucial, and instance segmentation techniques, by delineating the pixel-level contours of each conductor, can significantly aid in the identification and study of galloping phenomena. This work expands upon the YOLO11-seg model and introduces an instance segmentation approach for galloping video and image sensor data of OCS conductors. The algorithm, designed for the stripe-like distribution of OCS conductors in the data, employs four-direction Sobel filters to extract edge features in horizontal, vertical, and diagonal orientations. These features are subsequently integrated with the original convolutional branch to form the FDSE (Four Direction Sobel Enhancement) module. It integrates the ECA (Efficient Channel Attention) mechanism for the adaptive augmentation of conductor characteristics and utilizes the FL (Focal Loss) function to mitigate the class-imbalance issue between positive and negative samples, hence enhancing the model’s sensitivity to conductors. Consequently, segmentation outcomes from neighboring frames are utilized, and mask-difference analysis is performed to autonomously detect conductor galloping locations, emphasizing their contours for the clear depiction of galloping characteristics. Experimental results demonstrate that the enhanced YOLO11-seg model achieves 85.38% precision, 77.30% recall, 84.25% AP@0.5, 81.14% F1-score, and a real-time processing speed of 44.78 FPS. When combined with the galloping visualization module, it can issue real-time alerts of conductor galloping anomalies, providing robust technical support for railway OCS safety monitoring. Full article

(This article belongs to the Section Industrial Sensors)

► Show Figures

Figure 1

25 pages, 2518 KiB

Open AccessArticle

An Efficient Semantic Segmentation Framework with Attention-Driven Context Enhancement and Dynamic Fusion for Autonomous Driving

by Jia Tian, Peizeng Xin, Xinlu Bai, Zhiguo Xiao and Nianfeng Li

Appl. Sci. 2025, 15(15), 8373; https://doi.org/10.3390/app15158373 - 28 Jul 2025

Viewed by 274

Abstract

In recent years, a growing number of real-time semantic segmentation networks have been developed to improve segmentation accuracy. However, these advancements often come at the cost of increased computational complexity, which limits their inference efficiency, particularly in scenarios such as autonomous driving, where [...] Read more.

In recent years, a growing number of real-time semantic segmentation networks have been developed to improve segmentation accuracy. However, these advancements often come at the cost of increased computational complexity, which limits their inference efficiency, particularly in scenarios such as autonomous driving, where strict real-time performance is essential. Achieving an effective balance between speed and accuracy has thus become a central challenge in this field. To address this issue, we present a lightweight semantic segmentation model tailored for the perception requirements of autonomous vehicles. The architecture follows an encoder–decoder paradigm, which not only preserves the capability for deep feature extraction but also facilitates multi-scale information integration. The encoder leverages a high-efficiency backbone, while the decoder introduces a dynamic fusion mechanism designed to enhance information interaction between different feature branches. Recognizing the limitations of convolutional networks in modeling long-range dependencies and capturing global semantic context, the model incorporates an attention-based feature extraction component. This is further augmented by positional encoding, enabling better awareness of spatial structures and local details. The dynamic fusion mechanism employs an adaptive weighting strategy, adjusting the contribution of each feature channel to reduce redundancy and improve representation quality. To validate the effectiveness of the proposed network, experiments were conducted on a single RTX 3090 GPU. The Dynamic Real-time Integrated Vision Encoder–Segmenter Network (DriveSegNet) achieved a mean Intersection over Union (mIoU) of 76.9% and an inference speed of 70.5 FPS on the Cityscapes test dataset, 74.6% mIoU and 139.8 FPS on the CamVid test dataset, and 35.8% mIoU with 108.4 FPS on the ADE20K dataset. The experimental results demonstrate that the proposed method achieves an excellent balance between inference speed, segmentation accuracy, and model size. Full article

► Show Figures

Figure 1

27 pages, 11177 KiB

Open AccessArticle

Robust Segmentation of Lung Proton and Hyperpolarized Gas MRI with Vision Transformers and CNNs: A Comparative Analysis of Performance Under Artificial Noise

by Ramtin Babaeipour, Matthew S. Fox, Grace Parraga and Alexei Ouriadov

Bioengineering 2025, 12(8), 808; https://doi.org/10.3390/bioengineering12080808 - 28 Jul 2025

Viewed by 262

Abstract

Accurate segmentation in medical imaging is essential for disease diagnosis and monitoring, particularly in lung imaging using proton and hyperpolarized gas MRI. However, image degradation due to noise and artifacts—especially in hyperpolarized gas MRI, where scans are acquired during breath-holds—poses challenges for conventional [...] Read more.

Accurate segmentation in medical imaging is essential for disease diagnosis and monitoring, particularly in lung imaging using proton and hyperpolarized gas MRI. However, image degradation due to noise and artifacts—especially in hyperpolarized gas MRI, where scans are acquired during breath-holds—poses challenges for conventional segmentation algorithms. This study evaluates the robustness of deep learning segmentation models under varying Gaussian noise levels, comparing traditional convolutional neural networks (CNNs) with modern Vision Transformer (ViT)-based models. Using a dataset of proton and hyperpolarized gas MRI slices from 56 participants, we trained and tested Feature Pyramid Network (FPN) and U-Net architectures with both CNN (VGG16, VGG19, ResNet152) and ViT (MiT-B0, B3, B5) backbones. Results showed that ViT-based models, particularly those using the SegFormer backbone, consistently outperformed CNN-based counterparts across all metrics and noise levels. The performance gap was especially pronounced in high-noise conditions, where transformer models retained higher Dice scores and lower boundary errors. These findings highlight the potential of ViT-based architectures for deployment in clinically realistic, low-SNR environments such as hyperpolarized gas MRI, where segmentation reliability is critical. Full article

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence for Biomedical Applications, 3rd Edition)

► Show Figures

Figure 1

30 pages, 92065 KiB

Open AccessArticle

A Picking Point Localization Method for Table Grapes Based on PGSS-YOLOv11s and Morphological Strategies

by Jin Lu, Zhongji Cao, Jin Wang, Zhao Wang, Jia Zhao and Minjie Zhang

Agriculture 2025, 15(15), 1622; https://doi.org/10.3390/agriculture15151622 - 26 Jul 2025

Viewed by 252

Abstract

During the automated picking of table grapes, the automatic recognition and segmentation of grape pedicels, along with the positioning of picking points, are vital components for all the following operations of the harvesting robot. In the actual scene of a grape plantation, however, [...] Read more.

During the automated picking of table grapes, the automatic recognition and segmentation of grape pedicels, along with the positioning of picking points, are vital components for all the following operations of the harvesting robot. In the actual scene of a grape plantation, however, it is extremely difficult to accurately and efficiently identify and segment grape pedicels and then reliably locate the picking points. This is attributable to the low distinguishability between grape pedicels and the surrounding environment such as branches, as well as the impacts of other conditions like weather, lighting, and occlusion, which are coupled with the requirements for model deployment on edge devices with limited computing resources. To address these issues, this study proposes a novel picking point localization method for table grapes based on an instance segmentation network called Progressive Global-Local Structure-Sensitive Segmentation (PGSS-YOLOv11s) and a simple combination strategy of morphological operators. More specifically, the network PGSS-YOLOv11s is composed of an original backbone of the YOLOv11s-seg, a spatial feature aggregation module (SFAM), an adaptive feature fusion module (AFFM), and a detail-enhanced convolutional shared detection head (DE-SCSH). And the PGSS-YOLOv11s have been trained with a new grape segmentation dataset called Grape-⊥, which includes 4455 grape pixel-level instances with the annotation of ⊥-shaped regions. After the PGSS-YOLOv11s segments the ⊥-shaped regions of grapes, some morphological operations such as erosion, dilation, and skeletonization are combined to effectively extract grape pedicels and locate picking points. Finally, several experiments have been conducted to confirm the validity, effectiveness, and superiority of the proposed method. Compared with the other state-of-the-art models, the main metrics

F 1

score and mask mAP@0.5 of the PGSS-YOLOv11s reached 94.6% and 95.2% on the Grape-⊥ dataset, as well as 85.4% and 90.0% on the Winegrape dataset. Multi-scenario tests indicated that the success rate of positioning the picking points reached up to 89.44%. In orchards, real-time tests on the edge device demonstrated the practical performance of our method. Nevertheless, for grapes with short pedicels or occluded pedicels, the designed morphological algorithm exhibited the loss of picking point calculations. In future work, we will enrich the grape dataset by collecting images under different lighting conditions, from various shooting angles, and including more grape varieties to improve the method’s generalization performance. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

21 pages, 3463 KiB

Open AccessArticle

Apple Rootstock Cutting Drought-Stress-Monitoring Model Based on IMYOLOv11n-Seg

by Xu Wang, Hongjie Liu, Pengfei Wang, Long Gao and Xin Yang

Agriculture 2025, 15(15), 1598; https://doi.org/10.3390/agriculture15151598 - 24 Jul 2025

Viewed by 270

Abstract

To ensure the normal water status of apple rootstock softwood cuttings during the initial stage of cutting, a drought stress monitoring model was designed. The model is optimized based on the YOLOv11n-seg instance segmentation model, using the leaf curl degree of cuttings as [...] Read more.

To ensure the normal water status of apple rootstock softwood cuttings during the initial stage of cutting, a drought stress monitoring model was designed. The model is optimized based on the YOLOv11n-seg instance segmentation model, using the leaf curl degree of cuttings as the classification basis for drought-stress grades. The backbone structure of the IMYOLOv11n-seg model is improved by the C3K2_CMUNeXt module and the multi-head self-attention (MHSA) mechanism module. The neck part is optimized by the KFHA module (Kalman filter and Hungarian algorithm model), and the head part enhances post-processing effects through HIoU-SD (hierarchical IoU–spatial distance filtering algorithm). The IMYOLOv11-seg model achieves an average inference speed of 33.53 FPS (frames per second) and the mean intersection over union (MIoU) value of 0.927. The average recognition accuracies for cuttings under normal water status, mild drought stress, moderate drought stress, and severe drought stress are 94.39%, 93.27%, 94.31%, and 94.71%, respectively. The IMYOLOv11n-seg model demonstrates the best comprehensive performance in ablation and comparative experiments. The automatic humidification system equipped with the IMYOLOv11n-seg model saves 6.14% more water than the labor group. This study provides a design approach for an automatic humidification system in protected agriculture during apple rootstock cutting propagation. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

21 pages, 3825 KiB

Open AccessArticle

Light Propagation and Multi-Scale Enhanced DeepLabV3+ for Underwater Crack Detection

by Wenji Ai, Jiaxuan Zou, Zongchao Liu, Shaodi Wang and Shuai Teng

Algorithms 2025, 18(8), 462; https://doi.org/10.3390/a18080462 - 24 Jul 2025

Viewed by 147

Abstract

Achieving state-of-the-art performance (82.5% IoU, 85.6% F1), this paper proposes an enhanced DeepLabV3+ model for robust underwater crack detection through three integrated innovations: a physics-based light propagation correction model for illumination distortion, multi-scale feature extraction for variable crack dimensions, and curvature flow-guided loss [...] Read more.

Achieving state-of-the-art performance (82.5% IoU, 85.6% F1), this paper proposes an enhanced DeepLabV3+ model for robust underwater crack detection through three integrated innovations: a physics-based light propagation correction model for illumination distortion, multi-scale feature extraction for variable crack dimensions, and curvature flow-guided loss for boundary precision. Our approach significantly outperforms DeepLabV3+, SCTNet, and LarvSeg by 10.6–13.4% IoU, demonstrating particular strength in detecting small cracks (78.1% IoU) under challenging low-light/high-turbidity conditions. The solution provides a practical framework for automated underwater infrastructure inspection. Full article

(This article belongs to the Special Issue Machine Learning for Pattern Recognition (3rd Edition))

► Show Figures

Figure 1

18 pages, 3368 KiB

Open AccessArticle

Segmentation-Assisted Fusion-Based Classification for Automated CXR Image Analysis

by Shilu Kang, Dongfang Li, Jiaxin Xu, Aokun Mei and Hua Huo

Sensors 2025, 25(15), 4580; https://doi.org/10.3390/s25154580 - 24 Jul 2025

Viewed by 281

Abstract

Accurate classification of chest X-ray (CXR) images is crucial for diagnosing lung diseases in medical imaging. Existing deep learning models for CXR image classification face challenges in distinguishing non-lung features. In this work, we propose a new segmentation-assisted fusion-based classification method. The method [...] Read more.

Accurate classification of chest X-ray (CXR) images is crucial for diagnosing lung diseases in medical imaging. Existing deep learning models for CXR image classification face challenges in distinguishing non-lung features. In this work, we propose a new segmentation-assisted fusion-based classification method. The method involves two stages: first, we use a lightweight segmentation model, Partial Convolutional Segmentation Network (PCSNet) designed based on an encoder–decoder architecture, to accurately obtain lung masks from CXR images. Then, a fusion of the masked CXR image with the original image enables classification using the improved lightweight ShuffleNetV2 model. The proposed method is trained and evaluated on segmentation datasets including the Montgomery County Dataset (MC) and Shenzhen Hospital Dataset (SH), and classification datasets such as Chest X-Ray Images for Pneumonia (CXIP) and COVIDx. Compared with seven segmentation models (U-Net, Attention-Net, SegNet, FPNNet, DANet, DMNet, and SETR), five classification models (ResNet34, ResNet50, DenseNet121, Swin-Transforms, and ShuffleNetV2), and state-of-the-art methods, our PCSNet model achieved high segmentation performance on CXR images. Compared to the state-of-the-art Attention-Net model, the accuracy of PCSNet increased by 0.19% (98.94% vs. 98.75%), and the boundary accuracy improved by 0.3% (97.86% vs. 97.56%), while requiring 62% fewer parameters. For pneumonia classification using the CXIP dataset, the proposed strategy outperforms the current best model by 0.14% in accuracy (98.55% vs. 98.41%). For COVID-19 classification with the COVIDx dataset, the model reached an accuracy of 97.50%, the absolute improvement in accuracy compared to CovXNet was 0.1%, and clinical metrics demonstrate more significant gains: specificity increased from 94.7% to 99.5%. These results highlight the model’s effectiveness in medical image analysis, demonstrating clinically meaningful improvements over state-of-the-art approaches. Full article

(This article belongs to the Special Issue Vision- and Image-Based Biomedical Diagnostics—2nd Edition)

► Show Figures

Figure 1

16 pages, 10372 KiB

Open AccessArticle

PRONOBIS: A Robotic System for Automated Ultrasound-Based Prostate Reconstruction and Biopsy Planning

by Matija Markulin, Luka Matijević, Janko Jurdana, Luka Šiktar, Branimir Ćaran, Toni Zekulić, Filip Šuligoj, Bojan Šekoranja, Tvrtko Hudolin, Tomislav Kuliš, Bojan Jerbić and Marko Švaco

Robotics 2025, 14(8), 100; https://doi.org/10.3390/robotics14080100 - 22 Jul 2025

Viewed by 246

Abstract

This paper presents the PRONOBIS project, an ultrasound-only, robotically assisted, deep learning-based system for prostate scanning and biopsy treatment planning. The proposed system addresses the challenges of precise prostate segmentation, reconstruction and inter-operator variability by performing fully automated prostate scanning, real-time CNN-transformer-based image [...] Read more.

This paper presents the PRONOBIS project, an ultrasound-only, robotically assisted, deep learning-based system for prostate scanning and biopsy treatment planning. The proposed system addresses the challenges of precise prostate segmentation, reconstruction and inter-operator variability by performing fully automated prostate scanning, real-time CNN-transformer-based image processing, 3D prostate reconstruction, and biopsy needle position planning. Fully automated prostate scanning is achieved by using a robotic arm equipped with an ultrasound system. Real-time ultrasound image processing utilizes state-of-the-art deep learning algorithms with intelligent post-processing techniques for precise prostate segmentation. To create a high-quality prostate segmentation dataset, this paper proposes a deep learning-based medical annotation platform, MedAP. For precise segmentation of the entire prostate sweep, DAF3D and MicroSegNet models are evaluated, and additional image post-processing methods are proposed. Three-dimensional visualization and prostate reconstruction are performed by utilizing the segmentation results and robotic positional data, enabling robust, user-friendly biopsy treatment planning. The real-time sweep scanning and segmentation operate at 30 Hz, which enable complete scan in 15 to 20 s, depending on the size of the prostate. The system is evaluated on prostate phantoms by reconstructing the sweep and by performing dimensional analysis, which indicates 92% and 98% volumetric accuracy on the tested phantoms. Three-dimansional prostate reconstruction takes approximately 3 s and enables fast and detailed insight for precise biopsy needle position planning. Full article

(This article belongs to the Section Sensors and Control in Robotics)

► Show Figures

Figure 1

26 pages, 78396 KiB

Open AccessArticle

SWRD–YOLO: A Lightweight Instance Segmentation Model for Estimating Rice Lodging Degree in UAV Remote Sensing Images with Real-Time Edge Deployment

by Chunyou Guo and Feng Tan

Agriculture 2025, 15(15), 1570; https://doi.org/10.3390/agriculture15151570 - 22 Jul 2025

Viewed by 276

Abstract

Rice lodging severely affects crop growth, yield, and mechanized harvesting efficiency. The accurate detection and quantification of lodging areas are crucial for precision agriculture and timely field management. However, Unmanned Aerial Vehicle (UAV)-based lodging detection faces challenges such as complex backgrounds, variable lighting, [...] Read more.

Rice lodging severely affects crop growth, yield, and mechanized harvesting efficiency. The accurate detection and quantification of lodging areas are crucial for precision agriculture and timely field management. However, Unmanned Aerial Vehicle (UAV)-based lodging detection faces challenges such as complex backgrounds, variable lighting, and irregular lodging patterns. To address these issues, this study proposes SWRD–YOLO, a lightweight instance segmentation model that enhances feature extraction and fusion using advanced convolution and attention mechanisms. The model employs an optimized loss function to improve localization accuracy, achieving precise lodging area segmentation. Additionally, a grid-based lodging ratio estimation method is introduced, dividing images into fixed-size grids to calculate local lodging proportions and aggregate them for robust overall severity assessment. Evaluated on a self-built rice lodging dataset, the model achieves 94.8% precision, 88.2% recall, 93.3% mAP@0.5, and 91.4% F1 score, with real-time inference at 16.15 FPS on an embedded NVIDIA Jetson Orin NX device. Compared to the baseline YOLOv8n-seg, precision, recall, mAP@0.5, and F1 score improved by 8.2%, 16.5%, 12.8%, and 12.8%, respectively. These results confirm the model’s effectiveness and potential for deployment in intelligent crop monitoring and sustainable agriculture. Full article

(This article belongs to the Topic Intelligent Agriculture: Perception Technologies and Agricultural Equipment for Crop Production Processes)

► Show Figures

Figure 1

17 pages, 3069 KiB

Open AccessArticle

Enhanced Segmentation of Glioma Subregions via Modality-Aware Encoding and Channel-Wise Attention in Multimodal MRI

by Annachiara Cariola, Elena Sibilano, Antonio Brunetti, Domenico Buongiorno, Andrea Guerriero and Vitoantonio Bevilacqua

Appl. Sci. 2025, 15(14), 8061; https://doi.org/10.3390/app15148061 - 20 Jul 2025

Viewed by 382

Abstract

Accurate segmentation of key tumor subregions in adult gliomas from Magnetic Resonance Imaging (MRI) is of critical importance for brain tumor diagnosis, treatment planning, and prognosis. However, this task remains poorly investigated and highly challenging due to the considerable variability in shape and [...] Read more.

Accurate segmentation of key tumor subregions in adult gliomas from Magnetic Resonance Imaging (MRI) is of critical importance for brain tumor diagnosis, treatment planning, and prognosis. However, this task remains poorly investigated and highly challenging due to the considerable variability in shape and appearance of these areas across patients. This study proposes a novel Deep Learning architecture leveraging modality-specific encoding and attention-based refinement for the segmentation of glioma subregions, including peritumoral edema (ED), necrotic core (NCR), and enhancing tissue (ET). The model is trained and validated on the Brain Tumor Segmentation (BraTS) 2023 challenge dataset and benchmarked against a state-of-the-art transformer-based approach. Our architecture achieves promising results, with Dice scores of 0.78, 0.86, and 0.88 for NCR, ED, and ET, respectively, outperforming SegFormer3D while maintaining comparable model complexity. To ensure a comprehensive evaluation, performance was also assessed on standard composite tumor regions, i.e., tumor core (TC) and whole tumor (WT). The statistically significant improvements obtained on all regions highlight the effectiveness of integrating complementary modality-specific information and applying channel-wise feature recalibration in the proposed model. Full article

(This article belongs to the Special Issue The Role of Artificial Intelligence Technologies in Health)

► Show Figures

Figure 1

18 pages, 2930 KiB

Open AccessArticle

Eye in the Sky for Sub-Tidal Seagrass Mapping: Leveraging Unsupervised Domain Adaptation with SegFormer for Multi-Source and Multi-Resolution Aerial Imagery

by Satish Pawar, Aris Thomasberger, Stefan Hein Bengtson, Malte Pedersen and Karen Timmermann

Remote Sens. 2025, 17(14), 2518; https://doi.org/10.3390/rs17142518 - 19 Jul 2025

Viewed by 280

Abstract

The accurate and large-scale mapping of seagrass meadows is essential, as these meadows form primary habitats for marine organisms and large sinks for blue carbon. Image data available for mapping these habitats are often scarce or are acquired through multiple surveys and instruments, [...] Read more.

The accurate and large-scale mapping of seagrass meadows is essential, as these meadows form primary habitats for marine organisms and large sinks for blue carbon. Image data available for mapping these habitats are often scarce or are acquired through multiple surveys and instruments, resulting in images of varying spatial and spectral characteristics. This study presents an unsupervised domain adaptation (UDA) strategy that combines histogram-matching with the transformer-based SegFormer model to address these challenges. Unoccupied aerial vehicle (UAV)-derived imagery (3-cm resolution) was used for training, while orthophotos from airplane surveys (12.5-cm resolution) served as the target domain. The method was evaluated across three Danish estuaries (Horsens Fjord, Skive Fjord, and Lovns Broad) using one-to-one, leave-one-out, and all-to-one histogram matching strategies. The highest performance was observed at Skive Fjord, achieving an F1-score/IoU = 0.52/0.48 for the leave-one-out test, corresponding to 68% of the benchmark model that was trained on both domains. These results demonstrate the potential of this lightweight UDA approach to generalization across spatial, temporal, and resolution domains, enabling the cost-effective and scalable mapping of submerged vegetation in data-scarce environments. This study also sheds light on contrast as a significant property of target domains that impacts image segmentation. Full article

(This article belongs to the Special Issue High-Resolution Remote Sensing Image Processing and Applications)

► Show Figures

Figure 1

20 pages, 4148 KiB

Open AccessArticle

Automated Discrimination of Appearance Quality Grade of Mushroom (Stropharia rugoso-annulata) Using Computer Vision-Based Air-Blown System

by Meng Lv, Lei Kong, Qi-Yuan Zhang and Wen-Hao Su

Sensors 2025, 25(14), 4482; https://doi.org/10.3390/s25144482 - 18 Jul 2025

Viewed by 322

Abstract

The mushroom Stropharia rugoso-annulata is one of the most popular varieties in the international market because it is highly nutritious and has a delicious flavor. However, grading is still performed manually, leading to inconsistent grading standards and low efficiency. In this study, deep [...] Read more.

The mushroom Stropharia rugoso-annulata is one of the most popular varieties in the international market because it is highly nutritious and has a delicious flavor. However, grading is still performed manually, leading to inconsistent grading standards and low efficiency. In this study, deep learning and computer vision techniques were used to develop an automated air-blown grading system for classifying this mushroom into three quality grades. The system consisted of a classification module and a grading module. In the classification module, the cap and stalk regions were extracted using the YOLOv8-seg algorithm, then post-processed using OpenCV based on quantitative grading indexes, forming the proposed SegGrade algorithm. In the grading module, an air-blown grading system with an automatic feeding unit was developed in combination with the SegGrade algorithm. The experimental results show that for 150 randomly selected mushrooms, the trained YOLOv8-seg algorithm achieved an accuracy of 99.5% in segmenting the cap and stalk regions, while the SegGrade algorithm achieved an accuracy of 94.67%. Furthermore, the system ultimately achieved an average grading accuracy of 80.66% and maintained the integrity of the mushrooms. This system can be further expanded according to production needs, improving sorting efficiency and meeting market demands. Full article

(This article belongs to the Special Issue Sensing Technology and Computer Vision for Precision Agriculture and Smart Farming)

► Show Figures

Figure 1

37 pages, 6001 KiB

Open AccessArticle

Deep Learning-Based Crack Detection on Cultural Heritage Surfaces

by Wei-Che Huang, Yi-Shan Luo, Wen-Cheng Liu and Hong-Ming Liu

Appl. Sci. 2025, 15(14), 7898; https://doi.org/10.3390/app15147898 - 15 Jul 2025

Viewed by 385

Abstract

This study employs a deep learning-based object detection model, GoogleNet, to identify cracks in cultural heritage images. Subsequently, a semantic segmentation model, SegNet, is utilized to determine the location and extent of the cracks. To establish a scale ratio between image pixels and [...] Read more.

This study employs a deep learning-based object detection model, GoogleNet, to identify cracks in cultural heritage images. Subsequently, a semantic segmentation model, SegNet, is utilized to determine the location and extent of the cracks. To establish a scale ratio between image pixels and real-world dimensions, a parallel laser-based measurement approach is applied, enabling precise crack length calculations. The results indicate that the percentage error between crack lengths estimated using deep learning and those measured with a caliper is approximately 3%, demonstrating the feasibility and reliability of the proposed method. Additionally, the study examines the impact of iteration count, image quantity, and image category on the performance of GoogleNet and SegNet. While increasing the number of iterations significantly improves the models’ learning performance in the early stages, excessive iterations lead to overfitting. The optimal performance for GoogleNet was achieved at 75 iterations, whereas SegNet reached its best performance after 45,000 iterations. Similarly, while expanding the training dataset enhances model generalization, an excessive number of images may also contribute to overfitting. GoogleNet exhibited optimal performance with a training set of 66 images, while SegNet achieved the best segmentation accuracy when trained with 300 images. Furthermore, the study investigates the effect of different crack image categories by classifying datasets into four groups: general cracks, plain wall cracks, mottled wall cracks, and brick wall cracks. The findings reveal that training GoogleNet and SegNet with general crack images yielded the highest model performance, whereas training with a single crack category substantially reduced generalization capability. Full article

(This article belongs to the Special Issue Deep Learning and Machine Learning in Image Processing and Pattern Recognition)

► Show Figures

Figure 1

Search Results (720)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (720)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI