MDPI - Publisher of Open Access Journals

16 pages, 4587 KiB

Open AccessArticle

FAMNet: A Lightweight Stereo Matching Network for Real-Time Depth Estimation in Autonomous Driving

by Jingyuan Zhang, Qiang Tong, Na Yan and Xiulei Liu

Symmetry 2025, 17(8), 1214; https://doi.org/10.3390/sym17081214 - 1 Aug 2025

Accurate and efficient stereo matching is fundamental to real-time depth estimation from symmetric stereo cameras in autonomous driving systems. However, existing high-accuracy stereo matching networks typically rely on computationally expensive 3D convolutions, which limit their practicality in real-world environments. In contrast, real-time methods [...] Read more.

Accurate and efficient stereo matching is fundamental to real-time depth estimation from symmetric stereo cameras in autonomous driving systems. However, existing high-accuracy stereo matching networks typically rely on computationally expensive 3D convolutions, which limit their practicality in real-world environments. In contrast, real-time methods often sacrifice accuracy or generalization capability. To address these challenges, we propose FAMNet (Fusion Attention Multi-Scale Network), a lightweight and generalizable stereo matching framework tailored for real-time depth estimation in autonomous driving applications. FAMNet consists of two novel modules: Fusion Attention-based Cost Volume (FACV) and Multi-scale Attention Aggregation (MAA). FACV constructs a compact yet expressive cost volume by integrating multi-scale correlation, attention-guided feature fusion, and channel reweighting, thereby reducing reliance on heavy 3D convolutions. MAA further enhances disparity estimation by fusing multi-scale contextual cues through pyramid-based aggregation and dual-path attention mechanisms. Extensive experiments on the KITTI 2012 and KITTI 2015 benchmarks demonstrate that FAMNet achieves a favorable trade-off between accuracy, efficiency, and generalization. On KITTI 2015, with the incorporation of FACV and MAA, the prediction accuracy of the baseline model is improved by 37% and 38%, respectively, and a total improvement of 42% is achieved by our final model. These results highlight FAMNet’s potential for practical deployment in resource-constrained autonomous driving systems requiring real-time and reliable depth perception. Full article

(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry, 2nd Edition)

► Show Figures

Figure 1

18 pages, 74537 KiB

Open AccessArticle

SDA-YOLO: Multi-Scale Dynamic Branching and Attention Fusion for Self-Explosion Defect Detection in Insulators

by Zhonghao Yang, Wangping Xu, Nanxing Chen, Yifu Chen, Kaijun Wu, Min Xie, Hong Xu and Enhui Zheng

Electronics 2025, 14(15), 3070; https://doi.org/10.3390/electronics14153070 (registering DOI) - 31 Jul 2025

Viewed by 28

Abstract

To enhance the performance of UAVs in detecting insulator self-explosion defects during power inspections, this paper proposes an insulator self-explosion defect recognition algorithm, SDA-YOLO, based on an improved YOLOv11s network. First, the SODL is added to YOLOv11 to fuse shallow features with deeper [...] Read more.

To enhance the performance of UAVs in detecting insulator self-explosion defects during power inspections, this paper proposes an insulator self-explosion defect recognition algorithm, SDA-YOLO, based on an improved YOLOv11s network. First, the SODL is added to YOLOv11 to fuse shallow features with deeper features, thereby improving the model’s focus on small-sized self-explosion defect features. The OBB is also employed to reduce interference from the complex background. Second, the DBB module is incorporated into the C3k2 module in the backbone to extract target features through a multi-branch parallel convolutional structure. Finally, the AIFI module replaces the C2PSA module, effectively directing and aggregating information between channels to improve detection accuracy and inference speed. The experimental results show that the average accuracy of SDA-YOLO reaches 96.0%, which is higher than the YOLOv11s baseline model of 6.6%. While maintaining high accuracy, the inference speed of SDA-YOLO can reach 93.6 frames/s, which achieves the purpose of the real-time detection of insulator faults. Full article

► Show Figures

Figure 1

20 pages, 19642 KiB

Open AccessArticle

SIRI-MOGA-UNet: A Synergistic Framework for Subsurface Latent Damage Detection in ‘Korla’ Pears via Structured-Illumination Reflectance Imaging and Multi-Order Gated Attention

by Baishao Zhan, Jiawei Liao, Hailiang Zhang, Wei Luo, Shizhao Wang, Qiangqiang Zeng and Yongxian Lai

Spectrosc. J. 2025, 3(3), 22; https://doi.org/10.3390/spectroscj3030022 - 29 Jul 2025

Viewed by 125

Abstract

Bruising in ‘Korla’ pears represents a prevalent phenomenon that leads to progressive fruit decay and substantial economic losses. The detection of early-stage bruising proves challenging due to the absence of visible external characteristics, and existing deep learning models have limitations in weak feature [...] Read more.

Bruising in ‘Korla’ pears represents a prevalent phenomenon that leads to progressive fruit decay and substantial economic losses. The detection of early-stage bruising proves challenging due to the absence of visible external characteristics, and existing deep learning models have limitations in weak feature extraction under complex optical interference. To address the postharvest latent damage detection challenges in ‘Korla’ pears, this study proposes a collaborative detection framework integrating structured-illumination reflectance imaging (SIRI) with multi-order gated attention mechanisms. Initially, an SIRI optical system was constructed, employing 150 cycles·m⁻¹ spatial frequency modulation and a three-phase demodulation algorithm to extract subtle interference signal variations, thereby generating RT (Relative Transmission) images with significantly enhanced contrast in subsurface damage regions. To improve the detection accuracy of latent damage areas, the MOGA-UNet model was developed with three key innovations: 1. Integrate the lightweight VGG16 encoder structure into the feature extraction network to improve computational efficiency while retaining details. 2. Add a multi-order gated aggregation module at the end of the encoder to realize the fusion of features at different scales through a special convolution method. 3. Embed the channel attention mechanism in the decoding stage to dynamically enhance the weight of feature channels related to damage. Experimental results demonstrate that the proposed model achieves 94.38% mean Intersection over Union (mIoU) and 97.02% Dice coefficient on RT images, outperforming the baseline UNet model by 2.80% with superior segmentation accuracy and boundary localization capabilities compared with mainstream models. This approach provides an efficient and reliable technical solution for intelligent postharvest agricultural product sorting. Full article

► Show Figures

Figure 1

22 pages, 16984 KiB

Open AccessArticle

Small Ship Detection Based on Improved Neural Network Algorithm and SAR Images

by Jiaqi Li, Hongyuan Huo, Li Guo, De Zhang, Wei Feng, Yi Lian and Long He

Remote Sens. 2025, 17(15), 2586; https://doi.org/10.3390/rs17152586 - 24 Jul 2025

Viewed by 248

Abstract

Synthetic aperture radar images can be used for ship target detection. However, due to the unclear ship outline in SAR images, noise and land background factors affect the difficulty and accuracy of ship (especially small target ship) detection. Therefore, based on the YOLOv5s [...] Read more.

Synthetic aperture radar images can be used for ship target detection. However, due to the unclear ship outline in SAR images, noise and land background factors affect the difficulty and accuracy of ship (especially small target ship) detection. Therefore, based on the YOLOv5s model, this paper improves its backbone network and feature fusion network algorithm to improve the accuracy of ship detection target recognition. First, the LSKModule is used to improve the backbone network of YOLOv5s. By adaptively aggregating the features extracted by large-size convolution kernels to fully obtain context information, at the same time, key features are enhanced and noise interference is suppressed. Secondly, multiple Depthwise Separable Convolution layers are added to the SPPF (Spatial Pyramid Pooling-Fast) structure. Although a small number of parameters and calculations are introduced, features of different receptive fields can be extracted. Third, the feature fusion network of YOLOv5s is improved based on BIFPN, and the shallow feature map is used to optimize the small target detection performance. Finally, the CoordConv module is added before the detect head of YOLOv5, and two coordinate channels are added during the convolution operation to further improve the accuracy of target detection. The map50 of this method for the SSDD dataset and HRSID dataset reached 97.6% and 91.7%, respectively, and was compared with a variety of advanced target detection models. The results show that the detection accuracy of this method is higher than other similar target detection algorithms. Full article

► Show Figures

Figure 1

18 pages, 4203 KiB

Open AccessArticle

SRW-YOLO: A Detection Model for Environmental Risk Factors During the Grid Construction Phase

by Yu Zhao, Fei Liu, Qiang He, Fang Liu, Xiaohu Sun and Jiyong Zhang

Remote Sens. 2025, 17(15), 2576; https://doi.org/10.3390/rs17152576 - 24 Jul 2025

Viewed by 250

Abstract

With the rapid advancement of UAV-based remote sensing and image recognition techniques, identifying environmental risk factors from aerial imagery has emerged as a focal point in intelligent inspection during the power transmission and distribution projects construction phase. The uneven spatial distribution of risk [...] Read more.

With the rapid advancement of UAV-based remote sensing and image recognition techniques, identifying environmental risk factors from aerial imagery has emerged as a focal point in intelligent inspection during the power transmission and distribution projects construction phase. The uneven spatial distribution of risk factors on construction sites, their weak texture signatures, and the inherently multi-scale nature of UAV imagery pose significant detection challenges. To address these issues, we propose a one-stage SRW-YOLO algorithm built upon the YOLOv11 framework. First, a P2-scale shallow feature detection layer is added to capture high-resolution fine details of small targets. Second, we integrate a reparameterized convolution based on channel shuffle (RCS) of a one-shot aggregation (RCS-OSA) module into the backbone and neck’s shallow layers, enhancing feature extraction while significantly reducing inference latency. Finally, a dynamic non-monotonic focusing mechanism WIoU v3 loss function is employed to reweigh low-quality annotations, thereby improving small-object localization accuracy. Experimental results demonstrate that SRW-YOLO achieves an overall precision of 80.6% and mAP of 79.1% on the State Grid dataset, and exhibits similarly superior performance on the VisDrone2019 dataset. Compared with other one-stage detectors, SRW-YOLO delivers markedly higher detection accuracy, offering critical technical support for multi-scale, heterogeneous environmental risk monitoring during the power transmission and distribution projects construction phase, and establishes the theoretical foundation for rapid and accurate inspection using UAV-based intelligent imaging. Full article

► Show Figures

Figure 1

22 pages, 4611 KiB

Open AccessArticle

MMC-YOLO: A Lightweight Model for Real-Time Detection of Geometric Symmetry-Breaking Defects in Wind Turbine Blades

by Caiye Liu, Chao Zhang, Xinyu Ge, Xunmeng An and Nan Xue

Symmetry 2025, 17(8), 1183; https://doi.org/10.3390/sym17081183 - 24 Jul 2025

Viewed by 288

Abstract

Performance degradation of wind turbine blades often stems from geometric asymmetry induced by damage. Existing methods for assessing damage face challenges in balancing accuracy and efficiency due to their limited ability to capture fine-grained geometric asymmetries associated with multi-scale damage under complex background [...] Read more.

Performance degradation of wind turbine blades often stems from geometric asymmetry induced by damage. Existing methods for assessing damage face challenges in balancing accuracy and efficiency due to their limited ability to capture fine-grained geometric asymmetries associated with multi-scale damage under complex background interference. To address this, based on the high-speed detection model YOLOv10-N, this paper proposes a novel detection model named MMC-YOLO. First, the Multi-Scale Perception Gated Convolution (MSGConv) Module was designed, which constructs a full-scale receptive field through multi-branch fusion and channel rearrangement to enhance the extraction of geometric asymmetry features. Second, the Multi-Scale Enhanced Feature Pyramid Network (MSEFPN) was developed, integrating dynamic path aggregation and an SENetv2 attention mechanism to suppress background interference and amplify damage response. Finally, the Channel-Compensated Filtering (CCF) module was constructed to preserve critical channel information using a dynamic buffering mechanism. Evaluated on a dataset of 4818 wind turbine blade damage images, MMC-YOLO achieves an 82.4% mAP [0.5:0.95], representing a 4.4% improvement over the baseline YOLOv10-N model, and a 91.1% recall rate, an 8.7% increase, while maintaining a lightweight parameter count of 4.2 million. This framework significantly enhances geometric asymmetry defect detection accuracy while ensuring real-time performance, meeting engineering requirements for high efficiency and precision. Full article

(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)

► Show Figures

Figure 1

36 pages, 25361 KiB

Open AccessArticle

Remote Sensing Image Compression via Wavelet-Guided Local Structure Decoupling and Channel–Spatial State Modeling

by Jiahui Liu, Lili Zhang and Xianjun Wang

Remote Sens. 2025, 17(14), 2419; https://doi.org/10.3390/rs17142419 - 12 Jul 2025

Viewed by 447

Abstract

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer [...] Read more.

As the resolution and data volume of remote sensing imagery continue to grow, achieving efficient compression without sacrificing reconstruction quality remains a major challenge, given that traditional handcrafted codecs often fail to balance rate-distortion performance and computational complexity, while deep learning-based approaches offer superior representational capacity. However, challenges remain in achieving a balance between fine-detail adaptation and computational efficiency. Mamba, a state–space model (SSM)-based architecture, offers linear-time complexity and excels at capturing long-range dependencies in sequences. It has been adopted in remote sensing compression tasks to model long-distance dependencies between pixels. However, despite its effectiveness in global context aggregation, Mamba’s uniform bidirectional scanning is insufficient for capturing high-frequency structures such as edges and textures. Moreover, existing visual state–space (VSS) models built upon Mamba typically treat all channels equally and lack mechanisms to dynamically focus on semantically salient spatial regions. To address these issues, we present an innovative architecture for distant sensing image compression, called the Multi-scale Channel Global Mamba Network (MGMNet). MGMNet integrates a spatial–channel dynamic weighting mechanism into the Mamba architecture, enhancing global semantic modeling while selectively emphasizing informative features. It comprises two key modules. The Wavelet Transform-guided Local Structure Decoupling (WTLS) module applies multi-scale wavelet decomposition to disentangle and separately encode low- and high-frequency components, enabling efficient parallel modeling of global contours and local textures. The Channel–Global Information Modeling (CGIM) module enhances conventional VSS by introducing a dual-path attention strategy that reweights spatial and channel information, improving the modeling of long-range dependencies and edge structures. We conducted extensive evaluations on three distinct remote sensing datasets to assess the MGMNet. The results of the investigations revealed that MGMNet outperforms the current SOTA models across various performance metrics. Full article

(This article belongs to the Special Issue New Insights in Remote Sensing Image Interpretation with Deep Learning)

► Show Figures

Figure 1

21 pages, 24495 KiB

Open AccessArticle

UAMS: An Unsupervised Anomaly Detection Method Integrating MSAA and SSPCAB

by Zhe Li, Wenhui Chen and Weijie Wang

Symmetry 2025, 17(7), 1119; https://doi.org/10.3390/sym17071119 - 12 Jul 2025

Viewed by 310

Abstract

Anomaly detection methods play a crucial role in automated quality control within modern manufacturing systems. In this context, unsupervised methods are increasingly favored due to their independence from large-scale labeled datasets. However, existing methods present limited multi-scale feature extraction ability and may fail [...] Read more.

Anomaly detection methods play a crucial role in automated quality control within modern manufacturing systems. In this context, unsupervised methods are increasingly favored due to their independence from large-scale labeled datasets. However, existing methods present limited multi-scale feature extraction ability and may fail to effectively capture subtle anomalies. To address these challenges, we propose UAMS, a pyramid-structured normalization flow framework that leverages the symmetry in feature recombination to harmonize multi-scale interactions. The proposed framework integrates a Multi-Scale Attention Aggregation (MSAA) module for cross-scale dynamic fusion, as well as a Self-Supervised Predictive Convolutional Attention Block (SSPCAB) for spatial channel attention and masked prediction learning. Experiments on the MVTecAD dataset show that UAMS largely outperforms state-of-the-art unsupervised methods, in terms of detection and localization accuracy, while maintaining high inference efficiency. For example, when comparing UAMS against the baseline model on the carpet category, the AUROC is improved from 90.8% to 94.5%, and AUPRO is improved from 91.0% to 92.9%. These findings validate the potential of the proposed method for use in real industrial inspection scenarios. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

21 pages, 4010 KiB

Open AccessArticle

PCES-YOLO: High-Precision PCB Detection via Pre-Convolution Receptive Field Enhancement and Geometry-Perception Feature Fusion

by Heqi Yang, Junming Dong, Cancan Wang, Zhida Lian and Hui Chang

Appl. Sci. 2025, 15(13), 7588; https://doi.org/10.3390/app15137588 - 7 Jul 2025

Viewed by 356

Abstract

Printed circuit board (PCB) defect detection faces challenges like small target feature loss and severe background interference. To address these issues, this paper proposes PCES-YOLO, an enhanced YOLOv11-based model. First, a developed Pre-convolution Receptive Field Enhancement (PRFE) module replaces C3k in the C3k2 [...] Read more.

Printed circuit board (PCB) defect detection faces challenges like small target feature loss and severe background interference. To address these issues, this paper proposes PCES-YOLO, an enhanced YOLOv11-based model. First, a developed Pre-convolution Receptive Field Enhancement (PRFE) module replaces C3k in the C3k2 module. The ConvNeXtBlock with inverted bottleneck is introduced in the P4 layer, greatly improving small-target feature capture and semantic understanding. The second key innovation lies in the creation of the Efficient Feature Fusion and Aggregation Network (EFAN), which integrates a lightweight Spatial-Channel Decoupled Downsampling (SCDown) module and three innovative fusion pathways. This achieves substantial parameter reduction while effectively integrating shallow detail features with deep semantic features, preserving critical defect information across different feature levels. Finally, the Shape-IoU loss function is incorporated, focusing on bounding box shape and scale for more accurate regression and enhanced defect localization precision. Experiments on the enhanced Peking University PCB defect dataset show that PCES-YOLO achieves a mAP50 of 97.3% and a mAP50–95 of 77.2%. Compared to YOLOv11n, it shows improvements of 3.6% in mAP50 and 15.2% in mAP50–95. When compared to YOLOv11s, it increases mAP50 by 1.0% and mAP50–95 by 5.6% while also significantly reducing the model parameters. The performance of PCES-YOLO is also evaluated against mainstream object detection algorithms, including Faster R-CNN, SSD, YOLOv8n, etc. These results indicate that PCES-YOLO outperforms these algorithms in terms of detection accuracy and efficiency, making it a promising high-precision and efficient solution for PCB defect detection in industrial settings. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

24 pages, 1307 KiB

Open AccessArticle

A Self-Supervised Specific Emitter Identification Method Based on Contrastive Asymmetric Masked Learning

by Dong Wang, Yonghui Huang, Tianshu Cui and Yan Zhu

Sensors 2025, 25(13), 4023; https://doi.org/10.3390/s25134023 - 27 Jun 2025

Viewed by 294

Abstract

Specific emitter identification (SEI) is a core technology for wireless device security that plays a crucial role in protecting wireless communication systems from various security threats. However, current deep learning-based SEI methods heavily rely on large amounts of labeled data for supervised training, [...] Read more.

Specific emitter identification (SEI) is a core technology for wireless device security that plays a crucial role in protecting wireless communication systems from various security threats. However, current deep learning-based SEI methods heavily rely on large amounts of labeled data for supervised training, facing challenges in non-cooperative communication scenarios. To address these issues, this paper proposes a novel contrastive asymmetric masked learning-based SEI (CAML-SEI) method, effectively solving the problem of SEI under scarce labeled samples. The proposed method constructs an asymmetric auto-encoder architecture, comprising an encoder network based on channel squeeze-and-excitation residual blocks to capture radio frequency fingerprint (RFF) features embedded in signals, while employing a lightweight single-layer convolutional decoder for masked signal reconstruction. This design promotes the learning of fine-grained local feature representations. To further enhance feature discriminability, a learnable non-linear mapping is introduced to compress high-dimensional encoded features into a compact low-dimensional space, accompanied by a contrastive loss function that simultaneously achieves feature aggregation of positive samples and feature separation of negative samples. Finally, the network is jointly optimized by combining signal reconstruction and feature contrast tasks. Experiments conducted on real-world ADS-B and Wi-Fi datasets demonstrate that the proposed method effectively learns generalized RFF features, and the results show superior performance compared with other SEI methods. Full article

(This article belongs to the Section Communications)

► Show Figures

Figure 1

20 pages, 4391 KiB

Open AccessArticle

GDS-YOLOv7: A High-Performance Model for Water-Surface Obstacle Detection Using Optimized Receptive Field and Attention Mechanisms

by Xu Yang, Lei Huang, Fuyang Ke, Chao Liu, Ruixue Yang and Shicheng Xie

ISPRS Int. J. Geo-Inf. 2025, 14(7), 238; https://doi.org/10.3390/ijgi14070238 - 23 Jun 2025

Viewed by 314

Abstract

Unmanned ships, equipped with self-navigation and image processing capabilities, are progressively expanding their applications in fields such as mining, fisheries, and marine environments. Along with this development, issues concerning waterborne traffic safety are gradually emerging. To address the challenges of navigation and obstacle [...] Read more.

Unmanned ships, equipped with self-navigation and image processing capabilities, are progressively expanding their applications in fields such as mining, fisheries, and marine environments. Along with this development, issues concerning waterborne traffic safety are gradually emerging. To address the challenges of navigation and obstacle detection on the water’s surface, this paper presents CDS-YOLOv7, an enhanced obstacle-detection framework for aquatic environments, architecturally evolved from YOLOv7. The proposed system implements three key innovations: (1) Architectural optimization through replacement of the Spatial Pyramid Pooling Cross Stage Partial Connections (SPPCSPC) module with GhostSPPCSPC for expanded receptive field representation. (2) Integration of a parameter-free attention mechanism (SimAM) with refined pooling configurations to boost multi-scale detection sensitivity, and (3) Strategic deployment of depthwise separable convolutions (DSC) to reduce computational complexity while maintaining detection fidelity. Furthermore, we develop a Spatial–Channel Synergetic Attention (SCSA) mechanism to counteract feature degradation in convolutional operations, embedding this module within the Extended Effective Long-Range Aggregation Network (E-ELAN) network to enhance contextual awareness. Experimental results reveal the model’s superiority over baseline YOLOv7, achieving 4.9% mean average precision@0.5 (mAP@0.5), +4.3% precision (P), and +6.9% recall (R) alongside a 22.8% reduction in Giga Floating-point Operations Per Second (GFLOPS). Full article

(This article belongs to the Topic State-of-the-Art Object Detection, Tracking, and Recognition Techniques)

► Show Figures

Figure 1

19 pages, 25047 KiB

Open AccessArticle

Hash-Guided Adaptive Matching and Progressive Multi-Scale Aggregation for Reference-Based Image Super-Resolution

by Lin Wang, Jiaqi Zhang, Huan Kang, Haonan Su and Minghua Zhao

Appl. Sci. 2025, 15(12), 6821; https://doi.org/10.3390/app15126821 - 17 Jun 2025

Viewed by 305

Abstract

Reference-based super-resolution (RefSR) enhances the detail restoration capability of low-resolution images (LR) by utilizing the details and texture information of external reference images (Ref). This study proposes a RefSR method based on hash adaptive matching and progressive multi-scale dynamic aggregation to improve the [...] Read more.

Reference-based super-resolution (RefSR) enhances the detail restoration capability of low-resolution images (LR) by utilizing the details and texture information of external reference images (Ref). This study proposes a RefSR method based on hash adaptive matching and progressive multi-scale dynamic aggregation to improve the super-resolution reconstruction capability. Firstly, to address the issue of feature matching, this chapter proposes a hash adaptive matching module. On the basis of similarity calculation between traditional LR images and Ref images, self-similarity information of LR images is added to assist in super-resolution reconstruction. By dividing the feature space into multiple hash buckets through spherical hashing, the matching range is narrowed down from global search to local neighborhoods, enabling efficient matching in more informative regions. This not only retains global modeling capabilities, but also significantly reduces computational costs. In addition, a learnable similarity scoring function has been designed to adaptively optimize the similarity score between LR images and Ref images, improving matching accuracy. Secondly, in the process of feature transfer, this chapter proposes a progressive multi-scale dynamic aggregation module. This module utilizes dynamic decoupling filters to simultaneously perceive texture information in both spatial and channel domains, extracting key information more accurately and effectively suppressing irrelevant texture interference. In addition, this module enhances the robustness of the model to large-scale biases by gradually adjusting features at different scales, ensuring the accuracy of texture transfer. The experimental results show that this method achieves superior super-resolution reconstruction performance on multiple benchmark datasets. Full article

► Show Figures

Figure 1

35 pages, 4507 KiB

Open AccessArticle

Liver Semantic Segmentation Method Based on Multi-Channel Feature Extraction and Cross Fusion

by Chenghao Zhang, Lingfei Wang, Chunyu Zhang, Yu Zhang, Peng Wang and Jin Li

Bioengineering 2025, 12(6), 636; https://doi.org/10.3390/bioengineering12060636 - 11 Jun 2025

Viewed by 535

Abstract

Semantic segmentation plays a critical role in medical image analysis, offering indispensable information for the diagnosis and treatment planning of liver diseases. However, due to the complex anatomical structure of the liver and significant inter-patient variability, the current methods exhibit notable limitations in [...] Read more.

Semantic segmentation plays a critical role in medical image analysis, offering indispensable information for the diagnosis and treatment planning of liver diseases. However, due to the complex anatomical structure of the liver and significant inter-patient variability, the current methods exhibit notable limitations in feature extraction and fusion, which pose a major challenge to achieving accurate liver segmentation. To address these challenges, this study proposes an improved U-Net-based liver semantic segmentation method that enhances segmentation performance through optimized feature extraction and fusion mechanisms. Firstly, a multi-scale input strategy is employed to account for the variability in liver features at different scales. A multi-scale convolutional attention (MSCA) mechanism is integrated into the encoder to aggregate multi-scale information and improve feature representation. Secondly, an atrous spatial pyramid pooling (ASPP) module is incorporated into the bottleneck layer to capture features at various receptive fields using dilated convolutions, while global pooling is applied to enhance the acquisition of contextual information and ensure efficient feature transmission. Furthermore, a Channel Transformer module replaces the traditional skip connections to strengthen the interaction and fusion between encoder and decoder features, thereby reducing the semantic gap. The effectiveness of this method was validated on integrated public datasets, achieving an Intersection over Union (IoU) of 0.9315 for liver segmentation tasks, outperforming other mainstream approaches. This provides a novel solution for precise liver image segmentation and holds significant clinical value for liver disease diagnosis and treatment. Full article

(This article belongs to the Special Issue Machine Learning and Deep Learning Applications in Healthcare)

► Show Figures

Figure 1

25 pages, 3432 KiB

Open AccessReview

Appraising the Sonic Environment: A Conceptual Framework for Perceptual, Computational, and Cognitive Requirements

by Tjeerd C. Andringa

Behav. Sci. 2025, 15(6), 797; https://doi.org/10.3390/bs15060797 - 10 Jun 2025

Viewed by 421

Abstract

This paper provides a conceptual framework for soundscape appraisal as a key outcome of the hearing process. Sound appraisal involves auditory sense-making and produces the soundscape as the perceived and understood acoustic environment. The soundscape exists in the experiential domain and involves meaning-giving. [...] Read more.

This paper provides a conceptual framework for soundscape appraisal as a key outcome of the hearing process. Sound appraisal involves auditory sense-making and produces the soundscape as the perceived and understood acoustic environment. The soundscape exists in the experiential domain and involves meaning-giving. Soundscape research has reached a consensus about the relevance of two experiential dimensions—pleasure and eventfulness—which give rise to four appraisal quadrants: calm, lively/vibrant, chaotic, and boring/monotonous. Requirements for and constraints on the hearing and appraisal processes follow from the demands of living in a complex world, the specific properties of source and transmission physics, and the need for auditory events and streams of single-source information. These lead to several core features and functions of the hearing process, such as prioritizing the auditory channel (loudness), forming auditory streams (audibility, primitive auditory scene analysis), prioritizing auditory streams (audible safety, noise sensitivity), and initial meaning-giving (auditory gist and perceptual layers). Combined, this leads to a model of soundscape appraisal yielding the ISO quadrant structure. Long-term aggregated appraisals lead to a sonic climate that allows for an insightful comparison of different locations. The resulting system needs additional validation and optimization to comply in detail with human appraisal and evaluation. Full article

(This article belongs to the Special Issue Music Listening as Exploratory Behavior)

► Show Figures

Figure 1

23 pages, 4896 KiB

Open AccessArticle

Insulator Surface Defect Detection Method Based on Graph Feature Diffusion Distillation

by Shucai Li, Na Zhang, Gang Yang, Yannong Hou and Xingzhong Zhang

J. Imaging 2025, 11(6), 190; https://doi.org/10.3390/jimaging11060190 - 10 Jun 2025

Viewed by 1194

Abstract

Aiming at the difficulties of scarcity of defect samples on the surface of power insulators, irregular morphology and insufficient pixel-level localization accuracy, this paper proposes a defect detection method based on graph feature diffusion distillation named GFDD. The feature bias problem is alleviated [...] Read more.

Aiming at the difficulties of scarcity of defect samples on the surface of power insulators, irregular morphology and insufficient pixel-level localization accuracy, this paper proposes a defect detection method based on graph feature diffusion distillation named GFDD. The feature bias problem is alleviated by constructing a dual-division teachers architecture with graph feature consistency constraints, while the cross-layer feature fusion module is utilized to dynamically aggregate multi-scale information to reduce redundancy; the diffusion distillation mechanism is designed to break through the traditional single-layer feature transfer limitation, and the global context modeling capability is enhanced by fusing deep semantics and shallow details through channel attention. In the self-built dataset, GFDD achieves 96.6% Pi.AUROC, 97.7% Im.AUROC and 95.1% F1-score, which is 2.4–3.2% higher than the existing optimal methods; it maintains excellent generalization and robustness in multiple public dataset tests. The method provides a high-precision solution for automated inspection of insulator surface defect and has certain engineering value. Full article

(This article belongs to the Special Issue Self-Supervised Learning for Image Processing and Analysis)

► Show Figures

Figure 1

Search Results (316)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (316)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI