Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (218)

Search Parameters:
Keywords = channel-wise attention

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 6934 KiB  
Article
Feature Constraints Map Generation Models Integrating Generative Adversarial and Diffusion Denoising
by Chenxing Sun, Xixi Fan, Xiechun Lu, Laner Zhou, Junli Zhao, Yuxuan Dong and Zhanlong Chen
Remote Sens. 2025, 17(15), 2683; https://doi.org/10.3390/rs17152683 - 3 Aug 2025
Viewed by 67
Abstract
The accelerated evolution of remote sensing technology has intensified the demand for real-time tile map generation, highlighting the limitations of conventional mapping approaches that rely on manual cartography and field surveys. To address the critical need for rapid cartographic updates, this study presents [...] Read more.
The accelerated evolution of remote sensing technology has intensified the demand for real-time tile map generation, highlighting the limitations of conventional mapping approaches that rely on manual cartography and field surveys. To address the critical need for rapid cartographic updates, this study presents a novel multi-stage generative framework that synergistically integrates Generative Adversarial Networks (GANs) with Diffusion Denoising Models (DMs) for high-fidelity map generation from remote sensing imagery. Specifically, our proposed architecture first employs GANs for rapid preliminary map generation, followed by a cascaded diffusion process that progressively refines topological details and spatial accuracy through iterative denoising. Furthermore, we propose a hybrid attention mechanism that strategically combines channel-wise feature recalibration with coordinate-aware spatial modulation, enabling the enhanced discrimination of geographic features under challenging conditions involving edge ambiguity and environmental noise. Quantitative evaluations demonstrate that our method significantly surpasses established baselines in both structural consistency and geometric fidelity. This framework establishes an operational paradigm for automated, rapid-response cartography, demonstrating a particular utility in time-sensitive applications including disaster impact assessment, unmapped terrain documentation, and dynamic environmental surveillance. Full article
Show Figures

Figure 1

24 pages, 6041 KiB  
Article
Attention-Guided Residual Spatiotemporal Network with Label Regularization for Fault Diagnosis with Small Samples
by Yanlong Xu, Liming Zhang, Ling Chen, Tian Tan, Xiaolong Wang and Hongguang Xiao
Sensors 2025, 25(15), 4772; https://doi.org/10.3390/s25154772 - 3 Aug 2025
Viewed by 61
Abstract
Fault diagnosis is of great significance for the maintenance of rotating machinery. Deep learning is an intelligent diagnostic technique that is receiving increasing attention. To address the issues of industrial data with small samples and varying working conditions, a residual convolutional neural network [...] Read more.
Fault diagnosis is of great significance for the maintenance of rotating machinery. Deep learning is an intelligent diagnostic technique that is receiving increasing attention. To address the issues of industrial data with small samples and varying working conditions, a residual convolutional neural network based on the attention mechanism is put forward for the fault diagnosis of rotating machinery. The method incorporates channel attention and spatial attention simultaneously, implementing channel-wise recalibration for frequency-dependent feature adjustment and performing spatial context aggregation across receptive fields. Subsequently, a residual module is introduced to address the vanishing gradient problem of the model in deep network structures. In addition, LSTM is used to realize spatiotemporal feature fusion. Finally, label smoothing regularization (LSR) is proposed to balance the distributional disparities among labeled samples. The effectiveness of the method is evaluated by its application to the vibration signal data from the safe injection pump and the Case Western Reserve University (CWRU). The results show that the method has superb diagnostic accuracy and strong robustness. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

29 pages, 15488 KiB  
Article
GOFENet: A Hybrid Transformer–CNN Network Integrating GEOBIA-Based Object Priors for Semantic Segmentation of Remote Sensing Images
by Tao He, Jianyu Chen and Delu Pan
Remote Sens. 2025, 17(15), 2652; https://doi.org/10.3390/rs17152652 - 31 Jul 2025
Viewed by 310
Abstract
Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability [...] Read more.
Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability in semantic segmentation. While convolutional neural networks (CNNs) excel at local feature extraction, they inherently struggle to capture long-range dependencies. In contrast, Transformer-based models are well suited for global context modeling but often lack fine-grained local detail. To overcome these limitations, we propose GOFENet (Geo-Object Feature Enhanced Network)—a hybrid semantic segmentation architecture that effectively fuses object-level priors into deep feature representations. GOFENet employs a dual-encoder design combining CNN and Swin Transformer architectures, enabling multi-scale feature fusion through skip connections to preserve both local and global semantics. An auxiliary branch incorporating cascaded atrous convolutions is introduced to inject information of segmented objects into the learning process. Furthermore, we develop a cross-channel selection module (CSM) for refined channel-wise attention, a feature enhancement module (FEM) to merge global and local representations, and a shallow–deep feature fusion module (SDFM) to integrate pixel- and object-level cues across scales. Experimental results on the GID and LoveDA datasets demonstrate that GOFENet achieves superior segmentation performance, with 66.02% mIoU and 51.92% mIoU, respectively. The model exhibits strong capability in delineating large-scale land cover features, producing sharper object boundaries and reducing classification noise, while preserving the integrity and discriminability of land cover categories. Full article
Show Figures

Graphical abstract

29 pages, 36251 KiB  
Article
CCDR: Combining Channel-Wise Convolutional Local Perception, Detachable Self-Attention, and a Residual Feedforward Network for PolSAR Image Classification
by Jianlong Wang, Bingjie Zhang, Zhaozhao Xu, Haifeng Sima and Junding Sun
Remote Sens. 2025, 17(15), 2620; https://doi.org/10.3390/rs17152620 - 28 Jul 2025
Viewed by 220
Abstract
In the task of PolSAR image classification, effectively utilizing convolutional neural networks and vision transformer models with limited labeled data poses a critical challenge. This article proposes a novel method for PolSAR image classification that combines channel-wise convolutional local perception, detachable self-attention, and [...] Read more.
In the task of PolSAR image classification, effectively utilizing convolutional neural networks and vision transformer models with limited labeled data poses a critical challenge. This article proposes a novel method for PolSAR image classification that combines channel-wise convolutional local perception, detachable self-attention, and a residual feedforward network. Specifically, the proposed method comprises several key modules. In the channel-wise convolutional local perception module, channel-wise convolution operations enable accurate extraction of local features from different channels of PolSAR images. The local residual connections further enhance these extracted features, providing more discriminative information for subsequent processing. Additionally, the detachable self-attention mechanism plays a pivotal role: it facilitates effective interaction between local and global information, enabling the model to comprehensively perceive features across different scales, thereby improving classification accuracy and robustness. Subsequently, replacing the conventional feedforward network with a residual feedforward network that incorporates residual structures aids the model in better representing local features, further enhances the capability of cross-layer gradient propagation, and effectively alleviates the problem of vanishing gradients during the training of deep networks. In the final classification stage, two fully connected layers with dropout prevent overfitting, while softmax generates predictions. The proposed method was validated on the AIRSAR Flevoland, RADARSAT-2 San Francisco, and RADARSAT-2 Xi’an datasets. The experimental results demonstrate that the proposed method can attain a high level of classification performance even with a limited amount of labeled data, and the model is relatively stable. Furthermore, the proposed method has lower computational costs than comparative methods. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

23 pages, 19710 KiB  
Article
Hybrid EEG Feature Learning Method for Cross-Session Human Mental Attention State Classification
by Xu Chen, Xingtong Bao, Kailun Jitian, Ruihan Li, Li Zhu and Wanzeng Kong
Brain Sci. 2025, 15(8), 805; https://doi.org/10.3390/brainsci15080805 - 28 Jul 2025
Viewed by 258
Abstract
Background: Decoding mental attention states from electroencephalogram (EEG) signals is crucial for numerous applications such as cognitive monitoring, adaptive human–computer interaction, and brain–computer interfaces (BCIs). However, conventional EEG-based approaches often focus on channel-wise processing and are limited to intra-session or subject-specific scenarios, lacking [...] Read more.
Background: Decoding mental attention states from electroencephalogram (EEG) signals is crucial for numerous applications such as cognitive monitoring, adaptive human–computer interaction, and brain–computer interfaces (BCIs). However, conventional EEG-based approaches often focus on channel-wise processing and are limited to intra-session or subject-specific scenarios, lacking robustness in cross-session or inter-subject conditions. Methods: In this study, we propose a hybrid feature learning framework for robust classification of mental attention states, including focused, unfocused, and drowsy conditions, across both sessions and individuals. Our method integrates preprocessing, feature extraction, feature selection, and classification in a unified pipeline. We extract channel-wise spectral features using short-time Fourier transform (STFT) and further incorporate both functional and structural connectivity features to capture inter-regional interactions in the brain. A two-stage feature selection strategy, combining correlation-based filtering and random forest ranking, is adopted to enhance feature relevance and reduce dimensionality. Support vector machine (SVM) is employed for final classification due to its efficiency and generalization capability. Results: Experimental results on two cross-session and inter-subject EEG datasets demonstrate that our approach achieves classification accuracy of 86.27% and 94.01%, respectively, significantly outperforming traditional methods. Conclusions: These findings suggest that integrating connectivity-aware features with spectral analysis can enhance the generalizability of attention decoding models. The proposed framework provides a promising foundation for the development of practical EEG-based systems for continuous mental state monitoring and adaptive BCIs in real-world environments. Full article
Show Figures

Figure 1

17 pages, 2072 KiB  
Article
Barefoot Footprint Detection Algorithm Based on YOLOv8-StarNet
by Yujie Shen, Xuemei Jiang, Yabin Zhao and Wenxin Xie
Sensors 2025, 25(15), 4578; https://doi.org/10.3390/s25154578 - 24 Jul 2025
Viewed by 293
Abstract
This study proposes an optimized footprint recognition model based on an enhanced StarNet architecture for biometric identification in the security, medical, and criminal investigation fields. Conventional image recognition algorithms exhibit limitations in processing barefoot footprint images characterized by concentrated feature distributions and rich [...] Read more.
This study proposes an optimized footprint recognition model based on an enhanced StarNet architecture for biometric identification in the security, medical, and criminal investigation fields. Conventional image recognition algorithms exhibit limitations in processing barefoot footprint images characterized by concentrated feature distributions and rich texture patterns. To address this, our framework integrates an improved StarNet into the backbone of YOLOv8 architecture. Leveraging the unique advantages of element-wise multiplication, the redesigned backbone efficiently maps inputs to a high-dimensional nonlinear feature space without increasing channel dimensions, achieving enhanced representational capacity with low computational latency. Subsequently, an Encoder layer facilitates feature interaction within the backbone through multi-scale feature fusion and attention mechanisms, effectively extracting rich semantic information while maintaining computational efficiency. In the feature fusion part, a feature modulation block processes multi-scale features by synergistically combining global and local information, thereby reducing redundant computations and decreasing both parameter count and computational complexity to achieve model lightweighting. Experimental evaluations on a proprietary barefoot footprint dataset demonstrate that the proposed model exhibits significant advantages in terms of parameter efficiency, recognition accuracy, and computational complexity. The number of parameters has been reduced by 0.73 million, further improving the model’s speed. Gflops has been reduced by 1.5, lowering the performance requirements for computational hardware during model deployment. Recognition accuracy has reached 99.5%, with further improvements in model precision. Future research will explore how to capture shoeprint images with complex backgrounds from shoes worn at crime scenes, aiming to further enhance the model’s recognition capabilities in more forensic scenarios. Full article
(This article belongs to the Special Issue Transformer Applications in Target Tracking)
Show Figures

Figure 1

20 pages, 3978 KiB  
Article
Cotton-YOLO: A Lightweight Detection Model for Falled Cotton Impurities Based on Yolov8
by Jie Li, Zhoufan Zhong, Youran Han and Xinhou Wang
Symmetry 2025, 17(8), 1185; https://doi.org/10.3390/sym17081185 - 24 Jul 2025
Viewed by 248
Abstract
As an important pillar of the global economic system, the cotton industry faces critical challenges from non-fibrous impurities (e.g., leaves and debris) during processing, which severely degrade product quality, inflate costs, and reduce efficiency. Traditional detection methods suffer from insufficient accuracy and low [...] Read more.
As an important pillar of the global economic system, the cotton industry faces critical challenges from non-fibrous impurities (e.g., leaves and debris) during processing, which severely degrade product quality, inflate costs, and reduce efficiency. Traditional detection methods suffer from insufficient accuracy and low efficiency, failing to meet practical production needs. While deep learning models excel in general object detection, their massive parameter counts render them ill-suited for real-time industrial applications. To address these issues, this study proposes Cotton-YOLO, an optimized yolov8 model. By leveraging principles of symmetry in model design and system setup, the study integrates the CBAM attention module—with its inherent dual-path (channel-spatial) symmetry—to enhance feature capture for tiny impurities and mitigate insufficient focus on key areas. The C2f_DSConv module, exploiting functional equivalence via quantization and shift operations, reduces model complexity by 12% (to 2.71 million parameters) without sacrificing accuracy. Considering angle and shape variations in complex scenarios, the loss function is upgraded to Wise-IoU for more accurate boundary box regression. Experimental results show that Cotton-YOLO achieves 86.5% precision, 80.7% recall, 89.6% mAP50, 50.1% mAP50–95, and 50.51 fps detection speed, representing a 3.5% speed increase over the original yolov8. This work demonstrates the effective application of symmetry concepts (in algorithmic structure and performance balance) to create a model that balances lightweight design and high efficiency, providing a practical solution for industrial impurity detection and key technical support for automated cotton sorting systems. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

19 pages, 1711 KiB  
Article
TSDCA-BA: An Ultra-Lightweight Speech Enhancement Model for Real-Time Hearing Aids with Multi-Scale STFT Fusion
by Zujie Fan, Zikun Guo, Yanxing Lai and Jaesoo Kim
Appl. Sci. 2025, 15(15), 8183; https://doi.org/10.3390/app15158183 - 23 Jul 2025
Viewed by 263
Abstract
Lightweight speech denoising models have made remarkable progress in improving both speech quality and computational efficiency. However, most models rely on long temporal windows as input, limiting their applicability in low-latency, real-time scenarios on edge devices. To address this challenge, we propose a [...] Read more.
Lightweight speech denoising models have made remarkable progress in improving both speech quality and computational efficiency. However, most models rely on long temporal windows as input, limiting their applicability in low-latency, real-time scenarios on edge devices. To address this challenge, we propose a lightweight hybrid module, Temporal Statistics Enhancement, Squeeze-and-Excitation-based Dual Convolutional Attention, and Band-wise Attention (TSE, SDCA, BA) Module. The TSE module enhances single-frame spectral features by concatenating statistical descriptors—mean, standard deviation, maximum, and minimum—thereby capturing richer local information without relying on temporal context. The SDCA and BA module integrates a simplified residual structure and channel attention, while the BA component further strengthens the representation of critical frequency bands through band-wise partitioning and differentiated weighting. The proposed model requires only 0.22 million multiply–accumulate operations (MMACs) and contains a total of 112.3 K parameters, making it well suited for low-latency, real-time speech enhancement applications. Experimental results demonstrate that among lightweight models with fewer than 200K parameters, the proposed approach outperforms most existing methods in both denoising performance and computational efficiency, significantly reducing processing overhead. Furthermore, real-device deployment on an improved hearing aid confirms an inference latency as low as 2 milliseconds, validating its practical potential for real-time edge applications. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

17 pages, 3069 KiB  
Article
Enhanced Segmentation of Glioma Subregions via Modality-Aware Encoding and Channel-Wise Attention in Multimodal MRI
by Annachiara Cariola, Elena Sibilano, Antonio Brunetti, Domenico Buongiorno, Andrea Guerriero and Vitoantonio Bevilacqua
Appl. Sci. 2025, 15(14), 8061; https://doi.org/10.3390/app15148061 - 20 Jul 2025
Viewed by 414
Abstract
Accurate segmentation of key tumor subregions in adult gliomas from Magnetic Resonance Imaging (MRI) is of critical importance for brain tumor diagnosis, treatment planning, and prognosis. However, this task remains poorly investigated and highly challenging due to the considerable variability in shape and [...] Read more.
Accurate segmentation of key tumor subregions in adult gliomas from Magnetic Resonance Imaging (MRI) is of critical importance for brain tumor diagnosis, treatment planning, and prognosis. However, this task remains poorly investigated and highly challenging due to the considerable variability in shape and appearance of these areas across patients. This study proposes a novel Deep Learning architecture leveraging modality-specific encoding and attention-based refinement for the segmentation of glioma subregions, including peritumoral edema (ED), necrotic core (NCR), and enhancing tissue (ET). The model is trained and validated on the Brain Tumor Segmentation (BraTS) 2023 challenge dataset and benchmarked against a state-of-the-art transformer-based approach. Our architecture achieves promising results, with Dice scores of 0.78, 0.86, and 0.88 for NCR, ED, and ET, respectively, outperforming SegFormer3D while maintaining comparable model complexity. To ensure a comprehensive evaluation, performance was also assessed on standard composite tumor regions, i.e., tumor core (TC) and whole tumor (WT). The statistically significant improvements obtained on all regions highlight the effectiveness of integrating complementary modality-specific information and applying channel-wise feature recalibration in the proposed model. Full article
(This article belongs to the Special Issue The Role of Artificial Intelligence Technologies in Health)
Show Figures

Figure 1

22 pages, 32971 KiB  
Article
Spatial-Channel Multiscale Transformer Network for Hyperspectral Unmixing
by Haixin Sun, Qiuguang Cao, Fanlei Meng, Jingwen Xu and Mengdi Cheng
Sensors 2025, 25(14), 4493; https://doi.org/10.3390/s25144493 - 19 Jul 2025
Viewed by 347
Abstract
In recent years, deep learning (DL) has been demonstrated remarkable capabilities in hyperspectral unmixing (HU) due to its powerful feature representation ability. Convolutional neural networks (CNNs) are effective in capturing local spatial information, but limited in modeling long-range dependencies. In contrast, transformer architectures [...] Read more.
In recent years, deep learning (DL) has been demonstrated remarkable capabilities in hyperspectral unmixing (HU) due to its powerful feature representation ability. Convolutional neural networks (CNNs) are effective in capturing local spatial information, but limited in modeling long-range dependencies. In contrast, transformer architectures extract global contextual features via multi-head self-attention (MHSA) mechanisms. However, most existing transformer-based HU methods focus only on spatial or spectral modeling at a single scale, lacking a unified mechanism to jointly explore spatial and channel-wise dependencies. This limitation is particularly critical for multiscale contextual representation in complex scenes. To address these issues, this article proposes a novel Spatial-Channel Multiscale Transformer Network (SCMT-Net) for HU. Specifically, a compact feature projection (CFP) module is first used to extract shallow discriminative features. Then, a spatial multiscale transformer (SMT) and a channel multiscale transformer (CMT) are sequentially applied to model contextual relations across spatial dimensions and long-range dependencies among spectral channels. In addition, a multiscale multi-head self-attention (MMSA) module is designed to extract rich multiscale global contextual and channel information, enabling a balance between accuracy and efficiency. An efficient feed-forward network (E-FFN) is further introduced to enhance inter-channel information flow and fusion. Experiments conducted on three real hyperspectral datasets (Samson, Jasper and Apex) and one synthetic dataset showed that SCMT-Net consistently outperformed existing approaches in both abundance estimation and endmember extraction, demonstrating superior accuracy and robustness. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

28 pages, 19790 KiB  
Article
HSF-DETR: A Special Vehicle Detection Algorithm Based on Hypergraph Spatial Features and Bipolar Attention
by Kaipeng Wang, Guanglin He and Xinmin Li
Sensors 2025, 25(14), 4381; https://doi.org/10.3390/s25144381 - 13 Jul 2025
Viewed by 469
Abstract
Special vehicle detection in intelligent surveillance, emergency rescue, and reconnaissance faces significant challenges in accuracy and robustness under complex environments, necessitating advanced detection algorithms for critical applications. This paper proposes HSF-DETR (Hypergraph Spatial Feature DETR), integrating four innovative modules: a Cascaded Spatial Feature [...] Read more.
Special vehicle detection in intelligent surveillance, emergency rescue, and reconnaissance faces significant challenges in accuracy and robustness under complex environments, necessitating advanced detection algorithms for critical applications. This paper proposes HSF-DETR (Hypergraph Spatial Feature DETR), integrating four innovative modules: a Cascaded Spatial Feature Network (CSFNet) backbone with Cross-Efficient Convolutional Gating (CECG) for enhanced long-range detection through hybrid state-space modeling; a Hypergraph-Enhanced Spatial Feature Modulation (HyperSFM) network utilizing hypergraph structures for high-order feature correlations and adaptive multi-scale fusion; a Dual-Domain Feature Encoder (DDFE) combining Bipolar Efficient Attention (BEA) and Frequency-Enhanced Feed-Forward Network (FEFFN) for precise feature weight allocation; and a Spatial-Channel Fusion Upsampling Block (SCFUB) improving feature fidelity through depth-wise separable convolution and channel shift mixing. Experiments conducted on a self-built special vehicle dataset containing 2388 images demonstrate that HSF-DETR achieves mAP50 and mAP50-95 of 96.6% and 70.6%, respectively, representing improvements of 3.1% and 4.6% over baseline RT-DETR while maintaining computational efficiency at 59.7 GFLOPs and 18.07 M parameters. Cross-domain validation on VisDrone2019 and BDD100K datasets confirms the method’s generalization capability and robustness across diverse scenarios, establishing HSF-DETR as an effective solution for special vehicle detection in complex environments. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

19 pages, 14033 KiB  
Article
SCCA-YOLO: Spatial Channel Fusion and Context-Aware YOLO for Lunar Crater Detection
by Jiahao Tang, Boyuan Gu, Tianyou Li and Ying-Bo Lu
Remote Sens. 2025, 17(14), 2380; https://doi.org/10.3390/rs17142380 - 10 Jul 2025
Viewed by 392
Abstract
Lunar crater detection plays a crucial role in geological analysis and the advancement of lunar exploration. Accurate identification of craters is also essential for constructing high-resolution topographic maps and supporting mission planning in future lunar exploration efforts. However, lunar craters often suffer from [...] Read more.
Lunar crater detection plays a crucial role in geological analysis and the advancement of lunar exploration. Accurate identification of craters is also essential for constructing high-resolution topographic maps and supporting mission planning in future lunar exploration efforts. However, lunar craters often suffer from insufficient feature representation due to their small size and blurred boundaries. In addition, the visual similarity between craters and surrounding terrain further exacerbates background confusion. These challenges significantly hinder detection performance in remote sensing imagery and underscore the necessity of enhancing both local feature representation and global semantic reasoning. In this paper, we propose a novel Spatial Channel Fusion and Context-Aware YOLO (SCCA-YOLO) model built upon the YOLO11 framework. Specifically, the Context-Aware Module (CAM) employs a multi-branch dilated convolutional structure to enhance feature richness and expand the local receptive field, thereby strengthening the feature extraction capability. The Joint Spatial and Channel Fusion Module (SCFM) is utilized to fuse spatial and channel information to model the global relationships between craters and the background, effectively suppressing background noise and reinforcing feature discrimination. In addition, the improved Channel Attention Concatenation (CAC) strategy adaptively learns channel-wise importance weights during feature concatenation, further optimizing multi-scale semantic feature fusion and enhancing the model’s sensitivity to critical crater features. The proposed method is validated on a self-constructed Chang’e 6 dataset, covering the landing site and its surrounding areas. Experimental results demonstrate that our model achieves an mAP0.5 of 96.5% and an mAP0.5:0.95 of 81.5%, outperforming other mainstream detection models including the YOLO family of algorithms. These findings highlight the potential of SCCA-YOLO for high-precision lunar crater detection and provide valuable insights into future lunar surface analysis. Full article
Show Figures

Figure 1

21 pages, 3079 KiB  
Article
A Lightweight Multi-Angle Feature Fusion CNN for Bearing Fault Diagnosis
by Huanli Li, Guoqiang Wang, Nianfeng Shi, Yingying Li, Wenlu Hao and Chongwen Pang
Electronics 2025, 14(14), 2774; https://doi.org/10.3390/electronics14142774 - 10 Jul 2025
Viewed by 308
Abstract
To address the issues of high model complexity and weak noise resistance in convolutional neural networks for bearing fault diagnosis, this paper proposes a novel lightweight multi-angle feature fusion convolutional neural network (LMAFCNN). First, the original signal was preprocessed using a wide-kernel convolutional [...] Read more.
To address the issues of high model complexity and weak noise resistance in convolutional neural networks for bearing fault diagnosis, this paper proposes a novel lightweight multi-angle feature fusion convolutional neural network (LMAFCNN). First, the original signal was preprocessed using a wide-kernel convolutional layer to achieve data dimensionality reduction and feature channel expansion. Second, a lightweight multi-angle feature fusion module was designed as the core feature extraction unit. The main branch fused multidimensional features through pointwise convolution and large-kernel channel-wise expansion convolution, whereas the auxiliary branch introduced an efficient channel attention (ECA) mechanism to achieve channel-adaptive weighting. Feature enhancement was achieved through the addition of branches. Finally, global average pooling and fully connected layers were used to complete end-to-end fault diagnosis. The experimental results showed that the proposed method achieved an accuracy of 99.5% on the Paderborn University (PU) artificial damage dataset, with a computational complexity of only 14.8 million floating-point operations (MFLOPs) and 55.2 K parameters. Compared with existing mainstream methods, the proposed method significantly reduces model complexity while maintaining high accuracy, demonstrating excellent diagnostic performance and application potential. Full article
(This article belongs to the Section Industrial Electronics)
Show Figures

Figure 1

21 pages, 5895 KiB  
Article
Improved YOLO-Based Pulmonary Nodule Detection with Spatial-SE Attention and an Aspect Ratio Penalty
by Xinhang Song, Haoran Xie, Tianding Gao, Nuo Cheng and Jianping Gou
Sensors 2025, 25(14), 4245; https://doi.org/10.3390/s25144245 - 8 Jul 2025
Viewed by 420
Abstract
The accurate identification of pulmonary nodules is critical for the early diagnosis of lung diseases; however, this task remains challenging due to inadequate feature representation and limited localization sensitivity. Current methodologies often utilize channel attention mechanisms and intersection over union (IoU)-based loss functions. [...] Read more.
The accurate identification of pulmonary nodules is critical for the early diagnosis of lung diseases; however, this task remains challenging due to inadequate feature representation and limited localization sensitivity. Current methodologies often utilize channel attention mechanisms and intersection over union (IoU)-based loss functions. Yet, they frequently overlook spatial context and struggle to capture subtle variations in aspect ratios, which hinders their ability to detect small objects. In this study, we introduce an improved YOLOV11 framework that addresses these limitations through two primary components: a spatial squeeze-and-excitation (SSE) module that concurrently models channel-wise and spatial attention to enhance the discriminative features pertinent to nodules and explicit aspect ratio penalty IoU (EAPIoU) loss that imposes a direct penalty on the squared differences in aspect ratios to refine the bounding box regression process. Comprehensive experiments conducted on the LUNA16, LungCT, and Node21 datasets reveal that our approach achieves superior precision, recall, and mean average precision (mAP) across various IoU thresholds, surpassing previous state-of-the-art methods while maintaining computational efficiency. Specifically, the proposed SSE module achieves a precision of 0.781 on LUNA16, while the EAPIoU loss boosts mAP@50 to 92.4% on LungCT, outperforming mainstream attention mechanisms and IoU-based loss functions. These findings underscore the effectiveness of integrating spatially aware attention mechanisms with aspect ratio-sensitive loss functions for robust nodule detection. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

26 pages, 32088 KiB  
Article
Fall Detection Algorithm Using Enhanced HRNet Combined with YOLO
by Huan Shi, Xiaopeng Wang and Jia Shi
Sensors 2025, 25(13), 4128; https://doi.org/10.3390/s25134128 - 2 Jul 2025
Viewed by 496
Abstract
To address the issues of insufficient feature extraction, single-fall judgment method, and poor real-time performance of traditional fall detection algorithms in occluded scenes, a top-down fall detection algorithm based on improved YOLOv8 combined with BAM-HRNet is proposed. First, the Shufflenetv2 network is used [...] Read more.
To address the issues of insufficient feature extraction, single-fall judgment method, and poor real-time performance of traditional fall detection algorithms in occluded scenes, a top-down fall detection algorithm based on improved YOLOv8 combined with BAM-HRNet is proposed. First, the Shufflenetv2 network is used to make the backbone of YOLOv8 light weight, and a mixed attention mechanism network is connected stage-wise at the neck to enable the network to better obtain human body position information. Second, the HRNet network integrated with the channel attention mechanism can effectively extract the position information of key points. Then, by analyzing the position information of skeletal key points, the decline speed of the center of mass, the angular velocity between the trunk and the ground, and the human body height-to-width ratio are jointly used as the discriminant basis for identifying fall behaviors. In addition, when a suspected fall is detected, the system automatically activates a voice inquiry mechanism to improve the accuracy of fall judgment. The results show that the accuracy of the object detection module on the COCO and Pascal VOC datasets is 64.1% and 61.7%, respectively. The accuracy of the key point detection module on the COCO and OCHuman datasets reaches 73.49% and 70.11%, respectively. On the fall detection datasets, the accuracy of the proposed algorithm exceeds 95% and the frame rate reaches 18.1 fps. Compared with traditional algorithms, it demonstrates superior ability to distinguish between normal and fall behaviors. Full article
Show Figures

Figure 1

Back to TopTop