Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (295)

Search Parameters:
Keywords = pyramidal representation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
37 pages, 10380 KB  
Article
FEWheat-YOLO: A Lightweight Improved Algorithm for Wheat Spike Detection
by Hongxin Wu, Weimo Wu, Yufen Huang, Shaohua Liu, Yanlong Liu, Nannan Zhang, Xiao Zhang and Jie Chen
Plants 2025, 14(19), 3058; https://doi.org/10.3390/plants14193058 - 3 Oct 2025
Abstract
Accurate detection and counting of wheat spikes are crucial for yield estimation and variety selection in precision agriculture. However, challenges such as complex field environments, morphological variations, and small target sizes hinder the performance of existing models in real-world applications. This study proposes [...] Read more.
Accurate detection and counting of wheat spikes are crucial for yield estimation and variety selection in precision agriculture. However, challenges such as complex field environments, morphological variations, and small target sizes hinder the performance of existing models in real-world applications. This study proposes FEWheat-YOLO, a lightweight and efficient detection framework optimized for deployment on agricultural edge devices. The architecture integrates four key modules: (1) FEMANet, a mixed aggregation feature enhancement network with Efficient Multi-scale Attention (EMA) for improved small-target representation; (2) BiAFA-FPN, a bidirectional asymmetric feature pyramid network for efficient multi-scale feature fusion; (3) ADown, an adaptive downsampling module that preserves structural details during resolution reduction; and (4) GSCDHead, a grouped shared convolution detection head for reduced parameters and computational cost. Evaluated on a hybrid dataset combining GWHD2021 and a self-collected field dataset, FEWheat-YOLO achieved a COCO-style AP of 51.11%, AP@50 of 89.8%, and AP scores of 18.1%, 50.5%, and 61.2% for small, medium, and large targets, respectively, with an average recall (AR) of 58.1%. In wheat spike counting tasks, the model achieved an R2 of 0.941, MAE of 3.46, and RMSE of 6.25, demonstrating high counting accuracy and robustness. The proposed model requires only 0.67 M parameters, 5.3 GFLOPs, and 1.6 MB of storage, while achieving an inference speed of 54 FPS. Compared to YOLOv11n, FEWheat-YOLO improved AP@50, AP_s, AP_m, AP_l, and AR by 0.53%, 0.7%, 0.7%, 0.4%, and 0.3%, respectively, while reducing parameters by 74%, computation by 15.9%, and model size by 69.2%. These results indicate that FEWheat-YOLO provides an effective balance between detection accuracy, counting performance, and model efficiency, offering strong potential for real-time agricultural applications on resource-limited platforms. Full article
(This article belongs to the Special Issue Advances in Artificial Intelligence for Plant Research)
12 pages, 768 KB  
Article
ECG Waveform Segmentation via Dual-Stream Network with Selective Context Fusion
by Yongpeng Niu, Nan Lin, Yuchen Tian, Kaipeng Tang and Baoxiang Liu
Electronics 2025, 14(19), 3925; https://doi.org/10.3390/electronics14193925 - 2 Oct 2025
Abstract
Electrocardiogram (ECG) waveform delineation is fundamental to cardiac disease diagnosis. This task requires precise localization of key fiducial points, specifically the onset, peak, and offset positions of P-waves, QRS complexes, and T-waves. Current methods exhibit significant performance degradation in noisy clinical environments (baseline [...] Read more.
Electrocardiogram (ECG) waveform delineation is fundamental to cardiac disease diagnosis. This task requires precise localization of key fiducial points, specifically the onset, peak, and offset positions of P-waves, QRS complexes, and T-waves. Current methods exhibit significant performance degradation in noisy clinical environments (baseline drift, electromyographic interference, powerline interference, etc.), compromising diagnostic reliability. To address this limitation, we introduce ECG-SCFNet: a novel dual-stream architecture employing selective context fusion. Our framework is further enhanced by a consistency training paradigm, enabling it to maintain robust waveform delineation accuracy under challenging noise conditions.The network employs a dual-stream architecture: (1) A temporal stream captures dynamic rhythmic features through sequential multi-branch convolution and temporal attention mechanisms; (2) A morphology stream combines parallel multi-scale convolution with feature pyramid integration to extract multi-scale waveform structural features through morphological attention; (3) The Selective Context Fusion (SCF) module adaptively integrates features from the temporal and morphology streams using a dual attention mechanism, which operates across both channel and spatial dimensions to selectively emphasize informative features from each stream, thereby enhancing the representation learning for accurate ECG segmentation. On the LUDB and QT datasets, ECG-SCFNet achieves high performance, with F1-scores of 97.83% and 97.80%, respectively. Crucially, it maintains robust performance under challenging noise conditions on these datasets, with 88.49% and 86.25% F1-scores, showing significantly improved noise robustness compared to other methods and demonstrating exceptional robustness and precise boundary localization for clinical ECG analysis. Full article
Show Figures

Figure 1

19 pages, 6027 KB  
Article
An Improved HRNetV2-Based Semantic Segmentation Algorithm for Pipe Corrosion Detection in Smart City Drainage Networks
by Liang Gao, Xinxin Huang, Wanling Si, Feng Yang, Xu Qiao, Yaru Zhu, Tingyang Fu and Jianshe Zhao
J. Imaging 2025, 11(10), 325; https://doi.org/10.3390/jimaging11100325 - 23 Sep 2025
Viewed by 235
Abstract
Urban drainage pipelines are essential components of smart city infrastructure, supporting the safe and sustainable operation of underground systems. However, internal corrosion in pipelines poses significant risks to structural stability and public safety. In this study, we propose an enhanced semantic segmentation framework [...] Read more.
Urban drainage pipelines are essential components of smart city infrastructure, supporting the safe and sustainable operation of underground systems. However, internal corrosion in pipelines poses significant risks to structural stability and public safety. In this study, we propose an enhanced semantic segmentation framework based on High-Resolution Network Version 2 (HRNetV2) to accurately identify corroded regions in Traditional closed-circuit television (CCTV) images. The proposed method integrates a Convolutional Block Attention Module (CBAM) to strengthen the feature representation of corrosion patterns and introduces a Lightweight Pyramid Pooling Module (LitePPM) to improve multi-scale context modeling. By preserving high-resolution details through HRNetV2’s parallel architecture, the model achieves precise and robust segmentation performance. Experiments on a real-world corrosion dataset show that our approach attains a mean Intersection over Union (mIoU) of 95.92 ± 0.03%, Recall of 97.01 ± 0.02%, and an overall Accuracy of 98.54%. These results demonstrate the method’s effectiveness in supporting intelligent infrastructure inspection and provide technical insights for advancing automated maintenance systems in smart cities. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

29 pages, 34222 KB  
Article
BFRDNet: A UAV Image Object Detection Method Based on a Backbone Feature Reuse Detection Network
by Liming Zhou, Jiakang Yang, Yuanfei Xie, Guochong Zhang, Cheng Liu and Yang Liu
ISPRS Int. J. Geo-Inf. 2025, 14(9), 365; https://doi.org/10.3390/ijgi14090365 - 21 Sep 2025
Viewed by 398
Abstract
Unmanned aerial vehicle (UAV) image object detection has become an increasingly important research area in computer vision. However, the variable target shapes and complex environments make it difficult for the model to fully exploit its features. In order to solve this problem, we [...] Read more.
Unmanned aerial vehicle (UAV) image object detection has become an increasingly important research area in computer vision. However, the variable target shapes and complex environments make it difficult for the model to fully exploit its features. In order to solve this problem, we propose a UAV image object detection method based on a backbone feature reuse detection network, named BFRDNet. First, we design a backbone feature reuse pyramid network (BFRPN), which takes the model characteristics as the starting point and more fully utilizes the multi-scale features of backbone network to improve the model’s performance in complex environments. Second, we propose a feature extraction module based on multiple kernels convolution (MKConv), to deeply mine features under different receptive fields, helping the model accurately recognize targets of different sizes and shapes. Finally, we design a detection head preprocessing module (PDetect) to enhance the feature representation fed to the detection head and effectively suppress the interference of background information. In this study, we validate the performance of BFRDNet primarily on the VisDrone dataset. The experimental results demonstrate that BFRDNet achieves a significant improvement in detection performance, with the mAP increasing by 7.5%. To additionally evaluate the model’s generalization capacity, we extend the experiments to the UAVDT and COCO datasets. Full article
Show Figures

Figure 1

27 pages, 9667 KB  
Article
REU-YOLO: A Context-Aware UAV-Based Rice Ear Detection Model for Complex Field Scenes
by Dongquan Chen, Kang Xu, Wenbin Sun, Danyang Lv, Songmei Yang, Ranbing Yang and Jian Zhang
Agronomy 2025, 15(9), 2225; https://doi.org/10.3390/agronomy15092225 - 20 Sep 2025
Viewed by 266
Abstract
Accurate detection and counting of rice ears serve as a critical indicator for yield estimation, but the complex conditions of paddy fields limit the efficiency and precision of traditional sampling methods. We propose REU-YOLO, a model specifically designed for UAV low-altitude remote sensing [...] Read more.
Accurate detection and counting of rice ears serve as a critical indicator for yield estimation, but the complex conditions of paddy fields limit the efficiency and precision of traditional sampling methods. We propose REU-YOLO, a model specifically designed for UAV low-altitude remote sensing to collect images of rice ears, to address issues such as high-density and complex spatial distribution with occlusion in field scenes. Initially, we combine the Additive Block containing Convolutional Additive Self-attention (CAS) and Convolutional Gated Linear Unit (CGLU) to propose a novel module called Additive-CGLU-C2F (AC-C2f) as a replacement for the original C2f in YOLOv8. It can capture the contextual information between different regions of images and improve the feature extraction ability of the model, introduce the Dropblock strategy to reduce model overfitting, and replace the original SPPF module with the SPPFCSPC-G module to enhance feature representation and improve the capacity of the model to extract features across varying scales. We further propose a feature fusion network called Multi-branch Bidirectional Feature Pyramid Network (MBiFPN), which introduces a small object detection head and adjusts the head to focus more on small and medium-sized rice ear targets. By using adaptive average pooling and bidirectional weighted feature fusion, shallow and deep features are dynamically fused to enhance the robustness of the model. Finally, the Inner-PloU loss function is introduced to improve the adaptability of the model to rice ear morphology. In the self-developed dataset UAVR, REU-YOLO achieves a precision (P) of 90.76%, a recall (R) of 86.94%, an mAP0.5 of 93.51%, and an mAP0.5:0.95 of 78.45%, which are 4.22%, 3.76%, 4.85%, and 8.27% higher than the corresponding values obtained with YOLOv8 s, respectively. Furthermore, three public datasets, DRPD, MrMT, and GWHD, were used to perform a comprehensive evaluation of REU-YOLO. The results show that REU-YOLO indicates great generalization capabilities and more stable detection performance. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

23 pages, 11401 KB  
Article
HSFANet: Hierarchical Scale-Sensitive Feature Aggregation Network for Small Object Detection in UAV Aerial Images
by Hongxing Zhang, Zhonghong Ou, Siyuan Yao, Yifan Zhu, Yangfu Zhu, Hailin Li, Shigeng Wang, Yang Guo and Meina Song
Drones 2025, 9(9), 659; https://doi.org/10.3390/drones9090659 - 19 Sep 2025
Viewed by 375
Abstract
Small object detection in aerial images, particularly from Unmanned Aerial Vehicle (UAV) platforms, remains a significant challenge due to limited object resolution, dense scenes, and background interference. However, existing small object detectors often overlook making full use of hierarchical features and inevitably introduce [...] Read more.
Small object detection in aerial images, particularly from Unmanned Aerial Vehicle (UAV) platforms, remains a significant challenge due to limited object resolution, dense scenes, and background interference. However, existing small object detectors often overlook making full use of hierarchical features and inevitably introduce noise interference because of hierarchical upsampling operations, and commonly used loss metrics lack sensitivity to scale information; these two issues jointly lead to performance deterioration. To address these issues, we propose Hierarchical Scale-Sensitive Feature Aggregation Network (HSFANet), a novel framework that conducts robust cross-layer feature interaction to perceive the small objects’ position information in hierarchical feature pyramids and enforces the model to balance the multi-scale prediction heads for accurate instances localization. HSFANet introduces a Dynamic Position Aggregation (DPA) module to explicitly enhance the object area in both shallow and deep layers, which is capable of exploiting the complementarily salient representation of the small objects. Additionally, an efficient Scale-Sensitive Loss (SSL) is proposed to balance the small object detection outputs in hierarchical prediction heads, thereby effectively improving the performance of small object detection. Extensive experiments on two challenging UAV benchmarks, VisDrone and UAVDT, demonstrate that HSFANet achieves state-of-the-art (SOTA) results, with a 1.3% gain in overall average precision (AP) and a notable 2.2% improvement in AP for small objects on VisDrone. On UAVDT, HSFANet outperforms previous methods by 0.3% in overall AP and 16.7% in small object AP. These results highlight the effectiveness of HSFANet in enhancing small object detection performance in complex aerial imagery, making it well suited for practical UAV-based applications. Full article
(This article belongs to the Special Issue Intelligent Image Processing and Sensing for Drones, 2nd Edition)
Show Figures

Figure 1

27 pages, 4122 KB  
Article
Development of a Tool to Detect Open-Mouthed Respiration in Caged Broilers
by Yali Ma, Yongmin Guo, Bin Gao, Pengshen Zheng and Changxi Chen
Animals 2025, 15(18), 2732; https://doi.org/10.3390/ani15182732 - 18 Sep 2025
Viewed by 285
Abstract
Open-mouth panting in broiler chickens is a visible and critical indicator of heat stress and compromised welfare. However, detecting this behavior in densely populated cages is challenging due to the small size of the target and frequent occlusions and cluttered backgrounds. To overcome [...] Read more.
Open-mouth panting in broiler chickens is a visible and critical indicator of heat stress and compromised welfare. However, detecting this behavior in densely populated cages is challenging due to the small size of the target and frequent occlusions and cluttered backgrounds. To overcome these issues, we proposed an enhanced object detection method based on the lightweight YOLOv8n framework, incorporating four key improvements. First, we add a dedicated P2 detection head to improve the recognition of small targets. Second, a space-to-depth grouped convolution module (SGConv) is introduced to capture fine-grained texture and edge features crucial for panting identification. Third, a bidirectional feature pyramid network (BIFPN) merges multi-scale feature maps for richer representations. Finally, a squeeze-and-excitation (SE) channel attention mechanism emphasizes mouth-related cues while suppressing irrelevant background noise. We trained and evaluated the method on a comprehensive, full-cycle broiler panting dataset covering all growth stages. Experimental results show that our method significantly outperforms baseline YOLO models, achieving 0.92 mAP@50 (independent test set) and 0.927 mAP@50 (leakage-free retraining), confirming strong generalizability while maintaining real-time performance. The initial evaluation had data partitioning limitations; method generalizability is now dually validated through both independent testing and rigorous split-then-augment retraining. This approach provides a practical tool for intelligent broiler welfare monitoring and heat stress management, contributing to improved environmental control and animal well-being. Full article
(This article belongs to the Section Poultry)
Show Figures

Figure 1

22 pages, 3632 KB  
Article
RFR-YOLO-Based Recognition Method for Dairy Cow Behavior in Farming Environments
by Congcong Li, Jialong Ma, Shifeng Cao and Leifeng Guo
Agriculture 2025, 15(18), 1952; https://doi.org/10.3390/agriculture15181952 - 15 Sep 2025
Viewed by 407
Abstract
Cow behavior recognition constitutes a fundamental element of effective cow health monitoring and intelligent farming systems. Within large-scale cow farming environments, several critical challenges persist, including the difficulty in accurately capturing behavioral feature information, substantial variations in multi-scale features, and high inter-class similarity [...] Read more.
Cow behavior recognition constitutes a fundamental element of effective cow health monitoring and intelligent farming systems. Within large-scale cow farming environments, several critical challenges persist, including the difficulty in accurately capturing behavioral feature information, substantial variations in multi-scale features, and high inter-class similarity among different cow behaviors. To address these limitations, this study introduces an enhanced target detection algorithm for cow behavior recognition, termed RFR-YOLO, which is developed upon the YOLOv11n framework. A well-structured dataset encompassing nine distinct cow behaviors—namely, lying, standing, walking, eating, drinking, licking, grooming, estrus, and limping—is constructed, comprising a total of 13,224 labeled samples. The proposed algorithm incorporates three major technical improvements: First, an Inverted Dilated Convolution module (Region Semantic Inverted Convolution, RsiConv) is designed and seamlessly integrated with the C3K2 module to form the C3K2_Rsi module, which effectively reduces computational overhead while enhancing feature representation. Second, a Four-branch Multi-scale Dilated Attention mechanism (Four Multi-Scale Dilated Attention, FMSDA) is incorporated into the network architecture, enabling the scale-specific features to align with the corresponding receptive fields, thereby improving the model’s capacity to capture multi-scale characteristics. Third, a Reparameterized Generalized Residual Feature Pyramid Network (Reparameterized Generalized Residual-FPN, RepGRFPN) is introduced as the Neck component, allowing for the features to propagate through differentiated pathways and enabling flexible control over multi-scale feature expression, thereby facilitating efficient feature fusion and mitigating the impact of behavioral similarity. The experimental results demonstrate that RFR-YOLO achieves precision, recall, mAP50, and mAP50:95 values of 95.9%, 91.2%, 94.9%, and 85.2%, respectively, representing performance gains of 5.5%, 5%, 5.6%, and 3.5% over the baseline model. Despite a marginal increase in computational complexity of 1.4G, the algorithm retains a high detection speed of 147.6 frames per second. The proposed RFR-YOLO algorithm significantly improves the accuracy and robustness of target detection in group cow farming scenarios. Full article
(This article belongs to the Section Farm Animal Production)
Show Figures

Figure 1

27 pages, 5866 KB  
Article
DCGAN Feature-Enhancement-Based YOLOv8n Model in Small-Sample Target Detection
by Peng Zheng, Yun Cheng, Wei Zhu, Bo Liu, Chenhao Ye, Shijie Wang, Shuhong Liu and Jinyin Bai
Computers 2025, 14(9), 389; https://doi.org/10.3390/computers14090389 - 15 Sep 2025
Viewed by 354
Abstract
This paper proposes DCGAN-YOLOv8n, an integrated framework that significantly advances small-sample target detection by synergizing generative adversarial feature enhancement with multi-scale representation learning. The model’s core contribution lies in its novel adversarial feature enhancement module (AFEM), which leverages conditional generative adversarial networks to [...] Read more.
This paper proposes DCGAN-YOLOv8n, an integrated framework that significantly advances small-sample target detection by synergizing generative adversarial feature enhancement with multi-scale representation learning. The model’s core contribution lies in its novel adversarial feature enhancement module (AFEM), which leverages conditional generative adversarial networks to reconstruct discriminative multi-scale features while effectively mitigating mode collapse. Furthermore, the architecture incorporates a deformable multi-scale feature pyramid that dynamically fuses generated high-resolution features with hierarchical semantic representations through an attention mechanism. The proposed triple marginal constraint optimization jointly enhances intra-class compactness and inter-class separation, thereby structuring a highly discriminative feature space. Extensive experiments on the NWPU VHR-10 dataset demonstrate state-of-the-art performance, with the model achieving an mAP50 of 90.46% and an mAP50-95 of 57.06%, representing significant improvements of 4.52% and 4.08% over the baseline YOLOv8n, respectively. These results validate the framework’s effectiveness in addressing critical challenges of feature representation scarcity and cross-scale adaptation in data-limited scenarios. Full article
(This article belongs to the Special Issue Machine Learning Applications in Pattern Recognition)
Show Figures

Figure 1

25 pages, 3276 KB  
Article
CPB-YOLOv8: An Enhanced Multi-Scale Traffic Sign Detector for Complex Road Environment
by Wei Zhao, Lanlan Li and Xin Gong
Information 2025, 16(9), 798; https://doi.org/10.3390/info16090798 - 15 Sep 2025
Viewed by 447
Abstract
Traffic sign detection is critically important for intelligent transportation systems, yet persistent challenges like multi-scale variation and complex background interference severely degrade detection accuracy and real-time performance. To address these limitations, this study presents CPB-YOLOv8, an advanced multi-scale detection framework based on the [...] Read more.
Traffic sign detection is critically important for intelligent transportation systems, yet persistent challenges like multi-scale variation and complex background interference severely degrade detection accuracy and real-time performance. To address these limitations, this study presents CPB-YOLOv8, an advanced multi-scale detection framework based on the YOLOv8 architecture. A Cross-Stage Partial-Partitioned Transformer Block (CSP-PTB) is incorporated into the feature extraction stage to preserve semantic information during downsampling while enhancing global feature representation. For feature fusion, a four-level bidirectional feature pyramid BiFPN integrated with a P2 detection layer significantly improves small-target detection capability. Further enhancement is achieved via an optimized loss function that balances multi-scale objective localization. Comprehensive evaluations were conducted on the TT100K, the CCTSDB, and a custom multi-scenario road image dataset capturing urban and suburban environments at 1920 × 1080 resolution. Results demonstrate compelling performance: On TT100K, CPB-YOLOv8 achieved 90.73% mAP@0.5 with a 12.5 MB model size, exceeding the YOLOv8s baseline by 3.94 percentage points and achieving 6.43% higher small-target recall. On CCTSDB, it attained a near-saturation performance of 99.21% mAP@0.5. Crucially, the model demonstrated exceptional robustness across diverse environmental conditions. Rigorous analysis on partitioned CCTSDB subsets based on weather and illumination, alongside validation using a separate self-collected dataset reserved solely for inference, confirmed strong adaptability to real-world distribution shifts and low-visibility scenarios. Cross-dataset validation and visual comparisons further substantiated the model’s robustness and its effective suppression of background interference. Full article
Show Figures

Graphical abstract

25 pages, 6352 KB  
Article
Multi-Level Structured Scattering Feature Fusion Network for Limited Sample SAR Target Recognition
by Chenxi Zhao, Daochang Wang, Siqian Zhang and Gangyao Kuang
Remote Sens. 2025, 17(18), 3186; https://doi.org/10.3390/rs17183186 - 15 Sep 2025
Viewed by 363
Abstract
Synthetic aperture radar (SAR) target recognition tasks face the dilemma of limited training samples. The fusion of target scattering features improves the ability of the network to perceive discriminative information and reduces the dependence on training samples. However, existing methods are inadequate in [...] Read more.
Synthetic aperture radar (SAR) target recognition tasks face the dilemma of limited training samples. The fusion of target scattering features improves the ability of the network to perceive discriminative information and reduces the dependence on training samples. However, existing methods are inadequate in utilizing and fusing target scattering information, which limits the development of target recognition. To address the above issues, the multi-level structured scattering feature fusion network is proposed. Firstly, relying on the visual geometric structure of the target, the correlation between local scattering points is established to construct a more realistic target scattering structure. On this basis, the scattering association pyramid network is proposed to mine the multi-level structured scattering information of the target to achieve the full representation of the target scattering information. Subsequently, the discriminative information in the features is measured by the information entropy theory, and the results of the measurements are employed as weighting factors to achieve feature fusion. Additionally, the cosine space classifier is proposed to enhance the discriminative capability of features and the correlation with azimuth information. The effectiveness and superiority of the proposed method are verified on two publicly available SAR image target recognition datasets. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Graphical abstract

15 pages, 1304 KB  
Article
Conv-ScaleNet: A Multiscale Convolutional Model for Federated Human Activity Recognition
by Xian Wu Ting, Ying Han Pang, Zheng You Lim, Shih Yin Ooi and Fu San Hiew
AI 2025, 6(9), 218; https://doi.org/10.3390/ai6090218 - 8 Sep 2025
Viewed by 441
Abstract
Background: Artificial Intelligence (AI) techniques have been extensively deployed in sensor-based Human Activity Recognition (HAR) systems. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have advanced HAR by enabling automatic feature extraction from raw sensor data. However, these models often struggle [...] Read more.
Background: Artificial Intelligence (AI) techniques have been extensively deployed in sensor-based Human Activity Recognition (HAR) systems. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have advanced HAR by enabling automatic feature extraction from raw sensor data. However, these models often struggle to capture multiscale patterns in human activity, limiting recognition accuracy. Additionally, traditional centralized learning approaches raise data privacy concerns, as personal sensor data must be transmitted to a central server, increasing the risk of privacy breaches. Methods: To address these challenges, this paper introduces Conv-ScaleNet, a CNN-based model designed for multiscale feature learning and compatibility with federated learning (FL) environments. Conv-ScaleNet integrates a Pyramid Pooling Module to extract both fine-grained and coarse-grained features and employs sequential Global Average Pooling layers to progressively capture abstract global representations from inertial sensor data. The model supports federated learning by training locally on user devices, sharing only model updates rather than raw data, thus preserving user privacy. Results: Experimental results demonstrate that the proposed Conv-ScaleNet achieves approximately 98% and 96% F1-scores on the WISDM and UCI-HAR datasets, respectively, confirming its competitiveness in FL environments for activity recognition. Conclusions: The proposed Conv-ScaleNet model addresses key limitations of existing HAR systems by combining multiscale feature learning with privacy-preserving training. Its strong performance, data protection capability, and adaptability to decentralized environments make it a robust and scalable solution for real-world HAR applications. Full article
Show Figures

Figure 1

25 pages, 20160 KB  
Article
A Robust Framework Fusing Visual SLAM and 3D Gaussian Splatting with a Coarse-Fine Method for Dynamic Region Segmentation
by Zhian Chen, Yaqi Hu and Yong Liu
Sensors 2025, 25(17), 5539; https://doi.org/10.3390/s25175539 - 5 Sep 2025
Viewed by 1242
Abstract
Existing visual SLAM systems with neural representations excel in static scenes but fail in dynamic environments where moving objects degrade performance. To address this, we propose a robust dynamic SLAM framework combining classic geometric features for localization with learned photometric features for dense [...] Read more.
Existing visual SLAM systems with neural representations excel in static scenes but fail in dynamic environments where moving objects degrade performance. To address this, we propose a robust dynamic SLAM framework combining classic geometric features for localization with learned photometric features for dense mapping. Our method first tracks objects using instance segmentation and a Kalman filter. We then introduce a cascaded, coarse-to-fine strategy for efficient motion analysis: a lightweight sparse optical flow method performs a coarse screening, while a fine-grained dense optical flow clustering is selectively invoked for ambiguous targets. By filtering features on dynamic regions, our system drastically improves camera pose estimation, reducing Absolute Trajectory Error by up to 95% on dynamic TUM RGB-D sequences compared to ORB-SLAM3, and generates clean dense maps. The 3D Gaussian Splatting backend, optimized with a Gaussian pyramid strategy, ensures high-quality reconstruction. Validations on diverse datasets confirm our system’s robustness, achieving accurate localization and high-fidelity mapping in dynamic scenarios while reducing motion analysis computation by 91.7% over a dense-only approach. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Graphical abstract

17 pages, 16767 KB  
Article
AeroLight: A Lightweight Architecture with Dynamic Feature Fusion for High-Fidelity Small-Target Detection in Aerial Imagery
by Hao Qiu, Xiaoyan Meng, Yunjie Zhao, Liang Yu and Shuai Yin
Sensors 2025, 25(17), 5369; https://doi.org/10.3390/s25175369 - 30 Aug 2025
Viewed by 657
Abstract
Small-target detection in Unmanned Aerial Vehicle (UAV) aerial images remains a significant and unresolved challenge in aerial image analysis, hampered by low target resolution, dense object clustering, and complex, cluttered backgrounds. In order to cope with these problems, we present AeroLight, a novel [...] Read more.
Small-target detection in Unmanned Aerial Vehicle (UAV) aerial images remains a significant and unresolved challenge in aerial image analysis, hampered by low target resolution, dense object clustering, and complex, cluttered backgrounds. In order to cope with these problems, we present AeroLight, a novel and efficient detection architecture that achieves high-fidelity performance in resource-constrained environments. AeroLight is built upon three key innovations. First, we have optimized the feature pyramid at the architectural level by integrating a high-resolution head specifically designed for minute object detection. This design enhances sensitivity to fine-grained spatial details while streamlining redundant and computationally expensive network layers. Second, a Dynamic Feature Fusion (DFF) module is proposed to adaptively recalibrate and merge multi-scale feature maps, mitigating information loss during integration and strengthening object representation across diverse scales. Finally, we enhance the localization precision of irregular-shaped objects by refining bounding box regression using a Shape-IoU loss function. AeroLight is shown to improve mAP50 and mAP50-95 by 7.5% and 3.3%, respectively, on the VisDrone2019 dataset, while reducing the parameter count by 28.8% when compared with the baseline model. Further validation on the RSOD dataset and Huaxing Farm Drone dataset confirms its superior performance and generalization capabilities. AeroLight provides a powerful and efficient solution for real-world UAV applications, setting a new standard for lightweight, high-precision object recognition in aerial imaging scenarios. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

25 pages, 73925 KB  
Article
Attention-Guided Edge-Optimized Network for Real-Time Detection and Counting of Pre-Weaning Piglets in Farrowing Crates
by Ning Kong, Tongshuai Liu, Guoming Li, Lei Xi, Shuo Wang and Yuepeng Shi
Animals 2025, 15(17), 2553; https://doi.org/10.3390/ani15172553 - 30 Aug 2025
Viewed by 510
Abstract
Accurate, real-time, and cost-effective detection and counting of pre-weaning piglets are critical for improving piglet survival rates. However, achieving this remains technically challenging due to high computational demands, frequent occlusion, social behaviors, and cluttered backgrounds in commercial farming environments. To address these challenges, [...] Read more.
Accurate, real-time, and cost-effective detection and counting of pre-weaning piglets are critical for improving piglet survival rates. However, achieving this remains technically challenging due to high computational demands, frequent occlusion, social behaviors, and cluttered backgrounds in commercial farming environments. To address these challenges, this study proposes a lightweight and attention-enhanced piglet detection and counting network based on an improved YOLOv8n architecture. The design includes three key innovations: (i) the standard C2f modules in the backbone were replaced with an efficient novel Multi-Scale Spatial Pyramid Attention (MSPA) module to enhance the multi-scale feature representation while a maintaining low computational cost; (ii) an improved Gather-and-Distribute (GD) mechanism was incorporated into the neck to facilitate feature fusion and accelerate inference; and (iii) the detection head and the sample assignment strategy were optimized to align the classification and localization tasks better, thereby improving the overall performance. Experiments on the custom dataset demonstrated the model’s superiority over state-of-the-art counterparts, achieving 88.5% precision and a 93.8% mAP0.5. Furthermore, ablation studies showed that the model reduced the parameters, floating point operations (FLOPs), and model size by 58.45%, 46.91% and 56.45% compared to those of the baseline YOLOv8n, respectively, while achieving a 2.6% improvement in the detection precision and a 4.41% reduction in the counting MAE. The trained model was deployed on a Raspberry Pi 4B with ncnn to verify the effectiveness of the lightweight design, reaching an average inference speed of <87 ms per image. These findings confirm that the proposed method offers a practical, scalable solution for intelligent pig farming, combining a high accuracy, efficiency, and real-time performance in resource-limited environments. Full article
(This article belongs to the Section Pigs)
Show Figures

Figure 1

Back to TopTop