Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,663)

Search Parameters:
Keywords = separable convolutions

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 4099 KB  
Article
Fatigue Crack Length Estimation Using Acoustic Emissions Technique-Based Convolutional Neural Networks
by Asaad Migot, Ahmed Saaudi, Roshan Joseph and Victor Giurgiutiu
Sensors 2026, 26(2), 650; https://doi.org/10.3390/s26020650 (registering DOI) - 18 Jan 2026
Abstract
Fatigue crack propagation is a critical failure mechanism in engineering structures, requiring meticulous monitoring for timely maintenance. This research introduces a deep learning framework for estimating fatigue fracture length in metallic plates through acoustic emission (AE) signals. AE waveforms recorded during crack growth [...] Read more.
Fatigue crack propagation is a critical failure mechanism in engineering structures, requiring meticulous monitoring for timely maintenance. This research introduces a deep learning framework for estimating fatigue fracture length in metallic plates through acoustic emission (AE) signals. AE waveforms recorded during crack growth are transformed into time-frequency images using the Choi–Williams distribution. First, a clustering system is developed to analyze the distribution of the AE image-based dataset. This system employs a CNN-based model to extract features from the input images. The AE dataset is then divided into three categories according to fatigue lengths using the K-means algorithm. Principal Component Analysis (PCA) is used to reduce the feature vectors to two dimensions for display. The results show how close together the data points are in the clusters. Second, convolutional neural network (CNN) models are trained using the AE dataset to categorize fracture lengths into three separate ranges. Using the pre-trained models ResNet50V2 and VGG16, we compare the performance of a bespoke CNN using transfer learning. It is clear from the data that transfer learning models outperform the custom CNN by a wide margin, with an accuracy of approximately 99% compared to 93%. This research confirms that convolutional neural networks (CNNs), particularly when trained with transfer learning, are highly successful at understanding AE data for data-driven structural health monitoring. Full article
Show Figures

Figure 1

20 pages, 857 KB  
Article
Hybrid Spike-Encoded Spiking Neural Networks for Real-Time EEG Seizure Detection: A Comparative Benchmark
by Ali Mehrabi, Neethu Sreenivasan, Upul Gunawardana and Gaetano Gargiulo
Biomimetics 2026, 11(1), 75; https://doi.org/10.3390/biomimetics11010075 - 16 Jan 2026
Viewed by 50
Abstract
Reliable and low-latency seizure detection from electroencephalography (EEG) is critical for continuous clinical monitoring and emerging wearable health technologies. Spiking neural networks (SNNs) provide an event-driven computational paradigm that is well suited to real-time signal processing, yet achieving competitive seizure detection performance with [...] Read more.
Reliable and low-latency seizure detection from electroencephalography (EEG) is critical for continuous clinical monitoring and emerging wearable health technologies. Spiking neural networks (SNNs) provide an event-driven computational paradigm that is well suited to real-time signal processing, yet achieving competitive seizure detection performance with constrained model complexity remains challenging. This work introduces a hybrid spike encoding scheme that combines Delta–Sigma (change-based) and stochastic rate representations, together with two spiking architectures designed for real-time EEG analysis: a compact feed-forward HybridSNN and a convolution-enhanced ConvSNN incorporating depthwise-separable convolutions and temporal self-attention. The architectures are intentionally designed to operate on short EEG segments and to balance detection performance with computational practicality for continuous inference. Experiments on the CHB–MIT dataset show that the HybridSNN attains 91.8% accuracy with an F1-score of 0.834 for seizure detection, while the ConvSNN further improves detection performance to 94.7% accuracy and an F1-score of 0.893. Event-level evaluation on continuous EEG recordings yields false-alarm rates of 0.82 and 0.62 per day for the HybridSNN and ConvSNN, respectively. Both models exhibit inference latencies of approximately 1.2 ms per 0.5 s window on standard CPU hardware, supporting continuous real-time operation. These results demonstrate that hybrid spike encoding enables spiking architectures with controlled complexity to achieve seizure detection performance comparable to larger deep learning models reported in the literature, while maintaining low latency and suitability for real-time clinical and wearable EEG monitoring. Full article
(This article belongs to the Special Issue Bioinspired Engineered Systems)
Show Figures

Figure 1

18 pages, 1144 KB  
Article
Hypersector-Based Method for Real-Time Classification of Wind Turbine Blade Defects
by Lesia Dubchak, Bohdan Rusyn, Carsten Wolff, Tomasz Ciszewski, Anatoliy Sachenko and Yevgeniy Bodyanskiy
Energies 2026, 19(2), 442; https://doi.org/10.3390/en19020442 - 16 Jan 2026
Viewed by 52
Abstract
This paper presents a novel hypersector-based method with Fuzzy Learning Vector Quantization (FLVQ) for the real-time classification of wind turbine blade defects using data acquired by unmanned aerial vehicles (UAVs). Unlike conventional prototype-based FLVQ approaches that rely on Euclidean distance in the feature [...] Read more.
This paper presents a novel hypersector-based method with Fuzzy Learning Vector Quantization (FLVQ) for the real-time classification of wind turbine blade defects using data acquired by unmanned aerial vehicles (UAVs). Unlike conventional prototype-based FLVQ approaches that rely on Euclidean distance in the feature space, the proposed method models each defect class as a hypersector on an n-dimensional hypersphere, where class boundaries are defined by angular similarity and fuzzy membership transitions. This geometric reinterpretation of FLVQ constitutes the core innovation of the study, enabling improved class separability, robustness to noise, and enhanced interpretability under uncertain operating conditions. Feature vectors extracted via the pre-trained SqueezeNet convolutional network are normalized onto the hypersphere, forming compact directional clusters that serve as the geometric foundation of the FLVQ classifier. A fuzzy softmax membership function and an adaptive prototype-updating mechanism are introduced to handle class overlap and improve learning stability. Experimental validation on a custom dataset of 900 UAV-acquired images achieved 95% classification accuracy on test data and 98.3% on an independent dataset, with an average F1-score of 0.91. Comparative analysis with the classical FLVQ prototype demonstrated superior performance and noise robustness. Owing to its low computational complexity and transparent geometric decision structure, the developed model is well-suited for real-time deployment on UAV embedded systems. Furthermore, the proposed hypersector FLVQ framework is generic and can be extended to other renewable-energy diagnostic tasks, including solar and hydropower asset monitoring, contributing to enhanced energy security and sustainability. Full article
(This article belongs to the Special Issue Modeling, Control and Optimization of Wind Power Systems)
Show Figures

Figure 1

19 pages, 1722 KB  
Article
Light-YOLO-Pepper: A Lightweight Model for Detecting Missing Seedlings
by Qiang Shi, Yongzhong Zhang, Xiaoxue Du, Tianhua Chen and Yafei Wang
Agriculture 2026, 16(2), 231; https://doi.org/10.3390/agriculture16020231 - 15 Jan 2026
Viewed by 157
Abstract
The aim of this study was to accurately meet the demand of real-time detection of seedling shortage in large-scale seedling production and solve the problems of low precision of traditional models and insufficient adaptability of mainstream lightweight models. This study proposed a Light-YOLO-Pepper [...] Read more.
The aim of this study was to accurately meet the demand of real-time detection of seedling shortage in large-scale seedling production and solve the problems of low precision of traditional models and insufficient adaptability of mainstream lightweight models. This study proposed a Light-YOLO-Pepper seedling shortage detection model based on the improvement of YOLOv8n. This model was based on YOLOv8n. The SE (Squeeze-and-Excitation) attention module was introduced to dynamically suppress the interference of the nutrient soil background and enhance the features of the seedling shortage area. Depth-separable convolution (DSConv) was used to replace the traditional convolution, which can reduce computational redundancy while retaining core features. Based on K- means clustering, customized anchor boxes were generated to adapt to the hole sizes of 72-unit (large size) and 128-unit (small size and high-density) seedling trays. The results show that the overall mAP@0.5, accuracy and recall rate of Light-YOLO-Pepper model were 93.6 ± 0.5%, 94.6 ± 0.4% and 93.2 ± 0.6%, which were 3.3%, 3.1%, and 3.4% higher than YOLOv8n model, respectively. The parameter size of the Light-YOLO-Pepper model was only 1.82 M, the calculation cost was 3.2 G FLOPs, and the reasoning speeds with regard to the GPU and CPU were 168.4 FPS and 28.9 FPS, respectively. The Light-YOLO-Pepper model was superior to the mainstream model in terms of its lightweight and real-time performance. The precision difference between the two seedlings was only 1.2%, and the precision retention rate in high-density scenes was 98.73%. This model achieves the best balance of detection accuracy, lightweight performance, and scene adaptability, and can efficiently meet the needs of embedded equipment and real-time detection in large-scale seedling production, providing technical support for replanting automation. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

31 pages, 15738 KB  
Article
HiT_DS: A Modular and Physics-Informed Hierarchical Transformer Framework for Spatial Downscaling of Sea Surface Temperature and Height
by Min Wang, Weixuan Liu, Rong Chu, Xidong Wang, Shouxian Zhu and Guanghong Liao
Remote Sens. 2026, 18(2), 292; https://doi.org/10.3390/rs18020292 - 15 Jan 2026
Viewed by 55
Abstract
Recent advances in satellite observations have expanded the use of Sea Surface Temperature (SST) and Sea Surface Height (SSH) data in climate and oceanography, yet their low spatial resolution limits fine-scale analyses. We propose HiT_DS, a modular hierarchical Transformer framework for high-resolution downscaling [...] Read more.
Recent advances in satellite observations have expanded the use of Sea Surface Temperature (SST) and Sea Surface Height (SSH) data in climate and oceanography, yet their low spatial resolution limits fine-scale analyses. We propose HiT_DS, a modular hierarchical Transformer framework for high-resolution downscaling of SST and SSH fields. To address challenges in multiscale feature representation and physical consistency, HiT_DS integrates three key modules: (1) Enhanced Dual Feature Extraction (E-DFE), which employs depth-wise separable convolutions to improve local feature modeling efficiently; (2) Gradient-Aware Attention (GA), which emphasizes dynamically important high-gradient structures such as oceanic fronts; and (3) Physics-Informed Loss Functions, which promote physical realism and dynamical consistency in the reconstructed fields. Experiments across two dynamically distinct oceanic regions demonstrate that HiT_DS achieves improved reconstruction accuracy and enhanced physical fidelity, with selective module combinations tailored to regional dynamical conditions. This framework provides an effective and extensible approach for oceanographic data downscaling. Full article
Show Figures

Figure 1

21 pages, 23946 KB  
Article
Infrared Image Denoising Algorithm Based on Wavelet Transform and Self-Attention Mechanism
by Hongmei Li, Yang Zhang, Luxia Yang and Hongrui Zhang
Sensors 2026, 26(2), 523; https://doi.org/10.3390/s26020523 - 13 Jan 2026
Viewed by 104
Abstract
Infrared images are often degraded by complex noise due to hardware and environmental factors, posing challenges for subsequent processing and target detection. To overcome the shortcomings of existing denoising methods in balancing noise removal and detail preservation, this paper proposes a Wavelet Transform [...] Read more.
Infrared images are often degraded by complex noise due to hardware and environmental factors, posing challenges for subsequent processing and target detection. To overcome the shortcomings of existing denoising methods in balancing noise removal and detail preservation, this paper proposes a Wavelet Transform Enhanced Infrared Denoising Model (WTEIDM). Firstly, a Wavelet Transform Self-Attention (WTSA) is designed, which combines the frequency-domain decomposition ability of the discrete wavelet transform (DWT) with the dynamic weighting mechanism of self-attention to achieve effective separation of noise and detail. Secondly, a Multi-Scale Gated Linear Unit (MSGLU) is devised to improve the ability to capture detail information and dynamically control features through dual-branch multi-scale depth-wise convolution and gating strategy. Finally, a Parallel Hybrid Attention Module (PHAM) is proposed to enhance cross-dimensional feature fusion effect through the parallel cross-interaction of spatial and channel attention. Extensive experiments are conducted on five infrared datasets under different noise levels (σ = 15, 25, and 50). The results demonstrate that the proposed WTEIDM outperforms several state-of-the-art denoising algorithms on both PSNR and SSIM metrics, confirming its superior generalization capability and robustness. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

18 pages, 5889 KB  
Article
High-Resolution Mapping Coastal Wetland Vegetation Using Frequency-Augmented Deep Learning Method
by Ning Gao, Xinyuan Du, Peng Xu, Erding Gao and Yixin Yang
Remote Sens. 2026, 18(2), 247; https://doi.org/10.3390/rs18020247 - 13 Jan 2026
Viewed by 89
Abstract
Coastal wetland vegetation exhibits pronounced spectral mixing, complex mosaic spatial patterns, and small target sizes, posing considerable challenges for fine-grained classification in high-resolution UAV imagery. At present, remote sensing classification of ground objects based on deep learning mainly relies on spectral and structural [...] Read more.
Coastal wetland vegetation exhibits pronounced spectral mixing, complex mosaic spatial patterns, and small target sizes, posing considerable challenges for fine-grained classification in high-resolution UAV imagery. At present, remote sensing classification of ground objects based on deep learning mainly relies on spectral and structural features, while the frequency domain features of ground objects are not fully considered. To address these issues, this study proposes a vegetation classification model that integrates spatial-domain and frequency-domain features. The model enhances global contextual modeling through a large-kernel convolution branch, while a frequency-domain interaction branch separates and fuses low-frequency structural information with high-frequency details. In addition, a shallow auxiliary supervision module is introduced to improve local detail learning and stabilize training. With a compact parameter scale suitable for real-world deployment, the proposed framework effectively adapts to high-resolution remote sensing scenarios. Experiments on typical coastal wetland vegetation including Reeds, Spartina alterniflora, and Suaeda salsa demonstrate that the proposed method consistently outperforms representative segmentation models such as UNet, DeepLabV3, TransUNet, SegFormer, D-LinkNet, and MCCA across multiple metrics including Accuracy, Recall, F1 Score, and mIoU. Overall, the results show that the proposed model effectively addresses the challenges of subtle spectral differences, pervasive species mixture, and intricate structural details, offering a robust and efficient solution for UAV-based wetland vegetation mapping and ecological monitoring. Full article
Show Figures

Figure 1

21 pages, 15751 KB  
Article
Fault Diagnosis of Gearbox Bearings Based on Multi-Feature Fusion Dual-Channel CNN-Transformer-CAM
by Lihai Chen, Yonghui He, Ao Tan, Xiaolong Bai, Zhenshui Li and Xiaoqiang Wang
Machines 2026, 14(1), 92; https://doi.org/10.3390/machines14010092 - 13 Jan 2026
Viewed by 231
Abstract
As a core component of the gearbox, bearings are crucial to the stability and reliability of the transmission system. However, dynamic variations in operating conditions and complex noise interference present limitations for existing fault diagnosis methods in processing non-stationary signals and capturing complex [...] Read more.
As a core component of the gearbox, bearings are crucial to the stability and reliability of the transmission system. However, dynamic variations in operating conditions and complex noise interference present limitations for existing fault diagnosis methods in processing non-stationary signals and capturing complex features. To address the aforementioned challenges, this paper proposes a bearing fault diagnosis method based on a multi-feature fusion dual-channel CNN-Transformer-CAM framework. The model cross-fuses the two-dimensional feature images from Gramian Angular Difference Field (GADF) and Generalized S Transform (GST), preserving complete time–frequency domain information. First, a dual-channel parallel convolutional structure is employed to separately sample the generalized S-transform (GST) maps and the Gramian Angular Difference Field (GADF) maps, enriching fault information from different dimensions and effectively enhancing the model’s feature extraction capability. Subsequently, a Transformer structure is introduced at the backend of the convolutional neural network to strengthen the representation and analysis of complex time–frequency features. Finally, a cross-attention mechanism is applied to dynamically adjust features from the two channels, achieving adaptive weighted fusion. Test results demonstrate that under conditions of noise interference, limited samples, and multiple operating states, the proposed method can effectively achieve the accurate assessment of bearing fault conditions. Full article
Show Figures

Figure 1

24 pages, 5571 KB  
Article
Bearing Fault Diagnosis Based on a Depthwise Separable Atrous Convolution and ASPP Hybrid Network
by Xiaojiao Gu, Chuanyu Liu, Jinghua Li, Xiaolin Yu and Yang Tian
Machines 2026, 14(1), 93; https://doi.org/10.3390/machines14010093 - 13 Jan 2026
Viewed by 86
Abstract
To address the computational redundancy, inadequate multi-scale feature capture, and poor noise robustness of traditional deep networks used for bearing vibration and acoustic signal feature extraction, this paper proposes a fault diagnosis method based on Depthwise Separable Atrous Convolution (DSAC) and Acoustic Spatial [...] Read more.
To address the computational redundancy, inadequate multi-scale feature capture, and poor noise robustness of traditional deep networks used for bearing vibration and acoustic signal feature extraction, this paper proposes a fault diagnosis method based on Depthwise Separable Atrous Convolution (DSAC) and Acoustic Spatial Pyramid Pooling (ASPP). First, the Continuous Wavelet Transform (CWT) is applied to the vibration and acoustic signals to convert them into time–frequency representations. The vibration CWT is then fed into a multi-scale feature extraction module to obtain preliminary vibration features, whereas the acoustic CWT is processed by a Deep Residual Shrinkage Network (DRSN). The two feature streams are concatenated in a feature fusion module and subsequently fed into the DSAC and ASPP modules, which together expand the effective receptive field and aggregate multi-scale contextual information. Finally, global pooling followed by a classifier outputs the bearing fault category, enabling high-precision bearing fault identification. Experimental results show that, under both clean data and multiple low signal-to-noise ratio (SNR) noise conditions, the proposed DSAC-ASPP method achieves higher accuracy and lower variance than baselines such as ResNet, VGG, and MobileNet, while requiring fewer parameters and FLOPs and exhibiting superior robustness and deployability. Full article
Show Figures

Figure 1

17 pages, 710 KB  
Article
KD-SecBERT: A Knowledge-Distilled Bidirectional Encoder Optimized for Open-Source Software Supply Chain Security in Smart Grid Applications
by Qinman Li, Xixiang Zhang, Weiming Liao, Tao Dai, Hongliang Zheng, Beiya Yang and Pengfei Wang
Electronics 2026, 15(2), 345; https://doi.org/10.3390/electronics15020345 - 13 Jan 2026
Viewed by 145
Abstract
With the acceleration of digital transformation, open-source software has become a fundamental component of modern smart grids and other critical infrastructures. However, the complex dependency structures of open-source ecosystems and the continuous emergence of vulnerabilities pose substantial challenges to software supply chain security. [...] Read more.
With the acceleration of digital transformation, open-source software has become a fundamental component of modern smart grids and other critical infrastructures. However, the complex dependency structures of open-source ecosystems and the continuous emergence of vulnerabilities pose substantial challenges to software supply chain security. In power information networks and cyber–physical control systems, vulnerabilities in open-source components integrated into Supervisory Control and Data Acquisition (SCADA), Energy Management System (EMS), and Distribution Management System (DMS) platforms and distributed energy controllers may propagate along the supply chain, threatening system security and operational stability. In such application scenarios, large language models (LLMs) often suffer from limited semantic accuracy when handling domain-specific security terminology, as well as deployment inefficiencies that hinder their practical adoption in critical infrastructure environments. To address these issues, this paper proposes KD-SecBERT, a domain-specific semantic bidirectional encoder optimized through multi-level knowledge distillation for open-source software supply chain security in smart grid applications. The proposed framework constructs a hierarchical multi-teacher ensemble that integrates general language understanding, cybersecurity-domain knowledge, and code semantic analysis, together with a lightweight student architecture based on depthwise separable convolutions and multi-head self-attention. In addition, a dynamic, multi-dimensional distillation strategy is introduced to jointly perform layer-wise representation alignment, ensemble knowledge fusion, and task-oriented optimization under a progressive curriculum learning scheme. Extensive experiments conducted on a multi-source dataset comprising National Vulnerability Database (NVD) and Common Vulnerabilities and Exposures (CVE) entries, security-related GitHub code, and Open Web Application Security Project (OWASP) test cases show that KD-SecBERT achieves an accuracy of 91.3%, a recall of 90.6%, and an F1-score of 89.2% on vulnerability classification tasks, indicating strong robustness in recognizing both common and low-frequency security semantics. These results demonstrate that KD-SecBERT provides an effective and practical solution for semantic analysis and software supply chain risk assessment in smart grids and other critical-infrastructure environments. Full article
Show Figures

Figure 1

31 pages, 4778 KB  
Article
ESCFM-YOLO: Lightweight Dual-Stream Architecture for Real-Time Small-Scale Fire Smoke Detection on Edge Devices
by Jong-Chan Park, Myeongjun Kim, Sang-Min Choi and Gun-Woo Kim
Appl. Sci. 2026, 16(2), 778; https://doi.org/10.3390/app16020778 - 12 Jan 2026
Viewed by 121
Abstract
Early detection of small-scale fires is crucial for minimizing damage and enabling rapid emergency response. While recent deep learning-based fire detection systems have achieved high accuracy, they still face three key challenges: (1) limited deployability in resource-constrained edge environments due to high computational [...] Read more.
Early detection of small-scale fires is crucial for minimizing damage and enabling rapid emergency response. While recent deep learning-based fire detection systems have achieved high accuracy, they still face three key challenges: (1) limited deployability in resource-constrained edge environments due to high computational costs, (2) performance degradation caused by feature interference when jointly learning flame and smoke features in a single backbone, and (3) low sensitivity to small flames and thin smoke in the initial stages. To address these issues, we propose a lightweight dual-stream fire detection architecture based on YOLOv5n, which learns flame and smoke features separately to improve both accuracy and efficiency under strict edge constraints. The proposed method integrates two specialized attention modules: ESCFM++, which enhances spatial and channel discrimination for sharp boundaries and local flame structures (flame), and ESCFM-RS, which captures low-contrast, diffuse smoke patterns through depthwise convolutions and residual scaling (smoke). On the D-Fire dataset, the flame detector achieved 74.5% mAP@50 with only 1.89 M parameters, while the smoke detector achieved 89.2% mAP@50. When deployed on an NVIDIA Jetson Xavier NX (NVIDIA Corporation, Santa Clara, CA, USA)., the system achieved 59.7 FPS (single-stream) and 28.3 FPS (dual-tream) with GPU utilization below 90% and power consumption under 17 W. Under identical on-device conditions, it outperforms YOLOv9t and YOLOv12n by 36–62% in FPS and 0.7–2.0% in detection accuracy. We further validate deployment via outdoor day/night long-range live-stream tests on Jetson using our flame detector, showing reliable capture of small, distant flames that appear as tiny cues on the screen, particularly in challenging daytime scenes. These results demonstrate overall that modality-specific stream specialization and ESCFM attention reduce feature interference while improving detection accuracy and computational efficiency for real-time edge-device fire monitoring. Full article
Show Figures

Figure 1

23 pages, 6446 KB  
Article
Lightweight GAFNet Model for Robust Rice Pest Detection in Complex Agricultural Environments
by Yang Zhou, Wanqiang Huang, Benjing Liu, Tianhua Chen, Jing Wang, Qiqi Zhang and Tianfu Yang
AgriEngineering 2026, 8(1), 26; https://doi.org/10.3390/agriengineering8010026 - 10 Jan 2026
Viewed by 176
Abstract
To address challenges such as small target size, high density, severe occlusion, complex background interference, and edge device computational constraints, a lightweight model, GAFNet, is proposed based on YOLO11n, optimized for rice pest detection in field environments. To improve feature perception, we propose [...] Read more.
To address challenges such as small target size, high density, severe occlusion, complex background interference, and edge device computational constraints, a lightweight model, GAFNet, is proposed based on YOLO11n, optimized for rice pest detection in field environments. To improve feature perception, we propose the Global Attention Fusion and Spatial Pyramid Pooling (GAM-SPP) module, which captures global context and aggregates multi-scale features. Building on this, we introduce the C3-Efficient Feature Selection Attention (C3-EFSA) module, which refines feature representation by combining depthwise separable convolutions (DWConv) with lightweight channel attention to enhance background discrimination. The model’s detection head, Enhanced Ghost Detect (EGDetect), integrates Enhanced Ghost Convolution (EGConv), Squeeze-and-Excitation (SE), and Sigmoid-Weighted Linear Unit (SiLU) activation, which reduces redundancy. Additionally, we propose the Focal-Enhanced Complete-IoU (FECIoU) loss function, incorporating stability and hard-sample weighting for improved localization. Compared to YOLO11n, GAFNet improves Precision, Recall, and mean Average Precision (mAP) by 3.5%, 4.2%, and 1.6%, respectively, while reducing parameters and computation by 5% and 21%. GAFNet can deploy on edge devices, providing farmers with instant pest alerts. Further, GAFNet is evaluated on the AgroPest-12 dataset, demonstrating enhanced generalization and robustness across diverse pest detection scenarios. Overall, GAFNet provides an efficient, reliable, and sustainable solution for early pest detection, precision pesticide application, and eco-friendly pest control, advancing the future of smart agriculture. Full article
Show Figures

Figure 1

22 pages, 4804 KB  
Article
SER-YOLOv8: An Early Forest Fire Detection Model Integrating Multi-Path Attention and NWD
by Juan Liu, Jiaxin Feng, Shujie Wang, Yian Ding, Jianghua Guo, Yuhang Li, Wenxuan Xue and Jie Hu
Forests 2026, 17(1), 93; https://doi.org/10.3390/f17010093 - 10 Jan 2026
Viewed by 128
Abstract
Forest ecosystems, as vital natural resources, are increasingly endangered by wildfires. Effective forest fire management relies on the accurate and early detection of small–scale flames and smoke. However, the complex and dynamic forest environment, along with the small size and irregular shape of [...] Read more.
Forest ecosystems, as vital natural resources, are increasingly endangered by wildfires. Effective forest fire management relies on the accurate and early detection of small–scale flames and smoke. However, the complex and dynamic forest environment, along with the small size and irregular shape of early fire indicators, poses significant challenges to reliable early warning systems. To address these issues, this paper introduces SER–YOLOv8, an enhanced detection model based on the YOLOv8 architecture. The model incorporates the RepNCSPELAN4 module and an SPPELAN structure to strengthen multi-scale feature representation. Furthermore, to improve small target localization, the Normalized Wasserstein Distance (NWD) loss is adopted, providing a more robust similarity measure than traditional IoU–based losses. The newly designed SERDet module deeply integrates a multi–scale feature extraction mechanism with a multi-path fused attention mechanism, significantly enhancing the recognition capability for flame targets under complex backgrounds. Depthwise separable convolution (DWConv) is utilized to reduce parameters and boost inference efficiency. Experiments on the M4SFWD dataset show that the proposed method improves mAP50 by 1.2% for flames and 2.4% for smoke, with a 1.5% overall gain in mAP50–95 over the baseline YOLOv8, outperforming existing mainstream models and offering a reliable solution for forest fire prevention. Full article
(This article belongs to the Section Natural Hazards and Risk Management)
Show Figures

Figure 1

20 pages, 6475 KB  
Article
Rolling Element Bearing Fault Diagnosis Based on Adversarial Autoencoder Network
by Wenbin Zhang, Xianyun Zhang and Han Xu
Processes 2026, 14(2), 245; https://doi.org/10.3390/pr14020245 - 10 Jan 2026
Viewed by 171
Abstract
Rolling bearing fault diagnosis is critical for the reliable operation of rotating machinery. However, many existing deep learning-based methods rely on complex signal preprocessing and lack interpretability. This paper proposes an adversarial autoencoder (AAE)-based framework that integrates adaptive, data-driven signal decomposition directly into [...] Read more.
Rolling bearing fault diagnosis is critical for the reliable operation of rotating machinery. However, many existing deep learning-based methods rely on complex signal preprocessing and lack interpretability. This paper proposes an adversarial autoencoder (AAE)-based framework that integrates adaptive, data-driven signal decomposition directly into a neural network. A convolutional autoencoder is employed to extract latent representations while preserving temporal resolution, enabling encoder channels to be interpreted as nonlinear signal components. A channel attention mechanism adaptively reweights these components, and a classifier acts as a discriminator to enhance class separability. The model is trained in an end-to-end manner by jointly optimizing reconstruction and classification objectives. Experiments on three benchmark datasets demonstrate that the proposed method achieves high diagnostic accuracy (99.64 ± 0.29%) without additional signal preprocessing and outperforms several representative deep learning-based methods. Moreover, the learned representations exhibit interpretable characteristics analogous to classical envelope demodulation, confirming the effectiveness and interpretability of the proposed approach. Full article
Show Figures

Figure 1

19 pages, 2336 KB  
Article
A Lightweight Upsampling and Cross-Modal Feature Fusion-Based Algorithm for Small-Object Detection in UAV Imagery
by Jianglei Gong, Zhe Yuan, Wenxing Li, Weiwei Li, Yanjie Guo and Baolong Guo
Electronics 2026, 15(2), 298; https://doi.org/10.3390/electronics15020298 - 9 Jan 2026
Viewed by 150
Abstract
Small-object detection in UAV remote sensing faces common challenges such as tiny target size, blurred features, and severe background interference. Furthermore, single imaging modalities exhibit limited representation capability in complex environments. To address these issues, this paper proposes CTU-YOLO, a UAV-based small-object detection [...] Read more.
Small-object detection in UAV remote sensing faces common challenges such as tiny target size, blurred features, and severe background interference. Furthermore, single imaging modalities exhibit limited representation capability in complex environments. To address these issues, this paper proposes CTU-YOLO, a UAV-based small-object detection algorithm built upon cross-modal feature fusion and lightweight upsampling. The algorithm incorporates a dynamic and adaptive cross-modal feature fusion (DCFF) module, which achieves efficient feature alignment and fusion by combining frequency-domain analysis with convolutional operations. Additionally, a lightweight upsampling module (LUS) is introduced, integrating dynamic sampling and depthwise separable convolution to enhance the recovery of fine details for small objects. Experiments on the DroneVehicle and LLVIP datasets demonstrate that CTU-YOLO achieves 73.9% mAP on DroneVehicle and 96.9% AP on LLVIP, outperforming existing mainstream methods. Meanwhile, the model possesses only 4.2 MB parameters and 13.8 GFLOPs computational cost, with inference speeds reaching 129.9 FPS on DroneVehicle and 135.1 FPS on LLVIP. This exhibits an excellent lightweight design and real-time performance while maintaining high accuracy. Ablation studies confirm that both the DCFF and LUS modules contribute significantly to performance gains. Visualization analysis further indicates that the proposed method can accurately preserve the structure of small objects even under nighttime, low-light, and multi-scale background conditions, demonstrating strong robustness. Full article
(This article belongs to the Special Issue AI-Driven Image Processing: Theory, Methods, and Applications)
Show Figures

Figure 1

Back to TopTop