Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,189)

Search Parameters:
Keywords = lightweight attention mechanism

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 586 KiB  
Article
An Accurate and Efficient Diabetic Retinopathy Diagnosis Method via Depthwise Separable Convolution and Multi-View Attention Mechanism
by Qing Yang, Ying Wei, Fei Liu and Zhuang Wu
Appl. Sci. 2025, 15(17), 9298; https://doi.org/10.3390/app15179298 - 24 Aug 2025
Abstract
Diabetic retinopathy (DR), a critical ocular disease that can lead to blindness, demands early and accurate diagnosis to prevent vision loss. Current automated DR diagnosis methods face two core challenges: first, subtle early lesions such as microaneurysms are often missed due to insufficient [...] Read more.
Diabetic retinopathy (DR), a critical ocular disease that can lead to blindness, demands early and accurate diagnosis to prevent vision loss. Current automated DR diagnosis methods face two core challenges: first, subtle early lesions such as microaneurysms are often missed due to insufficient feature extraction; second, there is a persistent trade-off between model accuracy and efficiency—lightweight architectures often sacrifice precision for real-time performance, while high-accuracy models are computationally expensive and difficult to deploy on resource-constrained edge devices. To address these issues, this study presents a novel deep learning framework integrating depthwise separable convolution and a multi-view attention mechanism (MVAM) for efficient DR diagnosis using retinal images. The framework employs multi-scale feature fusion via parallel 3 × 3 and 5 × 5 convolutions to capture lesions of varying sizes and incorporates Gabor filters to enhance vascular texture and directional lesion modeling, improving sensitivity to early structural abnormalities while reducing computational costs. Experimental results on both the diabetic retinopathy (DR) dataset and ocular disease (OD) dataset demonstrate the superiority of the proposed method: it achieves a high accuracy of 0.9697 on the DR dataset and 0.9669 on the OD dataset, outperforming traditional methods such as CNN_eye, VGG, and UNet by more than 1 percentage point. Moreover, its training time is only half that of U-Net (on DR dataset) and VGG (on OD dataset), highlighting its potential for clinical DR screening. Full article
20 pages, 6878 KiB  
Article
EMR-YOLO: A Multi-Scale Benthic Organism Detection Algorithm for Degraded Underwater Visual Features and Computationally Constrained Environments
by Dehua Zou, Songhao Zhao, Jingchun Zhou, Guangqiang Liu, Zhiying Jiang, Minyi Xu, Xianping Fu and Siyuan Liu
J. Mar. Sci. Eng. 2025, 13(9), 1617; https://doi.org/10.3390/jmse13091617 - 24 Aug 2025
Abstract
Marine benthic organism detection (BOD) is essential for underwater robotics and seabed resource management but suffers from motion blur, perspective distortion, and background clutter in dynamic underwater environments. To address visual feature degradation and computational constraints, we, in this paper, introduce EMR-YOLO, a [...] Read more.
Marine benthic organism detection (BOD) is essential for underwater robotics and seabed resource management but suffers from motion blur, perspective distortion, and background clutter in dynamic underwater environments. To address visual feature degradation and computational constraints, we, in this paper, introduce EMR-YOLO, a deep learning based multi-scale BOD method. To handle the diverse sizes and morphologies of benthic organisms, we propose an Efficient Detection Sparse Head (EDSHead), which combines a unified attention mechanism and dynamic sparse operators to enhance spatial modeling. For robust feature extraction under resource limitations, we design a lightweight Multi-Branch Fusion Downsampling (MBFDown) module that utilizes cross-stage feature fusion and multi-branch architecture to capture rich gradient information. Additionally, a Regional Two-Level Routing Attention (RTRA) mechanism is developed to mitigate background noise and sharpen focus on target regions. The experimental results demonstrate that EMR-YOLO achieves improvements of 2.33%, 1.50%, and 4.12% in AP, AP50, and AP75, respectively, outperforming state-of-the-art methods while maintaining efficiency. Full article
38 pages, 4775 KiB  
Article
Sparse-MoE-SAM: A Lightweight Framework Integrating MoE and SAM with a Sparse Attention Mechanism for Plant Disease Segmentation in Resource-Constrained Environments
by Benhan Zhao, Xilin Kang, Hao Zhou, Ziyang Shi, Lin Li, Guoxiong Zhou, Fangying Wan, Jiangzhang Zhu, Yongming Yan, Leheng Li and Yulong Wu
Plants 2025, 14(17), 2634; https://doi.org/10.3390/plants14172634 - 24 Aug 2025
Abstract
Plant disease segmentation has achieved significant progress with the help of artificial intelligence. However, deploying high-accuracy segmentation models in resource-limited settings faces three key challenges, as follows: (A) Traditional dense attention mechanisms incur quadratic computational complexity growth (O(n2d)), rendering [...] Read more.
Plant disease segmentation has achieved significant progress with the help of artificial intelligence. However, deploying high-accuracy segmentation models in resource-limited settings faces three key challenges, as follows: (A) Traditional dense attention mechanisms incur quadratic computational complexity growth (O(n2d)), rendering them ill-suited for low-power hardware. (B) Naturally sparse spatial distributions and large-scale variations in the lesions on leaves necessitate models that concurrently capture long-range dependencies and local details. (C) Complex backgrounds and variable lighting in field images often induce segmentation errors. To address these challenges, we propose Sparse-MoE-SAM, an efficient framework based on an enhanced Segment Anything Model (SAM). This deep learning framework integrates sparse attention mechanisms with a two-stage mixture of experts (MoE) decoder. The sparse attention dynamically activates key channels aligned with lesion sparsity patterns, reducing self-attention complexity while preserving long-range context. Stage 1 of the MoE decoder performs coarse-grained boundary localization; Stage 2 achieves fine-grained segmentation by leveraging specialized experts within the MoE, significantly enhancing edge discrimination accuracy. The expert repository—comprising standard convolutions, dilated convolutions, and depthwise separable convolutions—dynamically routes features through optimized processing paths based on input texture and lesion morphology. This enables robust segmentation across diverse leaf textures and plant developmental stages. Further, we design a sparse attention-enhanced Atrous Spatial Pyramid Pooling (ASPP) module to capture multi-scale contexts for both extensive lesions and small spots. Evaluations on three heterogeneous datasets (PlantVillage Extended, CVPPP, and our self-collected field images) show that Sparse-MoE-SAM achieves a mean Intersection-over-Union (mIoU) of 94.2%—surpassing standard SAM by 2.5 percentage points—while reducing computational costs by 23.7% compared to the original SAM baseline. The model also demonstrates balanced performance across disease classes and enhanced hardware compatibility. Our work validates that integrating sparse attention with MoE mechanisms sustains accuracy while drastically lowering computational demands, enabling the scalable deployment of plant disease segmentation models on mobile and edge devices. Full article
(This article belongs to the Special Issue Advances in Artificial Intelligence for Plant Research)
18 pages, 917 KiB  
Article
ATA-MSTF-Net: An Audio Texture-Aware MultiSpectro-Temporal Attention Fusion Network
by Yubo Su, Haolin Wang, Zhihao Xu, Chengxi Yin, Fucheng Chen and Zhaoguo Wang
Mathematics 2025, 13(17), 2719; https://doi.org/10.3390/math13172719 - 24 Aug 2025
Abstract
Unsupervised anomalous sound detection (ASD) models the normal sounds of machinery through classification operations, thereby identifying anomalies by quantifying deviations. Most recent approaches adopt depthwise separable modules from MobileNetV2. Extensive studies demonstrate that squeeze-and-excitation (SE) modules can enhance model fitting by dynamically weighting [...] Read more.
Unsupervised anomalous sound detection (ASD) models the normal sounds of machinery through classification operations, thereby identifying anomalies by quantifying deviations. Most recent approaches adopt depthwise separable modules from MobileNetV2. Extensive studies demonstrate that squeeze-and-excitation (SE) modules can enhance model fitting by dynamically weighting input features to adjust output distributions. However, we observe that conventional SE modules fail to adapt to the complex spectral textures of audio data. To address this, we propose an Audio Texture Attention (ATA) specifically designed for machine noise data, improving model robustness. Additionally, we integrate an LSTM layer and refine the temporal feature extraction architecture to strengthen the model’s sensitivity to sequential noise patterns. Experimental results on the DCASE 2020 Challenge Task 2 dataset show that our method achieves state-of-the-art performance, with AUC, pAUC, and mAUC scores of 96.15%, 90.58%, and 90.63%, respectively. Full article
Show Figures

Figure 1

18 pages, 2701 KiB  
Article
YOLOv11-CHBG: A Lightweight Fire Detection Model
by Yushuang Jiang, Peisheng Liu, Yunping Han and Bei Xiao
Fire 2025, 8(9), 338; https://doi.org/10.3390/fire8090338 - 24 Aug 2025
Abstract
Fire is a disaster that seriously threatens people’s lives. Because fires occur suddenly and spread quickly, especially in densely populated places or areas where it is difficult to evacuate quickly, it often causes major property damage and seriously endangers personal safety. Therefore, it [...] Read more.
Fire is a disaster that seriously threatens people’s lives. Because fires occur suddenly and spread quickly, especially in densely populated places or areas where it is difficult to evacuate quickly, it often causes major property damage and seriously endangers personal safety. Therefore, it is necessary to detect the occurrence of fires accurately and promptly and issue early warnings. This study introduces YOLOv11-CHBG, a novel detection model designed to identify flames and smoke. On the basis of YOLOv11, the C3K2-HFERB module is used in the backbone part, the BiAdaGLSA module is proposed in the neck, the SEAM attention mechanism is added to the model detection head, and the proposed model is more lightweight, offering potential support for fire rescue efforts. The model developed in this study is shown by the experimental results to achieve an average precision (mAP@0.5) of 78.4% on the Dfire datasets, with a 30.8% reduction in parameters compared to YOLOv11. The model achieves a lightweight design, enhancing its significance for real-time fire and smoke detection, and it provides a research basis for detecting fires earlier, preventing the spread of fires and reducing the harm caused by fires. Full article
Show Figures

Figure 1

24 pages, 1543 KiB  
Article
Intelligent Fault Diagnosis for Rotating Machinery via Transfer Learning and Attention Mechanisms: A Lightweight and Adaptive Approach
by Zhengjie Wang, Xing Yang, Tongjie Li, Lei She, Xuanchen Guo and Fan Yang
Actuators 2025, 14(9), 415; https://doi.org/10.3390/act14090415 - 23 Aug 2025
Abstract
Fault diagnosis under variable operating conditions remains challenging due to the limited adaptability of traditional methods. This paper proposes a transfer learning-based approach for bearing fault diagnosis across different rotational speeds, addressing the critical need for reliable detection in changing industrial environments. The [...] Read more.
Fault diagnosis under variable operating conditions remains challenging due to the limited adaptability of traditional methods. This paper proposes a transfer learning-based approach for bearing fault diagnosis across different rotational speeds, addressing the critical need for reliable detection in changing industrial environments. The method trains a diagnostic model on labeled source-domain data and transfers them to unlabeled target domains through a two-stage adaptation strategy. First, only the source-domain data are labeled to reflect real-world scenarios where target-domain labels are unavailable. The model architecture combines a convolutional neural network (CNN) for feature extraction with a self-attention mechanism for classification. During source-domain training, the feature extractor parameters are frozen to focus on classifier optimization. When transferring to target domains, the classifier parameters are frozen instead, allowing the feature extractor to adapt to new speed conditions. Experimental validation on the Case Western Reserve University bearing dataset (CWRU), Jiangnan University bearing dataset (JNU), and Southeast University gear and bearing dataset (SEU) demonstrates the method’s effectiveness, achieving accuracies of 99.95%, 99.99%, and 100%, respectively. The proposed method achieves significant model size reduction compared to conventional TL approaches (e.g., DANN and CDAN), with reductions of up to 91.97% and 64%, respectively. Furthermore, we observed a maximum reduction of 61.86% in FLOPs consumption. The results show significant improvement over conventional approaches in maintaining diagnostic performance across varying operational conditions. This study provides a practical solution for industrial applications where equipment operates under non-stationary speeds, offering both computational efficiency and reliable fault detection capabilities. Full article
(This article belongs to the Section Actuators for Manufacturing Systems)
24 pages, 2671 KiB  
Article
CNN–Transformer-Based Model for Maritime Blurred Target Recognition
by Tianyu Huang, Chao Pan, Jin Liu and Zhiwei Kang
Electronics 2025, 14(17), 3354; https://doi.org/10.3390/electronics14173354 - 23 Aug 2025
Abstract
In maritime blurred image recognition, ship collision accidents frequently result from three primary blur types: (1) motion blur from vessel movement in complex sea conditions, (2) defocus blur due to water vapor refraction, and (3) scattering blur caused by sea fog interference. This [...] Read more.
In maritime blurred image recognition, ship collision accidents frequently result from three primary blur types: (1) motion blur from vessel movement in complex sea conditions, (2) defocus blur due to water vapor refraction, and (3) scattering blur caused by sea fog interference. This paper proposes a dual-branch recognition method specifically designed for motion blur, which represents the most prevalent blur type in maritime scenarios. Conventional approaches exhibit constrained computational efficiency and limited adaptability across different modalities. To overcome these limitations, we propose a hybrid CNN–Transformer architecture: the CNN branch captures local blur characteristics, while the enhanced Transformer module models long-range dependencies via attention mechanisms. The CNN branch employs a lightweight ResNet variant, in which conventional residual blocks are substituted with Multi-Scale Gradient-Aware Residual Block (MSG-ARB). This architecture employs learnable gradient convolution for explicit local gradient feature extraction and utilizes gradient content gating to strengthen blur-sensitive region representation, significantly improving computational efficiency compared to conventional CNNs. The Transformer branch incorporates a Hierarchical Swin Transformer (HST) framework with Shifted Window-based Multi-head Self-Attention for global context modeling. The proposed method incorporates blur invariant Positional Encoding (PE) to enhance blur spectrum modeling capability, while employing DyT (Dynamic Tanh) module with learnable α parameters to replace traditional normalization layers. This architecture achieves a significant reduction in computational costs while preserving feature representation quality. Moreover, it efficiently computes long-range image dependencies using a compact 16 × 16 window configuration. The proposed feature fusion module synergistically integrates CNN-based local feature extraction with Transformer-enabled global representation learning, achieving comprehensive feature modeling across different scales. To evaluate the model’s performance and generalization ability, we conducted comprehensive experiments on four benchmark datasets: VAIS, GoPro, Mini-ImageNet, and Open Images V4. Experimental results show that our method achieves superior classification accuracy compared to state-of-the-art approaches, while simultaneously enhancing inference speed and reducing GPU memory consumption. Ablation studies confirm that the DyT module effectively suppresses outliers and improves computational efficiency, particularly when processing low-quality input data. Full article
26 pages, 5260 KiB  
Article
Blurred Lesion Image Segmentation via an Adaptive Scale Thresholding Network
by Qi Chen, Wenmin Wang, Zhibing Wang, Haomei Jia and Minglu Zhao
Appl. Sci. 2025, 15(17), 9259; https://doi.org/10.3390/app15179259 - 22 Aug 2025
Abstract
Medical image segmentation is crucial for disease diagnosis, as precise results aid clinicians in locating lesion regions. However, lesions often have blurred boundaries and complex shapes, challenging traditional methods in capturing clear edges and impacting accurate localization and complete excision. Small lesions are [...] Read more.
Medical image segmentation is crucial for disease diagnosis, as precise results aid clinicians in locating lesion regions. However, lesions often have blurred boundaries and complex shapes, challenging traditional methods in capturing clear edges and impacting accurate localization and complete excision. Small lesions are also critical but prone to detail loss during downsampling, reducing segmentation accuracy. To address these issues, we propose a novel adaptive scale thresholding network (AdSTNet) that acts as a post-processing lightweight network for enhancing sensitivity to lesion edges and cores through a dual-threshold adaptive mechanism. The dual-threshold adaptive mechanism is a key architectural component that includes a main threshold map for core localization and an edge threshold map for more precise boundary detection. AdSTNet is compatible with any segmentation network and introduces only a small computational and parameter cost. Additionally, Spatial Attention and Channel Attention (SACA), the Laplacian operator, and the Fusion Enhancement module are introduced to improve feature processing. SACA enhances spatial and channel attention for core localization; the Laplacian operator retains edge details without added complexity; and the Fusion Enhancement module adapts concatenation operation and Convolutional Gated Linear Unit (ConvGLU) to improve feature intensities to improve edge and small lesion segmentation. Experiments show that AdSTNet achieves notable performance gains on ISIC 2018, BUSI, and Kvasir-SEG datasets. Compared with the original U-Net, our method attains mIoU/mDice of 83.40%/90.24% on ISIC, 71.66%/80.32% on BUSI, and 73.08%/81.91% on Kvasir-SEG. Moreover, similar improvements are observed in the rest of the networks. Full article
24 pages, 9450 KiB  
Article
Industrial-AdaVAD: Adaptive Industrial Video Anomaly Detection Empowered by Edge Intelligence
by Jie Xiao, Haocheng Shen, Yasan Ding and Bin Guo
Mathematics 2025, 13(17), 2711; https://doi.org/10.3390/math13172711 - 22 Aug 2025
Abstract
The rapid advancement of Artificial Intelligence of Things (AIoT) has driven an urgent demand for intelligent video anomaly detection (VAD) to ensure industrial safety. However, traditional approaches struggle to detect unknown anomalies in complex and dynamic environments due to the scarcity of abnormal [...] Read more.
The rapid advancement of Artificial Intelligence of Things (AIoT) has driven an urgent demand for intelligent video anomaly detection (VAD) to ensure industrial safety. However, traditional approaches struggle to detect unknown anomalies in complex and dynamic environments due to the scarcity of abnormal samples and limited generalization capabilities. To address these challenges, this paper presents an adaptive VAD framework powered by edge intelligence tailored for resource-constrained industrial settings. Specifically, a lightweight feature extractor is developed by integrating residual networks with channel attention mechanisms, achieving a 58% reduction in model parameters through dense connectivity and output pruning. A multidimensional evaluation strategy is introduced to dynamically select optimal models for deployment on heterogeneous edge devices. To enhance cross-scene adaptability, we propose a multilayer adversarial domain adaptation mechanism that effectively aligns feature distributions across diverse industrial environments. Extensive experiments on a real-world coal mine surveillance dataset demonstrate that the proposed framework achieves an accuracy of 86.7% with an inference latency of 23 ms per frame on edge hardware, improving both detection efficiency and transferability. Full article
22 pages, 5943 KiB  
Article
LiteCOD: Lightweight Camouflaged Object Detection via Holistic Understanding of Local-Global Features and Multi-Scale Fusion
by Abbas Khan, Hayat Ullah and Arslan Munir
AI 2025, 6(9), 197; https://doi.org/10.3390/ai6090197 - 22 Aug 2025
Abstract
Camouflaged object detection (COD) represents one of the most challenging tasks in computer vision, requiring sophisticated approaches to accurately extract objects that seamlessly blend within visually similar backgrounds. While contemporary techniques demonstrate promising detection performance, they predominantly suffer from computational complexity and resource [...] Read more.
Camouflaged object detection (COD) represents one of the most challenging tasks in computer vision, requiring sophisticated approaches to accurately extract objects that seamlessly blend within visually similar backgrounds. While contemporary techniques demonstrate promising detection performance, they predominantly suffer from computational complexity and resource requirements that severely limit their deployment in real-time applications, particularly on mobile devices and edge computing platforms. To address these limitations, we propose LiteCOD, an efficient lightweight framework that integrates local and global perceptions through holistic feature fusion and specially designed efficient attention mechanisms. Our approach achieves superior detection accuracy while maintaining computational efficiency essential for practical deployment, with enhanced feature propagation and minimal computational overhead. Extensive experiments validate LiteCOD’s effectiveness, demonstrating that it surpasses existing lightweight methods with average improvements of 7.55% in the F-measure and 8.08% overall performance gain across three benchmark datasets. Our results indicate that our framework consistently outperforms 20 state-of-the-art methods across quantitative metrics, computational efficiency, and overall performance while achieving real-time inference capabilities with a significantly reduced parameter count of 5.15M parameters. LiteCOD establishes a practical solution bridging the gap between detection accuracy and deployment feasibility in resource-constrained environments. Full article
Show Figures

Figure 1

29 pages, 9158 KiB  
Review
Advancements and Future Prospects of Energy Harvesting Technology in Power Systems
by Haojie Du, Jiajing Lu, Wenye Zhang, Guang Yang, Wenzhuo Zhang, Zejun Xu, Huifeng Wang, Kejie Dai and Lingxiao Gao
Micromachines 2025, 16(8), 964; https://doi.org/10.3390/mi16080964 - 21 Aug 2025
Viewed by 240
Abstract
The electric power equipment industry is rapidly advancing toward “informationization,” with the swift progression of intelligent sensing technology serving as a key driving force behind this transformation, thereby triggering significant changes in global electric power equipment. In this process, intelligent sensing has created [...] Read more.
The electric power equipment industry is rapidly advancing toward “informationization,” with the swift progression of intelligent sensing technology serving as a key driving force behind this transformation, thereby triggering significant changes in global electric power equipment. In this process, intelligent sensing has created an urgent demand for high-performance integrated power systems that feature compact size, lightweight design, long operational life, high reliability, high energy density, and low cost. However, the performance metrics of traditional power supplies have increasingly failed to meet the requirements of modern intelligent sensing, thereby significantly hindering the advancement of intelligent power equipment. Energy harvesting technology, characterized by its long operational lifespan, compact size, environmental sustainability, and self-sufficient operation, is capable of capturing renewable energy from ambient power sources and converting it into electrical energy to supply power to sensors. Due to these advantages, it has garnered significant attention in the field of power sensing. This paper presents a comprehensive review of the current state of development of energy harvesting technologies within the power environment. It outlines recent advancements in magnetic field energy harvesting, electric field energy harvesting, vibration energy harvesting, wind energy harvesting, and solar energy harvesting. Furthermore, it explores the integration of multiple physical mechanisms and hybrid energy sources aimed at enhancing self-powered applications in this domain. A comparative analysis of the advantages and limitations associated with each technology is also provided. Additionally, the paper discusses potential future directions for the development of energy harvesting technologies in the power environment. Full article
(This article belongs to the Special Issue Nanogenerators: Design, Fabrication and Applications)
Show Figures

Figure 1

21 pages, 9325 KiB  
Article
Lightweight Model Improvement and Application for Rice Disease Classification
by Tonglai Liu, Mingguang Liu, Chengcheng Yang, Ancong Wu, Xiaodong Li and Wenzhao Wei
Electronics 2025, 14(16), 3331; https://doi.org/10.3390/electronics14163331 - 21 Aug 2025
Viewed by 155
Abstract
The timely and correct identification of rice diseases is essential to ensuring rice productivity. However, many methods have drawbacks such as slow recognition speed, low recognition accuracy and overly complex models that are unfavorable for portability. Therefore, this study proposes an improved model [...] Read more.
The timely and correct identification of rice diseases is essential to ensuring rice productivity. However, many methods have drawbacks such as slow recognition speed, low recognition accuracy and overly complex models that are unfavorable for portability. Therefore, this study proposes an improved model for accurately classifying rice diseases based on a two-level routing attention mechanism and dynamic convolution based on the above difficulties. The model employs Alterable Kernel Convolution with dynamic, irregularly shaped convolutional kernels and Bi-level Routing Attention that utilizes sparsity to reduce parameters and involves a GPU-friendly dense matrix multiplication, which can achieve high-precision rice disease recognition while ensuring lightweight and recognition speed. The model successfully classified 10 species, including nine diseased and healthy rice, with 97.31% accuracy and a 97.18% F1-score. Our proposed method outperforms MobileNetV3-large, EfficientNet-b0, Swin Transformer-tiny and ResNet-50 by 1.73%, 1.82%, 1.25% and 0.67%, respectively. Meanwhile, the model contains only 4.453×106 parameters and achieves an inference time of 6.13 s, which facilitates deployment on mobile devices.The proposed MobileViT_BiAK method effectively identifies rice diseases while providing a lightweight and high-performance classification solution. Full article
(This article belongs to the Special Issue Target Tracking and Recognition Techniques and Their Applications)
Show Figures

Figure 1

23 pages, 10656 KiB  
Article
Lightweight YOLOv11n-Based Detection and Counting of Early-Stage Cabbage Seedlings from UAV RGB Imagery
by Rongrui Zhao, Rongxiang Luo, Xue Ding, Jiao Cui and Bangjin Yi
Horticulturae 2025, 11(8), 993; https://doi.org/10.3390/horticulturae11080993 - 21 Aug 2025
Viewed by 208
Abstract
This study proposes a lightweight adaptive neural network framework based on an improved YOLOv11n model to address the core challenges in identifying cabbage seedlings in visible light images captured by UAVs. These challenges include the loss of small-target features, poor adaptability to complex [...] Read more.
This study proposes a lightweight adaptive neural network framework based on an improved YOLOv11n model to address the core challenges in identifying cabbage seedlings in visible light images captured by UAVs. These challenges include the loss of small-target features, poor adaptability to complex lighting conditions, and the low deployment efficiency of edge devices. First, the adaptive dual-path downsampling module (ADown) integrates average pooling and maximum pooling into a dual-branch structure to enhance background texture and crop edge features in a synergistic manner. Secondly, the Illumination Robust Contrast Learning Head (IRCLHead) utilizes a temperature-adaptive network to adjust the contrast loss function parameters dynamically. Combined with a dual-output supervision mechanism that integrates growth stage prediction and interference-resistant feature embedding, this module enhances the model’s robustness in complex lighting scenarios. Finally, a lightweight spatial-channel attention convolution module (LAConv) has been developed to optimize the model’s computational load by using multi-scale feature extraction paths and depth decomposition structures. Experiments demonstrate that the proposed architecture achieves an mAP@0.5 of 99.0% in detecting cabbage seedling growth cycles, improving upon the baseline model by 0.71 percentage points. Furthermore, it achieves an mAP@0.5:0.95 of 2.4 percentage points, reduces computational complexity (GFLOPs) by 12.7%, and drastically reduces inference time from 3.7 ms to 1.0 ms. Additionally, the model parameters are simplified by 3%. This model provides an efficient solution for the real-time counting of cabbage seedlings and lightweight operations in drone-based precision agriculture. Full article
(This article belongs to the Section Vegetable Production Systems)
Show Figures

Figure 1

18 pages, 7729 KiB  
Article
A Lightweight Traffic Sign Detection Model Based on Improved YOLOv8s for Edge Deployment in Autonomous Driving Systems Under Complex Environments
by Chen Xing, Haoran Sun and Jiafu Yang
World Electr. Veh. J. 2025, 16(8), 478; https://doi.org/10.3390/wevj16080478 - 21 Aug 2025
Viewed by 476
Abstract
Traffic sign detection is a core function of autonomous driving systems, requiring real-time and accurate target recognition in complex road environments. Existing lightweight detection models struggle to balance accuracy, efficiency, and robustness under computational constraints of vehicle-mounted edge devices. To address this, we [...] Read more.
Traffic sign detection is a core function of autonomous driving systems, requiring real-time and accurate target recognition in complex road environments. Existing lightweight detection models struggle to balance accuracy, efficiency, and robustness under computational constraints of vehicle-mounted edge devices. To address this, we propose a lightweight model integrating FasterNet, Efficient Multi-scale Attention (EMA), Bidirectional Feature Pyramid Network (BiFPN), and Group Separable Convolution (GSConv) based on YOLOv8s (FEBG-YOLOv8s). Key innovations include reconstructing the Cross Stage Partial Network 2 with Focus (C2f) module using FasterNet blocks to minimize redundant computation; integrating an EMA mechanism to enhance robustness against small and occluded targets; refining the neck network based on BiFPN via channel compression, downsampling layers, and skip connections to optimize shallow–deep semantic fusion; and designing a GSConv-based hybrid serial–parallel detection head (GSP-Detect) to preserve cross-channel information while reducing computational load. Experiments on Tsinghua–Tencent 100K (TT100K) show FEBG-YOLOv8s improves mean Average Precision at Intersection over Union 0.5 (mAP50) by 3.1% compared to YOLOv8s, with 4 million fewer parameters and 22.5% lower Giga Floating-Point Operations (GFLOPs). Generalizability experiments on the CSUST Chinese Traffic Sign Detection Benchmark (CCTSDB) validate robustness, with 3.3% higher mAP50, demonstrating its potential for real-time traffic sign detection on edge platforms. Full article
Show Figures

Figure 1

20 pages, 2239 KiB  
Article
Lightweight Financial Fraud Detection Using a Symmetrical GAN-CNN Fusion Architecture
by Yiwen Yang, Chengjun Xu and Guisheng Tian
Symmetry 2025, 17(8), 1366; https://doi.org/10.3390/sym17081366 - 21 Aug 2025
Viewed by 165
Abstract
With the rapid development of information technology and the deep integration of the Internet platform, the scale and form of financial transactions continue to grow and expand, significantly improving users’ payment experience and life efficiency. However, financial transactions bring us convenience but also [...] Read more.
With the rapid development of information technology and the deep integration of the Internet platform, the scale and form of financial transactions continue to grow and expand, significantly improving users’ payment experience and life efficiency. However, financial transactions bring us convenience but also expose many security risks, such as money laundering activities, forged checks, and other financial fraud that occurs frequently, seriously threatening the stability and security of the financial system. Due to the imbalance between the proportion of normal and abnormal transactions in the data, most of the existing deep learning-based methods still have obvious deficiencies in learning small numbers sample classes, context modeling, and computational complexity control. To address these deficiencies, this paper proposes a symmetrical structure-based GAN-CNN model for lightweight financial fraud detection. The symmetrical structure can improve the feature extraction and fusion ability and enhance the model’s recognition effect for complex fraud patterns. Synthetic fraud samples are generated based on a GAN to alleviate category imbalance. Multi-scale convolution and attention mechanisms are designed to extract local and global transaction features, and adaptive aggregation and context encoding modules are introduced to improve computational efficiency. We conducted numerous replicate experiments on two public datasets, YelpChi and Amazon. The results showed that on the Amazon dataset with a 50% training ratio, compared with the CNN-GAN model, the accuracy of our model was improved by 1.64%, and the number of parameters was reduced by approximately 88.4%. Compared with the hybrid CNN-LSTM–attention model under the same setting, the accuracy was improved by 0.70%, and the number of parameters was reduced by approximately 87.6%. The symmetry-based lightweight architecture proposed in this work is novel in terms of structural design, and the experimental results show that it is both efficient and accurate in detecting imbalanced transactions. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

Back to TopTop