MDPI - Publisher of Open Access Journals

17 pages, 586 KiB

Open AccessArticle

An Accurate and Efficient Diabetic Retinopathy Diagnosis Method via Depthwise Separable Convolution and Multi-View Attention Mechanism

by Qing Yang, Ying Wei, Fei Liu and Zhuang Wu

Appl. Sci. 2025, 15(17), 9298; https://doi.org/10.3390/app15179298 - 24 Aug 2025

Abstract

Diabetic retinopathy (DR), a critical ocular disease that can lead to blindness, demands early and accurate diagnosis to prevent vision loss. Current automated DR diagnosis methods face two core challenges: first, subtle early lesions such as microaneurysms are often missed due to insufficient [...] Read more.

Diabetic retinopathy (DR), a critical ocular disease that can lead to blindness, demands early and accurate diagnosis to prevent vision loss. Current automated DR diagnosis methods face two core challenges: first, subtle early lesions such as microaneurysms are often missed due to insufficient feature extraction; second, there is a persistent trade-off between model accuracy and efficiency—lightweight architectures often sacrifice precision for real-time performance, while high-accuracy models are computationally expensive and difficult to deploy on resource-constrained edge devices. To address these issues, this study presents a novel deep learning framework integrating depthwise separable convolution and a multi-view attention mechanism (MVAM) for efficient DR diagnosis using retinal images. The framework employs multi-scale feature fusion via parallel 3 × 3 and 5 × 5 convolutions to capture lesions of varying sizes and incorporates Gabor filters to enhance vascular texture and directional lesion modeling, improving sensitivity to early structural abnormalities while reducing computational costs. Experimental results on both the diabetic retinopathy (DR) dataset and ocular disease (OD) dataset demonstrate the superiority of the proposed method: it achieves a high accuracy of 0.9697 on the DR dataset and 0.9669 on the OD dataset, outperforming traditional methods such as CNN_eye, VGG, and UNet by more than 1 percentage point. Moreover, its training time is only half that of U-Net (on DR dataset) and VGG (on OD dataset), highlighting its potential for clinical DR screening. Full article

20 pages, 6878 KiB

Open AccessArticle

EMR-YOLO: A Multi-Scale Benthic Organism Detection Algorithm for Degraded Underwater Visual Features and Computationally Constrained Environments

by Dehua Zou, Songhao Zhao, Jingchun Zhou, Guangqiang Liu, Zhiying Jiang, Minyi Xu, Xianping Fu and Siyuan Liu

J. Mar. Sci. Eng. 2025, 13(9), 1617; https://doi.org/10.3390/jmse13091617 - 24 Aug 2025

Abstract

Marine benthic organism detection (BOD) is essential for underwater robotics and seabed resource management but suffers from motion blur, perspective distortion, and background clutter in dynamic underwater environments. To address visual feature degradation and computational constraints, we, in this paper, introduce EMR-YOLO, a [...] Read more.

Marine benthic organism detection (BOD) is essential for underwater robotics and seabed resource management but suffers from motion blur, perspective distortion, and background clutter in dynamic underwater environments. To address visual feature degradation and computational constraints, we, in this paper, introduce EMR-YOLO, a deep learning based multi-scale BOD method. To handle the diverse sizes and morphologies of benthic organisms, we propose an Efficient Detection Sparse Head (EDSHead), which combines a unified attention mechanism and dynamic sparse operators to enhance spatial modeling. For robust feature extraction under resource limitations, we design a lightweight Multi-Branch Fusion Downsampling (MBFDown) module that utilizes cross-stage feature fusion and multi-branch architecture to capture rich gradient information. Additionally, a Regional Two-Level Routing Attention (RTRA) mechanism is developed to mitigate background noise and sharpen focus on target regions. The experimental results demonstrate that EMR-YOLO achieves improvements of 2.33%, 1.50%, and 4.12% in

A P

,

A P_{50}

, and

A P_{75}

, respectively, outperforming state-of-the-art methods while maintaining efficiency. Full article

(This article belongs to the Topic Applications and Development of Underwater Robotics and Underwater Vision Technology, 2nd Edition)

38 pages, 4775 KiB

Open AccessArticle

Sparse-MoE-SAM: A Lightweight Framework Integrating MoE and SAM with a Sparse Attention Mechanism for Plant Disease Segmentation in Resource-Constrained Environments

by Benhan Zhao, Xilin Kang, Hao Zhou, Ziyang Shi, Lin Li, Guoxiong Zhou, Fangying Wan, Jiangzhang Zhu, Yongming Yan, Leheng Li and Yulong Wu

Plants 2025, 14(17), 2634; https://doi.org/10.3390/plants14172634 - 24 Aug 2025

Abstract

Plant disease segmentation has achieved significant progress with the help of artificial intelligence. However, deploying high-accuracy segmentation models in resource-limited settings faces three key challenges, as follows: (A) Traditional dense attention mechanisms incur quadratic computational complexity growth (O(

n^{2} d

)), rendering [...] Read more.

Plant disease segmentation has achieved significant progress with the help of artificial intelligence. However, deploying high-accuracy segmentation models in resource-limited settings faces three key challenges, as follows: (A) Traditional dense attention mechanisms incur quadratic computational complexity growth (O(

n^{2} d

)), rendering them ill-suited for low-power hardware. (B) Naturally sparse spatial distributions and large-scale variations in the lesions on leaves necessitate models that concurrently capture long-range dependencies and local details. (C) Complex backgrounds and variable lighting in field images often induce segmentation errors. To address these challenges, we propose Sparse-MoE-SAM, an efficient framework based on an enhanced Segment Anything Model (SAM). This deep learning framework integrates sparse attention mechanisms with a two-stage mixture of experts (MoE) decoder. The sparse attention dynamically activates key channels aligned with lesion sparsity patterns, reducing self-attention complexity while preserving long-range context. Stage 1 of the MoE decoder performs coarse-grained boundary localization; Stage 2 achieves fine-grained segmentation by leveraging specialized experts within the MoE, significantly enhancing edge discrimination accuracy. The expert repository—comprising standard convolutions, dilated convolutions, and depthwise separable convolutions—dynamically routes features through optimized processing paths based on input texture and lesion morphology. This enables robust segmentation across diverse leaf textures and plant developmental stages. Further, we design a sparse attention-enhanced Atrous Spatial Pyramid Pooling (ASPP) module to capture multi-scale contexts for both extensive lesions and small spots. Evaluations on three heterogeneous datasets (PlantVillage Extended, CVPPP, and our self-collected field images) show that Sparse-MoE-SAM achieves a mean Intersection-over-Union (mIoU) of 94.2%—surpassing standard SAM by 2.5 percentage points—while reducing computational costs by 23.7% compared to the original SAM baseline. The model also demonstrates balanced performance across disease classes and enhanced hardware compatibility. Our work validates that integrating sparse attention with MoE mechanisms sustains accuracy while drastically lowering computational demands, enabling the scalable deployment of plant disease segmentation models on mobile and edge devices. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence for Plant Research)

18 pages, 917 KiB

Open AccessArticle

ATA-MSTF-Net: An Audio Texture-Aware MultiSpectro-Temporal Attention Fusion Network

by Yubo Su, Haolin Wang, Zhihao Xu, Chengxi Yin, Fucheng Chen and Zhaoguo Wang

Mathematics 2025, 13(17), 2719; https://doi.org/10.3390/math13172719 - 24 Aug 2025

Abstract

Unsupervised anomalous sound detection (ASD) models the normal sounds of machinery through classification operations, thereby identifying anomalies by quantifying deviations. Most recent approaches adopt depthwise separable modules from MobileNetV2. Extensive studies demonstrate that squeeze-and-excitation (SE) modules can enhance model fitting by dynamically weighting [...] Read more.

Unsupervised anomalous sound detection (ASD) models the normal sounds of machinery through classification operations, thereby identifying anomalies by quantifying deviations. Most recent approaches adopt depthwise separable modules from MobileNetV2. Extensive studies demonstrate that squeeze-and-excitation (SE) modules can enhance model fitting by dynamically weighting input features to adjust output distributions. However, we observe that conventional SE modules fail to adapt to the complex spectral textures of audio data. To address this, we propose an Audio Texture Attention (ATA) specifically designed for machine noise data, improving model robustness. Additionally, we integrate an LSTM layer and refine the temporal feature extraction architecture to strengthen the model’s sensitivity to sequential noise patterns. Experimental results on the DCASE 2020 Challenge Task 2 dataset show that our method achieves state-of-the-art performance, with AUC, pAUC, and mAUC scores of 96.15%, 90.58%, and 90.63%, respectively. Full article

(This article belongs to the Special Issue Advances in Mathematical Approaches to Trustworthy and Secure AI Systems)

► Show Figures

Figure 1

18 pages, 2701 KiB

Open AccessArticle

YOLOv11-CHBG: A Lightweight Fire Detection Model

by Yushuang Jiang, Peisheng Liu, Yunping Han and Bei Xiao

Fire 2025, 8(9), 338; https://doi.org/10.3390/fire8090338 - 24 Aug 2025

Abstract

Fire is a disaster that seriously threatens people’s lives. Because fires occur suddenly and spread quickly, especially in densely populated places or areas where it is difficult to evacuate quickly, it often causes major property damage and seriously endangers personal safety. Therefore, it [...] Read more.

Fire is a disaster that seriously threatens people’s lives. Because fires occur suddenly and spread quickly, especially in densely populated places or areas where it is difficult to evacuate quickly, it often causes major property damage and seriously endangers personal safety. Therefore, it is necessary to detect the occurrence of fires accurately and promptly and issue early warnings. This study introduces YOLOv11-CHBG, a novel detection model designed to identify flames and smoke. On the basis of YOLOv11, the C3K2-HFERB module is used in the backbone part, the BiAdaGLSA module is proposed in the neck, the SEAM attention mechanism is added to the model detection head, and the proposed model is more lightweight, offering potential support for fire rescue efforts. The model developed in this study is shown by the experimental results to achieve an average precision (mAP@0.5) of 78.4% on the Dfire datasets, with a 30.8% reduction in parameters compared to YOLOv11. The model achieves a lightweight design, enhancing its significance for real-time fire and smoke detection, and it provides a research basis for detecting fires earlier, preventing the spread of fires and reducing the harm caused by fires. Full article

(This article belongs to the Special Issue Machine Learning (ML) and Deep Learning (DL) Applications in Wildfire Science: Principles, Progress and Prospects (2nd Edition))

► Show Figures

Figure 1

24 pages, 1543 KiB

Open AccessArticle

Intelligent Fault Diagnosis for Rotating Machinery via Transfer Learning and Attention Mechanisms: A Lightweight and Adaptive Approach

by Zhengjie Wang, Xing Yang, Tongjie Li, Lei She, Xuanchen Guo and Fan Yang

Actuators 2025, 14(9), 415; https://doi.org/10.3390/act14090415 - 23 Aug 2025

Abstract

Fault diagnosis under variable operating conditions remains challenging due to the limited adaptability of traditional methods. This paper proposes a transfer learning-based approach for bearing fault diagnosis across different rotational speeds, addressing the critical need for reliable detection in changing industrial environments. The [...] Read more.

Fault diagnosis under variable operating conditions remains challenging due to the limited adaptability of traditional methods. This paper proposes a transfer learning-based approach for bearing fault diagnosis across different rotational speeds, addressing the critical need for reliable detection in changing industrial environments. The method trains a diagnostic model on labeled source-domain data and transfers them to unlabeled target domains through a two-stage adaptation strategy. First, only the source-domain data are labeled to reflect real-world scenarios where target-domain labels are unavailable. The model architecture combines a convolutional neural network (CNN) for feature extraction with a self-attention mechanism for classification. During source-domain training, the feature extractor parameters are frozen to focus on classifier optimization. When transferring to target domains, the classifier parameters are frozen instead, allowing the feature extractor to adapt to new speed conditions. Experimental validation on the Case Western Reserve University bearing dataset (CWRU), Jiangnan University bearing dataset (JNU), and Southeast University gear and bearing dataset (SEU) demonstrates the method’s effectiveness, achieving accuracies of 99.95%, 99.99%, and 100%, respectively. The proposed method achieves significant model size reduction compared to conventional TL approaches (e.g., DANN and CDAN), with reductions of up to 91.97% and 64%, respectively. Furthermore, we observed a maximum reduction of 61.86% in FLOPs consumption. The results show significant improvement over conventional approaches in maintaining diagnostic performance across varying operational conditions. This study provides a practical solution for industrial applications where equipment operates under non-stationary speeds, offering both computational efficiency and reliable fault detection capabilities. Full article

(This article belongs to the Section Actuators for Manufacturing Systems)

24 pages, 2671 KiB

Open AccessArticle

CNN–Transformer-Based Model for Maritime Blurred Target Recognition

by Tianyu Huang, Chao Pan, Jin Liu and Zhiwei Kang

Electronics 2025, 14(17), 3354; https://doi.org/10.3390/electronics14173354 - 23 Aug 2025

Abstract

In maritime blurred image recognition, ship collision accidents frequently result from three primary blur types: (1) motion blur from vessel movement in complex sea conditions, (2) defocus blur due to water vapor refraction, and (3) scattering blur caused by sea fog interference. This [...] Read more.

In maritime blurred image recognition, ship collision accidents frequently result from three primary blur types: (1) motion blur from vessel movement in complex sea conditions, (2) defocus blur due to water vapor refraction, and (3) scattering blur caused by sea fog interference. This paper proposes a dual-branch recognition method specifically designed for motion blur, which represents the most prevalent blur type in maritime scenarios. Conventional approaches exhibit constrained computational efficiency and limited adaptability across different modalities. To overcome these limitations, we propose a hybrid CNN–Transformer architecture: the CNN branch captures local blur characteristics, while the enhanced Transformer module models long-range dependencies via attention mechanisms. The CNN branch employs a lightweight ResNet variant, in which conventional residual blocks are substituted with Multi-Scale Gradient-Aware Residual Block (MSG-ARB). This architecture employs learnable gradient convolution for explicit local gradient feature extraction and utilizes gradient content gating to strengthen blur-sensitive region representation, significantly improving computational efficiency compared to conventional CNNs. The Transformer branch incorporates a Hierarchical Swin Transformer (HST) framework with Shifted Window-based Multi-head Self-Attention for global context modeling. The proposed method incorporates blur invariant Positional Encoding (PE) to enhance blur spectrum modeling capability, while employing DyT (Dynamic Tanh) module with learnable α parameters to replace traditional normalization layers. This architecture achieves a significant reduction in computational costs while preserving feature representation quality. Moreover, it efficiently computes long-range image dependencies using a compact 16 × 16 window configuration. The proposed feature fusion module synergistically integrates CNN-based local feature extraction with Transformer-enabled global representation learning, achieving comprehensive feature modeling across different scales. To evaluate the model’s performance and generalization ability, we conducted comprehensive experiments on four benchmark datasets: VAIS, GoPro, Mini-ImageNet, and Open Images V4. Experimental results show that our method achieves superior classification accuracy compared to state-of-the-art approaches, while simultaneously enhancing inference speed and reducing GPU memory consumption. Ablation studies confirm that the DyT module effectively suppresses outliers and improves computational efficiency, particularly when processing low-quality input data. Full article

26 pages, 5260 KiB

Open AccessArticle

Blurred Lesion Image Segmentation via an Adaptive Scale Thresholding Network

by Qi Chen, Wenmin Wang, Zhibing Wang, Haomei Jia and Minglu Zhao

Appl. Sci. 2025, 15(17), 9259; https://doi.org/10.3390/app15179259 - 22 Aug 2025

Abstract

Medical image segmentation is crucial for disease diagnosis, as precise results aid clinicians in locating lesion regions. However, lesions often have blurred boundaries and complex shapes, challenging traditional methods in capturing clear edges and impacting accurate localization and complete excision. Small lesions are [...] Read more.

Medical image segmentation is crucial for disease diagnosis, as precise results aid clinicians in locating lesion regions. However, lesions often have blurred boundaries and complex shapes, challenging traditional methods in capturing clear edges and impacting accurate localization and complete excision. Small lesions are also critical but prone to detail loss during downsampling, reducing segmentation accuracy. To address these issues, we propose a novel adaptive scale thresholding network (AdSTNet) that acts as a post-processing lightweight network for enhancing sensitivity to lesion edges and cores through a dual-threshold adaptive mechanism. The dual-threshold adaptive mechanism is a key architectural component that includes a main threshold map for core localization and an edge threshold map for more precise boundary detection. AdSTNet is compatible with any segmentation network and introduces only a small computational and parameter cost. Additionally, Spatial Attention and Channel Attention (SACA), the Laplacian operator, and the Fusion Enhancement module are introduced to improve feature processing. SACA enhances spatial and channel attention for core localization; the Laplacian operator retains edge details without added complexity; and the Fusion Enhancement module adapts concatenation operation and Convolutional Gated Linear Unit (ConvGLU) to improve feature intensities to improve edge and small lesion segmentation. Experiments show that AdSTNet achieves notable performance gains on ISIC 2018, BUSI, and Kvasir-SEG datasets. Compared with the original U-Net, our method attains mIoU/mDice of 83.40%/90.24% on ISIC, 71.66%/80.32% on BUSI, and 73.08%/81.91% on Kvasir-SEG. Moreover, similar improvements are observed in the rest of the networks. Full article

24 pages, 9450 KiB

Open AccessArticle

Industrial-AdaVAD: Adaptive Industrial Video Anomaly Detection Empowered by Edge Intelligence

by Jie Xiao, Haocheng Shen, Yasan Ding and Bin Guo

Mathematics 2025, 13(17), 2711; https://doi.org/10.3390/math13172711 - 22 Aug 2025

Abstract

The rapid advancement of Artificial Intelligence of Things (AIoT) has driven an urgent demand for intelligent video anomaly detection (VAD) to ensure industrial safety. However, traditional approaches struggle to detect unknown anomalies in complex and dynamic environments due to the scarcity of abnormal [...] Read more.

The rapid advancement of Artificial Intelligence of Things (AIoT) has driven an urgent demand for intelligent video anomaly detection (VAD) to ensure industrial safety. However, traditional approaches struggle to detect unknown anomalies in complex and dynamic environments due to the scarcity of abnormal samples and limited generalization capabilities. To address these challenges, this paper presents an adaptive VAD framework powered by edge intelligence tailored for resource-constrained industrial settings. Specifically, a lightweight feature extractor is developed by integrating residual networks with channel attention mechanisms, achieving a 58% reduction in model parameters through dense connectivity and output pruning. A multidimensional evaluation strategy is introduced to dynamically select optimal models for deployment on heterogeneous edge devices. To enhance cross-scene adaptability, we propose a multilayer adversarial domain adaptation mechanism that effectively aligns feature distributions across diverse industrial environments. Extensive experiments on a real-world coal mine surveillance dataset demonstrate that the proposed framework achieves an accuracy of 86.7% with an inference latency of 23 ms per frame on edge hardware, improving both detection efficiency and transferability. Full article

(This article belongs to the Special Issue Mathematical Method for Artificial Intelligence and Mobile Edge Computing)

22 pages, 5943 KiB

Open AccessArticle

LiteCOD: Lightweight Camouflaged Object Detection via Holistic Understanding of Local-Global Features and Multi-Scale Fusion

by Abbas Khan, Hayat Ullah and Arslan Munir

AI 2025, 6(9), 197; https://doi.org/10.3390/ai6090197 - 22 Aug 2025

Abstract

Camouflaged object detection (COD) represents one of the most challenging tasks in computer vision, requiring sophisticated approaches to accurately extract objects that seamlessly blend within visually similar backgrounds. While contemporary techniques demonstrate promising detection performance, they predominantly suffer from computational complexity and resource [...] Read more.

Camouflaged object detection (COD) represents one of the most challenging tasks in computer vision, requiring sophisticated approaches to accurately extract objects that seamlessly blend within visually similar backgrounds. While contemporary techniques demonstrate promising detection performance, they predominantly suffer from computational complexity and resource requirements that severely limit their deployment in real-time applications, particularly on mobile devices and edge computing platforms. To address these limitations, we propose LiteCOD, an efficient lightweight framework that integrates local and global perceptions through holistic feature fusion and specially designed efficient attention mechanisms. Our approach achieves superior detection accuracy while maintaining computational efficiency essential for practical deployment, with enhanced feature propagation and minimal computational overhead. Extensive experiments validate LiteCOD’s effectiveness, demonstrating that it surpasses existing lightweight methods with average improvements of 7.55% in the F-measure and 8.08% overall performance gain across three benchmark datasets. Our results indicate that our framework consistently outperforms 20 state-of-the-art methods across quantitative metrics, computational efficiency, and overall performance while achieving real-time inference capabilities with a significantly reduced parameter count of 5.15M parameters. LiteCOD establishes a practical solution bridging the gap between detection accuracy and deployment feasibility in resource-constrained environments. Full article

► Show Figures

Figure 1

29 pages, 9158 KiB

Open AccessReview

Advancements and Future Prospects of Energy Harvesting Technology in Power Systems

by Haojie Du, Jiajing Lu, Wenye Zhang, Guang Yang, Wenzhuo Zhang, Zejun Xu, Huifeng Wang, Kejie Dai and Lingxiao Gao

Micromachines 2025, 16(8), 964; https://doi.org/10.3390/mi16080964 - 21 Aug 2025

Viewed by 240

Abstract

The electric power equipment industry is rapidly advancing toward “informationization,” with the swift progression of intelligent sensing technology serving as a key driving force behind this transformation, thereby triggering significant changes in global electric power equipment. In this process, intelligent sensing has created [...] Read more.

The electric power equipment industry is rapidly advancing toward “informationization,” with the swift progression of intelligent sensing technology serving as a key driving force behind this transformation, thereby triggering significant changes in global electric power equipment. In this process, intelligent sensing has created an urgent demand for high-performance integrated power systems that feature compact size, lightweight design, long operational life, high reliability, high energy density, and low cost. However, the performance metrics of traditional power supplies have increasingly failed to meet the requirements of modern intelligent sensing, thereby significantly hindering the advancement of intelligent power equipment. Energy harvesting technology, characterized by its long operational lifespan, compact size, environmental sustainability, and self-sufficient operation, is capable of capturing renewable energy from ambient power sources and converting it into electrical energy to supply power to sensors. Due to these advantages, it has garnered significant attention in the field of power sensing. This paper presents a comprehensive review of the current state of development of energy harvesting technologies within the power environment. It outlines recent advancements in magnetic field energy harvesting, electric field energy harvesting, vibration energy harvesting, wind energy harvesting, and solar energy harvesting. Furthermore, it explores the integration of multiple physical mechanisms and hybrid energy sources aimed at enhancing self-powered applications in this domain. A comparative analysis of the advantages and limitations associated with each technology is also provided. Additionally, the paper discusses potential future directions for the development of energy harvesting technologies in the power environment. Full article

(This article belongs to the Special Issue Nanogenerators: Design, Fabrication and Applications)

► Show Figures

Figure 1

21 pages, 9325 KiB

Open AccessArticle

Lightweight Model Improvement and Application for Rice Disease Classification

by Tonglai Liu, Mingguang Liu, Chengcheng Yang, Ancong Wu, Xiaodong Li and Wenzhao Wei

Electronics 2025, 14(16), 3331; https://doi.org/10.3390/electronics14163331 - 21 Aug 2025

Viewed by 155

Abstract

The timely and correct identification of rice diseases is essential to ensuring rice productivity. However, many methods have drawbacks such as slow recognition speed, low recognition accuracy and overly complex models that are unfavorable for portability. Therefore, this study proposes an improved model [...] Read more.

The timely and correct identification of rice diseases is essential to ensuring rice productivity. However, many methods have drawbacks such as slow recognition speed, low recognition accuracy and overly complex models that are unfavorable for portability. Therefore, this study proposes an improved model for accurately classifying rice diseases based on a two-level routing attention mechanism and dynamic convolution based on the above difficulties. The model employs Alterable Kernel Convolution with dynamic, irregularly shaped convolutional kernels and Bi-level Routing Attention that utilizes sparsity to reduce parameters and involves a GPU-friendly dense matrix multiplication, which can achieve high-precision rice disease recognition while ensuring lightweight and recognition speed. The model successfully classified 10 species, including nine diseased and healthy rice, with 97.31% accuracy and a 97.18% F1-score. Our proposed method outperforms MobileNetV3-large, EfficientNet-b0, Swin Transformer-tiny and ResNet-50 by 1.73%, 1.82%, 1.25% and 0.67%, respectively. Meanwhile, the model contains only

4.453 \times 10^{6}

parameters and achieves an inference time of 6.13 s, which facilitates deployment on mobile devices.The proposed MobileViT_BiAK method effectively identifies rice diseases while providing a lightweight and high-performance classification solution. Full article

(This article belongs to the Special Issue Target Tracking and Recognition Techniques and Their Applications)

► Show Figures

Figure 1

23 pages, 10656 KiB

Open AccessArticle

Lightweight YOLOv11n-Based Detection and Counting of Early-Stage Cabbage Seedlings from UAV RGB Imagery

by Rongrui Zhao, Rongxiang Luo, Xue Ding, Jiao Cui and Bangjin Yi

Horticulturae 2025, 11(8), 993; https://doi.org/10.3390/horticulturae11080993 - 21 Aug 2025

Viewed by 208

Abstract

This study proposes a lightweight adaptive neural network framework based on an improved YOLOv11n model to address the core challenges in identifying cabbage seedlings in visible light images captured by UAVs. These challenges include the loss of small-target features, poor adaptability to complex [...] Read more.

This study proposes a lightweight adaptive neural network framework based on an improved YOLOv11n model to address the core challenges in identifying cabbage seedlings in visible light images captured by UAVs. These challenges include the loss of small-target features, poor adaptability to complex lighting conditions, and the low deployment efficiency of edge devices. First, the adaptive dual-path downsampling module (ADown) integrates average pooling and maximum pooling into a dual-branch structure to enhance background texture and crop edge features in a synergistic manner. Secondly, the Illumination Robust Contrast Learning Head (IRCLHead) utilizes a temperature-adaptive network to adjust the contrast loss function parameters dynamically. Combined with a dual-output supervision mechanism that integrates growth stage prediction and interference-resistant feature embedding, this module enhances the model’s robustness in complex lighting scenarios. Finally, a lightweight spatial-channel attention convolution module (LAConv) has been developed to optimize the model’s computational load by using multi-scale feature extraction paths and depth decomposition structures. Experiments demonstrate that the proposed architecture achieves an mAP@0.5 of 99.0% in detecting cabbage seedling growth cycles, improving upon the baseline model by 0.71 percentage points. Furthermore, it achieves an mAP@0.5:0.95 of 2.4 percentage points, reduces computational complexity (GFLOPs) by 12.7%, and drastically reduces inference time from 3.7 ms to 1.0 ms. Additionally, the model parameters are simplified by 3%. This model provides an efficient solution for the real-time counting of cabbage seedlings and lightweight operations in drone-based precision agriculture. Full article

(This article belongs to the Section Vegetable Production Systems)

► Show Figures

Figure 1

18 pages, 7729 KiB

Open AccessArticle

A Lightweight Traffic Sign Detection Model Based on Improved YOLOv8s for Edge Deployment in Autonomous Driving Systems Under Complex Environments

by Chen Xing, Haoran Sun and Jiafu Yang

World Electr. Veh. J. 2025, 16(8), 478; https://doi.org/10.3390/wevj16080478 - 21 Aug 2025

Viewed by 476

Abstract

Traffic sign detection is a core function of autonomous driving systems, requiring real-time and accurate target recognition in complex road environments. Existing lightweight detection models struggle to balance accuracy, efficiency, and robustness under computational constraints of vehicle-mounted edge devices. To address this, we [...] Read more.

Traffic sign detection is a core function of autonomous driving systems, requiring real-time and accurate target recognition in complex road environments. Existing lightweight detection models struggle to balance accuracy, efficiency, and robustness under computational constraints of vehicle-mounted edge devices. To address this, we propose a lightweight model integrating FasterNet, Efficient Multi-scale Attention (EMA), Bidirectional Feature Pyramid Network (BiFPN), and Group Separable Convolution (GSConv) based on YOLOv8s (FEBG-YOLOv8s). Key innovations include reconstructing the Cross Stage Partial Network 2 with Focus (C2f) module using FasterNet blocks to minimize redundant computation; integrating an EMA mechanism to enhance robustness against small and occluded targets; refining the neck network based on BiFPN via channel compression, downsampling layers, and skip connections to optimize shallow–deep semantic fusion; and designing a GSConv-based hybrid serial–parallel detection head (GSP-Detect) to preserve cross-channel information while reducing computational load. Experiments on Tsinghua–Tencent 100K (TT100K) show FEBG-YOLOv8s improves mean Average Precision at Intersection over Union 0.5 (mAP50) by 3.1% compared to YOLOv8s, with 4 million fewer parameters and 22.5% lower Giga Floating-Point Operations (GFLOPs). Generalizability experiments on the CSUST Chinese Traffic Sign Detection Benchmark (CCTSDB) validate robustness, with 3.3% higher mAP50, demonstrating its potential for real-time traffic sign detection on edge platforms. Full article

► Show Figures

Figure 1

20 pages, 2239 KiB

Open AccessArticle

Lightweight Financial Fraud Detection Using a Symmetrical GAN-CNN Fusion Architecture

by Yiwen Yang, Chengjun Xu and Guisheng Tian

Symmetry 2025, 17(8), 1366; https://doi.org/10.3390/sym17081366 - 21 Aug 2025

Viewed by 165

Abstract

With the rapid development of information technology and the deep integration of the Internet platform, the scale and form of financial transactions continue to grow and expand, significantly improving users’ payment experience and life efficiency. However, financial transactions bring us convenience but also [...] Read more.

With the rapid development of information technology and the deep integration of the Internet platform, the scale and form of financial transactions continue to grow and expand, significantly improving users’ payment experience and life efficiency. However, financial transactions bring us convenience but also expose many security risks, such as money laundering activities, forged checks, and other financial fraud that occurs frequently, seriously threatening the stability and security of the financial system. Due to the imbalance between the proportion of normal and abnormal transactions in the data, most of the existing deep learning-based methods still have obvious deficiencies in learning small numbers sample classes, context modeling, and computational complexity control. To address these deficiencies, this paper proposes a symmetrical structure-based GAN-CNN model for lightweight financial fraud detection. The symmetrical structure can improve the feature extraction and fusion ability and enhance the model’s recognition effect for complex fraud patterns. Synthetic fraud samples are generated based on a GAN to alleviate category imbalance. Multi-scale convolution and attention mechanisms are designed to extract local and global transaction features, and adaptive aggregation and context encoding modules are introduced to improve computational efficiency. We conducted numerous replicate experiments on two public datasets, YelpChi and Amazon. The results showed that on the Amazon dataset with a 50% training ratio, compared with the CNN-GAN model, the accuracy of our model was improved by 1.64%, and the number of parameters was reduced by approximately 88.4%. Compared with the hybrid CNN-LSTM–attention model under the same setting, the accuracy was improved by 0.70%, and the number of parameters was reduced by approximately 87.6%. The symmetry-based lightweight architecture proposed in this work is novel in terms of structural design, and the experimental results show that it is both efficient and accurate in detecting imbalanced transactions. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

Search Results (1,189)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,189)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI