Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,741)

Search Parameters:
Keywords = feature map scaling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 2081 KB  
Article
Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction
by Guoliang Yang, Yuyu Zhang and Hao Yang
Sensors 2026, 26(1), 105; https://doi.org/10.3390/s26010105 (registering DOI) - 23 Dec 2025
Abstract
The large scale and shape variation in breast lesions make their segmentation extremely challenging. A breast ultrasound image segmentation model integrating Mamba-CNN and feature interaction is proposed for breast ultrasound images with a large amount of speckle noise and multiple artifacts. The model [...] Read more.
The large scale and shape variation in breast lesions make their segmentation extremely challenging. A breast ultrasound image segmentation model integrating Mamba-CNN and feature interaction is proposed for breast ultrasound images with a large amount of speckle noise and multiple artifacts. The model first uses the visual state space model (VSS) as an encoder for feature extraction to better capture its long-range dependencies. Second, a hybrid attention enhancement mechanism (HAEM) is designed at the bottleneck between the encoder and the decoder to provide fine-grained control of the feature map in both the channel and spatial dimensions, so that the network captures key features and regions more comprehensively. The decoder uses transposed convolution to upsample the feature map, gradually increasing the resolution and recovering its spatial information. Finally, the cross-fusion module (CFM) is constructed to simultaneously focus on the spatial information of the shallow feature map as well as the deep semantic information, which effectively reduces the interference of noise and artifacts. Experiments are carried out on BUSI and UDIAT datasets, and the Dice similarity coefficient and HD95 indexes reach 76.04% and 20.28 mm, respectively, which show that the algorithm can effectively solve the problems of noise and artifacts in ultrasound image segmentation, and the segmentation performance is improved compared with the existing algorithms. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

30 pages, 3181 KB  
Article
PRA-Unet: Parallel Residual Attention U-Net for Real-Time Segmentation of Brain Tumors
by Ali Zakaria Lebani, Medjeded Merati and Saïd Mahmoudi
Information 2026, 17(1), 14; https://doi.org/10.3390/info17010014 (registering DOI) - 23 Dec 2025
Abstract
With the increasing prevalence of brain tumors, it becomes crucial to ensure fast and reliable segmentation in MRI scans. Medical professionals struggle with manual tumor segmentation due to its exhausting and time-consuming nature. Automated segmentation speeds up decision-making and diagnosis; however, achieving an [...] Read more.
With the increasing prevalence of brain tumors, it becomes crucial to ensure fast and reliable segmentation in MRI scans. Medical professionals struggle with manual tumor segmentation due to its exhausting and time-consuming nature. Automated segmentation speeds up decision-making and diagnosis; however, achieving an optimal balance between accuracy and computational cost remains a significant challenge. In many cases, current methods trade speed for accuracy, or vice versa, consuming substantial computing power and making them difficult to use on devices with limited resources. To address this issue, we present PRA-UNet, a lightweight deep learning model optimized for fast and accurate 2D brain tumor segmentation. Using a single 2D input, the architecture processes four types of MRI scans (FLAIR, T1, T1c, and T2). The encoder uses inverted residual blocks and bottleneck residual blocks to capture features at different scales effectively. The Convolutional Block Attention Module (CBAM) and the Spatial Attention Module (SAM) improve the bridge and skip connections by refining feature maps and making it easier to detect and localize brain tumors. The decoder uses depthwise separable convolutions, which significantly reduce computational costs without degrading accuracy. The BraTS2020 dataset shows that PRA-UNet achieves a Dice score of 95.71%, an accuracy of 99.61%, and a processing speed of 60 ms per image, enabling real-time analysis. PRA-UNet outperforms other models in segmentation while requiring less computing power, suggesting it could be suitable for deployment on lightweight edge devices in clinical settings. Its speed and reliability enable radiologists to diagnose tumors quickly and accurately, enhancing practical medical applications. Full article
(This article belongs to the Special Issue Feature Papers in Information in 2024–2025)
Show Figures

Graphical abstract

24 pages, 8036 KB  
Article
MarsTerrNet: A U-Shaped Dual-Backbone Framework with Feature-Guided Loss for Martian Terrain Segmentation
by Rui Wang, Jimin Sun, Kefa Zhou, Jinlin Wang, Jiantao Bi, Qing Zhang, Wei Wang, Guangjun Qu, Chao Li and Heshun Qiu
Remote Sens. 2026, 18(1), 35; https://doi.org/10.3390/rs18010035 - 23 Dec 2025
Abstract
Accurate terrain perception is essential for safe rover operations and reliable geotechnical interpretation of Martian surfaces. The heterogeneous scales, colors, and textures of Martian terrain present significant challenges for semantic segmentation. We present MarsTerrNet, a dual-backbone segmentation framework that combines Progressive Residual Blocks [...] Read more.
Accurate terrain perception is essential for safe rover operations and reliable geotechnical interpretation of Martian surfaces. The heterogeneous scales, colors, and textures of Martian terrain present significant challenges for semantic segmentation. We present MarsTerrNet, a dual-backbone segmentation framework that combines Progressive Residual Blocks (PRB) with a Swin Transformer to jointly capture fine-grained local details and global contextual dependencies. To further enhance discrimination among geologically correlated classes, we design a feature-guided loss that aligns representative features across terrain categories and reduces confusion between visually similar but physically distinct types. For comprehensive evaluation, we establish MarsTerr2024, an extended dataset derived from the Curiosity rover, providing diverse geological scenes for terrain understanding. Experimental results show that MarsTerrNet achieves state-of-the-art performance and produces geologically consistent segmentation results, supporting automated mapping and geotechnical assessment for future Mars exploration missions. Full article
Show Figures

Figure 1

27 pages, 5218 KB  
Article
A System-Level Approach to Pixel-Based Crop Segmentation from Ultra-High-Resolution UAV Imagery
by Aisulu Ismailova, Moldir Yessenova, Gulden Murzabekova, Jamalbek Tussupov and Gulzira Abdikerimova
Appl. Syst. Innov. 2026, 9(1), 3; https://doi.org/10.3390/asi9010003 - 22 Dec 2025
Abstract
This paper proposed a two-level hybrid stacking model for the classification of crops—wheat, soybean, and barley—based on multispectral orthomosaics obtained from uncrewed aerial vehicles. The proposed method unites gradient boosting algorithms (LightGBM, XGBoost, CatBoost) and tree ensembles (RandomForest, ExtraTrees, Attention-MLP deep neural network), [...] Read more.
This paper proposed a two-level hybrid stacking model for the classification of crops—wheat, soybean, and barley—based on multispectral orthomosaics obtained from uncrewed aerial vehicles. The proposed method unites gradient boosting algorithms (LightGBM, XGBoost, CatBoost) and tree ensembles (RandomForest, ExtraTrees, Attention-MLP deep neural network), whose predictions fuse at the meta-level using ExtraTreesClassifier. Spectral channels, along with a wide range of vegetation indices and their statistical characteristics, are used to construct the feature space. Experiments on an open dataset showed that the proposed model achieves high classification accuracy (Accuracy ≈ 95%, macro-F1 ≈ 0.95) and significantly outperforms individual algorithms across all key metrics. An analysis of the seasonal dynamics of vegetation indices confirmed the feasibility of monitoring phenological phases and early detection of stress factors. Furthermore, spatial segmentation of orthomosaics achieved approximately 99% accuracy in constructing crop maps, making the developed approach a promising tool for precision farming. The study’s results showed the high potential of hybrid ensembles for scaling to other crops and regions, as well as for integrating them into digital agricultural information systems. Full article
(This article belongs to the Section Information Systems)
Show Figures

Figure 1

36 pages, 2348 KB  
Article
LSTM-CA-YOLOv11: A Road Sign Detection Model Integrating LSTM Temporal Modeling and Multi-Scale Attention Mechanism
by Tianlei Ye, Yajie Pang, Yihong Li, Enming Liang, Yunfei Wang and Tong Zhou
Appl. Sci. 2026, 16(1), 116; https://doi.org/10.3390/app16010116 - 22 Dec 2025
Abstract
Traffic sign detection is crucial for intelligent transportation and autonomous driving, yet faces challenges such as illumination variations, occlusions, and scale changes that impact accuracy. To address these issues, the paper proposes the LSTM-CA-YOLOv11 model. This approach pioneers the integration of a Bi-LSTM [...] Read more.
Traffic sign detection is crucial for intelligent transportation and autonomous driving, yet faces challenges such as illumination variations, occlusions, and scale changes that impact accuracy. To address these issues, the paper proposes the LSTM-CA-YOLOv11 model. This approach pioneers the integration of a Bi-LSTM (Bi-directional Long-Short Term Memory) into the YOLOv11 backbone network to model spatial-sequence dependencies, thereby enhancing structured feature extraction capabilities. The lightweight CA (Coordinate Attention) module encodes precise positional information by capturing horizontal and vertical features. The MSEF (Multi-Scale Enhancement Fusion) module addresses scale variations through parallel convolutional and pooling branches with adaptive fusion processing. We further introduce the SPP-Plus (Spatial Pyramid Pooling-Plus) module to expand the receptive field while preserving fine details, and employ a focus IoU (Intersection over Union) loss to prioritise challenging samples, thereby improving regression accuracy. On a private dataset comprising 10,231 images, experiments demonstrate that this model achieves a mAP@0.5 of 93.4% and a mAP@0.5:0.95 of 79.5%, representing improvements of 5.3% and 4.7% over the baseline, respectively. Furthermore, the model’s generalisation performance on the public TT100K (Tsinghua-Tencent 100K) dataset surpassed the latest YOLOv13n by 5.3% in mAP@0.5 and 3.9% in mAP@0.5:0.95, demonstrating robust cross-dataset capabilities and exceptional practical deployment feasibility. Full article
(This article belongs to the Special Issue AI in Object Detection)
23 pages, 5771 KB  
Article
F3M: A Frequency-Domain Feature Fusion Module for Robust Underwater Object Detection
by Tianyi Wang, Haifeng Wang, Wenbin Wang, Kun Zhang, Baojiang Ye and Huilin Dong
J. Mar. Sci. Eng. 2026, 14(1), 20; https://doi.org/10.3390/jmse14010020 - 22 Dec 2025
Abstract
In this study, we propose the Frequency-domain Feature Fusion Module (F3M) to address the challenges of underwater object detection, where optical degradation—particularly high-frequency attenuation and low-frequency color distortion—significantly compromises performance. We critically re-evaluate the need for strict invertibility in detection-oriented frequency modeling. Traditional [...] Read more.
In this study, we propose the Frequency-domain Feature Fusion Module (F3M) to address the challenges of underwater object detection, where optical degradation—particularly high-frequency attenuation and low-frequency color distortion—significantly compromises performance. We critically re-evaluate the need for strict invertibility in detection-oriented frequency modeling. Traditional wavelet-based methods incur high computational redundancy to maintain signal reconstruction, whereas F3M introduces a lightweight “Separate–Project–Fuse” paradigm. This mechanism decouples low-frequency illumination artifacts from high-frequency structural cues via spatial approximation, enabling the recovery of fine-scale details like coral textures and debris boundaries without the overhead of channel expansion. We validate F3M’s versatility by integrating it into both Convolutional Neural Networks (YOLO) and Transformer-based detectors (RT-DETR). Evaluations on the SCoralDet dataset show consistent improvements: F3M enhances the lightweight YOLO11n by 3.5% mAP50 and increases RT-DETR-n’s localization accuracy (mAP50–95) from 0.514 to 0.532. Additionally, cross-domain validation on the deep-sea TrashCan-Instance dataset shows F3M achieving comparable accuracy to the larger YOLOv8n while requiring 13% fewer parameters and 20% fewer GFLOPs. This study confirms that frequency-domain modulation provides an efficient and widely applicable enhancement for real-time underwater perception. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

27 pages, 3103 KB  
Article
IHBOFS: A Biomimetics-Inspired Hybrid Breeding Optimization Algorithm for High-Dimensional Feature Selection
by Chunli Xiang, Jing Zhou and Wen Zhou
Biomimetics 2026, 11(1), 3; https://doi.org/10.3390/biomimetics11010003 - 22 Dec 2025
Abstract
With the explosive growth of data across various fields, effective data preprocessing has become increasingly critical. Evolutionary and swarm intelligence algorithms have shown considerable potential in feature selection. However, their performance often deteriorates in large-scale problems, due to premature convergence and limited exploration [...] Read more.
With the explosive growth of data across various fields, effective data preprocessing has become increasingly critical. Evolutionary and swarm intelligence algorithms have shown considerable potential in feature selection. However, their performance often deteriorates in large-scale problems, due to premature convergence and limited exploration ability. To address these limitations, this paper proposes an algorithm named IHBOFS, a biomimetics-inspired optimization framework that integrates multiple adaptive strategies to enhance performance and stability. The introduction of the Good Point Set and Elite Opposition-Based Learning mechanisms provides the population with a well-distributed and diverse initialization. Furthermore, adaptive exploitation–exploration balancing strategies are designed for each subpopulation, effectively mitigating premature convergence. Extensive ablation studies on the CEC2022 benchmark functions verify the effectiveness of these strategies. Considering the discrete nature of feature selection, IHBOFS is further extended with continuous-to-discrete mapping functions and applied to six real-world datasets. Comparative experiments against nine metaheuristic-based methods, including Harris Hawk Optimization (HHO) and Ant Colony Optimization (ACO), demonstrate that IHBOFS achieves an average classification accuracy of 92.57%, confirming its superiority and robustness in high-dimensional feature selection tasks. Full article
(This article belongs to the Section Biological Optimisation and Management)
Show Figures

Figure 1

24 pages, 4820 KB  
Article
YOLOv11-SAFM: Enhancing Landslide Detection in Complex Mountainous Terrain Through Spatial Feature Adaptation
by Cheng Zhang, Bo-Hui Tang, Fangliang Cai, Menghua Li and Dong Fan
Remote Sens. 2026, 18(1), 24; https://doi.org/10.3390/rs18010024 - 22 Dec 2025
Abstract
Landslide detection in mountainous regions remains highly challenging due to complex terrain conditions, heterogeneous surface textures, and the fragmented distribution of landslide features. To address these limitations, this study proposes an enhanced object detection framework named YOLOv11-SAFM, which integrates a Spatially Adaptive Feature [...] Read more.
Landslide detection in mountainous regions remains highly challenging due to complex terrain conditions, heterogeneous surface textures, and the fragmented distribution of landslide features. To address these limitations, this study proposes an enhanced object detection framework named YOLOv11-SAFM, which integrates a Spatially Adaptive Feature Modulation (SAFM) module, an optimized MPDIoU-based bounding box regression loss, and a multi-scale training strategy. These improvements strengthen the model’s ability to detect small-scale landslides with blurred edges under complex geomorphic conditions. A high-resolution remote sensing dataset was constructed using imagery from Bijie and Zhaotong in southwest China including GF-2 optical imagery at 1 m resolution and Sentinel-2 data at 10 m resolution for model training and validation, while independent data from Zhenxiong County were used to assess generalization capability. Experimental results demonstrate that YOLOv11-SAFM achieves a precision of 95.05%, recall of 90.10%, F1-score of 92.51%, and mAP@0.5 of 95.30% on the independent test set of the Zhaotong–Bijie dataset for detecting small-scale landslides in rugged plateau environments. Compared with the widely used Mask R-CNN, the proposed model improves precision by 13.87% and mAP@0.5 by 15.7%; against the traditional YOLOv8, it increases recall by 27.0% and F1-score by 22.47%. YOLOv11-SAFM enables efficient and robust automatic landslide detection in complex mountainous terrains and shows strong potential for integration into operational geohazard monitoring and early warning systems. Full article
Show Figures

Graphical abstract

18 pages, 4935 KB  
Article
Automated Hurricane Damage Classification for Sustainable Disaster Recovery Using 3D LiDAR and Machine Learning: A Post-Hurricane Michael Case Study
by Jackson Kisingu Ndolo, Ivan Oyege and Leonel Lagos
Sustainability 2026, 18(1), 90; https://doi.org/10.3390/su18010090 (registering DOI) - 21 Dec 2025
Abstract
Accurate mapping of hurricane-induced damage is essential for guiding rapid disaster response and long-term recovery planning. This study evaluates the Three-Dimensional Multi-Attributes, Multiscale, Multi-Cloud (3DMASC) framework for semantic classification of pre- and post-hurricane Light Detection and Ranging (LiDAR) data, using Mexico Beach, Florida, [...] Read more.
Accurate mapping of hurricane-induced damage is essential for guiding rapid disaster response and long-term recovery planning. This study evaluates the Three-Dimensional Multi-Attributes, Multiscale, Multi-Cloud (3DMASC) framework for semantic classification of pre- and post-hurricane Light Detection and Ranging (LiDAR) data, using Mexico Beach, Florida, as a case study following Hurricane Michael. The goal was to assess the framework’s ability to classify stable landscape features and detect damage-specific classes in a highly complex post-disaster environment. Bitemporal topo-bathymetric LiDAR datasets from 2017 (pre-event) and 2018 (post-event) were processed to extract more than 80 geometric, radiometric, and echo-based features at multiple spatial scales. A Random Forest classifier was trained on a 2.37 km2 pre-hurricane area (Zone A) and evaluated on an independent 0.95 km2 post-hurricane area (Zone B). Pre-hurricane classification achieved an overall accuracy of 0.9711, with stable classes such as ground, water, and buildings achieving precision and recall exceeding 0.95. Post-hurricane classification maintained similar accuracy; however, damage-related classes exhibited lower performance, with debris reaching an F1-score of 0.77, damaged buildings 0.58, and vehicles recording a recall of only 0.13. These results indicate that the workflow is effective for rapid mapping of persistent structures, with additional refinements needed for detailed damage classification. Misclassifications were concentrated along class boundaries and in structurally ambiguous areas, consistent with known LiDAR limitations in disaster contexts. These results demonstrate the robustness and spatial transferability of the 3DMASC–Random Forest approach for disaster mapping. Integrating multispectral data, improving small-object representation, and incorporating automated debris volume estimation could further enhance classification reliability, enabling faster, more informed post-disaster decision-making. By enabling rapid, accurate damage mapping, this approach supports sustainable disaster recovery, resource-efficient debris management, and resilience planning in hurricane-prone regions. Full article
(This article belongs to the Section Sustainable Urban and Rural Development)
Show Figures

Figure 1

15 pages, 43560 KB  
Article
Research on Traffic Sign Detection Algorithm Based on Improved YOLO11n
by Haonan Feng, Jiaxu Meng, Zhiyong Guo, Pengchao Zhao, Wenchao Zhang, Yiran Cao and Cunman Liang
Technologies 2026, 14(1), 4; https://doi.org/10.3390/technologies14010004 - 21 Dec 2025
Abstract
In order to improve detection accuracy while minimizing computational overhead, a modified algorithm is proposed based on the YOLO11n baseline. The innovation incorporates a lightweight ADown module into the P4 and P5 layers of the backbone network, strategically reducing computational complexity. Simultaneously, a [...] Read more.
In order to improve detection accuracy while minimizing computational overhead, a modified algorithm is proposed based on the YOLO11n baseline. The innovation incorporates a lightweight ADown module into the P4 and P5 layers of the backbone network, strategically reducing computational complexity. Simultaneously, a multi-scale attention mechanism with parallel structure is integrated into the detection head to enhance feature representation, while a micro-detection head is appended to specifically improve the detection of tiny objects. Based on the classic metrics, including parameter count, mAP@50, mAP@50-95, recall, and FPS, the ablation experiments are performed to validate the improvement of the improved algorithm on the CCTSDB2021 dataset. Furthermore, comparative experiments against traditional YOLO variants are conducted on both CCTSDB2021 and TT100K-2021 datasets. Experimental results demonstrate significant improvements across all evaluated metrics for the improved algorithm, highlighting its exceptional capability to balance high accuracy with minimal computational complexity. Full article
(This article belongs to the Special Issue Advanced Intelligent Driving Technology)
Show Figures

Figure 1

17 pages, 2395 KB  
Article
A Structurally Optimized and Efficient Lightweight Object Detection Model for Autonomous Driving
by Mingjing Li, Junshuai Wang, Shuang Chen, LinLin Liu, KaiJie Li, Zengzhi Zhao and Haijiao Yun
Sensors 2026, 26(1), 54; https://doi.org/10.3390/s26010054 (registering DOI) - 21 Dec 2025
Abstract
Object detection plays a pivotal role in safety-critical applications, including autonomous driving, intelligent surveillance, and unmanned aerial systems. However, many state-of-the-art detectors remain highly resource-intensive; their large parameter sizes and substantial floating-point operations make it difficult to balance accuracy and efficiency, particularly under [...] Read more.
Object detection plays a pivotal role in safety-critical applications, including autonomous driving, intelligent surveillance, and unmanned aerial systems. However, many state-of-the-art detectors remain highly resource-intensive; their large parameter sizes and substantial floating-point operations make it difficult to balance accuracy and efficiency, particularly under constrained computational budgets. To mitigate this accuracy–efficiency trade-off, we propose FE-YOLOv8, a lightweight yet more effective variant of YOLOv8 (You Only Look Once version 8). Specifically, two architectural refinements are introduced: (1) C2f-Faster (Cross-Stage-Partial 2-Conv Faster Block) modules embedded in both the backbone and neck, where PConv (partial convolution) prunes redundant computations without diminishing representational capacity; and (2) an EfficientHead detection head that integrates EMSConv (Efficient Multi-Scale Convolution) to enhance multi-scale feature fusion while simplifying the head design and maintaining low computational complexity. Extensive ablation and comparative experiments on the SODA-10M dataset show that FE-YOLOv8 reduces the parameter count by 31.09% and the computational cost by 43.31% relative to baseline YOLOv8 while achieving comparable or superior mean Average Precision (mAP). Generalization experiments conducted on the BDD100K dataset further validate these improvements, demonstrating that FE-YOLOv8 achieves a favorable balance between accuracy and efficiency within the YOLOv8 family and provides new architectural insights for lightweight object detector design. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

24 pages, 3622 KB  
Article
Deep Learning-Based Intelligent Monitoring of Petroleum Infrastructure Using High-Resolution Remote Sensing Imagery
by Nannan Zhang, Hang Zhao, Pengxu Jing, Yan Gao, Song Liu, Jinli Shen, Shanhong Huang, Qihong Zeng, Yang Liu and Miaofen Huang
Processes 2026, 14(1), 28; https://doi.org/10.3390/pr14010028 - 20 Dec 2025
Viewed by 40
Abstract
The rapid advancement of high-resolution remote sensing technology has significantly expanded observational capabilities in the oil and gas sector, enabling more precise identification of petroleum infrastructure. Remote sensing now plays a critical role in providing real-time, continuous monitoring. Manual interpretation remains the predominant [...] Read more.
The rapid advancement of high-resolution remote sensing technology has significantly expanded observational capabilities in the oil and gas sector, enabling more precise identification of petroleum infrastructure. Remote sensing now plays a critical role in providing real-time, continuous monitoring. Manual interpretation remains the predominant approach, yet is plagued by multiple limitations. To overcome the limitations of manual interpretation in large-scale monitoring of upstream petroleum assets, this study develops an end-to-end, deep learning-driven framework for intelligent extraction of key oilfield targets from high-resolution remote sensing imagery. Specific aims are as follows: (1) To leverage temporal diversity in imagery to construct a representative training dataset. (2) To automate multi-class detection of well sites, production discharge pools, and storage facilities with high precision. This study proposes an intelligent monitoring framework based on deep learning for the automatic extraction of petroleum-related features from high-resolution remote sensing imagery. Leveraging the temporal richness of multi-temporal satellite data, a geolocation-based sampling strategy was adopted to construct a dedicated petroleum remote sensing dataset. The dataset comprises over 8000 images and more than 30,000 annotated targets across three key classes: well pads, production ponds, and storage facilities. Four state-of-the-art object detection models were evaluated—two-stage frameworks (Faster R-CNN, Mask R-CNN) and single-stage algorithms (YOLOv3, YOLOv4)—with the integration of transfer learning to improve accuracy, generalization, and robustness. Experimental results demonstrate that two-stage detectors significantly outperform their single-stage counterparts in terms of mean Average Precision (mAP). Specifically, the Mask R-CNN model, enhanced through transfer learning, achieved an mAP of 89.2% across all classes, exceeding the best-performing single-stage model (YOLOv4) by 11 percentage points. This performance gap highlights the trade-off between speed and accuracy inherent in single-shot detection models, which prioritize real-time inference at the expense of precision. Additionally, comparative analysis among similar architectures confirmed that newer versions (e.g., YOLOv4 over YOLOv3) and the incorporation of transfer learning consistently yield accuracy improvements of 2–4%, underscoring its effectiveness in remote sensing applications. Three oilfield areas were selected for practical application. The results indicate that the constructed model can automatically extract multiple target categories simultaneously, with average detection accuracies of 84% for well sites and 77% for production ponds. For multi-class targets over 100 square kilometers, manual detection previously required one day but now takes only one hour. Full article
Show Figures

Figure 1

21 pages, 6979 KB  
Article
A Lightweight Edge-Deployable Framework for Intelligent Rice Disease Monitoring Based on Pruning and Distillation
by Wei Liu, Baoquan Duan, Zhipeng Fan, Ming Chen and Zeguo Qiu
Sensors 2026, 26(1), 35; https://doi.org/10.3390/s26010035 - 20 Dec 2025
Viewed by 109
Abstract
Digital agriculture and smart farming require crop health monitoring methods that balance detection accuracy with computational cost. Rice leaf diseases threaten yield, while field images often contain small multi-scale lesions, variable illumination and cluttered backgrounds. This paper investigates SCD-YOLOv11n, a lightweight detector designed [...] Read more.
Digital agriculture and smart farming require crop health monitoring methods that balance detection accuracy with computational cost. Rice leaf diseases threaten yield, while field images often contain small multi-scale lesions, variable illumination and cluttered backgrounds. This paper investigates SCD-YOLOv11n, a lightweight detector designed with these constraints in mind. The model replaces the YOLOv11n backbone with a StarNet backbone and integrates a C3k2-Star module to enhance fine-grained, multi-scale feature extraction. A Detail-Strengthened Cross-scale Detection (DSCD) head is further introduced to improve localization of small lesions. On this architecture, we design a DepGraph-based mixed group-normalization pruning rule and apply channel-wise feature distillation to recover performance after pruning. Experiments on a public rice leaf disease dataset show that the compressed model requires 1.9 MB of storage, achieves 97.4% mAP@50 and 76.2% mAP@50:95, and attains a measured speed of 184 FPS under the tested settings. These results provide a quantitative reference for designing lightweight object detectors for rice disease monitoring in digital agriculture scenarios. Full article
(This article belongs to the Topic Digital Agriculture, Smart Farming and Crop Monitoring)
Show Figures

Figure 1

22 pages, 1922 KB  
Article
Research on Propeller Defect Diagnosis of Rotor UAVs Based on MDI-STFFNet
by Beining Cui, Dezhi Jiang, Xinyu Wang, Lv Xiao, Peisen Tan, Yanxia Li and Zhaobin Tan
Symmetry 2026, 18(1), 3; https://doi.org/10.3390/sym18010003 - 19 Dec 2025
Viewed by 57
Abstract
To address flight safety risks from rotor defects in rotorcraft drones operating in complex low-altitude environments, this study proposes a high-precision diagnostic model based on the Multimodal Data Input and Spatio-Temporal Feature Fusion Network (MDI-STFFNet). The model uses a dual-modality coupling mechanism that [...] Read more.
To address flight safety risks from rotor defects in rotorcraft drones operating in complex low-altitude environments, this study proposes a high-precision diagnostic model based on the Multimodal Data Input and Spatio-Temporal Feature Fusion Network (MDI-STFFNet). The model uses a dual-modality coupling mechanism that integrates vibration and air pressure signals, forming a “single-path temporal, dual-path representational” framework. The one-dimensional vibration signal and the five-channel pressure array are mapped into a texture space via phase space reconstruction and color-coded recurrence plots, followed by extraction of transient spatial features using a pre-trained ResNet-18 model. Parallel LSTM networks capture long-term temporal dependencies, while a parameter-free 1D max-pooling layer compresses redundant pressure data, reducing LSTM parameter growth. The CSW-FM module enables adaptive fusion across modal scales via shared-weight mapping and learnable query vectors that dynamically assign spatiotemporal weights. Experiments on a self-built dataset with seven defect types show that the model achieves 99.01% accuracy, improving by 4.46% and 1.98% over single-modality vibration and pressure inputs. Ablation studies confirm the benefits of spatiotemporal fusion and soft weighting in accuracy and robustness. The model provides a scalable, lightweight solution for UAV power system fault diagnosis under high-noise and varying conditions. Full article
(This article belongs to the Section Engineering and Materials)
23 pages, 7391 KB  
Article
TSE-YOLO: A Model for Tomato Ripeness Segmentation
by Liangquan Jia, Xinhui Yuan, Ze Chen, Tao Wang, Lu Gao, Guosong Gu, Xuechun Wang and Yang Wang
Agriculture 2026, 16(1), 8; https://doi.org/10.3390/agriculture16010008 - 19 Dec 2025
Viewed by 151
Abstract
Accurate and efficient tomato ripeness estimation is crucial for robotic harvesting and supply chain grading in smart agriculture. However, manual visual inspection is subjective, slow and difficult to scale, while existing vision models often struggle with cluttered field backgrounds, small targets and limited [...] Read more.
Accurate and efficient tomato ripeness estimation is crucial for robotic harvesting and supply chain grading in smart agriculture. However, manual visual inspection is subjective, slow and difficult to scale, while existing vision models often struggle with cluttered field backgrounds, small targets and limited throughput. To overcome these limitations, we introduce TSE-YOLO, an improved real-time detector tailored for tomato ripeness estimation with joint detection and segmentation. In the TSE-YOLO model, three key enhancements are introduced. The C2PSA module is improved with ConvGLU, adapted from TransNeXt, to strengthen feature extraction within tomato regions. A novel segmentation head is designed to accelerate ripeness-aware segmentation and improve recall. Additionally, the C3k2 module is augmented with partial and frequency-dynamic convolutions, enhancing feature representation under complex planting conditions. These components enable precise instance-level localization and pixel-wise segmentation of tomatoes at three ripeness stages: verde, semi-ripe (semi-maduro), and ripe. Experiments on a self-constructed tomato ripeness dataset demonstrate that TSE-YOLO achieves 92.5% mAP@0.5 for detection and 92.2% mAP@0.5 for segmentation with only 9.8 GFLOPs. Deployed on Android via Ncnn Convolutional Neural Network (NCNN), the model runs at 30 fps on Dimensity 9300, offering a practical solution for automated tomato harvesting and grading that accelerates smart agriculture’s industrial adoption. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

Back to TopTop