Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,230)

Search Parameters:
Keywords = multi-scale feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 4698 KB  
Article
Robust Feature Recognition of Slab Edges in Complex Industrial Environments Based on a Deep Dense Perception Network Model
by Yang Liu, Meiqin Liang, Xuejun Zhang and Junqi Yuan
Metals 2026, 16(4), 378; https://doi.org/10.3390/met16040378 (registering DOI) - 28 Mar 2026
Abstract
Defect detection in the hot rolling process is closely linked to the quality of the final product. Among these defects, slab camber during the intermediate rolling stage is one of the primary manifestations of asymmetry, which significantly impairs both the quality of the [...] Read more.
Defect detection in the hot rolling process is closely linked to the quality of the final product. Among these defects, slab camber during the intermediate rolling stage is one of the primary manifestations of asymmetry, which significantly impairs both the quality of the finished strip and the stability of subsequent rolling processes. Conventional image-based edge detection methods for slab camber are prone to detection deviations in complex industrial environments, mainly due to their weak noise robustness. To address the scientific challenge of low accuracy and poor robustness in feature extraction for hot-rolled intermediate slab camber detection, which is induced by environmental interference in complex industrial settings, we break through the technical bottlenecks of traditional edge detection methods and existing deep learning models in terms of channel–spatial feature collaborative optimization and anti-interference fusion of multi-scale features. We establish a dense perception network model integrated with a channel–spatial attention mechanism, realize robust feature recognition of slab edges under complex working conditions, and provide theoretical and technical support for the real-time quantitative detection of slab shape defects in the hot rolling process. The proposed model significantly improves detection accuracy and robustness through multi-scale feature enhancement and noise suppression, effectively meeting the requirements for real-time quantitative detection of slab camber in the roughing rolling stage. Field experiments verify that the method increases detection accuracy by 36.55% and achieves favorable performance on evaluation metrics, including ODS and OIS. Full article
Show Figures

Figure 1

17 pages, 847 KB  
Article
Low-Dose CT Image Denoising Based on a Progressive Fusion Distillation Network with Pixel Attention
by Xinyi Wang and Bao Pang
Appl. Sci. 2026, 16(7), 3292; https://doi.org/10.3390/app16073292 (registering DOI) - 28 Mar 2026
Abstract
Low-dose computed tomography (LDCT) can effectively reduce ionizing radiation; however, the associated image noise and artifacts can severely compromise the accuracy of clinical diagnosis. To address the challenge of balancing noise suppression and detail preservation in LDCT images, this study proposes a deep [...] Read more.
Low-dose computed tomography (LDCT) can effectively reduce ionizing radiation; however, the associated image noise and artifacts can severely compromise the accuracy of clinical diagnosis. To address the challenge of balancing noise suppression and detail preservation in LDCT images, this study proposes a deep learning (DL)-based image denoising method termed Progressive Fusion Distillation Network (PFDN). Building upon the Information Multi-distillation Network (IMDN), the proposed method incorporates a pixel attention (PA) mechanism and a progressive fusion strategy, and further designs a Pixel Parallel Extraction Block (PPEB) together with a Progressive Fusion Distillation Block (PFDB) to fully exploit multi-scale and multi-channel features, thereby optimizing the image denoising network through efficient feature separation and re-fusion. In addition, by explicitly leveraging the noise characteristics specific to LDCT images, the method establishes an end-to-end training framework suitable for medical imaging. Experimental results demonstrate that PFDN not only effectively reduces image noise and artifacts, but also enhances overall image quality while preserving diagnostically relevant image structures under the adopted evaluation setting. Full article
28 pages, 5206 KB  
Article
CEA-DETR: A Multi-Scale Feature Fusion-Based Method for Wind Turbine Blade Surface Defect Detection
by Xudong Luo, Ruimin Wang, Jianhui Zhang, Junjie Zeng and Xiaohang Cai
Sensors 2026, 26(7), 2115; https://doi.org/10.3390/s26072115 (registering DOI) - 28 Mar 2026
Abstract
Wind turbine blade surface defect detection remains challenging due to large variations in defect scales, blurred edge textures, and severe interference from complex backgrounds, which often lead to insufficient detection accuracy and high false and missed detection rates. To address these issues, this [...] Read more.
Wind turbine blade surface defect detection remains challenging due to large variations in defect scales, blurred edge textures, and severe interference from complex backgrounds, which often lead to insufficient detection accuracy and high false and missed detection rates. To address these issues, this paper proposes an improved RTDETR-based detection framework, termed CEA-DETR, for wind turbine blade surface defect inspection. First, a Cross-Scale Multi-Edge feature Extraction (CSME) backbone is designed by integrating multi-scale pooling and edge-enhancement units with a dual-domain feature selection mechanism, enabling effective extraction of fine-grained texture and edge features across different scales. Second, an Efficient Multi-Scale Feature Fusion Network (EMSFFN) is constructed to facilitate deep cross-level feature interaction through adaptive weighted fusion and multi-scale convolutional structures, thereby enhancing the representation of multi-scale defects. Furthermore, an adaptive sparse self-attention mechanism is introduced to reconstruct the AIFI module, strengthening global dependency modeling and guiding the network to focus on critical defect regions under complex background conditions. Experimental results demonstrate that CEA-DETR achieves mAP50 and mAP50:95 of 89.4% and 68.9%, respectively, representing improvements of 3.1% and 6.5% over the RT-DETR-r18 baseline. Meanwhile, the proposed model reduces computational cost (GFLOPs) by 20.1% and parameter count by 8.1%. These advantages make CEA-DETR more suitable for deployment on resource-constrained unmanned aerial vehicles (UAVs), enabling efficient and real-time autonomous inspection of wind turbine blades. Full article
(This article belongs to the Section Industrial Sensors)
Show Figures

Figure 1

19 pages, 1666 KB  
Article
MTLL: A Novel Multi-Task Learning Approach for Lymphocytic Leukemia Classification and Nucleus Segmentation
by Cuisi Ou, Zhigang Hu, Xinzheng Wang, Kaiwen Cao and Yipei Wang
Electronics 2026, 15(7), 1419; https://doi.org/10.3390/electronics15071419 (registering DOI) - 28 Mar 2026
Abstract
Bone marrow cell classification and nucleus segmentation in microscopic images are fundamental tasks for computer-aided diagnosis of lymphocytic leukemia. However, bone marrow cells from different subtypes exhibit high morphological similarity, and structural information is often constrained under optical microscopic imaging, posing challenges for [...] Read more.
Bone marrow cell classification and nucleus segmentation in microscopic images are fundamental tasks for computer-aided diagnosis of lymphocytic leukemia. However, bone marrow cells from different subtypes exhibit high morphological similarity, and structural information is often constrained under optical microscopic imaging, posing challenges for stable and effective feature representation. To address this issue, we propose MTLL (Multitask Model on Lymphocytic Leukemia), a novel multitask approach that performs cell classification and nucleus segmentation within a unified network to exploit their complementary information. The model constructs a hybrid backbone for shared feature representation based on a CNN-Transformer architecture, in which Fuse-MBConv modules are tightly integrated with multilayer multi-scale transformers to enable deep fusion of local texture and global semantic information. For the segmentation branch, we design an AM (Atrous Multilayer Perceptron) decoder that combines atrous spatial pyramid pooling with multilayer perceptrons to fuse multi-scale information and accurately delineate nucleus boundaries. The classification branch incorporates prior knowledge of cell nuclei structures to capture subtle variations in cellular morphology and texture, thereby enhancing the model’s ability to distinguish between leukemia subtypes. Experimental results demonstrate that the MTLL model significantly outperforms existing advanced single-task and multi-task models in both lymphocytic leukemia classification and cell nucleus segmentation. These results validate the effectiveness of the multi-task feature-sharing strategy for lymphocytic leukemia diagnosis using bone marrow microscopic images. Full article
Show Figures

Figure 1

50 pages, 10525 KB  
Article
Passable Area Evaluation of Tractor Road Based on Improved YOLOv5s and Multi-Factor Fusion
by Qian Zhang, Wenjie Xu, Wenfei Wu, Lizhang Xu, Zhenghui Zhao and Shaowei Liang
Agriculture 2026, 16(7), 752; https://doi.org/10.3390/agriculture16070752 (registering DOI) - 28 Mar 2026
Abstract
The tractor road, as the core scene for autonomous driving of grain transport vehicles, is unstructured, complex, and obstacle-rich, leading to poor real-time performance and accuracy of joint road and obstacle detection with existing YOLOv5s. Furthermore, the reliability of passable area evaluation is [...] Read more.
The tractor road, as the core scene for autonomous driving of grain transport vehicles, is unstructured, complex, and obstacle-rich, leading to poor real-time performance and accuracy of joint road and obstacle detection with existing YOLOv5s. Furthermore, the reliability of passable area evaluation is low solely based on environmental factors. Therefore, YOLOv5s-C2S is proposed, fusing multi-scale features, attention mechanism, and dynamic features for joint detection. Firstly, YOLOv5s-CC is proposed for road detection by fusing context and spatial details and introducing Criss-Cross attention. Secondly, YOLOv5s-SGA is proposed for obstacle detection by grouped and spatial convolution, parameter-free attention, and adaptive feature fusion. By reusing YOLOv5s-CC weights, YOLOv5s-C2S shares low-level features and decouples high-level specificity. Based on the tractor road and obstacle information, combined with vehicle factors, a weighted scoring–based comprehensive method for passable area evaluation is proposed. Finally, the method was verified through experiments with an intelligent tracked grain transport vehicle using self-constructed datasets, including VOC_Road (11,927 images) and VOC_Obstacle (21,779 images). Compared with existing YOLOv5s, Deeplabv3+, FCN, Unet and SegNet, the mAP50 of road detection by YOLOv5s-CC increased by over 1.2%. Compared with existing YOLOv5s, R-CNN, YOLOv7, SSD and YOLOv8n, the mAP50 of obstacle detection by YOLOv5s-SGA increased by over 2%. Compared with YOLOv5s-SD, the mAP50 of joint detection by YOLOv5s-C2S increased by 9.3%, and the frame rate increased by 7.0 FPS. The proposed passable area evaluation method exhibits strong robustness and reliability in complex environments, meeting the accuracy and real-time requirements in autonomous driving of grain transport vehicles. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
24 pages, 4811 KB  
Article
Lightweight Power Line Defect Detection Based on Improved YOLOv8n
by Yuhan Yin, Xiaoyi Liu, Kunxiao Wu, Ruilin Xu, Jianyong Zheng and Fei Mei
Sensors 2026, 26(7), 2112; https://doi.org/10.3390/s26072112 (registering DOI) - 28 Mar 2026
Abstract
To address the challenges of small targets, severe background clutter, and high deployment cost in UAV-based power-line defect detection, this paper proposes a lightweight defect detection model based on an improved YOLOv8n. In the downsampling stage, we design an improved lightweight adaptive downsampling [...] Read more.
To address the challenges of small targets, severe background clutter, and high deployment cost in UAV-based power-line defect detection, this paper proposes a lightweight defect detection model based on an improved YOLOv8n. In the downsampling stage, we design an improved lightweight adaptive downsampling module (ADownPro) to replace part of conventional convolutions, which uses a dual-branch parallel structure for stronger feature interaction and depthwise separable convolutions (DSConv) for complexity reduction. In the feature extraction stage, an integration of cross-stage partial connections and partial convolution (CSPPC) is proposed to replace the C2F module for efficient multi-scale feature fusion. In the detection head, mixed local channel attention (MLCA), which combines channel-spatial information and local–global contextual features, is introduced to strengthen defect-focused representations under complex backgrounds. For the loss function, a scale-annealed mixed-quality EIoU loss (SAMQ-EIoU) is proposed by combining iso-center scale transformation, scale factor annealing and focal-style quality reweighting to improve localization accuracy at high IoU thresholds. Experiments on a constructed dataset covering six typical defect categories show that the improved YOLOv8n achieves 91.4% mAP@0.50 and 64.5% mAP@0.50:0.95, with only 1.59 M parameters and 4.9 GFLOPs. Compared with mainstream detectors, the proposed model achieves a better balance between detection accuracy and lightweight design. In particular, compared with the recently proposed YOLOv8n-DSN and IDD-YOLO, it improves mAP@0.50 by 0.6% and 0.8%, and mAP@0.50:0.95 by 1.2% and 4.8%, respectively, while further reducing the parameter count by 1.00 M and 1.26 M, and the FLOPs by 1.7 G and 0.2 G. Moreover, the cross-dataset evaluation on the public UPID and SFID datasets further demonstrate the robustness and generalization ability of the proposed method. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
35 pages, 6116 KB  
Article
Attention-Enhanced GAN for Spatial–Spectral Fusion and Chlorophyll-a Inversion in Chen Lake, China
by Chenxi Zeng, Cheng Shang, Yankun Wang, Shan Jiang, Ningsheng Chen, Chengyu Geng, Yadong Zhou and Yun Du
Sensors 2026, 26(7), 2107; https://doi.org/10.3390/s26072107 (registering DOI) - 28 Mar 2026
Abstract
The Sentinel-3 Ocean and Land Colour Instrument (OLCI) is designed for water monitoring. Its 21-spectral bands serve as the basis for the precise retrieval of water quality parameters. However, its coarse resolution restricts the depiction of the spatial distribution of water quality parameters [...] Read more.
The Sentinel-3 Ocean and Land Colour Instrument (OLCI) is designed for water monitoring. Its 21-spectral bands serve as the basis for the precise retrieval of water quality parameters. However, its coarse resolution restricts the depiction of the spatial distribution of water quality parameters in small inland water bodies. Spatial–spectral fusion is a common method to address the inherent constraints between the spatial and spectral resolutions of sensors. Central to the popular methods is the deep learning-based method. Nonetheless, deep-learning-based models still face challenges in fusing Sentinel-2 Multi-Spectral Instrument (MSI) and Sentinel-3 OLCI data. Here, we propose a Multi-Scale-Attention-based Unsupervised Generative Adversarial Network (MSA-UGAN), which effectively integrates OLCI’s spectral advantage and MSI’s spatial resolution. Quantitative evaluation was conducted against five benchmark methods, including traditional approaches (GS, SFIM, MTF-GLP) and deep learning models (SRCNN, UCGAN). The results show that MSA-UGAN achieves the best overall performance: QNR (0.9709) and SSIM (0.9087) are the highest, while SAM (1.1331), spatial distortion (DS = 0.0389), and spectral distortion (Dλ = 0.0252) are the lowest. This shows that MSA-UGAN can better preserve the spatial details of S2 MSI and the spectral features of S3 OLCI data. Moreover, ERGAS (2.2734) also performs excellently in the comparative experiments. The experiment of Chlorophyll-a inversion using the fused image in Chen Lake revealed a spatial gradient ranging from 3.25 to 19.33 µg/L, with the highest concentrations in the southwestern nearshore waters, likely associated with aquaculture. These results jointly indicate that MSA-UGAN can generate high-spatial-resolution multispectral images, and the fused images can be effectively utilized for water quality monitoring, thereby providing essential data support for the precision management and scientific decision-making regarding inland lakes. Full article
(This article belongs to the Section Remote Sensors)
14 pages, 2326 KB  
Article
Steel Surface Defect Detection Based on Improved YOLOv8 with Multi-Scale Feature Fusion and Attention Mechanism
by Yalei Jia, Xian Zhang, Jianhui Meng and Jisong Zang
Electronics 2026, 15(7), 1408; https://doi.org/10.3390/electronics15071408 - 27 Mar 2026
Abstract
Identifying microscopic textural anomalies and filtering out complicated industrial background noise remain significant hurdles in inspecting metallic surfaces. To tackle these operational bottlenecks, our research introduces a refined multi-scale detection framework built upon the YOLOv8l architecture. Specifically, we engineer a fine-grained detection pathway [...] Read more.
Identifying microscopic textural anomalies and filtering out complicated industrial background noise remain significant hurdles in inspecting metallic surfaces. To tackle these operational bottlenecks, our research introduces a refined multi-scale detection framework built upon the YOLOv8l architecture. Specifically, we engineer a fine-grained detection pathway utilizing the P2 layer, which aims to preserve critical details of miniature flaws that are otherwise discarded during feature extraction. Furthermore, a Bi-directional Feature Pyramid Network model is embedded to reconstruct the feature fusion path, balancing the preservation of shallow geometric textures with enhanced multi-scale representation capabilities. To bolster anti-interference performance, a Convolutional Block Attention Module (CBAM) is integrated prior to the detection head, employing adaptive channel and spatial weighting to suppress unstructured background noise. Experimental results utilizing TTA demonstrate that the mAP@0.5 reached 76.3%. Detection accuracies for patches and inclusions reached 93.1% and 85.3%. Full article
Show Figures

Figure 1

16 pages, 10364 KB  
Article
A Method for Filling Blank Stripes in Electrical Imaging Based on the Fusion of Arbitrary Kernel Convolution and Generative Adversarial Networks
by Ruhan A, Die Liu, Ge Cao, Kun Meng, Taiping Zhao, Lili Tian, Bin Zhao, Guilan Lin and Sinan Fang
Appl. Sci. 2026, 16(7), 3267; https://doi.org/10.3390/app16073267 - 27 Mar 2026
Abstract
Electrical imaging logging images play a crucial role in petroleum exploration; however, in practical applications, blank strips frequently appear due to instrument malfunctions or data transmission failures, severely compromising geological interpretation and hydrocarbon evaluation. Existing image inpainting methods have limited adaptability to blank [...] Read more.
Electrical imaging logging images play a crucial role in petroleum exploration; however, in practical applications, blank strips frequently appear due to instrument malfunctions or data transmission failures, severely compromising geological interpretation and hydrocarbon evaluation. Existing image inpainting methods have limited adaptability to blank strips at different depth scales and exhibit blurred high-resolution geological textures. To address these issues, this paper proposes a blank strip filling method that integrates Arbitrary Kernel Convolution (AKConv) with the Aggregated Contextual-Transformations Generative Adversarial Network (AOT-GAN). Specifically, the adaptive sampling mechanism of AKConv is incorporated into the generator network of AOT-GAN, enabling the model—to effectively capture long-range contextual information and adaptively handle blank strips of varying scales and shapes through multi-scale feature fusion. Experimental results on real oilfield datasets demonstrate that the proposed method achieves significant improvements in PSNR, SSIM, and MAE, exhibiting superior structural preservation and texture sharpness—especially in restoring deep and large-scale blank strips. Furthermore, visual comparisons confirm the method’s superior performance in recovering key geological features, such as bedding continuity and fracture structures, thus providing an effective approach for electrical imaging logging image restoration. Full article
(This article belongs to the Special Issue Applied Geophysical Imaging and Data Processing, 2nd Edition)
Show Figures

Figure 1

21 pages, 922 KB  
Article
DBCF-Net: A Dual-Branch Cross-Scale Fusion Network for Heterogeneous Satellite–UAV Change Detection
by Yan Ren, Ruiyong Li, Pengbo Zhai and Xinyu Chen
Remote Sens. 2026, 18(7), 1009; https://doi.org/10.3390/rs18071009 - 27 Mar 2026
Abstract
Heterogeneous change detection (HCD) using satellite and Unmanned Aerial Vehicle (UAV) imagery is a pivotal task in remote sensing and Earth observation. However, the effective utilization of such multi-source data is significantly hindered by extreme spatial resolution disparities and distinct radiometric characteristics. Existing [...] Read more.
Heterogeneous change detection (HCD) using satellite and Unmanned Aerial Vehicle (UAV) imagery is a pivotal task in remote sensing and Earth observation. However, the effective utilization of such multi-source data is significantly hindered by extreme spatial resolution disparities and distinct radiometric characteristics. Existing deep learning methods, often based on weight-sharing Siamese architectures, struggle to bridge these domain gaps, leading to spectral pseudo-changes and blurred detection boundaries. To address these challenges, we propose a novel Dual-Branch Cross-Scale Fusion Network (DBCF-Net) specifically tailored for heterogeneous satellite–UAV change detection. We introduce a Difference-Aware Attention Module (DAAM) to explicitly align cross-modal feature spaces and suppress domain-related noise through a hybrid local–global attention mechanism. Furthermore, an Adaptive Gated Fusion Module (AGFM) is designed to dynamically weight multi-scale interactions, ensuring the preservation of high-frequency spatial details from UAV imagery while maintaining the semantic consistency of satellite data. Extensive experiments on the Heterogeneous Satellite–UAV Dataset (HSUD) demonstrate that DBCF-Net achieves state-of-the-art performance, reaching an F1-score of 88.75% and an IoU of 80.58%. This study provides a robust technical framework for heterogeneous sensor fusion and high-precision monitoring in complex remote sensing scenarios. Full article
(This article belongs to the Section Remote Sensing Image Processing)
19 pages, 997 KB  
Article
A Dual-Branch Typhoon-Gated Axial Transformer for Accurate Tropical Cyclone Path Forecasting
by Xiaoyang Huang, Kenan Fan, Xiaolin Zhu and Wei Lv
Atmosphere 2026, 17(4), 339; https://doi.org/10.3390/atmos17040339 - 27 Mar 2026
Abstract
Typhoon track prediction is an important research direction in weather forecasting. Although deep learning methods have achieved some progress in this field, challenges remain, including insufficient fusion of meteorological features, limited capability in modeling temporal and spatial evolution, and high computational cost of [...] Read more.
Typhoon track prediction is an important research direction in weather forecasting. Although deep learning methods have achieved some progress in this field, challenges remain, including insufficient fusion of meteorological features, limited capability in modeling temporal and spatial evolution, and high computational cost of some models. To address these issues, this paper proposes a dual-path, multi-modal typhoon track prediction model that incorporates a gated axial Transformer to enhance the modeling of deep structural features in the meteorological environment. Numerical experimental results show that the proposed model achieves higher prediction accuracy than comparative methods in typhoon track prediction tasks across multiple time scales, demonstrating the effectiveness of the approach. Full article
(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)
24 pages, 15151 KB  
Article
SG-YOLO: A Multispectral Small-Object Detector for UAV Imagery Based on YOLO
by Binjie Zhang, Lin Wang, Quanwei Yao, Keyang Li and Qinyan Tan
Remote Sens. 2026, 18(7), 1003; https://doi.org/10.3390/rs18071003 - 27 Mar 2026
Abstract
Object detection in unmanned aerial vehicle (UAV) imagery remains a crucial yet challenging task due to complex backgrounds, large scale variations, and the prevalence of small objects. Visible-spectrum images lack robustness under all-weather and all-illumination conditions; by contrast, multispectral sensing provides complementary cues [...] Read more.
Object detection in unmanned aerial vehicle (UAV) imagery remains a crucial yet challenging task due to complex backgrounds, large scale variations, and the prevalence of small objects. Visible-spectrum images lack robustness under all-weather and all-illumination conditions; by contrast, multispectral sensing provides complementary cues (e.g., thermal signatures) that improve detection robustness. However, existing multispectral solutions often incur high computational costs and are therefore difficult to deploy on resource-constrained UAV platforms. To address these issues, SG-YOLO is proposed, a lightweight and efficient multispectral object detection framework that aims to balance accuracy and efficiency. First, a Spectral Gated Downsampling Stem (SGDS) is designed, in which grouped convolutions and a gating mechanism are employed at the early stage of the network to extract band-specific features, thereby maximizing spectral complementarity while minimizing redundancy. Second, a Spectral–Spatial Iterative Attention Fusion (SSIAF) module is introduced, in which spectral-wise (channel) attention and spatial-wise attention are iteratively coupled and cascaded in a multi-scale manner to jointly model cross-band dependencies and spatial saliency, thereby aggregating high-level semantic information while suppressing redundant spectral responses. Finally, a Spatial–Channel Synergistic Fusion (SCSF) module is designed to enhance multi-scale and cross-channel feature integration in the neck. Experiments on the MODA dataset show that SG-YOLOs achieves 72.4% mAP50, outperforming the baseline by 3.2%. Moreover, compared with a range of mainstream one-stage detectors and multispectral detection methods, SG-YOLO delivers the best overall performance, providing an effective solution for UAV object detection while maintaining a favorable trade-off between model size and detection accuracy. Full article
Show Figures

Figure 1

24 pages, 1541 KB  
Article
Infrared Moving Maritime Vessel Segmentation Based on Multi-Scale Spatial–Temporal Transformer Network
by Wenhui Liu, Yulong Qiao, Yue Zhao and Zhengyi Xing
Remote Sens. 2026, 18(7), 1006; https://doi.org/10.3390/rs18071006 - 27 Mar 2026
Abstract
Infrared moving maritime vessel segmentation is a crucial image processing task for maritime security, which is a challenging problem due to the complex backgrounds and targets with varying sizes. To address these issues, we propose an end-to-end segmentation network based on a multi-scale [...] Read more.
Infrared moving maritime vessel segmentation is a crucial image processing task for maritime security, which is a challenging problem due to the complex backgrounds and targets with varying sizes. To address these issues, we propose an end-to-end segmentation network based on a multi-scale spatiotemporal vision transformer (ST-VT) for segmenting the moving maritime vessels in the infrared image sequence. Specifically, in the feature extraction module, we introduce a multi-scale feature encoding structure that combines a multi-scale backbone and Feature Pyramid Network technology. Then, the multi-scale deformable encoder structure and a cross-scale fusion module with the pixel decoder are proposed to generate the multi-scale spatiotemporal features. Subsequently, we employ the improved attention blocks that are the core blocks of the coarse-to-fine framework (across scales) of the prompt decoder to obtain the prompts. Finally, a multi-scale mask decoder is applied to achieve the final target segmentation. The experiments are conducted on the benchmark dataset IPATCH and our labeled dataset LAS-MassMIND. The results demonstrate that the proposed method achieves state-of-the-art performance, especially within complex backgrounds and targets of varying sizes. Full article
30 pages, 2146 KB  
Article
Research on a Precision Counting Method and Web Deployment for Natural-Form Bothriochloa ischaemum Spikes and Seeds Based on Object Detection
by Huamin Zhao, Yongzhuo Zhang, Yabo Zheng, Erkang Zeng, Linjun Jiang, Weiqi Yan, Fangshan Xia and Defang Xu
Agronomy 2026, 16(7), 706; https://doi.org/10.3390/agronomy16070706 - 27 Mar 2026
Abstract
Bothriochloa ischaemum is a key forage species with strong grazing tolerance and high nutritional value, making precise quantification of spike and seed traits essential for germplasm evaluation and yield prediction. However, the compact architecture and minute seed size in natural field conditions render [...] Read more.
Bothriochloa ischaemum is a key forage species with strong grazing tolerance and high nutritional value, making precise quantification of spike and seed traits essential for germplasm evaluation and yield prediction. However, the compact architecture and minute seed size in natural field conditions render manual counting inefficient and labor-intensive. To address this limitation, this study presents a non-destructive and automated quantification framework integrating advanced object detection and regression analysis for accurate in situ estimation of spikes and seed numbers. To further address the challenges of dense spike detection caused by occlusion and small object sizes, this study developed a modified model named YOLOv12-DAN by integrating DySample dynamic upsampling, ASFF feature fusion, and NWD loss, which achieved a mean average precision (mAP) of 91.6%. Meanwhile, for the detection of dense kernels on compact spikes, an improved YOLOv12 architecture incorporating an Explicit Visual Center (EVC) module was proposed to enhance multi-scale feature representation. The optimized model attained a bounding box precision of 96.5%, a recall rate of 86.4%, an mAP50 of 94.3%, and an mAP50-95 of 73.9%. Furthermore, a univariate linear regression model based on 132 spike samples verified the reliable consistency between the predicted and actual seed counts, with a mean absolute error (MAE) of 6.30, a mean absolute percentage error (MAPE) of 9.35, and an R-squared (R2) value of 0.808. Finally, the model was deployed through a lightweight end-to-end web application, enabling real-time field operation and promoting its applicability in breeding programs and agronomic decision-making. This study provides a robust technical pathway for automated phenotyping and precision forage improvement. Full article
(This article belongs to the Special Issue Digital Twins in Precision Agriculture)
25 pages, 9555 KB  
Article
EFSL-YOLO: An Improved Model for Small Object Detection in UAV Vision
by Meng Zhou, Shuke He, Chang Wang and Jing Wang
Drones 2026, 10(4), 243; https://doi.org/10.3390/drones10040243 - 27 Mar 2026
Abstract
To address the challenges in UAV remote sensing imagery, such as small object size, dense occlusion and complex background interference, this paper proposes an enhanced small object detection algorithm based on an improved YOLOv13 model for drone applications in complex weather environments. First, [...] Read more.
To address the challenges in UAV remote sensing imagery, such as small object size, dense occlusion and complex background interference, this paper proposes an enhanced small object detection algorithm based on an improved YOLOv13 model for drone applications in complex weather environments. First, an enhanced feature fusion attention network (EFFA-Net) is designed in the preprocessing stage to reduce image degradation and suppress the interference caused by smoke and haze. Then, in the backbone, a swish-gated convolution (SwiGLUConv) module is designed to adaptively expand the receptive field and enhance multi-scale feature extraction, which strengthens the representation of small targets while maintaining efficient computation. Furthermore, a locally enhanced multi-scale context fusion (LF-MSCF) module is integrated into the feature fusion neck of YOLO, combining multi-head self-attention, channel attention, and spatial attention to suppress background noise and redundant responses, thereby improving detection accuracy. Extensive experiments on the VisDrone-DET2019 dataset, UAVDT dataset, and HazyDet dataset demonstrate that the proposed algorithm outperforms other mainstream methods, showcasing excellent detection accuracy and robustness in complex UAV aerial scenarios. Full article
Back to TopTop