Saved Queries

Precise recognition of individual ovine specimens plays a pivotal role in implementing smart agricultural platforms and optimizing herd management systems. With the development of deep learning technology, sheep face recognition provides an efficient and contactless solution for individual sheep identification. However, with the growth of sheep, their facial features keep changing, which poses challenges for existing sheep face recognition models to maintain accuracy across the dynamic changes in facial features over time, making it difficult to meet practical needs. To address this limitation, we propose the lifelong biometric learning of the sheep face network (LBL-SheepNet), a feature decoupling network designed for continuous adaptation to ovine facial changes, and constructed a dataset of 31,200 images from 55 sheep tracked monthly from 1 to 12 months of age. The LBL-SheepNet model addresses dynamic variations in facial features during sheep growth through a multi-module architectural framework. Firstly, a Squeeze-and-Excitation (SE) module enhances discriminative feature representation through adaptive channel-wise recalibration. Then, a nonlinear feature decoupling module employs a hybrid channel-batch attention mechanism to separate age-related features from identity-specific characteristics. Finally, a correlation analysis module utilizes adversarial learning to suppress age-biased feature interference, ensuring focus on age-invariant identifiers. Experimental results demonstrate that LBL-SheepNet achieves 95.5% identification accuracy and 95.3% average precision on the sheep face dataset. This study introduces a lifelong biometric learning (LBL) mechanism to mitigate recognition accuracy degradation caused by dynamic facial feature variations in growing sheep. By designing a feature decoupling network integrated with adversarial age-invariant learning, the proposed method addresses the performance limitations of existing models in long-term individual identification. Full article

(This article belongs to the Section Animal System and Management)

►▼ Show Figures

Figure 1

9 pages, 838 KiB

Open AccessReview

Merging Neuroscience and Engineering Through Regenerative Peripheral Nerve Interfaces

by Melanie J. Wang, Theodore A. Kung, Alison K. Snyder-Warwick and Paul S. Cederna

Prosthesis 2025, 7(4), 97; https://doi.org/10.3390/prosthesis7040097 (registering DOI) - 6 Aug 2025

Abstract

Approximately 185,000 people in the United states experience limb loss each year. There is a need for an intuitive neural interface that can offer high-fidelity control signals to optimize the advanced functionality of prosthetic devices. Regenerative peripheral nerve interface (RPNI) is a pioneering advancement in neuroengineering that combines surgical techniques with biocompatible materials to create an interface for individuals with limb loss. RPNIs are surgically constructed from autologous muscle grafts that are neurotized by the residual peripheral nerves of an individual with limb loss. RPNIs amplify neural signals and demonstrate long term stability. In this narrative review, the terms “Regenerative Peripheral Nerve Interface (RPNI)” and “RPNI surgery” are used interchangeably to refer to the same surgical and biological construct. This narrative review specifically focuses on RPNIs as a targeted approach to enhance prosthetic control through surgically created nerve–muscle interfaces. This area of research offers a promising solution to overcome the limitations of existing prosthetic control systems and could help improve the quality of life for people suffering from limb loss. It allows for multi-channel control and bidirectional communication, while enhancing the functionality of prosthetics through improved sensory feedback. RPNI surgery holds significant promise for improving the quality of life for individuals with limb loss by providing a more intuitive and responsive prosthetic experience. Full article

(This article belongs to the Special Issue Innovation in Prosthetic Solutions: Bridging Neuroscience and Engineering for Next-Generation Prosthetic Systems)

►▼ Show Figures

Figure 1

21 pages, 49475 KiB

Open AccessArticle

NRGS-Net: A Lightweight Uformer with Gated Positional and Local Context Attention for Nighttime Road Glare Suppression

by Ruoyu Yang, Huaixin Chen, Sijie Luo and Zhixi Wang

Appl. Sci. 2025, 15(15), 8686; https://doi.org/10.3390/app15158686 (registering DOI) - 6 Aug 2025

Abstract

Existing nighttime visibility enhancement methods primarily focus on improving overall brightness under low-light conditions. However, nighttime road images are also affected by glare, glow, and flare from complex light sources such as streetlights and headlights, making it challenging to suppress locally overexposed regions and recover fine details. To address these challenges, we propose a Nighttime Road Glare Suppression Network (NRGS-Net) for glare removal and detail restoration. Specifically, to handle diverse glare disturbances caused by the uncertainty in light source positions and shapes, we designed a gated positional attention (GPA) module that integrates positional encoding with local contextual information to guide the network in accurately locating and suppressing glare regions, thereby enhancing the visibility of affected areas. Furthermore, we introduced an improved Uformer backbone named LCAtransformer, in which the downsampling layers adopt efficient depthwise separable convolutions to reduce computational cost while preserving critical spatial information. The upsampling layers incorporate a residual PixelShuffle module to achieve effective restoration in glare-affected regions. Additionally, channel attention is introduced within the Local Context-Aware Feed-Forward Network (LCA-FFN) to enable adaptive adjustment of feature weights, effectively suppressing irrelevant and interfering features. To advance the research in nighttime glare suppression, we constructed and publicly released the Night Road Glare Dataset (NRGD) captured in real nighttime road scenarios, enriching the evaluation system for this task. Experiments conducted on the Flare7K++ and NRGD, using five evaluation metrics and comparing six state-of-the-art methods, demonstrate that our method achieves superior performance in both subjective and objective metrics compared to existing advanced methods. Full article

(This article belongs to the Special Issue Computational Imaging: Algorithms, Technologies, and Applications)

►▼ Show Figures

Figure 1

11 pages, 2425 KiB

Open AccessArticle

Single-Layer High-Efficiency Metasurface for Multi-User Signal Enhancement

by Hui Jin, Peixuan Zhu, Rongrong Zhu, Bo Yang, Siqi Zhang and Huan Lu

Micromachines 2025, 16(8), 911; https://doi.org/10.3390/mi16080911 (registering DOI) - 6 Aug 2025

Abstract

In multi-user wireless communication scenarios, signal degradation caused by channel fading and co-channel interference restricts system capacity, while traditional enhancement schemes face challenges of high coordination complexity and hardware integration. This paper proposes an electromagnetic focusing method using a single-layer transmissive passive metasurface. A high-efficiency metasurface array is fabricated based on PCB technology, which utilizes subwavelength units for wide-range phase modulation to construct a multi-user energy convergence model in the WiFi band. By optimizing phase gradients through the geometric phase principle, the metasurface achieves collaborative wavefront manipulation for multiple target regions with high transmission efficiency, reducing system complexity compared to traditional multi-layer structures. Measurements in a microwave anechoic chamber and tests in an office environment demonstrate that the metasurface can simultaneously create signal enhancement zones for multiple users, featuring stable focusing capability and environmental adaptability. This lightweight design facilitates deployment in dense networks, providing an effective solution for signal optimization in indoor distributed systems and IoT communications. Full article

(This article belongs to the Special Issue Novel Electromagnetic and Acoustic Devices)

►▼ Show Figures

Figure 1

23 pages, 6490 KiB

Open AccessArticle

LISA-YOLO: A Symmetry-Guided Lightweight Small Object Detection Framework for Thyroid Ultrasound Images

by Guoqing Fu, Guanghua Gu, Wen Liu and Hao Fu

Symmetry 2025, 17(8), 1249; https://doi.org/10.3390/sym17081249 - 6 Aug 2025

Abstract

Non-invasive ultrasound diagnosis, combined with deep learning, is frequently used for detecting thyroid diseases. However, real-time detection on portable devices faces limitations due to constrained computational resources, and existing models often lack sufficient capability for small object detection of thyroid nodules. To address this, this paper proposes an improved lightweight small object detection network framework called LISA-YOLO, which enhances the lightweight multi-scale collaborative fusion algorithm. The proposed framework exploits the inherent symmetrical characteristics of ultrasound images and the symmetrical architecture of the detection network to better capture and represent features of thyroid nodules. Specifically, an improved depthwise separable convolution algorithm replaces traditional convolution to construct a lightweight network (DG-FNet). Through symmetrical cross-scale fusion operations via FPN, detection accuracy is maintained while reducing computational overhead. Additionally, an improved bidirectional feature network (IMS F-NET) fully integrates the semantic and detailed information of high- and low-level features symmetrically, enhancing the representation capability for multi-scale features and improving the accuracy of small object detection. Finally, a collaborative attention mechanism (SAF-NET) uses a dual-channel and spatial attention mechanism to adaptively calibrate channel and spatial weights in a symmetric manner, effectively suppressing background noise and enabling the model to focus on small target areas in thyroid ultrasound images. Extensive experiments on two image datasets demonstrate that the proposed method achieves improvements of 2.3% in F1 score, 4.5% in mAP, and 9.0% in FPS, while maintaining only 2.6 M parameters and reducing GFLOPs from 6.1 to 5.8. The proposed framework provides significant advancements in lightweight real-time detection and demonstrates the important role of symmetry in enhancing the performance of ultrasound-based thyroid diagnosis. Full article

(This article belongs to the Section Computer)

►▼ Show Figures

Figure 1

15 pages, 1241 KiB

Open AccessArticle

Triplet Spatial Reconstruction Attention-Based Lightweight Ship Component Detection for Intelligent Manufacturing

by Bocheng Feng, Zhenqiu Yao and Chuanpu Feng

Appl. Sci. 2025, 15(15), 8676; https://doi.org/10.3390/app15158676 (registering DOI) - 5 Aug 2025

Abstract

Automatic component recognition plays a crucial role in intelligent ship manufacturing, but existing methods suffer from low recognition accuracy and high computational cost in industrial scenarios involving small samples, component stacking, and diverse categories. To address the requirements of shipbuilding industrial applications, a Triplet Spatial Reconstruction Attention (TSA) mechanism that combines threshold-based feature separation with triplet parallel processing is proposed, and a lightweight You Only Look Once Ship (YOLO-Ship) detection network is constructed. Unlike existing attention mechanisms that focus on either spatial reconstruction or channel attention independently, the proposed TSA integrates triplet parallel processing with spatial feature separation–reconstruction techniques to achieve enhanced target feature representation while significantly reducing parameter count and computational overhead. Experimental validation on a small-scale actual ship component dataset demonstrates that the improved network achieves 88.7% mean Average Precision (mAP), 84.2% precision, and 87.1% recall, representing improvements of 3.5%, 2.2%, and 3.8%, respectively, compared to the original YOLOv8n algorithm, requiring only 2.6 M parameters and 7.5 Giga Floating-point Operations per Second (GFLOPs) computational cost, achieving a good balance between detection accuracy and lightweight model design. Future research directions include developing adaptive threshold learning mechanisms for varying industrial conditions and integration with surface defect detection capabilities to enhance comprehensive quality control in intelligent manufacturing systems. Full article

(This article belongs to the Special Issue Artificial Intelligence on the Edge for Industry 4.0)

►▼ Show Figures

Figure 1

23 pages, 85184 KiB

Open AccessArticle

MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition

by Cheng Fang, Sitong Liu and Bing Gao

Sensors 2025, 25(15), 4819; https://doi.org/10.3390/s25154819 - 5 Aug 2025

Abstract

Emotion analysis based on electroencephalogram (EEG) sensors is pivotal for human–machine interaction yet faces key challenges in spatio-temporal feature fusion and cross-band and brain-region integration from multi-channel sensor-derived signals. This paper proposes MB-MSTFNet, a novel framework for EEG emotion recognition. The model constructs a 3D tensor to encode band–space–time correlations of sensor data, explicitly modeling frequency-domain dynamics and spatial distributions of EEG sensors across brain regions. A multi-scale CNN-Inception module extracts hierarchical spatial features via diverse convolutional kernels and pooling operations, capturing localized sensor activations and global brain network interactions. Bi-directional GRUs (BiGRUs) model temporal dependencies in sensor time-series, adept at capturing long-range dynamic patterns. Multi-head self-attention highlights critical time windows and brain regions by assigning adaptive weights to relevant sensor channels, suppressing noise from non-contributory electrodes. Experiments on the DEAP dataset, containing multi-channel EEG sensor recordings, show that MB-MSTFNet achieves 96.80 ± 0.92% valence accuracy, 98.02 ± 0.76% arousal accuracy for binary classification tasks, and 92.85 ± 1.45% accuracy for four-class classification. Ablation studies validate that feature fusion, bidirectional temporal modeling, and multi-scale mechanisms significantly enhance performance by improving feature complementarity. This sensor-driven framework advances affective computing by integrating spatio-temporal dynamics and multi-band interactions of EEG sensor signals, enabling efficient real-time emotion recognition. Full article

(This article belongs to the Section Intelligent Sensors)

►▼ Show Figures

Figure 1

22 pages, 4169 KiB

Open AccessArticle

Multi-Scale Differentiated Network with Spatial–Spectral Co-Operative Attention for Hyperspectral Image Denoising

by Xueli Chang, Xiaodong Wang, Xiaoyu Huang, Meng Yan and Luxiao Cheng

Appl. Sci. 2025, 15(15), 8648; https://doi.org/10.3390/app15158648 (registering DOI) - 5 Aug 2025

Abstract

Hyperspectral image (HSI) denoising is a crucial step in image preprocessing as its effectiveness has a direct impact on the accuracy of subsequent tasks such as land cover classification, target recognition, and change detection. However, existing methods suffer from limitations in effectively integrating multi-scale features and adaptively modeling complex noise distributions, making it difficult to construct effective spatial–spectral joint representations. This often leads to issues like detail loss and spectral distortion, especially when dealing with complex mixed noise. To address these challenges, this paper proposes a multi-scale differentiated denoising network based on spatial–spectral cooperative attention (MDSSANet). The network first constructs a multi-scale image pyramid using three downsampling operations and independently models the features at each scale to better capture noise characteristics at different levels. Additionally, a spatial–spectral cooperative attention module (SSCA) and a differentiated multi-scale feature fusion module (DMF) are introduced. The SSCA module effectively captures cross-spectral dependencies and spatial feature interactions through parallel spectral channel and spatial attention mechanisms. The DMF module adopts a multi-branch parallel structure with differentiated processing to dynamically fuse multi-scale spatial–spectral features and incorporates a cross-scale feature compensation strategy to improve feature representation and mitigate information loss. The experimental results show that the proposed method outperforms state-of-the-art methods across several public datasets, exhibiting greater robustness and superior visual performance in tasks such as handling complex noise and recovering small targets. Full article

(This article belongs to the Special Issue Remote Sensing Image Processing and Application, 2nd Edition)

►▼ Show Figures

Figure 1

19 pages, 2359 KiB

Open AccessArticle

Research on Concrete Crack Damage Assessment Method Based on Pseudo-Label Semi-Supervised Learning

by Ming Xie, Zhangdong Wang and Li’e Yin

Buildings 2025, 15(15), 2726; https://doi.org/10.3390/buildings15152726 - 1 Aug 2025

Viewed by 214

Abstract

To address the inefficiency of traditional concrete crack detection methods and the heavy reliance of supervised learning on extensive labeled data, in this study, an intelligent assessment method of concrete damage based on pseudo-label semi-supervised learning and fractal geometry theory is proposed to solve two core tasks: one is binary classification of pixel-level cracks, and the other is multi-category assessment of damage state based on crack morphology. Using three-channel RGB images as input, a dual-path collaborative training framework based on U-Net encoder–decoder architecture is constructed, and a binary segmentation mask of the same size is output to achieve the accurate segmentation of cracks at the pixel level. By constructing a dual-path collaborative training framework and employing a dynamic pseudo-label refinement mechanism, the model achieves an F1-score of 0.883 using only 50% labeled data—a mere 1.3% decrease compared to the fully supervised benchmark DeepCrack (F1 = 0.896)—while reducing manual annotation costs by over 60%. Furthermore, a quantitative correlation model between crack fractal characteristics and structural damage severity is established by combining a U-Net segmentation network with the differential box-counting algorithm. The experimental results demonstrate that under a cyclic loading of 147.6–221.4 kN, the fractal dimension monotonically increases from 1.073 (moderate damage) to 1.189 (failure), with 100% accuracy in damage state identification, closely aligning with the degradation trend of macroscopic mechanical properties. In complex crack scenarios, the model attains a recall rate (Re = 0.882), surpassing U-Net by 13.9%, with significantly enhanced edge reconstruction precision. Compared with the mainstream models, this method effectively alleviates the problem of data annotation dependence through a semi-supervised strategy while maintaining high accuracy. It provides an efficient structural health monitoring solution for engineering practice, which is of great value to promote the application of intelligent detection technology in infrastructure operation and maintenance. Full article

(This article belongs to the Special Issue Advanced Damage Detection and State Monitoring Technologies for Engineering Structures)

►▼ Show Figures

Figure 1

15 pages, 5007 KiB

Open AccessArticle

In Situ Construction of Thiazole-Linked Covalent Organic Frameworks on Cu₂O for High-Efficiency Photocatalytic Tetracycline Degradation

by Zhifang Jia, Tingxia Wang, Zhaoxia Wu, Shumaila Razzaque, Zhixiang Zhao, Jiaxuan Cai, Wenao Xie, Junli Wang, Qiang Zhao and Kewei Wang

Molecules 2025, 30(15), 3233; https://doi.org/10.3390/molecules30153233 - 1 Aug 2025

Viewed by 171

Abstract

The strategic construction of heterojunctions through a simple and efficient strategy is one of the most effective means to boost the photocatalytic activity of semiconductor materials. Herein, a thiazole-linked covalent organic framework (TZ-COF) with large surface area, well-ordered pore structure, and high stability was developed. To further boost photocatalytic activity, the TZ-COF was synthesized in situ on the surface of Cu₂O through a simple multicomponent reaction, yielding an encapsulated composite material (Cu₂O@TZ-COF-18). In this composite, the outermost COF endows the material with abundant redox active sites and mass transfer channels, while the innermost Cu₂O exhibits unique photoelectric properties. Notably, the synthesized Cu₂O@TZ-COF-18 was proven to have the heterojunction structure, which can efficiently restrain the recombination of photogenerated electron–hole pairs, thereby enhancing the photocatalytic performance. The photocatalytic degradation of tetracycline demonstrated that 3-Cu₂O@TZ-COF-18 had the highest photocatalytic efficiency, with the removal rate of 96.3% within 70 min under visible light, which is better than that of pristine TZ-COF-18, Cu₂O, the physical mixture of Cu₂O and TZ-COF-18, and numerous reported COF-based composite materials. 3-Cu₂O@TZ-COF-18 retained its original crystallinity and removal efficiency after five cycles in photodegradation reaction, displaying high stability and excellent cycle performance. Full article

(This article belongs to the Special Issue Purposeful Assembly and Versatile Applications of Advanced Crystalline Frameworks)

►▼ Show Figures

Graphical abstract

16 pages, 4587 KiB

Open AccessArticle

FAMNet: A Lightweight Stereo Matching Network for Real-Time Depth Estimation in Autonomous Driving

by Jingyuan Zhang, Qiang Tong, Na Yan and Xiulei Liu

Symmetry 2025, 17(8), 1214; https://doi.org/10.3390/sym17081214 - 1 Aug 2025

Viewed by 236

Abstract

Accurate and efficient stereo matching is fundamental to real-time depth estimation from symmetric stereo cameras in autonomous driving systems. However, existing high-accuracy stereo matching networks typically rely on computationally expensive 3D convolutions, which limit their practicality in real-world environments. In contrast, real-time methods often sacrifice accuracy or generalization capability. To address these challenges, we propose FAMNet (Fusion Attention Multi-Scale Network), a lightweight and generalizable stereo matching framework tailored for real-time depth estimation in autonomous driving applications. FAMNet consists of two novel modules: Fusion Attention-based Cost Volume (FACV) and Multi-scale Attention Aggregation (MAA). FACV constructs a compact yet expressive cost volume by integrating multi-scale correlation, attention-guided feature fusion, and channel reweighting, thereby reducing reliance on heavy 3D convolutions. MAA further enhances disparity estimation by fusing multi-scale contextual cues through pyramid-based aggregation and dual-path attention mechanisms. Extensive experiments on the KITTI 2012 and KITTI 2015 benchmarks demonstrate that FAMNet achieves a favorable trade-off between accuracy, efficiency, and generalization. On KITTI 2015, with the incorporation of FACV and MAA, the prediction accuracy of the baseline model is improved by 37% and 38%, respectively, and a total improvement of 42% is achieved by our final model. These results highlight FAMNet’s potential for practical deployment in resource-constrained autonomous driving systems requiring real-time and reliable depth perception. Full article

(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry, 2nd Edition)

►▼ Show Figures

Figure 1

24 pages, 10190 KiB

Open AccessArticle

MSMT-RTDETR: A Multi-Scale Model for Detecting Maize Tassels in UAV Images with Complex Field Backgrounds

by Zhenbin Zhu, Zhankai Gao, Jiajun Zhuang, Dongchen Huang, Guogang Huang, Hansheng Wang, Jiawei Pei, Jingjing Zheng and Changyu Liu

Agriculture 2025, 15(15), 1653; https://doi.org/10.3390/agriculture15151653 - 31 Jul 2025

Viewed by 616

Abstract

Accurate detection of maize tassels plays a crucial role in yield estimation of maize in precision agriculture. Recently, UAV and deep learning technologies have been widely introduced in various applications of field monitoring. However, complex field backgrounds pose multiple challenges against the precision detection of maize tassels, including maize tassel multi-scale variations caused by varietal differences and growth stage variations, intra-class occlusion, and background interference. To achieve accurate maize tassel detection in UAV images under complex field backgrounds, this study proposes an MSMT-RTDETR detection model. The Faster-RPE Block is first designed to enhance multi-scale feature extraction while reducing model Params and FLOPs. To improve detection performance for multi-scale targets in complex field backgrounds, a Dynamic Cross-Scale Feature Fusion Module (Dy-CCFM) is constructed by upgrading the CCFM through dynamic sampling strategies and multi-branch architecture. Furthermore, the MPCC3 module is built via re-parameterization methods, and further strengthens cross-channel information extraction capability and model stability to deal with intra-class occlusion. Experimental results on the MTDC-UAV dataset demonstrate that the MSMT-RTDETR significantly outperforms the baseline in detecting maize tassels under complex field backgrounds, where a precision of 84.2% was achieved. Compared with Deformable DETR and YOLOv10m, improvements of 2.8% and 2.0% were achieved, respectively, in the mAP₅₀ for UAV images. This study proposes an innovative solution for accurate maize tassel detection, establishing a reliable technical foundation for maize yield estimation. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

►▼ Show Figures

Figure 1

25 pages, 25022 KiB

Open AccessArticle

Research on Underwater Laser Communication Channel Attenuation Model Analysis and Calibration Device

by Wenyu Cai, Hengmei Wang, Meiyan Zhang and Yu Wang

J. Mar. Sci. Eng. 2025, 13(8), 1483; https://doi.org/10.3390/jmse13081483 - 31 Jul 2025

Viewed by 130

Abstract

To investigate the influence of different water quality conditions on the underwater transmission performance of laser communication signals, this paper systematically analyzes the absorption and scattering characteristics of the underwater laser communication channel, and constructs a transmission model of laser propagation in water, so as to explore the transmission influence mechanism under typical water quality environments. On this basis, a system of in situ measurements for underwater laser channel attenuation is designed and constructed, and several sets of experiments are carried out to verify the rationality and applicability of the model. The collected experimental data are denoised by the fusion of wavelet analysis and adaptive Kalman filtering (DWT-AKF in short) algorithm, and compared with the data measured by an underwater hyperspectral Absorption Coefficient Spectrophotometer (ACS in short), which shows that the channel attenuation coefficients of the model inversion and the measured values are in high agreement. The research results provide a reliable theoretical basis and experimental support for the performance optimization and engineering design of the underwater laser communication system. Full article

(This article belongs to the Section Ocean Engineering)

►▼ Show Figures

Figure 1

25 pages, 21958 KiB

Open AccessArticle

ESL-YOLO: Edge-Aware Side-Scan Sonar Object Detection with Adaptive Quality Assessment

by Zhanshuo Zhang, Changgeng Shuai, Chengren Yuan, Buyun Li, Jianguo Ma and Xiaodong Shang

J. Mar. Sci. Eng. 2025, 13(8), 1477; https://doi.org/10.3390/jmse13081477 - 31 Jul 2025

Viewed by 108

Abstract

Focusing on the problem of insufficient detection accuracy caused by blurred target boundaries, variable scales, and severe noise interference in side-scan sonar images, this paper proposes a high-precision detection network named ESL-YOLO, which integrates edge perception and adaptive quality assessment. Firstly, an Edge Fusion Module (EFM) is designed, which integrates the Sobel operator into depthwise separable convolution. Through a dual-branch structure, it realizes effective fusion of edge features and spatial features, significantly enhancing the ability to recognize targets with blurred boundaries. Secondly, a Self-Calibrated Dual Attention (SCDA) Module is constructed. By means of feature cross-calibration and multi-scale channel attention fusion mechanisms, it achieves adaptive fusion of shallow details and deep-rooted semantic content, improving the detection accuracy for small-sized targets and targets with elaborate shapes. Finally, a Location Quality Estimator (LQE) is introduced, which quantifies localization quality using the statistical characteristics of bounding box distribution, effectively reducing false detections and missed detections. Experiments on the SIMD dataset show that the mAP@0.5 of ESL-YOLO reaches 84.65%. The precision and recall rate reach 87.67% and 75.63%, respectively. Generalization experiments on additional sonar datasets further validate the effectiveness of the proposed method across different data distributions and target types, providing an effective technical solution for side-scan sonar image target detection. Full article

(This article belongs to the Section Ocean Engineering)

►▼ Show Figures

Figure 1

26 pages, 62045 KiB

Open AccessArticle

CML-RTDETR: A Lightweight Wheat Head Detection and Counting Algorithm Based on the Improved RT-DETR

by Yue Fang, Chenbo Yang, Chengyong Zhu, Hao Jiang, Jingmin Tu and Jie Li

Electronics 2025, 14(15), 3051; https://doi.org/10.3390/electronics14153051 - 30 Jul 2025

Viewed by 171

Abstract

Wheat is one of the important grain crops, and spike counting is crucial for predicting spike yield. However, in complex farmland environments, the wheat body scale has huge differences, its color is highly similar to the background, and wheat ears often overlap with each other, which makes wheat ear detection work face a lot of challenges. At the same time, the increasing demand for high accuracy and fast response in wheat spike detection has led to the need for models to be lightweight function with reduced the hardware costs. Therefore, this study proposes a lightweight wheat ear detection model, CML-RTDETR, for efficient and accurate detection of wheat ears in real complex farmland environments. In the model construction, the lightweight network CSPDarknet is firstly introduced as the backbone network of CML-RTDETR to enhance the feature extraction efficiency. In addition, the FM module is cleverly introduced to modify the bottleneck layer in the C2f component, and hybrid feature extraction is realized by spatial and frequency domain splicing to enhance the feature extraction capability of wheat to be tested in complex scenes. Secondly, to improve the model’s detection capability for targets of different scales, a multi-scale feature enhancement pyramid (MFEP) is designed, consisting of GHSDConv, for efficiently obtaining low-level detail information and CSPDWOK for constructing a multi-scale semantic fusion structure. Finally, channel pruning based on Layer-Adaptive Magnitude Pruning (LAMP) scoring is performed to reduce model parameters and runtime memory. The experimental results on the GWHD2021 dataset show that the

{AP}_{50}

of CML-RTDETR reaches 90.5%, which is an improvement of 1.2% compared to the baseline RTDETR-R18 model. Meanwhile, the parameters and GFLOPs have been decreased to 11.03 M and 37.8 G, respectively, resulting in a reduction of 42% and 34%, respectively. Finally, the real-time frame rate reaches 73 fps, significantly achieving parameter simplification and speed improvement. Full article

(This article belongs to the Section Artificial Intelligence)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 56.

Go to page 1 2 3 4 5

Search Results (2,764)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI