MDPI - Publisher of Open Access Journals

27 pages, 2162 KB

Open AccessArticle

A Dual-Attention Temporal Convolutional Network-Based Track Initiation Method for Maneuvering Targets

by Hanbao Wu, Yiming Hao, Wei Chen and Mingli Liao

Electronics 2025, 14(21), 4215; https://doi.org/10.3390/electronics14214215 - 28 Oct 2025

Viewed by 117

In strong clutter and maneuvering scenarios, radar track initiation faces the dual challenges of a low initiation rate and high false alarm rate. Although the existing deep learning methods show promise, the commonly adopted “feature flattening” input strategy destroys the intrinsic temporal structure [...] Read more.

In strong clutter and maneuvering scenarios, radar track initiation faces the dual challenges of a low initiation rate and high false alarm rate. Although the existing deep learning methods show promise, the commonly adopted “feature flattening” input strategy destroys the intrinsic temporal structure and feature relationships of track data, limiting its discriminative performance. To address this issue, this paper proposes a novel radar track initiation method based on Dual-Attention Temporal Convolutional Network (DA-TCN), reformulating track initiation as a binary classification task for very short multi-channel time series that preserve complete temporal structure. The DA-TCN model employs the TCN as its backbone network to extract local dynamic features and innovatively constructs a dual-attention architecture: a channel attention branch dynamically calibrates the importance of each kinematic feature, while a temporal attention branch integrates Bi-GRU and self-attention mechanisms to capture the dependencies at critical time steps. Ultimately, a learnable gated fusion mechanism adaptively weights the dual-branch information for optimal characterization of track characteristics. Experimental results on maneuvering target datasets demonstrate that the proposed method significantly outperforms multiple baseline models across varying clutter densities: Under the highest clutter density, DA-TCN achieves 95.12% true track initiation rate (+1.6% over best baseline) with 9.65% false alarm rate (3.63% reduction), validating its effectiveness for high-precision and highly robust track initiation in complex environments. Full article

(This article belongs to the Topic Radar Signal and Data Processing with Applications, 2nd Edition)

► Show Figures

Figure 1

23 pages, 5261 KB

Open AccessArticle

FocusNet: A Lightweight Insulator Defect Detection Network via First-Order Taylor Importance Assessment and Knowledge Distillation

by Yurong Jing, Zhiyong Tao and Sen Lin

Algorithms 2025, 18(10), 649; https://doi.org/10.3390/a18100649 - 16 Oct 2025

Viewed by 280

Abstract

In the detection of small targets such as insulator defects and flashovers, the existing YOLOv11 has problems such as insufficient feature extraction and difficulty in balancing model lightweight and detection accuracy. We propose a lightweight architecture called FocusNet based on YOLOv11n. To improve [...] Read more.

In the detection of small targets such as insulator defects and flashovers, the existing YOLOv11 has problems such as insufficient feature extraction and difficulty in balancing model lightweight and detection accuracy. We propose a lightweight architecture called FocusNet based on YOLOv11n. To improve the feature expression ability of small targets, Aggregation Diffusion Neck is designed to achieve deep integration and optimization of features at different levels through multiple rounds of multi-scale feature fusion and scale adaptation, and Focus module is introduced to focus on and strengthen the key features of small targets. On this basis, to achieve efficient deployment, the Group-Level First-Order Taylor Expansion Importance Assessment Method is proposed to eliminate channels that have little impact on detection accuracy to streamline the model structure. Then, Channel Distribution Distillation compensates for the slight accuracy loss caused by pruning, and finally achieves the dual optimization of high accuracy and high efficiency. Furthermore, we analyze the interpretability of FocusNet via heatmaps generated by KPCA-CAM. Experiments show that FocusNet achieves 98.50% precision and 99.20% mAP@0.5 on a proprietary insulator defect detection database created for this project using only 3.80 GFLOPs. This research provides reliable technical support for insulator monitoring in power systems. Full article

(This article belongs to the Special Issue Algorithms for Feature Selection (3rd Edition))

► Show Figures

Figure 1

22 pages, 3964 KB

Open AccessArticle

MultiScaleSleepNet: A Hybrid CNN–BiLSTM–Transformer Architecture with Multi-Scale Feature Representation for Single-Channel EEG Sleep Stage Classification

by Cenyu Liu, Qinglin Guan, Wei Zhang, Liyang Sun, Mengyi Wang, Xue Dong and Shuogui Xu

Sensors 2025, 25(20), 6328; https://doi.org/10.3390/s25206328 - 13 Oct 2025

Viewed by 836

Abstract

Accurate automatic sleep stage classification from single-channel EEG remains challenging due to the need for effective extraction of multiscale neurophysiological features and modeling of long-range temporal dependencies. This study aims to address these limitations by developing an efficient and compact deep learning architecture [...] Read more.

Accurate automatic sleep stage classification from single-channel EEG remains challenging due to the need for effective extraction of multiscale neurophysiological features and modeling of long-range temporal dependencies. This study aims to address these limitations by developing an efficient and compact deep learning architecture tailored for wearable and edge device applications. We propose MultiScaleSleepNet, a hybrid convolutional neural network–bidirectional long short-term memory–transformer architecture that extracts multiscale temporal and spectral features through parallel convolutional branches, followed by sequential modeling using a BiLSTM memory network and transformer-based attention mechanisms. The model obtained an accuracy, macro-averaged F1 score, and kappa coefficient of 88.6%, 0.833, and 0.84 on the Sleep-EDF dataset; 85.6%, 0.811, and 0.80 on the Sleep-EDF Expanded dataset; and 84.6%, 0.745, and 0.79 on the SHHS dataset. Ablation studies indicate that attention mechanisms and spectral fusion consistently improve performance, with the most notable gains observed for stages N1, N3, and rapid eye movement. MultiScaleSleepNet demonstrates competitive performance across multiple benchmark datasets while maintaining a compact size of 1.9 million parameters, suggesting robustness to variations in dataset size and class distribution. The study supports the feasibility of real-time, accurate sleep staging from single-channel EEG using parameter-efficient deep models suitable for portable systems. Full article

(This article belongs to the Special Issue AI on Biomedical Signal Sensing and Processing for Health Monitoring)

► Show Figures

Figure 1

39 pages, 13725 KB

Open AccessArticle

SRTSOD-YOLO: Stronger Real-Time Small Object Detection Algorithm Based on Improved YOLO11 for UAV Imageries

by Zechao Xu, Huaici Zhao, Pengfei Liu, Liyong Wang, Guilong Zhang and Yuan Chai

Remote Sens. 2025, 17(20), 3414; https://doi.org/10.3390/rs17203414 - 12 Oct 2025

Viewed by 1111

Abstract

To address the challenges of small target detection in UAV aerial images—such as difficulty in feature extraction, complex background interference, high miss rates, and stringent real-time requirements—this paper proposes an innovative model series named SRTSOD-YOLO, based on YOLO11. The backbone network incorporates a [...] Read more.

To address the challenges of small target detection in UAV aerial images—such as difficulty in feature extraction, complex background interference, high miss rates, and stringent real-time requirements—this paper proposes an innovative model series named SRTSOD-YOLO, based on YOLO11. The backbone network incorporates a Multi-scale Feature Complementary Aggregation Module (MFCAM), designed to mitigate the loss of small target information as network depth increases. By integrating channel and spatial attention mechanisms with multi-scale convolutional feature extraction, MFCAM effectively locates small objects in the image. Furthermore, we introduce a novel neck architecture termed Gated Activation Convolutional Fusion Pyramid Network (GAC-FPN). This module enhances multi-scale feature fusion by emphasizing salient features while suppressing irrelevant background information. GAC-FPN employs three key strategies: adding a detection head with a small receptive field while removing the original largest one, leveraging large-scale features more effectively, and incorporating gated activation convolutional modules. To tackle the issue of positive-negative sample imbalance, we replace the conventional binary cross-entropy loss with an adaptive threshold focal loss in the detection head, accelerating network convergence. Additionally, to accommodate diverse application scenarios, we develop multiple versions of SRTSOD-YOLO by adjusting the width and depth of the network modules: a nano version (SRTSOD-YOLO-n), small (SRTSOD-YOLO-s), medium (SRTSOD-YOLO-m), and large (SRTSOD-YOLO-l). Experimental results on the VisDrone2019 and UAVDT datasets demonstrate that SRTSOD-YOLO-n improves the mAP@0.5 by 3.1% and 1.2% compared to YOLO11n, while SRTSOD-YOLO-l achieves gains of 7.9% and 3.3% over YOLO11l, respectively. Compared to other state-of-the-art methods, SRTSOD-YOLO-l attains the highest detection accuracy while maintaining real-time performance, underscoring the superiority of the proposed approach. Full article

(This article belongs to the Special Issue Advanced Image Processing Algorithms for Object Detection and Tracking in Aerial and Satellite Imagery)

► Show Figures

Figure 1

30 pages, 13570 KB

Open AccessArticle

DVIF-Net: A Small-Target Detection Network for UAV Aerial Images Based on Visible and Infrared Fusion

by Xiaofeng Zhao, Hui Zhang, Chenxiao Li, Kehao Wang and Zhili Zhang

Remote Sens. 2025, 17(20), 3411; https://doi.org/10.3390/rs17203411 - 11 Oct 2025

Viewed by 897

Abstract

During UAV aerial photography tasks, influenced by flight altitude and imaging mechanisms, the target in images often exhibits characteristics such as small size, complex backgrounds, and small inter-class differences. Under single optical modality, the weak and less discriminative feature representation of targets in [...] Read more.

During UAV aerial photography tasks, influenced by flight altitude and imaging mechanisms, the target in images often exhibits characteristics such as small size, complex backgrounds, and small inter-class differences. Under single optical modality, the weak and less discriminative feature representation of targets in drone-captured images makes them easily overwhelmed by complex background noise, leading to low detection accuracy, high missed-detection and false-detection rates in current object detection networks. Moreover, such methods struggle to meet all-weather and all-scenario application requirements. To address these issues, this paper proposes DVIF-Net, a visible-infrared fusion network for small-target detection in UAV aerial images, which leverages the complementary characteristics of visible and infrared images to enhance detection capability in complex environments. Firstly, a dual-branch feature extraction structure is designed based on YOLO architecture to separately extract features from visible and infrared images. Secondly, a P4-level cross-modal fusion strategy is proposed to effectively integrate features from both modalities while reducing computational complexity. Meanwhile, we design a novel dual context-guided fusion module to capture complementary features through channel attention of visible and infrared images during fusion and enhance interaction between modalities via element-wise multiplication. Finally, an edge information enhancement module based on cross stage partial structure is developed to improve sensitivity to small-target edges. Experimental results on two cross-modal datasets, DroneVehicle and VEDAI, demonstrate that DVIF-Net achieves detection accuracies of 85.8% and 62%, respectively. Compared with YOLOv10n, it has improved by 21.7% and 10.5% in visible modality, and by 7.4% and 30.5% in infrared modality, while maintaining a model parameter count of only 2.49 M. Furthermore, compared with 15 other algorithms, the proposed DVIF-Net attains SOTA performance. These results indicate that the method significantly enhances the detection capability for small targets in UAV aerial images, offering a high-precision and lightweight solution for real-time applications in complex aerial scenarios. Full article

► Show Figures

Figure 1

31 pages, 3160 KB

Open AccessArticle

Multimodal Image Segmentation with Dynamic Adaptive Window and Cross-Scale Fusion for Heterogeneous Data Environments

by Qianping He, Meng Wu, Pengchang Zhang, Lu Wang and Quanbin Shi

Appl. Sci. 2025, 15(19), 10813; https://doi.org/10.3390/app151910813 - 8 Oct 2025

Viewed by 570

Abstract

Multi-modal image segmentation is a key task in various fields such as urban planning, infrastructure monitoring, and environmental analysis. However, it remains challenging due to complex scenes, varying object scales, and the integration of heterogeneous data sources (such as RGB, depth maps, and [...] Read more.

Multi-modal image segmentation is a key task in various fields such as urban planning, infrastructure monitoring, and environmental analysis. However, it remains challenging due to complex scenes, varying object scales, and the integration of heterogeneous data sources (such as RGB, depth maps, and infrared). To address these challenges, we proposed a novel multi-modal segmentation framework, DyFuseNet, which features dynamic adaptive windows and cross-scale feature fusion capabilities. This framework consists of three key components: (1) Dynamic Window Module (DWM), which uses dynamic partitioning and continuous position bias to adaptively adjust window sizes, thereby improving the representation of irregular and fine-grained objects; (2) Scale Context Attention (SCA), a hierarchical mechanism that associates local details with global semantics in a coarse-to-fine manner, enhancing segmentation accuracy in low-texture or occluded regions; and (3) Hierarchical Adaptive Fusion Architecture (HAFA), which aligns and fuses features from multiple modalities through shallow synchronization and deep channel attention, effectively balancing complementarity and redundancy. Evaluated on benchmark datasets (such as ISPRS Vaihingen and Potsdam), DyFuseNet achieved state-of-the-art performance, with mean Intersection over Union (mIoU) scores of 80.40% and 80.85%, surpassing MFTransNet by 1.91% and 1.77%, respectively. The model also demonstrated strong robustness in challenging scenes (such as building edges and shadowed objects), achieving an average F1 score of 85% while maintaining high efficiency (26.19 GFLOPs, 30.09 FPS), making it suitable for real-time deployment. This work presents a practical, versatile, and computationally efficient solution for multi-modal image analysis, with potential applications beyond remote sensing, including smart monitoring, industrial inspection, and multi-source data fusion tasks. Full article

(This article belongs to the Special Issue Signal and Image Processing: From Theory to Applications: 2nd Edition)

► Show Figures

Figure 1

29 pages, 4573 KB

Open AccessArticle

LCW-YOLO: A Lightweight Multi-Scale Object Detection Method Based on YOLOv11 and Its Performance Evaluation in Complex Natural Scenes

by Gang Li and Juelong Fang

Sensors 2025, 25(19), 6209; https://doi.org/10.3390/s25196209 - 7 Oct 2025

Viewed by 736

Abstract

Accurate object detection is fundamental to computer vision, yet detecting small targets in complex backgrounds remains challenging due to feature loss and limited model efficiency. To address this, we propose LCW-YOLO, a lightweight detection framework that integrates three innovations: Wavelet Pooling, a CGBlock-enhanced [...] Read more.

Accurate object detection is fundamental to computer vision, yet detecting small targets in complex backgrounds remains challenging due to feature loss and limited model efficiency. To address this, we propose LCW-YOLO, a lightweight detection framework that integrates three innovations: Wavelet Pooling, a CGBlock-enhanced C3K2 structure, and an improved LDHead detection head. The Wavelet Pooling strategy employs Haar-based multi-frequency reconstruction to preserve fine-grained details while mitigating noise sensitivity. CGBlock introduces dynamic channel interactions within C3K2, facilitating the fusion of shallow visual cues with deep semantic features without excessive computational overhead. LDHead incorporates classification and localization functions, thereby improving target recognition accuracy and spatial precision. Extensive experiments across multiple public datasets demonstrate that LCW-YOLO outperforms mainstream detectors in both accuracy and inference speed, with notable advantages in small-object, sparse, and cluttered scenarios. Here we show that the combination of multi-frequency feature preservation and efficient feature fusion enables stronger representations under complex conditions, advancing the design of resource-efficient detection models for safety-critical and real-time applications. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

24 pages, 73520 KB

Open AccessArticle

2C-Net: A Novel Spatiotemporal Dual-Channel Network for Soil Organic Matter Prediction Using Multi-Temporal Remote Sensing and Environmental Covariates

by Jiale Geng, Chong Luo, Jun Lu, Depiao Kong, Xue Li and Huanjun Liu

Remote Sens. 2025, 17(19), 3358; https://doi.org/10.3390/rs17193358 - 3 Oct 2025

Viewed by 415

Abstract

Soil organic matter (SOM) is essential for ecosystem health and agricultural productivity. Accurate prediction of SOM content is critical for modern agricultural management and sustainable soil use. Existing digital soil mapping (DSM) models, when processing temporal data, primarily focus on modeling the changes [...] Read more.

Soil organic matter (SOM) is essential for ecosystem health and agricultural productivity. Accurate prediction of SOM content is critical for modern agricultural management and sustainable soil use. Existing digital soil mapping (DSM) models, when processing temporal data, primarily focus on modeling the changes in input data across successive time steps. However, they do not adequately model the relationships among different input variables, which hinders the capture of complex data patterns and limits the accuracy of predictions. To address this problem, this paper proposes a novel deep learning model, 2-Channel Network (2C-Net), leveraging sequential multi-temporal remote sensing images to improve SOM prediction. The network separates input data into temporal and spatial data, processing them through independent temporal and spatial channels. Temporal data includes multi-temporal Sentinel-2 spectral reflectance, while spatial data consists of environmental covariates including climate and topography. The Multi-sequence Feature Fusion Module (MFFM) is proposed to globally model spectral data across multiple bands and time steps, and the Diverse Convolutional Architecture (DCA) extracts spatial features from environmental data. Experimental results show that 2C-Net outperforms the baseline model (CNN-LSTM) and mainstream machine learning model for DSM, with R² = 0.524, RMSE = 0.884 (%), MAE = 0.581 (%), and MSE = 0.781 (%)². Furthermore, this study demonstrates the significant importance of sequential spectral data for the inversion of SOM content and concludes the following: for the SOM inversion task, the bare soil period after tilling is a more important time window than other bare soil periods. 2C-Net model effectively captures spatiotemporal features, offering high-accuracy SOM predictions and supporting future DSM and soil management. Full article

(This article belongs to the Special Issue Remote Sensing in Soil Organic Carbon Dynamics)

► Show Figures

Figure 1

16 pages, 10633 KB

Open AccessArticle

HVI-Based Spatial–Frequency-Domain Multi-Scale Fusion for Low-Light Image Enhancement

by Yuhang Zhang, Huiying Zheng, Xinya Xu and Hancheng Zhu

Appl. Sci. 2025, 15(19), 10376; https://doi.org/10.3390/app151910376 - 24 Sep 2025

Viewed by 497

Abstract

Low-light image enhancement aims to restore images captured under extreme low-light conditions. Existing methods demonstrate that fusing Fourier transform magnitude and phase information within the RGB color space effectively improves enhancement results. Meanwhile, recent advances have demonstrated that certain color spaces based on [...] Read more.

Low-light image enhancement aims to restore images captured under extreme low-light conditions. Existing methods demonstrate that fusing Fourier transform magnitude and phase information within the RGB color space effectively improves enhancement results. Meanwhile, recent advances have demonstrated that certain color spaces based on human visual perception, such as Hue–Value–Intensity (HVI), are superior to RGB for enhancing low-light images. However, these methods neglect the key impact of the coupling relationship between spatial and frequency-domain features on image enhancement. This paper proposes a spatial–frequency-domain multi-scale fusion for low-light image enhancement by exploring the intrinsic relationships among the three channels of HVI space, which consists of a dual-path parallel processing architecture. In the spatial domain, a specifically designed multi-scale feature extraction module systematically captures comprehensive structural information. In the frequency domain, our model establishes deep coupling between spatial features and Fourier transform features in the I-channel. The effectively fused features from both domains synergistically drive an encoder–decoder network to achieve superior image enhancement performance. Extensive experiments on multiple public benchmark datasets show that the proposed method significantly outperforms state-of-the-art approaches in both quantitative metrics and visual quality. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

24 pages, 4503 KB

Open AccessArticle

Single-Phase Ground Fault Detection Method in Three-Phase Four-Wire Distribution Systems Using Optuna-Optimized TabNet

by Xiaohua Wan, Hui Fan, Min Li and Xiaoyuan Wei

Electronics 2025, 14(18), 3659; https://doi.org/10.3390/electronics14183659 - 16 Sep 2025

Viewed by 558

Abstract

Single-phase ground (SPG) faults pose significant challenges in three-phase four-wire distribution systems due to their complex transient characteristics and the presence of multiple influencing factors. To solve the aforementioned issues, a comprehensive fault identification framework is proposed, which uses the TabNet deep learning [...] Read more.

Single-phase ground (SPG) faults pose significant challenges in three-phase four-wire distribution systems due to their complex transient characteristics and the presence of multiple influencing factors. To solve the aforementioned issues, a comprehensive fault identification framework is proposed, which uses the TabNet deep learning architecture with hyperparameters optimized by Optuna. Firstly, a 10 kV simulation model is developed in Simulink to generate a diverse fault dataset. For each simulated fault, voltage and current signals from eight channels (L1–L4 voltage and current) are collected. Secondly, multi-domain features are extracted from each channel across time, frequency, waveform, and wavelet perspectives. Then, an attention-based fusion mechanism is employed to capture cross-channel dependencies, followed by L2-norm-based feature selection to enhance generalization. Finally, the optimized TabNet model effectively classifies 24 fault categories, achieving an accuracy of 97.33%, and outperforms baseline methods including Temporal Convolutional Network (TCN), Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM), Capsule Network with Sparse Filtering (CNSF), and Dual-Branch CNN in terms of accuracy, macro-F1 score, and kappa coefficient. It also exhibits strong stability and fast convergence during training. These results demonstrate the robustness and interpretability of the proposed method for SPG fault detection. Full article

(This article belongs to the Section Power Electronics)

► Show Figures

Figure 1

19 pages, 3899 KB

Open AccessArticle

An Automatic Brain Cortex Segmentation Technique Based on Dynamic Recalibration and Region Awareness

by Jiaofen Nan, Gaodeng Fan, Kaifan Zhang, Shuyao Zhai, Xueqi Jin, Duan Li and Chunlai Yu

Electronics 2025, 14(18), 3631; https://doi.org/10.3390/electronics14183631 - 13 Sep 2025

Viewed by 367

Abstract

To address the limitations in the accuracy of current cerebral cortex structure segmentation methods, this study proposes an automatic segmentation network based on dynamic recalibration and region awareness. The network is an improved version of the classic U-shaped architecture, incorporating a Dynamic Recalibration [...] Read more.

To address the limitations in the accuracy of current cerebral cortex structure segmentation methods, this study proposes an automatic segmentation network based on dynamic recalibration and region awareness. The network is an improved version of the classic U-shaped architecture, incorporating a Dynamic Recalibration Block (DRB) and a Region-Aware Block (RAB). The DRB enhances important feature channels by extracting global feature information across channels, computing the significance weights via a two-layer fully connected network, and applying these weights to the original feature maps for dynamic feature reweighting. Meanwhile, the RAB integrates spatial positional information and captures both global and local context across multiple dimensions. It recalibrates features using dimension-specific weights, enabling region-aware feature association and complementing the DRB’s function. Together, these components enable efficient and accurate segmentation of brain structures. The proposed DRA-Net model effectively overcomes the accuracy–efficiency trade-off in cortical segmentation through multi-scale feature fusion, dual attention mechanisms, and deep feature extraction strategies. Experimental results demonstrate that DRA-Net achieves an average Dice score of 91.35% across multiple datasets, outperforming segmentation atlases based on methods such as U-Net, QuickNAT, and FastSurfer. Full article

► Show Figures

Figure 1

15 pages, 5090 KB

Open AccessArticle

EFIMD-Net: Enhanced Feature Interaction and Multi-Domain Fusion Deep Forgery Detection Network

by Hao Cheng, Weiye Pang, Kun Li, Yongzhuang Wei, Yuhang Song and Ji Chen

J. Imaging 2025, 11(9), 312; https://doi.org/10.3390/jimaging11090312 - 12 Sep 2025

Viewed by 594

Abstract

Currently, deepfake detection has garnered widespread attention as a key defense mechanism against the misuse of deepfake technology. However, existing deepfake detection networks still face challenges such as insufficient robustness, limited generalization capabilities, and a single feature extraction domain (e.g., using only spatial [...] Read more.

Currently, deepfake detection has garnered widespread attention as a key defense mechanism against the misuse of deepfake technology. However, existing deepfake detection networks still face challenges such as insufficient robustness, limited generalization capabilities, and a single feature extraction domain (e.g., using only spatial domain features) when confronted with evolving algorithms or diverse datasets, which severely limits their application capabilities. To address these issues, this study proposes a deepfake detection network named EFIMD-Net, which enhances performance by strengthening feature interaction and integrating spatial and frequency domain features. The proposed network integrates a Cross-feature Interaction Enhancement module (CFIE) based on cosine similarity, which achieves adaptive interaction between spatial domain features (RGB stream) and frequency domain features (SRM, Spatial Rich Model stream) through a channel attention mechanism, effectively fusing macro-semantic information with high-frequency artifact information. Additionally, an Enhanced Multi-scale Feature Fusion (EMFF) module is proposed, which effectively integrates multi-scale feature information from various layers of the network through adaptive feature enhancement and reorganization techniques. Experimental results show that compared to the baseline network Xception, EFIMD-Net achieves comparable or even better Area Under the Curve (AUC) on multiple datasets. Ablation experiments also validate the effectiveness of the proposed modules. Furthermore, compared to the baseline traditional two-stream network Locate and Verify, EFIMD-Net significantly improves forgery detection performance, with a 9-percentage-point increase in Area Under the Curve on the CelebDF-v1 dataset and a 7-percentage-point increase on the CelebDF-v2 dataset. These results fully demonstrate the effectiveness and generalization of EFIMD-Net in forgery detection. Potential limitations regarding real-time processing efficiency are acknowledged. Full article

(This article belongs to the Section Biometrics, Forensics, and Security)

► Show Figures

Figure 1

17 pages, 3935 KB

Open AccessArticle

Markerless Force Estimation via SuperPoint-SIFT Fusion and Finite Element Analysis: A Sensorless Solution for Deformable Object Manipulation

by Qingqing Xu, Ruoyang Lai and Junqing Yin

Biomimetics 2025, 10(9), 600; https://doi.org/10.3390/biomimetics10090600 - 8 Sep 2025

Viewed by 532

Abstract

Contact-force perception is a critical component of safe robotic grasping. With the rapid advances in embodied intelligence technology, humanoid robots have enhanced their multimodal perception capabilities. Conventional force sensors face limitations, such as complex spatial arrangements, installation challenges at multiple nodes, and potential [...] Read more.

Contact-force perception is a critical component of safe robotic grasping. With the rapid advances in embodied intelligence technology, humanoid robots have enhanced their multimodal perception capabilities. Conventional force sensors face limitations, such as complex spatial arrangements, installation challenges at multiple nodes, and potential interference with robotic flexibility. Consequently, these conventional sensors are unsuitable for biomimetic robot requirements in object perception, natural interaction, and agile movement. Therefore, this study proposes a sensorless external force detection method that integrates SuperPoint-Scale Invariant Feature Transform (SIFT) feature extraction with finite element analysis to address force perception challenges. A visual analysis method based on the SuperPoint-SIFT feature fusion algorithm was implemented to reconstruct a three-dimensional displacement field of the target object. Subsequently, the displacement field was mapped to the contact force distribution using finite element modeling. Experimental results demonstrate a mean force estimation error of 7.60% (isotropic) and 8.15% (anisotropic), with RMSE < 8%, validated by flexible pressure sensors. To enhance the model’s reliability, a dual-channel video comparison framework was developed. By analyzing the consistency of the deformation patterns and mechanical responses between the actual compression and finite element simulation video keyframes, the proposed approach provides a novel solution for real-time force perception in robotic interactions. The proposed solution is suitable for applications such as precision assembly and medical robotics, where sensorless force feedback is crucial. Full article

(This article belongs to the Special Issue Bio-Inspired Intelligent Robot)

► Show Figures

Figure 1

24 pages, 3398 KB

Open AccessArticle

DEMNet: Dual Encoder–Decoder Multi-Frame Infrared Small Target Detection Network with Motion Encoding

by Feng He, Qiran Zhang, Yichuan Li and Tianci Wang

Remote Sens. 2025, 17(17), 2963; https://doi.org/10.3390/rs17172963 - 26 Aug 2025

Viewed by 976

Abstract

Infrared dim and small target detection aims to accurately localize targets within complex backgrounds or clutter. However, under extremely low signal-to-noise ratio (SNR) conditions, single-frame detection methods often fail to effectively detect such targets. In contrast, multi-frame detection can exploit temporal cues to [...] Read more.

Infrared dim and small target detection aims to accurately localize targets within complex backgrounds or clutter. However, under extremely low signal-to-noise ratio (SNR) conditions, single-frame detection methods often fail to effectively detect such targets. In contrast, multi-frame detection can exploit temporal cues to significantly improve the probability of detection (Pd) and reduce false alarms (Fa). Existing multi-frame approaches often employ 3D convolutions/RNNs to implicitly extract temporal features. However, they typically lack explicit modeling of target motion. To address this, we propose a Dual Encoder–Decoder Multi-Frame Infrared Small Target Detection Network with Motion Encoding (DEMNet) that explicitly incorporates motion information into the detection process. The first multi-level encoder–decoder module leverages spatial and channel attention mechanisms to fuse hierarchical features across multiple scales, enabling robust spatial feature extraction from each frame of the temporally aligned input sequence. The second encoder–decoder module encodes both inter-frame target motion and intra-frame target positional information, followed by 3D convolution to achieve effective motion information fusion. Extensive experiments demonstrate that DEMNet achieves state-of-the-art performance, outperforming recent advanced methods such as DTUM and SSTNet. For the DAUB dataset, compared to the second-best model, DEMNet improves Pd by 2.42 percentage points and reduces Fa by 4.13 × 10⁻⁶ (a 68.72% reduction). For the NUDT dataset, it improves Pd by 1.68 percentage points and reduces Fa by 0.67 × 10⁻⁶ (a 7.26% reduction) compared to the next-best model. Notably, DEMNet demonstrates even greater advantages on test sequences with SNR ≤ 3. Full article

(This article belongs to the Special Issue Recent Advances in Infrared Target Detection)

► Show Figures

Figure 1

16 pages, 3575 KB

Open AccessArticle

A Fusion Model for Intelligent Diagnosis of Gear Faults with Small Sample Sizes

by Jianing Huang, Zikang Liu, Jianggui Han, Chenghao Cao and Xiaofeng Li

Sensors 2025, 25(17), 5230; https://doi.org/10.3390/s25175230 - 22 Aug 2025

Viewed by 758

Abstract

Gear faults are a frequent cause of rotating machinery breakdowns. There are two open issues in the current intelligent diagnosis model of gear faults. (1) Shallow models demand fewer data but necessitate feature extraction from raw signals, relying on prior knowledge. (2) Deep [...] Read more.

Gear faults are a frequent cause of rotating machinery breakdowns. There are two open issues in the current intelligent diagnosis model of gear faults. (1) Shallow models demand fewer data but necessitate feature extraction from raw signals, relying on prior knowledge. (2) Deep networks can adaptively extract fault features but require large datasets to train hyperparameters. In this paper, a novel fusion model, called CBAM-TCN-SVM, is proposed for intelligent gear fault diagnosis. It consists of a temporal convolutional network module (TCN), a convolutional block attention module (CBAM), and a support vector machine (SVM) module. More specifically, the frequency-domain sequence data are fed into the CBAM-TCN model, which effectively extracts deep fault features via multiple convolutional layers, channel attention mechanisms, and spatial attention mechanisms. Then, the SVM classifier is employed for intelligent classification. The fusion model combines the advantages of deep networks and shallow classifiers, addressing the issues that arise when the accuracy of fault diagnoses is constrained by the data scale and feature extractions rely on prior knowledge. The experiments result in the proposed method achieving a classification accuracy of 98.3% and demonstrate that it is a feasible approach for predicting gear faults. Full article

(This article belongs to the Special Issue Advanced Fault Diagnosis and Health Monitoring Techniques for Complex Engineering Systems: 2nd Edition)

► Show Figures

Figure 1

Search Results (196)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (196)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI