MDPI - Publisher of Open Access Journals

24 pages, 59144 KB

Open AccessArticle

EWAM: Scene-Adaptive Infrared-Visible Image Matching with Radiation-Prior Encoding and Learnable Wavelet Edge Enhancement

by Mingwei Li, Hai Tan, Haoran Zhai and Jinlong Ci

Remote Sens. 2025, 17(22), 3666; https://doi.org/10.3390/rs17223666 - 7 Nov 2025

Abstract

Infrared–visible image matching is a prerequisite for environmental monitoring, military reconnaissance, and multisource geospatial analysis. However, pronounced texture disparities, intensity drift, and complex non-linear radiometric distortions in such cross-modal pairs mean that existing frameworks such as SuperPoint + SuperGlue (SP + SG) and [...] Read more.

Infrared–visible image matching is a prerequisite for environmental monitoring, military reconnaissance, and multisource geospatial analysis. However, pronounced texture disparities, intensity drift, and complex non-linear radiometric distortions in such cross-modal pairs mean that existing frameworks such as SuperPoint + SuperGlue (SP + SG) and LoFTR cannot reliably establish correspondences. To address this issue, we propose a dual-path architecture, the Environment-Adaptive Wavelet Enhancement and Radiation Priors Aided Matcher (EWAM). EWAM incorporates two synergistic branches: (1) an Environment-Adaptive Radiation Feature Extractor, which first classifies the scene according to radiation-intensity variations and then incorporates a physical radiation model into a learnable gating mechanism for selective feature propagation; (2) a Wavelet-Transform High-Frequency Enhancement Module, which recovers blurred edge structures by boosting wavelet coefficients under directional perceptual losses. The two branches collectively increase the number of tie points (reliable correspondences) and refine their spatial localization. A coarse-to-fine matcher subsequently refines the cross-modal correspondences. We benchmarked EWAM against SIFT, AKAZE, D2-Net, SP + SG, and LoFTR on a newly compiled dataset that fuses GF-7, Landsat-8, and Five-Billion-Pixels imagery. Across desert, mountain, gobi, urban and farmland scenes, EWAM reduced the average RMSE to 1.85 pixels and outperformed the best competing method by 2.7%, 2.6%, 2.0%, 2.3% and 1.8% in accuracy, respectively. These findings demonstrate that EWAM yields a robust and scalable framework for large-scale multi-sensor remote-sensing data fusion. Full article

► Show Figures

Figure 1

16 pages, 450 KB

Open AccessArticle

Uncertainty-Aware Multi-Branch Graph Attention Network for Transient Stability Assessment of Power Systems Under Disturbances

by Ke Wang, Shixiong Fan, Haotian Xu, Jincai Huang and Kezheng Jiang

Mathematics 2025, 13(22), 3575; https://doi.org/10.3390/math13223575 - 7 Nov 2025

Abstract

With the rapid development of modern society and the continuous growth of electricity demand, the stability of power systems has become increasingly critical. In particular, Transient Stability Assessment (TSA) plays a vital role in ensuring the secure and reliable operation of power systems. [...] Read more.

With the rapid development of modern society and the continuous growth of electricity demand, the stability of power systems has become increasingly critical. In particular, Transient Stability Assessment (TSA) plays a vital role in ensuring the secure and reliable operation of power systems. Existing studies have employed Graph Attention Networks (GAT) to model both the topological structure and vertex attributes of power systems, achieving excellent results under ideal test environments. However, the continuous expansion of power systems and the large-scale integration of renewable energy sources have significantly increased system complexity, posing major challenges to TSA. Traditional methods often struggle to handle various disturbances. To address this issue, we propose a graph attention network framework with multi-branch feature aggregation. This framework constructs multiple GAT branches from different information sources and employs a learnable mask mechanism to enhance diversity among branches. In addition, this framework adopts an uncertainty-aware aggregation strategy to efficiently fuse the information from all branches. Extensive experiments conducted on the IEEE-39 bus and IEEE-118 bus systems demonstrate that our method consistently outperforms existing approaches under different disturbance scenarios, providing more accurate and reliable identification of potential instability risks. Full article

(This article belongs to the Special Issue Advanced Neural Network and Machine Learning Algorithms, Models and Architectures in Data Mining)

► Show Figures

Figure 1

24 pages, 2181 KB

Open AccessArticle

DPDQN-TER: An Improved Deep Reinforcement Learning Approach for Mobile Robot Path Planning in Dynamic Scenarios

by Shuyuan Gao, Yang Xu, Xiaoxiao Guo, Chenchen Liu and Xiaobai Wang

Sensors 2025, 25(21), 6741; https://doi.org/10.3390/s25216741 - 4 Nov 2025

Viewed by 433

Abstract

Efficient and stable path planning in dynamic and obstacle-dense environments, such as large-scale structure assembly measurement, is essential for improving the practicality and environmental adaptability of mobile robots in measurement and quality inspection tasks. However, traditional reinforcement learning methods often suffer from inefficient [...] Read more.

Efficient and stable path planning in dynamic and obstacle-dense environments, such as large-scale structure assembly measurement, is essential for improving the practicality and environmental adaptability of mobile robots in measurement and quality inspection tasks. However, traditional reinforcement learning methods often suffer from inefficient use of experience and limited capability to represent policy structures in complex dynamic scenarios. To overcome these limitations, this study proposes a method named DPDQN-TER that integrates Transformer-based sequence modeling with a multi-branch parameter policy network. The proposed method introduces a temporal-aware experience replay mechanism that employs multi-head self-attention to capture causal dependencies within state transition sequences. By dynamically weighting and sampling critical obstacle-avoidance experiences, this mechanism significantly improves learning efficiency and policy performance and stability in dynamic environments. Furthermore, a multi-branch parameter policy structure is designed to decouple continuous parameter generation tasks of different action categories into independent subnetworks, thereby reducing parameter interference and improving deployment-time efficiency. Extensive simulation experiments were conducted in both static and dynamic obstacle environments, as well as cross-environment validation. The results show that DPDQN-TER achieves higher success rates, shorter path lengths, and faster convergence compared with benchmark algorithms including Parameterized Deep Q-Network (PDQN), Multi-Pass Deep Q-Network (MPDQN), and PDQN-TER. Ablation studies further confirm that both the Transformer-enhanced replay mechanism and the multi-branch parameter policy network contribute significantly to these improvements. These findings demonstrate improved overall performance (e.g., success rate, path length, and convergence) and generalization capability of the proposed method, indicating its potential as a practical solution for autonomous navigation of mobile robots in complex industrial measurement scenarios. Full article

(This article belongs to the Special Issue Advanced Sensors for Path Planning and Navigation in Challenging Environments)

► Show Figures

Figure 1

17 pages, 3049 KB

Open AccessArticle

PECNet: A Lightweight Single-Image Super-Resolution Network with Periodic Boundary Padding Shift and Multi-Scale Adaptive Feature Aggregation

by Tianyu Gao and Yuhao Liu

Symmetry 2025, 17(11), 1833; https://doi.org/10.3390/sym17111833 - 1 Nov 2025

Viewed by 177

Abstract

Lightweight Single-Image Super-Resolution (SISR) faces the core challenge of balancing computational efficiency with reconstruction quality, particularly in preserving both high-frequency details and global structures under constrained resources. To address this, we propose the Periodically Enhanced Cascade Network (PECNet). Our main contributions are as [...] Read more.

Lightweight Single-Image Super-Resolution (SISR) faces the core challenge of balancing computational efficiency with reconstruction quality, particularly in preserving both high-frequency details and global structures under constrained resources. To address this, we propose the Periodically Enhanced Cascade Network (PECNet). Our main contributions are as follows: 1. Its core component, a novel Multi-scale Adaptive Feature Aggregation (MAFA) module, which employs three functionally complementary branches that work synergistically: one dedicated to extracting local high-frequency details, another to efficiently modeling long-range dependencies and a third to capturing structured contextual information within windows. 2. To seamlessly integrate these branches and enable cross-window information interaction, we introduce the Periodic Boundary Padding Shift (PBPS) mechanism. This mechanism serves as a symmetric preprocessing step that achieves implicit window shifting without introducing any additional computational overhead. Extensive benchmarking shows PECNet achieves better reconstruction quality without a complexity increase. Taking the representative shift-window-based lightweight model, NGswin, as an example, for ×4 SR on the Manga109 dataset, PECNet achieves an average PSNR 0.25 dB higher, while its computational cost (in FLOPs) constitutes merely 40% of NGswin’s. Full article

(This article belongs to the Special Issue Symmetry in Deep Learning Networks and Its Applications in the Real World)

► Show Figures

Figure 1

19 pages, 1895 KB

Open AccessArticle

Cross-Context Aggregation for Multi-View Urban Scene and Building Facade Matching

by Yaping Yan and Yuhang Zhou

ISPRS Int. J. Geo-Inf. 2025, 14(11), 425; https://doi.org/10.3390/ijgi14110425 - 31 Oct 2025

Viewed by 310

Abstract

Accurate and robust feature matching across multi-view urban imagery is fundamental for urban mapping, 3D reconstruction, and large-scale spatial alignment. Real-world urban scenes involve significant variations in viewpoint, illumination, and occlusion, as well as repetitive architectural patterns that make correspondence estimation challenging. To [...] Read more.

Accurate and robust feature matching across multi-view urban imagery is fundamental for urban mapping, 3D reconstruction, and large-scale spatial alignment. Real-world urban scenes involve significant variations in viewpoint, illumination, and occlusion, as well as repetitive architectural patterns that make correspondence estimation challenging. To address these issues, we propose the Cross-Context Aggregation Matcher (CCAM), a detector-free framework that jointly leverages multi-scale local features, long-range contextual information, and geometric priors to produce spatially consistent matches. Specifically, CCAM integrates a multi-scale local enhancement branch with a parallel self- and cross-attention Transformer, enabling the model to preserve detailed local structures while maintaining a coherent global context. In addition, an independent positional encoding scheme is introduced to strengthen geometric reasoning in repetitive or low-texture regions. Extensive experiments demonstrate that CCAM outperforms state-of-the-art methods, achieving up to +31.8%, +19.1%, and +11.5% improvements in AUC@{5°, 10°, 20°} over detector-based approaches and up to 1.72% higher precision compared with detector-free counterparts. These results confirm that CCAM delivers reliable and spatially coherent matches, thereby facilitating downstream geospatial applications. Full article

► Show Figures

Figure 1

18 pages, 6298 KB

Open AccessArticle

The Influence of Multi-Level Structure on the Bearing and Crack Propagation Mechanism of Tooth Enamel

by Yiyun Kong, Haiyan Xin, Siqi Zhu, Mengmeng Chen, Yujie Fan and Jing Xia

Coatings 2025, 15(11), 1255; https://doi.org/10.3390/coatings15111255 - 30 Oct 2025

Viewed by 300

Abstract

Dental enamel exhibits a unique combination of high hardness and high toughness. This outstanding mechanical property is closely tied to its multi-scale hierarchical structure. In this study, rat tooth enamel was selected as the research object, the different structural layers and mechanical properties [...] Read more.

Dental enamel exhibits a unique combination of high hardness and high toughness. This outstanding mechanical property is closely tied to its multi-scale hierarchical structure. In this study, rat tooth enamel was selected as the research object, the different structural layers and mechanical properties of tooth enamel were investigated and characterized experimentally. The multi-scale mechanical models with different structural layers were developed and analyzed using numerical simulations. The research results indicate that, regarding the load-bearing mechanism, the outer layer of tooth enamel consists of hydroxyapatite crystal bundles arranged in parallel and inclined orientations, and this structural feature enables it to exhibit excellent elastic modulus and resistance to deformation, while the inner layer with cross-arranged crystal bundles shows different mechanical response characteristics. In terms of crack propagation behavior, the outer layer is more prone to crack initiation due to the consistency of crystal orientation, and the cracks tend to extend in a straight line, while the unique cross arrangement of crystals in the inner layer can effectively inhibit crack propagation by inducing crack deflection and branching mechanisms, thus demonstrating more excellent fracture toughness. This “outer hard and inner flexible” gradient structure design elucidates the synergistic mechanism between crystal orientation and crack propagation behavior in tooth enamel, offering significant design insights for biomimetic composite materials. Full article

(This article belongs to the Section Surface Coatings for Biomedicine and Bioengineering)

► Show Figures

Figure 1

18 pages, 2271 KB

Open AccessArticle

DAFF-Net: A Dual-Branch Attention-Guided Feature Fusion Network for Vehicle Re-Identification

by Yi Guo, Guowu Yuan, Wen Li and Hao Li

Algorithms 2025, 18(11), 690; https://doi.org/10.3390/a18110690 - 29 Oct 2025

Viewed by 299

Abstract

Vehicle re-identification (Re-ID) is a critical task in the fields of intelligent transportation and urban surveillance. This task faces numerous challenges, such as significant changes in shooting angles, strong similarities in appearance between different vehicles of the same model, and difficulties in modeling [...] Read more.

Vehicle re-identification (Re-ID) is a critical task in the fields of intelligent transportation and urban surveillance. This task faces numerous challenges, such as significant changes in shooting angles, strong similarities in appearance between different vehicles of the same model, and difficulties in modeling fine-grained differences. To overcome the shortcomings of existing methods in local feature extraction and multi-scale fusion, this paper proposes an attention-guided dual-branch feature fusion network (DAFF-Net). The network uses ResNet50-ibn as its backbone and designs two complementary feature extraction branches. One branch fuses cross-layer attention between shallow and deep features, introducing a Temperature-Calibration Attention Fusion Module (TCAF) to improve the accuracy of cross-layer feature fusion effectively. The other branch enhances multi-scale attention for mid-layer features, constructing a Multi-Scale Gated Attention Module (MSGA) to extract local details and directional structural information. Finally, the discriminative ability of the enhanced features is improved by concatenating the two branch features and jointly optimizing the network using triplet loss, cross-entropy loss, and center loss. Experimental results on the VeRi-776 and VehicleID public datasets indicate that the proposed DAFF-Net outperforms existing mainstream methods in multiple key metrics. On the VeRi-776 dataset, mAP and CMC@1 increased to 82.2% and 97.5%, respectively. In the three test subsets of the VehicleID dataset, the CMC@1 metric achieved 90.7%, 84.6%, and 82.1%, respectively, demonstrating the effectiveness of the proposed network in vehicle re-identification tasks. Full article

► Show Figures

Figure 1

27 pages, 2162 KB

Open AccessArticle

A Dual-Attention Temporal Convolutional Network-Based Track Initiation Method for Maneuvering Targets

by Hanbao Wu, Yiming Hao, Wei Chen and Mingli Liao

Electronics 2025, 14(21), 4215; https://doi.org/10.3390/electronics14214215 - 28 Oct 2025

Viewed by 197

Abstract

In strong clutter and maneuvering scenarios, radar track initiation faces the dual challenges of a low initiation rate and high false alarm rate. Although the existing deep learning methods show promise, the commonly adopted “feature flattening” input strategy destroys the intrinsic temporal structure [...] Read more.

In strong clutter and maneuvering scenarios, radar track initiation faces the dual challenges of a low initiation rate and high false alarm rate. Although the existing deep learning methods show promise, the commonly adopted “feature flattening” input strategy destroys the intrinsic temporal structure and feature relationships of track data, limiting its discriminative performance. To address this issue, this paper proposes a novel radar track initiation method based on Dual-Attention Temporal Convolutional Network (DA-TCN), reformulating track initiation as a binary classification task for very short multi-channel time series that preserve complete temporal structure. The DA-TCN model employs the TCN as its backbone network to extract local dynamic features and innovatively constructs a dual-attention architecture: a channel attention branch dynamically calibrates the importance of each kinematic feature, while a temporal attention branch integrates Bi-GRU and self-attention mechanisms to capture the dependencies at critical time steps. Ultimately, a learnable gated fusion mechanism adaptively weights the dual-branch information for optimal characterization of track characteristics. Experimental results on maneuvering target datasets demonstrate that the proposed method significantly outperforms multiple baseline models across varying clutter densities: Under the highest clutter density, DA-TCN achieves 95.12% true track initiation rate (+1.6% over best baseline) with 9.65% false alarm rate (3.63% reduction), validating its effectiveness for high-precision and highly robust track initiation in complex environments. Full article

(This article belongs to the Topic Radar Signal and Data Processing with Applications, 2nd Edition)

► Show Figures

Figure 1

19 pages, 2911 KB

Open AccessArticle

MCFI-Net: Multi-Scale Cross-Layer Feature Interaction Network for Landslide Segmentation in Remote Sensing Imagery

by Jianping Liao and Lihua Ye

Electronics 2025, 14(21), 4190; https://doi.org/10.3390/electronics14214190 - 27 Oct 2025

Viewed by 212

Abstract

Accurate and reliable detection of landslides plays a crucial role in disaster prevention and mitigation efforts. However, due to unfavorable environmental conditions, uneven surface structures, and other disturbances similar to those of landslides, traditional methods often fail to achieve the desired results. To [...] Read more.

Accurate and reliable detection of landslides plays a crucial role in disaster prevention and mitigation efforts. However, due to unfavorable environmental conditions, uneven surface structures, and other disturbances similar to those of landslides, traditional methods often fail to achieve the desired results. To address these challenges, this study introduces a novel multi-scale cross-layer feature interaction network, specifically designed for landslide segmentation in remote sensing images. In the MCFI-Net framework, we adopt the encoder–decoder as the foundational architecture, and integrate cross-layer feature information to capture fine-grained local textures and broader contextual patterns. Then, we introduce the receptive field block (RFB) into the skip connections to effectively aggregate multi-scale contextual information. Additionally, we design the multi-branch dynamic convolution block (MDCB), which possesses both dynamic perception ability and multi-scale feature representation capability. The comprehensive evaluation conducted on both the Landslide4Sense and Bijie datasets demonstrates the superior performance of MCFI-Net in landslide segmentation tasks. Specifically, on the Landslide4Sense dataset, MCFI-Net achieved a Dice score of 0.7254, a Matthews correlation coefficient (Mcc) of 0.7138, and a Jaccard score of 0.5699. Similarly, on the Bijie dataset, MCFI-Net maintained high accuracy with a Dice score of 0.8201, an Mcc of 0.8004, and a Jaccard score of 0.6951. Furthermore, when evaluated on the optical remote sensing dataset EORSSD, MCFI-Net obtained a Dice score of 0.7770, an Mcc of 0.7732, and a Jaccard score of 0.6571. Finally, ablation experiments carried out on the Landslide4Sense dataset further validated the effectiveness of each proposed module. These results affirm MCFI-Net’s capability in accurately identifying landslide regions from complex remote sensing imagery, and it provides great potential for the analysis of geological disasters in the real world. Full article

(This article belongs to the Special Issue Advanced Machine Learning Technologies and Their Applications in Intelligent Imaging and Image Processing)

► Show Figures

Figure 1

29 pages, 6329 KB

Open AccessArticle

Non-Contact Measurement of Sunflower Flowerhead Morphology Using Mobile-Boosted Lightweight Asymmetric (MBLA)-YOLO and Point Cloud Technology

by Qiang Wang, Xinyuan Wei, Kaixuan Li, Boxin Cao and Wuping Zhang

Agriculture 2025, 15(21), 2180; https://doi.org/10.3390/agriculture15212180 - 22 Oct 2025

Viewed by 354

Abstract

The diameter of the sunflower flower head and the thickness of its margins are important crop phenotypic parameters. Traditional, single-dimensional two-dimensional imaging methods often struggle to balance precision with computational efficiency. This paper addresses the limitations of the YOLOv11n-seg model in the instance [...] Read more.

The diameter of the sunflower flower head and the thickness of its margins are important crop phenotypic parameters. Traditional, single-dimensional two-dimensional imaging methods often struggle to balance precision with computational efficiency. This paper addresses the limitations of the YOLOv11n-seg model in the instance segmentation of floral disk fine structures by proposing the MBLA-YOLO instance segmentation model, achieving both lightweight efficiency and high accuracy. Building upon this foundation, a non-contact measurement method is proposed that combines an improved model with three-dimensional point cloud analysis to precisely extract key structural parameters of the flower head. First, image annotation is employed to eliminate interference from petals and sepals, whilst instance segmentation models are used to delineate the target region; The segmentation results for the disc surface (front) and edges (sides) are then mapped onto the three-dimensional point cloud space. Target regions are extracted, and following processing, separate models are constructed for the disc surface and edges. Finally, with regard to the differences between the surface and edge structures, targeted methods are employed for their respective calculations. Whilst maintaining lightweight characteristics, the proposed MBLA-YOLO model achieves simultaneous improvements in accuracy and efficiency compared to the baseline YOLOv11n-seg. The introduced CKMB backbone module enhances feature modelling capabilities for complex structural details, whilst the LADH detection head improves small object recognition and boundary segmentation accuracy. Specifically, the CKMB module integrates MBConv and channel attention to strengthen multi-scale feature extraction and representation, while the LADH module adopts a tri-branch design for classification, regression, and IoU prediction, structurally improving detection precision and boundary recognition. This research not only demonstrates superior accuracy and robustness but also significantly reduces computational overhead, thereby achieving an excellent balance between model efficiency and measurement precision. This method avoids the need for three-dimensional reconstruction of the entire plant and multi-view point cloud registration, thereby reducing data redundancy and computational resource expenditure. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

22 pages, 1678 KB

Open AccessArticle

Image Completion Network Considering Global and Local Information

by Yubo Liu, Ke Chen and Alan Penn

Buildings 2025, 15(20), 3746; https://doi.org/10.3390/buildings15203746 - 17 Oct 2025

Viewed by 307

Abstract

Accurate depth image inpainting in complex urban environments remains a critical challenge due to occlusions, reflections, and sensor limitations, which often result in significant data loss. We propose a hybrid deep learning framework that explicitly combines local and global modelling through Convolutional Neural [...] Read more.

Accurate depth image inpainting in complex urban environments remains a critical challenge due to occlusions, reflections, and sensor limitations, which often result in significant data loss. We propose a hybrid deep learning framework that explicitly combines local and global modelling through Convolutional Neural Networks (CNNs) and Transformer modules. The model employs a multi-branch parallel architecture, where the CNN branch captures fine-grained local textures and edges, while the Transformer branch models global semantic structures and long-range dependencies. We introduce an optimized attention mechanism, Agent Attention, which differs from existing efficient/linear attention methods by using learnable proxy tokens tailored for urban scene categories (e.g., façades, sky, ground). A content-guided dynamic fusion module adaptively combines multi-scale features to enhance structural alignment and texture recovery. The frame-work is trained with a composite loss function incorporating pixel accuracy, perceptual similarity, adversarial realism, and structural consistency. Extensive experiments on the Paris StreetView dataset demonstrate that the proposed method achieves state-of-the-art performance, outperforming existing approaches in PSNR, SSIM, and LPIPS metrics. The study highlights the potential of multi-scale modeling for urban depth inpainting and discusses challenges in real-world deployment, ethical considerations, and future directions for multimodal integration. Full article

(This article belongs to the Special Issue Advanced Technologies for Construction and Maintenance of Engineering Structures)

► Show Figures

Figure 1

20 pages, 1288 KB

Open AccessArticle

Spatio-Temporal Residual Attention Network for Satellite-Based Infrared Small Target Detection

by Yan Chang, Decao Ma, Qisong Yang, Shaopeng Li and Daqiao Zhang

Remote Sens. 2025, 17(20), 3457; https://doi.org/10.3390/rs17203457 - 16 Oct 2025

Viewed by 381

Abstract

With the development of infrared remote sensing technology and the deployment of satellite constellations, infrared video from orbital platforms is playing an increasingly important role in airborne target surveillance. However, due to the limitations of remote sensing imaging, the aerial targets in such [...] Read more.

With the development of infrared remote sensing technology and the deployment of satellite constellations, infrared video from orbital platforms is playing an increasingly important role in airborne target surveillance. However, due to the limitations of remote sensing imaging, the aerial targets in such videos are often small in scale, low in contrast, and slow in movement, making them difficult to detect in complex backgrounds. In this paper, we propose a novel detection network that integrates inter-frame residual guidance with spatio-temporal feature enhancement to address the challenge of small object detection in infrared satellite video. This method first extracts residual features to highlight motion-sensitive regions, then uses a dual-branch structure to encode spatial semantics and temporal evolution, and then fuses them deeply through a multi-scale feature enhancement module. Extensive experiments show that this method outperforms mainstream methods in terms on various infrared small target video datasets, and has good robustness under low-signal-to-noise-ratio conditions. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

21 pages, 2372 KB

Open AccessArticle

IDG-ViolenceNet: A Video Violence Detection Model Integrating Identity-Aware Graphs and 3D-CNN

by Hong Huang and Qingping Jiang

Sensors 2025, 25(20), 6272; https://doi.org/10.3390/s25206272 - 10 Oct 2025

Viewed by 608

Abstract

Video violence detection plays a crucial role in intelligent surveillance and public safety, yet existing methods still face challenges in modeling complex multi-person interactions. To address this, we propose IDG-ViolenceNet, a dual-stream video violence detection model that integrates identity-aware spatiotemporal graphs with three-dimensional [...] Read more.

Video violence detection plays a crucial role in intelligent surveillance and public safety, yet existing methods still face challenges in modeling complex multi-person interactions. To address this, we propose IDG-ViolenceNet, a dual-stream video violence detection model that integrates identity-aware spatiotemporal graphs with three-dimensional convolutional neural networks (3D-CNN). Specifically, the model utilizes YOLOv11 for high-precision person detection and cross-frame identity tracking, constructing a dynamic spatiotemporal graph that encodes spatial proximity, temporal continuity, and individual identity information. On this basis, a GINEConv branch extracts structured interaction features, while an R3D-18 branch models local spatiotemporal patterns. The two representations are fused in a dedicated module for cross-modal feature integration. Experimental results show that IDG-ViolenceNet achieves accuracies of 97.5%, 99.5%, and 89.4% on the Hockey Fight, Movies Fight, and RWF-2000 datasets, respectively, significantly outperforming state-of-the-art methods. Additionally, ablation studies validate the contributions of key components in improving detection accuracy and robustness. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

12 pages, 768 KB

Open AccessArticle

ECG Waveform Segmentation via Dual-Stream Network with Selective Context Fusion

by Yongpeng Niu, Nan Lin, Yuchen Tian, Kaipeng Tang and Baoxiang Liu

Electronics 2025, 14(19), 3925; https://doi.org/10.3390/electronics14193925 - 2 Oct 2025

Viewed by 410

Abstract

Electrocardiogram (ECG) waveform delineation is fundamental to cardiac disease diagnosis. This task requires precise localization of key fiducial points, specifically the onset, peak, and offset positions of P-waves, QRS complexes, and T-waves. Current methods exhibit significant performance degradation in noisy clinical environments (baseline [...] Read more.

Electrocardiogram (ECG) waveform delineation is fundamental to cardiac disease diagnosis. This task requires precise localization of key fiducial points, specifically the onset, peak, and offset positions of P-waves, QRS complexes, and T-waves. Current methods exhibit significant performance degradation in noisy clinical environments (baseline drift, electromyographic interference, powerline interference, etc.), compromising diagnostic reliability. To address this limitation, we introduce ECG-SCFNet: a novel dual-stream architecture employing selective context fusion. Our framework is further enhanced by a consistency training paradigm, enabling it to maintain robust waveform delineation accuracy under challenging noise conditions.The network employs a dual-stream architecture: (1) A temporal stream captures dynamic rhythmic features through sequential multi-branch convolution and temporal attention mechanisms; (2) A morphology stream combines parallel multi-scale convolution with feature pyramid integration to extract multi-scale waveform structural features through morphological attention; (3) The Selective Context Fusion (SCF) module adaptively integrates features from the temporal and morphology streams using a dual attention mechanism, which operates across both channel and spatial dimensions to selectively emphasize informative features from each stream, thereby enhancing the representation learning for accurate ECG segmentation. On the LUDB and QT datasets, ECG-SCFNet achieves high performance, with F1-scores of 97.83% and 97.80%, respectively. Crucially, it maintains robust performance under challenging noise conditions on these datasets, with 88.49% and 86.25% F1-scores, showing significantly improved noise robustness compared to other methods and demonstrating exceptional robustness and precise boundary localization for clinical ECG analysis. Full article

► Show Figures

Figure 1

12 pages, 1340 KB

Open AccessArticle

Research on Well Depth Tracking Calculation Method Based on Branching Parallel Neural Networks

by Weikai Liu, Baoquan Ma and Xiaolei Yu

Processes 2025, 13(10), 3147; https://doi.org/10.3390/pr13103147 - 30 Sep 2025

Viewed by 367

Abstract

Aiming at the problem that the well depth parameters in existing intelligent drilling technology can not be obtained underground, a multi-branch parallel neural network is proposed to solve the problem of downhole well depth tracking, and its effectiveness is verified by a field [...] Read more.

Aiming at the problem that the well depth parameters in existing intelligent drilling technology can not be obtained underground, a multi-branch parallel neural network is proposed to solve the problem of downhole well depth tracking, and its effectiveness is verified by a field example. After analyzing and correcting the quality of the logging data collected on site by using DBSCAN (a density clustering algorithm), five parameters of WOB, rotating speed, displacement, pump pressure, and torque are selected to predict and calculate the downhole mechanical ROP. Adjust the structure of a traditional artificial BP neural network and design a multi-branch parallel neural network, change the basic architecture of the original hierarchical operation, make full use of the operation efficiency of a computer to realize parallel operation, and adopt the method of point-to-point depth comparison when evaluating the well depth tracking effect. The results indicate that the MAE and mechanical drilling rate evaluation values obtained were 1.18 and 0.873, respectively. The multi-branch parallel neural network achieved a 66.55% improvement in MAE compared to the original BP neural network, while the R² evaluation method showed a 61.82% increase. The average point-by-point comparison error in the example calculation was 0.012 m, with a maximum error of 0.268 m. This result can serve as a fundamental basis for judging changes in well depth during the drilling process. Full article

(This article belongs to the Special Issue Applications of Intelligent Models in the Petroleum Industry)

► Show Figures

Figure 1

Search Results (591)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (591)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI