Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (368)

Search Parameters:
Keywords = spatio-temporal feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 5023 KB  
Article
Robust 3D Target Detection Based on LiDAR and Camera Fusion
by Miao Jin, Bing Lu, Gang Liu, Yinglong Diao, Xiwen Chen and Gaoning Nie
Electronics 2025, 14(21), 4186; https://doi.org/10.3390/electronics14214186 (registering DOI) - 27 Oct 2025
Abstract
Autonomous driving relies on multimodal sensors to acquire environmental information for supporting decision making and control. While significant progress has been made in 3D object detection regarding point cloud processing and multi-sensor fusion, existing methods still suffer from shortcomings—such as sparse point clouds [...] Read more.
Autonomous driving relies on multimodal sensors to acquire environmental information for supporting decision making and control. While significant progress has been made in 3D object detection regarding point cloud processing and multi-sensor fusion, existing methods still suffer from shortcomings—such as sparse point clouds of foreground targets, fusion instability caused by fluctuating sensor data quality, and inadequate modeling of cross-frame temporal consistency in video streams—which severely restrict the practical performance of perception systems. To address these issues, this paper proposes a multimodal video stream 3D object detection framework based on reliability evaluation. Specifically, it dynamically perceives the reliability of each modal feature by evaluating the Region of Interest (RoI) features of cameras and LiDARs, and adaptively adjusts their contribution ratios in the fusion process accordingly. Additionally, a target-level semantic soft matching graph is constructed within the RoI region. Combined with spatial self-attention and temporal cross-attention mechanisms, the spatio-temporal correlations between consecutive frames are fully explored to achieve feature completion and enhancement. Verification on the nuScenes dataset shows that the proposed algorithm achieves an optimal performance of 67.3% and 70.6% in terms of the two core metrics, mAP and NDS, respectively—outperforming existing mainstream 3D object detection algorithms. Ablation experiments confirm that each module plays a crucial role in improving overall performance, and the algorithm exhibits better robustness and generalization in dynamically complex scenarios. Full article
Show Figures

Figure 1

18 pages, 11993 KB  
Article
Spatiotemporal Coupling Analysis of Street Vitality and Built Environment: A Multisource Data-Driven Dynamic Assessment Model
by Caijian Hua, Wei Lv and Yan Zhang
Sustainability 2025, 17(21), 9517; https://doi.org/10.3390/su17219517 (registering DOI) - 26 Oct 2025
Abstract
To overcome the limited accuracy of existing street vitality assessments under dense occlusion and their lack of dynamic, multi-source data fusion, this study proposes an integrated dynamic model that couples an enhanced YOLOv11 with heterogeneous spatiotemporal datasets. The network introduces a two-backbone architecture [...] Read more.
To overcome the limited accuracy of existing street vitality assessments under dense occlusion and their lack of dynamic, multi-source data fusion, this study proposes an integrated dynamic model that couples an enhanced YOLOv11 with heterogeneous spatiotemporal datasets. The network introduces a two-backbone architecture for stronger multi-scale fusion, Spatial Pyramid Depth Convolution (SPDConv) for richer urban scene features, and Dynamic Sparse Sampling (DySample) for robust occlusion handling. Validated in Yibin, the model achieves 90.4% precision, 67.3% recall, and 77.2% mAP@50 gains of 6.5%, 5.3%, and 5.1% over the baseline. By fusing Baidu heatmaps, street-view imagery, road networks, and POI data, a spatial coupling framework quantifies the interplay between commercial facilities and street vitality, enabling dynamic assessment of urban dynamics based on multi-source data fusion, offering insights for targeted retail regulation and adaptive traffic management. By enabling continuous monitoring of urban space use, the model enhances the allocation of public resources and cuts energy waste from idle traffic, thereby advancing urban sustainability via improved commercial planning and responsive traffic control. The work provides a methodological foundation for shifting urban resource allocation from static planning to dynamic, responsive systems. Full article
Show Figures

Figure 1

27 pages, 2176 KB  
Article
Intelligent Fault Diagnosis of Rolling Bearings Based on Digital Twin and Multi-Scale CNN-AT-BiGRU Model
by Jiayu Shi, Liang Qi, Shuxia Ye, Changjiang Li, Chunhui Jiang, Zhengshun Ni, Zheng Zhao, Zhe Tong, Siyu Fei, Runkang Tang, Danfeng Zuo and Jiajun Gong
Symmetry 2025, 17(11), 1803; https://doi.org/10.3390/sym17111803 (registering DOI) - 26 Oct 2025
Abstract
Rolling bearings constitute critical rotating components within rolling mill equipment. Production efficiency and the operational safety of the whole mechanical system are directly governed by their operational health state. To address the dual challenges of the over-reliance of conventional diagnostic methods on expert [...] Read more.
Rolling bearings constitute critical rotating components within rolling mill equipment. Production efficiency and the operational safety of the whole mechanical system are directly governed by their operational health state. To address the dual challenges of the over-reliance of conventional diagnostic methods on expert experience and the scarcity of fault samples in industrial scenarios, we propose a virtual–physical data fusion-optimized intelligent fault diagnosis framework. Initially, a dynamics-based digital twin model for rolling bearings is developed by leveraging their geometric symmetry. It is capable of generating comprehensive fault datasets through parametric adjustments of bearing dimensions and operational environments in virtual space. Subsequently, a symmetry-informed architecture is constructed, which integrates multi-scale convolutional neural networks with attention mechanisms and bidirectional gated recurrent units (MCNN-AT-BiGRU). This architecture enables spatiotemporal feature extraction and enhances critical fault characteristics. The experimental results demonstrate 99.5% fault identification accuracy under single operating conditions. It maintains stable performance under low SNR conditions. Furthermore, the framework exhibits superior generalization capability and transferability across the different bearing types. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

16 pages, 3350 KB  
Article
A Novel Demographic Indicator Fusion Network (DIFNet) for Dynamic Fusion of EEG and Demographic Indicators for Robust Depression Detection
by Chaoliang Wang, Qingshu Zhou, Mengfan Li, Jiaxin Li and Jing Zhao
Sensors 2025, 25(21), 6549; https://doi.org/10.3390/s25216549 (registering DOI) - 24 Oct 2025
Viewed by 179
Abstract
Electroencephalography (EEG) has proven to be effective for detecting major depressive disorder (MDD), with deep learning models further advancing its potential. However, the performance of these models may be limited by their neglect of demographic factors (e.g., age, sex, and education), which are [...] Read more.
Electroencephalography (EEG) has proven to be effective for detecting major depressive disorder (MDD), with deep learning models further advancing its potential. However, the performance of these models may be limited by their neglect of demographic factors (e.g., age, sex, and education), which are known to influence EEG characteristics of depression. To address this, we propose DIFNet, a deep learning framework that dynamically fuses EEG features with demographic indicators (age, sex, and years of education) to enhance depression recognition accuracy. DIFNet is composed of four modules: a multiscale convolutional module, a Transformer encoder module, a temporal convolutional network (TCN) module, and a demographic indicator fusion module. The fusion model leverages convolution to process demographic vectors and integrates them with spatiotemporal EEG features, thereby embedding demographic indicators within the deep learning model for classification. Cross-validation between data trials showed that the DIFNet fusing age and years of education achieves a superior accuracy of 99.66%; the dynamic fusion mechanism improves accuracy by 0.72% compared to the baseline without fusing demographic indicators (98.94%), outperforming state-of-the-art methods (SparNet 94.37% and DBGCN 98.30%). Full article
(This article belongs to the Collection EEG-Based Brain–Computer Interface for a Real-Life Appliance)
Show Figures

Figure 1

19 pages, 3240 KB  
Article
AI-Based Downscaling of MODIS LST Using SRDA-Net Model for High-Resolution Data Generation
by Hongxia Ma, Kebiao Mao, Zijin Yuan, Longhao Xu, Jiancheng Shi, Zhonghua Guo and Zhihao Qin
Remote Sens. 2025, 17(21), 3510; https://doi.org/10.3390/rs17213510 - 22 Oct 2025
Viewed by 160
Abstract
Land surface temperature (LST) is a critical parameter in agricultural drought monitoring, crop growth analysis, and climate change research. However, the challenge of acquiring high-resolution LST data with both fine spatial and temporal scales remains a significant obstacle in remote sensing applications. Despite [...] Read more.
Land surface temperature (LST) is a critical parameter in agricultural drought monitoring, crop growth analysis, and climate change research. However, the challenge of acquiring high-resolution LST data with both fine spatial and temporal scales remains a significant obstacle in remote sensing applications. Despite the high temporal resolution afforded by daily MODIS LST observations, the coarse (1 km) spatial scale of these data restricts their applicability for studies demanding finer spatial resolution. To address this challenge, a novel deep learning-based approach is proposed for LST downscaling: the spatial resolution downscaling attention network (SRDA-Net). The model is designed to upscale the resolution of MODIS LST from 1000 m to 250 m, overcoming the shortcomings of traditional interpolation techniques in reconstructing spatial details, as well as reducing the reliance on linear models and multi-source high-temporal LST data typical of conventional fusion approaches. SRDA-Net captures the feature interaction between MODIS LST and auxiliary data through global resolution attention to address spatial heterogeneity. It further enhances the feature representation ability under heterogeneous surface conditions by optimizing multi-source features to handle heterogeneous data. Additionally, it strengthens the model of spatial dependency relationships through a multi-level feature refinement module. Moreover, this study constructs a composite loss function system that integrates physical mechanisms and data characteristics, ensuring the improvement of reconstruction details while maintaining numerical accuracy and model interpret-ability through a triple collaborative constraint mechanism. Experimental results show that the proposed model performs excellently in the simulation experiment (from 2000 m to 1000 m), with an MAE of 0.928 K and an R2 of 0.95. In farmland areas, the model performs particularly well (MAE = 0.615 K, R2 = 0.96, RMSE = 0.823 K), effectively supporting irrigation scheduling and crop health monitoring. It also maintains good vegetation heterogeneity expression ability in grassland areas, making it suitable for drought monitoring tasks. In the target downscaling experiment (from 1000 m to 500 m and 250 m), the model achieved an RMSE of 1.804 K, an MAE of 1.587 K, and an R2 of 0.915, confirming its stable generalization ability across multiple scales. This study supports agricultural drought warning and precise irrigation and provides data support for interdisciplinary applications such as climate change research and ecological monitoring, while offering a new approach to generating high spatio-temporal resolution LST. Full article
Show Figures

Figure 1

24 pages, 11432 KB  
Article
MRDAM: Satellite Cloud Image Super-Resolution via Multi-Scale Residual Deformable Attention Mechanism
by Liling Zhao, Zichen Liao and Quansen Sun
Remote Sens. 2025, 17(21), 3509; https://doi.org/10.3390/rs17213509 - 22 Oct 2025
Viewed by 275
Abstract
High-resolution meteorological satellite cloud imagery plays a crucial role in diagnosing and forecasting severe convective weather phenomena characterized by suddenness and locality, such as tropical cyclones. However, constrained by imaging principles and various internal/external interferences during satellite data acquisition, current satellite imagery often [...] Read more.
High-resolution meteorological satellite cloud imagery plays a crucial role in diagnosing and forecasting severe convective weather phenomena characterized by suddenness and locality, such as tropical cyclones. However, constrained by imaging principles and various internal/external interferences during satellite data acquisition, current satellite imagery often fails to meet the spatiotemporal resolution requirements for fine-scale monitoring of these weather systems. Particularly for real-time tracking of tropical cyclone genesis-evolution dynamics and capturing detailed cloud structure variations within cyclone cores, existing spatial resolutions remain insufficient. Therefore, developing super-resolution techniques for meteorological satellite cloud imagery through software-based approaches holds significant application potential. This paper proposes a Multi-scale Residual Deformable Attention Model (MRDAM) based on Generative Adversarial Networks (GANs), specifically designed for satellite cloud image super-resolution tasks considering their morphological diversity and non-rigid deformation characteristics. The generator architecture incorporates two key components: a Multi-scale Feature Progressive Fusion Module (MFPFM), which enhances texture detail preservation and spectral consistency in reconstructed images, and a Deformable Attention Additive Fusion Module (DAAFM), which captures irregular cloud pattern features through adaptive spatial-attention mechanisms. Comparative experiments against multiple GAN-based super-resolution baselines demonstrate that MRDAM achieves superior performance in both objective evaluation metrics (PSNR/SSIM) and subjective visual quality, proving its superior performance for satellite cloud image super-resolution tasks. Full article
(This article belongs to the Special Issue Neural Networks and Deep Learning for Satellite Image Processing)
Show Figures

Figure 1

23 pages, 5146 KB  
Article
Spatio-Temporal Multi-Graph Convolution Traffic Flow Prediction Model Based on Multi-Source Information Fusion and Attention Enhancement
by Wenjing Li, Zhongning Sun and Yao Wan
Appl. Sci. 2025, 15(20), 11295; https://doi.org/10.3390/app152011295 - 21 Oct 2025
Viewed by 182
Abstract
Traffic flow prediction plays a vital role in intelligent transportation systems, directly affecting travel scheduling, road planning, and traffic management efficiency. However, traditional methods often struggle to capture complex spatiotemporal dependencies and integrate heterogeneous data sources. To overcome these challenges, we propose a [...] Read more.
Traffic flow prediction plays a vital role in intelligent transportation systems, directly affecting travel scheduling, road planning, and traffic management efficiency. However, traditional methods often struggle to capture complex spatiotemporal dependencies and integrate heterogeneous data sources. To overcome these challenges, we propose a Spatio-temporal Multi-graph Convolution Traffic Flow Prediction Model based on Multi-source Information Fusion and Attention Enhancement (MIFA-ST-MGCN). The model adopts adaptive data fusion strategies according to spatiotemporal characteristics, achieving effective integration through feature concatenation and multi-graph structure construction. A spatiotemporal attention mechanism is designed to dynamically capture the varying contributions of different adjacency relations and temporal dependencies, thereby enhancing feature representation. In addition, recurrent units are combined with graph convolutional networks to model spatiotemporal data and generate more accurate prediction results. Experiments conducted on a real-world traffic dataset demonstrate that the proposed model achieves superior performance, reducing the mean absolute error by 3.57% compared with mainstream traffic flow prediction models. These results confirm the effectiveness of multi-source fusion and attention enhancement in improving prediction accuracy. Full article
(This article belongs to the Special Issue Advanced Methods for Time Series Forecasting)
Show Figures

Figure 1

19 pages, 2109 KB  
Article
SF6 Leak Detection in Infrared Video via Multichannel Fusion and Spatiotemporal Features
by Zhiwei Li, Xiaohui Zhang, Zhilei Xu, Yubo Liu and Fengjuan Zhang
Appl. Sci. 2025, 15(20), 11141; https://doi.org/10.3390/app152011141 - 17 Oct 2025
Viewed by 173
Abstract
With the development of infrared imaging technology and the integration of intelligent algorithms, the realization of non-contact, dynamic and real-time detection of SF6 gas leakage based on infrared video has been a significant research direction. However, the existing real-time detection algorithms exhibit low [...] Read more.
With the development of infrared imaging technology and the integration of intelligent algorithms, the realization of non-contact, dynamic and real-time detection of SF6 gas leakage based on infrared video has been a significant research direction. However, the existing real-time detection algorithms exhibit low accuracy in detecting SF6 leakage and are susceptible to noise, which makes it difficult to meet the actual needs of engineering. To address this problem, this paper proposes a real-time SF6 leakage detection method, VGEC-Net, based on multi-channel fusion and spatiotemporal feature extraction. The proposed method first employs the ViBe-GMM algorithm to extract foreground masks, which are then fused with infrared images to construct a dual-channel input. In the backbone network, a CE-Net structure—integrating CBAM and ECA-Net—is combined with the P3D network to achieve efficient spatiotemporal feature extraction. A Feature Pyramid Network (FPN) and a temporal Transformer module are further integrated to enhance multi-scale feature representation and temporal modeling, thereby significantly improving the detection performance for small-scale targets. Experimental results demonstrate that VGEC-Net achieves a mean average precision (mAP) of 61.7% on the dataset used in this study, with a mAP@50 of 87.3%, which represents a significant improvement over existing methods. These results validate the effectiveness and advancement of the proposed method for infrared video-based gas leakage detection. Furthermore, the model achieves 78.2 frames per second (FPS) during inference, demonstrating good real-time processing capability while maintaining high detection accuracy, exhibiting strong application potential. Full article
Show Figures

Figure 1

20 pages, 2565 KB  
Article
GBV-Net: Hierarchical Fusion of Facial Expressions and Physiological Signals for Multimodal Emotion Recognition
by Jiling Yu, Yandong Ru, Bangjun Lei and Hongming Chen
Sensors 2025, 25(20), 6397; https://doi.org/10.3390/s25206397 - 16 Oct 2025
Viewed by 488
Abstract
A core challenge in multimodal emotion recognition lies in the precise capture of the inherent multimodal interactive nature of human emotions. Addressing the limitation of existing methods, which often process visual signals (facial expressions) and physiological signals (EEG, ECG, EOG, and GSR) in [...] Read more.
A core challenge in multimodal emotion recognition lies in the precise capture of the inherent multimodal interactive nature of human emotions. Addressing the limitation of existing methods, which often process visual signals (facial expressions) and physiological signals (EEG, ECG, EOG, and GSR) in isolation and thus fail to exploit their complementary strengths effectively, this paper presents a new multimodal emotion recognition framework called the Gated Biological Visual Network (GBV-Net). This framework enhances emotion recognition accuracy through deep synergistic fusion of facial expressions and physiological signals. GBV-Net integrates three core modules: (1) a facial feature extractor based on a modified ConvNeXt V2 architecture incorporating lightweight Transformers, specifically designed to capture subtle spatio-temporal dynamics in facial expressions; (2) a hybrid physiological feature extractor combining 1D convolutions, Temporal Convolutional Networks (TCNs), and convolutional self-attention mechanisms, adept at modeling local patterns and long-range temporal dependencies in physiological signals; and (3) an enhanced gated attention fusion module capable of adaptively learning inter-modal weights to achieve dynamic, synergistic integration at the feature level. A thorough investigation of the publicly accessible DEAP and MAHNOB-HCI datasets reveals that GBV-Net surpasses contemporary methods. Specifically, on the DEAP dataset, the model attained classification accuracies of 95.10% for Valence and 95.65% for Arousal, with F1-scores of 95.52% and 96.35%, respectively. On MAHNOB-HCI, the accuracies achieved were 97.28% for Valence and 97.73% for Arousal, with F1-scores of 97.50% and 97.74%, respectively. These experimental findings substantiate that GBV-Net effectively captures deep-level interactive information between multimodal signals, thereby improving emotion recognition accuracy. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

20 pages, 1288 KB  
Article
Spatio-Temporal Residual Attention Network for Satellite-Based Infrared Small Target Detection
by Yan Chang, Decao Ma, Qisong Yang, Shaopeng Li and Daqiao Zhang
Remote Sens. 2025, 17(20), 3457; https://doi.org/10.3390/rs17203457 - 16 Oct 2025
Viewed by 261
Abstract
With the development of infrared remote sensing technology and the deployment of satellite constellations, infrared video from orbital platforms is playing an increasingly important role in airborne target surveillance. However, due to the limitations of remote sensing imaging, the aerial targets in such [...] Read more.
With the development of infrared remote sensing technology and the deployment of satellite constellations, infrared video from orbital platforms is playing an increasingly important role in airborne target surveillance. However, due to the limitations of remote sensing imaging, the aerial targets in such videos are often small in scale, low in contrast, and slow in movement, making them difficult to detect in complex backgrounds. In this paper, we propose a novel detection network that integrates inter-frame residual guidance with spatio-temporal feature enhancement to address the challenge of small object detection in infrared satellite video. This method first extracts residual features to highlight motion-sensitive regions, then uses a dual-branch structure to encode spatial semantics and temporal evolution, and then fuses them deeply through a multi-scale feature enhancement module. Extensive experiments show that this method outperforms mainstream methods in terms on various infrared small target video datasets, and has good robustness under low-signal-to-noise-ratio conditions. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

18 pages, 1960 KB  
Article
CasDacGCN: A Dynamic Attention-Calibrated Graph Convolutional Network for Information Popularity Prediction
by Bofeng Zhang, Yanlin Zhu, Zhirong Zhang, Kaili Liao, Sen Niu, Bingchun Li and Haiyan Li
Entropy 2025, 27(10), 1064; https://doi.org/10.3390/e27101064 - 14 Oct 2025
Viewed by 394
Abstract
Information popularity prediction is a critical problem in social network analysis. With the increasing prevalence of social platforms, accurate prediction of the diffusion process has become increasingly important. Existing methods mainly rely on graph neural networks to model structural relationships, but they are [...] Read more.
Information popularity prediction is a critical problem in social network analysis. With the increasing prevalence of social platforms, accurate prediction of the diffusion process has become increasingly important. Existing methods mainly rely on graph neural networks to model structural relationships, but they are often insufficient in capturing the complex interplay between temporal evolution and local cascade structures, especially in real-world scenarios involving sparse or rapidly changing cascades. To address this issue, we propose the Cascading Dynamic attention-calibrated Graph Convolutional Network, named CasDacGCN. It enhances prediction performance through spatiotemporal feature fusion and adaptive representation learning. The model integrates snapshot-level local encoding, global temporal modeling, cross-attention mechanisms, and a hypernetwork-based sample-wise calibration strategy, enabling flexible modeling of multi-scale diffusion patterns. Results from experiments demonstrate that the proposed model consistently surpasses existing approaches on two real-world datasets, validating its effectiveness in popularity prediction tasks. Full article
Show Figures

Figure 1

24 pages, 14760 KB  
Article
Remaining Useful Life Prediction of Electric Drive Bearings in New Energy Vehicles: Based on Degradation Assessment and Spatiotemporal Feature Fusion
by Fang Yang, En Dong, Zhidan Zhong, Weiqi Zhang, Yunhao Cui and Jun Ye
Machines 2025, 13(10), 914; https://doi.org/10.3390/machines13100914 - 3 Oct 2025
Cited by 1 | Viewed by 340
Abstract
Accurate prediction of the RUL of electric drive bearings over the entire service life cycle for new energy vehicles optimizes maintenance strategies and reduces costs, addressing clear application needs. Full life data of electric drive bearings exhibit long time spans and abrupt degradation, [...] Read more.
Accurate prediction of the RUL of electric drive bearings over the entire service life cycle for new energy vehicles optimizes maintenance strategies and reduces costs, addressing clear application needs. Full life data of electric drive bearings exhibit long time spans and abrupt degradation, complicating the modeling of time dependent relationships and degradation states; therefore, a piecewise linear degradation model is appropriate. An RUL prediction method is proposed based on degradation assessment and spatiotemporal feature fusion, which extracts strongly time correlated features from bearing vibration data, evaluates sensitive indicators, constructs weighted fused degradation features, and identifies abrupt degradation points. On this basis, a piecewise linear degradation model is constructed that uses a path graph structure to represent temporal dependencies and a temporal observation window to embed temporal features. By incorporating GAT-LSTM, RUL prediction for bearings is performed. The method is validated on the XJTU-SY dataset and on a loaded ball bearing test rig for electric vehicle drive motors, yielding comprehensive vibration measurements for life prediction. The results show that the method captures deep degradation information across the full bearing life cycle and delivers accurate, robust predictions, providing guidance for the health assessment of electric drive bearings in new energy vehicles. Full article
Show Figures

Figure 1

24 pages, 73520 KB  
Article
2C-Net: A Novel Spatiotemporal Dual-Channel Network for Soil Organic Matter Prediction Using Multi-Temporal Remote Sensing and Environmental Covariates
by Jiale Geng, Chong Luo, Jun Lu, Depiao Kong, Xue Li and Huanjun Liu
Remote Sens. 2025, 17(19), 3358; https://doi.org/10.3390/rs17193358 - 3 Oct 2025
Viewed by 388
Abstract
Soil organic matter (SOM) is essential for ecosystem health and agricultural productivity. Accurate prediction of SOM content is critical for modern agricultural management and sustainable soil use. Existing digital soil mapping (DSM) models, when processing temporal data, primarily focus on modeling the changes [...] Read more.
Soil organic matter (SOM) is essential for ecosystem health and agricultural productivity. Accurate prediction of SOM content is critical for modern agricultural management and sustainable soil use. Existing digital soil mapping (DSM) models, when processing temporal data, primarily focus on modeling the changes in input data across successive time steps. However, they do not adequately model the relationships among different input variables, which hinders the capture of complex data patterns and limits the accuracy of predictions. To address this problem, this paper proposes a novel deep learning model, 2-Channel Network (2C-Net), leveraging sequential multi-temporal remote sensing images to improve SOM prediction. The network separates input data into temporal and spatial data, processing them through independent temporal and spatial channels. Temporal data includes multi-temporal Sentinel-2 spectral reflectance, while spatial data consists of environmental covariates including climate and topography. The Multi-sequence Feature Fusion Module (MFFM) is proposed to globally model spectral data across multiple bands and time steps, and the Diverse Convolutional Architecture (DCA) extracts spatial features from environmental data. Experimental results show that 2C-Net outperforms the baseline model (CNN-LSTM) and mainstream machine learning model for DSM, with R2 = 0.524, RMSE = 0.884 (%), MAE = 0.581 (%), and MSE = 0.781 (%)2. Furthermore, this study demonstrates the significant importance of sequential spectral data for the inversion of SOM content and concludes the following: for the SOM inversion task, the bare soil period after tilling is a more important time window than other bare soil periods. 2C-Net model effectively captures spatiotemporal features, offering high-accuracy SOM predictions and supporting future DSM and soil management. Full article
(This article belongs to the Special Issue Remote Sensing in Soil Organic Carbon Dynamics)
Show Figures

Figure 1

21 pages, 2248 KB  
Article
TSFNet: Temporal-Spatial Fusion Network for Hybrid Brain-Computer Interface
by Yan Zhang, Bo Yin and Xiaoyang Yuan
Sensors 2025, 25(19), 6111; https://doi.org/10.3390/s25196111 - 3 Oct 2025
Viewed by 485
Abstract
Unimodal brain–computer interfaces (BCIs) often suffer from inherent limitations due to the characteristic of using single modalities. While hybrid BCIs combining electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) offer complementary advantages, effectively integrating their spatiotemporal features remains a challenge due to inherent signal [...] Read more.
Unimodal brain–computer interfaces (BCIs) often suffer from inherent limitations due to the characteristic of using single modalities. While hybrid BCIs combining electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) offer complementary advantages, effectively integrating their spatiotemporal features remains a challenge due to inherent signal asynchrony. This study aims to develop a novel deep fusion network to achieve synergistic integration of EEG and fNIRS signals for improved classification performance across different tasks. We propose a novel Temporal-Spatial Fusion Network (TSFNet), which consists of two key sublayers: the EEG-fNIRS-guided Fusion (EFGF) layer and the Cross-Attention-based Feature Enhancement (CAFÉ) layer. The EFGF layer extracts temporal features from EEG and spatial features from fNIRS to generate a hybrid attention map, which is utilized to achieve more effective and complementary integration of spatiotemporal information. The CAFÉ layer enables bidirectional interaction between fNIRS and fusion features via a cross-attention mechanism, which enhances the fusion features and selectively filters informative fNIRS representations. Through the two sublayers, TSFNet achieves deep fusion of multimodal features. Finally, TSFNet is evaluated on motor imagery (MI), mental arithmetic (MA), and word generation (WG) classification tasks. Experimental results demonstrate that TSFNet achieves superior classification performance, with average accuracies of 70.18% for MI, 86.26% for MA, and 81.13% for WG, outperforming existing state-of-the-art multimodal algorithms. These findings suggest that TSFNet provides an effective solution for spatiotemporal feature fusion in hybrid BCIs, with potential applications in real-world BCI systems. Full article
Show Figures

Figure 1

32 pages, 13081 KB  
Article
FedIFD: Identifying False Data Injection Attacks in Internet of Vehicles Based on Federated Learning
by Huan Wang, Junying Yang, Jing Sun, Zhe Wang, Qingzheng Liu and Shaoxuan Luo
Big Data Cogn. Comput. 2025, 9(10), 246; https://doi.org/10.3390/bdcc9100246 - 26 Sep 2025
Viewed by 417
Abstract
With the rapid development of intelligent connected vehicle technology, false data injection (FDI) attacks have become a major challenge in the Internet of Vehicles (IoV). While deep learning methods can effectively identify such attacks, the dynamic, distributed architecture of the IoV and limited [...] Read more.
With the rapid development of intelligent connected vehicle technology, false data injection (FDI) attacks have become a major challenge in the Internet of Vehicles (IoV). While deep learning methods can effectively identify such attacks, the dynamic, distributed architecture of the IoV and limited computing resources hinder both privacy protection and lightweight computation. To address this, we propose FedIFD, a federated learning (FL)-based detection method for false data injection attacks. The lightweight threat detection model utilizes basic safety messages (BSM) for local incremental training, and the Q-FedCG algorithm compresses gradients for global aggregation. Original features are reshaped using a time window. To ensure temporal and spatial consistency, a sliding average strategy aligns samples before spatial feature extraction. A dual-branch architecture enables parallel extraction of spatiotemporal features: a three-layer stacked Bidirectional Long Short-Term Memory (BiLSTM) captures temporal dependencies, and a lightweight Transformer models spatial relationships. A dynamic feature fusion weight matrix calculates attention scores for adaptive feature weighting. Finally, a differentiated pooling strategy is applied to emphasize critical features. Experiments on the VeReMi dataset show that the accuracy reaches 97.8%. Full article
(This article belongs to the Special Issue Big Data Analytics with Machine Learning for Cyber Security)
Show Figures

Figure 1

Back to TopTop