Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,118)

Search Parameters:
Keywords = spatiotemporal convolutional network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 4327 KB  
Article
A Global TEC Map Forecasting Method Based on Periodic-Matched Residual Prediction and Longitude-Circular Boundary-Aware Convolution
by Yingli Chang, Yu Gao, Mengjie Wu and Peng Guo
Appl. Sci. 2026, 16(11), 5651; https://doi.org/10.3390/app16115651 - 4 Jun 2026
Abstract
Total Electron Content (TEC) is a key parameter for characterizing the state of the ionosphere, and its spatiotemporal variations can significantly affect satellite navigation, radio communication, and space weather monitoring. To address the pronounced diurnal periodicity in global TEC map forecasting and the [...] Read more.
Total Electron Content (TEC) is a key parameter for characterizing the state of the ionosphere, and its spatiotemporal variations can significantly affect satellite navigation, radio communication, and space weather monitoring. To address the pronounced diurnal periodicity in global TEC map forecasting and the commonly neglected continuity at longitudinal boundaries, this study proposes an encoder–decoder ConvLSTM model that integrates periodic-matched residual prediction with longitude-circular boundary-aware convolution, namely the Longitude-Circular Periodic-Residual ED-ConvLSTM (LC-PR-EDConvLSTM). In the proposed model, the TEC map at the same temporal phase on the previous day is used as a periodic background field, enabling the network to focus on learning the residual variation in future TEC relative to this background. Meanwhile, longitude-circular padding is introduced into the convolution operations to preserve the spatial continuity of global TEC maps across the −180° and 180° meridians. Experiments were conducted using CODE global ionospheric map products from 2009 to 2019, with 12 TEC maps from the previous day used as inputs to predict 12 TEC maps for the following day. The results show that LC-PR-EDConvLSTM achieves RMSE values of 3.68 TECU and 1.37 TECU on the 2015 high-solar-activity test set and the 2019 low-solar-activity test set, respectively, outperforming the C1pg, ED-ConvGRU, and ED-ConvLSTM benchmark models. Ablation experiments further verify the effectiveness of the periodic-matched residual prediction strategy and the longitude-circular boundary-aware convolution. Analyses of typical space weather events and latitudinal regions demonstrate that the proposed model provides stable forecasting performance under complex space weather conditions and across most latitude regions. Full article
(This article belongs to the Collection Space Applications)
Show Figures

Figure 1

21 pages, 9092 KB  
Article
Prior-Knowledge-Guided Graph Attention Network for Fault Diagnosis of Engine Valve Clearance
by Mingyu Li, Jingqian Wen, Xiaonan Yang, Yaoguang Hu, Xinlong Li and Zhongjie Shi
Sensors 2026, 26(11), 3565; https://doi.org/10.3390/s26113565 - 3 Jun 2026
Viewed by 258
Abstract
Fault diagnosis of diesel engines is a critical task in the operation and maintenance of complex equipment. Diesel engine fault diagnosis technology based on deep learning has seen widespread development due to its powerful feature learning and fault classification capabilities. However, traditional data-driven [...] Read more.
Fault diagnosis of diesel engines is a critical task in the operation and maintenance of complex equipment. Diesel engine fault diagnosis technology based on deep learning has seen widespread development due to its powerful feature learning and fault classification capabilities. However, traditional data-driven deep learning models cannot explicitly uncover relationships between signals, which hinders better fault information capture. Therefore, this paper proposes a diesel-engine valve-clearance fault diagnosis method driven by a combination of knowledge and data. Firstly, the original signals are converted into graph data with a topological structure based on the spatiotemporal relationships of events occurring within the cylinder, thereby uncovering the intrinsic structural information of the samples. Then, the graph structure is input into a graph convolutional attention network to extract features and learn fault patterns. Valve fault experiments were conducted on a diesel engine test bench, and the results indicate that the proposed knowledge and data-driven deep learning fault diagnosis model achieves better diagnostic performance and clearer interpretability compared to traditional data-driven deep learning fault diagnosis models, and it still has a relatively high accuracy in a diagnostic environment with scarce data. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

28 pages, 7559 KB  
Article
GA-GBDT: A Spatio-Temporal Graph-Augmented Gradient Boosting Framework for GNSS Network–Based Landslide Event Warning in Mining Areas
by Jinhua Wu, Liang Fei, Wei Dong, Chengdu Cao, Bo Zhang, Xiangyang Han, Ting On Chan, Yuli Wang and Joseph Awange
Appl. Sci. 2026, 16(11), 5569; https://doi.org/10.3390/app16115569 - 2 Jun 2026
Viewed by 207
Abstract
Landslide event warning in mining areas is essential for geohazard risk mitigation and infrastructure safety. With the increasing use of Global Navigation Satellite System (GNSS) monitoring networks, warning decisions are often derived from abnormal deformation responses in continuous displacement records. However, deriving stable [...] Read more.
Landslide event warning in mining areas is essential for geohazard risk mitigation and infrastructure safety. With the increasing use of Global Navigation Satellite System (GNSS) monitoring networks, warning decisions are often derived from abnormal deformation responses in continuous displacement records. However, deriving stable and transferable warning decisions from GNSS networks is challenged by spatially coupled station responses, time-varying displacement patterns, and incomplete or disturbed observations. To address these issues, this study proposes a graph-augmented gradient boosting decision tree framework, termed GA-GBDT (Graph-Augmented Gradient Boosting Decision Trees), for multi-station landslide event warning in mining areas. The framework first constructs a weighted station graph to encode spatial dependence across stations. Based on this graph, a Gated Recurrent Unit (GRU) and a Graph Convolutional Network (GCN) are integrated to learn spatio-temporal embeddings, which are then fused with station-wise features and fed into XGBoost (eXtreme Gradient Boosting) for warning decision-making. Experiments on a 90-station GNSS network show that GA-GBDT outperforms representative rule-based, machine-learning, and deep-learning baselines, achieving more robust warning performance with improved generalization and false-alarm control. These results indicate that GA-GBDT improves warning robustness, decision stability, and cross-zone generalization for GNSS-based landslide warning in mining areas, with potential transferability to other slope warning scenarios. Full article
(This article belongs to the Section Earth Sciences)
Show Figures

Figure 1

27 pages, 5561 KB  
Article
A Short-Term Traffic Flow Prediction Model Based on IHO-CNN-BiLSTM-Attention
by Zihan Shen, Yuefang Sun and Xuze Dong
Electronics 2026, 15(11), 2418; https://doi.org/10.3390/electronics15112418 - 2 Jun 2026
Viewed by 168
Abstract
Accurate short-term traffic flow prediction is crucial for managing macroscopic Intelligent Transportation Systems (ITS). To overcome limitations in capturing complex spatiotemporal dependencies and the severe challenges of hyperparameter tuning, this paper proposes IHO-CNN-BiLSTM-Attention, a novel hybrid deep learning framework. Specifically, a Convolutional Neural [...] Read more.
Accurate short-term traffic flow prediction is crucial for managing macroscopic Intelligent Transportation Systems (ITS). To overcome limitations in capturing complex spatiotemporal dependencies and the severe challenges of hyperparameter tuning, this paper proposes IHO-CNN-BiLSTM-Attention, a novel hybrid deep learning framework. Specifically, a Convolutional Neural Network (CNN) extracts local spatial features, a Bidirectional Long Short-Term Memory (BiLSTM) network captures temporal dependencies, and an attention mechanism dynamically weights key timesteps. To maximize the architecture’s performance, an Improved Hippopotamus Optimization (IHO) algorithm is proposed for automatic hyperparameter optimization. The IHO algorithm effectively overcomes the premature convergence of traditional optimizers by integrating a Piecewise Linear Chaotic Map (PWLCM) for initialization, tangent-based non-linear adaptive weights, a Tangent Flight defense mechanism, and Lens Opposition-Based Learning (LOBL) for local optimum escape. Evaluated comprehensively across three distinct macroscopic traffic benchmark datasets (a multimodal intersection, METR-LA velocity, and PeMSD4 volume), the IHO algorithm first demonstrated statistically significant superiority on standard CEC benchmark functions. Subsequently, the proposed hybrid model achieved state-of-the-art traffic state classification performance, maintaining peak F1-Scores of 0.9798, 0.8436, and 0.9561 across the highly diverse datasets. It significantly outperformed both classical optimized baselines (e.g., PSO, GWO) and contemporary heavy deep learning architectures (e.g., ASTformer, DiffSTG) under severe class imbalance and varying topological conditions. This work offers a robust, scalable, and highly generalized spatiotemporal forecasting solution with strong theoretical guarantees for intelligent traffic control. Full article
Show Figures

Figure 1

21 pages, 3700 KB  
Article
Enhanced Attention-Based Multi-Channel Feature Fusion Network for Accurate Epilepsy Prediction
by Ziyang Gong, Junho Yoon, Xin Su and Chang Choi
Mathematics 2026, 14(11), 1926; https://doi.org/10.3390/math14111926 - 1 Jun 2026
Viewed by 286
Abstract
Artificial Intelligence (AI) has advanced electroencephalography (EEG)-based epilepsy management, yet high-dimensional multi-channel EEG signals remain difficult to exploit effectively. Many existing approaches inadequately capture spatiotemporal characteristics and often fail to identify seizure-sensitive channels, with more emphasis placed on classification than prevention. To address [...] Read more.
Artificial Intelligence (AI) has advanced electroencephalography (EEG)-based epilepsy management, yet high-dimensional multi-channel EEG signals remain difficult to exploit effectively. Many existing approaches inadequately capture spatiotemporal characteristics and often fail to identify seizure-sensitive channels, with more emphasis placed on classification than prevention. To address these limitations, a multi-channel feature fusion framework is proposed. Temporal dynamics are modeled by a Temporal Convolutional Network (TCN), and spatial attention is learned by a Vision Transformer (ViT). Channel selection and attention-based reweighting are further introduced to optimize the fusion process. The proposed framework was evaluated on the CHB-MIT scalp EEG dataset and the Mayo Clinic intracranial EEG (iEEG) dataset. AUC scores of 95.6% and 90.8% were obtained, with false-positive rates of 2.7% and 8.2%, respectively. These results indicate that preictal EEG segments can be identified prior to seizure onset with improved robustness. Full article
Show Figures

Figure 1

30 pages, 4496 KB  
Article
Identification of Mown Grassland in the Xilingol League by Leveraging Multi-Modal Remote Sensing Data and the MAD-Net Model
by Yalei Yang, Hong Wang, Xiaobing Li, Yixuan Wang, Zengwei Tang, Zixuan Jia and Ziru Wang
Remote Sens. 2026, 18(11), 1778; https://doi.org/10.3390/rs18111778 - 1 Jun 2026
Viewed by 86
Abstract
As a crucial grassland management practice, mowing plays a key role in maintaining the stability, productivity, and economic value of grassland ecosystems. The development of large-scale monitoring techniques for detecting whether mowing has occurred is of significant scientific and practical importance for improving [...] Read more.
As a crucial grassland management practice, mowing plays a key role in maintaining the stability, productivity, and economic value of grassland ecosystems. The development of large-scale monitoring techniques for detecting whether mowing has occurred is of significant scientific and practical importance for improving the understanding of grassland ecosystem response mechanisms and optimizing management strategies. This study focuses on the concentrated grassland area of the Xilingol League in Inner Mongolia, restricted to the SAR-covered western sub-region. All classification accuracies reported here are obtained under spatially random train/test splits and represent an upper bound; generalization to geographically disjoint blocks remains unverified. By utilizing Sentinel-1, Sentinel-2, and Landsat-8 remote sensing images during the mowing season (August to September 2023) along with field survey data, we first applied the random forest-SHAP algorithm to select the optimal features from 70 texture features and construct a multimodal remote sensing dataset. Subsequently, we proposed the MAD-Net (Multi-Modal Attention Fusion Network with Dynamic Weighting) model to fully exploit information related to mowing identification from both optical and SAR data and conducted comparative analyses with other models. The results indicate that the CNN_LSTM_Attention model, which integrates convolutional neural networks, long short-term memory networks, and convolutional block attention modules, performed best in terms of capturing spatiotemporal variations in time series NDVI data. The U-Net model achieved the highest performance on the optimized texture dataset, while the MAD-Net model, which consists of three subnetworks that target different feature data, reached an identification accuracy of 92.59% in the SAR-covered western sub-region under a spatially random train/test split. This result represents an optimistic upper bound, as generalization to geographically independent blocks has not been evaluated. Ablation studies reveal that NDVI time series is the most informative single modality, while texture and SAR features provide complementary information; the proposed dynamic weighting module outperforms conventional fusion strategies. This study provides a new perspective for the large-scale binary classification of mown vs. non-mown grassland and effectively combines multimodal remote sensing data with deep learning models. Thus, this work not only offers a comparative basis for timely and effective identification of mowed grasslands but also provides insights for formulating optimized regional grassland management policies. Full article
Show Figures

Figure 1

37 pages, 3488 KB  
Article
Explainable Seizure Detection from Intracranial EEG Using a Spatio-Temporal Model
by Javier García-Sigüenza, Manuel Curado, Faraón Llorens-Largo and Jose F. Vicent
Mathematics 2026, 14(11), 1889; https://doi.org/10.3390/math14111889 - 29 May 2026
Viewed by 217
Abstract
Seizure detection based on intracranial electroencephalography (iEEG) signals is a relevant task in the analysis of epilepsy. In this context, it is not only important to achieve high predictive performance but also to ensure explainability, which allows for the analysis of the model’s [...] Read more.
Seizure detection based on intracranial electroencephalography (iEEG) signals is a relevant task in the analysis of epilepsy. In this context, it is not only important to achieve high predictive performance but also to ensure explainability, which allows for the analysis of the model’s behavior. The properties of the problem allow it to be formulated as a spatio-temporal problem due to the multichannel nature of iEEG and the temporal evolution of epileptic activity. Therefore, the data must be modeled jointly due to spatial and temporal dependencies. In this work, we propose the Exact Self Explainable Graph Convolutional Recurrent Network (ESEGCRN) for the detection of ictal and interictal periods in a patient-specific setting. The model represents the iEEG channels as nodes in a graph and the temporal evolution of the signal as a sequence over that structure. To validate the proposal, ESEGCRN is compared with various models that address the same problem. The results show that our model achieves the best overall predictive performance among the compared models. Furthermore, our model incorporates an internal explainability mechanism that generates a mask allowing for the analysis of node relevance. Analysis of the mask shows that, as the use of connections is restricted, incoming edges tend to progressively concentrate on seizure onset zone (SOZ) nodes. This reinforces confidence in the model and suggests that the relevance inferred by ESEGCRN is related to clinically significant nodes. Full article
(This article belongs to the Special Issue Computational Methods and Applications of Neural Networks)
Show Figures

Figure 1

24 pages, 4040 KB  
Article
SSA-A-BiGCRNN: An Attention-Based Spectrum Prediction Method for Spatio-Temporal Feature Synergy
by Yueshun He, Hao Song, Ping Du, Linlin He, Xiaoyu Cao, Yunzhe Liu and Weiqian Song
Telecom 2026, 7(3), 61; https://doi.org/10.3390/telecom7030061 - 28 May 2026
Viewed by 160
Abstract
Spectrum prediction is essential for implementing dynamic spectrum management and mitigating spectrum congestion. However, spectrum data in real electromagnetic environments exhibit high non-stationarity, multi-scale features, and complex non-Euclidean spatio-temporal coupling characteristics, which limit the prediction accuracy of existing models. To address these issues, [...] Read more.
Spectrum prediction is essential for implementing dynamic spectrum management and mitigating spectrum congestion. However, spectrum data in real electromagnetic environments exhibit high non-stationarity, multi-scale features, and complex non-Euclidean spatio-temporal coupling characteristics, which limit the prediction accuracy of existing models. To address these issues, this paper proposes an attention-based spectrum prediction method for spatio-temporal feature synergy (SSA-A-BiGCRNN). First, Singular Spectrum Analysis (SSA) is introduced to decompose and reconstruct the non-stationary spectrum signals, filtering out high-frequency burst noise and extracting core evolutionary trends. Second, a spatial topology graph among multiple frequency bands is constructed based on the Spearman rank correlation coefficient. A Bidirectional Graph Convolutional Recurrent Neural Network is then designed to simultaneously capture the spatial dependencies between frequency bands and the bidirectional evolutionary patterns in the time dimension. Finally, an attention mechanism is incorporated during the feature fusion stage to evaluate and focus on critical spatio-temporal information, further enhancing global prediction accuracy. Experimental results based on a real electromagnetic monitoring dataset demonstrate that the proposed model achieves an accuracy of 96.82%, a coefficient of determination (R2) of 0.9966, a Root Mean Square Error (RMSE) of 0.5597, and a Mean Absolute Error (MAE) of 0.4031, significantly outperforming existing models. Full article
Show Figures

Figure 1

25 pages, 1006 KB  
Article
MADS-GCN: A Robust Interactive Memory-Augmented Dual-Stream GCN with Adaptive Spatiotemporal Modeling for Human Action Recognition
by Qian Wang, Yini Zhou, Haowen Shi and Qian Huang
Appl. Sci. 2026, 16(11), 5408; https://doi.org/10.3390/app16115408 - 28 May 2026
Viewed by 131
Abstract
Human action recognition is a key research area in computer vision, where accurate recognition relies on effective modeling of both global and local spatiotemporal information. However, existing GCN-based methods often overemphasize the local topological connectivity of human skeletons. Moreover, their temporal modules fail [...] Read more.
Human action recognition is a key research area in computer vision, where accurate recognition relies on effective modeling of both global and local spatiotemporal information. However, existing GCN-based methods often overemphasize the local topological connectivity of human skeletons. Moreover, their temporal modules fail to fully capture the evolution of action sequences, leading to critical instantaneous information being obscured by global representations. To address these problems, we propose an integrated framework termed MADS-GCN. In the spatial modeling stage, we introduce two parallel streams: the Physical Stream uses the adjacency matrix to constrain convolution and capture global structural patterns, while the Topological Stream leverages spatial attention to assign adaptive weights to joints, preserving discriminative local adaptive features. For temporal modeling, a channel-temporal attention mechanism is applied to adaptively refine feature maps, followed by a bidirectional GRU to capture multi-scale temporal patterns. Extensive experiments on NTU RGB+D60, Northwestern-UCLA, and our custom DanceBasic-Set demonstrate the effectiveness of MADS-GCN and indicate its applicability to dance action recognition scenarios. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

34 pages, 39880 KB  
Article
A Soil Moisture Prediction Model Based on GCN-LSTM Network Incorporating Channel and Temporal Attention
by Jing Wang, Bojia Liu, Xiaohe Han, Yuheng Ji and Qingliang Li
Water 2026, 18(11), 1308; https://doi.org/10.3390/w18111308 - 28 May 2026
Viewed by 163
Abstract
Getting soil moisture right matters for fighting drought and stopping land from turning into desert. Aiming at the problems of insufficient spatiotemporal modeling and redundant attention mechanisms in global soil moisture prediction, we built a new deep learning model called CTA-GraphConvLSTM to better [...] Read more.
Getting soil moisture right matters for fighting drought and stopping land from turning into desert. Aiming at the problems of insufficient spatiotemporal modeling and redundant attention mechanisms in global soil moisture prediction, we built a new deep learning model called CTA-GraphConvLSTM to better capture how soil moisture changes across both space and time, and provide technical support for drought early warning, precision agriculture and water resource management. It combines graph convolutional networks to map geographic relationships and uses a 3D-SENet attention mechanism to pull out key temporal patterns. Using the LandBench dataset, we compared the proposed model with LSTM, GraphLSTM, and ConvLSTM across multiple lead times and drought levels. Performance was evaluated using root mean square error (RMSE) and R2. The CTA-GraphConvLSTM achieved the highest predictive accuracy (R2 = 0.555 for 1-day lead), outperforming ConvLSTM (R2 = 0.444), LSTM (R2 = 0.430), and GraphLSTM (R2 = 0.088). This value reveals that the model can hardly explain the variance in the data and presents extremely poor prediction performance, performing just slightly better than a simple mean predictor. The comparison results fully verify that the proposed model has higher prediction accuracy. These results demonstrate the effectiveness of graph-scale spatiotemporal modeling for soil moisture prediction. Our research has direct practical applications: it can support precision agriculture by optimizing irrigation schedules, enhance water resource management through improved reservoir operation, and strengthen drought early warning systems, thereby contributing to sustainable land use and food security. Full article
(This article belongs to the Special Issue Data Assimilation and Modeling for Sustainable Soil–Water Systems)
Show Figures

Figure 1

31 pages, 7005 KB  
Article
Comparative Evaluation of Machine Learning Models for Satellite Chlorophyll-a Gap Reconstruction in the Chesapeake Bay
by Rakshita Chidananda, Anusha Srirenganathan Malarvizhi, Samir Ahmed, Elena Zhang and Chaowei Phil Yang
Remote Sens. 2026, 18(11), 1736; https://doi.org/10.3390/rs18111736 - 28 May 2026
Viewed by 308
Abstract
Harmful algal blooms (HABs) are increasing in frequency in the Chesapeake Bay, posing risks to marine ecosystems, water quality, and public health. Chlorophyll-a (Chl-a) is a widely used indicator of algal biomass, and satellite observations such as Sentinel-3 Ocean and Land Color Instrument [...] Read more.
Harmful algal blooms (HABs) are increasing in frequency in the Chesapeake Bay, posing risks to marine ecosystems, water quality, and public health. Chlorophyll-a (Chl-a) is a widely used indicator of algal biomass, and satellite observations such as Sentinel-3 Ocean and Land Color Instrument (OLCI) enable large-scale monitoring of bloom dynamics. However, cloud cover and atmospheric interference frequently introduce missing pixels in daily satellite products, reducing temporal continuity and limiting monitoring reliability. Satellite-derived chlorophyll-a (Chl-a) data exhibit substantial missingness, with daily pixel gaps ranging from approximately 52.30% to 100% (mean ≈ 88.95%). This study evaluates spatial interpolation, EOF-based, supervised machine-learning, deep-learning, and convolutional autoencoder approaches for reconstructing missing Chl-a values. Sentinel-3 OLCI Chl-a data from 2023–2024 were used for model training, while data from 2025 served as a temporally independent test set to avoid spatiotemporal leakage. To simulate cloud-induced data gaps, artificial missingness scenarios ranging from 50% to 90% were applied for the Inverse Distance Weighting (IDW) and Data Interpolating Empirical Orthogonal Functions (DINEOF) baseline approaches, while machine-learning, deep-learning, and convolutional autoencoder models were evaluated using real satellite-derived missing observations. The evaluated models include IDW, DINEOF, K-Nearest Neighbors (KNN), Random Forest (RF), Extra Trees (ET), XGBoost, a Long Short-Term Memory (LSTM) network, and a Temporal Data Interpolating Convolutional Autoencoder (Temporal DINCAE). Model performance was assessed using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), prediction bias, and the coefficient of determination (R2). Results indicate that tree-based ensemble models outperform spatial interpolation and EOF-based methods, with XGBoost achieving the best overall performance (R2 ≈ 0.86; RMSE ≈ 9.61 mg m−3). The LSTM model achieved lower prediction errors (RMSE ≈ 5.87 mg m−3; MAE ≈ 2.16 mg m−3), highlighting the benefit of incorporating temporal dependencies, although with slightly reduced variance capture. The convolutional autoencoder-based Temporal DINCAE model achieved strong reconstruction performance (R2 ≈ 0.84; RMSE ≈ 11.15 mg m−3). Uncertainty quantification shows that Extra Trees tends to underestimate uncertainty with narrower prediction intervals, whereas XGBoost provides better-calibrated but wider intervals. Full article
Show Figures

Figure 1

26 pages, 14829 KB  
Article
A Method for Predicting Arctic Sea Ice Concentration Based on Multimodal Feature Fusion and Temporal Trend Analysis
by Liang Huang, Jianhua Miao, Haishao Chen, Xiaojun Mei, Zhongdai Wu, Feng Wang and Yuxuan Zhang
J. Mar. Sci. Eng. 2026, 14(11), 993; https://doi.org/10.3390/jmse14110993 - 28 May 2026
Viewed by 219
Abstract
The accurate prediction of Arctic sea ice concentration is essential for polar ecological protection and shipping safety. However, existing prediction methods suffer from insufficient feature representation, which limits their ability to capture the complex spatiotemporal distribution of sea ice. Furthermore, they cannot effectively [...] Read more.
The accurate prediction of Arctic sea ice concentration is essential for polar ecological protection and shipping safety. However, existing prediction methods suffer from insufficient feature representation, which limits their ability to capture the complex spatiotemporal distribution of sea ice. Furthermore, they cannot effectively integrate multi-source, heterogeneous sea ice-related data, resulting in limited prediction accuracy. To address these issues, this paper proposes a Multimodal Feature and Trend analysis (MFT) method for sea ice concentration prediction. In the feature extraction stage, MFT combines a Convolutional Neural Network with a Convolutional Block Attention Module to deeply extract global deep semantic features while also employing the Scale-Invariant Feature Transform algorithm to accurately capture local stable features. To improve processing efficiency for high-dimensional remote sensing data, a coarse-resolution dimensionality reduction strategy is developed to select core spatial features, thereby preserving key spatial distribution information while optimizing computational efficiency. For temporal analysis, the Mann–Kendall (MK) non-parametric test and Sen’s slope method are integrated to quantitatively analyze long-term evolution trends in Arctic sea ice concentration. Experimental results show that the proposed MFT model outperforms random forest (RF), LSTM, and traditional MK methods in both prediction accuracy and computational efficiency. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

22 pages, 8915 KB  
Article
Explainable Deep Learning for Greenhouse Horticulture: Feature and Temporal Interpretability in Crop Yield and Energy Optimization
by Yiqiao Li, Boyuan Zheng, Victor W. Chu, Jianlong Zhou, Fang Chen, Sachin Chavan, Jing He, Meng Xu, Zhonghua Chen and David Tissue
AgriEngineering 2026, 8(6), 213; https://doi.org/10.3390/agriengineering8060213 - 28 May 2026
Viewed by 171
Abstract
Optimizing crop yield while minimizing energy consumption remains a central challenge in greenhouse horticulture. This study introduces an integrated deep learning framework that couples multi-horizon time-series forecasting with dual-layered explainability to address the critical need for spatiotemporal transparency in optimizing greenhouse crop yield [...] Read more.
Optimizing crop yield while minimizing energy consumption remains a central challenge in greenhouse horticulture. This study introduces an integrated deep learning framework that couples multi-horizon time-series forecasting with dual-layered explainability to address the critical need for spatiotemporal transparency in optimizing greenhouse crop yield and energy efficiency. Four deep learning architectures, including the One-Dimensional Convolutional Neural Network (1D-CNN), Long Short-Term Memory Network (LSTM), Bidirectional Long Short-Term Memory Network (BiLSTM), and TinyTimeMixer (TTM), were evaluated across two varieties of capsicum. LSTM and BiLSTM achieved the highest accuracy for incremental yield prediction, whereas TTM outperformed other models in forecasting daily energy usage, reflecting the distinct temporal characteristics of biological growth and environment-driven energy demand. To uncover the factors driving these predictions, two complementary explainability methods were applied: Gradient SHapley Additive exPlanations (SHAP) for feature-level attribution and a Temporal Convolutional Network with Convolutional Block Attention Module (TCN–CBAM) attention mechanism for joint temporal-feature interpretation. Radiation and drainage-related variables consistently emerged as the dominant contributors to yield, whereas external temperature, and humidity were the primary determinants of energy usage. Temporal attention further showed that yield is influenced by both recent irrigation responses and longer-term developmental dynamics, while energy consumption is driven mainly by short-term climatic fluctuations. These findings provide actionable insights for irrigation scheduling, climate-control strategies, and energy optimization, supporting more transparent and sustainable greenhouse management. Full article
Show Figures

Figure 1

23 pages, 5712 KB  
Article
MGFNet: A Multi-Granularity Fusion Network with Coupling-Guided Sparse Routing for Hybrid EEG-fNIRS Decoding
by Yan Zhang, Xiaoyu Gong and Xiaoyang Yuan
Sensors 2026, 26(11), 3402; https://doi.org/10.3390/s26113402 - 27 May 2026
Viewed by 268
Abstract
Hybrid brain–computer interfaces (BCIs) have attracted growing research attention because they combine the millisecond-level temporal resolution of electroencephalography (EEG) with the spatially informative hemodynamic responses of functional near-infrared spectroscopy (fNIRS). However, most existing deep fusion methods rely on static late-fusion strategies, which tend [...] Read more.
Hybrid brain–computer interfaces (BCIs) have attracted growing research attention because they combine the millisecond-level temporal resolution of electroencephalography (EEG) with the spatially informative hemodynamic responses of functional near-infrared spectroscopy (fNIRS). However, most existing deep fusion methods rely on static late-fusion strategies, which tend to underexploit latent cross-modal dependencies and are vulnerable to modality-specific signal degradation. To address these limitations, we propose MGFNet, a multi-granularity fusion network for hybrid BCI decoding. MGFNet contains three components: (1) intra-modal encoders that learn modality-specific spatiotemporal representations from EEG, oxygenated hemoglobin (HbO), and deoxygenated hemoglobin (HbR) signals; (2) cross-modal interaction encoders that temporally align paired modalities and use dilated convolutions to capture long-range EEG-fNIRS dependencies; and (3) a Coupling-Guided Sparse Component Routing (CGSCR) module that estimates sample-specific cross-modal coupling and performs adaptive discrete routing. We further introduce a deep supervision strategy to stabilize optimization and improve branch-level discriminability. Under a within-subject held-out evaluation protocol on a public benchmark dataset, MGFNet achieved classification accuracies of 99.40% on the n-back task and 99.03% on the word generation (WG) task, outperforming representative comparison methods evaluated under a matched protocol. Ablation studies further confirmed the contributions of the intra-modal encoders, the cross-modal interaction encoders, and the CGSCR module. Under controlled EEG corruption with additive white Gaussian noise at −10 dB, MGFNet outperformed a static-fusion variant by 9.23 percentage points on the n-back task and 6.31 percentage points on the WG task. These results support the effectiveness of MGFNet in the present offline within-subject setting and indicate improved robustness under controlled single-modality degradation. Full article
(This article belongs to the Special Issue Challenges and Future Trends in Biomedical Signal Processing)
Show Figures

Figure 1

20 pages, 4796 KB  
Article
UHPose-VAD: Unsupervised Video Anomaly Detection via Pose-Graph Learning and Normalizing Flow
by Di Jiang, Huicheng Lai, Guxue Gao, Dan Ma and Liejun Wang
J. Imaging 2026, 12(6), 227; https://doi.org/10.3390/jimaging12060227 - 27 May 2026
Viewed by 181
Abstract
Unsupervised video anomaly detection (VAD) aims to identify unusual events by learning from unlabeled videos. However, many current methods overlook the fine-grained spatiotemporal dynamics of human poses, which are crucial for detecting localized anomalies like falls or assaults. Prevailing methods that rely on [...] Read more.
Unsupervised video anomaly detection (VAD) aims to identify unusual events by learning from unlabeled videos. However, many current methods overlook the fine-grained spatiotemporal dynamics of human poses, which are crucial for detecting localized anomalies like falls or assaults. Prevailing methods that rely on raw RGB frames are often susceptible to variations in lighting and background and struggle to capture the precise structural relationships of human bodies over time. To bridge this gap, we propose UHPose-VAD, a novel unsupervised framework that integrates human pose dynamics with normalizing flow within a graph-based probabilistic model to capture anomalies through spatiotemporal Gaussian distributions. Our framework first extracts human pose keypoints and normalizing flow features. These are then modeled by a graph convolutional network that adaptively learns the graph connectivity, effectively mapping the data to a latent space. This approach allows the model to explicitly reason about the spatiotemporal relationships between body joints, making it inherently more robust and interpretable for human-centric anomaly detection. Finally, a Gaussian Mixture Model fits the latent features of normal training data, learning the intrinsic manifold of regular motion patterns. Extensive experiments on ShanghaiTech and UBnormal datasets show that UHPose-VAD achieves state-of-the-art performance among unsupervised methods, with AUC scores of 86.1% and 69.4%, respectively. Full article
(This article belongs to the Special Issue From Visual Perception to Spatiotemporal Understanding)
Show Figures

Figure 1

Back to TopTop