Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,156)

Search Parameters:
Keywords = attention-based LSTM

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 8249 KB  
Article
Short-Term Passenger Flow Forecasting for Rail Transit Inte-Grating Multi-Scale Decomposition and Deep Attention Mechanism
by Youpeng Lu and Jiming Wang
Sustainability 2025, 17(19), 8880; https://doi.org/10.3390/su17198880 - 6 Oct 2025
Abstract
Short-term passenger flow prediction provides critical data-driven support for optimizing resource allocation, guiding passenger mobility, and enhancing risk response capabilities in urban rail transit systems. To further improve prediction accuracy, this study proposes a hybrid SMA-VMD-Informer-BiLSTM prediction model. Addressing the challenge of error [...] Read more.
Short-term passenger flow prediction provides critical data-driven support for optimizing resource allocation, guiding passenger mobility, and enhancing risk response capabilities in urban rail transit systems. To further improve prediction accuracy, this study proposes a hybrid SMA-VMD-Informer-BiLSTM prediction model. Addressing the challenge of error propagation caused by non-stationary components (e.g., noise and abrupt fluctuations) in conventional passenger flow signals, the Variational Mode Decomposition (VMD) method is introduced to decompose raw flow data into multiple intrinsic mode functions (IMFs). A Slime Mould Algorithm (SMA)-based optimization mechanism is designed to adaptively tune VMD parameters, effectively mitigating mode redundancy and information loss. Furthermore, to circumvent error accumulation inherent in serial modeling frameworks, a parallel prediction architecture is developed: the Informer branch captures long-term dependencies through its ProbSparse self-attention mechanism, while the Bidirectional Long Short-Term Memory (BiLSTM) network extracts localized short-term temporal patterns. The outputs of both branches are fused via a fully connected layer, balancing global trend adherence and local fluctuation characterization. Experimental validation using historical entry flow data from Weihouzhuang Station on Xi’an Metro demonstrated the superior performance of the SMA-VMD-Informer-BiLSTM model. Compared to benchmark models (CNN-BiLSTM, CNN-BiGRU, Transformer-LSTM, ARIMA-LSTM), the proposed model achieved reductions of 7.14–53.33% in fmse, 3.81–31.14% in frmse, and 8.87–38.08% in fmae, alongside a 4.11–5.48% improvement in R2. Cross-station validation across multiple Xi’an Metro hubs further confirmed robust spatial generalizability, with prediction errors bounded within fmse: 0.0009–0.01, frmse: 0.0303–0.1, fmae: 0.0196–0.0697, and R2: 0.9011–0.9971. Furthermore, the model demonstrated favorable predictive performance when applied to forecasting passenger inflows at multiple stations in Nanjing and Zhengzhou, showcasing its excellent spatial transferability. By integrating multi-level, multi-scale data processing and adaptive feature extraction mechanisms, the proposed model significantly mitigates error accumulation observed in traditional approaches. These findings collectively indicate its potential as a scientific foundation for refined operational decision-making in urban rail transit management, thereby significantly promoting the sustainable development and long-term stable operation of urban rail transit systems. Full article
Show Figures

Figure 1

43 pages, 4746 KB  
Article
The BTC Price Prediction Paradox Through Methodological Pluralism
by Mariya Paskaleva and Ivanka Vasenska
Risks 2025, 13(10), 195; https://doi.org/10.3390/risks13100195 - 4 Oct 2025
Abstract
Bitcoin’s extreme price volatility presents significant challenges for investors and traders, necessitating accurate predictive models to guide decision-making in cryptocurrency markets. This study compares the performance of machine learning approaches for Bitcoin price prediction, specifically examining XGBoost gradient boosting, Long Short-Term Memory (LSTM), [...] Read more.
Bitcoin’s extreme price volatility presents significant challenges for investors and traders, necessitating accurate predictive models to guide decision-making in cryptocurrency markets. This study compares the performance of machine learning approaches for Bitcoin price prediction, specifically examining XGBoost gradient boosting, Long Short-Term Memory (LSTM), and GARCH-DL neural networks using comprehensive market data spanning December 2013 to May 2025. We employed extensive feature engineering incorporating technical indicators, applied multiple machine and deep learning models configurations including standalone and ensemble approaches, and utilized cross-validation techniques to assess model robustness. Based on the empirical results, the most significant practical implication is that traders and financial institutions should adopt a dual-model approach, deploying XGBoost for directional trading strategies and utilizing LSTM models for applications requiring precise magnitude predictions, due to their superior continuous forecasting performance. This research demonstrates that traditional technical indicators, particularly market capitalization and price extremes, remain highly predictive in algorithmic trading contexts, validating their continued integration into modern cryptocurrency prediction systems. For risk management applications, the attention-based LSTM’s superior risk-adjusted returns, combined with enhanced interpretability, make it particularly valuable for institutional portfolio optimization and regulatory compliance requirements. The findings suggest that ensemble methods offer balanced performance across multiple evaluation criteria, providing a robust foundation for production trading systems where consistent performance is more valuable than optimization for single metrics. These results enable practitioners to make evidence-based decisions about model selection based on their specific trading objectives, whether focused on directional accuracy for signal generation or precision of magnitude for risk assessment and portfolio management. Full article
(This article belongs to the Special Issue Portfolio Theory, Financial Risk Analysis and Applications)
Show Figures

Figure 1

28 pages, 7501 KB  
Article
Multi-Step Apparent Temperature Prediction in Broiler Houses Using a Hybrid SE-TCN–Transformer Model with Kalman Filtering
by Pengshen Zheng, Wanchao Zhang, Bin Gao, Yali Ma and Changxi Chen
Sensors 2025, 25(19), 6124; https://doi.org/10.3390/s25196124 - 3 Oct 2025
Abstract
In intensive broiler production, rapid environmental fluctuations can induce heat stress, adversely affecting flock welfare and productivity. Apparent temperature (AT), integrating temperature, humidity, and wind speed, provides a comprehensive thermal index, guiding predictive climate control. This study develops a multi-step AT forecasting model [...] Read more.
In intensive broiler production, rapid environmental fluctuations can induce heat stress, adversely affecting flock welfare and productivity. Apparent temperature (AT), integrating temperature, humidity, and wind speed, provides a comprehensive thermal index, guiding predictive climate control. This study develops a multi-step AT forecasting model based on a hybrid SE-TCN–Transformer architecture enhanced with Kalman filtering. The temporal convolutional network with SE attention extracts short-term local trends, the Transformer captures long-range dependencies, and Kalman smoothing reduces prediction noise, collectively improving robustness and accuracy. The model was trained on multi-source time-series data from a commercial broiler house and evaluated for 5, 15, and 30 min horizons against LSTM, GRU, Autoformer, and Informer benchmarks. Results indicate that the proposed model achieves substantially lower prediction errors and higher determination coefficients. By combining multi-variable feature integration, local–global temporal modeling, and dynamic smoothing, the model offers a precise and reliable tool for intelligent ventilation control and heat stress management. These findings provide both scientific insight into multi-step thermal environment prediction and practical guidance for optimizing broiler welfare and production performance. Full article
(This article belongs to the Section Smart Agriculture)
19 pages, 9302 KB  
Article
Real-Time Face Gesture-Based Robot Control Using GhostNet in a Unity Simulation Environment
by Yaseen
Sensors 2025, 25(19), 6090; https://doi.org/10.3390/s25196090 - 2 Oct 2025
Abstract
Unlike traditional control systems that rely on physical input devices, facial gesture-based interaction offers a contactless and intuitive method for operating autonomous systems. Recent advances in computer vision and deep learning have enabled the use of facial expressions and movements for command recognition [...] Read more.
Unlike traditional control systems that rely on physical input devices, facial gesture-based interaction offers a contactless and intuitive method for operating autonomous systems. Recent advances in computer vision and deep learning have enabled the use of facial expressions and movements for command recognition in human–robot interaction. In this work, we propose a lightweight, real-time facial gesture recognition method, GhostNet-BiLSTM-Attention (GBA), which integrates GhostNet and BiLSTM with an attention mechanism, is trained on the FaceGest dataset, and is integrated with a 3D robot simulation in Unity. The system is designed to recognize predefined facial gestures such as head tilts, eye blinks, and mouth movements with high accuracy and low inference latency. Recognized gestures are mapped to specific robot commands and transmitted to a Unity-based simulation environment via socket communication across machines. This framework enables smooth and immersive robot control without the need for conventional controllers or sensors. Real-time evaluation demonstrates the system’s robustness and responsiveness under varied user and lighting conditions, achieving a classification accuracy of 99.13% on the FaceGest dataset. The GBA holds strong potential for applications in assistive robotics, contactless teleoperation, and immersive human–robot interfaces. Full article
(This article belongs to the Special Issue Smart Sensing and Control for Autonomous Intelligent Unmanned Systems)
Show Figures

Figure 1

27 pages, 10646 KB  
Article
Deep Learning-Based Hybrid Model with Multi-Head Attention for Multi-Horizon Stock Price Prediction
by Rajesh Kumar Ghosh, Bhupendra Kumar Gupta, Ajit Kumar Nayak and Samit Kumar Ghosh
J. Risk Financial Manag. 2025, 18(10), 551; https://doi.org/10.3390/jrfm18100551 - 1 Oct 2025
Abstract
The prediction of stock prices is challenging due to their volatility, irregular patterns, and complex time-series structure. Reliably forecasting stock market data plays a crucial role in minimizing financial risk and optimizing investment strategies. However, traditional models often struggle to capture temporal dependencies [...] Read more.
The prediction of stock prices is challenging due to their volatility, irregular patterns, and complex time-series structure. Reliably forecasting stock market data plays a crucial role in minimizing financial risk and optimizing investment strategies. However, traditional models often struggle to capture temporal dependencies and extract relevant features from noisy inputs, which limits their predictive performance. To improve this, we developed an enhanced recursive feature elimination (RFE) method that blends the importance of impurity-based features from random forest and gradient boosting models with Kendall tau correlation analysis, and we applied SHapley Additive exPlanations (SHAP) analysis to externally validate the reliability of the selected features. This approach leads to more consistent and reliable feature selection for short-term stock prediction over 1-, 3-, and 7-day intervals. The proposed deep learning (DL) architecture integrates a temporal convolutional network (TCN) for long-term pattern recognition, a gated recurrent unit (GRU) for sequence capture, and multi-head attention (MHA) for focusing on critical information, thereby achieving superior predictive performance. We evaluate the proposed approach using daily stock price data from three leading companies—HDFC Bank, Tata Consultancy Services (TCS), and Tesla—and two major stock indices: Nifty 50 and S&P 500. The performance of our model is compared against five benchmark models: temporal convolutional network (TCN), long short-term memory (LSTM), GRU, Bidirectional GRU, and a hybrid TCN–GRU model. Our method consistently shows lower error rates and higher predictive accuracy across all datasets, as measured by four commonly used performance metrics. Full article
(This article belongs to the Section Financial Markets)
Show Figures

Figure 1

18 pages, 3177 KB  
Article
Ground Type Classification for Hexapod Robots Using Foot-Mounted Force Sensors
by Yong Liu, Rui Sun, Xianguo Tuo, Tiantao Sun and Tao Huang
Machines 2025, 13(10), 900; https://doi.org/10.3390/machines13100900 - 1 Oct 2025
Abstract
In field exploration, disaster rescue, and complex terrain operations, the accuracy of ground type recognition directly affects the walking stability and task execution efficiency of legged robots. To address the problem of terrain recognition in complex ground environments, this paper proposes a high-precision [...] Read more.
In field exploration, disaster rescue, and complex terrain operations, the accuracy of ground type recognition directly affects the walking stability and task execution efficiency of legged robots. To address the problem of terrain recognition in complex ground environments, this paper proposes a high-precision classification method based on single-leg triaxial force signals. The method first employs a one-dimensional convolutional neural network (1D-CNN) module to extract local temporal features, then introduces a long short-term memory (LSTM) network to model long-term and short-term dependencies during ground contact, and incorporates a convolutional block attention module (CBAM) to adaptively enhance the feature responses of critical channels and time steps, thereby improving discriminative capability. In addition, an improved whale optimization algorithm (iBWOA) is adopted to automatically perform global search and optimization of key hyperparameters, including the number of convolution kernels, the number of LSTM units, and the dropout rate, to achieve the optimal training configuration. Experimental results demonstrate that the proposed method achieves excellent classification performance on five typical ground types—grass, cement, gravel, soil, and sand—under varying slope and force conditions, with an overall classification accuracy of 96.94%. Notably, it maintains high recognition accuracy even between ground types with similar contact mechanical properties, such as soil vs. grass and gravel vs. sand. This study provides a reliable perception foundation and technical support for terrain-adaptive control and motion strategy optimization of legged robots in real-world environments. Full article
(This article belongs to the Section Robotics, Mechatronics and Intelligent Machines)
Show Figures

Figure 1

34 pages, 4605 KB  
Article
Forehead and In-Ear EEG Acquisition and Processing: Biomarker Analysis and Memory-Efficient Deep Learning Algorithm for Sleep Staging with Optimized Feature Dimensionality
by Roberto De Fazio, Şule Esma Yalçınkaya, Ilaria Cascella, Carolina Del-Valle-Soto, Massimo De Vittorio and Paolo Visconti
Sensors 2025, 25(19), 6021; https://doi.org/10.3390/s25196021 - 1 Oct 2025
Abstract
Advancements in electroencephalography (EEG) technology and feature extraction methods have paved the way for wearable, non-invasive systems that enable continuous sleep monitoring outside clinical environments. This study presents the development and evaluation of an EEG-based acquisition system for sleep staging, which can be [...] Read more.
Advancements in electroencephalography (EEG) technology and feature extraction methods have paved the way for wearable, non-invasive systems that enable continuous sleep monitoring outside clinical environments. This study presents the development and evaluation of an EEG-based acquisition system for sleep staging, which can be adapted for wearable applications. The system utilizes a custom experimental setup with the ADS1299EEG-FE-PDK evaluation board to acquire EEG signals from the forehead and in-ear regions under various conditions, including visual and auditory stimuli. Afterward, the acquired signals were processed to extract a wide range of features in time, frequency, and non-linear domains, selected based on their physiological relevance to sleep stages and disorders. The feature set was reduced using the Minimum Redundancy Maximum Relevance (mRMR) algorithm and Principal Component Analysis (PCA), resulting in a compact and informative subset of principal components. Experiments were conducted on the Bitbrain Open Access Sleep (BOAS) dataset to validate the selected features and assess their robustness across subjects. The feature set extracted from a single EEG frontal derivation (F4-F3) was then used to train and test a two-step deep learning model that combines Long Short-Term Memory (LSTM) and dense layers for 5-class sleep stage classification, utilizing attention and augmentation mechanisms to mitigate the natural imbalance of the feature set. The results—overall accuracies of 93.5% and 94.7% using the reduced feature sets (94% and 98% cumulative explained variance, respectively) and 97.9% using the complete feature set—demonstrate the feasibility of obtaining a reliable classification using a single EEG derivation, mainly for unobtrusive, home-based sleep monitoring systems. Full article
Show Figures

Figure 1

24 pages, 2681 KB  
Article
A Method for Operation Risk Assessment of High-Current Switchgear Based on Ensemble Learning
by Weidong Xu, Peng Chen, Cong Yuan, Zhi Wang, Shuyu Liang, Yanbo Hao, Jiahao Zhang and Bin Liao
Processes 2025, 13(10), 3136; https://doi.org/10.3390/pr13103136 - 30 Sep 2025
Abstract
In the context of the new power system, high-current switchgear is prone to various faults due to complex operation environments and severe load fluctuations. Among them, an abnormal temperature rise can lead to contact oxidation, insulation aging, and even equipment failure, posing a [...] Read more.
In the context of the new power system, high-current switchgear is prone to various faults due to complex operation environments and severe load fluctuations. Among them, an abnormal temperature rise can lead to contact oxidation, insulation aging, and even equipment failure, posing a serious threat to the safety of the distribution system. The operation risk assessment of high-current switchgear has thus become a key to ensuring the safety of the distribution system. Ensemble learning, which integrates the advantages of multiple models, provides an effective approach for accurate and intelligent risk assessment. However, existing ensemble learning methods have shortcomings in feature extraction, time-series modeling, and generalization ability. Therefore, this paper first preprocesses and reduces the dimensionality of multi-source data, such as historical load and equipment operation status. Secondly, we propose an operation risk assessment method for high-current switchgear based on ensemble learning: in the first layer, an improved random forest (RF) is used to optimize feature extraction; in the second layer, an improved long short-term memory (LSTM) network with an attention mechanism is adopted to capture time-series dependent features; in the third layer, an adaptive back propagation neural network (ABPNN) model fused with an adaptive genetic algorithm is utilized to correct the previous results, improving the stability of the assessment. Simulation results show that in temperature rise prediction, the proposed algorithm significantly improves the goodness-of-fit indicator with increases of 15.4%, 4.9%, and 24.8% compared to three baseline algorithms, respectively. It can accurately assess the operation risk of switchgear, providing technical support for intelligent equipment operation and maintenance, and safe operation of the system. Full article
Show Figures

Figure 1

22 pages, 3419 KB  
Article
A Small-Sample Prediction Model for Ground Surface Settlement in Shield Tunneling Based on Adjacent-Ring Graph Convolutional Networks (GCN-SSPM)
by Jinpo Li, Haoxuan Huang and Gang Wang
Buildings 2025, 15(19), 3519; https://doi.org/10.3390/buildings15193519 - 30 Sep 2025
Abstract
In some projects, a lack of data causes problems for presenting an accurate prediction model for surface settlement caused by shield tunneling. Existing models often rely on large volumes of data and struggle to maintain accuracy and reliability in shield tunneling. In particular, [...] Read more.
In some projects, a lack of data causes problems for presenting an accurate prediction model for surface settlement caused by shield tunneling. Existing models often rely on large volumes of data and struggle to maintain accuracy and reliability in shield tunneling. In particular, the spatial dependency between adjacent rings is overlooked. To address these limitations, this study presents a small-sample prediction framework for settlement induced by shield tunneling, using an adjacent-ring graph convolutional network (GCN-SSPM). Gaussian smoothing, empirical mode decomposition (EMD), and principal component analysis (PCA) are integrated into the model, which incorporates spatial topological priors by constructing a ring-based adjacency graph to extract essential features. A dynamic ensemble strategy is further employed to enhance robustness across layered geological conditions. Monitoring data from the Wuhan Metro project is used to demonstrate that GCN-SSPM yields accurate and stable predictions, particularly in zones facing abrupt settlement shifts. Compared to LSTM+GRU+Attention and XGBoost, the proposed model reduces RMSE by over 90% (LSTM) and 75% (XGBoost), respectively, while achieving an R2 of about 0.71. Notably, the ensemble assigns over 70% of predictive weight to GCN-SSPM in disturbance-sensitive zones, emphasizing its effectiveness in capturing spatially coupled and nonlinear settlement behavior. The prediction error remains within ±1.2 mm, indicating strong potential for practical applications in intelligent construction and early risk mitigation in complex geological conditions. Full article
(This article belongs to the Section Building Structures)
Show Figures

Figure 1

18 pages, 4522 KB  
Article
PGTFT: A Lightweight Graph-Attention Temporal Fusion Transformer for Predicting Pedestrian Congestion in Shadow Areas
by Jiyoon Lee and Youngok Kang
ISPRS Int. J. Geo-Inf. 2025, 14(10), 381; https://doi.org/10.3390/ijgi14100381 - 28 Sep 2025
Abstract
Forecasting pedestrian congestion in urban back streets is challenging due to “shadow areas” where CCTV coverage is absent and trajectory data cannot be directly collected. To address these gaps, we propose the Peak-aware Graph-attention Temporal Fusion Transformer (PGTFT), a lightweight hybrid model that [...] Read more.
Forecasting pedestrian congestion in urban back streets is challenging due to “shadow areas” where CCTV coverage is absent and trajectory data cannot be directly collected. To address these gaps, we propose the Peak-aware Graph-attention Temporal Fusion Transformer (PGTFT), a lightweight hybrid model that extends the Temporal Fusion Transformer by integrating a non-parametric attention-based Graph Convolutional Network, a peak-aware Gated Residual Network, and a Peak-weighted Quantile Loss. The model leverages both physical connectivity and functional similarity between roads through a fused adjacency matrix, while enhancing sensitivity to high-congestion events. Using real-world trajectory data from 38 CCTVs in Anyang, South Korea, experiments show that PGTFT outperforms LSTM, TFT, and GCN-TFT across different sparsity settings. Under sparse 5 m neighbor conditions, the model achieved the lowest MAE (0.059) and RMSE (0.102), while under denser 30 m settings it maintained superior accuracy with standard quantile loss. Importantly, PGTFT requires only 1.54 million parameters—about half the size of conventional Transformer–GCN hybrids—while delivering equal or better predictive performance. These results demonstrate that PGTFT is both parameter-efficient and robust, offering strong potential for deployment in smart city monitoring, emergency response, and transportation planning, as well as a practical approach to addressing data sparsity in urban sensing systems. Full article
Show Figures

Figure 1

26 pages, 5279 KB  
Article
A Deep Learning-Based Method for Mechanical Equipment Unknown Fault Detection in the Industrial Internet of Things
by Xiaokai Liu, Xiangheng Meng, Lina Ning, Fangmin Xu, Qiguang Li and Chenglin Zhao
Sensors 2025, 25(19), 5984; https://doi.org/10.3390/s25195984 - 27 Sep 2025
Abstract
With the development of the Industrial Internet of Things (IIoT) technology, fault diagnosis has emerged as a critical component of its operational reliability, and machine learning algorithms play a crucial role in fault diagnosis. To achieve better fault diagnosis results, it is necessary [...] Read more.
With the development of the Industrial Internet of Things (IIoT) technology, fault diagnosis has emerged as a critical component of its operational reliability, and machine learning algorithms play a crucial role in fault diagnosis. To achieve better fault diagnosis results, it is necessary to have a sufficient number of fault samples participating in the training of the model. In actual industrial scenarios, it is often difficult to obtain fault samples, and there may even be situations where no fault samples exist. For scenarios without fault samples, accurately identifying the unknown faults of equipment is an issue that requires focused attention. This paper presents a method for the normal-sample-based mechanical equipment unknown fault detection. By leveraging the characteristics of the autoencoder network (AE) in deep learning for feature extraction and sample reconstruction, normal samples are used to train the AE network. Whether the input sample is abnormal is determined via the reconstruction error and a threshold value, achieving the goal of anomaly detection without relying on fault samples. In terms of input data, the frequency domain features of normal samples are used to train the AE network, which improves the training stability of the AE network model, reduces the network parameters, and saves the occupied memory space at the same time. Moreover, this paper further improves the network based on the traditional AE network by incorporating a convolutional neural network (CNN) and a long short-term memory network (LSTM). This enhances the ability of the AE network to extract the spatial and temporal features of the input data, further improving the network’s ability to extract and recognize abnormal features. In the simulation part, through public datasets collected in factories, the advantages and practicality of this method compared with other algorithms in the detection of unknown faults are fully verified. Full article
(This article belongs to the Section Internet of Things)
Show Figures

Figure 1

32 pages, 13081 KB  
Article
FedIFD: Identifying False Data Injection Attacks in Internet of Vehicles Based on Federated Learning
by Huan Wang, Junying Yang, Jing Sun, Zhe Wang, Qingzheng Liu and Shaoxuan Luo
Big Data Cogn. Comput. 2025, 9(10), 246; https://doi.org/10.3390/bdcc9100246 - 26 Sep 2025
Abstract
With the rapid development of intelligent connected vehicle technology, false data injection (FDI) attacks have become a major challenge in the Internet of Vehicles (IoV). While deep learning methods can effectively identify such attacks, the dynamic, distributed architecture of the IoV and limited [...] Read more.
With the rapid development of intelligent connected vehicle technology, false data injection (FDI) attacks have become a major challenge in the Internet of Vehicles (IoV). While deep learning methods can effectively identify such attacks, the dynamic, distributed architecture of the IoV and limited computing resources hinder both privacy protection and lightweight computation. To address this, we propose FedIFD, a federated learning (FL)-based detection method for false data injection attacks. The lightweight threat detection model utilizes basic safety messages (BSM) for local incremental training, and the Q-FedCG algorithm compresses gradients for global aggregation. Original features are reshaped using a time window. To ensure temporal and spatial consistency, a sliding average strategy aligns samples before spatial feature extraction. A dual-branch architecture enables parallel extraction of spatiotemporal features: a three-layer stacked Bidirectional Long Short-Term Memory (BiLSTM) captures temporal dependencies, and a lightweight Transformer models spatial relationships. A dynamic feature fusion weight matrix calculates attention scores for adaptive feature weighting. Finally, a differentiated pooling strategy is applied to emphasize critical features. Experiments on the VeReMi dataset show that the accuracy reaches 97.8%. Full article
(This article belongs to the Special Issue Big Data Analytics with Machine Learning for Cyber Security)
Show Figures

Figure 1

21 pages, 2265 KB  
Article
Enhancing Wind Power Forecasting in the Spanish Market Through Transformer Neural Networks and Temporal Optimization
by Teresa Oriol, Jenny Cifuentes and Geovanny Marulanda
Sustainability 2025, 17(19), 8655; https://doi.org/10.3390/su17198655 - 26 Sep 2025
Abstract
The increasing penetration of renewable energy, and wind power in particular, requires accurate short-term forecasting to ensure grid stability, reduce operational uncertainty, and facilitate large-scale integration of intermittent resources. This study evaluates Transformer-based architectures for wind power forecasting using hourly generation data from [...] Read more.
The increasing penetration of renewable energy, and wind power in particular, requires accurate short-term forecasting to ensure grid stability, reduce operational uncertainty, and facilitate large-scale integration of intermittent resources. This study evaluates Transformer-based architectures for wind power forecasting using hourly generation data from Spain (2020–2024). Time series were segmented into input windows of 12, 24, and 36 h, and multiple model configurations were systematically tested. For benchmarking, LSTM and GRU models were trained under identical protocols. The results show that the Transformer consistently outperformed recurrent baselines across all horizons. The best configuration, using a 36 h input sequence with moderate dimensionality and shallow depth, achieved an RMSE of 370.71 MW, MAE of 258.77 MW, and MAPE of 4.92%, reducing error by a significant margin compared to LSTM and GRU models, whose best performances reached RMSEs above 395 MW and MAPEs above 5.7%. Beyond predictive accuracy, attention maps revealed that the Transformer effectively captured short-term fluctuations while also attending to longer-range dependencies, offering a transparent mechanism for interpreting the contribution of historical information to forecasts. These findings demonstrate the superior performance of Transformer-based models in short-term wind power forecasting, underscoring their capacity to deliver more accurate and interpretable predictions that support the reliable integration of renewable energy into modern power systems. Full article
(This article belongs to the Section Energy Sustainability)
Show Figures

Figure 1

16 pages, 1473 KB  
Article
MASleepNet: A Sleep Staging Model Integrating Multi-Scale Convolution and Attention Mechanisms
by Zhiyuan Wang, Zian Gong, Tengjie Wang, Qi Dong, Zhentao Huang, Shanwen Zhang and Yahong Ma
Biomimetics 2025, 10(10), 642; https://doi.org/10.3390/biomimetics10100642 - 23 Sep 2025
Viewed by 132
Abstract
With the rapid development of modern industry, people’s living pressures are gradually increasing, and an increasing number of individuals are affected by sleep disorders such as insomnia, hypersomnia, and sleep apnea syndrome. Many cardiovascular and psychiatric diseases are also closely related to sleep. [...] Read more.
With the rapid development of modern industry, people’s living pressures are gradually increasing, and an increasing number of individuals are affected by sleep disorders such as insomnia, hypersomnia, and sleep apnea syndrome. Many cardiovascular and psychiatric diseases are also closely related to sleep. Therefore, the early detection, accurate diagnosis, and treatment of sleep disorders an urgent research priority. Traditional manual sleep staging methods have many problems, such as being time-consuming and cumbersome, relying on expert experience, or being subjective. To address these issues, researchers have proposed multiple algorithmic strategies for sleep staging automation based on deep learning in recent years. This paper studies MASleepNet, a sleep staging neural network model that integrates multimodal deep features. This model takes multi-channel Polysomnography (PSG) signals (including EEG (Fpz-Cz, Pz-Oz), EOG, and EMG) as input and employs a multi-scale convolutional module to extract features at different time scales in parallel. It then adaptively weights and fuses the features from each modality using a channel-wise attention mechanism. The integrated temporal features are integrated into a Bidirectional Long Short-Term Memory (BiLSTM) sequence encoder, where an attention mechanism is introduced to identify key temporal segments. The final classification result is produced by the fully connected layer. The proposed model was experimentally evaluated on the Sleep-EDF dataset (consisting of two subsets, Sleep-EDF-78 and Sleep-EDF-20), achieving classification accuracies of 82.56% and 84.53% on the two subsets, respectively. These results demonstrate that deep models that integrate multimodal signals and an attention mechanism offer the possibility to enhance the efficiency of automatic sleep staging compared to cutting-edge methods. Full article
Show Figures

Graphical abstract

22 pages, 1250 KB  
Article
Entity Span Suffix Classification for Nested Chinese Named Entity Recognition
by Jianfeng Deng, Ruitong Zhao, Wei Ye and Suhong Zheng
Information 2025, 16(10), 822; https://doi.org/10.3390/info16100822 - 23 Sep 2025
Viewed by 183
Abstract
Named entity recognition (NER) is one of the fundamental tasks in building knowledge graphs. For some domain-specific corpora, the text descriptions exhibit limited standardization, and some entity structures have entity nesting. The existing entity recognition methods have problems such as word matching noise [...] Read more.
Named entity recognition (NER) is one of the fundamental tasks in building knowledge graphs. For some domain-specific corpora, the text descriptions exhibit limited standardization, and some entity structures have entity nesting. The existing entity recognition methods have problems such as word matching noise interference and difficulty in distinguishing different entity labels for the same character in sequence label prediction. This paper proposes a span-based feature reuse stacked bidirectional long short term memory network (BiLSTM) nested named entity recognition (SFRSN) model, which transforms the entity recognition of sequence prediction into the problem of entity span suffix category classification. Firstly, character feature embedding is generated through bidirectional encoder representation of transformers (BERT). Secondly, a feature reuse stacked BiLSTM is proposed to obtain deep context features while alleviating the problem of deep network degradation. Thirdly, the span feature is obtained through the dilated convolution neural network (DCNN), and at the same time, a single-tail selection function is introduced to obtain the classification feature of the entity span suffix, with the aim of reducing the training parameters. Fourthly, a global feature gated attention mechanism is proposed, integrating span features and span suffix classification features to achieve span suffix classification. The experimental results on four Chinese-specific domain datasets demonstrate the effectiveness of our approach: SFRSN achieves micro-F1 scores of 83.34% on ontonotes, 73.27% on weibo, 96.90% on resume, and 86.77% on the supply chain management dataset. This represents a maximum improvement of 1.55%, 4.94%, 2.48%, and 3.47% over state-of-the-art baselines, respectively. The experimental results demonstrate the effectiveness of the model in addressing nested entities and entity label ambiguity issues. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Graphical abstract

Back to TopTop