Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,546)

Search Parameters:
Keywords = bi-directional long short-term memory

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 3893 KB  
Article
A Method for Asymmetric Fault Location in HVAC Transmission Lines Based on the Modal Amplitude Ratio
by Bin Zhang, Shihao Yin, Shixian Hui, Mingliang Yang, Yunchuan Chen and Ning Tong
Energies 2026, 19(2), 411; https://doi.org/10.3390/en19020411 (registering DOI) - 14 Jan 2026
Abstract
To address the issues of insensitivity to high-impedance ground faults and difficulty in identifying reflected wavefronts in single-ended traveling-wave fault location methods for asymmetric ground faults in high-voltage AC transmission lines, this paper proposes a single-ended fault location method based on the modal [...] Read more.
To address the issues of insensitivity to high-impedance ground faults and difficulty in identifying reflected wavefronts in single-ended traveling-wave fault location methods for asymmetric ground faults in high-voltage AC transmission lines, this paper proposes a single-ended fault location method based on the modal amplitude ratio and deep learning. First, based on the dispersion characteristics of traveling waves, an approximate formula is derived between the fault distance and the amplitude ratio of the sum of the initial transient voltage traveling-wave 1-mode and 2-mode to 0-mode at the measurement point. Simulation verifies that the fault distance x from the measurement point at the line head is unaffected by transition resistance and fault inception angle, and that a nonlinear positive correlation exists between the distance x and the modal amplitude ratio. The multi-scale wavelet modal maximum ratio of the sum of 1-mode and 2-mode to 0-mode is used to characterize the amplitude ratio. This ratio serves as the input for a Residual Bidirectional Long Short-Term Memory (BiLSTM) network, which is optimized using the Dung Beetle Optimizer (DBO). The DBO-Res-BiLSTM model fits the nonlinear mapping between the fault distance x and the amplitude ratio. Simulation results demonstrate that the proposed method achieves high location accuracy. Furthermore, it remains robust against variations in fault type, location, transition resistance, and inception angle. Full article
Show Figures

Figure 1

17 pages, 3529 KB  
Article
Study on Multimodal Sensor Fusion for Heart Rate Estimation Using BCG and PPG Signals
by Jisheng Xing, Xin Fang, Jing Bai, Luyao Cui, Feng Zhang and Yu Xu
Sensors 2026, 26(2), 548; https://doi.org/10.3390/s26020548 - 14 Jan 2026
Abstract
Continuous heart rate monitoring is crucial for early cardiovascular disease detection. To overcome the discomfort and limitations of ECG in home settings, we propose a multimodal temporal fusion network (MM-TFNet) that integrates ballistocardiography (BCG) and photoplethysmography (PPG) signals. The network extracts temporal features [...] Read more.
Continuous heart rate monitoring is crucial for early cardiovascular disease detection. To overcome the discomfort and limitations of ECG in home settings, we propose a multimodal temporal fusion network (MM-TFNet) that integrates ballistocardiography (BCG) and photoplethysmography (PPG) signals. The network extracts temporal features from BCG and PPG signals through temporal convolutional networks (TCNs) and bidirectional long short-term memory networks (BiLSTMs), respectively, achieving cross-modal dynamic fusion at the feature level. First, bimodal features are projected into a unified dimensional space through fully connected layers. Subsequently, a cross-modal attention weight matrix is constructed for adaptive learning of the complementary correlation between BCG mechanical vibration and PPG volumetric flow features. Combined with dynamic focusing on key heartbeat waveforms through multi-head self-attention (MHSA), the model’s robustness under dynamic activity states is significantly enhanced. Experimental validation using a publicly available BCG-PPG-ECG simultaneous acquisition dataset comprising 40 subjects demonstrates that the model achieves excellent performance with a mean absolute error (MAE) of 0.88 BPM in heart rate prediction tasks, outperforming current mainstream deep learning methods. This study provides theoretical foundations and engineering guidance for developing contactless, low-power, edge-deployable home health monitoring systems, demonstrating the broad application potential of multimodal fusion methods in complex physiological signal analysis. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

23 pages, 1151 KB  
Article
CNN–BiLSTM–Attention-Based Hybrid-Driven Modeling for Diameter Prediction of Czochralski Silicon Single Crystals
by Pengju Zhang, Hao Pan, Chen Chen, Yiming Jing and Ding Liu
Crystals 2026, 16(1), 57; https://doi.org/10.3390/cryst16010057 - 13 Jan 2026
Abstract
High-precision prediction of the crystal diameter during the growth of electronic-grade silicon single crystals is a critical step for the fabrication of high-quality single crystals. However, the process features high-temperature operation, strong nonlinearities, significant time-delay dynamics, and external disturbances, which limit the accuracy [...] Read more.
High-precision prediction of the crystal diameter during the growth of electronic-grade silicon single crystals is a critical step for the fabrication of high-quality single crystals. However, the process features high-temperature operation, strong nonlinearities, significant time-delay dynamics, and external disturbances, which limit the accuracy of conventional mechanism-based models. In this study, mechanism-based models denote physics-informed heat-transfer and geometric models that relate heater power and pulling rate to diameter evolution. To address this challenge, this paper proposes a hybrid deep learning model combining a convolutional neural network (CNN), a bidirectional long short-term memory network (BiLSTM), and self-attention to improve diameter prediction during the shoulder-formation and constant-diameter stages. The proposed model leverages the CNN to extract localized spatial features from multi-source sensor data, employs the BiLSTM to capture temporal dependencies inherent to the crystal growth process, and utilizes the self-attention mechanism to dynamically highlight critical feature information, thereby substantially enhancing the model’s capacity to represent complex industrial operating conditions. Experiments on operational production data collected from an industrial Czochralski (Cz) furnace, model TDR-180, demonstrate improved prediction accuracy and robustness over mechanism-based and single data-driven baselines, supporting practical process control and production optimization. Full article
(This article belongs to the Section Inorganic Crystalline Materials)
Show Figures

Figure 1

23 pages, 2063 KB  
Article
A Hybrid LSTM–Attention Model for Multivariate Time Series Imputation: Evaluation on Environmental Datasets
by Ammara Laeeq, Jie Li and Usman Adeel
Mach. Learn. Knowl. Extr. 2026, 8(1), 18; https://doi.org/10.3390/make8010018 - 12 Jan 2026
Abstract
Environmental monitoring systems generate large volumes of multivariate time series data from heterogeneous sensors, including those measuring soil, weather, and air quality parameters. However, sensor malfunctions and transmission failures frequently lead to missing values, compromising the performance of downstream analytical and predictive models. [...] Read more.
Environmental monitoring systems generate large volumes of multivariate time series data from heterogeneous sensors, including those measuring soil, weather, and air quality parameters. However, sensor malfunctions and transmission failures frequently lead to missing values, compromising the performance of downstream analytical and predictive models. To address this challenge, this study presents a comprehensive and systematic evaluation of previously proposed hybrid architecture that interleaves Long Short-Term Memory (LSTM) layers with a Multi-Head Attention mechanism in a “sandwiched” setting (LSTM–Attention–LSTM) for robust multivariate data imputation in environmental IoT datasets. The first LSTM layer captures short-term temporal dependencies, the attention layer emphasises long-range relationships among correlated features, and the second LSTM layer re-integrates these enriched representations into a coherent temporal sequence. The model is evaluated using multiple environmental datasets of soil temperature, meteorological (precipitation, temperature, wind speed, humidity), and air quality data across missingness levels ranging from 10% to 90%. Performance is compared against baseline methods, including K-Nearest Neighbour (KNN) and Bidirectional Recurrent Imputation for Time Series (BRITS). Across all datasets, the Hybrid model consistently outperforms baseline methods, achieving MAE reductions exceeding 50% and reaching over 80% in several scenarios, along with RMSE reductions of up to approximately 85%, particularly under moderate to high missingness conditions. An ablation study further examines the contribution of each layer to overall model performance. Results demonstrate that the proposed Hybrid model achieves superior accuracy and robustness across datasets, confirming its effectiveness for environmental sensor data imputation under varying missing data conditions. Full article
(This article belongs to the Section Learning)
Show Figures

Graphical abstract

20 pages, 3746 KB  
Article
Fault Diagnosis and Classification of Rolling Bearings Using ICEEMDAN–CNN–BiLSTM and Acoustic Emission
by Jinliang Li, Haoran Sheng, Bin Liu and Xuewei Liu
Sensors 2026, 26(2), 507; https://doi.org/10.3390/s26020507 - 12 Jan 2026
Viewed by 25
Abstract
Reliable operation of rolling bearings is essential for mechanical systems. Acoustic emission (AE) offers a promising approach for bearing fault detection because of its high-frequency response and strong noise-suppression capability. This study proposes an intelligent diagnostic method that combines an improved complete ensemble [...] Read more.
Reliable operation of rolling bearings is essential for mechanical systems. Acoustic emission (AE) offers a promising approach for bearing fault detection because of its high-frequency response and strong noise-suppression capability. This study proposes an intelligent diagnostic method that combines an improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) and a convolutional neural network–bidirectional long short-term memory (CNN–BiLSTM) architecture. The method first applies wavelet denoising to AE signals, then uses ICEEMDAN decomposition followed by kurtosis-based screening to extract key fault components and construct feature vectors. Subsequently, a CNN automatically learns deep time–frequency features, and a BiLSTM captures temporal dependencies among these features, enabling end-to-end fault identification. Experiments were conducted on a bearing acoustic emission dataset comprising 15 operating conditions, five fault types, and three rotational speeds; comparative model tests were also performed. Results indicate that ICEEMDAN effectively suppresses mode mixing (average mixing rate 6.08%), and the proposed model attained an average test-set recognition accuracy of 98.00%, significantly outperforming comparative models. Moreover, the model maintained 96.67% accuracy on an independent validation set, demonstrating strong generalization and practical application potential. Full article
(This article belongs to the Special Issue Deep Learning Based Intelligent Fault Diagnosis)
Show Figures

Figure 1

15 pages, 3033 KB  
Article
Comparative Study of Different Algorithms for Human Motion Direction Prediction Based on Multimodal Data
by Hongyu Zhao, Yichi Zhang, Yongtao Chen, Hongkai Zhao, Zhuoran Jiang, Mingwei Cao, Haiqing Yang, Yuhang Ding and Peng Li
Sensors 2026, 26(2), 501; https://doi.org/10.3390/s26020501 - 12 Jan 2026
Viewed by 37
Abstract
The accurate prediction of human movement direction plays a crucial role in fields such as rehabilitation monitoring, sports science, and intelligent military systems. Based on plantar pressure and inertial sensor data, this study developed a hybrid deep learning model integrating a Convolutional Neural [...] Read more.
The accurate prediction of human movement direction plays a crucial role in fields such as rehabilitation monitoring, sports science, and intelligent military systems. Based on plantar pressure and inertial sensor data, this study developed a hybrid deep learning model integrating a Convolutional Neural Network (CNN) and a Bidirectional Long Short-Term Memory (BiLSTM) network to enable joint spatiotemporal feature learning. Systematic comparative experiments involving four distinct deep learning models—CNN, BiLSTM, CNN-LSTM, and CNN-BiLSTM—were conducted to evaluate their convergence performance and prediction accuracy comprehensively. Results show that the CNN-BiLSTM model outperforms the other three models, achieving the lowest RMSE (0.26) and MAE (0.14) on the test set, with an R2 of 0.86, which indicates superior fitting accuracy and generalization ability. The superior performance of the CNN-BiLSTM model is attributed to its ability to effectively capture local spatial features via CNN and model bidirectional temporal dependencies via BiLSTM, thus demonstrating strong adaptability for complex motion scenarios. This work focuses on the optimization and comparison of deep learning algorithms for spatiotemporal feature extraction, providing a reliable framework for real-time human motion prediction and offering potential applications in intelligent gait analysis, wearable monitoring, and adaptive human–machine interaction. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

16 pages, 947 KB  
Article
Depression Detection Method Based on Multi-Modal Multi-Layer Collaborative Perception Attention Mechanism of Symmetric Structure
by Shaorong Jiang, Chengjun Xu and Xiuya Fang
Informatics 2026, 13(1), 8; https://doi.org/10.3390/informatics13010008 - 12 Jan 2026
Viewed by 28
Abstract
Depression is a mental illness with hidden characteristics that affects human physical and mental health. In severe cases, it may lead to suicidal behavior (for example, among college students and social groups). Therefore, it has attracted widespread attention. Scholars have developed numerous models [...] Read more.
Depression is a mental illness with hidden characteristics that affects human physical and mental health. In severe cases, it may lead to suicidal behavior (for example, among college students and social groups). Therefore, it has attracted widespread attention. Scholars have developed numerous models and methods for depression detection. However, most of these methods focus on a single modality and do not consider the influence of gender on depression, while the existing models have limitations such as complex structures. To solve this problem, we propose a symmetric-structured, multi-modal, multi-layer cooperative perception model for depression detection that dynamically focuses on critical features. First, the double-branch symmetric structure of the proposed model is designed to account for gender-based variations in emotional factors. Second, we introduce a stacked multi-head attention (MHA) module and an interactive cross-attention module to comprehensively extract key features while suppressing irrelevant information. A bidirectional long short-term memory network (BiLSTM) module enhances depression detection accuracy. To verify the effectiveness and feasibility of the model, we conducted a series of experiments using the proposed method on the AVEC 2014 dataset. Compared with the most advanced HMTL-IMHAFF model, our model improves the accuracy by 0.0308. The results indicate that the proposed framework demonstrates superior performance. Full article
Show Figures

Figure 1

18 pages, 1386 KB  
Article
Long-Term and Short-Term Photovoltaic Power Generation Forecasting Using a Multi-Scale Fusion MHA-BiLSTM Model
by Mengkun Li, Letian Sun and Yitian Sun
Energies 2026, 19(2), 363; https://doi.org/10.3390/en19020363 - 12 Jan 2026
Viewed by 39
Abstract
As the proportion of photovoltaic (PV) power generation continues to increase in power systems, high-precision PV power forecasting has become a critical challenge for smart grid scheduling. Traditional forecasting methods often struggle with accuracy and error propagation, particularly when handling short-term fluctuations and [...] Read more.
As the proportion of photovoltaic (PV) power generation continues to increase in power systems, high-precision PV power forecasting has become a critical challenge for smart grid scheduling. Traditional forecasting methods often struggle with accuracy and error propagation, particularly when handling short-term fluctuations and long-term trends. To address these issues, this paper proposes a multi-time scale forecasting model, MHA-BiLSTM, based on Bidirectional Long Short-Term Memory (BiLSTM) and Multi-Head Attention (MHA). The model combines the short-term dependency modeling ability of BiLSTM with the long-term trend capturing ability of the multi-head attention mechanism, effectively addressing both short-term (within 6 h) and long-term (up to 72 h) dependencies in PV power data. The experimental results on a simulated PV dataset demonstrate that the MHA-BiLSTM model outperforms traditional models such as LSTM, BiLSTM, and Transformer in multiple evaluation metrics (e.g., MSE, RMSE, R2), particularly showing stronger robustness and generalization ability in long-term forecasting tasks. The results prove that MHA-BiLSTM effectively improves the accuracy of both short-term and long-term PV power predictions, providing valuable support for future microgrid scheduling, energy storage optimization, and the development of smart energy systems. Full article
(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)
Show Figures

Figure 1

24 pages, 7954 KB  
Article
Machine Learning-Based Prediction of Maximum Stress in Observation Windows of HOV
by Dewei Li, Zhijie Wang, Zhongjun Ding and Xi An
J. Mar. Sci. Eng. 2026, 14(2), 151; https://doi.org/10.3390/jmse14020151 - 10 Jan 2026
Viewed by 149
Abstract
With advances in deep-sea exploration technologies, utilizing human-occupied vehicles (HOV) in marine science has become widespread. The observation window is a critical component, as its structural strength affects submersible safety and performance. Under load, it experiences stress concentration, deformation, cracking, and catastrophic failure. [...] Read more.
With advances in deep-sea exploration technologies, utilizing human-occupied vehicles (HOV) in marine science has become widespread. The observation window is a critical component, as its structural strength affects submersible safety and performance. Under load, it experiences stress concentration, deformation, cracking, and catastrophic failure. The observation window will experience different stress distributions in high-pressure environments. The maximum principal stress is the most significant phenomenon that determines the most likely failure of materials in windows of HOV. This study proposes an artificial intelligence-based method to predict the maximum principal stress of observation windows in HOV for rapid safety assessment. Samples were designed, while strain data with corresponding maximum principal stress values were collected under different loading conditions. Three machine learning algorithms—transformer–CNN-BiLSTM, CNN-LSTM, and Gaussian process regression (GP)—were employed for analysis. Results show that the transformer–CNN-BiLSTM model achieved the highest accuracy, particularly at the point exhibiting the maximum the principal stress value. Evaluation metrics, including mean squared error (MSE), mean absolute error (MAE), and root squared residual (RSR), confirmed its superior performance. The proposed hybrid model incorporates a positional encoding layer to enrich input data with locational information and combines the strengths of bidirectional long short-term memory (LSTM), one-dimensional CNN, and transformer–CNN-BiLSTM encoders. This approach effectively captures local and global stress features, offering a reliable predictive tool for health monitoring of submersible observation windows. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

30 pages, 4543 KB  
Article
Dynamic Risk Assessment of the Coal Slurry Preparation System Based on LSTM-RNN Model
by Ziheng Zhang, Rijia Ding, Wenxin Zhang, Liping Wu and Ming Liu
Sustainability 2026, 18(2), 684; https://doi.org/10.3390/su18020684 - 9 Jan 2026
Viewed by 104
Abstract
As the core technology of clean and efficient utilization of coal, coal gasification technology plays an important role in reducing environmental pollution, improving coal utilization, and achieving sustainable energy development. In order to ensure the safe, stable, and long-term operation of coal gasification [...] Read more.
As the core technology of clean and efficient utilization of coal, coal gasification technology plays an important role in reducing environmental pollution, improving coal utilization, and achieving sustainable energy development. In order to ensure the safe, stable, and long-term operation of coal gasification plant, aiming to address the strong subjectivity of dynamic Bayesian network (DBN) prior data in dynamic risk assessment, this study takes the coal slurry preparation system—the main piece of equipment in the initial stage of the coal gasification process—as the research object and uses a long short-term memory (LSTM) model combined with a back propagation (BP) neural network model to optimize DBN prior data. To further validate the superiority of the model, a gated recurrent unit (GRU) model was introduced for comparative verification. The mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination are used to evaluate the generalization ability of the LSTM model. The results show that the LSTM model’s predictions are more accurate and stable. Bidirectional inference is performed on the DBN of the optimized coal slurry preparation system to achieve dynamic reliability analysis. Thanks to the forward reasoning of DBN in the coal slurry preparation system, quantitative analysis of the system’s reliability effects is conducted to clearly demonstrate the trend of system reliability over time, providing data support for stable operation and subsequent upgrades. By conducting reverse reasoning, key events and weak links before and after system optimization can be identified, and targeted improvement measures can be proposed accordingly. Full article
(This article belongs to the Special Issue Process Safety and Control Strategies for Urban Clean Energy Systems)
Show Figures

Figure 1

27 pages, 4631 KB  
Article
Multimodal Minimal-Angular-Geometry Representation for Real-Time Dynamic Mexican Sign Language Recognition
by Gerardo Garcia-Gil, Gabriela del Carmen López-Armas and Yahir Emmanuel Ramirez-Pulido
Technologies 2026, 14(1), 48; https://doi.org/10.3390/technologies14010048 - 8 Jan 2026
Viewed by 156
Abstract
Current approaches to dynamic sign language recognition commonly rely on dense landmark representations, which impose high computational cost and hinder real-time deployment on resource-constrained devices. To address this limitation, this work proposes a computationally efficient framework for real-time dynamic Mexican Sign Language (MSL) [...] Read more.
Current approaches to dynamic sign language recognition commonly rely on dense landmark representations, which impose high computational cost and hinder real-time deployment on resource-constrained devices. To address this limitation, this work proposes a computationally efficient framework for real-time dynamic Mexican Sign Language (MSL) recognition based on a multimodal minimal angular-geometry representation. Instead of processing complete landmark sets (e.g., MediaPipe Holistic with up to 468 keypoints), the proposed method encodes the relational geometry of the hands, face, and upper body into a compact set of 28 invariant internal angular descriptors. This representation substantially reduces feature dimensionality and computational complexity while preserving linguistically relevant manual and non-manual information required for grammatical and semantic discrimination in MSL. A real-time end-to-end pipeline is developed, comprising multimodal landmark extraction, angular feature computation, and temporal modeling using a Bidirectional Long Short-Term Memory (BiLSTM) network. The system is evaluated on a custom dataset of dynamic MSL gestures acquired under controlled real-time conditions. Experimental results demonstrate that the proposed approach achieves 99% accuracy and 99% macro F1-score, matching state-of-the-art performance while using fewer features dramatically. The compactness, interpretability, and efficiency of the minimal angular descriptor make the proposed system suitable for real-time deployment on low-cost devices, contributing toward more accessible and inclusive sign language recognition technologies. Full article
(This article belongs to the Special Issue Image Analysis and Processing)
Show Figures

Figure 1

19 pages, 3298 KB  
Article
Detection of Cadmium Content in Pak Choi Using Hyperspectral Imaging Combined with Feature Selection Algorithms and Multivariate Regression Models
by Yongkuai Chen, Tao Wang, Shanshan Lin, Shuilan Liao and Songliang Wang
Appl. Sci. 2026, 16(2), 670; https://doi.org/10.3390/app16020670 - 8 Jan 2026
Viewed by 98
Abstract
Pak choi (Brassica chinensis L.) has a strong adsorption capacity for the heavy metal cadmium (Cd), which is a big threat to human health. Traditional detection methods have drawbacks such as destructiveness, time-consuming processes, and low efficiency. Therefore, this study aimed to [...] Read more.
Pak choi (Brassica chinensis L.) has a strong adsorption capacity for the heavy metal cadmium (Cd), which is a big threat to human health. Traditional detection methods have drawbacks such as destructiveness, time-consuming processes, and low efficiency. Therefore, this study aimed to construct a non-destructive prediction model for Cd content in pak choi leaves using hyperspectral technology combined with feature selection algorithms and multivariate regression models. Four different cadmium concentration treatments (0 (CK), 25, 50, and 100 mg/L) were established to monitor the apparent characteristics, chlorophyll content, cadmium content, chlorophyll fluorescence parameters, and spectral features of pak choi. Competitive adaptive reweighted sampling (CARS), the successive projections algorithm (SPA), and random frog (RF) were used for feature wavelength selection. Partial least squares regression (PLSR), random forest regression (RFR), the Elman neural network, and bidirectional long short-term memory (BiLSTM) models were established using both full spectra and feature wavelengths. The results showed that high-concentration Cd (100 mg/L) significantly inhibited pak choi growth, leaf Cd content was significantly higher than that in the control group, chlorophyll content decreased by 16.6%, and damage to the PSII reaction centre was aggravated. Among the models, the FD–RF–BiLSTM model demonstrated the best prediction performance, with a determination coefficient of the prediction set (Rp2) of 0.913 and a root mean square error of the prediction set (RMSEP) of 0.032. This study revealed the physiological, ecological, and spectral response characteristics of pak choi under Cd stress. It is feasible to detect leaf Cd content in pak choi using hyperspectral imaging technology, and non-destructive, high-precision detection was achieved by combining chemometric methods. This provides an efficient technical means for the rapid screening of Cd pollution in vegetables and holds important practical significance for ensuring the quality and safety of agricultural products. Full article
(This article belongs to the Section Agricultural Science and Technology)
Show Figures

Figure 1

15 pages, 1386 KB  
Article
Symmetry and Asymmetry Principles in Deep Speaker Verification Systems: Balancing Robustness and Discrimination Through Hybrid Neural Architectures
by Sundareswari Thiyagarajan and Deok-Hwan Kim
Symmetry 2026, 18(1), 121; https://doi.org/10.3390/sym18010121 - 8 Jan 2026
Viewed by 108
Abstract
Symmetry and asymmetry are foundational design principles in artificial intelligence, defining the balance between invariance and adaptability in multimodal learning systems. In audio-visual speaker verification, where speech and lip-motion features are jointly modeled to determine whether two utterances belong to the same individual, [...] Read more.
Symmetry and asymmetry are foundational design principles in artificial intelligence, defining the balance between invariance and adaptability in multimodal learning systems. In audio-visual speaker verification, where speech and lip-motion features are jointly modeled to determine whether two utterances belong to the same individual, these principles govern both fairness and discriminative power. In this work, we analyze how symmetry and asymmetry emerge within a gated-fusion architecture that integrates Time-Delay Neural Networks and Bidirectional Long Short-Term Memory encoders for speech, ResNet-based visual lip encoders, and a shared Conformer-based temporal backbone. Structural symmetry is preserved through weight-sharing across paired utterances and symmetric cosine-based scoring, ensuring verification consistency regardless of input order. In contrast, asymmetry is intentionally introduced through modality-dependent temporal encoding, multi-head attention pooling, and a learnable gating mechanism that dynamically re-weights the contribution of audio and visual streams at each timestep. This controlled asymmetry allows the model to rely on visual cues when speech is noisy, and conversely on speech when lip visibility is degraded, yielding adaptive robustness under cross-modal degradation. Experimental results demonstrate that combining symmetric embedding space design with adaptive asymmetric fusion significantly improves generalization, reducing Equal Error Rate (EER) to 3.419% on VoxCeleb-2 test dataset without sacrificing interpretability. The findings show that symmetry ensures stable and fair decision-making, while learnable asymmetry enables modality awareness together forming a principled foundation for next-generation audio-visual speaker verification systems. Full article
Show Figures

Figure 1

28 pages, 6394 KB  
Article
Prediction of Blade Root Loads for Wind Turbine Based on RBMO-VMD and TCN-BiLSTM-Attention
by Yifan Liu and Jing Cheng
Mathematics 2026, 14(2), 218; https://doi.org/10.3390/math14020218 - 6 Jan 2026
Viewed by 98
Abstract
Addressing the challenges associated with wind turbine blade root loads—including nonlinearity, strong coupling effects, high computational complexity, and the limitations of conventional mathematical-physical modeling approaches. This paper proposes a wind turbine blade root load prediction model that integrates Variational Mode Decomposition (VMD) optimized [...] Read more.
Addressing the challenges associated with wind turbine blade root loads—including nonlinearity, strong coupling effects, high computational complexity, and the limitations of conventional mathematical-physical modeling approaches. This paper proposes a wind turbine blade root load prediction model that integrates Variational Mode Decomposition (VMD) optimized by the Red-billed Blue Magpie Algorithm (RBMO) and a combined Temporal Convolutional Network (TCN)—Bidirectional Long Short-Term Memory (BiLSTM)—Attention mechanism. First, the RBMO algorithm optimizes VMD parameters. VMD decomposes data into multiple sub-sequences, which are combined with environmental and operational parameters to form input components for the TCN-BiLSTM-Attention ensemble prediction model. Finally, the RBMO algorithm determines the optimal hyperparameter configuration for the combined model. Prediction outputs from each component are then aggregated and reconstructed to yield the final blade root load prediction. Predictions are compared against actual data and results from other forecasting models. Results demonstrate superior predictive performance for the proposed model, effectively enhancing the accuracy of blade root load prediction for wind turbines. Full article
(This article belongs to the Collection Applied Mathematics for Emerging Trends in Mechatronic Systems)
Show Figures

Figure 1

26 pages, 3117 KB  
Article
C-STEER: A Dynamic Sentiment-Aware Framework for Fake News Detection with Lifecycle Emotional Evolution
by Ziyi Zhen and Ying Li
Informatics 2026, 13(1), 4; https://doi.org/10.3390/informatics13010004 - 5 Jan 2026
Viewed by 290
Abstract
The dynamic evolution of collective emotions across the news dissemination life-cycle is a powerful yet underexplored signal in affective computing. While phenomena like the spread of fake news depend on eliciting specific emotional trajectories, existing methods often fail to capture these crucial dynamic [...] Read more.
The dynamic evolution of collective emotions across the news dissemination life-cycle is a powerful yet underexplored signal in affective computing. While phenomena like the spread of fake news depend on eliciting specific emotional trajectories, existing methods often fail to capture these crucial dynamic affective cues. Many approaches focus on static text or propagation topology, limiting their robustness and failing to model the complete emotional life-cycle for applications such as assessing veracity. This paper introduces C-STEER (Cycle-aware Sentiment-Temporal Emotion Evolution), a novel framework grounded in communication theory, designed to model the characteristic initiation, burst, and decay stages of these emotional arcs. Guided by Diffusion of Innovations Theory, C-STEER first segments an information cascade into its life-cycle phases. It then operationalizes insights from Uses and Gratifications Theory and Emotional Contagion Theory to extract stage-specific emotional features and model their temporal dependencies using a Bidirectional Long Short-Term Memory (BiLSTM). To validate the framework’s descriptive and predictive power, we apply it to the challenging domain of fake news detection. Experiments on the Weibo21 and Twitter16 datasets demonstrate that modeling life-cycle emotion dynamics significantly improves detection performance, achieving F1-macro scores of 91.6% and 90.1%, respectively, outperforming state-of-the-art baselines by margins of 1.6% to 2.4%. This work validates the C-STEER framework as an effective approach for the computational modeling of collective emotion life-cycles. Full article
(This article belongs to the Special Issue Practical Applications of Sentiment Analysis)
Show Figures

Figure 1

Back to TopTop