Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,618)

Search Parameters:
Keywords = self-attention network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 3181 KB  
Article
Distributed Cooperative Self-Localization Algorithm for Multi-UAVs in Aerial Gaming Scenarios
by Qing Liang, Yingzhi Ouyang and Hui Li
Aerospace 2026, 13(7), 574; https://doi.org/10.3390/aerospace13070574 (registering DOI) - 25 Jun 2026
Abstract
Accurate and consistent self-localization is essential for multi-UAV aerial missions in complex dynamic environments. However, communication constraints and heterogeneous sensor reliability variations often lead to cumulative localization errors and degraded robustness in conventional fusion frameworks. To address these challenges, this paper proposes a [...] Read more.
Accurate and consistent self-localization is essential for multi-UAV aerial missions in complex dynamic environments. However, communication constraints and heterogeneous sensor reliability variations often lead to cumulative localization errors and degraded robustness in conventional fusion frameworks. To address these challenges, this paper proposes a distributed cooperative localization framework integrating deep temporal feature learning, heterogeneous multi-sensor fusion, and consistency-aware distributed state estimation. First, an LSTM-based staged fusion strategy is designed to integrate VIO, GPS, and UWB measurements for accurate single-UAV localization. Second, a Squeeze-and-Excitation LSTM Self-Attention (SE-LSTM-SA) network is developed to adaptively recalibrate heterogeneous sensor channels and enhance temporal feature extraction under dynamic sensing conditions. Finally, a consistency-aware distributed fusion mechanism based on the Labeled Multi-Bernoulli (LMB) framework is introduced to improve inter-UAV state consistency through iterative local-neighbor information exchange. Experiments conducted on the XTDrone platform demonstrate that the proposed framework achieves superior localization accuracy compared with traditional EKF and conventional LSTM-based methods. Specifically, the proposed method achieves lower RMSE, MAE, and Maximum Prediction Error (MaxPE), while significantly improving global consistency performance. Experimental results demonstrate that the proposed framework provides accurate and consistent localization performance for multi-UAV systems in complex dynamic environments. Full article
Show Figures

Figure 1

27 pages, 1575 KB  
Article
Intelligent Time-Series Warning Method Based on LSTM–Transformer Hybrid Network for Digital Twin Applications in Refining Enterprises
by Tao Xu, Xiang Jin, Lei Liu, Song Zhang, Jianzhou Zhang and Wei Wang
Appl. Syst. Innov. 2026, 9(7), 134; https://doi.org/10.3390/asi9070134 (registering DOI) - 25 Jun 2026
Abstract
This paper proposes an intelligent time-series early warning framework based on a production LSTM–Transformer network for petrochemical refining processes. A cascaded encoder–decoder architecture is designed, where the LSTM extracts local temporal patterns and medium-term memory from noisy industrial data, while the Transformer models [...] Read more.
This paper proposes an intelligent time-series early warning framework based on a production LSTM–Transformer network for petrochemical refining processes. A cascaded encoder–decoder architecture is designed, where the LSTM extracts local temporal patterns and medium-term memory from noisy industrial data, while the Transformer models global dependencies and cross-unit interactions via multi-head self-attention. An adaptive feature fusion layer bridges the representational gap between the two networks. A multi-stage preprocessing pipeline tailored for refining MES data handles missing values, outliers, and mixed operating conditions. Using 120 variables from five units of a fluid catalytic cracking unit, the framework predicts the regenerator bed temperature up to 8 h (48 steps) ahead. Comparative experiments show that the production LSTM–Transformer achieves a mean MAE of 0.088, a mean RMSE of 0.113, and the lowest median MAPE of 19.91% among all models, outperforming standalone LSTM (MAE 0.095, MAPE 20.85%) and Transformer (MAE 0.088, MAPE 20.49%). Robustness analysis confirms stable performance under strong noise (down to 5 dB) and missing rates up to 50%, with a median MAE of 0.1027 across tags. This work provides an effective, end-to-end predictive early warning solution that balances accuracy, production importance coverage, and industrial robustness, offering a generalizable data-driven paradigm for process industries. Full article
(This article belongs to the Special Issue Autonomous Robotics and Hybrid Intelligent Systems)
Show Figures

Figure 1

32 pages, 2844 KB  
Article
Robust Tilapia Disease Diagnosis Based on Prompt-Enhanced Segment Anything Model and Neuro-Fuzzy Inference
by Yicheng Gao and Guofu Feng
Appl. Sci. 2026, 16(13), 6359; https://doi.org/10.3390/app16136359 (registering DOI) - 25 Jun 2026
Abstract
Diagnosing tilapia diseases in complex aquaculture environments is severely hindered by noisy backgrounds and limited high-quality pathological data. To overcome these bottlenecks, this study presents a two-stage diagnostic framework integrating an enhanced Segment Anything Model (SAM) with an Adaptive Neuro-Fuzzy Inference System (ANFIS). [...] Read more.
Diagnosing tilapia diseases in complex aquaculture environments is severely hindered by noisy backgrounds and limited high-quality pathological data. To overcome these bottlenecks, this study presents a two-stage diagnostic framework integrating an enhanced Segment Anything Model (SAM) with an Adaptive Neuro-Fuzzy Inference System (ANFIS). In the first stage, SAM is augmented with a Convolutional Block Attention Module (CBAM) feature adapter and a Region Proposal Network (RPN)-based prompt encoder. This design enables the automated and precise extraction of irregular disease lesions by self-generating spatial prompts, thereby isolating water background noise. In the second stage, clinical color features extracted from the lesion masks are classified using ANFIS. To optimize performance on small-scale datasets, ANFIS parameters are trained via Particle Swarm Optimization (PSO) under a numerically stable One-vs-Rest (OvR) binary cross-entropy loss. Validated on the public dataset “Enhancing Disease Detection in Nile Tilapia”, our method delivers an average segmentation Dice coefficient of 86.2% and a classification accuracy of 93.5%. This hybrid approach demonstrates strong potential as a foundational baseline for the automated monitoring of aquaculture diseases. Full article
Show Figures

Figure 1

31 pages, 2776 KB  
Article
A Multimodal Biomedical Transformer Fusion Network for Disease-Level Rare-Disease-Inheritance Classification Using Ontology-Enriched Text, Metadata, and Gene Associations
by Mahmood A. Mahmood and Khalaf Alsalem
Biomedicines 2026, 14(7), 1439; https://doi.org/10.3390/biomedicines14071439 (registering DOI) - 25 Jun 2026
Abstract
Background/Objectives: Inheritance classification in rare diseases remains challenging because curated knowledge is incomplete, heterogeneous, and imbalanced across inheritance categories. Disease-level inheritance modeling can support knowledge organization, annotation review, and hypothesis generation in rare-disease resources. This paper introduces RareFusion-Net, a multimodal benchmark framework for [...] Read more.
Background/Objectives: Inheritance classification in rare diseases remains challenging because curated knowledge is incomplete, heterogeneous, and imbalanced across inheritance categories. Disease-level inheritance modeling can support knowledge organization, annotation review, and hypothesis generation in rare-disease resources. This paper introduces RareFusion-Net, a multimodal benchmark framework for disease-level inheritance classification, and evaluates whether integrating ontology-enriched disease text, structured epidemiological metadata, and gene-association information improves prediction in curated rare-disease knowledge bases. RareFusion-Net is intended for knowledge modeling, not individual patient diagnosis. Methods: We developed RareFusionBalanced, a gated multimodal fusion model that combines biomedical disease descriptions, structured metadata, and gene-related information using auxiliary supervision. Ontology-enriched disease text was treated as the dominant semantic modality, while tabular and gene modalities were incorporated as complementary evidence when available. Robustness was improved using balanced regularization, selective transformer fine-tuning, dropout, weight decay, label smoothing, early stopping, and prediction aggregation across random seeds. Evaluation included accuracy, macro-F1, micro-F1, macro-AUC, mean average precision, calibration metrics, class-wise analysis, statistical testing, and ablation experiments. Results: RareFusionBalanced achieved 0.7382 test accuracy, 0.6284 macro-F1, 0.7382 micro-F1, 0.9183 macro-AUC, and 0.6686 mean average precision. Calibration was favorable, with an expected calibration error of 0.0395 and a Brier-OVR of 0.0528. The multimodal model slightly outperformed TextOnly-TransformerBalanced, but improvement over the best TF-IDF baseline was not statistically significant. Ablation showed ontology-enriched text as the strongest modality, with gene associations adding complementary value. Conclusions: RareFusion-Net provides a practical benchmark for ontology-aware rare-disease inheritance modeling. Results suggest selective multimodal benefit while highlighting minority-class difficulty, limited statistical superiority, need for external validation, and improved biological interpretability. Full article
Show Figures

Figure 1

21 pages, 3076 KB  
Article
Research on Gas Concentration Prediction Method Based on Decoupling of Temporal Feature and Dynamic Relationship Reconstruction
by Yongle Yan, Yichao Zhao and Jiuwu Hui
Fire 2026, 9(7), 267; https://doi.org/10.3390/fire9070267 (registering DOI) - 24 Jun 2026
Abstract
Accurate multi-channel gas concentration prediction is very important for coal mine safety. However, the dynamic reconstruction of the sensor network often interferes with the input sequence. Existing models face a critical trade-off: channel-independent models are robust to sequence changes but ignore spatial coupling, [...] Read more.
Accurate multi-channel gas concentration prediction is very important for coal mine safety. However, the dynamic reconstruction of the sensor network often interferes with the input sequence. Existing models face a critical trade-off: channel-independent models are robust to sequence changes but ignore spatial coupling, while channel-dependent models overfit fixed sequences, leading to performance collapse during rearrangements. This paper presents a gas concentration prediction framework based on channel permutation-invariant interaction (CPiRi) to reconcile these limitations. CPiRi employs a spatio-temporal decoupling architecture where a frozen univariate pre-trained encoder independently extracts temporal features to ensure sequence robustness. Subsequently, a permutation-equivariant spatial module utilizes self-attention to model inter-channel gas emission relationships based on data content rather than positional indices. To achieve true permutation invariance, we introduce channel-shuffling regularization during training, forcing the model to learn content-driven relational reasoning. Evaluations on 15 real-world Chinese coal mine datasets demonstrate that CPiRi achieves highly competitive accuracy and consistently outperforms mainstream baselines in both prediction precision and structural adaptability. This study offers a robust technical pathway for gas monitoring in dynamic environments, substantially improving the reliability of intelligent mine safety systems. Full article
Show Figures

Figure 1

19 pages, 7335 KB  
Article
MSA-DET: A Multi-Scale Attention Network with Adaptive Feature Fusion for SAR Ship Detection
by Sai Wan, Zhiyong Tao and Lu Chen
Sensors 2026, 26(13), 3970; https://doi.org/10.3390/s26133970 (registering DOI) - 23 Jun 2026
Abstract
Synthetic aperture radar (SAR) ship detection faces three persistent challenges: coherent speckle noise that obscures target boundaries, heterogeneous background clutter in coastal and harbor scenes, and ship targets whose spatial extent varies by more than an order of magnitude within the same image. [...] Read more.
Synthetic aperture radar (SAR) ship detection faces three persistent challenges: coherent speckle noise that obscures target boundaries, heterogeneous background clutter in coastal and harbor scenes, and ship targets whose spatial extent varies by more than an order of magnitude within the same image. To address these issues jointly, this paper proposes MSA-DET, an improved SAR ship detection network built upon YOLOv11. In the backbone, a Multi-Scale Cross-axis Attention module (MSCAttention) runs horizontal and vertical axial attention branches in parallel across multiple receptive-field scales, sharpening feature representations for ship targets that vary widely in size and orientation. In the neck, the standard C3k2 block is redesigned as C3k2_SSA by embedding sparse self-attention, which selectively focuses on the most discriminative spatial tokens while suppressing speckle interference and reducing computational overhead. An Adaptive Spatial Feature Fusion detection head (ASFF) replaces fixed pyramid-level aggregation with learned per-pixel blending weights, resolving gradient conflicts across scales and improving localization consistency for both small and large ships. On the HRSID dataset, MSA-DET achieves an mAP@0.5:0.95 of 63.6% and mAP@0.5 of 88.1%, representing gains of 4.0% and 1.6% over the YOLOv11n baseline; on SSDD, it reaches 69.6% and 97.7%, surpassing the baseline by 7.2% and 2.1%, respectively. These results demonstrate that coordinated multi-stage redesign—rather than isolated module substitution—is an effective strategy for SAR-oriented ship detection. The accuracy gains are accompanied by a moderate increase in model size (8.9 M parameters versus 2.6 M for YOLOv11n) and computational cost (9.6 G FLOPs versus 6.3 G), a trade-off that is justified by the substantial improvement in detection quality. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

30 pages, 4938 KB  
Article
Intelligent Smart Grid Energy Management for EV Charging Stations Using GOA–HMGIGCN
by Mlungisi Ntombela
Algorithms 2026, 19(6), 497; https://doi.org/10.3390/a19060497 (registering DOI) - 22 Jun 2026
Viewed by 144
Abstract
Electric Vehicle Charging Stations (EVCSs) have become increasingly important due to the growing penetration of electric vehicles (EVs) and renewable-based power generation. However, challenges such as fluctuating renewable energy availability, increasing charging demand, power losses, operational cost, and charging delays continue to affect [...] Read more.
Electric Vehicle Charging Stations (EVCSs) have become increasingly important due to the growing penetration of electric vehicles (EVs) and renewable-based power generation. However, challenges such as fluctuating renewable energy availability, increasing charging demand, power losses, operational cost, and charging delays continue to affect overall grid performance and stability. To address these issues, this study proposes a hybrid Goat Optimization Algorithm–Hierarchical Multi-Granularity Interaction Graph Convolutional Network (GOA–HMGIGCN) framework for intelligent smart grid energy management and EV charging coordination. The proposed framework combines the Goat Optimization Algorithm (GOA) for optimal EVCS placement and charging scheduling with the Hierarchical Multi-Granularity Interaction Graph Convolutional Network (HMGIGCN) for forecasting renewable generation, charging demand, and load variations. The framework was implemented and evaluated in MATLAB/Simulink R2024a using the IEEE 14-bus smart grid test system under varying operating conditions. Simulation results demonstrated that the proposed framework achieved superior performance compared with the Coot Optimization Algorithm–Fractional Backpropagation Physics-Informed Neural Network (COA-FBPINN), Dingo Optimization Algorithm–Convolutional Hypergraph Graph Neural Network (DOA-CHGNN), Self-Feedback Feedforward Artificial Neural Network (SFFANN), Deep Neural Network (DNN), and Golden Jackal Optimization–Attention-Based Probabilistic Convolutional Neural Network (GJO-APCNN) techniques by attaining the lowest operational cost of USD 1561, the highest efficiency of 99.2%, the minimum power loss of 10.6 kW, and the shortest charging time of 32 min. In addition, the proposed framework and overall grid reliability, confirming its effectiveness for intelligent renewable-integrated smart grid applications. Full article
Show Figures

Figure 1

25 pages, 40725 KB  
Article
A Method for Extracting Sedimentary Outcrops from UAV Oblique Photogrammetry Point Clouds
by Chufan Ren, Chaodong Wu, Yanan Zhang, Cong Lin, Xinyue Niu and Yanan Chu
Sensors 2026, 26(12), 3946; https://doi.org/10.3390/s26123946 (registering DOI) - 21 Jun 2026
Viewed by 240
Abstract
Point-cloud analysis of sedimentary outcrops using Unmanned Aerial Vehicle (UAV) oblique photogrammetry is a crucial approach to sedimentary system characterization, stratigraphic correlation, and petroleum exploration analog studies. In large-scale field settings, however, outcrops are often scattered and fragmented, vegetation and soil cover is [...] Read more.
Point-cloud analysis of sedimentary outcrops using Unmanned Aerial Vehicle (UAV) oblique photogrammetry is a crucial approach to sedimentary system characterization, stratigraphic correlation, and petroleum exploration analog studies. In large-scale field settings, however, outcrops are often scattered and fragmented, vegetation and soil cover is extensive, and class imbalance is pronounced. Manual interpretation is labor-intensive, while existing clustering algorithms, conventional machine learning methods, and general-purpose point-cloud segmentation networks struggle to simultaneously ensure geometric fidelity, rare-class recognition, and multi-scale feature integration. To address these challenges, we propose a method for extracting sedimentary outcrop point clouds from field surface point clouds using a UAV oblique photogrammetry acquisition strategy. The core segmentation module of the method, sedimentary cross-scale self-attention network (SedCSA-Net), is an enhanced version of PointNet++ that integrates collaborative improvements across four dimensions: data augmentation, sampling strategy, feature encoding, and loss optimization. Taking the Cretaceous Qingshuihe Formation in the Louzhuangzi area of the southern Junggar Basin as a case study, our experimental results indicate that SedCSA-Net overcomes the natural variability of UAV oblique photogrammetry point clouds—such as shadows, voids, and uneven density—achieving a mean Intersection over Union(mIoU) of 89.51% and an Overall Accuracy(OA) of 96.08%, with an outcrop-class Intersection over Union(IoU) of 86.90%. Attitude measurements derived from segmentation results deviate by less than 3° from manually annotated references, demonstrating that the proposed framework provides an end-to-end, generalizable approach for intelligent segmentation, geometric reconstruction, and attitude extraction of large-scale sedimentary outcrop point clouds. Full article
Show Figures

Figure 1

15 pages, 6283 KB  
Article
Robust Polyurethane Hydrogels Based on Dynamic Disulfide Bonds and Pendant Tertiary Amines with Room-Temperature Self-Healing and pH Responsiveness
by Xia Ding, Bing Yang, Xinyi Si, Lei Ni, Chao Fang and Zhaosheng Hou
Gels 2026, 12(6), 555; https://doi.org/10.3390/gels12060555 (registering DOI) - 20 Jun 2026
Viewed by 82
Abstract
Hydrogels have garnered significant attention due to their tunable structures and broad applicability in biomedical and smart materials. However, achieving a balance between excellent mechanical performance and multifunctionality remains a major challenge. In this study, a series of multifunctional polyurethane hydrogels (PUGs) was [...] Read more.
Hydrogels have garnered significant attention due to their tunable structures and broad applicability in biomedical and smart materials. However, achieving a balance between excellent mechanical performance and multifunctionality remains a major challenge. In this study, a series of multifunctional polyurethane hydrogels (PUGs) was developed by integrating dynamic disulfide bonds and pendant tertiary amine groups into poly(ethylene glycol)-based networks using a solvent-exchange method. Structural characterization confirmed the successful formation of a crosslinked porous network. The hydrogels demonstrated remarkable mechanical properties, with PUG–II exhibiting a tensile strength of 448 kPa and an elongation at break of 489%, as well as exceptional compressibility (371 kPa at 90% strain) and fatigue resistance. Meanwhile, the PUGs displayed efficient room-temperature self-healing with a healing efficiency of up to 94.5%. The reversible protonation of tertiary amine groups imparted pronounced pH-responsive swelling behavior, with the equilibrium swelling ratio of PUG–I at pH 2.0 being 5.8 times higher than that at pH 12.0. This study provides a promising strategy for developing PU-based hydrogels that combine robust mechanical performance and multifunctionality, offering potential for advanced smart material applications. Full article
Show Figures

Graphical abstract

32 pages, 3105 KB  
Review
A Review on Deep State Space Models for Sequential Healthcare Data Prediction
by Wenjie Li, Yongming Xie and Yinglong Dai
Mathematics 2026, 14(12), 2210; https://doi.org/10.3390/math14122210 - 19 Jun 2026
Viewed by 126
Abstract
Sequential data prediction is a crucial area in healthcare. Healthcare data have the characteristics of non-stationarity, long-range dependence (LRD), and irregular sampling. Modeling these complex temporal features is highly challenging. Recurrent Neural Networks (RNNs) and their variants are limited in learning long-range dependencies [...] Read more.
Sequential data prediction is a crucial area in healthcare. Healthcare data have the characteristics of non-stationarity, long-range dependence (LRD), and irregular sampling. Modeling these complex temporal features is highly challenging. Recurrent Neural Networks (RNNs) and their variants are limited in learning long-range dependencies (LRDs) due to the inherent issues of vanishing and exploding gradients. Transformers alleviate this limitation by using the self-attention mechanism. Its quadratic computational complexity and memory bottleneck limit its scalability in long-range healthcare data. In this context, Structured State Space Models (SSMs) have emerged as a promising alternative. Compared with conventional RNNs, they can alleviate the difficulty of modeling LRDs more efficiently, and many modern SSM variants achieve linear time sequence modeling while reducing the computational burden associated with Transformers. In this review, we provide a formal definition of Healthcare Process Modeling, compare the core theoretical frameworks of RNNs, Transformers, and SSMs, trace the architectural evolution of SSM architectures, and provide a comprehensive review of healthcare applications and open challenges, including LSSL, S4, S5, Mamba, and their related variants. Existing studies suggest that structured SSMs are promising for selected long-sequence healthcare prediction tasks, particularly when computational efficiency and long-context retention are important. With these advantages, they may help alleviate the computational burden in certain healthcare tasks and provide a basis for further exploring the practical application of data-driven healthcare systems in clinical practice. Full article
Show Figures

Figure 1

42 pages, 15288 KB  
Article
A Hybrid Model for Stock Index Forecasting Integrating Adaptive Frequency-Domain Decomposition and Enhanced Transformer Encoder
by Hairong Zheng, Xiaozheng Zeng, Guoyu Hu and Tingting Zhang
Mathematics 2026, 14(12), 2202; https://doi.org/10.3390/math14122202 - 18 Jun 2026
Viewed by 214
Abstract
Stock index price series are composed of superimposed multi-frequency components, including long-term trends, cyclical fluctuations, and stochastic noise. Effectively decoupling these heterogeneous components and modeling them separately is key to improving forecasting accuracy. Existing methods under the “decomposition–prediction” paradigm mostly employ fixed-scale decomposition, [...] Read more.
Stock index price series are composed of superimposed multi-frequency components, including long-term trends, cyclical fluctuations, and stochastic noise. Effectively decoupling these heterogeneous components and modeling them separately is key to improving forecasting accuracy. Existing methods under the “decomposition–prediction” paradigm mostly employ fixed-scale decomposition, and the forecasting models are not specifically adapted to the non-stationary and high-noise characteristics of financial data, resulting in limitations in adaptivity and local dynamic capture. This paper proposes a frequency-aware adaptive multi-scale decomposition Transformer hybrid model (FAMS-Transformer). At the decomposition level, the fast Fourier transform is used to dynamically identify dominant cycles, thereby adaptively decoupling trends and fluctuations, overcoming the limitations of fixed-scale decomposition. At the forecasting level, a lightweight depthwise separable convolution is embedded between the self-attention and feedforward network of the Transformer encoder, enhancing the model’s ability to capture local temporal dynamics and achieving collaborative modeling of global dependencies and local information. Comparative experiments with 15 baseline models including LSTM, Transformer, TimesNet, and FreTS on three representative Chinese market indices—Shanghai Composite Index, Shenzhen Component Index, and Small and Medium Enterprises 100 Index—across four prediction horizons from one step to 15 steps demonstrate that FAMS-Transformer achieves the best forecasting accuracy in all scenarios. The coefficient of determination for 15-step prediction remains stably between 0.730 and 0.928. Moreover, the model still performs well on the S & P 500 dataset. Ablation studies and significance tests further validate the effectiveness of each core module and the statistical significance of the performance improvements. Full article
Show Figures

Figure 1

21 pages, 50702 KB  
Article
A Target Tracking Method Based on Frequency and Spatial Information Perception in UAV Vision
by Chenyang Li, Zhiheng Liu and Suiping Zhou
Remote Sens. 2026, 18(12), 2036; https://doi.org/10.3390/rs18122036 - 18 Jun 2026
Viewed by 176
Abstract
Target tracking for Unmanned Aerial Vehicles (UAVs) can be significantly impacted by environmental factors such as lighting variations, background clutter, and target occlusion. To address these challenges, we developed a target tracking method that integrates both frequency-domain and spatial perception capabilities in UAV [...] Read more.
Target tracking for Unmanned Aerial Vehicles (UAVs) can be significantly impacted by environmental factors such as lighting variations, background clutter, and target occlusion. To address these challenges, we developed a target tracking method that integrates both frequency-domain and spatial perception capabilities in UAV vision (FSTrack). Specifically: (1) we utilized the Swin Transformer as the core network to extract features from both the template and search images; (2) we introduced a Transformer-based module to enhance both frequency and spatial information, improving tracking accuracy under varying illumination conditions; (3) we designed a spatio-temporal feature fusion module with multiple multi-head self-attention mechanisms to precisely model the tracking state, thus increasing reliability in cluttered and occluded environments; and (4) we created a hybrid loss function to boost accuracy in both classification and regression tasks. Our experimental results on the UAV123, DTB70, and UAVDT datasets show that our approach not only surpasses current state-of-the-art methods in success rates and precision but also operates more swiftly. Full article
Show Figures

Figure 1

24 pages, 15691 KB  
Article
A Joint Fault Diagnosis and Severity Prediction Framework for Rolling Bearings Using PPCA-EMD and 1DCNN-BiGRU
by Wangshen Hao, Chunhui Zhu, Dongliang Zou, Chenyang Li, Shenglin Song and Shilong Zhang
Machines 2026, 14(6), 701; https://doi.org/10.3390/machines14060701 (registering DOI) - 18 Jun 2026
Viewed by 204
Abstract
Rolling bearing fault diagnosis remains challenging due to environmental noise, insufficient information sharing between diagnosis and prediction tasks, and poor model generalization ability. To address these issues, this paper proposes a fault diagnosis and severity prediction method integrating probabilistic principal component analysis (PPCA) [...] Read more.
Rolling bearing fault diagnosis remains challenging due to environmental noise, insufficient information sharing between diagnosis and prediction tasks, and poor model generalization ability. To address these issues, this paper proposes a fault diagnosis and severity prediction method integrating probabilistic principal component analysis (PPCA) and empirical mode decomposition (EMD) with a one-dimensional convolutional neural network (1DCNN) and bidirectional gated recurrent unit (BiGRU). The proposed model consists of two parallel branches for fault diagnosis and fault severity prediction. A self-attention mechanism is integrated into both branches to enhance feature extraction via adaptive feature weighting. In addition, parameter sharing and weighted loss functions are adopted to improve the training efficiency and collaborative learning between the two tasks. PPCA and EMD are employed for signal denoising and reconstruction while preserving fault-related features. Experiments on public datasets and industrial production-line data show that the proposed method improves the fault classification accuracy from 92.43% to 99.71% under different load conditions, while achieving 98.99% accuracy in fault severity prediction. Noise interference tests further demonstrate the effectiveness of the model. A production-line case study further illustrates the feasibility of applying the proposed method to real monitoring signals. These results confirm the effectiveness and practical potential of the proposed method for rolling bearing fault diagnosis and health assessment. Full article
Show Figures

Figure 1

29 pages, 14449 KB  
Article
RUL Prediction of Rotating Machinery: A Multi-Channel Information Fusion Forecasting Framework and GMM Evolution-Based Health Indicator Construction
by Qinqing Fan, Xiaoman Zhang and Xiaochen Zhang
Appl. Sci. 2026, 16(12), 6151; https://doi.org/10.3390/app16126151 - 17 Jun 2026
Viewed by 191
Abstract
To address the challenges of complex multi-channel signal coupling and insufficient long-term temporal dependency characterization in remaining useful life (RUL) prediction of rotating machinery, this paper proposes a multivariate time series forecasting framework integrating multi-channel information fusion and a self-attention gated augmentation unit [...] Read more.
To address the challenges of complex multi-channel signal coupling and insufficient long-term temporal dependency characterization in remaining useful life (RUL) prediction of rotating machinery, this paper proposes a multivariate time series forecasting framework integrating multi-channel information fusion and a self-attention gated augmentation unit (SGAU). First, a multilayer perceptron (MLP) explicitly models nonlinear coupling among channels; SGAU replaces the conventional feed-forward network in the Transformer encoder, using multi-head self-attention outputs as gating signals to adaptively regulate feature transformation. Second, multi-channel signals are predicted via this framework; high-dimensional feature vectors are extracted to construct multi-channel Gaussian mixture models (GMMs). Third, Jensen–Shannon divergence (JSD) quantifies deviations between the target and initial data clusters; centroid distance evolutionary trajectory is fused with JSD to construct the health indicator (HI). Continuous HI predictions yield the RUL prediction curve. Experiments on a self-designed wind turbine gearbox platform and the XJTU-SY bearing dataset demonstrate that the proposed framework outperforms baseline methods on Mean Square (MS), Root Mean Square (RMS), and Energy metrics, with average error reductions of 6.6% and 12.1% in the horizontal and vertical directions on the gearbox dataset and 20.9% and 32.3% on the bearing dataset, confirming its effectiveness and generalization capability. Full article
(This article belongs to the Section Acoustics and Vibrations)
Show Figures

Figure 1

25 pages, 28692 KB  
Article
Semi-Supervised Degradation-Aware Learning for All-in-One Weather-Degraded Image Restoration
by Lei Cai, Fang Ruan, Wei Lu, Qi Lin, Huijie Zheng, Wenjie Xiang and Tao Zhu
Electronics 2026, 15(12), 2686; https://doi.org/10.3390/electronics15122686 - 17 Jun 2026
Viewed by 107
Abstract
All-in-one weather-degraded image restoration aims to restore clean images from diverse weather-degraded observations (such as rain, haze, and snow) using a unified model. However, this topic remains challenging due to its ill-posed nature and the scarcity of large-scale paired training data. This article [...] Read more.
All-in-one weather-degraded image restoration aims to restore clean images from diverse weather-degraded observations (such as rain, haze, and snow) using a unified model. However, this topic remains challenging due to its ill-posed nature and the scarcity of large-scale paired training data. This article develops a novel semi-supervised learning framework, termed Semi-Supervised Degradation-Aware Learning (S2DAL), to adjust the feature space to align with the unified parameter space for all-in-one adverse weather removal. Specifically, the proposed S2DAL consists of two backbone networks: a Degradation-guided Histogram Transformer (DHformer) for weather-degraded image restoration and a Degradation-guided Convolutional Neural Network (DCNN) for degradation generation. A key component, the Degradation-guided Histogram Transformer (DHT) block, is designed to effectively capture intrinsic image features while suppressing diverse degradation interference through channel shuffling modulation, dynamic-range histogram self-attention, and dual-scale gated feed forward. Furthermore, a Monte Carlo-based Expectation-Maximization (EM) algorithm is introduced to jointly optimize latent variables and network parameters under both labeled and unlabeled data. Extensive quantitative and qualitative results on synthetic and real-world datasets consistently demonstrate that the proposed S2DAL achieves superior restoration performance compared to multiple state-of-the-art fully supervised and semi-supervised approaches. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)
Show Figures

Figure 1

Back to TopTop