MDPI - Publisher of Open Access Journals

31 pages, 16969 KB

Open AccessArticle

Research on Cooperative Vehicle–Infrastructure Perception Integrating Enhanced Point-Cloud Features and Spatial Attention

by Shiyang Yan, Yanfeng Wu, Zhennan Liu and Chengwei Xie

World Electr. Veh. J. 2026, 17(4), 164; https://doi.org/10.3390/wevj17040164 - 24 Mar 2026

Abstract

Vehicle–infrastructure cooperative perception (VICP) extends the sensing capability of single-vehicle systems by integrating multi-source information from onboard and roadside sensors, thereby alleviating limitations in sensing range and field-of-view coverage. However, in complex urban environments, the robustness of such systems—particularly in terms of blind-spot [...] Read more.

Vehicle–infrastructure cooperative perception (VICP) extends the sensing capability of single-vehicle systems by integrating multi-source information from onboard and roadside sensors, thereby alleviating limitations in sensing range and field-of-view coverage. However, in complex urban environments, the robustness of such systems—particularly in terms of blind-spot coverage and feature representation—is severely affected by both static and dynamic occlusions, as well as distance-induced sparsity in point cloud data. To address these challenges, a 3D object detection framework incorporating point cloud feature enhancement and spatially adaptive fusion is proposed. First, to mitigate feature degradation under sparse and occluded conditions, a Redefined Squeeze-and-Excitation Network (R-SENet) attention module is integrated into the feature encoding stage. This module employs a dual-dimensional squeeze-and-excitation mechanism operating across pillars and intra-pillar points, enabling adaptive recalibration of critical geometric features. In addition, a Feature Pyramid Backbone Network (FPB-Net) is designed to improve target representation across varying distances through multi-scale feature extraction and cross-layer aggregation. Second, to address feature heterogeneity and spatial misalignment between heterogeneous sensing agents, a Spatial Adaptive Feature Fusion (SAFF) module is introduced. By explicitly encoding the origin of features and leveraging spatial attention mechanisms, the SAFF module enables dynamic weighting and complementary fusion between fine-grained vehicle-side features and globally informative roadside semantics. Extensive experiments conducted on the DAIR-V2X benchmark and a custom dataset demonstrate that the proposed approach outperforms several state-of-the-art methods. Specifically, Average Precision (AP) scores of 0.762 and 0.694 are achieved at an IoU threshold of 0.5, while AP scores of 0.617 and 0.563 are obtained at an IoU threshold of 0.7 on the two datasets, respectively. Furthermore, the proposed framework maintains real-time inference performance, highlighting its effectiveness and practical potential for real-world deployment. Full article

(This article belongs to the Section Automated and Connected Vehicles)

► Show Figures

Figure 1

26 pages, 8183 KB

Open AccessArticle

Tri-View Adaptive Contrastive for Bundle Recommendation

by Xueli Shen and Han Wu

Electronics 2026, 15(6), 1302; https://doi.org/10.3390/electronics15061302 - 20 Mar 2026

Viewed by 175

Abstract

Bundle recommendation has gained significant attention, but it faces two key challenges: sparse interaction data and complex UB, UI, and BI relations. Recent work uses multi-view contrastive learning, yet current frameworks rely on fixed-weight fusion that ignores view-specific importance and suffers from gradient [...] Read more.

Bundle recommendation has gained significant attention, but it faces two key challenges: sparse interaction data and complex UB, UI, and BI relations. Recent work uses multi-view contrastive learning, yet current frameworks rely on fixed-weight fusion that ignores view-specific importance and suffers from gradient suppression on sparse data. We propose TriadCBR, a tri-view adaptive contrastive learning architecture for bundle recommendation. It uses a simplified GCN to learn view-specific representations and a Mixture-of-Experts (MoE) module to generate personalized fusion weights, addressing the limitations of fixed-weight fusion. TriadCBR further incorporates a fine-grained contrastive module integrating InfoNCE, DCL, and Barlow Twins. This combination effectively mitigates gradient vanishing from invalid negatives and minimizes cross-view feature redundancy. To handle data sparsity, we design a Difficulty-Aware BPR (DA-BPR) with curriculum augmentation to dynamically refine the ranking trajectory. Extensive experiments on Youshu, iFashion, and NetEase demonstrate that TriadCBR achieves statistically significant improvements, boosting Recall and NDCG by an average of 3.61%, with 9 of 12 metric–dataset combinations reaching statistical significance, over state-of-the-art baselines, validating the robustness of its dynamic fusion and adaptive optimization. Full article

(This article belongs to the Special Issue Data Mining and Recommender Systems)

► Show Figures

Figure 1

28 pages, 15951 KB

Open AccessArticle

Local–Global Aware Concept Bottleneck Models for Interpretable Image Classification

by Ci Liu, Zijie Lin and Chen Tang

Sensors 2026, 26(6), 1833; https://doi.org/10.3390/s26061833 - 14 Mar 2026

Viewed by 228

Abstract

Concept Bottleneck Models facilitate interpretable image classification by predicting human-understandable concepts prior to class labels. However, when constructed upon CLIP, they exhibit unreliable concept scores stemming from CLIP’s global representation bias and insufficient region-level sensitivity, which severely constrain their effectiveness in sensor-driven applications [...] Read more.

Concept Bottleneck Models facilitate interpretable image classification by predicting human-understandable concepts prior to class labels. However, when constructed upon CLIP, they exhibit unreliable concept scores stemming from CLIP’s global representation bias and insufficient region-level sensitivity, which severely constrain their effectiveness in sensor-driven applications like remote sensing and medical imaging where localized visual evidence is critical. To mitigate this, we propose the Local–Global Aware Concept Bottleneck Model (LGA-CBM), which improves concept prediction through a training-free refinement pipeline. Building on initial CLIP-derived concept scores, LGA-CBM incorporates three key components: a Dual Masking Guided Concept Score Refinement (DMCSR) module that exploits attention weights to strengthen region–concept alignment; a Local-to-Global Concept Reidentification (L2GCR) strategy to harmonize local and global activations; and a Similar Concepts Correction Mechanism (SCCM) integrating Grounding DINO for fine-grained disambiguation. A sparse linear layer then maps the refined concepts to class labels, enabling highly interpretable classification with minimal concept usage. Experiments across six benchmark datasets demonstrate that LGA-CBM consistently achieves state-of-the-art performance in both accuracy and interpretability, producing explanations that align closely with human cognition. Full article

(This article belongs to the Special Issue AI for Emerging Image-Based Sensor Applications)

► Show Figures

Figure 1

27 pages, 14900 KB

Open AccessArticle

TreeDGS: Aerial Gaussian Splatting for Distant DBH Measurement

by Belal Shaheen, Minh-Hieu Nguyen, Bach-Thuan Bui, Shubham, Tim Wu, Michael Fairley, Matthew Zane, Michael Wu and James Tompkin

Remote Sens. 2026, 18(6), 867; https://doi.org/10.3390/rs18060867 - 11 Mar 2026

Viewed by 320

Abstract

Aerial remote sensing efficiently surveys large areas, but accurate direct object-level measurement remains difficult in complex natural scenes. Advancements in 3D computer vision, particularly radiance field representations such as NeRF and 3D Gaussian splatting, can improve reconstruction fidelity from posed imagery. Nevertheless, direct [...] Read more.

Aerial remote sensing efficiently surveys large areas, but accurate direct object-level measurement remains difficult in complex natural scenes. Advancements in 3D computer vision, particularly radiance field representations such as NeRF and 3D Gaussian splatting, can improve reconstruction fidelity from posed imagery. Nevertheless, direct aerial measurement of important attributes like tree diameter at breast height (DBH) remains challenging. Trunks in aerial forest scans are distant and sparsely observed in image views; at typical operating altitudes, stems may span only a few pixels. With these constraints, conventional reconstruction methods have inaccurate breast-height trunk geometry. TreeDGS is an aerial image reconstruction method that uses 3D Gaussian splatting as a continuous scene representation for trunk measurement. After SfM–MVS initialization and Gaussian optimization, we extract a dense point set from the Gaussian field using RaDe-GS’s depth-aware cumulative-opacity integration and associate each sample with a multi-view opacity reliability score. Then, we isolate trunk points and estimate DBH using opacity-weighted solid-circle fitting. Evaluated on 10 plots with field-measured DBH, TreeDGS reaches 4.79 cm RMSE (about 2.6 pixels at this GSD) and outperforms a LiDAR baseline (7.66 cm RMSE). This shows that TreeDGS can enable accurate, low-cost aerial DBH measurement. Full article

(This article belongs to the Special Issue Advanced Remote Sensing Technology for Precision Forestry and Carbon Sink Assessment)

► Show Figures

Figure 1

20 pages, 21647 KB

Open AccessArticle

Spatial Orthogonal and Boundary-Aware Network for Rotated and Elongated-Target Detection

by Yong Liu, Zhengbiao Jing, Yinghong Chang and Donglin Jing

Algorithms 2026, 19(3), 206; https://doi.org/10.3390/a19030206 - 9 Mar 2026

Viewed by 162

Abstract

In recent years, the refinement of bounding box representations has emerged as a major research focus in remote sensing. Nevertheless, mainstream detection algorithms typically ignore the disruptive impacts induced by the diverse morphologies and arbitrary orientations of high-aspect-ratio aerial objects throughout model training, [...] Read more.

In recent years, the refinement of bounding box representations has emerged as a major research focus in remote sensing. Nevertheless, mainstream detection algorithms typically ignore the disruptive impacts induced by the diverse morphologies and arbitrary orientations of high-aspect-ratio aerial objects throughout model training, thereby giving rise to several critical technical challenges: (1) Anisotropic information distribution: Target features are highly concentrated in one spatial dimension but sparse in the other, with significant feature differences across bounding box parameters, breaking the symmetry of feature distribution. (2) Missing high-quality positive samples: IoU-based assignment strategies fail to adequately capture the symmetric structural characteristics of elongated targets, resulting in incomplete coverage of critical features. (3) Loss function gradient instability: Small deviations in large-aspect-ratio bounding boxes cause drastic loss value fluctuations, as the asymmetric gradient changes hinder stable optimization directions during training. To address the challenges, we propose a Spatial Orthogonal and Boundary-Aware Network (SOBA-Net) for rotated and elongated target detection, leveraging symmetry-aware designs to enhance feature representation. Specifically, spatial staggered convolutions are constructed to fuse local and directional contextual features, effectively modeling long-range symmetric information across multiple spatial scales and reducing background noise interference. Secondly, the designed Symmetric-Constrained Label Assignment (SC-LA) introduces an IoU-weighted function, ensuring high-quality samples with symmetric structural features are classified as positive samples. Ultimately, the designed Gradient Dynamic Equilibrium Loss Function mitigates the problem of unstable gradients associated with high-aspect-ratio objects by enforcing symmetrical gradient regulation across samples with negligible localization deviations. Comprehensive evaluations across three representative remote sensing benchmarks—DOTA, UCAS-AOD, and HRSC2016—sufficiently corroborate the superiority of symmetry-aware enhancement schemes, which boast straightforward implementation and efficient inference deployment. Full article

(This article belongs to the Special Issue Advances in Deep Learning-Based Data Analysis)

► Show Figures

Figure 1

19 pages, 6143 KB

Open AccessArticle

Research on Density-Adaptive Feature Enhancement and Lightweight Spectral Fine-Tuning Algorithm for 3D Point Cloud Analysis

by Wenquan Huang, Teng Li, Qing Cheng, Ping Qi and Jing Zhu

Information 2026, 17(2), 184; https://doi.org/10.3390/info17020184 - 11 Feb 2026

Viewed by 354

Abstract

To address fragile feature representation in sparse regions and detail loss in occluded scenes caused by uneven sampling density in 3D point cloud semantic segmentation on the SemanticKITTI dataset, this article proposes an innovative framework that integrates density-adaptive feature enhancement with lightweight spectral [...] Read more.

To address fragile feature representation in sparse regions and detail loss in occluded scenes caused by uneven sampling density in 3D point cloud semantic segmentation on the SemanticKITTI dataset, this article proposes an innovative framework that integrates density-adaptive feature enhancement with lightweight spectral fine-tuning, which involves frequency-domain transformations (e.g., Fast Fourier Transform) applied to point cloud features to optimize computational efficiency and enhance robustness in sparse regions, which involves frequency-domain transformations to optimize features efficiently. The method begins by accurately calculating each point’s local neighborhood density using KD tree radius search, subsequently injecting this as an additional feature channel to enable the network’s adaptation to density variations. A density-aware loss function is then employed, dynamically adjusting the classification loss weights—by approximately 40% in low-density areas—to strongly penalize misclassifications and enhance feature robustness from sparse points. Additionally, a multi-view projection fusion mechanism is introduced that projects point clouds onto multiple 2D views, capturing detailed information via mature 2D models, with the primary focus on semantic segmentation tasks using the SemanticKITTI dataset to ensure task specificity. This information is then fused with the original 3D features through backprojection, thereby complementing geometric relationships and texture details to effectively alleviate occlusion artifacts. Experiments on the SemanticKITTI dataset for semantic segmentation show significant performance improvements over the baseline, achieving Precision 0.91, Recall 0.89, and F1-Score 0.90. In low-density regions, the F1-Score improved from 0.73 to 0.80. Ablation studies highlight the contributions of density feature injection, multi-view fusion, and density-aware loss, enhancing F1-Score by 3.8%, 2.5%, and 5.0%, respectively. This framework offers an effective approach for accurate and robust point cloud analysis through optimized density techniques and spectral domain fine-tuning. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 15356 KB

Open AccessArticle

Enhanced UWB-FMCW-SAR RFI Suppression via Joint Time–Frequency LRSR-TTV and Coherence Factor Weighting

by Wenjie Li, Haibo Tang, Yuchen Luan, Fubo Zhang and Longyong Chen

Electronics 2026, 15(4), 735; https://doi.org/10.3390/electronics15040735 - 9 Feb 2026

Viewed by 224

Abstract

This study addresses the challenge of suppressing radio frequency interference (RFI) in ultra-wideband (UWB) synthetic aperture radar (SAR) operating within complex electromagnetic environments, and proposes an innovative time–frequency signal extraction method. The proposed approach integrates a low-rank and sparse representation (LRSR) model in [...] Read more.

This study addresses the challenge of suppressing radio frequency interference (RFI) in ultra-wideband (UWB) synthetic aperture radar (SAR) operating within complex electromagnetic environments, and proposes an innovative time–frequency signal extraction method. The proposed approach integrates a low-rank and sparse representation (LRSR) model in the time–frequency domain with a time total variation (TTV) constraint. The core contributions are twofold: (1) constructing a time–frequency LRSR model of frequency modulation continuous wave (FMCW) signal, and (2) incorporating spectral continuity as a prior via TTV regularization into a joint low-rank sparse optimization framework. This effectively reduces the aliasing of RFI components into the target components caused by improper hyperparameters, which is particularly pronounced under low signal-to-interference-plus-noise ratio (SINR) conditions. To enhance robustness, the incoherence of interference across frequency bands is exploited, and a sub-band coherence factor (CF) weighting technique is introduced to further suppress RFI residues in the image domain. Experimental results demonstrate that the proposed method significantly outperforms existing robust principal component analysis (RPCA)-based techniques, offering a more adaptive and robust solution for RFI mitigation in UWB SAR systems. Full article

(This article belongs to the Special Issue Recent Advances and Applications of Radar Signal Processing)

► Show Figures

Figure 1

27 pages, 6439 KB

Open AccessArticle

Contrastive–Transfer-Synergized Dual-Stream Transformer for Hyperspectral Anomaly Detection

by Lei Deng, Jiaju Ying, Qianghui Wang, Yue Cheng and Bing Zhou

Remote Sens. 2026, 18(3), 516; https://doi.org/10.3390/rs18030516 - 5 Feb 2026

Viewed by 480

Abstract

Hyperspectral anomaly detection (HAD) aims to identify pixels that significantly differ from the background without prior knowledge. While deep learning-based reconstruction methods have shown promise, they often suffer from limited feature representation, inefficient training cycles, and sensitivity to imbalanced data distributions. To address [...] Read more.

Hyperspectral anomaly detection (HAD) aims to identify pixels that significantly differ from the background without prior knowledge. While deep learning-based reconstruction methods have shown promise, they often suffer from limited feature representation, inefficient training cycles, and sensitivity to imbalanced data distributions. To address these challenges, this paper proposes a novel contrastive–transfer-synergized dual-stream transformer for hyperspectral anomaly detection (CTDST-HAD). The framework integrates contrastive learning and transfer learning within a dual-stream architecture, comprising a spatial stream and a spectral stream, which are pre-trained separately and synergistically fine-tuned. Specifically, the spatial stream leverages general visual and hyperspectral-view datasets with adaptive elastic weight consolidation (EWC) to mitigate catastrophic forgetting. The spectral stream employs a variational autoencoder (VAE) enhanced with the RossThick–LiSparseR (R-L) physical-kernel-driven model for spectrally realistic data augmentation. During fine-tuning, spatial and spectral features are fused for pixel-level anomaly detection, with focal loss addressing class imbalance. Extensive experiments on nine real hyperspectral datasets demonstrate that CTDST-HAD outperforms state-of-the-art methods in detection accuracy and efficiency, particularly in complex backgrounds, while maintaining competitive inference speed. Full article

► Show Figures

Figure 1

22 pages, 6723 KB

Open AccessArticle

An Enhanced SegNeXt with Adaptive ROI for a Robust Navigation Line Extraction in Multi-Growth-Stage Maize Fields

by Yuting Zhai, Zongmei Gao, Jian Li, Yang Zhou and Yanlei Xu

Agriculture 2026, 16(3), 367; https://doi.org/10.3390/agriculture16030367 - 4 Feb 2026

Viewed by 315

Abstract

Navigation line extraction is essential for visual navigation in agricultural machinery, yet existing methods often perform poorly in complex environments due to challenges such as weed interference, broken crop rows, and leaf adhesion. To enhance the accuracy and robustness of crop row centerline [...] Read more.

Navigation line extraction is essential for visual navigation in agricultural machinery, yet existing methods often perform poorly in complex environments due to challenges such as weed interference, broken crop rows, and leaf adhesion. To enhance the accuracy and robustness of crop row centerline identification, this study proposes an improved segmentation model based on SegNeXt with integrated adaptive region of interest (ROI) extraction for multi-growth-stage maize row perception. Improvements include constructing a Local module via pooling layers to refine contour features of seedling rows and enhance complementary information across feature maps. A multi-scale fusion attention (MFA) is also designed for adaptive weighted fusion during decoding, improving detail representation and generalization. Additionally, Focal Loss is introduced to mitigate background dominance and strengthen learning from sparse positive samples. An adaptive ROI extraction method was also developed to dynamically focus on navigable regions, thereby improving efficiency and localization accuracy. The outcomes revealed that the proposed model achieves a segmentation accuracy of 95.13% and an IoU of 93.86%. The experimental results show that the proposed algorithm achieves a processing speed of 27 frames per second (fps) on GPU and 16.8 fps on an embedded Jetson TX2 platform. This performance meets the real-time requirements for agricultural machinery operations. This study offers an efficient and reliable perception solution for vision-based navigation in maize fields. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

24 pages, 10940 KB

Open AccessArticle

A Few-Shot Object Detection Framework for Remote Sensing Images Based on Adaptive Decision Boundary and Multi-Scale Feature Enhancement

by Lijiale Yang, Bangjie Li, Dongdong Guan and Deliang Xiang

Remote Sens. 2026, 18(3), 388; https://doi.org/10.3390/rs18030388 - 23 Jan 2026

Viewed by 579

Abstract

Given the high cost of acquiring large-scale annotated datasets, few-shot object detection (FSOD) has emerged as an increasingly important research direction. However, existing FSOD methods face two critical challenges in remote sensing images (RSIs): (1) features of small targets within remote sensing images [...] Read more.

Given the high cost of acquiring large-scale annotated datasets, few-shot object detection (FSOD) has emerged as an increasingly important research direction. However, existing FSOD methods face two critical challenges in remote sensing images (RSIs): (1) features of small targets within remote sensing images are incompletely represented due to extremely small-scale and cluttered backgrounds, which weakens discriminability and leads to significant detection degradation; (2) unified classification boundaries fail to handle the distinct confidence distributions between well-sampled base classes and sparsely sampled novel classes, leading to ineffective knowledge transfer. To address these issues, we propose TS-FSOD, a Transfer-Stable FSOD framework with two key innovations. First, the proposed detector integrates a Feature Enhancement Module (FEM) leveraging hierarchical attention mechanisms to alleviate small target feature attenuation, and an Adaptive Fusion Unit (AFU) utilizing spatial-channel selection to strengthen target feature representations while mitigating background interference. Second, Dynamic Temperature-scaling Learnable Classifier (DTLC) employs separate learnable temperature parameters for base and novel classes, combined with difficulty-aware weighting and dynamic adjustment, to adaptively calibrate decision boundaries for stable knowledge transfer. Experiments on DIOR and NWPU VHR-10 datasets show that TS-FSOD achieves competitive or superior performance compared to state-of-the-art methods, with improvements up to 4.30% mAP, particularly excelling in 3-shot and 5-shot scenarios. Full article

(This article belongs to the Special Issue Advances in Imaging Radar Signal Processing, Target Feature Extraction and Recognition)

► Show Figures

Figure 1

22 pages, 7186 KB

Open AccessArticle

Multi-Frequency GPR Image Fusion Based on Convolutional Sparse Representation to Enhance Road Detection

by Liang Fang, Feng Yang, Yuanjing Fang and Junli Nie

J. Imaging 2026, 12(1), 52; https://doi.org/10.3390/jimaging12010052 - 22 Jan 2026

Viewed by 358

Abstract

Single-frequency ground penetrating radar (GPR) systems are fundamentally constrained by a trade-off between penetration depth and resolution, alongside issues like narrow bandwidth and ringing interference. To break this limitation, we have developed a multi-frequency data fusion technique grounded in convolutional sparse representation (CSR). [...] Read more.

Single-frequency ground penetrating radar (GPR) systems are fundamentally constrained by a trade-off between penetration depth and resolution, alongside issues like narrow bandwidth and ringing interference. To break this limitation, we have developed a multi-frequency data fusion technique grounded in convolutional sparse representation (CSR). The proposed methodology involves spatially registering multi-frequency GPR signals and fusing them via a CSR framework, where the convolutional dictionaries are derived from simulated high-definition GPR data. Extensive evaluation using information entropy, average gradient, mutual information, and visual information fidelity demonstrates the superiority of our method over traditional fusion approaches (e.g., weighted average, PCA, 2D wavelets). Tests on simulated and real data confirm that our CSR-based fusion successfully synergizes the deep penetration of low frequencies with the fine resolution of high frequencies, leading to substantial gains in GPR image clarity and interpretability. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

37 pages, 1432 KB

Open AccessArticle

MDM-GANSA: A Multi-Distribution Generative Shilling Attack for Recommender Systems

by Quanqiang Zhou, Xiaoyue Zhang and Xi Zhao

Information 2026, 17(1), 77; https://doi.org/10.3390/info17010077 - 12 Jan 2026

Viewed by 235

Abstract

Shilling attacks pose a significant threat to collaborative filtering recommender systems. However, fake user profiles generated by mainstream attack models often lack diversity and realism. Furthermore, the static noise strategies and statistical dependency modeling used in advanced frameworks like the Multi-Distribution Mixture Generative [...] Read more.

Shilling attacks pose a significant threat to collaborative filtering recommender systems. However, fake user profiles generated by mainstream attack models often lack diversity and realism. Furthermore, the static noise strategies and statistical dependency modeling used in advanced frameworks like the Multi-Distribution Mixture Generative Adversarial Network (MDM-GAN) are ill-suited for high-dimensional, sparse attack scenarios. To address these challenges, we propose MDM-GANSA, a specialized attack model tailored for shilling attacks. First, it replaces the static mixture with a dynamic adaptive noise strategy by incorporating a weight predictor network. This network dynamically adjusts the weights of multiple noise sources based on the current training state, generating more diverse user latent representations. Second, it employs an autoencoder for data-driven dependency modeling, replacing the traditional statistical method. This allows the model to learn and generate profiles with inherent logical dependencies directly from genuine data. Consequently, it enhances the realism of the generated fake user profiles in terms of both statistical properties and internal logic. Additionally, the model utilizes an optimized two-stage generative architecture and fine-grained loss constraints to ensure training stability and high-quality outputs. Experimental results on two public datasets demonstrate that MDM-GANSA significantly outperforms various baseline models in both attack effectiveness and stealthiness. This study provides a concrete implementation for building a shilling-attack generation model targeting collaborative filtering recommender systems, and it also offers a feasible pathway for adapting general-purpose deep generative models to specialized security-oriented scenarios. Full article

► Show Figures

Graphical abstract

16 pages, 2092 KB

Open AccessArticle

Bidirectional Temporal Attention Convolutional Networks for High-Performance Network Traffic Anomaly Detection

by Feng Wang, Yufeng Huang and Yifei Shi

Information 2026, 17(1), 61; https://doi.org/10.3390/info17010061 - 9 Jan 2026

Viewed by 354

Abstract

Deep learning-based network traffic anomaly detection, particularly using Recurrent Neural Networks (RNNs), often struggles with high computational overhead and difficulties in capturing long-range temporal dependencies. To address these limitations, this paper proposes a Bidirectional Temporal Attention Convolutional Network (Bi-TACN) for robust and efficient [...] Read more.

Deep learning-based network traffic anomaly detection, particularly using Recurrent Neural Networks (RNNs), often struggles with high computational overhead and difficulties in capturing long-range temporal dependencies. To address these limitations, this paper proposes a Bidirectional Temporal Attention Convolutional Network (Bi-TACN) for robust and efficient network traffic anomaly detection. Specifically, dilated causal convolutions with expanding receptive fields and residual modules are employed to capture multi-scale temporal patterns while effectively mitigating the vanishing gradient. Furthermore, a bidirectional structure integrated with Efficient Channel Attention (ECA) is designed to adaptively weight contextual features, preventing sparse attack indicators from being overwhelmed by dominant normal traffic. A Softmax-based classifier then leverages these refined representations to execute high-performance anomaly detection. Extensive experiments on the NSL-KDD and UNSW-NB15 datasets demonstrate that Bi-TACN achieves average accuracies of 88.51% and 82.5%, respectively, significantly outperforming baseline models such as Bi-TCN and Bi-GRU in terms of both precision and convergence speed. Full article

► Show Figures

Figure 1

20 pages, 15504 KB

Open AccessArticle

O-Transformer-Mamba: An O-Shaped Transformer-Mamba Framework for Remote Sensing Image Haze Removal

by Xin Guan, Runxu He, Le Wang, Hao Zhou, Yun Liu and Hailing Xiong

Remote Sens. 2026, 18(2), 191; https://doi.org/10.3390/rs18020191 - 6 Jan 2026

Viewed by 399

Abstract

Although Transformer-based and state-space models (e.g., Mamba) have demonstrated impressive performance in image restoration, they remain deficient in remote sensing image dehazing. Transformer-based models tend to distribute attention evenly, making them difficult to handle the uneven distribution of haze. While Mamba excels at [...] Read more.

Although Transformer-based and state-space models (e.g., Mamba) have demonstrated impressive performance in image restoration, they remain deficient in remote sensing image dehazing. Transformer-based models tend to distribute attention evenly, making them difficult to handle the uneven distribution of haze. While Mamba excels at modeling long-range dependencies, it lacks fine-grained spatial awareness of complex atmospheric scattering. To overcome these limitations, we present a new O-shaped dehazing architecture that combines a Sparse-Enhanced Self-Attention (SE-SA) module with a Mixed Visual State Space Model (Mix-VSSM), balancing haze-sensitive details in remote sensing images with long-range context modeling. The SE-SA module introduces a dynamic soft masking mechanism that adaptively adjusts attention weights based on the local haze distribution, enabling the network to more effectively focus on severely degraded regions while suppressing redundant responses. Furthermore, the Mix-VSSM enhances global context modeling by combining sequential processing of 2D perception with local residual information. This design mitigates the loss of spatial detail in the standard VSSM and improves the feature representation of haze-degraded remote sensing images. Thorough experiments verify that our O-shaped framework outperforms existing methods on several benchmark datasets. Full article

(This article belongs to the Special Issue Deep Learning for Remote Sensing Image Enhancement)

► Show Figures

Graphical abstract

26 pages, 1071 KB

Open AccessArticle

FC-SBAAT: A Few-Shot Image Classification Approach Based on Feature Collaboration and Sparse Bias-Aware Attention in Transformers

by Min Wang, Chengyu Yang, Lin Sha, Jiaqi Li and Shikai Tang

Symmetry 2026, 18(1), 95; https://doi.org/10.3390/sym18010095 - 5 Jan 2026

Viewed by 485

Abstract

Few-shot classification aims to generalize from very limited samples, providing an effective solution for data-scarce scenarios. From a symmetry viewpoint, an ideal Few-Shot classifier should be invariant to class permutations and treat support and query features in a balanced manner, preserving intra-class cohesion [...] Read more.

Few-shot classification aims to generalize from very limited samples, providing an effective solution for data-scarce scenarios. From a symmetry viewpoint, an ideal Few-Shot classifier should be invariant to class permutations and treat support and query features in a balanced manner, preserving intra-class cohesion while enlarging inter-class separation in the embedding space. However, existing methods often violate this symmetry because prototypes are estimated from few noisy samples, which induces asymmetric representations and task-dependent biases under complex inter-class relations. To address this, we propose FC-SBAAT, feature collaboration, and Sparse Bias-Aware Attention Transformer, a framework that explicitly leverages symmetry in feature collaboration and prototype construction. First, we enhance symmetric interactions between support and query samples in both attention and contrastive subspaces and adaptively fuse these complementary representations via learned weights. Second, we refine prototypes by symmetrically aggregating intra-class features with learned importance weights, improving prototype quality while maintaining intra-class symmetry and increasing inter-class discrepancy. For matching, we introduce a Sparse Bias-Aware Attention Transformer that corrects asymmetric task bias through bias-aware attention with a low computational overhead. Extensive experiments show that FC-SBAAT achieves 55.71% and 73.87% accuracy for 1-shot and 5-shot tasks on MiniImageNet and 70.37% and 83.86% on CUB, outperforming prior methods. Full article

(This article belongs to the Special Issue Symmetry and Its Applications in Deep Learning and Artificial Intelligence Methods)

► Show Figures

Figure 1

Search Results (147)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (147)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI