Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (82)

Search Parameters:
Keywords = global feature extractor network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 9200 KB  
Article
A Hybrid Model for Ultrasound Image-Based Breast Cancer Diagnosis Using EfficientNet-V2 and Vision Transformer
by Zainab Qahtan Mohammed, Amel Tuama Alhussainy, Ihsan Salman Jasim and Asraf Mohamed Moubark
Diagnostics 2026, 16(8), 1176; https://doi.org/10.3390/diagnostics16081176 - 15 Apr 2026
Viewed by 413
Abstract
Background/Objectives: Breast cancer continues to be one of the most serious and common afflictions affecting women around the globe. Despite ultrasound imaging being an effective method for the detection of abnormalities in dense breast tissues, there are a number of drawbacks when [...] Read more.
Background/Objectives: Breast cancer continues to be one of the most serious and common afflictions affecting women around the globe. Despite ultrasound imaging being an effective method for the detection of abnormalities in dense breast tissues, there are a number of drawbacks when utilizing this method, including the subjective nature of the imaging and the variant nature of the imaging due to the cognitive biases of the interpreting expert and the experience of the interpreting expert. The above factors are the cause of the increased need in the implementation of AI-driven models for diagnostic analysis. In this research, we provide a hybrid deep learning-based framework for cancer classification of the breast cancer ultrasound image dataset (‘BUSI dataset’). Methods: The contributing models of the proposed architecture involve the combination of a light ViT encoder and an EfficientNetV2-RW-S feature extractor. The combination mentioned leverage the positive sensitivities of the convolutional neural networks (CNNs) and the global reasoning neural networks (i.e., transformers) in the explanation of the architecture. The reason being, EfficientNetV2 diminishes the capture of the fine-grained morphological components of the lesions, edges, and echogenic variances of the tissue, whereas the transformer model diminishes the long-range dependencies of the lesions and other surrounding tissues. Results: The experimental results from the proposed hybrid model of the architecture demonstrates an enhanced classification accuracy of 97.95%, in contrast to the self-standing models of the architecture, the hybrid model supersedes the isolated ViT model (i.e., 89%) and the isolated CNN model (i.e., 80%) frameworks. Furthermore, the proposed model hybrid architecture also diminishes the overall self-attention computational complexity of the proposed model by substantially diminishing the number of tokens reaching an overall count of 10 (from the vast 197 tokens). This further leads to a substantial decrease in the memory and cost expended during the attention processes. Conclusions: Overall, this study proposes a method for the improved diagnostic and computational analysis, suggesting the proposed architecture to be a potential framework for use in the contemporary clinical environments. Full article
(This article belongs to the Special Issue The Role of AI in Ultrasound, 2nd Edition)
Show Figures

Figure 1

22 pages, 5250 KB  
Article
Hybrid Deep Learning Method for Vibration-Based Gear Fault Diagnosis in Shearer Rocker Arm
by Joshua Fenuku, Hua Ding, Gertrude Selase Gosu, Xiaochun Sun and Ning Li
Electronics 2026, 15(8), 1587; https://doi.org/10.3390/electronics15081587 - 10 Apr 2026
Viewed by 205
Abstract
In underground coal mining, the gear of a shearer’s rocker arm endures extreme stress and environmental fluctuations. Failures in this vital component can pose serious safety hazards, cause prolonged operational downtime, and result in significant financial losses. Therefore, accurate gear fault diagnosis is [...] Read more.
In underground coal mining, the gear of a shearer’s rocker arm endures extreme stress and environmental fluctuations. Failures in this vital component can pose serious safety hazards, cause prolonged operational downtime, and result in significant financial losses. Therefore, accurate gear fault diagnosis is crucial. However, conventional diagnostic methods often struggle with limited feature extraction and poor performance when dealing with non-stationary, noisy signals typical of this environment. To address these challenges, a hybrid model consisting of Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) network, and Markov Transition Model (MTM) is proposed. In this framework, the CNN is used to extract both global and local features related to gear fault. A time-distributed feature extractor is then integrated with the LSTM to capture the temporal progression of these features, aiding in effective modeling of fault evolution over time. Finally, the MTM further refines classification by incorporating probabilistic state transition between fault conditions, thereby improving diagnostic stability and robustness under noise. Experimental validation was done using vibration data from the Taizhong Coal Machinery rocker arm test platform and gear data from Southeast University and achieved up to 99.79% accuracy. These results show this proposed method outperformed other advanced diagnostic methods, offering dependable fault diagnosis and strong noise resistance even under extreme noise conditions of −5 dB SNR. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Graphical abstract

26 pages, 2590 KB  
Article
GCA-Net: Geometric-Contextual Alignment Network for Lightweight and Robust Local Feature Extraction in Visual Localization
by Yujuan Deng, Liang Tian, Xiaohui Hou, Xiaoling Zhao, Yonggang Wang, Xin Liu, Xingchao Liu and Chunyuan Liao
Appl. Sci. 2026, 16(7), 3330; https://doi.org/10.3390/app16073330 - 30 Mar 2026
Viewed by 295
Abstract
Lightweight local feature extractors are essential for real-time SLAM. However, they frequently struggle with perceptual aliasing and low localization accuracy in texture-sparse or repetitive environments. This paper introduces the Geometric-Contextual Alignment Network (GCA-Net), a framework designed to address these instabilities through a Geometric-Contextual [...] Read more.
Lightweight local feature extractors are essential for real-time SLAM. However, they frequently struggle with perceptual aliasing and low localization accuracy in texture-sparse or repetitive environments. This paper introduces the Geometric-Contextual Alignment Network (GCA-Net), a framework designed to address these instabilities through a Geometric-Contextual Alignment (GCA) module. The proposed GCA module integrates global contextual priors into the feature stream. By employing Context-based Feature-wise Linear Modulation (C-FiLM), the network mitigates perceptual aliasing by prioritizing structurally reliable regions. To enhance spatial precision, we incorporate a Depthwise Separable Atrous Spatial Pyramid Pooling (DS-ASPP) stage to expand the Effective Receptive Field (ERF). This design provides robust multi-scale anchoring, which significantly reduces localization jitter under large viewpoint shifts. Extensive evaluations on MegaDepth, ScanNet and HPatches demonstrate that GCA-Net achieves high sub-pixel precision and robust cross-domain generalization. On the MegaDepth benchmark, GCA-Net outperforms the vanilla XFeat by 8.0% in Area Under the Curve (AUC) at a 5° threshold (AUC@5°). Furthermore, it yields a 23.6% relative improvement over the SuperPoint baseline while using compact 64-floating-point (64-f) descriptors. These results indicate that the GCA mechanism helps capture complex spatial structures that typically require much heavier architectures. By balancing matching accuracy with computational efficiency, GCA-Net provides an effective framework for autonomous navigation on edge computing platforms. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

27 pages, 9112 KB  
Article
MSWKN: Multi-Scale Wavelet Kolmogorov–Arnold Network with Spectral–Spatial and Frequency Domain Optimization for Hyperspectral Crop Classification
by Ziwei Li, Bingjie Liang, Weizhen Zhang, Zhenqiang Xu, Baowei Zhang, Ning Li, Weiran Luo and Jianzhong Guo
Agriculture 2026, 16(7), 740; https://doi.org/10.3390/agriculture16070740 - 27 Mar 2026
Viewed by 402
Abstract
Accurate crop classification provides fundamental data for agricultural resource management and ecological research. Hyperspectral image (HSI) classification is the core technique for achieving precise crop mapping. However, existing models often suffer from excessive parameters, limited robustness under few-shot conditions, and a trade-off between [...] Read more.
Accurate crop classification provides fundamental data for agricultural resource management and ecological research. Hyperspectral image (HSI) classification is the core technique for achieving precise crop mapping. However, existing models often suffer from excessive parameters, limited robustness under few-shot conditions, and a trade-off between efficiency and robustness. To address these issues, this paper proposes a Multi-Scale Wavelet Kolmogorov–Arnold Network (MSWKN). The model employs a Two-Branch Feature Extractor (TBFE) to capture both spectral correlations and spatial textures. a Channel Cross-Spatial (CCS) module to suppress background clutter and highlight discriminative regions. A group convolution-based Fixed Wavelet Multi-Scale Convolutional Layer (FW-MSCL) that leverages the time–frequency localization of wavelets and learnable linear combinations to enhance robustness against spectral distortion while reducing parameters. And a Fourier-based Transformer encoder to enable global frequency–space modeling. Experiments on the WHU-Hi-HanChuan and WHU-Hi-HongHu hyperspectral crop datasets show that MSWKN achieves high overall accuracy and performs favorably on few-shot categories. Under lower parameter counts and fast inference conditions, the model demonstrates a reasonable trade-off between accuracy and computational efficiency. Ablation studies and wavelet kernel comparisons further confirm the contribution of each module and the advantage of the wavelet. The proposed framework provides an efficient and robust solution for fine-grained hyperspectral crop classification. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

22 pages, 35239 KB  
Article
TBDDQN: Imbalanced Fault Diagnosis for Blast Furnace Ironmaking Process via Transformer–BiLSTM Double Deep Q-Networks
by Jinlong Zheng, Ping Wu, Ruirui Zuo, Xin Su, Yinzhu Liu and Nabin Kandel
Machines 2026, 14(3), 276; https://doi.org/10.3390/machines14030276 - 2 Mar 2026
Viewed by 371
Abstract
The blast furnace ironmaking process (BFIP) is a highly complex and dynamic industrial system where strong spatiotemporal coupling and severe data imbalance pose substantial challenges for fault diagnosis. To address these issues, this study proposes a Transformer–BiLSTM Double Deep Q-Network (TBDDQN) framework for [...] Read more.
The blast furnace ironmaking process (BFIP) is a highly complex and dynamic industrial system where strong spatiotemporal coupling and severe data imbalance pose substantial challenges for fault diagnosis. To address these issues, this study proposes a Transformer–BiLSTM Double Deep Q-Network (TBDDQN) framework for intelligent fault diagnosis. The framework employs a dual-branch architecture that integrates a Transformer-based spatial encoder with a BiLSTM-attention temporal extractor to capture global dependencies and dynamic patterns from multivariate time-series data. To mitigate class imbalance and asymmetric fault costs, a cost-sensitive reinforcement learning scheme based on Double DQN is incorporated, featuring prioritized experience replay and adaptive misclassification penalties. Experiments on real blast furnace datasets show that TBDDQN achieves a macro-averaged precision of 0.970 and a macro-averaged F1-score of 0.929, outperforming conventional CNN, LSTM, and DQN-based baselines. These results demonstrate that TBDDQN offers a robust and interpretable solution for imbalanced industrial fault diagnosis in the BFIP. Full article
Show Figures

Figure 1

43 pages, 1324 KB  
Article
Explainable Kolmogorov–Arnold Networks for Zero-Shot Human Activity Recognition on TinyML Edge Devices
by Ismail Lamaakal, Chaymae Yahyati, Yassine Maleh, Khalid El Makkaoui and Ibrahim Ouahbi
Mach. Learn. Knowl. Extr. 2026, 8(3), 55; https://doi.org/10.3390/make8030055 - 26 Feb 2026
Cited by 1 | Viewed by 859
Abstract
Human Activity Recognition (HAR) on wearable and IoT devices must jointly satisfy four requirements: high accuracy, the ability to recognize previously unseen activities, strict memory and latency constraints, and interpretable decisions. In this work, we address all four by introducing an explainable Kolmogorov–Arnold [...] Read more.
Human Activity Recognition (HAR) on wearable and IoT devices must jointly satisfy four requirements: high accuracy, the ability to recognize previously unseen activities, strict memory and latency constraints, and interpretable decisions. In this work, we address all four by introducing an explainable Kolmogorov–Arnold Network for Human Activity Recognition (TinyKAN-HAR) with a zero-shot learning (ZSL) module, designed specifically for TinyML edge devices. The proposed KAN replaces fixed activation functions by learnable one-dimensional spline operators applied after linear mixing, yielding compact yet expressive feature extractors whose internal nonlinearities can be directly visualized. On top of the KAN latent space, we learn a semantic projection and cosine-based compatibility function that align sensor features with class-level semantic embeddings, enabling both pure and generalized zero-shot recognition of unseen activities. We evaluate our method on three benchmark datasets (UCI HAR, WISDM, PAMAP2) under subject-disjoint and zero-shot splits. TinyKAN-HAR consistently achieves over 97% macro-F1 on seen classes and over 96% accuracy on unseen activities, with harmonic mean above 96% in the generalized ZSL setting, outperforming CNN, LSTM and Transformer-based ZSL baselines. For explainability, we combine gradient-based attributions, SHAP-style global relevance scores and inspection of the learned spline functions to provide sensor-level, temporal and neuron-level insights into each prediction. After 8-bit quantization and TinyML-oriented optimizations, the deployed model occupies only 145 kB of flash and 26 kB of RAM, and achieves an average inference latency of 4.1 ms (about 0.32 mJ per window) on a Cortex-M4F-class microcontroller, while preserving accuracy within 0.2% of the full-precision model. These results demonstrate that explainable, zero-shot HAR with near state-of-the-art accuracy is feasible on severely resource-constrained TinyML edge devices. Full article
(This article belongs to the Section Learning)
Show Figures

Graphical abstract

21 pages, 2928 KB  
Article
No Trade-Offs: Unified Global, Local, and Multi-Scale Context Modeling for Building Pixel-Wise Segmentation
by Zhiyu Zhang, Debao Yuan, Yifei Zhou and Renxu Yang
Remote Sens. 2026, 18(3), 472; https://doi.org/10.3390/rs18030472 - 2 Feb 2026
Viewed by 382
Abstract
Building extraction from remote sensing imagery plays a pivotal role in applications such as smart cities, urban planning, and disaster assessment. Although deep learning has significantly advanced this task, existing methods still struggle to strike an effective balance among global semantic understanding, local [...] Read more.
Building extraction from remote sensing imagery plays a pivotal role in applications such as smart cities, urban planning, and disaster assessment. Although deep learning has significantly advanced this task, existing methods still struggle to strike an effective balance among global semantic understanding, local detail recovery, and multi-scale contextual awareness—particularly when confronted with challenges including extreme scale variations, complex spatial distributions, occlusions, and ambiguous boundaries. To address these issues, we propose TriadFlow-Net, an efficient end-to-end network architecture. First, we introduce the Multi-scale Attention Feature Enhancement Module (MAFEM), which employs parallel attention branches with varying neighborhood radii to adaptively capture multi-scale contextual information, thereby alleviating the problem of imbalanced receptive field coverage. Second, to enhance robustness under severe occlusion scenarios, we innovatively integrate a Non-Causal State Space Model (NC-SSD) with a Densely Connected Dynamic Fusion (DCDF) mechanism, enabling linear-complexity modeling of global long-range dependencies. Finally, we incorporate a Multi-scale High-Frequency Detail Extractor (MHFE) along with a channel–spatial attention mechanism to precisely refine boundary details while suppressing noise. Extensive experiments conducted on three publicly available building segmentation benchmarks demonstrate that the proposed TriadFlow-Net achieves state-of-the-art performance across multiple evaluation metrics, while maintaining computational efficiency—offering a novel and effective solution for high-resolution remote sensing building extraction. Full article
Show Figures

Figure 1

28 pages, 5166 KB  
Article
Hyperspectral Image Classification Using SIFANet: A Dual-Branch Structure Combining CNN and Transformer
by Yuannan Gui, Lu Xu, Dongping Ming, Yanfei Wei and Ming Huang
Remote Sens. 2026, 18(3), 398; https://doi.org/10.3390/rs18030398 - 24 Jan 2026
Viewed by 834
Abstract
The hyperspectral image (HSI) is rich in spectral information and has important applications in the field of ground objects classification. However, HSI data have high dimensions and variable spatial–spectral features, which make it difficult for some models to adequately extract the effective features. [...] Read more.
The hyperspectral image (HSI) is rich in spectral information and has important applications in the field of ground objects classification. However, HSI data have high dimensions and variable spatial–spectral features, which make it difficult for some models to adequately extract the effective features. Recent studies have shown that fusing spatial and spectral features can significantly improve accuracy by exploiting multi-dimensional correlations. Based on this, this article proposes a spectral integration and focused attention network (SIFANet) with a two-branch structure. SIFANet captures the local spatial features and global spectral dependencies through the parallel-designed spatial feature extractor (SFE) and spectral sequence Transformer (SST), respectively. A cross-module attention fusion (CMAF) mechanism dynamically integrates features from both branches before final classification. Experiments on the Salinas dataset and Xiong’an hyperspectral dataset show that the overall accuracy on these two datasets is 99.89% and 99.79%, which is higher than the other models compared. The proposed method also had the lowest standard deviation of category accuracy and optimal computational efficiency metrics, demonstrating robust spatial–spectral feature integration for improved classification. Full article
Show Figures

Figure 1

21 pages, 1300 KB  
Article
CAIC-Net: Robust Radio Modulation Classification via Unified Dynamic Cross-Attention and Cross-Signal-to-Noise Ratio Contrastive Learning
by Teng Wu, Quan Zhu, Runze Mao, Changzhen Hu and Shengjun Wei
Sensors 2026, 26(3), 756; https://doi.org/10.3390/s26030756 - 23 Jan 2026
Viewed by 341
Abstract
In complex wireless communication environments, automatic modulation classification (AMC) faces two critical challenges: the lack of robustness under low-signal-to-noise ratio (SNR) conditions and the inefficiency of integrating multi-scale feature representations. To address these issues, this paper proposes CAIC-Net, a robust modulation classification network [...] Read more.
In complex wireless communication environments, automatic modulation classification (AMC) faces two critical challenges: the lack of robustness under low-signal-to-noise ratio (SNR) conditions and the inefficiency of integrating multi-scale feature representations. To address these issues, this paper proposes CAIC-Net, a robust modulation classification network that integrates a dynamic cross-attention mechanism with a cross-SNR contrastive learning strategy. CAIC-Net employs a dual-stream feature extractor composed of ConvLSTM2D and Transformer blocks to capture local temporal dependencies and global contextual relationships, respectively. To enhance fusion effectiveness, we design a Dynamic Cross-Attention Unit (CAU) that enables deep bidirectional interaction between the two branches while incorporating an SNR-aware mechanism to adaptively adjust the fusion strategy under varying channel conditions. In addition, a Cross-SNR Contrastive Learning (CSCL) module is introduced as an auxiliary task, where positive and negative sample pairs are constructed across different SNR levels and optimized using InfoNCE loss. This design significantly strengthens the intrinsic noise-invariant properties of the learned representations. Extensive experiments conducted on two standard datasets demonstrate that CAIC-Net achieves competitive classification performance at moderate-to-high SNRs and exhibits clear advantages in extremely low-SNR scenarios, validating the effectiveness and strong generalization capability of the proposed approach. Full article
(This article belongs to the Section Communications)
Show Figures

Figure 1

30 pages, 8453 KB  
Article
PBZGNet: A Novel Defect Detection Network for Substation Equipment Based on Gradual Parallel Branch Architecture
by Mintao Hu, Yang Zhuang, Jiahao Wang, Yaoyi Hu, Desheng Sun, Dawei Xu and Yongjie Zhai
Sensors 2026, 26(1), 300; https://doi.org/10.3390/s26010300 - 2 Jan 2026
Viewed by 702
Abstract
As power systems expand and grow smarter, the safe and steady operation of substation equipment has become a prerequisite for grid reliability. In cluttered substation scenes, however, existing deep learning detectors still struggle with small targets, multi-scale feature fusion, and precise localization. To [...] Read more.
As power systems expand and grow smarter, the safe and steady operation of substation equipment has become a prerequisite for grid reliability. In cluttered substation scenes, however, existing deep learning detectors still struggle with small targets, multi-scale feature fusion, and precise localization. To overcome these limitations, we introduce PBZGNet, a defect-detection network that couples a gradual parallel-branch backbone, a zoom-fusion neck, and a global channel-recalibration module. First, BiCoreNet is embedded in the feature extractor: dual-core parallel paths, reversible residual links, and channel recalibration cooperate to mine fault-sensitive cues. Second, cross-scale ZFusion and Concat-CBFuse are dynamically merged so that no scale loses information; a hierarchical composite feature pyramid is then formed, strengthening the representation of both complex objects and tiny flaws. Third, an attention-guided decoupled detection head (ADHead) refines responses to obscured and minute defect patterns. Finally, within the Generalized Focal Loss framework, a quality rating scheme suppresses background interference while distribution regression sharpens the localization of small targets. Across all scales, PBZGNet clearly outperforms YOLOv11. Its lightweight variant, PBZGNet-n, attains 83.9% mAP@50 with only 2.91 M parameters and 7.7 GFLOPs—9.3% above YOLOv11-n. The full PBZGNet surpasses the current best substation model, YOLO-SD, by 7.3% mAP@50, setting a new state of the art (SOTA). Full article
(This article belongs to the Special Issue Deep Learning Based Intelligent Fault Diagnosis)
Show Figures

Figure 1

19 pages, 2577 KB  
Article
A Hybrid Large-Kernel CNN and Markov Feature Framework for Remaining Useful Life Prediction
by Yuke Wang, Che Su, Peng Wang, Junquan Zhen and Dong Wang
Machines 2026, 14(1), 57; https://doi.org/10.3390/machines14010057 - 1 Jan 2026
Cited by 1 | Viewed by 510
Abstract
Remaining Useful Life (RUL) prediction has become a crucial component in predictive maintenance and condition-based operation with the rapid advancement of industrial automation and the increasing complexity of mechanical systems. Although existing deep learning models, such as Long Short-Term Memory (LSTM) networks and [...] Read more.
Remaining Useful Life (RUL) prediction has become a crucial component in predictive maintenance and condition-based operation with the rapid advancement of industrial automation and the increasing complexity of mechanical systems. Although existing deep learning models, such as Long Short-Term Memory (LSTM) networks and conventional Convolutional Neural Networks (CNNs), have demonstrated effectiveness in modeling equipment degradation from multivariate sensor data, they still face several limitations. Recurrent architectures often suffer from vanishing gradients and struggle to capture long-term dependencies, while CNN-based methods typically rely on small convolutional kernels and deterministic feature extractors, limiting their ability to model long-range dependencies and stochastic degradation transitions. To address these challenges, this study proposes a novel hybrid deep learning framework that integrates large-kernel convolutional feature extraction with Markov transition modeling for RUL prediction. Specifically, the large-kernel CNN captures both local and global degradation patterns, while the Markov feature module encodes probabilistic state transitions to characterize the stochastic evolution of equipment health. Furthermore, a lightweight channel attention mechanism is incorporated to adaptively emphasize degradation-sensitive sensor information, thereby enhancing feature discriminability. Extensive experiments conducted on the NASA C-MAPSS turbofan engine dataset demonstrate that the proposed model consistently outperforms conventional CNN, LSTM, and hybrid baselines in terms of Root Mean Square Error (RMSE) and the NASA scoring metric. The results verify that combining deep convolutional representations with probabilistic transition information significantly enhances prediction accuracy and robustness in industrial RUL estimation tasks. Full article
Show Figures

Figure 1

29 pages, 6668 KB  
Article
IoT Network Security Threat Detection Algorithm Integrating Symmetric Routing and a Sparse Mixture-of-Experts Model
by Jiawen Yang, Kunsan Zhang, Renguang Zheng, Chaopeng Li and Jiachun Zheng
Symmetry 2026, 18(1), 63; https://doi.org/10.3390/sym18010063 - 30 Dec 2025
Viewed by 661
Abstract
With the rapid deployment of the Internet of Things (IoT) in critical domains such as power and industrial systems, the number of IoT devices has surged, accompanied by increasingly severe network security risks. IoT networks face diverse threats, including distributed denial-of-service attacks, advanced [...] Read more.
With the rapid deployment of the Internet of Things (IoT) in critical domains such as power and industrial systems, the number of IoT devices has surged, accompanied by increasingly severe network security risks. IoT networks face diverse threats, including distributed denial-of-service attacks, advanced persistent threats, and data theft or tampering, while traditional detection and defense, lacking deep feature analysis, struggle with complex and unknown attacks, degrading security threat event detection. To this end, this paper proposes an IoT network security threat detection algorithm that integrates symmetric linear routing with a sparse mixture-of-experts model. The algorithm consists of a ConvNeXt feature extractor and a sparse BiLSTM expert layer, with symmetric linear routing embedded in the gating module. ConvNeXt provides refined global and local representations, Top-K gated BiLSTM experts for the module sequence-level dependencies among ordered features, and symmetric linear routing suppresses routing bias, enabling efficient and robust detection of IoT security threats. Experimental results on the CIC-IDS2018, TON-IoT, and BoT-IoT datasets indicate that the proposed IoT network security threat detection algorithm achieves accuracies of 94.08%, 99.99±0.01%, and 99.78%, respectively. Comparative experiments show the proposed algorithm outperforms baseline and state-of-the-art models, while the ablation and Top-K studies confirm module effectiveness for IoT intrusion detection. Full article
(This article belongs to the Section Engineering and Materials)
Show Figures

Figure 1

17 pages, 2779 KB  
Article
Image Restoration Based on Semantic Prior Aware Hierarchical Network and Multi-Scale Fusion Generator
by Yapei Feng, Yuxiang Tang and Hua Zhong
Technologies 2025, 13(11), 521; https://doi.org/10.3390/technologies13110521 - 13 Nov 2025
Viewed by 828
Abstract
As a fundamental low-level vision task, image restoration plays a pivotal role in reconstructing authentic visual information from corrupted inputs, directly impacting the performance of downstream high-level vision systems. Current approaches frequently exhibit two critical limitations: (1) Progressive texture degradation and blurring during [...] Read more.
As a fundamental low-level vision task, image restoration plays a pivotal role in reconstructing authentic visual information from corrupted inputs, directly impacting the performance of downstream high-level vision systems. Current approaches frequently exhibit two critical limitations: (1) Progressive texture degradation and blurring during iterative refinement, particularly in irregular damage patterns. (2) Structural incoherence when handling cross-domain artifacts. To address these challenges, we present a semantic-aware hierarchical network (SAHN) that synergistically integrates multi-scale semantic guidance with structural consistency constraints. Firstly, we construct a Dual-Stream Feature Extractor. Based on a modified U-Net backbone with dilated residual blocks, this skip-connected encoder–decoder module simultaneously captures hierarchical semantic contexts and fine-grained texture details. Secondly, we propose the semantic prior mapper by establishing spatial–semantic correspondences between damaged areas and multi-scale features through predefined semantic prototypes through adaptive attention pooling. Additionally, we construct a multi-scale fusion generator, by employing cascaded association blocks with structural similarity constraints. This unit progressively aggregates features from different semantic levels using deformable convolution kernels, effectively bridging the gap between global structure and local texture reconstruction. Compared to existing methods, our algorithm attains the highest overall PSNR of 34.99 with the best visual authenticity (with the lowest FID of 11.56). Comprehensive evaluations of three datasets demonstrate its leading performance in restoring visual realism. Full article
Show Figures

Figure 1

15 pages, 1171 KB  
Article
Person Re-Identification Under Non-Overlapping Cameras Based on Advanced Contextual Embeddings
by Chi-Hung Chuang, Tz-Chian Huang, Chong-Wei Wang, Jung-Hua Lo and Chih-Lung Lin
Algorithms 2025, 18(11), 714; https://doi.org/10.3390/a18110714 - 12 Nov 2025
Viewed by 912
Abstract
Person Re-identification (ReID), a critical technology in intelligent surveillance, aims to accurately match specific individuals across non-overlapping camera networks. However, factors in real-world scenarios such as variations in illumination, viewpoint, and pose continuously challenge the matching accuracy of existing models. Although Transformer-based models [...] Read more.
Person Re-identification (ReID), a critical technology in intelligent surveillance, aims to accurately match specific individuals across non-overlapping camera networks. However, factors in real-world scenarios such as variations in illumination, viewpoint, and pose continuously challenge the matching accuracy of existing models. Although Transformer-based models like TransReID have demonstrated a strong capability for capturing global context in feature extraction, the features they produce still have room for optimization at the metric matching stage. To address this issue, this study proposes a hybrid framework that combines advanced feature extraction with post-processing optimization. We employed a fixed, pre-trained TransReID model as the feature extractor and introduced a camera-aware Jaccard distance re-ranking algorithm (CA-Jaccard) as a post-processing module. Without retraining the main model, this framework refines the initial distance metric matrix by analyzing the local neighborhood topology among feature vectors and incorporating camera information. Experiments were conducted on two major public datasets, Market-1501 and MSMT17. The results show that our framework significantly improved the overall ranking quality of the model, increasing the mean Average Precision (mAP) on Market-1501 from 88.2% to 93.58% compared to using TransReID alone, achieving a gain of nearly 4% in mAP on MSMT17. This research confirms that advanced post-processing techniques can effectively complement powerful feature extraction models, providing an efficient pathway to enhance the robustness of ReID systems in complex scenarios. Additionally, it is the first-ever to analyze how the modified distance metric improves the ReID task when used specifically with the ViT-based feature extractor TransReID. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition (3rd Edition))
Show Figures

Figure 1

21 pages, 17739 KB  
Article
Re_MGFE: A Multi-Scale Global Feature Embedding Spectrum Sensing Method Based on Relation Network
by Jiayi Wang, Fan Zhou, Jinyang Ren, Lizhuang Tan, Jian Wang, Peiying Zhang and Shaolin Liao
Computers 2025, 14(11), 480; https://doi.org/10.3390/computers14110480 - 4 Nov 2025
Viewed by 689
Abstract
Currently, the increasing number of Internet of Things devices makes spectrum resource shortage prominent. Spectrum sensing technology can effectively solve this problem by conducting real-time monitoring of the spectrum. However, in practical applications, it is difficult to obtain a large number of labeled [...] Read more.
Currently, the increasing number of Internet of Things devices makes spectrum resource shortage prominent. Spectrum sensing technology can effectively solve this problem by conducting real-time monitoring of the spectrum. However, in practical applications, it is difficult to obtain a large number of labeled samples, which leads to the neural network model not being fully trained and affects the performance. Moreover, the existing few-shot methods focus on capturing spatial features, ignoring the representation forms of features at different scales, thus reducing the diversity of features. To address the above issues, this paper proposes a few-shot spectrum sensing method based on multi-scale global feature. To enhance the feature diversity, this method employs a multi-scale feature extractor to extract features at multiple scales. This improves the model’s ability to distinguish signals and avoids overfitting of the network. In addition, to make full use of the frequency features at different scales, a learnable weight feature reinforcer is constructed to enhance the frequency features. The simulation results show that, when SNR is under 0∼10 dB, the recognition accuracy of the network under different task modes all reaches above 81%, which is better than the existing methods. It realizes the accurate spectrum sensing under the few-shot conditions. Full article
(This article belongs to the Section Internet of Things (IoT) and Industrial IoT)
Show Figures

Graphical abstract

Back to TopTop