Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (191)

Search Parameters:
Keywords = class- wise features

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 25750 KB  
Article
Rainforest Monitoring Using Deep Learning and Short Time Series of Sentinel-1 IW Data
by Ricardo Dal Molin, Laetitia Thirion-Lefevre, Régis Guinvarc’h and Paola Rizzoli
Remote Sens. 2026, 18(4), 598; https://doi.org/10.3390/rs18040598 (registering DOI) - 14 Feb 2026
Abstract
The latest advances in remote sensing play a central role in providing Earth observation (EO) data for numerous applications in the scope of reaching environmentally sustainable goals. However, over tropical rainforests, optical imaging is often hindered by extensive cloud coverage, which means that [...] Read more.
The latest advances in remote sensing play a central role in providing Earth observation (EO) data for numerous applications in the scope of reaching environmentally sustainable goals. However, over tropical rainforests, optical imaging is often hindered by extensive cloud coverage, which means that analysis-ready images are mostly restricted to the dry season. In this study, we propose combining radar features extracted from short time series of Sentinel-1 Interferometric Wide Swath (IW) data with a deep learning-based classification scheme to continuously monitor the state of forests. The proposed methodology is based on the joint use of SAR backscatter and interferometric coherences at different temporal baselines to perform pixel-wise classification of land cover classes of interest. However, we show that for a sequence of Sentinel-1 time series, different land cover classes exhibit particular seasonal-dependent variations. Another challenge in performing short-term predictions stems from the fact that ground truths are usually available only on a yearly basis. To address these challenges, we propose a seasonal sampling of the training data, masked by potential deforestation, along with a classification based on a modified U-Net model. The classification results show that overall accuracies above 90% can be achieved throughout the whole year with the proposed method, emerging as a potential tool for mapping rainforests with unprecedented temporal resolution. Full article
Show Figures

Figure 1

30 pages, 5659 KB  
Article
Adversarially Robust and Explainable Insulator Defect Detection for Smart Grid Infrastructure
by Mubarak Alanazi
Energies 2026, 19(4), 1013; https://doi.org/10.3390/en19041013 (registering DOI) - 14 Feb 2026
Abstract
Automated insulator inspection systems face critical challenges from small object sizes, complex backgrounds, and vulnerability to adversarial attacks, a security concern largely unaddressed in safety-critical power infrastructure. We introduce Faster-YOLOv12n, integrating a FasterNet backbone with SGC2f attention modules and Wise-ShapeIoU loss for enhanced [...] Read more.
Automated insulator inspection systems face critical challenges from small object sizes, complex backgrounds, and vulnerability to adversarial attacks, a security concern largely unaddressed in safety-critical power infrastructure. We introduce Faster-YOLOv12n, integrating a FasterNet backbone with SGC2f attention modules and Wise-ShapeIoU loss for enhanced small defect localization. Our architecture achieves 98.9% mAP@0.5 on the CPLID, improving baseline YOLOv12n by 1.3% in precision (97.8% vs. 96.5%), 4.7% in recall (95.1% vs. 90.4%), and 1.8% in mAP@0.5. Through differential data augmentation, we expand training samples from 678 to 3900 images, achieving balanced class distribution and robust generalization across fog, adverse weather, and complex transmission line backgrounds. Comparative evaluation demonstrates superior performance over RT-DETR, Faster R-CNN, YOLOv7, YOLOv8, and YOLOv9, with per-class analysis revealing 99.8% AP@0.5 for defect detection. We provide the first comprehensive adversarial robustness evaluation for insulator defect detection, systematically assessing FGSM, PGD, and C&W attacks across perturbation budgets. Through adversarial training with mixed-batch strategies, our robust model maintains 93.2% mAP@0.5 under the strongest FGSM attacks (ϵ = 48/255), 94.5% under PGD attacks, and 95.1% under C&W attacks (τ = 3.0) while preserving 98.9% clean accuracy, demonstrating no trade-off between accuracy and robustness. Grad-CAM visualizations demonstrate that attacks disrupt confidence calibration while preserving spatial attention on defect regions, providing interpretable insights into model decision-making under adversarial conditions and validating learned feature representations for safety-critical smart grid monitoring applications. Full article
23 pages, 1202 KB  
Article
Image-Based Malware Classification Using DCGAN-Augmented Data and a CNN–Transformer Hybrid Model
by Manya Dhingra, Achin Jain, Niharika Thakur, Anurag Choubey, Massimo Donelli, Arun Kumar Dubey and Arvind Panwar
Future Internet 2026, 18(2), 102; https://doi.org/10.3390/fi18020102 (registering DOI) - 14 Feb 2026
Abstract
With the rapid growth and diversification of malware, accurate multi-class detection remains challenging due to severe class imbalance and limited labeled data. This work presents an image-based malware classification framework that converts executable binaries into 64×64 grayscale images, employs class-wise DCGAN [...] Read more.
With the rapid growth and diversification of malware, accurate multi-class detection remains challenging due to severe class imbalance and limited labeled data. This work presents an image-based malware classification framework that converts executable binaries into 64×64 grayscale images, employs class-wise DCGAN augmentation to mitigate severe imbalance (initial imbalance ratio >12 across 31 families, N9300), and trains a hybrid CNN–Transformer model that captures both local texture features and long-range contextual dependencies. The DCGAN generator produces high-fidelity synthetic samples, evaluated using Inception Score (IS) =3.43, Fréchet Inception Distance (FID) =10.99, and Kernel Inception Distance (KID) =0.0022, and is used to equalize class counts before classifier training. On the blended dataset the proposed GAN-balanced CNN–Transformer achieves an overall accuracy of 95% and a macro-averaged F1-score of 0.95; the hybrid model also attains validation accuracy of ≈94% while substantially improving minority-class recognition. Compared to CNN-only and Transformer-only baselines, the hybrid approach yields more stable convergence, reduced overfitting, and stronger per-class performance, while remaining feasible for practical deployment. These results demonstrate that DCGAN-driven balancing combined with CNN–Transformer feature fusion is an effective, scalable solution for robust malware family classification. Full article
(This article belongs to the Section Cybersecurity)
Show Figures

Graphical abstract

29 pages, 6070 KB  
Article
Clastic Rock Lithology Identification Based on Multivariate Feature Enhancement and Dynamic Confidence-Weighted Ensemble
by Kang Chen, Guoyun Zhong and Fan Diao
Appl. Sci. 2026, 16(4), 1808; https://doi.org/10.3390/app16041808 - 12 Feb 2026
Viewed by 102
Abstract
The strong heterogeneity of clastic reservoirs and the phenomenon of similar log responses for different lithologies (i.e., “same spectrum, different rocks”) significantly weaken feature separability. Furthermore, distribution shifts between different wells cause traditional models to suffer from severe generalization bottlenecks in cross-well applications. [...] Read more.
The strong heterogeneity of clastic reservoirs and the phenomenon of similar log responses for different lithologies (i.e., “same spectrum, different rocks”) significantly weaken feature separability. Furthermore, distribution shifts between different wells cause traditional models to suffer from severe generalization bottlenecks in cross-well applications. To address this critical challenge, this paper proposes a dual-driven framework comprising “Multivariate Feature Enhancement + Dynamic Ensemble”. At the feature level, physics-informed enhancement and multi-scale statistics are introduced to construct a Multivariate high-dimensional feature system, thereby strengthening the representation of geological patterns. At the model level, a sample-aware Dynamic Confidence-Weighted Ensemble (DCWE) strategy is designed to achieve sample-wise adaptive decision-making based on prediction uncertainty, fundamentally breaking through the limitations of fixed weights in static ensembles. This method combines the complementary advantages of Gradient Boosting Decision Trees (GBDT) and deep sequence networks, enabling the simultaneous capture of local textural variations and continuous trends across depths. Based on rigorous Leave-One-Group-Out (LOGO) cross-validation, the proposed framework achieves a maximum accuracy of 84.58%. It significantly reduces the misclassification rate in lithology transition zones and for minority class samples, while maintaining the geological continuity of prediction results. These results verify the significant advantages of the proposed method in cross-well generalization scenarios. Full article
Show Figures

Figure 1

35 pages, 5808 KB  
Article
Dynamic Mode Decomposition-Based Clustered Pattern Projection for Reliable Alzheimer’s Disease Detection from EEG
by Jong-Hyeon Seo, Hunseok Kang, Jacob Kang and Aymen I. Zreikat
Diagnostics 2026, 16(4), 530; https://doi.org/10.3390/diagnostics16040530 - 10 Feb 2026
Viewed by 120
Abstract
Background/Objectives: Detecting Alzheimer’s disease (AD) from normal aging using eyes-open (EO) EEG is challenging due to stimulus-driven nonstationarity and fragmented oscillatory responses. This study aims to determine whether prototype-based representations derived from Dynamic Mode Decomposition (DMD) can improve AD detection from EO photostimulation [...] Read more.
Background/Objectives: Detecting Alzheimer’s disease (AD) from normal aging using eyes-open (EO) EEG is challenging due to stimulus-driven nonstationarity and fragmented oscillatory responses. This study aims to determine whether prototype-based representations derived from Dynamic Mode Decomposition (DMD) can improve AD detection from EO photostimulation EEG. Methods: We propose a DMD-based framework termed DMD-based Clustered Pattern Projection (DMD-CPP). Segment-wise DMD representations were clustered to learn class-specific medoid prototypes, and each EEG epoch was encoded as a vector of cosine-similarity coordinates with respect to these prototypes. A linear SVM classifier was trained on the resulting DMD-CPP features and evaluated under strict leave-one-subject-out validation. Results: The DMD-CPP model achieved competitive classification accuracy and, importantly, enhanced margin-based reliability. In EO photostimulation, AD versus healthy control classification showed a pronounced improvement, characterized by wider and more asymmetric decision margins, particularly assigning low confidence to normal epochs misclassified as AD. Tasks involving frontotemporal dementia also showed improvement, although the effect was less pronounced than for AD. Conclusions: Clustering-based pattern projection has been shown to stabilize EEG dynamics and provide an interpretable, confidence-aware feature representation. These findings suggest that DMD-CPP offers a promising framework for reliable AD detection from EO EEG, where conventional spectral methods typically struggle. Full article
Show Figures

Figure 1

40 pages, 3023 KB  
Article
Molecular Informatics, Chemometrics, and Sensory Omics for Constructing an Umami Peptide Cluster Library Across the Entire Lager Beer Brewing Process
by Yashuai Wu, Ruiyang Yin, Wenjing Tian, Wanqiu Zhao, Jiayang Luo, Mingtao Huang and Dongrui Zhao
Foods 2026, 15(4), 641; https://doi.org/10.3390/foods15040641 - 10 Feb 2026
Viewed by 120
Abstract
Umami taste in lager beer not only determined body fullness and the backbone of aftertaste, but also affected the controllability and interpretability of flavor expression across the entire brewing process. Based on stage-wise sampling, peptidomic profiles were established on wort fermentation day 0, [...] Read more.
Umami taste in lager beer not only determined body fullness and the backbone of aftertaste, but also affected the controllability and interpretability of flavor expression across the entire brewing process. Based on stage-wise sampling, peptidomic profiles were established on wort fermentation day 0, day 1, day 3, and day 9. A total of 25,592 peptides were identified by reversed-phase liquid chromatography–quadrupole time-of-flight mass spectrometry (RPLC-QTOF-MS). Molecular informatics screening was performed using UMPred-FRL (a feature representation learning-based meta-predictor for umami peptides) and TastePeptides-Meta (a one-stop platform for taste peptides and prediction models), yielding 7255 potential umami peptides. From these, 145 peptides were further selected for molecular docking. In addition, 6 representative umami peptides were selected for receptor-level validation and structural analysis. Mechanistically, the umami receptor taste receptor type 1 member 1/taste receptor type 1 member 3 (T1R1/T1R3) belonged to class C G protein-coupled receptor (GPCR) and relied on the extracellular Venus flytrap (VFT) domain for ligand capture. Ligand-induced VFT conformational convergence transmitted changes to the transmembrane region and triggered signal transduction. Docking and energy decomposition indicated that the ionic group primarily contributed to orientation and anchoring. Salt-bridge or hydrogen-bond networks were formed around Lys228, Arg240, Glu206, Asp210, Asn141, and Gln138, thereby reducing conformational freedom. Meanwhile, hydrophobic side chains obtained major binding gains within a hydrophobic microenvironment formed by Val135, Ile137, Leu165, Tyr166, Trp78, and His79. These results reflected a synergistic mode in which charge pairing enabled positioning and hydro-phobic complementarity promoted VFT closure. To experimentally confirm sensory relevance, 6 representative peptides were individually spiked into 4 brewing-stage beer samples, which produced a clear stratification pattern across stages. Notably, peptides with favorable docking-derived binding propensity did not necessarily enhance umami perception, and several longer peptides showed persistent negative sensory shifts, supporting that binding affinity alone could not be treated as a proxy for perceived umami in the beer matrix. At the node level, the cumulative abundance of umami peptides showed a significant positive correlation with umami scores, with a Pearson correlation coefficient of r = 0.963 and p = 0.037. This result indicated good linear consistency between umami peptide content and the upward shift in umami taste in lager beer. Umami peptide clusters were further proposed as a more appropriate functional unit, and an umami peptide cluster database spanning the full process was constructed. This database provided a reusable resource for process control and flavor prediction. Full article
(This article belongs to the Section Food Analytical Methods)
Show Figures

Figure 1

23 pages, 5641 KB  
Article
Lightweight Multi-Scale Framework for Human Pose and Action Classification
by Alireza Saber, Mohammad-Mehdi Hosseini, Amirreza Fateh, Mansoor Fateh and Vahid Abolghasemi
Sensors 2026, 26(4), 1102; https://doi.org/10.3390/s26041102 - 8 Feb 2026
Viewed by 191
Abstract
Human pose classification, along with related tasks such as action recognition, is a crucial area in deep learning due to its wide range of applications in assisting human activities. Despite significant progress, it remains a challenging problem because of high inter-class similarity, dataset [...] Read more.
Human pose classification, along with related tasks such as action recognition, is a crucial area in deep learning due to its wide range of applications in assisting human activities. Despite significant progress, it remains a challenging problem because of high inter-class similarity, dataset noise, and the large variability in human poses. In this paper, we propose a lightweight yet highly effective modular attention-based architecture for human pose classification, built upon a Swin Transformer backbone for robust multi-scale feature extraction. The proposed design integrates the Spatial Attention module, the Context-Aware Channel Attention Module, and a novel Dual Weighted Cross Attention module, enabling effective fusion of spatial and channel-wise cues. Additionally, explainable AI techniques are employed to improve the reliability and interpretability of the model. We train and evaluate our approach on two distinct datasets: Yoga-82 (in both main-class and subclass configurations) and Stanford 40 Actions. Experimental results show that our model outperforms state-of-the-art baselines across accuracy, precision, recall, F1-score, and mean average precision, while maintaining an extremely low parameter count of only 0.79 million. Specifically, our method achieves accuracies of 90.40% and 87.44% for the 6-class and 20-class Yoga-82 configurations, respectively, and 94.28% for the Stanford 40 Actions dataset. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

27 pages, 70264 KB  
Article
TaDP-Det: Semi-Supervised Texture-Aware Dynamic Pseudo-Labeling Detector for Industrial Surface Defect Detection
by Qiwu Luo, Weiyu Zhan and Jiaojiao Su
Sensors 2026, 26(4), 1085; https://doi.org/10.3390/s26041085 - 7 Feb 2026
Viewed by 153
Abstract
Surface defect detection is essential for industrial quality control, but obtaining reliable labeled data remains costly due to the need for expert annotation. Semi-supervised object detection (SSOD) mitigates this need by leveraging unlabeled data through pseudo-labeling. However, industrial surface imagery presents specific challenges, [...] Read more.
Surface defect detection is essential for industrial quality control, but obtaining reliable labeled data remains costly due to the need for expert annotation. Semi-supervised object detection (SSOD) mitigates this need by leveraging unlabeled data through pseudo-labeling. However, industrial surface imagery presents specific challenges, including texture-ambiguous, low-contrast backgrounds that cause foreground–background confusion and strong class-dependent detection difficulty, which renders global confidence thresholds ineffective, often yielding noisy and imbalanced pseudo labels. To overcome these limitations, we propose TaDP-Det, a semi-supervised detector that improves pseudo-label quality through dual enhancements in feature representation and label filtering. We first introduce a Texture Enhance Module (TEM), designed as a texture-aware patch-level mixture-of-experts applied at shallow backbone stages, which amplifies discriminative low-level texture cues to generate more reliable pseudo labels in ambiguous regions. Second, the class-wise dynamic pseudo-label filtering (CDPF) scheme uses lightweight 1D Gaussian mixture models to adaptively determine per-class thresholds, preserving challenging defects and suppressing spurious predictions. Comprehensive evaluations on the NEU-DET, GC10-DET, and PCB-DEFECT datasets show that TaDP-Det consistently outperforms state-of-the-art SSOD baselines in mean average precision (mAP) with only modest computational overhead. The results underscore the effectiveness of our method for robust semi-supervised defect detection in industrial applications. Full article
(This article belongs to the Special Issue Advanced Sensing Technologies in Industrial Defect Detection)
Show Figures

Figure 1

25 pages, 3917 KB  
Article
Hierarchical Attention Fused CNN-LSTM Using Structured 2D Indicator Matrices for Stock Trading Action Detection
by Hao Feng, Xian Li, Dongjie Zhao and Hui Kong
Appl. Sci. 2026, 16(4), 1672; https://doi.org/10.3390/app16041672 - 7 Feb 2026
Viewed by 100
Abstract
Accurate detection of trading actions (buy, sell, and hold) is critical for portfolio optimization and risk management in volatile stock markets. However, existing approaches often suffer from deficiencies in feature representation, spatiotemporal modeling, and class balancing, which limit their effectiveness. To address these [...] Read more.
Accurate detection of trading actions (buy, sell, and hold) is critical for portfolio optimization and risk management in volatile stock markets. However, existing approaches often suffer from deficiencies in feature representation, spatiotemporal modeling, and class balancing, which limit their effectiveness. To address these issues, we propose HA-CL, a deep learning framework that integrates a hierarchical attention mechanism with CNN-LSTM. Specifically, technical indicators are encoded into a structured 2D matrix to preserve the inherent characteristics of stocks. Features extracted by ResNet are processed by a channel-wise LSTM equipped with an attention core to adaptively fuse spatial, temporal, and channel-level importance. To mitigate class imbalance, we design a customized extrema labeling strategy augmented with extrema oversampling, an importance-aware focal loss, and a heuristic action recalibration. Experiments on 63 Chinese A-share stocks show that HA-CL achieves an average accuracy of 68.89% with an annualized return of 111.01%, substantially outperforming all baselines. Risk-adjusted return metrics such as the Sharpe Ratio and the Maximum Drawdown further validate its robustness across market conditions. Together, they highlight the potential of HA-CL to translate complex market patterns into profitable trading actions. Full article
Show Figures

Figure 1

16 pages, 6191 KB  
Article
A Hybrid Millimeter-Wave Radar–Ultrasonic Fusion System for Robust Human Activity Recognition with Attention-Enhanced Deep Learning
by Liping Yao, Kwok L. Chung, Luxin Tang, Tao Ye, Shiquan Wang, Pingchuan Xu, Yuhao Bi and Yaowen Wu
Sensors 2026, 26(3), 1057; https://doi.org/10.3390/s26031057 - 6 Feb 2026
Viewed by 222
Abstract
To address the tradeoff between environmental robustness and fine-grained accuracy in single-sensor human behavior recognition, this paper proposes a non-contact system fusing 77 GHz SIFT (mmWave) radar and a 40 kHz ultrasonic array. The system leverages radar’s long-range penetration and low-visibility adaptability, paired [...] Read more.
To address the tradeoff between environmental robustness and fine-grained accuracy in single-sensor human behavior recognition, this paper proposes a non-contact system fusing 77 GHz SIFT (mmWave) radar and a 40 kHz ultrasonic array. The system leverages radar’s long-range penetration and low-visibility adaptability, paired with ultrasound’s centimeter-level short-range precision and electromagnetic clutter immunity. A synchronized data acquisition platform ensures multi-modal signal consistency, while wavelet transform (for radar) and STFT (for ultrasound) extract complementary time–frequency features. The proposed Attention-CNN-BiLSTM architecture integrates local spatial feature extraction, bidirectional temporal dependency modeling, and salient cue enhancement. Experimental results on 1600 synchronized sequences (four behaviors: standing, sitting, walking, falling) show a 98.6% mean class accuracy with subject-wise generalization, outperforming single-sensor baselines and traditional deep learning models. As a privacy-preserving, lighting-agnostic solution, it offers promising applications in smart homes, healthcare monitoring, and intelligent surveillance, providing a robust technical foundation for contactless behavior recognition. Full article
(This article belongs to the Special Issue Electromagnetic Sensors and Their Applications)
Show Figures

Figure 1

21 pages, 2169 KB  
Article
Enhancing Early Detection of Alzheimer’s Disease via Vision Transformer Machine Learning Architecture Using MRI Images
by Wided Hechkel, Marco Leo, Pierluigi Carcagnì, Marco Del-Coco and Abdelhamid Helali
Information 2026, 17(2), 163; https://doi.org/10.3390/info17020163 - 6 Feb 2026
Viewed by 191
Abstract
Computer-aided diagnosis (CAD) systems based on deep learning have shown significant potential for Alzheimer’s disease (AD) stage classification from Magnetic Resonance Imaging (MRI). Nevertheless, challenges such as class imbalance, small sample sizes, and the presence of multiple slices per subject may lead to [...] Read more.
Computer-aided diagnosis (CAD) systems based on deep learning have shown significant potential for Alzheimer’s disease (AD) stage classification from Magnetic Resonance Imaging (MRI). Nevertheless, challenges such as class imbalance, small sample sizes, and the presence of multiple slices per subject may lead to biased evaluation and statistically unreliable performance, particularly for minority classes. In this study, a Vision Transformer (ViT)-based framework is proposed for multi-class AD classification using a Kaggle dataset containing 6400 MRI slices across four cognitive stages. A subject-wise data-splitting strategy is employed to prevent information leakage between the training and testing sets, and the statistical unreliability of near-perfect scores in underrepresented classes is critically examined. An ablation study is conducted to assess the contribution of key architectural components, demonstrating the effectiveness of self-attention and patch embedding in capturing discriminative features. Furthermore, attention-based visualization maps are incorporated to highlight brain regions influencing the model’s decisions and to illustrate subtle anatomical differences between MildDemented and VeryMildDemented cases. The proposed approach achieves a test accuracy of 97.98%, outperforming existing methods on the same dataset while providing improved interpretability. It supports early and accurate AD stage identification. Full article
Show Figures

Graphical abstract

22 pages, 2833 KB  
Article
A Hybrid HOG-LBP-CNN Model with Self-Attention for Multiclass Lung Disease Diagnosis from CT Scan Images
by Aram Hewa, Jafar Razmara and Jaber Karimpour
Computers 2026, 15(2), 93; https://doi.org/10.3390/computers15020093 - 1 Feb 2026
Viewed by 203
Abstract
Resource-limited settings continue to face challenges in the identification of COVID-19, bacterial pneumonia, viral pneumonia, and normal lung conditions because of the overlap of CT appearance and inter-observer variability. We justify a hybrid architecture of deep learning which combines hand-designed descriptors (Histogram of [...] Read more.
Resource-limited settings continue to face challenges in the identification of COVID-19, bacterial pneumonia, viral pneumonia, and normal lung conditions because of the overlap of CT appearance and inter-observer variability. We justify a hybrid architecture of deep learning which combines hand-designed descriptors (Histogram of Oriented Gradients, Local Binary Patterns) and a 20-layer Convolutional Neural Network with dual self-attention. Handcrafted features were then trained with Support Vector Machines, and ensemble averaging was used to integrate the results with the CNN. The confidence level of 0.7 was used to mark suspicious cases to be reviewed manually. On a balanced dataset of 14,000 chest CT scans (3500 per class), the model was trained and cross-validated five-fold on a patient-wise basis. It had 97.43% test accuracy and a macro F1-score of 0.97, which was statistically significant compared to standalone CNN (92.0%), ResNet-50 (90.0%), multiscale CNN (94.5%), and ensemble CNN (96.0%). A further 2–3% enhancement was added by the self-attention module that targets the diagnostically salient lung regions. The predictions that were below the confidence limit amounted to only 5 percent, which indicated reliability and clinical usefulness. The framework provides an interpretable and scalable method of diagnosing multiclass lung disease, especially applicable to be deployed in healthcare settings with limited resources. The further development of the work will involve the multi-center validation, optimization of the model, and greater interpretability to be used in the real world. Full article
(This article belongs to the Special Issue AI in Bioinformatics)
Show Figures

Figure 1

23 pages, 2318 KB  
Article
Transformer Tokenization Strategies for Network Intrusion Detection: Addressing Class Imbalance Through Architecture Optimization
by Gulnur Aksholak, Agyn Bedelbayev, Raiymbek Magazov and Kaplan Kaplan
Computers 2026, 15(2), 75; https://doi.org/10.3390/computers15020075 - 1 Feb 2026
Viewed by 307
Abstract
Network intrusion detection has challenges that fundamentally differ from language and vision tasks typically addressed by Transformer models. In particular, network traffic features lack inherent ordering, datasets are extremely class-imbalanced (with benign traffic often exceeding 80%), and reported accuracies in the literature vary [...] Read more.
Network intrusion detection has challenges that fundamentally differ from language and vision tasks typically addressed by Transformer models. In particular, network traffic features lack inherent ordering, datasets are extremely class-imbalanced (with benign traffic often exceeding 80%), and reported accuracies in the literature vary widely (57–95%) without systematic explanation. To address these challenges, we propose a controlled experimental study that isolates and quantifies the impact of tokenization strategies on Transformer-based intrusion detection systems. Specifically, we introduce and compare three tokenization approaches—feature-wise tokenization (78 tokens) based on CICIDS2017, a sample-wise single-token baseline, and an optimized sample-wise tokenization—under identical training and evaluation protocols on a highly imbalanced intrusion detection dataset. We demonstrate that tokenization choice alone accounts for an accuracy gap of 37.43 percentage points, improving performance from 57.09% to 94.52% (100 K data). Furthermore, we show that architectural mechanisms for handling class imbalance—namely Batch Normalization and capped loss weights—yield an additional 15.05% improvement, making them approximately 21× more effective than increasing the training data by 50%. We achieve a macro-average AUC of 0.98, improve minority-class recall by 7–12%, and maintain strong discrimination even for classes with as few as four samples (AUC 0.9811). These results highlight tokenization and imbalance-aware architectural design as primary drivers of performance in Transformer-based intrusion detection and contribute practical guidance for deploying such models in modern network infrastructures, including IoT and cloud environments where extreme class imbalance is inherent. This study also presents practical implementation scheme recommending sample-wise tokenization, constrained class weighting, and Batch Normalization after embedding and classification layers to improve stability and performance in highly unstable table-based IDS problems. Full article
Show Figures

Graphical abstract

37 pages, 13544 KB  
Article
Attention-Driven Feature Extraction for XAI in Histopathology Leveraging a Hybrid Xception Architecture for Multi-Cancer Diagnosis
by Shirin Shila, Md. Safayat Hossain, Md Fuyad Al Masud, Mohammad Badrul Alam Miah, Afrig Aminuddin and Zia Muhammad
Mach. Learn. Knowl. Extr. 2026, 8(2), 31; https://doi.org/10.3390/make8020031 - 28 Jan 2026
Viewed by 627
Abstract
The automated and accurate results of classifying histopathology images are necessary in the early detection of cancer, especially the common cancers such as Colorectal Cancer (CRC) and Lung Cancer (LC). Nonetheless, classical deep learning frameworks often face challenges because the intra-class variations are [...] Read more.
The automated and accurate results of classifying histopathology images are necessary in the early detection of cancer, especially the common cancers such as Colorectal Cancer (CRC) and Lung Cancer (LC). Nonetheless, classical deep learning frameworks often face challenges because the intra-class variations are large, the relations across classes are alike, and the quality of images is not stable. In order to eliminate these constraints, a multi-layer diagnostic framework is offered in detail. This process starts with a strong preprocessing pipeline, which involves gamma correction, bilateral filtering, and adaptive CLAHE, resulting in statistically significant changes in image quality quantitative measures. The hybrid attention architecture is presented and includes an Xception backbone, a Convolutional Block Attention Module (CBAM), a Transformer block, and an MLP classifier to successfully combine local features with global context. The proposed model achieved an outstanding performance with a classification of 99.98%, 99.58%, and 99.33% percent on LC25000, CRC-VAL-HE-7K, and NCT-CRC-HE-100K when tested on three publicly available datasets. In order to enhance transparency, very detailed explainability analyses are conducted with the help of layer-wise feature visualization and Grad-CAM. Finally, the real-world example of this framework is presented by its implementation in a web-based platform, which can be a useful and easy-to-use tool in helping to diagnose a pathology. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

27 pages, 4802 KB  
Article
Fine-Grained Radar Hand Gesture Recognition Method Based on Variable-Channel DRSN
by Penghui Chen, Siben Li, Chenchen Yuan, Yujing Bai and Jun Wang
Electronics 2026, 15(2), 437; https://doi.org/10.3390/electronics15020437 - 19 Jan 2026
Viewed by 218
Abstract
With the ongoing miniaturization of smart devices, fine-grained hand gesture recognition using millimeter-wave radar has attracted increasing attention, yet practical deployment remains challenging in continuous-gesture segmentation, robust feature extraction, and reliable classification. This paper presents an end-to-end fine-grained gesture recognition framework based on [...] Read more.
With the ongoing miniaturization of smart devices, fine-grained hand gesture recognition using millimeter-wave radar has attracted increasing attention, yet practical deployment remains challenging in continuous-gesture segmentation, robust feature extraction, and reliable classification. This paper presents an end-to-end fine-grained gesture recognition framework based on frequency modulated continuous wave(FMCW) millimeter-wave radar, including gesture design, data acquisition, feature construction, and neural network-based classification. Ten gesture types are recorded (eight valid gestures and two return-to-neutral gestures); for classification, the two return-to-neutral gesture types are merged into a single invalid class, yielding a nine-class task. A sliding-window segmentation method is developed using short-time Fourier transformation(STFT)-based Doppler-time representations, and a dataset of 4050 labeled samples is collected. Multiple signal classification(MUSIC)-based super-resolution estimation is adopted to construct range–time and angle–time representations, and instance-wise normalization is applied to Doppler and range features to mitigate inter-individual variability without test leakage. For recognition, a variable-channel deep residual shrinkage network (DRSN) is employed to improve robustness to noise, supporting single-, dual-, and triple-channel feature inputs. Results under both subject-dependent evaluation with repeated random splits and subject-independent leave one subject out(LOSO) cross-validation show that DRSN architecture consistently outperforms the RefineNet-based baseline, and the triple-channel configuration achieves the best performance (98.88% accuracy). Overall, the variable-channel design enables flexible feature selection to meet diverse application requirements. Full article
Show Figures

Figure 1

Back to TopTop