MDPI - Publisher of Open Access Journals

31 pages, 30016 KB

Open AccessArticle

Sensors-Driven Multimodal Deepfake Detection: A Cross-Attention Fusion Approach with Adaptive Modality Gating

by Syeda Sitara Waseem, Noman Shabbir, Syed Rizwan Hassan and KangYoon Lee

Sensors 2026, 26(12), 3695; https://doi.org/10.3390/s26123695 - 10 Jun 2026

Viewed by 83

Deepfakes threaten sensor-based authentication systems, including biometric sensors, surveillance cameras, and IoT edge devices. Unimodal detectors remain vulnerable to modality-specific attacks. We propose a multimodal deepfake detection framework optimized for resource-constrained edge devices, featuring a novel cross-modal attention fusion mechanism with adaptive gating. [...] Read more.

Deepfakes threaten sensor-based authentication systems, including biometric sensors, surveillance cameras, and IoT edge devices. Unimodal detectors remain vulnerable to modality-specific attacks. We propose a multimodal deepfake detection framework optimized for resource-constrained edge devices, featuring a novel cross-modal attention fusion mechanism with adaptive gating. The architecture combines enhanced Res2Net for audio, temporal 3D CNN with SE attention for video, and bidirectional cross-modal attention with quality-based gates. On our benchmark (5472 audio + 1842 video samples), the fusion model achieves 96.7% accuracy, 96.6% F1-score, 0.988 AUC-ROC, and 3.3% EER. Adversarial testing shows 92.3% accuracy under the Fast Gradient Sign Method (FGSM) attack. The model has a 30.3 MB footprint and runs at 20 FPS on edge hardware. Modality contribution analysis reveals adaptive weighting (72% audio for TTS forgery, 78% video for lip-synced attacks). Cross-dataset evaluation on FakeAVCeleb achieves 92.3% overall accuracy, confirming generalization. Full article

(This article belongs to the Special Issue Secure and Resilient Solutions for CCTV, Small Sensor and IoT Device Security)

14 pages, 922 KB

Open AccessArticle

AggMo-Enhanced Momentum Attack: A Plug-and-Play Framework for Boosting Adversarial Transferability

by Qiaoyi Li, Zhengjie Wang, Chengxiang Ran and Haifeng Shen

Appl. Sci. 2026, 16(10), 4645; https://doi.org/10.3390/app16104645 - 8 May 2026

Viewed by 192

Abstract

Most adversarial attack methods achieve high success rates under the white-box setting. However, these methods often lack transferability when targeting other deep neural network (DNN) models. Momentum-based attacks have emerged as an effective strategy to enhance transferability by incorporating a momentum term to [...] Read more.

Most adversarial attack methods achieve high success rates under the white-box setting. However, these methods often lack transferability when targeting other deep neural network (DNN) models. Momentum-based attacks have emerged as an effective strategy to enhance transferability by incorporating a momentum term to stabilize update directions. While simple constant-momentum methods (e.g., MI-FGSM) or advanced variants (e.g., NI-FGSM, VMI-FGSM) have shown promise, they either use a single momentum decay factor or introduce significant computational overhead. To address this, we propose a novel plug-and-play momentum aggregation framework named AggMo-Attack. Our key insight is that a single momentum term with a fixed decay factor cannot optimally capture the multi-scale temporal correlations in gradients during adversarial optimization. Inspired by the Aggregated Momentum (AggMo) optimizer, we designed a multi-momentum aggregation module that maintains and weightedly combines multiple velocity vectors with different decay factors. This framework can be seamlessly integrated into existing momentum-based attack methods (e.g., MI-FGSM, NI-FGSM, VMI-FGSM) as a drop-in replacement for their standard momentum update step. Extensive experiments demonstrate that integrating our AggMo module significantly improves adversarial transferability. Our work provides a versatile and effective tool for enhancing momentum-based adversarial attacks and opens a new direction for designing adaptive attack strategies. Full article

► Show Figures

Figure 1

21 pages, 2238 KB

Open AccessArticle

Game-Theoretic Cost-Sensitive Adversarial Training for Robust Cloud Intrusion Detection Against GAN-Based Evasion Attacks

by Jianbo Ding, Zijian Shen and Wenhe Liu

Appl. Sci. 2026, 16(8), 3944; https://doi.org/10.3390/app16083944 - 18 Apr 2026

Cited by 1 | Viewed by 389

Abstract

Cloud-based intrusion detection systems (IDSs) increasingly rely on deep learning classifiers to identify malicious traffic; however, this reliance exposes them to adversarial evasion attacks in which adversaries craft near-imperceptible perturbations to bypass detection. Existing defenses based on conventional adversarial training often recover robustness [...] Read more.

Cloud-based intrusion detection systems (IDSs) increasingly rely on deep learning classifiers to identify malicious traffic; however, this reliance exposes them to adversarial evasion attacks in which adversaries craft near-imperceptible perturbations to bypass detection. Existing defenses based on conventional adversarial training often recover robustness against known perturbation patterns at the cost of degraded detection accuracy on canonical attack categories—a robustness–accuracy trade-off that remains an open challenge in the field. In this paper, we propose GT-CSAT (Game-Theoretic Cost-Sensitive Adversarial Training), a novel defense framework tailored for cloud security environments. GT-CSAT couples an improved Wasserstein GAN with Gradient Penalty (WGAN-GP) threat generator—conditioned on attack semantics to simulate functionally consistent and highly covert traffic variants—with a minimax adversarial training loop governed by a game-theoretic cost-sensitive loss function. The proposed loss function assigns asymmetric misclassification penalties derived from a two-player zero-sum payoff matrix, enabling the detector to maintain vigilance over both novel adversarial variants and well-characterized conventional threats simultaneously. Specifically, misclassifying an adversarially perturbed attack as benign incurs a strictly higher penalty than the symmetric cross-entropy baseline, while the cost weights are dynamically adapted via a Nash equilibrium-inspired update rule during training. We conduct comprehensive experiments on the Cloud Vulnerabilities Dataset (CVD), CICIDS-2017, and UNSW-NB15, which encompass diverse cloud-specific attack scenarios including denial-of-service, port scanning, brute-force, and SQL injection traffic. Under six representative evasion strategies—FGSM, PGD, C&W, BIM, DeepFool, and IDSGAN-style black-box perturbations—GT-CSAT achieves an average robust accuracy of 94.3%, surpassing standard adversarial training by 6.8 percentage points and the undefended baseline by 21.4 percentage points, while preserving clean-traffic detection at 97.1%. These results confirm that the game-theoretic cost structure effectively decouples robustness from accuracy, yielding a Pareto-superior detection profile relative to competing baselines across all evaluated threat models. The source code and experimental configurations have been publicly released to facilitate reproducibility. Full article

► Show Figures

Figure 1

29 pages, 7713 KB

Open AccessArticle

Toward Adversarial Robustness Network Intrusion Detection Based on Multi-Model Ensemble Approach

by Thi-Thu-Huong Le, Jaehan Cho, Dawit Shin and Howon Kim

Sensors 2026, 26(8), 2478; https://doi.org/10.3390/s26082478 - 17 Apr 2026

Viewed by 515

Abstract

Machine learning-based network intrusion detection systems (NIDSs) remain vulnerable to adversarial manipulation, but the robustness literature for tabular NIDS data is still dominated by single-model, single-dataset, and non-adaptive evaluations. In this paper, we reposition the manuscript as a comparative robustness study of a [...] Read more.

Machine learning-based network intrusion detection systems (NIDSs) remain vulnerable to adversarial manipulation, but the robustness literature for tabular NIDS data is still dominated by single-model, single-dataset, and non-adaptive evaluations. In this paper, we reposition the manuscript as a comparative robustness study of a four-component defense pipeline rather than as a claim of a universal defense primitive. We evaluate XGBoost, LightGBM, TabNet, and Residual MLP on RT_IOT2022 and Web_IDS23 under standard attacks, representative constrained/adaptive attacks, component-wise ablations, sample-fraction sensitivity, repeated-run significance tests, per-class F1 analysis, and computational-overhead measurements. The results show strong dataset and architecture dependence. On RT_IOT2022, tree-based models close most of the robustness gap under strong attacks but often only after large clean-accuracy reductions; Residual MLP achieves a more favorable balance, while the full defense stack over-regularizes TabNet. On Web_IDS23, aggregate robustness-gap reduction remains positive, yet simpler baselines such as adversarial-training-only or ensemble-only configurations frequently outperform the full four-stage pipeline in absolute clean/attack accuracy. Across both datasets, median filtering is the most fragile component: larger filter windows substantially degrade both clean and attacked accuracy, whereas contamination rate, anomaly-mixing weight, and ensemble size are comparatively stable. Representative constrained/adaptive evaluations reduce performance only modestly relative to standard FGSM/PGD, but per-class and overhead analyses show that minority-class collapse and training cost remain important deployment limitations. These findings support a more cautious conclusion: adversarial defense for tabular NIDS is validation driven and dataset specific, and the full defense stack should not be treated as a universal default. Full article

(This article belongs to the Special Issue Advances and Challenges in Sensor Security Systems)

► Show Figures

Figure 1

25 pages, 2805 KB

Open AccessArticle

CAPG: Context-Aware Perturbation Generation for Multi-Label Adversarial Attacks

by Aidos Askhatuly, Dinara Berdysheva, Azamat Berdyshev, Aigul Adamova and Didar Yedilkhan

Technologies 2026, 14(4), 233; https://doi.org/10.3390/technologies14040233 - 16 Apr 2026

Viewed by 472

Abstract

Multi-label deep learning models are widely used in real-world applications where predictions depend on the joint presence of several semantically correlated labels. However, existing adversarial attacks largely overlook these inter-label dependencies, often perturbing outputs indiscriminately and producing structurally implausible or easily detectable changes. [...] Read more.

Multi-label deep learning models are widely used in real-world applications where predictions depend on the joint presence of several semantically correlated labels. However, existing adversarial attacks largely overlook these inter-label dependencies, often perturbing outputs indiscriminately and producing structurally implausible or easily detectable changes. This paper presents CAPG (Context-Aware Perturbation Generation), a white-box, label-space targeted adversarial framework for generating selective and contextually consistent perturbations in multi-label settings. CAPG incorporates correlation-weighted regularization into the adversarial objective, enabling targeted manipulation of specific labels while preserving the contextual integrity of non-target outputs. Using the Pascal VOC 2012 dataset and a ResNet-101 multi-label classifier, we show that CAPG achieves higher Attack Success Rates (ASR) and substantially improved Contextual Consistency Scores (CCSs) than FGSM, PGD, CW, and DeepFool under identical perturbation budgets. CAPG also produces lower perceptual distortion, yielding adversarial examples that better preserve contextual structure. These results highlight the importance of correlation-aware adversarial evaluation for assessing the robustness of modern multi-label deep learning systems. Full article

(This article belongs to the Section Information and Communication Technologies)

► Show Figures

Figure 1

19 pages, 9603 KB

Open AccessArticle

Understanding Modality-Specific Vulnerabilities in Vision–Language Models Under Adversarial Attacks

by Maisha Binte Rashid and Pablo Rivas

AI 2026, 7(4), 135; https://doi.org/10.3390/ai7040135 - 9 Apr 2026

Viewed by 834

Abstract

Vision–language models (VLMs), such as Contrastive Language–Image Pretraining (CLIP), are increasingly deployed in real-world applications, including content moderation, misinformation detection, and fraud analysis, making their robustness to adversarial attacks a critical concern. While adversarial robustness has been widely studied in unimodal models, modality-specific [...] Read more.

Vision–language models (VLMs), such as Contrastive Language–Image Pretraining (CLIP), are increasingly deployed in real-world applications, including content moderation, misinformation detection, and fraud analysis, making their robustness to adversarial attacks a critical concern. While adversarial robustness has been widely studied in unimodal models, modality-specific vulnerabilities in multimodal models remain underexplored. In this work, we analyze CLIP by applying gradient-based adversarial attacks to its vision and language modalities, both independently and jointly, and evaluating performance on two multimodal classification benchmarks: the Facebook Hateful Memes dataset and a large-scale Suspicious Car Parts dataset. Using Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) attacks along with multiple adversarial retraining strategies, we show that adversarial perturbations on the image modality consistently cause the most severe and unstable performance degradation. These results demonstrate that the vision modality is the primary vulnerability in CLIP, highlighting the need for modality-specific defense strategies that focus more on the weaker modality in multimodal systems. Full article

(This article belongs to the Section AI Systems: Theory and Applications)

► Show Figures

Graphical abstract

20 pages, 1454 KB

Open AccessArticle

Momentum-Based Adversarial Attacks and Multi-Level Denoising Defenses in Deep Learning-Based Wind Power Forecasting

by Yangming Min, Congmei Jiang, Kang Yang, Xiankui Wen and Kexin Chen

Sensors 2026, 26(7), 2073; https://doi.org/10.3390/s26072073 - 26 Mar 2026

Viewed by 592

Abstract

Deep learning (DL) techniques have significantly advanced wind power forecasting by enhancing accuracy. However, these DL models are vulnerable to adversarial attacks, which can lead to severely inaccurate forecasts. Existing studies in wind power forecasting have rarely addressed the stealthiness and effectiveness of [...] Read more.

Deep learning (DL) techniques have significantly advanced wind power forecasting by enhancing accuracy. However, these DL models are vulnerable to adversarial attacks, which can lead to severely inaccurate forecasts. Existing studies in wind power forecasting have rarely addressed the stealthiness and effectiveness of adversarial attacks simultaneously, nor have they investigated defense strategies against multiple perturbation strengths or in black-box scenarios. To this end, we propose an attack algorithm for wind power forecasting, i.e., the momentum iterative fast gradient sign method (MI-FGSM). This algorithm generates adversarial samples by incorporating momentum into the iterative process and adding perturbations to the input samples along the gradient direction. To defend against such attacks under varying perturbation strengths, a defense model called multi-level iterative denoising autoencoder (MLI-DAE) is proposed. MLI-DAE is trained using adversarial samples with multiple perturbation levels to effectively restore attacked inputs to their clean forms. Experimental results under both white-box and black-box scenarios demonstrate that MI-FGSM induces significantly larger forecast errors with smaller perturbation magnitudes compared to FGSM. Furthermore, our proposed MLI-DAE effectively defends against multi-level perturbations without compromising the original forecast accuracy. Full article

(This article belongs to the Section Internet of Things)

► Show Figures

Figure 1

24 pages, 11178 KB

Open AccessArticle

FLAMA: Frame-Level Alignment Margin Attack for Scene Text and Automatic Speech Recognition

by Yikun Xu, Zhiheng Xu and Pengwen Dai

Electronics 2026, 15(5), 1064; https://doi.org/10.3390/electronics15051064 - 4 Mar 2026

Cited by 1 | Viewed by 524

Abstract

Scene text recognition (STR) and automatic speech recognition (ASR) translate visual or acoustic signals into linguistic sequences and underpin many modern perception systems. Although their front-ends and decoders differ (e.g., CTC-based, attention-based, or variants), both tasks ultimately rely on aligning input frames to [...] Read more.

Scene text recognition (STR) and automatic speech recognition (ASR) translate visual or acoustic signals into linguistic sequences and underpin many modern perception systems. Although their front-ends and decoders differ (e.g., CTC-based, attention-based, or variants), both tasks ultimately rely on aligning input frames to output tokens by deep learning techniques, which exposes a shared vulnerability to adversarial perturbations. Existing attacks commonly optimize global sequence-level objectives. As a result, decisive frames are treated implicitly, and optimization can become unnecessarily diffuse over long input sequences, hindering convergence and perceptual quality. To address the above issues, we propose FLAMA, a unified Frame-Level Alignment Margin Attack, which could be used for both STR and ASR models. FLAMA explicitly targets alignment by maximizing per frame (or per step) recognition margins. The design is decoder-agnostic and applies to both CTC-based and attention-based pipelines. It employs a recognition-score-aware Step/Halt gate that concentrates updates on the most critical frames, and a stabilization stage that suppresses late-iteration oscillations to improve optimization stability and perceptual control. Ablation analyses show that stabilization consistently enhances attack success and reduces distortion. We evaluate FLAMA on STR benchmarks (SVT, CUTE80, and IC13) with CRNN, STAR, and TRBA, and on the ASR benchmark (LibriSpeech) with a Wav2Vec 2.0 model. Across modalities and architectures, FLAMA achieves near-100% attack success while substantially reducing

l_{2}

distortion and improving perceptual metrics compared with FGSM/PGD baselines. These results highlight frame-level alignment as a shared weak point across visual and audio sequence recognizers and suggest localized margin objectives as a principled route to effective sequence attacks. Full article

► Show Figures

Figure 1

15 pages, 551 KB

Open AccessArticle

Query-Side Adversarial Attacks on Event-Based Person Re-Identification: A First-Order Robustness Analysis

by Jung Heum Woo and Eun-Kyu Lee

Appl. Sci. 2026, 16(5), 2430; https://doi.org/10.3390/app16052430 - 3 Mar 2026

Viewed by 438

Abstract

Event-based person re-identification (Re-ID) has recently emerged as a privacy-friendly alternative to conventional RGB-based surveillance. However, the security and adversarial robustness of these systems remain largely understudied. This paper presents a systematic investigation into the vulnerabilities of event-based person Re-ID models operating on [...] Read more.

Event-based person re-identification (Re-ID) has recently emerged as a privacy-friendly alternative to conventional RGB-based surveillance. However, the security and adversarial robustness of these systems remain largely understudied. This paper presents a systematic investigation into the vulnerabilities of event-based person Re-ID models operating on 5-channel event voxels. We evaluate the impact of a one-step FGSM attack on query-side event voxel inputs and measure the resulting retrieval performance. Our experiments demonstrate a significant susceptibility: under subtle perturbations, the Top-1 accuracy drops drastically from 0.462 to 0.154. Critically, these adversarial inputs maintain high perceptual similarity to the original data, with an average SSIM of approximately 0.99 and an average PSNR of 45 dB, rendering the modifications nearly imperceptible. These findings suggest that the sparse and asynchronous nature of event-based person Re-ID, despite its potential privacy advantages, is highly susceptible to gradient-based exploits. This study highlights the need for robustness-aware design and defense mechanisms in event-based surveillance systems. Full article

(This article belongs to the Special Issue Advanced Cybersecurity Applications: Solutions to Counteract Cyber Threats)

► Show Figures

Figure 1

20 pages, 6717 KB

Open AccessArticle

Unraveling Patch Size Effects in Vision Transformers: Adversarial Robustness in Hyperspectral Image Classification

by Shashi Kiran Chandrappa, Sidike Paheding and Abel A. Reyes-Angulo

Remote Sens. 2026, 18(4), 656; https://doi.org/10.3390/rs18040656 - 21 Feb 2026

Viewed by 768

Abstract

Vision Transformers (ViTs) have demonstrated strong performance in hyperspectral image (HSI) classification; however, their robustness is highly sensitive to patch size. This study investigates the impact of spatial patch size on clean accuracy and adversarial robustness using a standard ViT and a Channel [...] Read more.

Vision Transformers (ViTs) have demonstrated strong performance in hyperspectral image (HSI) classification; however, their robustness is highly sensitive to patch size. This study investigates the impact of spatial patch size on clean accuracy and adversarial robustness using a standard ViT and a Channel Attention Fusion variant (ViT-CAF). Patch sizes from 1 × 1 to 19 × 19 are evaluated across four benchmark datasets under FGSM, BIM, CW, PGD, and RFGSM attacks. Descriptive results show that smaller patches, particularly 1 × 1 and 3 × 3, generally yield higher adversarial accuracy, while larger patches amplify localized perturbations and degrade robustness. Parameter analysis indicates that patch-size-dependent variations arise mainly from the embedding layer, with the Transformer backbone remaining fixed, confirming that robustness differences are driven primarily by spatial context rather than model capacity. These findings reveal a trade-off between spatial granularity and adversarial resilience and provide guidance for patch size selection in ViT-based HSI applications. Full article

(This article belongs to the Special Issue Deep Neural Networks for Hyperspectral Remote Sensing Image Processing (Second Edition))

► Show Figures

Figure 1

29 pages, 5664 KB

Open AccessArticle

Adversarially Robust and Explainable Insulator Defect Detection for Smart Grid Infrastructure

by Mubarak Alanazi

Energies 2026, 19(4), 1013; https://doi.org/10.3390/en19041013 - 14 Feb 2026

Viewed by 498

Abstract

Automated insulator inspection systems face critical challenges from small object sizes, complex backgrounds, and vulnerability to adversarial attacks, a security concern largely unaddressed in safety-critical power infrastructure. We introduce Faster-YOLOv12n, integrating a FasterNet backbone with SGC2f attention modules and Wise-ShapeIoU loss for enhanced [...] Read more.

Automated insulator inspection systems face critical challenges from small object sizes, complex backgrounds, and vulnerability to adversarial attacks, a security concern largely unaddressed in safety-critical power infrastructure. We introduce Faster-YOLOv12n, integrating a FasterNet backbone with SGC2f attention modules and Wise-ShapeIoU loss for enhanced small defect localization. Our architecture achieves 98.9% mAP@0.5 on the CPLID, improving baseline YOLOv12n by 1.3% in precision (97.8% vs. 96.5%), 4.7% in recall (95.1% vs. 90.4%), and 1.8% in mAP@0.5. Through differential data augmentation, we expand training samples from 678 to 3900 images, achieving balanced class distribution and robust generalization across fog, adverse weather, and complex transmission line backgrounds. Comparative evaluation demonstrates superior performance over RT-DETR, Faster R-CNN, YOLOv7, YOLOv8, and YOLOv9, with per-class analysis revealing 99.8% AP@0.5 for defect detection. We provide the first comprehensive adversarial robustness evaluation for insulator defect detection, systematically assessing FGSM, PGD, and C&W attacks across perturbation budgets. Through adversarial training with mixed-batch strategies, our robust model maintains 93.2% mAP@0.5 under the strongest FGSM attacks (

ϵ

= 48/255), 94.5% under PGD attacks, and 95.1% under C&W attacks (

τ

= 3.0) while preserving 98.9% clean accuracy, demonstrating no trade-off between accuracy and robustness. Grad-CAM visualizations demonstrate that attacks disrupt confidence calibration while preserving spatial attention on defect regions, providing interpretable insights into model decision-making under adversarial conditions and validating learned feature representations for safety-critical smart grid monitoring applications. Full article

► Show Figures

Figure 1

5 pages, 398 KB

Open AccessProceeding Paper

A Lightweight Deep Learning Framework for Robust Video Watermarking in Adversarial Environments

by Antonio Cedillo-Hernandez, Lydia Velazquez-Garcia and Manuel Cedillo-Hernandez

Eng. Proc. 2026, 123(1), 25; https://doi.org/10.3390/engproc2026123025 - 5 Feb 2026

Viewed by 639

Abstract

The widespread distribution of digital videos in social networks, streaming services, and surveillance systems has increased the risk of manipulation, unauthorized redistribution, and adversarial tampering. This paper presents a lightweight deep learning framework for robust and imperceptible video watermarking designed specifically for cybersecurity [...] Read more.

The widespread distribution of digital videos in social networks, streaming services, and surveillance systems has increased the risk of manipulation, unauthorized redistribution, and adversarial tampering. This paper presents a lightweight deep learning framework for robust and imperceptible video watermarking designed specifically for cybersecurity environments. Unlike heavy architectures that rely on multi-scale feature extractors or complex adversarial networks, our model introduces a compact encoder–decoder pipeline optimized for real-time watermark embedding and recovery under adversarial attacks. The proposed system leverages spatial attention and temporal redundancy to ensure robustness against distortions such as compression, additive noise, and adversarial perturbations generated via Fast Gradient Sign Method (FGSM) or recompression attacks from generative models. Experimental simulations using a reduced Kinetics-600 subset demonstrate promising results, achieving an average PSNR of 38.9 dB, SSIM of 0.967, and Bit Error Rate (BER) below 3% even under FGSM attacks. These results suggest that the proposed lightweight framework achieves a favorable trade-off between resilience, imperceptibility, and computational efficiency, making it suitable for deployment in video forensics, authentication, and secure content distribution systems. Full article

(This article belongs to the Proceedings of First Summer School on Artificial Intelligence in Cybersecurity)

► Show Figures

Figure 1

21 pages, 2458 KB

Open AccessArticle

STS-AT: A Structured Tensor Flow Adversarial Training Framework for Robust Intrusion Detection

by Juntong Zhu, Zhihao Chen, Rong Cong, Hongyu Sun and Yanhua Dong

Sensors 2026, 26(2), 536; https://doi.org/10.3390/s26020536 - 13 Jan 2026

Cited by 1 | Viewed by 678

Abstract

Network intrusion detection is a key technology for ensuring cybersecurity. However, current methods face two major challenges: reliance on manual feature engineering, which leads to the loss of discriminative information, and the vulnerability of deep learning models to adversarial sample attacks. To address [...] Read more.

Network intrusion detection is a key technology for ensuring cybersecurity. However, current methods face two major challenges: reliance on manual feature engineering, which leads to the loss of discriminative information, and the vulnerability of deep learning models to adversarial sample attacks. To address these issues, this paper proposes STS-AT, a novel network intrusion detection method that integrates structured tensors with adversarial training. The method consists of three core components: first, structured tensor encoding, which fully converts raw hexadecimal traffic into a numerical representation; second, a hierarchical deep learning model that combines CNN and LSTM networks to simultaneously learn spatial and temporal features of the traffic; third, a multi-strategy adversarial training method that enhances model robustness by adaptively adjusting the mix of adversarial samples in different training phases. Experiments on the CICIDS2017 dataset show that the proposed method achieves an accuracy of 99.6% in normal traffic classification, significantly outperforming classical machine learning baselines such as Random Forest (93.1%) and Support Vector Machine (84.7%). Crucially, under various adversarial attacks (FGSM, PGD, and DeepFool), the accuracy of an undefended model drops to as low as 24.4%, whereas after multi-strategy adversarial training, the defense accuracy rises above 96.8%. Meanwhile, the total training time is reduced by approximately 67.6%. These results verify that structured tensor encoding effectively preserves original traffic information, the hierarchical model achieves comprehensive feature learning, and multi-strategy adversarial training significantly improves training efficiency while ensuring robust defense effectiveness. Full article

(This article belongs to the Special Issue AI, Machine Learning (ML), and Large Language Models (LLMs) for Cybersecurity in Sensor Networks)

► Show Figures

Figure 1

28 pages, 4585 KB

Open AccessArticle

Uncertainty-Aware Adaptive Intrusion Detection Using Hybrid CNN-LSTM with cWGAN-GP Augmentation and Human-in-the-Loop Feedback

by Clinton Manuel de Nascimento and Jin Hou

Safety 2025, 11(4), 120; https://doi.org/10.3390/safety11040120 - 5 Dec 2025

Cited by 1 | Viewed by 1977

Abstract

Intrusion detection systems (IDSs) must operate under severe class imbalance, evolving attack behavior, and the need for calibrated decisions that integrate smoothly with security operations. We propose a human-in-the-loop IDS that combines a convolutional neural network and a long short-term memory network (CNN–LSTM) [...] Read more.

Intrusion detection systems (IDSs) must operate under severe class imbalance, evolving attack behavior, and the need for calibrated decisions that integrate smoothly with security operations. We propose a human-in-the-loop IDS that combines a convolutional neural network and a long short-term memory network (CNN–LSTM) classifier with a variational autoencoder (VAE)-seeded conditional Wasserstein generative adversarial network with gradient penalty (cWGAN-GP) augmentation and entropy-based abstention. Minority classes are reinforced offline via conditional generative adversarial (GAN) sampling, whereas high-entropy predictions are escalated for analysts and are incorporated into a curated retraining set. On CIC-IDS2017, the resulting framework delivered well-calibrated binary performance (ACC = 98.0%, DR = 96.6%, precision = 92.1%, F1 = 94.3%; baseline ECE ≈ 0.04, Brier ≈ 0.11) and substantially improved minority recall (e.g., Infiltration from 0% to >80%, Web Attack–XSS +25 pp, and DoS Slowhttptest +15 pp, for an overall +11 pp macro-recall gain). The deployed model remained lightweight (~42 MB, <10 ms per batch; ≈32 k flows/s on RTX-3050 Ti), and only approximately 1% of the flows were routed for human review. Extensive evaluation, including ROC/PR sweeps, reliability diagrams, cross-domain tests on CIC-IoT2023, and FGSM/PGD adversarial stress, highlights both the strengths and remaining limitations, notably residual errors on rare web attacks and limited IoT transfer. Overall, the framework provides a practical, calibrated, and extensible machine learning (ML) tier for modern IDS deployment and motivates future research on domain alignment and adversarial defense. Full article

► Show Figures

Graphical abstract

21 pages, 1147 KB

Open AccessArticle

AI-Based Steganography Method to Enhance the Information Security of Hidden Messages in Digital Images

by Nhi Do Ngoc Huynh, Jiajun Jiang, Chung-Hao Chen and Wen-Chao Yang

Electronics 2025, 14(22), 4490; https://doi.org/10.3390/electronics14224490 - 17 Nov 2025

Cited by 3 | Viewed by 7005

Abstract

With the increasing sophistication of Artificial Intelligence (AI), traditional digital steganography methods face a growing risk of being detected and compromised. Adversarial attacks, in particular, pose a significant threat to the security and robustness of hidden information. To address these challenges, this paper [...] Read more.

With the increasing sophistication of Artificial Intelligence (AI), traditional digital steganography methods face a growing risk of being detected and compromised. Adversarial attacks, in particular, pose a significant threat to the security and robustness of hidden information. To address these challenges, this paper proposes a novel AI-based steganography framework designed to enhance the security of concealed messages within digital images. Our approach introduces a multi-stage embedding process that utilizes a sequence of encoder models, including a base encoder, a residual encoder, and a dense encoder, to create a more complex and secure hiding environment. To further improve robustness, we integrate Wavelet Transforms with various deep learning architectures, namely Convolutional Neural Networks (CNNs), Bayesian Neural Networks (BNNs), and Graph Convolutional Networks (GCNs). We conducted a comprehensive set of experiments on the FashionMNIST and MNIST datasets to evaluate our framework’s performance against several adversarial attacks. The results demonstrate that our multi-stage approach significantly enhances resilience. Notably, while CNN architectures provide the highest baseline accuracy, BNNs exhibit superior intrinsic robustness against gradient-based attacks. For instance, under the Fast Gradient Sign Method (FGSM) attack on the MNIST dataset, our BNN-based models maintained an accuracy of over 98%, whereas the performance of comparable CNN models dropped sharply to between 10% and 18%. This research provides a robust and effective method for developing next-generation secure steganography systems. Full article

(This article belongs to the Special Issue Advanced Machine Learning, Pattern Recognition, and Deep Learning Technologies: Methodologies and Applications, 2nd Edition)

► Show Figures

Figure 1

Search Results (64)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (64)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI