MDPI - Publisher of Open Access Journals

23 pages, 1202 KB

Open AccessArticle

Image-Based Malware Classification Using DCGAN-Augmented Data and a CNN–Transformer Hybrid Model

by Manya Dhingra, Achin Jain, Niharika Thakur, Anurag Choubey, Massimo Donelli, Arun Kumar Dubey and Arvind Panwar

Future Internet 2026, 18(2), 102; https://doi.org/10.3390/fi18020102 - 14 Feb 2026

Viewed by 179

Abstract

With the rapid growth and diversification of malware, accurate multi-class detection remains challenging due to severe class imbalance and limited labeled data. This work presents an image-based malware classification framework that converts executable binaries into

64 \times 64

grayscale images, employs class-wise DCGAN [...] Read more.

With the rapid growth and diversification of malware, accurate multi-class detection remains challenging due to severe class imbalance and limited labeled data. This work presents an image-based malware classification framework that converts executable binaries into

64 \times 64

grayscale images, employs class-wise DCGAN augmentation to mitigate severe imbalance (initial imbalance ratio >12 across 31 families,

N \approx 9300

), and trains a hybrid CNN–Transformer model that captures both local texture features and long-range contextual dependencies. The DCGAN generator produces high-fidelity synthetic samples, evaluated using Inception Score (IS)

= 3.43

, Fréchet Inception Distance (FID)

= 10.99

, and Kernel Inception Distance (KID)

= 0.0022

, and is used to equalize class counts before classifier training. On the blended dataset the proposed GAN-balanced CNN–Transformer achieves an overall accuracy of 95% and a macro-averaged F1-score of 0.95; the hybrid model also attains validation accuracy of ≈94% while substantially improving minority-class recognition. Compared to CNN-only and Transformer-only baselines, the hybrid approach yields more stable convergence, reduced overfitting, and stronger per-class performance, while remaining feasible for practical deployment. These results demonstrate that DCGAN-driven balancing combined with CNN–Transformer feature fusion is an effective, scalable solution for robust malware family classification. Full article

(This article belongs to the Section Cybersecurity)

► Show Figures

Graphical abstract

21 pages, 3512 KB

Open AccessArticle

Real-Time Ransomware Detection Using Reinforcement Learning Agents

by Kutub Thakur, Md Liakat Ali, Suzanna Schmeelk, Joan Debello and Md Mustafizur Rahman

Information 2026, 17(2), 194; https://doi.org/10.3390/info17020194 - 13 Feb 2026

Viewed by 208

Abstract

Traditional signature-based anti-malware tools often fail to detect zero-day ransomware attacks due to their reliance on known patterns. This paper presents a real-time ransomware detection framework that models system behavior as a Reinforcement Learning (RL) environment. Behavioral features—including file entropy, CPU usage, and [...] Read more.

Traditional signature-based anti-malware tools often fail to detect zero-day ransomware attacks due to their reliance on known patterns. This paper presents a real-time ransomware detection framework that models system behavior as a Reinforcement Learning (RL) environment. Behavioral features—including file entropy, CPU usage, and registry changes—are extracted from dynamic analysis logs generated by Cuckoo Sandbox. A (DQN) agent is trained to proactively block malicious actions by maximizing long-term rewards based on observed behavior. Experimental evaluation across multiple ransomware families such as WannaCry, Locky, Cerber, and Ryuk demonstrates that the proposed RL agent achieves a superior detection accuracy, precision, and F1-score compared to existing static and supervised learning methods. Furthermore, ablation tests and latency analysis confirm the model’s robustness and suitability for real-time deployment. This work introduces a behavior-driven, generalizable approach to ransomware defense that adapts to unseen threats through continual learning. Full article

(This article belongs to the Special Issue Extended Reality and Cybersecurity)

► Show Figures

Figure 1

33 pages, 745 KB

Open AccessArticle

XAI-Driven Malware Detection from Memory Artifacts: An Alert-Driven AI Framework with TabNet and Ensemble Classification

by Aristeidis Mystakidis, Grigorios Kalogiannnis, Nikolaos Vakakis, Nikolaos Altanis, Konstantina Milousi, Iason Somarakis, Gabriela Mihalachi, Mariana S. Mazi, Dimitris Sotos, Antonis Voulgaridis, Christos Tjortjis, Konstantinos Votis and Dimitrios Tzovaras

AI 2026, 7(2), 66; https://doi.org/10.3390/ai7020066 - 10 Feb 2026

Viewed by 406

Abstract

Modern malware presents significant challenges to traditional detection methods, often leveraging fileless techniques, in-memory execution, and process injection to evade antivirus and signature-based systems. To address these challenges, alert-driven memory forensics has emerged as a critical capability for uncovering stealthy, persistent, and zero-day [...] Read more.

Modern malware presents significant challenges to traditional detection methods, often leveraging fileless techniques, in-memory execution, and process injection to evade antivirus and signature-based systems. To address these challenges, alert-driven memory forensics has emerged as a critical capability for uncovering stealthy, persistent, and zero-day threats. This study presents a two-stage host-based malware detection framework, that integrates memory forensics, explainable machine learning, and ensemble classification, designed as a post-alert asynchronous SOC workflow balancing forensic depth and operational efficiency. Utilizing the MemMal-D2024 dataset—comprising rich memory forensic artifacts from Windows systems infected with malware samples whose creation metadata spans 2006–2021—the system performs malware detection, using features extracted from volatile memory. In the first stage, an Attentive and Interpretable Learning for structured Tabular data (TabNet) model is used for binary classification (benign vs. malware), leveraging its sequential attention mechanism and built-in explainability. In the second stage, a Voting Classifier ensemble, composed of Light Gradient Boosting Machine (LGBM), eXtreme Gradient Boosting (XGB), and Histogram Gradient Boosting (HGB) models, is used to identify the specific malware family (Trojan, Ransomware, Spyware). To reduce memory dump extraction and analysis time without compromising detection performance, only a curated subset of 24 memory features—operationally selected to reduce acquisition/extraction time and validated via redundancy inspection, model explainability (SHAP/TabNet), and training data correlation analysis —was used during training and runtime, identifying the best trade-off between memory analysis and detection accuracy. The pipeline, which is triggered from host-based Wazuh Security Information and Event Management (SIEM) alerts, achieved 99.97% accuracy in binary detection and 70.17% multiclass accuracy, resulting in an overall performance of 87.02%, including both global and local explainability, ensuring operational transparency and forensic interpretability. This approach provides an efficient and interpretable detection solution used in combination with conventional security tools as an extra layer of defense suitable for modern threat landscapes. Full article

► Show Figures

Figure 1

28 pages, 922 KB

Open AccessArticle

MAESTRO: A Multi-Scale Ensemble Framework with GAN-Based Data Refinement for Robust Malicious Tor Traffic Detection

by Jinbu Geng, Yu Xie, Jun Li, Xuewen Yu and Lei He

Mathematics 2026, 14(3), 551; https://doi.org/10.3390/math14030551 - 3 Feb 2026

Viewed by 314

Abstract

Malicious Tor traffic data contains deep domain-specific knowledge, which makes labeling challenging, and the lack of labeled data degrades the accuracy of learning-based detectors. Real-world deployments also exhibit severe class imbalance, where malicious traffic constitutes a small minority of network flows, which further [...] Read more.

Malicious Tor traffic data contains deep domain-specific knowledge, which makes labeling challenging, and the lack of labeled data degrades the accuracy of learning-based detectors. Real-world deployments also exhibit severe class imbalance, where malicious traffic constitutes a small minority of network flows, which further reduces detection performance. In addition, Tor’s fixed 512-byte cell architecture removes packet-size diversity that many encrypted-traffic methods rely on, making feature extraction difficult. This paper proposes an efficient three-stage framework, MAESTRO v1.0, for malicious Tor traffic detection. In Stage 1, MAESTRO extracts multi-scale behavioral signatures by fusing temporal, positional, and directional embeddings at cell, direction, and flow granularities to mitigate feature homogeneity; it then compresses these representations with an autoencoder into compact latent features. In Stage 2, MAESTRO introduces an ensemble-based quality quantification method that combines five complementary anomaly detection models to produce robust discriminability scores for adaptive sample weighting, helping the classifier to emphasize high-quality samples. MAESTRO also trains three specialized GANs per minority class and applies strict five-model ensemble validation to synthesize diverse high-fidelity samples, addressing extreme class imbalance. We evaluate MAESTRO under systematic imbalance settings, ranging from the natural distribution to an extreme 1% malicious ratio. On the CCS’22 Tor malware dataset, MAESTRO achieves 92.38% accuracy, 64.79% recall, and 73.70% F1-score under the natural distribution, improving F1-score by up to 15.53% compared with state-of-the-art baselines. Under the 1% malicious setting, MAESTRO maintains 21.1% recall, which is 14.1 percentage points higher than the best baseline, while conventional methods drop below 10%. Full article

(This article belongs to the Special Issue New Advances in Network Security and Data Privacy)

► Show Figures

Figure 1

6 pages, 915 KB

Open AccessProceeding Paper

Shield-X: Vectorization and Machine Learning-Based Pipeline for Network Traffic Threat Detection

by Claudio Henrique Marques de Oliveira, Marcelo Ladeira, Gustavo Cordeiro Galvao Van Erven and João José Costa Gondim

Eng. Proc. 2026, 123(1), 10; https://doi.org/10.3390/engproc2026123010 - 2 Feb 2026

Viewed by 172

Abstract

This paper presents an integrative methodology combining advanced network packet vectorization techniques, parallel processing with Dask, GPU-optimized machine learning models, and the Qdrant vector database. Our approach achieves a 99.9% detection rate for malicious traffic with only a 1% false-positive rate, setting new [...] Read more.

This paper presents an integrative methodology combining advanced network packet vectorization techniques, parallel processing with Dask, GPU-optimized machine learning models, and the Qdrant vector database. Our approach achieves a 99.9% detection rate for malicious traffic with only a 1% false-positive rate, setting new performance benchmarks for cybersecurity systems. The methodology establishes an average detection time limit not exceeding 10% of the total response time, maintaining high precision even for sophisticated attacks. The system processes 56 GB of PCAP files from Malware-Traffic-Analysis.net (2020–2024) through a five-stage pipeline: distributed packet processing, feature extraction, vectorization, vector database storage, and GPU-accelerated classification using XGBoost, Random Forest, and K-Nearest Neighbors models. Full article

(This article belongs to the Proceedings of First Summer School on Artificial Intelligence in Cybersecurity)

► Show Figures

Figure 1

24 pages, 1253 KB

Open AccessEditor’s ChoiceArticle

Re-Evaluating Android Malware Detection: Tabular Features, Vision Models, and Ensembles

by Prajwal Hosahalli Dayananda and Zesheng Chen

Electronics 2026, 15(3), 544; https://doi.org/10.3390/electronics15030544 - 27 Jan 2026

Viewed by 308

Abstract

Static, machine learning-based malware detection is widely used in Android security products, where even small increases in false-positive rates can impose significant burdens on analysts and cause unacceptable disruptions for end users. Both tabular features and image-based representations have been explored for Android [...] Read more.

Static, machine learning-based malware detection is widely used in Android security products, where even small increases in false-positive rates can impose significant burdens on analysts and cause unacceptable disruptions for end users. Both tabular features and image-based representations have been explored for Android malware detection. However, existing public benchmark datasets do not provide paired tabular and image representations for the same samples, limiting direct comparisons between tabular models and vision-based models. This work investigates whether carefully engineered, domain-specific tabular features can match or surpass the performance of state-of-the-art deep vision models under strict false-positive-rate constraints, and whether ensemble approaches justify their additional complexity. To enable this analysis, we construct a large corpus of Android applications with paired static representations and evaluate six popular machine learning models on the exact same samples: two tabular models using EMBER features, two tabular models using extended EMBER features, and two vision-based models using malware images. Our results show that a LightGBM model trained on extended EMBER features outperforms all other evaluated models, as well as a state-of-the-art approach trained on a much larger dataset. Furthermore, we develop an ensemble model combining both tabular and vision-based detectors, which yields a modest performance improvement but at the cost of substantial additional computational and engineering overhead. Full article

(This article belongs to the Special Issue Feature Papers in Networks: 2025–2026 Edition)

► Show Figures

Figure 1

32 pages, 4159 KB

Open AccessArticle

APT Malware Detection Model Based on Heterogeneous Multimodal Semantic Fusion

by Chaosen Pu and Liang Wan

Appl. Sci. 2026, 16(2), 1083; https://doi.org/10.3390/app16021083 - 21 Jan 2026

Viewed by 285

Abstract

In recent years, Advanced Persistent Threat (APT) malware, with its high stealth, has made it difficult for unimodal detection methods to accurately identify its disguised malicious behaviors. To address this challenge, this paper proposes an APT Malware Detection Model based on Heterogeneous Multimodal [...] Read more.

In recent years, Advanced Persistent Threat (APT) malware, with its high stealth, has made it difficult for unimodal detection methods to accurately identify its disguised malicious behaviors. To address this challenge, this paper proposes an APT Malware Detection Model based on Heterogeneous Multimodal Semantic Fusion (HMSF-ADM). By integrating the API call sequence features of APT malware in the operating system and the RGB image features of PE files, the model constructs multimodal representations with stronger discriminability, thus achieving efficient and accurate identification of APT malicious behaviors. First, the model employs two encoders, namely a Transformer encoder equipped with the DPCFTE module and a CAS-ViT encoder, to encode sequence features and image features, respectively, completing local–global collaborative context modeling. Then, the sequence encoding results and image encoding results are interactively fused via two cross-attention mechanisms to generate fused representations. Finally, a TextCNN-based classifier is utilized to perform classification prediction on the fused representations. Experimental results on two APT malware datasets demonstrate that the proposed HMSF-ADM model outperforms various mainstream multimodal comparison models in core metrics such as accuracy, precision, and F1-score. Notably, the F1-score of the model exceeds 0.95 for the vast majority of APT malware families, and its accuracy and F1-score both remain above 0.986 in the task of distinguishing between ordinary malware and APT malware. Full article

(This article belongs to the Special Issue Advanced Cybersecurity Applications: Solutions to Counteract Cyber Threats)

► Show Figures

Figure 1

23 pages, 1750 KB

Open AccessArticle

LLM-Generated Samples for Android Malware Detection

by Nik Rollinson and Nikolaos Polatidis

Digital 2026, 6(1), 5; https://doi.org/10.3390/digital6010005 - 18 Jan 2026

Viewed by 499

Abstract

Android malware continues to evolve through obfuscation and polymorphism, posing challenges for both signature-based defenses and machine learning models trained on limited and imbalanced datasets. Synthetic data has been proposed as a remedy for scarcity, yet the role of Large Language Models (LLMs) [...] Read more.

Android malware continues to evolve through obfuscation and polymorphism, posing challenges for both signature-based defenses and machine learning models trained on limited and imbalanced datasets. Synthetic data has been proposed as a remedy for scarcity, yet the role of Large Language Models (LLMs) in generating effective malware data for detection tasks remains underexplored. In this study, we fine-tune GPT-4.1-mini to produce structured records for three malware families: BankBot, Locker/SLocker, and Airpush/StopSMS, using the KronoDroid dataset. After addressing generation inconsistencies with prompt engineering and post-processing, we evaluate multiple classifiers under three settings: training with real data only, real-plus-synthetic data, and synthetic data alone. Results show that real-only training achieves near-perfect detection, while augmentation with synthetic data preserves high performance with only minor degradations. In contrast, synthetic-only training produces mixed outcomes, with effectiveness varying across malware families and fine-tuning strategies. These findings suggest that LLM-generated tabular malware feature records can enhance scarce datasets without compromising detection accuracy, but remain insufficient as a standalone training source. Full article

► Show Figures

Figure 1

29 pages, 2529 KB

Open AccessArticle

Enhancing Imbalanced Malware Detection via CWGAN-GP-Based Data Augmentation and TextCNN–Transformer Integration

by Luqiao Liu and Liang Wan

Symmetry 2025, 17(12), 2153; https://doi.org/10.3390/sym17122153 - 15 Dec 2025

Cited by 1 | Viewed by 470

Abstract

With the rapid growth and increasing sophistication of malicious software (malware), traditional detection methods face significant challenges in addressing emerging threats. Machine learning-based detection approaches rely on manual feature engineering, making it difficult for them to adapt to diverse attack patterns. In contrast, [...] Read more.

With the rapid growth and increasing sophistication of malicious software (malware), traditional detection methods face significant challenges in addressing emerging threats. Machine learning-based detection approaches rely on manual feature engineering, making it difficult for them to adapt to diverse attack patterns. In contrast, while deep learning methods can automatically extract features, they remain vulnerable to data imbalance and sample scarcity, which lead to poor detection performance for minority-class samples. To address these issues, this study proposes a semantic data augmentation approach based on a Conditional Wasserstein Generative Adversarial Network with Gradient Penalty (CWGAN-GP), and designs a malware detection model that combines a Text Convolutional Neural Network (TextCNN) with a Transformer Encoder, termed Mal-CGP-TTN. The proposed model establishes a symmetry between local feature extraction and global semantic representation, where the convolutional and attention-based components complement each other to achieve balanced learning. First, the proposed method enriches the semantic diversity of the training data by generating high-quality synthetic samples. Then, it leverages multi-scale convolution and self-attention mechanisms to extract both local and global features of malicious behaviors, thereby enabling hierarchical semantic modeling and accurate classification of malicious activities. Experimental results on two public datasets demonstrate that the proposed method outperforms traditional machine learning and mainstream deep learning models in terms of accuracy, precision, and F1-score. Notably, it achieves substantial improvements in detecting minority-class samples. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

29 pages, 9256 KB

Open AccessArticle

MaSS-Droid: Android Malware Detection Framework Using Multi-Layer Feature Screening and Stacking Integration

by Zihao Zhang, Qiang Han and Zhichao Shi

Entropy 2025, 27(12), 1252; https://doi.org/10.3390/e27121252 - 11 Dec 2025

Viewed by 456

Abstract

In recent years, the frequent emergence of Android malware has posed a significant threat to user security. The redundancy of features in malicious software samples and the instability of individual model performance have also introduced numerous challenges to malware detection. To address these [...] Read more.

In recent years, the frequent emergence of Android malware has posed a significant threat to user security. The redundancy of features in malicious software samples and the instability of individual model performance have also introduced numerous challenges to malware detection. To address these issues, this paper proposes a malware detection framework named Mass-Droid, based on Multi-feature and Multi-layer Screening for adaptive Stacking integration. First, three types of features are extracted from APK files: permission features, API call features, and opcode sequences. Then, a three-layer feature screening mechanism is designed to effectively eliminate feature redundancy, improve detection accuracy, and reduce the computational complexity of the model. To tackle the problem of high performance fluctuations and limited generalization ability in single models, this paper proposes an adaptive Stacking integration method (Adaptive-Stacking). By dynamically adjusting the weights of base classifiers, this method significantly enhances the stability and generalization performance of the ensemble model when dealing with complex and diverse malware samples. The experimental results demonstrate that the MaSS-Droid framework can effectively mitigate overfitting, improve the model’s generalization capability, reduce feature redundancy, and significantly enhance the overall stability and accuracy of malware detection. Full article

► Show Figures

Figure 1

20 pages, 1193 KB

Open AccessArticle

RepackDroid: An Efficient Detection Model for Repackaged Android Applications

by Tito Leadon and Karim Elish

Information 2025, 16(12), 1075; https://doi.org/10.3390/info16121075 - 4 Dec 2025

Viewed by 553

Abstract

Repackaged Android applications pose a significant threat to mobile ecosystems, acting as common vectors for malware distribution and intellectual property infringement. Addressing the challenges of existing repackaging detection methods—such as scalability, reliance on app pairs, and high computational costs—this paper presents a novel [...] Read more.

Repackaged Android applications pose a significant threat to mobile ecosystems, acting as common vectors for malware distribution and intellectual property infringement. Addressing the challenges of existing repackaging detection methods—such as scalability, reliance on app pairs, and high computational costs—this paper presents a novel hybrid approach that combines supervised learning and symptom discovery. We develop a lightweight feature extraction and analysis framework that leverages only 20 discriminative features, including inter-component communication (ICC) patterns, sensitive API usage, permission profiles, and a structural anomaly metric derived from string offset order. Our experiments, conducted on 8441 Android applications sourced from the RePack dataset, demonstrate the effectiveness of our approach, achieving a maximum F1 score of 85.9% and recall of 98.8% using Support Vector Machines—outperforming prior state-of-the-art models that utilized over 500 features. We also evaluate the standalone predictive power of AndroidSOO’s string offset order feature and highlight its value as a low-cost repackaging indicator. This work offers an accurate, efficient, and scalable alternative for automated detection of repackaged mobile applications in large-scale Android marketplaces. Full article

► Show Figures

Figure 1

27 pages, 56691 KB

Open AccessArticle

MalVis: Large-Scale Bytecode Visualization Framework for Explainable Android Malware Detection

by Saleh J. Makkawy, Michael J. De Lucia and Kenneth E. Barner

J. Cybersecur. Priv. 2025, 5(4), 109; https://doi.org/10.3390/jcp5040109 - 4 Dec 2025

Cited by 1 | Viewed by 846

Abstract

As technology advances, developers continually create innovative solutions to enhance smartphone security. However, the rapid spread of Android malware poses significant threats to devices and sensitive data. The Android Operating System (OS)’s open-source nature and Software Development Kit (SDK) availability mainly contribute to [...] Read more.

As technology advances, developers continually create innovative solutions to enhance smartphone security. However, the rapid spread of Android malware poses significant threats to devices and sensitive data. The Android Operating System (OS)’s open-source nature and Software Development Kit (SDK) availability mainly contribute to this alarming growth. Conventional malware detection methods, such as signature-based, static, and dynamic analysis, face challenges in detecting obfuscated techniques, including encryption, packing, and compression, in malware. Although developers have created several visualization techniques for malware detection using deep learning (DL), they often fail to accurately identify the critical malicious features of malware. This research introduces MalVis, a unified visualization framework that integrates entropy and N-gram analysis to emphasize meaningful structural and anomalous operational patterns within the malware bytecode. By addressing significant limitations of existing visualization methods, such as insufficient feature representation, limited interpretability, small dataset sizes, and restricted data access, MalVis delivers enhanced detection capabilities, particularly for obfuscated and previously unseen (zero-day) malware. The framework leverages the MalVis dataset introduced in this work, a publicly available large-scale dataset comprising more than 1.3 million visual representations in nine malware classes and one benign class. A comprehensive comparative evaluation was performed against existing state-of-the-art visualization techniques using leading convolutional neural network (CNN) architectures, MobileNet-V2, DenseNet201, ResNet50, VGG16, and Inception-V3. To further boost classification performance and mitigate overfitting, the outputs of these models were combined using eight distinct ensemble strategies. To address the issue of imbalanced class distribution in the multiclass dataset, we employed an undersampling technique to ensure balanced learning across all types of malware. MalVis achieved superior results, with 95% accuracy, 90% F1-score, 92% precision, 89% recall, 87% Matthews Correlation Coefficient (MCC), and 98% Receiver Operating Characteristic Area Under Curve (ROC-AUC). These findings highlight the effectiveness of MalVis in providing interpretable and accurate representation features for malware detection and classification, making it valuable for research and real-world security applications. Full article

(This article belongs to the Section Security Engineering & Applications)

► Show Figures

Figure 1

19 pages, 2271 KB

Open AccessArticle

Improving the Performance of Static Malware Classification Using Deep Learning Models and Feature Reduction Strategies

by Tai-Hung Lai, Yun-Jyun Tsai and Chiang-Lung Liu

Mathematics 2025, 13(23), 3753; https://doi.org/10.3390/math13233753 - 23 Nov 2025

Viewed by 1102

Abstract

The rapid evolution of malware continues to pose severe challenges to cybersecurity, highlighting the need for accurate and efficient detection systems. Traditional signature- and heuristic-based methods are increasingly inadequate against sophisticated threats, which has motivated the use of machine learning and deep learning [...] Read more.

The rapid evolution of malware continues to pose severe challenges to cybersecurity, highlighting the need for accurate and efficient detection systems. Traditional signature- and heuristic-based methods are increasingly inadequate against sophisticated threats, which has motivated the use of machine learning and deep learning for static malware classification. In this study, we propose three deep neural network (DNN) architectures tailored for the binary classification of Portable Executable (PE) files. The models were trained and validated on the EMBER 2017 dataset and further tested on the independent REWEMA dataset to evaluate their cross-dataset generalization capabilities. To address the computational burden of high-dimensional feature vectors, two feature reduction strategies were examined: the Kumar method, which selected 276 features, and the LightGBM-based intersection method, which identified 206 shared features. Experimental results showed that the proposed Model III consistently achieved the best overall performance, outperforming LightGBM (v3.3.5) and the other DNN models in terms of accuracy, recall, and F1-score. Notably, its recall exceeded that of LightGBM by 0.73%, highlighting its superiority in reducing false negative rates. Feature reduction further demonstrated that significant dimensionality reduction could be achieved without compromising classification quality, with the Kumar method achieving the best balance between accuracy and efficiency. Cross-dataset validation revealed performance degradation across all models due to distributional shifts, but the decline was less significant for the DNNs, confirming its greater adaptability compared with LightGBM. These findings demonstrate that architectural optimization and appropriate feature selection can significantly improve the performance of static malware classification. This study also provides empirical benchmarks and methodological guidance for developing accurate, efficient, and resilient malware detection systems that are resilient to evolving threats. Full article

(This article belongs to the Special Issue Emerging Applications of Artificial Intelligence Algorithms in Computer and Network Security)

► Show Figures

Figure 1

22 pages, 2460 KB

Open AccessArticle

AI-Driven Cybersecurity in IoT: Adaptive Malware Detection and Lightweight Encryption via TRIM-SEC Framework

by Ibrahim Mutambik

Sensors 2025, 25(22), 7072; https://doi.org/10.3390/s25227072 - 19 Nov 2025

Cited by 1 | Viewed by 1113

Abstract

The explosive growth in Internet of Things (IoT) technologies has given rise to significant security concerns, especially with the emergence of sophisticated and zero-day malware attacks. Conventional malware detection methods based on static or dynamic analysis often fail to meet the real-time operational [...] Read more.

The explosive growth in Internet of Things (IoT) technologies has given rise to significant security concerns, especially with the emergence of sophisticated and zero-day malware attacks. Conventional malware detection methods based on static or dynamic analysis often fail to meet the real-time operational needs and limited-resource constraints typical of IoT systems. This paper proposes TRIM-SEC (Transformer-Integrated Malware Security and Encryption for IoT), a lightweight and scalable framework that unifies intelligent threat detection with secure data transmission. The framework begins with Autoencoder-Based Feature Denoising (AEFD) to eliminate noise and enhance input quality, followed by Principal Component Analysis (PCA) for efficient dimensionality reduction. Malware classification is performed using a Transformer-Augmented Neural Network (TANN), which leverages multi-head self-attention to capture both contextual and temporal dependencies, enabling accurate detection of diverse threats such as Zero-Day, botnets, and zero-day exploits. For secure communication, TRIM-SEC incorporates Lightweight Elliptic Curve Cryptography (LECC), enhanced with Particle Swarm Optimization (PSO) to generate cryptographic keys with minimal computational burden. The framework is rigorously evaluated against advanced baselines, including LSTM-based IDS, CNN-GRU hybrids, and blockchain-enhanced security models. Experimental results show that TRIM-SEC delivers higher detection accuracy, fewer false alarms, and reduced encryption latency, which makes it well-suited for real-time operation in smart IoT ecosystems. Its balanced integration of detection performance, cryptographic strength, and computational efficiency positions TRIM-SEC as a promising solution for securing next-generation IoT environments. Full article

(This article belongs to the Special Issue AI, Machine Learning (ML), and Large Language Models (LLMs) for Cybersecurity in Sensor Networks)

► Show Figures

Figure 1

20 pages, 1031 KB

Open AccessArticle

MalRefiner: Recovering Malware Semantics via Reinforcement Learning-Based Semantic NOP Removal

by Jiankun Sun, Fan Shi, Min Zhang, Miao Hu, Pengfei Xue, Cheng Huang and Chengxi Xu

Appl. Sci. 2025, 15(22), 12015; https://doi.org/10.3390/app152212015 - 12 Nov 2025

Viewed by 578

Abstract

Adversarial evasion against learning-based malware detectors has shifted from feature-space perturbations to semantic-preserving, problem-space manipulations. In this paradigm, attackers inject semantic NOPs—functionally NOP instructions that shift the static feature distribution—into assembly code to suppress detection confidence. Existing defenses primarily recalibrate classifier decision boundaries, [...] Read more.

Adversarial evasion against learning-based malware detectors has shifted from feature-space perturbations to semantic-preserving, problem-space manipulations. In this paradigm, attackers inject semantic NOPs—functionally NOP instructions that shift the static feature distribution—into assembly code to suppress detection confidence. Existing defenses primarily recalibrate classifier decision boundaries, leaving the adversarially modified malware intact and thereby hindering downstream tasks including but not limited to malicious API localization and capability attribution. We introduce MalRefiner, a reinforcement-learning agent that automatically identifies and removes adversarially inserted semantic NOPs to restore the original malicious representation. The recovery process is formulated as a Markov Decision Process, where a policy network sequentially decides whether to retain or remove each opcode. The agent is trained with a composite reward function that balances detection confidence recovery with semantic preservation, guided by a lightweight 1D causal convolutional environment providing compact state representations and delayed rewards. Extensive evaluation on the PEMML and RawMal-TF datasets against four state-of-the-art detectors (1D CNN, MalConv, TCN, and MALIGN) demonstrates that MalRefiner restores F1 to within 3.18 ± 0.94% of the clean baseline and achieves a recovery rate exceeding 90% across all models and datasets, without requiring retraining or architectural modification of the target classifier. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

Search Results (308)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (308)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI