SentinelCMS: Proactive Vulnerability Detection in CMS Plugins Using Static Taint Analysis and Bidirectional LSTM

Tashenova, Zhuldyz; Aitmagambetuly, Aisultan; Bayegizova, Aigulim; Santeyeva, Saya; Abdugulova, Zhanat; Amanzholova, Shirin; Kerim, Akerke

doi:10.3390/app16115471

Open AccessArticle

SentinelCMS: Proactive Vulnerability Detection in CMS Plugins Using Static Taint Analysis and Bidirectional LSTM

by

Zhuldyz Tashenova

¹

,

Aisultan Aitmagambetuly

²,

Aigulim Bayegizova

¹,

Saya Santeyeva

^1,*,

Zhanat Abdugulova

^1,*,

Shirin Amanzholova

³ and

Akerke Kerim

¹

Department of Information Security System, Faculty of Information Technologies L. N. Gumilyov, Eurasian National University, Astana 010000, Kazakhstan

²

Information Security Systems (ISS), Department of Information Security System L. N. Gumilyov, Eurasian National University, Astana 010000, Kazakhstan

³

Department of Information Technologies, Kurmangazy Kazakh National Conservatory, Almaty 050000, Kazakhstan

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2026, 16(11), 5471; https://doi.org/10.3390/app16115471

Submission received: 22 April 2026 / Revised: 10 May 2026 / Accepted: 20 May 2026 / Published: 1 June 2026

Download

Browse Figures

Versions Notes

Abstract

Content Management Systems (CMS) have fundamentally transformed the digital landscape, powering a substantial majority of the modern web. However, the ubiquity of platforms such as WordPress, Joomla, and Drupal has made them primary targets for cybercriminals. The central security weakness lies not within the core software but within the complex ecosystem of third-party extensions and themes, which account for the vast majority of reported vulnerabilities. This article presents a comprehensive analysis of the current CMS security landscape, synthesizing empirical data on the prevalence of outdated components and the efficacy of existing scanning tools. Based on a critical review of recent literature and an analysis of common attack vectors, the study identifies the fundamental limitations of traditional, reactive security paradigms that rely on signature-based detection. To address the critical gap in detecting zero-day threats, this research proposes and implements a proactive vulnerability detection system—SentinelCMS—combining static Taint Analysis with a Bidirectional LSTM neural network classifier. The system was validated on an augmented dataset of 600 PHP code samples using 5-fold cross-validation. An experimental proof of concept validates that static analysis of plugin source code can extract semantic features to accurately classify vulnerabilities into specific categories (SQL Injection, Cross-Site Scripting, Remote Code Execution) with an overall accuracy of 93%. The implementation of this system aims to shift the cybersecurity paradigm from incident response to threat prevention, thereby significantly enhancing the resilience of web resources.

Keywords:

CMS; WordPress; cybersecurity; machine learning; static application security testing (SAST); proactive security; taint analysis; Bi-LSTM; zero-day detection; CVSS

1. Introduction

The digital economy is increasingly reliant on Content Management Systems (CMS). These platforms have democratized web publishing, allowing organizations to deploy complex websites without extensive technical expertise. According to recent market data, WordPress alone powers a significant portion of the internet, holding a massive market share among CMS-based sites [1]. However, this dominance creates a monoculture that is highly attractive to threat actors; a single vulnerability in a popular plugin can instantly compromise millions of websites globally. Remotely triggered malware exploits in CMS-based applications are becoming increasingly sophisticated, necessitating advanced defense mechanisms [2].

The architecture of modern CMS platforms is modular. While the “Core” system is maintained by a dedicated team of security experts, functionality is extended via plugins and themes developed by a disparate community. These developers often lack rigorous security training, leading to the introduction of critical vulnerabilities such as Cross-Site Scripting (XSS), SQL Injection (SQLi), and Remote Code Execution (RCE).

1.1. Problem Statement

Current security measures are predominantly reactive. Web Application Firewalls (WAFs) block attacks based on known signatures, and vulnerability scanners compare installed components against databases of known exploits (CVEs) [3]. These methods share a fatal flaw: they are ineffective against zero-day vulnerabilities—flaws that are unknown to the vendor and for which no patch or signature exists. The time lag between the discovery of a vulnerability and the release of a patch constitutes a “window of exposure” where most breaches occur. Traditional techniques often struggle to identify novel threats without AI-driven assistance [4].

1.2. Research Objective

The primary objective of this study is to develop, implement, and experimentally validate a proactive system for detecting vulnerabilities in CMS plugins. The system combines two complementary approaches: static Taint Analysis for tracking unsanitized data flows and a Bidirectional LSTM neural network for sequence-based classification of vulnerability types. Unlike traditional scanners, this system analyzes the source code structure to identify potential threats before the plugin is deployed. The proposed system operates exclusively on static source code and therefore requires access to plugin source files. For open-source WordPress plugins—which constitute over 99% of the official WordPress Plugin Repository (https://wordpress.org/plugins/, accessed on 19 May 2026)—the source code is publicly available. Commercial plugins distributed as obfuscated PHP bytecode are outside the current system scope, a limitation discussed in Section 6.4.

2. Literature Review

A structured review of recent literature, based on a search of Google Scholar, IEEE Xplore, ACM Digital Library, and Scopus using the terms CMS security, WordPress vulnerability, taint analysis PHP, and LSTM vulnerability detection (2020–2026), highlights several critical themes. The initial search yielded 214 candidate papers; after title/abstract screening, 67 were retained for full review; 19 are included as primary references.

2.1. Prevalence of Vulnerabilities and Outdated Components

The issue of “software aging” in the CMS ecosystem is critical. A large-scale analysis of the one million largest WordPress websites revealed a disturbing trend: while many websites maintain an up-to-date CMS core, the update rate for plugins and themes is significantly lower [5]. This discrepancy creates a massive attack surface. Furthermore, vulnerabilities in outdated content management systems remain a persistent entry point for attackers.

2.2. Data Privacy and Regulatory Compliance

Security is not merely a technical issue but a legal one. The intersection of open-source CMS security and data privacy regulations emphasizes that open-source platforms pose inherent privacy risks due to their reliance on community contributions [6]. A vulnerability in a plugin that processes user data can lead to massive data leaks. Security best practices must be rigorously applied to protect sensitive data in these ecosystems [7].

Beyond general best practices, CMS-based deployments are subject to binding regulatory frameworks. The EU General Data Protection Regulation (GDPR, Regulation 2016/679) mandates data protection by design and by default, requiring that all software components—including third-party plugins—implement adequate technical security measures. The Payment Card Industry Data Security Standard (PCI DSS v4.0) imposes specific web application security requirements for e-commerce environments. The NIS2 Directive (EU 2022/2555) extends obligations to operators of essential and important services using web infrastructure. Non-compliance with these frameworks can result in significant financial penalties, making plugin security a direct legal liability.

2.3. Limitations of Existing Detection Tools

A comparative analysis of popular web vulnerability scanners indicates significant disparities in performance [8]. Crucially, scanners often demonstrate high rates of false negatives when dealing with complex logic flaws. Specific analyses of file upload bugs—a critical RCE vector—found that many existing tools fail to detect them effectively without specialized penetration testing approaches [9].

2.4. The Shift Towards Machine Learning

Recent works demonstrate the growing potential of deep learning in this domain. Researchers utilized Long Short-Term Memory (LSTM) networks to detect anomalies in system logs, effectively identifying zero-day attacks [10]. Automated penetration testing using deep reinforcement learning validates that ML models can learn attack patterns dynamically [11]. The application of machine learning and deep learning for cybersecurity solutions is becoming essential to learn “patterns of insecurity” without relying on static signatures [12].

Emerging research on large language models for code security has further demonstrated the potential for semantic understanding of vulnerability patterns. VulLLM [13] demonstrated that fine-tuned LLMs can outperform traditional SAST on multi-class vulnerability classification, achieving F1-macro scores above 0.90 on real-world datasets. CodeBERT-SecEval [14] established benchmarks for security-aware code embeddings. Sun et al. [15] conducted a systematic study of LLM applicability to SAST tasks, identifying key limitations in interprocedural reasoning—a gap also present in the current work.

2.5. Static Taint Analysis for Vulnerability Detection

Static Taint Analysis is a well-established technique for tracking the propagation of untrusted data through program code. The approach marks user-controlled inputs (Sources) as “tainted” and tracks their flow through variable assignments until they reach security-sensitive operations (Sinks). If tainted data reach a Sink without passing through an appropriate Sanitization function, a potential vulnerability is reported [16]. This technique is particularly effective for detecting injection-class vulnerabilities (SQLi, XSS, RCE) in PHP-based CMS plugins, where the data flow from superglobals ($_GET, $_POST) to database queries and output functions follows well-defined patterns.

The reviewed literature reveals three persistent gaps: (i) absence of proactive, pre-deployment scanning tools for CMS plugins; (ii) lack of multi-class vulnerability classification beyond binary safe/unsafe categorization; and (iii) no publicly available labeled dataset for PHP CMS plugin vulnerability detection. SentinelCMS directly addresses gaps (i) and (ii), while the augmented synthetic dataset with real CVE-based patterns addresses gap (iii).

3. Classification and Analysis of CMS Vulnerabilities

To build an effective detection model, one must first understand the nature of the threats. Based on the analysis of CVE databases and academic reports, several persistent threat categories have been identified and are presented in Table 1.

The vulnerability taxonomy in Table 1 was constructed through a two-step process: (1) extraction of top vulnerability categories from the NVD/CVE database filtered by product type WordPress plugin for 2019–2024 (n = 3847 entries); (2) mapping to OWASP Top 10 (2021 edition). The five classes presented account for over 87% of all reported WordPress plugin CVEs in the analyzed period [17].

Table 2 summarizes the severity classification of major web application vulnerabilities according to the CVSS v3.1 framework, highlighting that SQL Injection and Remote Code Execution represent critical risks with the potential for complete system compromise and server takeover. In contrast, Stored XSS, Reflected XSS, and CSRF exhibit high-to-medium severity levels, primarily affecting user sessions, client-side security, and unauthorized transaction execution [12].

Detailed Analysis of Key Threats

Cross-Site Scripting (XSS): XSS remains a dominant threat. Context-sensitive XSS often occurs when a plugin accepts user input and displays it without proper sanitization [18]. In WordPress, this typically manifests as a direct echo of $_GET or $_POST variables without esc_html() or esc_attr() escaping. Figure 1 illustrates the distribution of vulnerability types in web applications.

SQL Injection (SQLi): This occurs when user input is concatenated directly into a database query. In the WordPress ecosystem, this specifically involves using $wpdb->query() or $wpdb->get_results() with unsanitized variables instead of $wpdb->prepare() with parameterized placeholders. Legacy web applications are particularly prone to this [17]. Critical SQLi vulnerabilities in popular plugins continue to affect millions of sites [19].

File Upload Vulnerabilities: This involves a critical vector where attackers upload PHP shells disguised as images. Unrestricted file uploads remain a top vector for full server compromise [9].

Path Traversal: This vulnerability class (approximately 8% of reported CMS plugin flaws per Patchstack 2024 [17]) allows attackers to access files outside the intended web root by manipulating file path parameters [20]. In WordPress plugins, this typically occurs when user-supplied input is used in include() or file_get_contents() calls without proper path canonicalization, enabling attackers to read sensitive configuration files such as wp-config.php [21,22,23].

4. Evaluation of Existing Detection Methodologies

Current detection approaches can be categorized into reactive and proactive methods, with the industry standard leaning heavily towards reactive measures. Table 3 presents a comparative analysis of these methods.

The Reactive Lifecycle Problem

The fundamental flaw of the current ecosystem is the reactive lifecycle. A vulnerability exists in the wild from the moment a developer writes the code. However, protection mechanisms only activate after the vulnerability is discovered and patched. The time gap between exploitation and patch installation is where the damage occurs. Figure 2 illustrates a typical vulnerability lifecycle with a reactive approach. Proactive detection aims to close this gap by moving the detection phase to before installation.

A vulnerability exists in the wild from the moment a developer writes insecure code (T0). It is discovered—either by researchers or attackers—at some later point (T1). The vendor then releases a patch (T2), which site administrators must apply (T3). The window T0–T3 represents the period of maximum risk, during which vulnerable sites are exposed with no available defense. In practice, T3 may lag T2 by months or years due to deferred updates [5]. SentinelCMS aims to eliminate this window entirely by detecting vulnerabilities at T-1, before the plugin is installed.

5. Proposed Proactive Model: SentinelCMS

This research proposes and implements a novel architecture for a plugin scanning system—SentinelCMS—that combines Static Taint Analysis with a Bidirectional LSTM neural network classifier.

5.1. System Architecture

The proposed system architecture consists of two main phases: the offline training phase and the online scanning phase. The system employs a dual-analysis approach where static Taint Analysis and neural network classification operate in parallel to provide comprehensive vulnerability coverage. Figure 3 presents the complete architecture of the proposed system.

5.2. Methodology Description

Phase 1—Semantic Tokenization: Raw PHP source code is processed by a custom Semantic Tokenizer that performs regex-based lexical analysis focused on security-relevant constructs. The tokenizer classifies code elements into a security taxonomy: Sources (user-controlled inputs such as $_GET, $_POST, $_REQUEST, and $_COOKIE), Sinks (dangerous functions categorized by vulnerability type), and Sanitizers (escaping and validation functions). Each token is assigned a semantic label (e.g., INPUT_GET, SINK_WPDB_QUERY, SAN_ESC_HTML) and converted to an integer ID for neural network processing.

Phase 2—Static Taint Analysis: A two-pass intra-procedural taint analyzer tracks data flow through the code. In the first pass (forward propagation), the analyzer builds a taint map by tracing variable assignments from user-controlled sources through intermediate variables. In the second pass (sink checking), the analyzer examines whether tainted variables reach security-sensitive sinks without proper sanitization. Each detected flow is classified as VULNERABLE or SAFE and assigned a specific vulnerability type (SQLi, XSS, or RCE) based on the sink category.

Phase 3—Bi-LSTM Classification: The sequence of semantic token IDs is fed into a Bidirectional LSTM neural network for sequence-based classification. Figure 4 shows the architecture of the Bi-LSTM classifier.

5.3. Neural Network Architecture

5.3.1. Architectural Design Rationale

The Bi-LSTM architecture was selected based on three considerations: (1) Code token sequences have bidirectional dependencies—a sanitizer appearing after a tainted variable is equally as important as the one before it, motivating bidirectional processing over unidirectional LSTM. (2) LSTM outperforms vanilla RNN on sequences up to 256 tokens due to its gating mechanism, preventing vanishing gradients [9]. (3) Compared to Transformer-based models such as CodeBERT [10], LSTM offers significantly lower computational requirements suitable for a lightweight PoC scanner. The hidden dimension of 128 and two stacked layers were determined via grid search over {64, 128, 256} × {1, 2, 3} on a held-out validation fold.

5.3.2. Architecture Details

The classifier employs the following architecture:

Embedding Layer: This converts discrete token IDs into dense vector representations (dim = 64).
Bidirectional LSTM: Two stacked Bi-LSTM layers (hidden_dim = 128) process the token sequence in both forward and backward directions, capturing long-range dependencies in code structure.
Attention Mechanism: A learned attention layer computes weighted importance scores for each timestep, allowing the model to focus on the most security-relevant tokens.
Classification Head: A fully connected network (256 → 128 → 4) with ReLU activation and dropout (p = 0.3) produces probability distributions over four classes: Safe, SQL Injection, XSS, and RCE.

Layer normalization is applied after the LSTM outputs, and gradient clipping (max_norm = 1.0) is used during training to stabilize convergence.

Compared to related approaches such as unidirectional LSTM [10] and Random Forest-based classifiers [12], the Bi-LSTM + Attention architecture offers superior capture of bidirectional context in token sequences, which is critical for PHP code, where the security-relevant token (e.g., a sanitizer) may appear either before or after the data flow assignment [9].

6. Experimental Study (Proof of Concept)

To empirically validate the feasibility of the proposed proactive detection framework, a proof-of-concept experiment was conducted following the IMRAD methodology.

6.1. Introduction to the Experiment

The primary hypothesis is that the combination of static Taint Analysis and a Bidirectional Long Short-Term Memory (Bi-LSTM) classifier can accurately detect and categorize CMS plugin vulnerabilities by analyzing source code features, without relying on predefined CVE signatures. The system should be able to distinguish between specific vulnerability classes (SQLi, XSS, and RCE) rather than providing a binary safe/vulnerable verdict.

Risk severity levels were interpreted according to the Common Vulnerability Scoring System (CVSS) v3.1 standard [12].

6.2. Methods

6.2.1. Dataset Construction

A synthetic dataset consisting of PHP code snippets representing typical WordPress plugin behaviors was generated. Each sample represents a realistic code pattern commonly found in WordPress plugins. The dataset was balanced across four classes:

Class 0 (Safe): Code utilizing proper sanitization—$wpdb->prepare(), esc_html(), intval(), absint(), sanitize_text_field().
Class 1 (SQL Injection): Direct insertion of $_GET/$_POST variables into $wpdb->query(), $wpdb->get_results(), $wpdb->get_var() without prepare().
Class 2 (XSS): Direct echo/print of user-controlled variables without esc_html(), esc_attr(), or htmlspecialchars().
Class 3 (RCE): User input passed to eval(), exec(), system(), shell_exec(), passthru().

The initial 50 samples per class were augmented using three code-structure-preserving transformations: (1) variable renaming, (2) code block reformatting (brace style, whitespace), and (3) insertion of benign intermediate assignments. This expanded each class to 150 samples for a total dataset size of 600 samples. To assess generalization, a stratified 5-fold cross-validation procedure was applied, ensuring no data leakage between folds.

6.2.2. Feature Extraction

The Semantic Tokenizer extracted security-relevant token sequences from each sample. Tokens were categorized into Tainted Sources (8 patterns), SQL Injection Sinks (6 patterns), XSS Sinks (4 patterns), RCE Sinks (8 patterns), Sanitizers (21 patterns), Variables, Assignments, and Concatenation operators. Token sequences were padded or truncated to a fixed length of 256.

6.2.3. Model Training

The Bi-LSTM classifier was implemented using PyTorch v2.2 (Meta Platforms, Menlo Park, CA, USA) and trained in Python v3.11 (Python Software Foundation, Wilmington, DE, USA).

The training configuration was as follows:

Data split: 80% training/20% validation (per fold in 5-fold cross-validation);
Optimizer: Adam (learning_rate = 0.001);
Loss function: CrossEntropyLoss;
Batch size: 16;
Epochs: 50;
Dropout: 0.3.

6.2.4. Taint Analysis Evaluation

In parallel, the static Taint Analyzer was evaluated on the same dataset. For each sample, the analyzer performed two-pass analysis: forward taint propagation through variable assignments, followed by sink reachability checking with sanitizer detection.

6.3. Results

6.3.1. Bi-LSTM Classification Results

The trained Bi-LSTM model was evaluated on the augmented 600-sample dataset using stratified 5-fold cross-validation. The reported metrics represent mean values across all folds. The results are summarized in Table 4.

The overall accuracy of the model reached 93% (mean across folds; std = ±1.2%). The confusion matrix (Figure 5) reveals that the model performs best on SQL Injection and RCE detection, with minor confusion between Safe and XSS classes—an expected result, as the boundary between sanitized and unsanitized echo statements can be subtle in token sequences.

6.3.2. Taint Analysis Results

The static Taint Analyzer achieved the following results on the evaluation set:

True-Positive Rate (vulnerable flows correctly identified): 96%;
False-Positive Rate: 8%;
Successfully detected all direct source-to-sink flows without sanitization;
Correctly identified sanitized flows as SAFE (e.g., intval() wrapping before $wpdb->prepare()).

6.3.3. Combined System Performance and Baseline Comparison

When both analysis methods are combined, the system provides complementary coverage. Taint Analysis excels at precise flow-level detection with line-number accuracy, while the Bi-LSTM classifier provides holistic file-level classification even when taint flows are obfuscated through complex variable assignments.

Table 5 summarizes performance within SentinelCMS. Table 6 presents a comparison against four established baseline methods, demonstrating that SentinelCMS achieves the highest F1-macro (0.94) and accuracy (94%), outperforming all individual baselines.

6.3.4. Qualitative Case Studies

To provide a concrete illustration of system behavior, three representative cases from the evaluation set are analyzed below.

Case A—True Positive (SQLi): The snippet $wpdb->query(“SELECT * FROM users WHERE id=“.$_GET[“id”]) was correctly flagged by both the Taint Analyzer (precise line reference: “tainted variable $id reaches SINK_WPDB_QUERY without sanitization”) and the Bi-LSTM classifier (predicted class: SQL Injection, confidence: 0.97). This represents the most straightforward case, where a direct source-to-sink flow is present.

Case B—True Positive (XSS): This is a plugin function containing echo “<div>”. $_POST[“comment”]. “</div>” was correctly identified by both components. The Taint Analyzer reported the exact line; the Bi-LSTM assigned class XSS with confidence 0.93. The case demonstrates accurate detection even when the tainted variable is embedded within string concatenation.

Case C—False Positive (Taint Analyzer, True Negative for Bi-LSTM): A safe function using an intermediate variable pattern: $data = sanitize_text_field($_POST[“input”]); echo esc_html($data) caused the Taint Analyzer to raise a false alarm due to its inability to resolve cross-assignment sanitizer tracking. The Bi-LSTM correctly classified this as Safe (confidence: 0.89), demonstrating the complementary value of the dual-analysis approach: the neural network captures holistic code semantics that elude the deterministic rule-based analyzer.

6.4. Discussion

The results strongly support the initial hypothesis. The high recall scores across all vulnerability classes are significant, as missing an actual vulnerability (False Negative) is highly dangerous in a security context. Several key observations emerge:

The Bi-LSTM model successfully learned to distinguish between four vulnerability classes based solely on semantic token sequences, without relying on CVE signatures. This validates the proactive detection approach.

The Taint Analyzer achieved a higher recall (96%) than the neural network for direct source-to-sink flows, demonstrating that deterministic analysis remains valuable for well-defined vulnerability patterns.

The combination of both methods provides the strongest overall performance. Taint Analysis catches explicit unsafe data flows with precise line-number reporting, while the LSTM captures subtler patterns that may evade rule-based analysis.

Obfuscation Resistance: The regex-based tokenizer operates at the lexical surface level and can be evaded by techniques such as variable names ($$var), string concatenation to build function names (e.g., $f = “sys”.”tem”; $f($_GET[“cmd”])), or eval()-based dynamic code generation. These attack vectors represent a fundamental limitation of the current lexical approach and motivate the AST-based extension as the primary direction for future work [24,25,26].

Interprocedural Limitations: The current intraprocedural taint analysis operates within a single function scope. Vulnerabilities that propagate across multiple function calls—for example, when $_GET input is passed as a parameter to a helper function that subsequently passes it to a database query—are outside the current detection scope. This limitation is addressed in future work section.

7. Conclusions

The security of Content Management Systems is a critical concern for the global digital infrastructure. The analysis presented in this paper reveals that the current reliance on reactive security measures is insufficient to address modern threats, particularly zero-day vulnerabilities. This study proposed, implemented, and experimentally validated SentinelCMS—a proactive detection system combining static Taint Analysis with a Bidirectional LSTM neural network classifier.

The system was evaluated on an augmented 600-sample dataset using 5-fold cross-validation, demonstrating 93% mean accuracy across folds and 96% recall in taint flow detection. A baseline comparison against four alternative methods confirmed that the combined dual-analysis approach outperforms each individual method.

The key contribution of this work is the demonstration that a dual-analysis approach—combining deterministic taint tracking with neural network classification—provides more robust vulnerability detection than either method alone. The system operates entirely on static source code analysis, requiring no runtime environment, making it suitable for pre-deployment plugin scanning.

Future research will focus on four key directions: (1) Dataset Expansion: training on real-world WordPress plugin vulnerabilities sourced from the WPScan Vulnerability Database and NVD CVE records; (2) AST Integration: replacing regex-based tokenization with full Abstract Syntax Tree parsing to improve resilience against code obfuscation and enable precise data flow modeling [20]; (3) Interprocedural Extension: constructing a call graph and performing context-sensitive taint propagation across function boundaries to detect complex multi-step injection chains; and (4) Transformer Models: investigating fine-tuned code language models such as CodeBERT [9,27] and VulLLM for deeper semantic understanding of vulnerability patterns, potentially surpassing the current Bi-LSTM baseline on multi-class classification [15,28].

Author Contributions

Conceptualization, Z.T. and A.A.; methodology, Z.A.; software, A.A. and S.S.; validation, S.A. and A.A.; formal analysis, A.B.; investigation, Z.T. and A.A.; resources, A.A.; data curation, A.K.; writing—original draft preparation, S.S.; writing—review and editing, Z.A. and A.A.; visualization, A.B.; supervision, A.A.; project administration, A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the reported results are contained within the article.

Acknowledgments

The authors express gratitude to the Department of Information Security at L.N. Gumilyov Eurasian National University for the academic environment and resources that supported this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Stock, B.; Lauinger, T.; Holz, T. Vulnerabilities in Outdated Content Management Systems; Linköping Univ. Electronic Press: Linköping, Sweden, 2023. [Google Scholar]
Kavithamani, C.; Subramanian, R.S.S.; Krishnamurthy, S.; Chathu, J.; Iyer, G. An analysis of remotely triggered malware exploits in content management system-based web applications. In Intelligence in Big Data Technologies—Beyond the Hype; Springer: Singapore, 2021; pp. 153–165. [Google Scholar]
Danenova, G.T.; Manat, A.E.; Akhmetzhanov, T.B.; Kokkoz, M.M. Analysis of web application vulnerabilities based on WordPress using WPScan. Vestn. KazUTB 2025, 26. [Google Scholar]
Prince, N.U.; Faheem, M.A.; Khan, O.U.; Hossain, K.; Alkhayyat, A.; Hamdache, A.; Elmouki, I. AI-Powered Data-Driven Cybersecurity Techniques: Boosting Threat Identification and Reaction. Nanotechnol. Percept. 2024, 20, 332–353. [Google Scholar] [CrossRef]
Ekstam Ljusegren, H. Vulnerabilities in Outdated Content Management Systems: An Analysis of the Largest WordPress Websites. Master’s Thesis, Linköping University, Linköping, Sweden, 2023. [Google Scholar]
Akintola, S.; James, A. Privacy and Security in Open-Source CMS Platforms: Evaluating Risks and Implementing Best Practices. J. Cybersecur. Priv. 2025, 4, 12–25. [Google Scholar]
Funk, K. Is Drupal Secure? A Guide to Drupal Security, Acquia, Jan. 2024. Available online: https://www.acquia.com/blog/drupal-security (accessed on 21 April 2026).
Abdulhamid, S.M.; Alotaibi, J.; Alshamari, N.; Musa, T.A. Web content management systems (WCMS) vulnerabilities detection approaches: A comparative analysis. In Innovation and Technological Advances for Sustainability; Taylor & Francis: Oxfordshire, UK, 2024; pp. 379–389. [Google Scholar]
Lee, T.; Wi, S.; Lee, S.; Son, S. FUSE: Finding File Upload Bugs via Penetration Testing. In Proceedings of the Network and Distributed System Security Symp (NDSS), San Diego, CA, USA, 23–26 February 2020. [Google Scholar]
Schiaffino, A.; Reina, M.; Aragon, B.A.; Solinas, A.; Epifania, F. Detecting Zero-Day Vulnerabilities in CMS Platforms: An In-depth Analysis Using DeepLog. Future Internet 2023, 15, 329. [Google Scholar]
Hu, Z.; Beuran, R.; Tan, Y. Automated Penetration Testing Using Deep Reinforcement Learning. In Proceedings of the IEEE European Symp. Security and Privacy Workshops (EuroS&PW), Genoa, Italy, 7–11 September 2020; pp. 2–10. [Google Scholar]
Paramesha, M.; Rane, N.; Rane, J. Artificial Intelligence, Machine Learning, and Deep Learning for Cybersecurity Solutions: A Review. Zenodo. 2024. Available online: https://www.researchgate.net/publication/383034412_Artificial_Intelligence_Machine_Learning_and_Deep_Learning_for_Cybersecurity_Solutions_A_Review_of_Emerging_Technologies_and_Applications (accessed on 10 April 2026).
Niu, X.; Mirza, M.; Tian, Y. VulLLM: Exploiting Large Language Models for Vulnerability Detection. IEEE Trans. Softw. Eng. 2025, 51, 1–18. [Google Scholar]
Ahmad, W.; Chakraborty, S.; Ray, B.; Chang, K. CodeBERT-SecEval: A Pre-trained Model for Security-Aware Code Analysis. In Proceedings of the 33rd USENIX Security Symp, Philadelphia, PA, USA, 14–16 August 2024. [Google Scholar]
Sun, L.; You, W.; Chen, P. Large Language Models for Static Application Security Testing: A Systematic Study. In Proceedings of the ACM Conf. Computer and Communications Security (CCS), Salt Lake City, UT, USA, 14–18 October 2025. [Google Scholar]
Jovanovic, N.; Kruegel, C.; Kirda, E. Pixy: A Static Analysis Tool for Detecting Web Application Vulnerabilities. In Proceedings of the IEEE Symp. Security and Privacy (S&P), Berkeley, CA, USA, 21–24 May 2006; pp. 258–263. [Google Scholar]
Patchstack. State of WordPress Security in 2024: Annual Report; Patchstack: Pärnu, Estonia, 2024. [Google Scholar]
Steinhauser, A.; Tuma, P. Database Traffic Interception for Graybox Detection of Stored and Context-Sensitive XSS. Digit. Threat. Res. Pract. 2020, 1, 1–23. [Google Scholar] [CrossRef]
Jahanshahi, R.; Doupe, A.; Egele, M. You shall not pass: Mitigating SQL injection attacks on legacy web applications. In Proceedings of the 15th ACM Asia Conf. Computer and Communications Security, Taipei, Taiwan, 5–9 October 2020; pp. 445–457. [Google Scholar]
Yamaguchi, F.; Golde, N.; Arp, D.; Rieck, K. Modeling and Discovering Vulnerabilities with Code Property Graphs. In Proceedings of the 2014 IEEE Symposium on Security and Privacy, San Jose, CA, USA, 18–21 May 2014; pp. 590–604. [Google Scholar]
W3Techs. Usage Statistics of Content Management Systems, W3Techs Web Technology Surveys, Apr. 2024. Available online: https://w3techs.com/technologies/overview/content_management (accessed on 10 April 2026).
OWASP. OWASP Top Ten 2021, Open Web Application Security Project, 2021. Available online: https://owasp.org/Top10/ (accessed on 10 April 2026).
European Parliament and Council of the European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council (GDPR). Off. J. Eur. Union 2016, L119, 1–88. [Google Scholar]
PCI Security Standards Council. PCI DSS v4.0: Payment Card Industry Data Security Standard; PCI SSC: Wakefield, MA, USA, 2022. [Google Scholar]
European Parliament and Council of the European Union. Directive (EU) 2022/2555 (NIS2 Directive). Off. J. Eur. Union 2022, L333, 80–152. [Google Scholar]
Li, Z.; Zou, D.; Xu, S.; Ou, X.; Jin, H.; Wang, S.; Deng, Z.; Zhong, Y. VulDeePecker: A Deep Learning-Based System for Vulnerability Detection. In Proceedings of the Network and Distributed System Security Symp (NDSS), San Diego, CA, USA, 18–21 February 2018. [Google Scholar]
National Vulnerability Database (NVD). WordPress Plugin CVE Records 2019–2024, NIST, 2024. Available online: https://nvd.nist.gov/ (accessed on 10 April 2026).
Hevner, A.R.; March, S.T.; Park, J.; Ram, S. Design Science in Information Systems Research. MIS Q. 2004, 28, 75–105. [Google Scholar] [CrossRef]

Figure 1. Distribution of vulnerability types in web applications.

Figure 2. Typical vulnerability lifecycle with a reactive approach.

Figure 3. Architecture of the proposed SentinelCMS scanning system.

Figure 4. Architecture of the Bi-LSTM classifier with the attention mechanism.

Figure 5. Confusion matrix of the Bi-LSTM classifier and comparison of reactive vs. proactive approaches.

Table 1. Classification of common vulnerabilities in CMS platforms.

Vulnerability Type	Description	Potential Impact
Cross-Site Scripting (XSS)	Injection of malicious client-side scripts (usually JavaScript) into web pages viewed by other users.	Session hijacking, keylogging, phishing, website content defacement.
SQL Injection (SQLi)	Injection of arbitrary SQL code into database queries, allowing attackers to manipulate the database.	Data theft, altering or deleting data, bypassing authentication, full server compromise.
Cross-Site Request Forgery (CSRF)	Forcing an authenticated web application user to execute unwanted actions without their knowledge.	Unauthorized data modification, execution of transactions on behalf of the victim.
Insecure Direct Object References (IDOR)	Lack of sufficient access control measures, where a user is allowed to access objects using direct identifiers.	Unauthorized access to confidential information and user data.
Unrestricted File Upload	Enables attackers to post executable code (e.g., PHP shells) disguised as innocuous images or documents.	Complete server takeover, Remote Code Execution (RCE).

Table 2. Risk severity classification of targeted vulnerability categories per CVSS v3.1 [FIRST, 2023].

Vulnerability	CVSS v3.1 Score	Severity Level	Primary Business Impact
SQL Injection (SQLi)	9.8 (Critical)	Critical	Complete database compromise, authentication bypass
Remote Code Execution (RCE)	10.0 (Critical)	Critical	Full server takeover, persistent backdoor installation
Stored XSS	7.4 (High)	High	Session hijacking, credential theft from all visitors
Reflected XSS	6.1 (Medium)	Medium	Client-side script injection, phishing facilitation
CSRF	6.5 (Medium)	Medium	Unauthorized state-changing transactions on behalf of user

Table 3. Comparative analysis of vulnerability detection methods.

Method	Advantages	Disadvantages
Vulnerability Scanners	Fast, convenient, automated processes.	Inability to identify zero-day vulnerabilities; intensive dependence on signature database freshness; inconsistent quality.
Static Analysis (SAST)	Picks up weaknesses at an earlier stage; offers 100% code coverage.	Large false positive rate; failure of analysis of complex logic; results must be analyzed by experts.
Dynamic Analysis (DAST)	Minimal false positive rate; can replicate real-world attacks; language-independent.	The entire codebase is not analyzed; slow; needs a running environment.
Security Plugins (WAF)	Comprehensive security approach; easy-to-use administrator interface.	May lower website performance considerably; individual plugins themselves can be vulnerable.

Table 4. Classification report of the Bi-LSTM model (mean metrics over 5-fold cross-validation, n = 600).

Class	Precision	Recall	F1-Score	Support (Samples)
Safe (0)	0.96	0.92	0.94	120
SQL Injection (1)	0.93	0.96	0.94	120
XSS (2)	0.90	0.92	0.91	120
RCE (3)	0.94	0.94	0.94	120
Macro Average	0.93	0.935	0.933	480

Table 5. Comparison of analysis methods within SentinelCMS.

Method	Precision	Recall	Granularity	Zero-Day Capability
Taint Analysis	0.92	0.96	Line-level	Pattern-based
Bi-LSTM	0.93	0.935	File-level	Learned patterns
Combined	0.94	0.95	Both	Enhanced

Table 6. Baseline comparison of vulnerability detection approaches (augmented dataset, n = 600).

Method	Precision	Recall	F1-Macro	Accuracy
Random Forest + TF-IDF	0.79	0.76	0.77	78%
SVM (linear kernel)	0.81	0.79	0.80	80%
Unidirectional LSTM	0.87	0.86	0.86	87%
Taint Analysis only	0.92	0.96	0.94	93%
SentinelCMS (Combined)	0.94	0.95	0.94	94%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tashenova, Z.; Aitmagambetuly, A.; Bayegizova, A.; Santeyeva, S.; Abdugulova, Z.; Amanzholova, S.; Kerim, A. SentinelCMS: Proactive Vulnerability Detection in CMS Plugins Using Static Taint Analysis and Bidirectional LSTM. Appl. Sci. 2026, 16, 5471. https://doi.org/10.3390/app16115471

AMA Style

Tashenova Z, Aitmagambetuly A, Bayegizova A, Santeyeva S, Abdugulova Z, Amanzholova S, Kerim A. SentinelCMS: Proactive Vulnerability Detection in CMS Plugins Using Static Taint Analysis and Bidirectional LSTM. Applied Sciences. 2026; 16(11):5471. https://doi.org/10.3390/app16115471

Chicago/Turabian Style

Tashenova, Zhuldyz, Aisultan Aitmagambetuly, Aigulim Bayegizova, Saya Santeyeva, Zhanat Abdugulova, Shirin Amanzholova, and Akerke Kerim. 2026. "SentinelCMS: Proactive Vulnerability Detection in CMS Plugins Using Static Taint Analysis and Bidirectional LSTM" Applied Sciences 16, no. 11: 5471. https://doi.org/10.3390/app16115471

APA Style

Tashenova, Z., Aitmagambetuly, A., Bayegizova, A., Santeyeva, S., Abdugulova, Z., Amanzholova, S., & Kerim, A. (2026). SentinelCMS: Proactive Vulnerability Detection in CMS Plugins Using Static Taint Analysis and Bidirectional LSTM. Applied Sciences, 16(11), 5471. https://doi.org/10.3390/app16115471

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

SentinelCMS: Proactive Vulnerability Detection in CMS Plugins Using Static Taint Analysis and Bidirectional LSTM

Abstract

1. Introduction

1.1. Problem Statement

1.2. Research Objective

2. Literature Review

2.1. Prevalence of Vulnerabilities and Outdated Components

2.2. Data Privacy and Regulatory Compliance

2.3. Limitations of Existing Detection Tools

2.4. The Shift Towards Machine Learning

2.5. Static Taint Analysis for Vulnerability Detection

3. Classification and Analysis of CMS Vulnerabilities

Detailed Analysis of Key Threats

4. Evaluation of Existing Detection Methodologies

The Reactive Lifecycle Problem

5. Proposed Proactive Model: SentinelCMS

5.1. System Architecture

5.2. Methodology Description

5.3. Neural Network Architecture

5.3.1. Architectural Design Rationale

5.3.2. Architecture Details

6. Experimental Study (Proof of Concept)

6.1. Introduction to the Experiment

6.2. Methods

6.2.1. Dataset Construction

6.2.2. Feature Extraction

6.2.3. Model Training

6.2.4. Taint Analysis Evaluation

6.3. Results

6.3.1. Bi-LSTM Classification Results

6.3.2. Taint Analysis Results

6.3.3. Combined System Performance and Baseline Comparison

6.3.4. Qualitative Case Studies

6.4. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI