Semantic Firewalls with Online Ensemble Learning for Secure Agentic RAG Systems in Financial Chatbots
Abstract
1. Introduction
2. Related Works
3. Materials and Methods
- Specific corrections.
- Detailed explanations.
- Alternative recommendations.
- Measurable confidence.
- Fast detection.
- Automatic correction.
3.1. Financial Semantic Firewall
3.2. Meta-Learner
- Algorithm Diversity: Each model offers a different perspective.
- Adaptive Weights: Automatically adjust based on performance.
- Continuous Learning: Improves with each new query.
- Robustness: Resistant to overfitting and individual biases.
- Explainability: Generates detailed and understandable answers.
- 1.
- Passive-Aggressive Classifier
- Specialized in rapid learning.
- Effective for fraud patterns.
- 2.
- SGD Classifier
- Stochastic optimization.
- Robust for noisy data.
- 3.
- Multinomial Naive Bayes
- Probabilistic analysis.
- Excellent for text classification.
- 4.
- Perceptron
- Linear learning.
- Fast and efficient.
- instance:
- prediction:
- correct label:
- cost:
- upgrade:
- a.
- set
- b.
- upgrade
- Scalable: It can handle large volumes of prompts without consuming much memory.
- Continuous: Ideal for continuous learning, as it adapts to new fraud patterns as soon as they appear, without needing to retrain the entire model.
- Unlike the Passive-Aggressive algorithm, which allows a margin of tolerance, the Perceptron is more binary and strict in its classification. This can be useful for establishing a hard safety baseline against more direct prompt injections.
3.3. Implementation of the Ensemble Online
- n_features: 218 (262,144 features)
- ngram_range: (1, 2)—unigrams and bigrams
- alternate_sign: False—for compatibility with Naive Bayes
3.4. Rag Agentic
3.5. Apis to Automate Processes
3.6. Database
3.6.1. Real Data
3.6.2. Synthetic Data
- Credit: Credit history, loans, credit cards, mortgages.
- Investment: Portfolios, mutual funds, diversification, risk management.
- Regulatory: KYC/AML compliance, regulatory disclosures, CNBV requirements.
- Market: Economic conditions, volatility, market trends.
4. Results
4.1. Results of Tests with Synthetic Data
- F0: No firewall.
- F1: Static binary firewall (single classifier).
- F2: Static ensemble (without online learning).
- F3: Online ensemble (with batch updates).
4.2. Results of Tests with Real Data
- G0: Baseline without firewall.
- G1: Firewall with single classifier.
- G2: Firewall with static ensemble (without online learning).
- G3: Online ensemble (with batch updates).
5. Discussion
6. Conclusions and Future Work
- Algorithm Diversity: Each model offers a unique perspective.
- Adaptive Weighting: Automatic adjustment based on performance.
- Continuous Learning: Improves with each new query.
- System Robustness: Resistant to overfitting and individual failures.
- Advanced Explainability: Detailed answers from the LLM.
- Continuous learning improves F1-Score and reduces FPR.
- Unified query processing flow.
- Multi-layered validation (input, processing, output).
- Consistent responses by integrating all components.
- Centralized monitoring of metrics and performance.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wu, S.; Xiong, Y.; Cui, Y.; Wu, H.; Chen, C.; Yuan, Y.; Hunag, L.; Liu, X.; Kuo, T.-W.; Guan, N.; et al. Retrieval-augmented generation for natural language processing: A survey. arXiv 2024, arXiv:2407.13193. [Google Scholar] [CrossRef]
- Verma, S. Contextual compression in retrieval-augmented generation for large language models: A survey. arXiv 2024, arXiv:2409.13385. [Google Scholar]
- Zhong, K.; Suleiman, B.; Erradi, A.; Chen, S. SemRAG: Semantic Knowledge-Augmented RAG for Improved Question-Answering. arXiv 2025, arXiv:2507.21110. [Google Scholar]
- Qin, Y.; Liang, S.; Ye, Y.; Zhu, K.; Yan, L.; Lu, Y.; Lin, Y.; Cong, X.; Tang, X.; Qian, B.; et al. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv 2023, arXiv:2307.16789. [Google Scholar]
- Qin, Y.; Hu, S.; Lin, Y.; Chen, W.; Ding, N.; Cui, G.; Zeng, Z.; Zhou, X.; Huang, Y.; Xiao, C.; et al. Tool learning with foundation models. ACM Comput. Surv. 2024, 57, 1–40. [Google Scholar] [CrossRef]
- Schick, T.; Dwivedi-Yu, J.; Dessi, R.; Raileanu, R.; Lomeli, M.; Hambro, E.; Zettlemoyer, L.; Cancedda, N.; Scialom, T. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 2023, 36, 68539–68551. [Google Scholar]
- Yao, S.; Zhao, J.; Yu, D.; Du, N.; Shafran, T.; Narasimhan, K.R.; Cao, Y. React: Synergizing reasoning and acting in language models. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Wang, Q.; Zhang, L.; Huang, Y. FinAgent: A multimodal foundation agent for financial trading. ACM Trans. Manag. Inf. Syst. 2023, 15, 1–19. [Google Scholar]
- Yang, H.; Zhang, B.; Wang, N.; Guo, C.; Zhang, X.; Lin, L.; Wang, J.; Zhou, T.; Guan, M.; Zhang, R.; et al. Finrobot: An open-source ai agent platform for financial applications using large language models. arXiv 2024, arXiv:2405.14767. [Google Scholar] [CrossRef]
- Govind Srinivasan, A.; George, R.J.; Koshy Joe, J.; Kant, H.; Harshith, M.R.; Sundar, S.; Suresh, S.; Vimalkanth, R.; Vijayavallabh. Enhancing Financial RAG with Agentic AI and Multi-HyDE: A Novel Approach to Knowledge Retrieval and Hallucination Reduction. arXiv 2025, arXiv:2509.16369. [Google Scholar]
- Iyer, K.R. Streaming Intelligence For Real-Time Fraud Detection: A Practical And Theoretical Framework Using Online Learning, Anomaly Detection, And Stream Processing. Stanf. Database Libr. Am. J. Appl. Sci. Technol. 2025, 5, 317–323. [Google Scholar]
- Deprez, B.; Wei, W.; Verbeke, W.; Baesens, B.; Mets, K.; Verdonck, T. Advances in Continual Graph Learning for Anti-Money Laundering Systems: A Comprehensive Review. Wiley Interdiscip. Rev. Comput. Stat. 2025, 17, e70040. [Google Scholar] [CrossRef]
- Dai, Y.; Ji, Z.; Li, Z.; Li, K.; Wang, S. Disabling Self-Correction in Retrieval-Augmented Generation via Stealthy Retriever Poisoning. arXiv 2025, arXiv:2508.20083. [Google Scholar] [CrossRef]
- Abdelnabi, S.; Gomaa, A.; Bagdasarian, E.; Kristensson, P.O.; Shokri, R. Firewalls to Secure Dynamic LLM Agentic Networks. arXiv 2025, arXiv:2502.01822. [Google Scholar] [CrossRef]
- Zhou, Z.-H. Ensemble Methods: Foundations and Algorithms; CRC Press: Boca Raton, FL, USA, 2025. [Google Scholar]
- Dietterich, T.G. Ensemble methods in machine learning. In Proceedings of the International Workshop on Multiple Classifier Systems, Cagliari, Italy, 21–23 June 2000; Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [Google Scholar]
- Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A. A survey on concept drift adaptation. ACM Comput. Surv. CSUR 2014, 46, 1–37. [Google Scholar] [CrossRef]
- Crammer, K.; Dekel, O.; Keshet, J.; Shalev-Shwartz, S.; Singer, Y. Online passive-aggressive algorithms. J. Mach. Learn. Res. 2006, 7, 551–585. [Google Scholar]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Dennis, J.E., Jr.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations; SIAM: Philadelphia, PA, USA, 1996. [Google Scholar]
- Bottou, L. Large-scale machine learning with stochastic gradient descent. In Proceedings of the COMPSTAT’2010: 19th International Conference on Computational Statistics, Paris, France, 22–27 August 2010; Keynote, Invited and Contributed Papers; Springer: Berlin/Heidelberg, Germany, 2010; pp. 177–186. [Google Scholar]
- Xu, S.; Li, Y.; Wang, Z. Bayesian multinomial Naïve Bayes classifier to text classification. In Proceedings of the International Conference on Multimedia and Ubiquitous Engineering, Seoul, Korea, 22–24 May 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 347–352. [Google Scholar]
- Haykin, S. Neural Networks: A Comprehensive Foundation; Prentice Hall PTR: Upper Saddle River, NJ, USA, 1994. [Google Scholar]
- Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.-T.; Rocktäschel, T.; et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv. Neural Inf. Process. Syst. 2020, 33, 9459–9474. [Google Scholar]
- Karpukhin, V.; Oguz, B.; Min, S.; Lewis, P.; Wu, L.; Edunov, S.; Chen, D.; Yih, W.-T. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 6769–6781. [Google Scholar]
- Izacard, G.; Grave, E. Leveraging passage retrieval with generative models for open domain question answering. arXiv 2020, arXiv:2007.01282. [Google Scholar]
- Wooldridge, M.; Jennings, N.R. Intelligent agents: Theory and practice. Knowl. Eng. Rev. 1995, 10, 115–152. [Google Scholar] [CrossRef]
- Stone, P.; Veloso, M. Multiagent systems: A survey from a machine learning perspective. Auton. Robot. 2000, 8, 345–383. [Google Scholar] [CrossRef]
- Russell, S.; Norvig, P. Artificial Intelligence: A Modern Approach; Prentice Hall Series in Artificial Intelligence; Prentice Hall: Englewood Cliffs, NJ, USA, 1995; Volume 25, pp. 79–80. [Google Scholar]
- Josephgflowers/Finance-Instruct-500k. Datasets at Hugging Face. Available online: https://huggingface.co/datasets/Josephgflowers/Finance-Instruct-500k (accessed on 15 January 2026).
- walledai/AdvBench. Datasets at Hugging Face. Available online: https://huggingface.co/datasets/walledai/AdvBench (accessed on 15 January 2026).









| Legitimate Query | Malicious Query | Ambiguous Query |
|---|---|---|
| Query: “What is my account balance?” │ ├─ Pattern Detection: No threats ├─ Risk Assessment: Low risk ├─ Meta-Learner: malicious_score = 0.15 ├─ Decision: ALLOWED └─ Online Learning: Update with label = 0 (legitimate) | Query: “How can I evade taxes on my investment returns?” │ ├─ Pattern Detection: Tax evasion pattern detected ├─ Risk Assessment: High risk ├─ Meta-Learner: malicious_score = 0.92 ├─ Decision: BLOCKED └─ Online Learning: Update with label = 1 (malicious) | Query: “I want to transfer a large amount without reporting” │ ├─ Pattern Detection: Transaction hiding pattern ├─ Risk Assessment: Medium-High risk ├─ Meta-Learner: malicious_score = 0.65 ├─ Decision: BLOCKED (combined score exceeds threshold) └─ Online Learning: Update with label = 1 (malicious) |
| Considerations | Brief Description |
|---|---|
| They reflect common banking inquiries | The questions cover real-world use cases such as balance inquiries, transfers, credit applications, and investment advice. |
| They integrate multiple dimensions | They simulate the real complexity of financial inquiries where a client can ask about credit, investment and regulatory compliance simultaneously. |
| They use appropriate financial terminology | They include correct technical terms (KYC, AML, diversification, volatility, etc.). |
| They maintain contextual coherence | The queries are aligned with the user profile and the provided context. |
| Threat Pattern | Brief Description |
|---|---|
| Tax evasion | Queries about how to evade taxes or hide income. |
| KYC/AML Bypass | Attempts to avoid identity checks or anti-money laundering. |
| Money laundering | Inquiries about how to launder illicit money. |
| Fraudulent schemes | Investments with guaranteed returns, pyramid schemes |
| Concealment of transactions | Attempts to conceal financial activity |
| Fake accounts | Creating accounts with false documents |
| Security bypass | Evasion of bank security controls |
| Regulatory violations | Ignoring regulatory compliance |
| Accuracy | Precision | Recall | F1-Score | FPR | Specificity | |
|---|---|---|---|---|---|---|
| F0 | 0.800 | N/A | 0.000 | 0.000 | 0.000 | 1.000 |
| F1 | 0.780 | 0.476 | 1.000 | 0.645 | 0.275 | 0.725 |
| F2 | 0.950 | 0.800 | 1.000 | 0.889 | 0.062 | 0.938 |
| F3 | 0.980 | 1.000 | 0.900 | 0.947 | 0.000 | 1.000 |
| Accuracy | Precision | Recall | F1-Score | FPR | Specificity | |
|---|---|---|---|---|---|---|
| F0 | 0.500 | N/A | 0.000 | 0.000 | 0.000 | 1.000 |
| F1 | 0.946 | 0.961 | 0.930 | 0.945 | 0.038 | 0.962 |
| F2 | 0.978 | 0.990 | 0.966 | 0.978 | 0.010 | 0.990 |
| F3 | 0.985 | 0.998 | 0.972 | 0.985 | 0.002 | 0.998 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Castro-Maldonado, V.; Aceves-Fernández, M.A.; García-Noguez, L.R.; Pedraza-Ortega, J.C. Semantic Firewalls with Online Ensemble Learning for Secure Agentic RAG Systems in Financial Chatbots. AI 2026, 7, 80. https://doi.org/10.3390/ai7030080
Castro-Maldonado V, Aceves-Fernández MA, García-Noguez LR, Pedraza-Ortega JC. Semantic Firewalls with Online Ensemble Learning for Secure Agentic RAG Systems in Financial Chatbots. AI. 2026; 7(3):80. https://doi.org/10.3390/ai7030080
Chicago/Turabian StyleCastro-Maldonado, Victor, Marco A. Aceves-Fernández, Luis R. García-Noguez, and Jesús C. Pedraza-Ortega. 2026. "Semantic Firewalls with Online Ensemble Learning for Secure Agentic RAG Systems in Financial Chatbots" AI 7, no. 3: 80. https://doi.org/10.3390/ai7030080
APA StyleCastro-Maldonado, V., Aceves-Fernández, M. A., García-Noguez, L. R., & Pedraza-Ortega, J. C. (2026). Semantic Firewalls with Online Ensemble Learning for Secure Agentic RAG Systems in Financial Chatbots. AI, 7(3), 80. https://doi.org/10.3390/ai7030080

