LogPPO: A Log-Based Anomaly Detector Aided with Proximal Policy Optimization Algorithms
Highlights
- Using Large Language Models (LLMs) together with domain classifiers improves log anomaly detection in data-scarce settings, with F1-Score gains of 5–86% over Transformer-based baselines.
- A PPO-based method aligns LLM outputs with classifier preferences, increasing “confidence” in label predictions.
- Enhancing automation improves operational stability and resilience in smart city cloud platforms, minimizing downtime and human intervention.
- Reducing data requirements lowers the practical barriers for deploying NLP-based anomaly detection, making advanced diagnostic solutions feasible for large-scale urban systems.
Abstract
1. Introduction
- (1)
- We introduce a decoder-encoder framework specifically designed to utilize both domain-specific expertise and general human knowledge within encoder-based classification models.
- (2)
- To the best of our knowledge, we are the first to employ the PPO algorithm in log-based anomaly detection tasks to address and align cognitive discrepancies between LLMs and classification models.
- (3)
- Three Transformer-based models are selected as baselines. Experimental results on four public log datasets show that LogPPO achieves a 5% to 86% improvement in anomaly detection F1-Scores compared to the best-performing baseline.
2. Background and Related Work
2.1. Log Data
2.2. Log Anomaly Detection via Deep Learning Architectures
2.2.1. Embedding-Based Methods
2.2.2. Generation-Based Methods
3. Approach
3.1. Classification Model
3.2. Generation Model
4. Experiments
- RQ1: How effective is LogPPO under conditions of data scarcity?
- RQ2: How effective are the LogPPO components?
- RQ3: Why is the Decoder-Encoder architecture effective for incorporating natural language analyses?
- RQ4: How do hyperparameter settings affect the convergence of LogPPO?
4.1. Experimental Metrics and Datasets
4.2. RQ1: How Effective Is LogPPO Under Conditions of Data Scarcity?
4.3. RQ2: How Effective Are the LogPPO Components?
4.4. RQ3: Why Is the Decoder-Encoder Architecture Effective for Incorporating Natural Language Analyses?
4.5. RQ4: How Do Hyperparameter Settings Affect the Convergence of LogPPO?
5. Discussion of Computational Cost and Deployment
5.1. Computational Cost
5.2. Deployment Challenges and Continuous Adaptation
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- He, S.; He, P.; Chen, Z.; Yang, T.; Su, Y.; Lyu, M.R. A Survey on Automated Log Analysis for Reliability Engineering. ACM Comput. Surv. 2021, 54, 130. [Google Scholar] [CrossRef]
- Ma, L.; Yang, W.; Xu, B.; Jiang, S.; Fei, B.; Liang, J.; Zhou, M.; Xiao, Y. KnowLog: Knowledge Enhanced Pre-trained Language Model for Log Understanding. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, Lisbon, Portugal, 14–20 April 2024. [Google Scholar] [CrossRef]
- Tao, S.; Liu, Y.; Meng, W.; Ren, Z.; Yang, H.; Chen, X.; Zhang, L.; Xie, Y.; Su, C.; Oiao, X.; et al. Biglog: Unsupervised Large-scale Pre-training for a Unified Log Representation. In Proceedings of the 2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS), Orlando, FL, USA, 19–21 June 2023; pp. 1–11. [Google Scholar] [CrossRef]
- Guo, H.; Yuan, S.; Wu, X. LogBERT: Log Anomaly Detection via BERT. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Liu, Y.; Tao, S.; Meng, W.; Yao, F.; Zhao, X.; Yang, H. LogPrompt: Prompt Engineering Towards Zero-Shot and Interpretable Log Analysis. In Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), Lisbon, Portugal, 14–20 April 2024; pp. 364–365. [Google Scholar] [CrossRef]
- Chen, R.; Zhang, S.; Li, D.; Zhang, Y.; Guo, F.; Meng, W.; Pei, D.; Zhang, Y.; Chen, X.; Liu, Y. LogTransfer: Cross-System Log Anomaly Detection for Software Systems with Transfer Learning. In Proceedings of the 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), Coimbra, Portugal, 12–15 October 2020; pp. 37–47. [Google Scholar] [CrossRef]
- Wang, J.; Chu, G.; Wang, J.; Sun, H.; Qi, Q.; Wang, Y.; Qi, J.; Liao, J. LogExpert: Log-based Recommended Resolutions Generation using Large Language Model. In Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), Lisbon, Portugal, 14–20 April 2024; pp. 42–46. [Google Scholar] [CrossRef]
- Wang, X.; Song, J.; Zhang, X.; Tang, J.; Gao, W.; Lin, Q. LogOnline: A Semi-Supervised Log-Based Anomaly Detector Aided with Online Learning Mechanism. In Proceedings of the 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), Luxembourg, 11–15 September 2023; pp. 141–152. [Google Scholar] [CrossRef]
- Le, V.H.; Zhang, H. Log-based Anomaly Detection Without Log Parsing. In Proceedings of the 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, 15–19 November 2021; pp. 492–504. [Google Scholar] [CrossRef]
- Jiang, Z.M.; Hassan, A.E.; Flora, P.; Hamann, G. Abstracting Execution Logs to Execution Events for Enterprise Applications (Short Paper). In Proceedings of the 2008 The Eighth International Conference on Quality Software, Oxford, UK, 12–13 August 2008; pp. 181–186. [Google Scholar] [CrossRef]
- Le, V.H.; Zhang, H. Log Parsing with Prompt-based Few-shot Learning. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, 14–20 May 2023; pp. 2438–2449. [Google Scholar] [CrossRef]
- Du, M.; Li, F. Spell: Streaming Parsing of System Event Logs. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM), Barcelona, Spain, 12–15 December 2016; pp. 859–864. [Google Scholar] [CrossRef]
- He, P.; Zhu, J.; Zheng, Z.; Lyu, M.R. Drain: An Online Log Parsing Approach with Fixed Depth Tree. In Proceedings of the 2017 IEEE International Conference on Web Services (ICWS), Honolulu, HI, USA, 25–30 June 2017; pp. 33–40. [Google Scholar] [CrossRef]
- Le, V.H.; Zhang, H. Log Parsing: How Far Can ChatGPT Go? In Proceedings of the 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), Luxembourg, 11–15 November 2023; pp. 1699–1704. [Google Scholar] [CrossRef]
- Ma, Z.; Chen, A.R.; Kim, D.J.; Chen, T.H.P.; Wang, S. LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing. In Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE), Lisbon, Portugal, 14–20 April 2024; pp. 1209–1221. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Wang, T.; Zhu, Q. ChatGPT—Technical Research Model, Capability Analysis, and Application Prospects. In Proceedings of the 2024 IEEE 7th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 15–17 March 2024; Volume 7, pp. 787–796. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); Burstein, J., Doran, C., Solorio, T., Eds.; Association for Computational Linguistics: Minneapolis, MN, USA, 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Nguyen, H.T.; Nguyen, L.V.; Le, V.H.; Zhang, H.; Le, M.T. Efficient Log-based Anomaly Detection with Knowledge Distillation. In Proceedings of the 2024 IEEE International Conference on Web Services (ICWS), Shenzhen, China, 7–13 July 2024; pp. 578–589. [Google Scholar] [CrossRef]
- Hearst, M.; Dumais, S.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
- Liang, Y.; Zhang, Y.; Xiong, H.; Sahoo, R. Failure Prediction in IBM BlueGene/L Event Logs. In Proceedings of the Seventh IEEE International Conference on Data Mining (ICDM 2007), Omaha, NE, USA, 28–31 October 2007; pp. 583–588. [Google Scholar] [CrossRef]
- Zhao, N.; Wang, H.; Li, Z.; Peng, X.; Wang, G.; Pan, Z.; Wu, Y.; Feng, Z.; Wen, X.; Zhang, W.; et al. An empirical investigation of practical log anomaly detection for online service systems. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, 23–28 August 2021; pp. 1404–1415. [Google Scholar] [CrossRef]
- Stiennon, N.; Ouyang, L.; Wu, J.; Ziegler, D.M.; Lowe, R.; Voss, C.; Radford, A.; Amodei, D.; Christiano, P. Learning to summarize from human feedback. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020. [Google Scholar]
- Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. LLaMA: Open and Efficient Foundation Language Models. arXiv 2023, arXiv:2302.13971. [Google Scholar] [CrossRef]
- Zheng, R.; Dou, S.; Gao, S.; Hua, Y.; Shen, W.; Wang, B.; Liu, Y.; Jin, S.; Liu, Q.; Zhou, Y.; et al. Secrets of RLHF in Large Language Models Part I: PPO. arXiv 2023, arXiv:2307.04964. [Google Scholar] [CrossRef]
- Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. In Proceedings of the Advances in Neural Information Processing Systems; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Sydney, Australia, 2022; Volume 35, pp. 27730–27744. [Google Scholar]
- Powers, D. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. J. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar]
- Oliner, A.; Stearley, J. What Supercomputers Say: A Study of Five System Logs. In Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), Edinburgh, UK, 25–28 June 2007; pp. 575–584. [Google Scholar] [CrossRef]
- Zhu, J.; He, S.; He, P.; Liu, J.; Lyu, M.R. Loghub: A large collection of system log datasets for ai-driven log analytics. In Proceedings of the 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), Florence, Italy, 9–12 October 2023; pp. 355–366. [Google Scholar]
- He, H.; Garcia, E.A. Learning from Imbalanced Data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar] [CrossRef]
- Abdi, H.; Williams, L.J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 433–459. [Google Scholar] [CrossRef]
- Kirkpatrick, J.; Pascanu, R.; Rabinowitz, N.; Veness, J.; Desjardins, G.; Rusu, A.A.; Milan, K.; Quan, J.; Ramalho, T.; Grabska-Barwinska, A.; et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. USA 2017, 114, 3521–3526. [Google Scholar] [CrossRef] [PubMed]
- Wang, A.; Singh, A.; Michael, J.; Hill, F.; Levy, O.; Bowman, S. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP; Linzen, T., Chrupała, G., Alishahi, A., Eds.; Association for Computational Linguistics: Brussels, Belgium, 2018; pp. 353–355. [Google Scholar] [CrossRef]
- Jaques, N.; Ghandeharioun, A.; Shen, J.H.; Ferguson, C.; Lapedriza, A.; Jones, N.; Gu, S.; Picard, R. Way off-policy batch deep reinforcement learning of implicit human preferences in dialog. arXiv 2019, arXiv:1907.00456. [Google Scholar] [CrossRef]
- Frantar, E.; Ashkboos, S.; Hoefler, T.; Alistarh, D. GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers. arXiv 2022, arXiv:2210.17323. [Google Scholar]
- Lin, J.; Tang, J.; Tang, H.; Yang, S.; Xiao, G.; Han, S. AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration. Getmobile Mobile Comp. Comm. 2025, 28, 12–17. [Google Scholar] [CrossRef]














| Dataset | Method | Precision | Recall | F1-Score |
|---|---|---|---|---|
| BGL | KnowLog (model) | 1.0000 | 0.0286 | 0.0556 |
| BigLog | 0.2500 | 0.0286 | 0.0513 | |
| LogPrompt | 0.1357 | 1.0000 | 0.2389 | |
| LogPPO (Ours) | 0.4324 | 0.4571 | 0.4444 | |
| Spirit | KnowLog (model) | 1.0000 | 0.0333 | 0.0645 |
| BigLog | 0.0000 | 0.0000 | 0.0000 | |
| LogPrompt | 0.1304 | 1.0000 | 0.2308 | |
| LogPPO (Ours) | 0.6667 | 0.2000 | 0.3077 | |
| Thunderbird | KnowLog (model) | 1.0000 | 0.0769 | 0.1429 |
| BigLog | 0.0000 | 0.0000 | 0.0000 | |
| LogPrompt | 0.1130 | 1.0000 | 0.2031 | |
| LogPPO (Ours) | 1.0000 | 0.1538 | 0.2667 | |
| KnowLog (dataset) | KnowLog (model) | 0.4697 | 0.3748 | 0.4169 |
| BigLog | 0.2961 | 0.3161 | 0.3058 | |
| LogPrompt | 0.3733 | 0.7478 | 0.4980 | |
| LogPPO (Ours) | 0.4739 | 0.5803 | 0.5217 |
| Dataset | (1) Log Template Only | (2) Template + Pre-PPO Analysis | (3) Template + Post-PPO Analysis |
|---|---|---|---|
| BGL | 0.0560 | 0.1053 | 0.4444 |
| Spirit | 0.0645 | 0.1818 | 0.2778 |
| Thunderbird | 0.1429 | 0.1429 | 0.2667 |
| KnowLog (dataset) | 0.4169 | 0.4733 | 0.5197 |
| Clip Ratio | |||
|---|---|---|---|
| Learning Rate | 0.1 | 0.2 | 0.3 |
| 0.2609 | 0.2222 | 0.2857 | |
| 0.3750 | 0.4444 | 0.2759 | |
| 0.3438 | 0.2800 | 0.3143 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wang, Z.; Dong, J.; Yang, C. LogPPO: A Log-Based Anomaly Detector Aided with Proximal Policy Optimization Algorithms. Smart Cities 2026, 9, 5. https://doi.org/10.3390/smartcities9010005
Wang Z, Dong J, Yang C. LogPPO: A Log-Based Anomaly Detector Aided with Proximal Policy Optimization Algorithms. Smart Cities. 2026; 9(1):5. https://doi.org/10.3390/smartcities9010005
Chicago/Turabian StyleWang, Zhihao, Jiachen Dong, and Chuanchuan Yang. 2026. "LogPPO: A Log-Based Anomaly Detector Aided with Proximal Policy Optimization Algorithms" Smart Cities 9, no. 1: 5. https://doi.org/10.3390/smartcities9010005
APA StyleWang, Z., Dong, J., & Yang, C. (2026). LogPPO: A Log-Based Anomaly Detector Aided with Proximal Policy Optimization Algorithms. Smart Cities, 9(1), 5. https://doi.org/10.3390/smartcities9010005

