DLogParser: An Efficient Dynamic Log Parser with Multiple Grouping Criteria
Abstract
1. Introduction
2. Observation
3. Methodology
3.1. Step 1 Sampling Parsing
3.2. Step 2 Policy Generation
| Algorithm 1 Log Policy Generation Algorithm. Source: author’s contribution. |
| Require: candidate criteria C Ensure: policy p
|
3.3. Step 3 Full Parsing
4. Evaluation
4.1. Accuracy
4.2. Robustness
4.3. Efficiency
5. Related Work
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- He, S.; He, P.; Chen, Z.; Yang, T.; Su, Y.; Lyu, M. A Survey on Automated Log Analysis for Reliability Engineering. ACM Comput. Surv. 2021, 54, 1–37. [Google Scholar] [CrossRef]
- Le, V.-H.; Zhang, H.Y. Log-based Anomaly Detection with Deep Learning: How Far Are We? In Proceedings of the IEEE/ACM International Conference on Software Engineering, Pittsburgh, PA, USA, 25–27 May 2022; pp. 1356–1367. [Google Scholar]
- Yang, L.; Chen, J.J.; Wang, Z.; Wang, W.J.; Jiang, J.J.; Dong, X.Y. Semi-Supervised Log-Based Anomaly Detection via Probabilistic Label Estimation. In Proceedings of the IEEE/ACM International Conference on Software Engineering, Madrid, Spain, 22–30 May 2021; pp. 1448–1460. [Google Scholar]
- Chen, A.R.; Chen, T.-H.; Wang, S.W. Pathidea: Improving Information Retrieval-Based Bug Localization by Re-Constructing Execution Paths Using Logs. IEEE Trans. Softw. Eng. 2022, 48, 2905–2919. [Google Scholar] [CrossRef]
- Chuah, E.; Kuo, S.-H.; Hiew, P.; Tjhi, W.-C.; Lee, G.; Hammond, J. Diagnosing the root-causes of failures from cluster log files. In Proceedings of the International Conference on High Performance Computing, Goa, India, 19–22 December 2010; pp. 1–10. [Google Scholar]
- Notaro, P.; Haeri, S.; Cardoso, J.; Gerndt, M. LogRule: Efficient Structured Log Mining for Root Cause Analysis. IEEE Trans. Netw. Serv. Manag. 2023, 20, 4231–4243. [Google Scholar] [CrossRef]
- Bushong, V.; Sanders, R.; Curtis, J.; Du, M.; Cerny, T.; Frajtak, K.; Bures, M.; Tisnovsky, P.; Shin, D.W. On Matching Log Analysis to Source Code: A Systematic Mapping Study. In Proceedings of the International Conference on Research in Adaptive and Convergent Systems, Gwangju, Republic of Korea, 13–16 October 2020; pp. 181–187. [Google Scholar]
- Shang, W.Y. Bridging the divide between software developers and operators using logs. In Proceedings of the International Conference on Software Engineering, Zurich, Switzerland, 2–9 June 2012; pp. 1583–1586. [Google Scholar]
- Vaarandi, R. A data clustering algorithm for mining patterns from event logs. In Proceedings of the IEEE Workshop on IP Operations and Management, Kansas City, MO, USA, 1–3 October 2003; pp. 119–126. [Google Scholar]
- Han, J.; Pei, J.; Yin, Y.; Mao, R. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Min. Knowl. Discov. 2004, 8, 53–87. [Google Scholar] [CrossRef]
- Vaarandi, R.; Pihelgas, M. LogCluster—A data clustering and pattern mining algorithm for event logs. In Proceedings of the International Conference on Network and Service Management, Barcelona, Spain, 9–13 November 2015; pp. 1–7. [Google Scholar]
- Hamooni, H.; Debnath, B.; Xu, J.W.; Zhang, H.; Jiang, G.F.; Mueen, A. LogMine: Fast Pattern Recognition for Log Analytics. In Proceedings of the ACM International on Conference on Information and Knowledge Management, Indianapolis, IN, USA, 24–28 October 2016; pp. 1573–1582. [Google Scholar]
- Fu, Q.; Lou, J.-G.; Wang, Y.; Li, J. Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis. In Proceedings of the IEEE International Conference on Data Mining, Miami, FL, USA, 6–9 December 2009; pp. 149–158. [Google Scholar]
- Tang, L.; Li, T.; Perng, C.-S. LogSig: Generating system events from raw textual logs. In Proceedings of the ACM international conference on Information and knowledge management, Glasgow, UK, 24–28 October 2011; pp. 785–794. [Google Scholar]
- Jiang, Z.M.; Hassan, A.E.; Flora, P.; Hamann, G. Abstracting Execution Logs to Execution Events for Enterprise Applications. In Proceedings of the International Conference on Quality Software, Oxford, UK, 12–13 August 2008; pp. 181–186. [Google Scholar]
- Makanju, A.A.O.; Zincir-Heywood, A.N.; Milios, E.E. Clustering event logs using iterative partitioning. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009; pp. 1255–1264. [Google Scholar]
- He, P.; Zhu, J.; Zheng, Z.; Lyu, M.R. Drain: An Online Log Parsing Approach with Fixed Depth Tree. In Proceedings of the IEEE International Conference on Web Services, Honolulu, HI, USA, 25–30 June 2017; pp. 33–40. [Google Scholar]
- Le, V.-H.; Zhang, H. Log Parsing with Prompt-based Few-shot Learning. In Proceedings of the IEEE/ACM International Conference on Software Engineering, Australia, 14–20 May 2023; pp. 2438–2449. [Google Scholar]
- Jiang, Z.; Liu, J.; Chen, Z.; Li, Y.; Huang, J.; Huo, Y.; He, P.; Gu, J.; Lyu, M. LILAC: Log Parsing using LLMs with Adaptive Parsing Cache. Proc. ACM Softw. Eng. 2024, 1, 137–160. [Google Scholar] [CrossRef]
- Xu, A.; Gau, A. HELP: Hierarchical Embeddings-based Log Parsing. arXiv 2024. [Google Scholar] [CrossRef]
- He, P.J.; Zhu, J.M.; Xu, P.C.; Zheng, Z.B.; Lyu, M.R. A Directed Acyclic Graph Approach to Online Log Parsing. arXiv 2018. [Google Scholar] [CrossRef]
- Yuan, J.H.; Zhou, H.W.; Wang, C.; Guan, B. nDrain: A Robust Log Template Mining Algorithm. In Proceedings of the International Conference on Computer and Communications, Chengdu, China, 13–16 December 2024; pp. 332–336. [Google Scholar]
- Zhu, J.M.; He, S.L.; Liu, J.Y.; He, P.J.; Xie, Q.; Zheng, Z.B. Tools and Benchmarks for Automated Log Parsing. In Proceedings of the IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, Montreal, QC, Canada, 25–31 May 2019; pp. 121–130. [Google Scholar]
- He, P.J.; Zhu, J.M.; He, S.L.; Li, J.; Lyu, M.R. An Evaluation Study on Log Parsing and Its Use in Log Mining. In Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks, Toulouse, France, 28 June–1 July 2016; pp. 654–661. [Google Scholar]
- Jiang, J.; Fu, Y.; Xu, J. PosParser: A Heuristic Online Log Parsing Method Based on Part-of-Speech Tagging. IEEE Trans. Big Data 2025, 11, 1334–1345. [Google Scholar] [CrossRef]
- Zhu, J.M.; He, S.L.; He, P.J.; Liu, J.Y.; Lyu, M.R. Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics. In Proceedings of the IEEE International Symposium on Software Reliability Engineering, Florence, Italy, 9–12 October 2023; pp. 355–366. [Google Scholar]
- Nagappan, M.; Vouk, M.A. Abstracting log lines to log event types for mining software system logs. In Proceedings of the IEEE Working Conference on Mining Software Repositories, Cape Town, South Africa, 2–3 May 2010; pp. 114–117. [Google Scholar]
- Du, M.; Li, F.F. Spell: Streaming Parsing of System Event Logs. In Proceedings of the IEEE International Conference on Data Mining, Barcelona, Spain, 12–15 December 2016; pp. 859–864. [Google Scholar]
- Wu, Y.; Yu, S.; Li, Y. Log Parsing using LLMs with Self-Generated In-Context Learning and Self-Correction. arXiv 2025, arXiv:2406.03376v2. [Google Scholar]
- Zhang, S.J.; Gang, W. Efficient Online Log Parsing with Log Punctuations Signature. Appl. Sci. 2021, 11, 11974. [Google Scholar] [CrossRef]
- Jiang, Z.H.; Liu, J.Y.; Huang, J.J.; Li, Y.C.; Huo, Y.T.; Gu, J.Z.; Chen, Z.B.; Zhu, J.M.; Lyu, M.R. A Large-Scale Evaluation for Log Parsing Techniques: How Far Are We? In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis, New York, NY, USA, 16–20 September 2024; pp. 223–234. [Google Scholar]
- Ma, Z.; Chen, A.R.; Kim, D.J.; Chen, T.-H.P.; Wang, S. LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing. In Proceedings of the IEEE/ACM International Conference on Software Engineering, Lisbon, Portugal, 14–20 April 2024; pp. 1209–1221. [Google Scholar]
- Xu, J.J.L.; Yang, R.C.; Huo, Y.T.; Zhang, C.Y.; He, P.J. DivLog: Log Parsing with Prompt Enhanced In-Context Learning. In Proceedings of the IEEE/ACM International Conference on Software Engineering, Lisbon, Portugal, 14–20 April 2024; pp. 2457–2468. [Google Scholar]
- Ma, Z.Y.; Kim, D.J.; Chen, T.-H.P. LibreLog: Accurate and Efficient Unsupervised Log Parsing Using Open-Source Large Language Models. In Proceedings of the IEEE/ACM International Conference on Software Engineering, Ottawa, ON, Canada, 26 April–6 May 2025; pp. 924–936. [Google Scholar]
- Xiao, Y.; Le, V.-H.; Zhang, H.Y. Demonstration-Free: Towards More Practical Log Parsing with Large Language Models. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, Sacramento, CA, USA, 27 October–1 November 2024; pp. 153–165. [Google Scholar]





| Log | Length | First Token | Total Templates | Accuracy |
|---|---|---|---|---|
| HDFS | 10 | 10 | 14 | 100% |
| Hadoop | 26 | 96 | 261 | 92.3% |
| Zookeeper | 30 | 17 | 77 | 99.5% |
| Apache | 18 | 12 | 30 | 100% |
| Linux | 18 | 199 | 441 | 68.6% |
| OpenSSH | 12 | 17 | 26 | 73.5% |
| Mac | 4 | 1221 | 1909 | 76.5% |
| Log | 5% | 10% | 20% | 30% | 50% |
|---|---|---|---|---|---|
| HDFS | 85.7% | 85.7% | 92.8% | 92.8% | 92.8% |
| Hadoop | 52.4% | 69.7% | 75.4% | 79.6% | 92.7% |
| Spark | 71.3% | 72.7% | 75.7% | 83.1% | 97.8% |
| Mac | 8.4% | 15.6% | 28.8% | 41.8% | 55.9% |
| OpenSSH | 80.7% | 84.6% | 84.6% | 84.6% | 84.6% |
| Dataset | AEL | IPLoM | LogCluster | LogMine | LogSig | LFA | Spell | Drain | nDrain+ | LILAC | AdaParser | DLogParser |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HDFS | 0.999 | 0.991 | 0.463 | 0.748 | 0.508 | 0.780 | 0.991 | 1.000 | 0.999 | 1.000 | 1.000 | 0.999 |
| Hadoop | 0.842 | 0.919 | 0.512 | 0.848 | 0.285 | 0.673 | 0.455 | 0.923 | 0.927 | 0.875 | 0.990 | 0.955 |
| Spark | 0.548 | 0.056 | 0.019 | 0.009 | 0.105 | 0.049 | 0.541 | 0.921 | 0.921 | 0.992 | 0.996 | 0.996 |
| ZooKeeper | 0.991 | 0.995 | 0.726 | 0.679 | 0.783 | 0.844 | 0.990 | 0.995 | 0.989 | 1.000 | 1.000 | 0.995 |
| BGL | 0.997 | 0.997 | 0.983 | 0.880 | 0.232 | 0.991 | 0.974 | 0.999 | 0.995 | 0.998 | 0.999 | 0.998 |
| HPC | 0.404 | 0.391 | 0.060 | 0.047 | 0.382 | 0.160 | 0.211 | 0.958 | 0.957 | 1.000 | 1.000 | 0.957 |
| Thunderbird | 0.860 | 0.739 | 0.544 | 0.846 | 0.756 | 0.682 | 0.583 | 0.922 | 0.906 | 0.910 | 0.953 | 0.917 |
| Linux | 0.916 | 0.808 | 0.595 | 0.736 | 0.107 | 0.224 | 0.622 | 0.686 | 0.805 | 0.652 | 0.801 | 0.826 |
| HealthApp | 0.731 | 0.974 | 0.738 | 0.545 | 0.092 | 0.753 | 0.657 | 0.861 | 0.851 | 1.000 | 0.990 | 0.860 |
| Apache | 1.000 | 0.992 | 0.536 | 1.000 | 0.731 | 0.802 | 1.000 | 1.000 | 1.000 | 0.996 | 0.999 | 1.000 |
| Proxifier | 0.973 | 0.800 | 0.661 | 0.503 | 0.494 | 0.351 | 0.521 | 0.692 | 0.773 | 0.521 | 0.946 | 0.692 |
| OpenSSH | 0.439 | 0.392 | 0.345 | 0.341 | 0.441 | 0.328 | 0.444 | 0.735 | 0.711 | 0.732 | 0.999 | 0.952 |
| OpenStack | 0.746 | 0.341 | 0.698 | 0.745 | 0.839 | 0.200 | 0.765 | 0.734 | 0.812 | 0.491 | 1.000 | 0.734 |
| Mac | 0.794 | 0.627 | 0.462 | 0.858 | 0.518 | 0.566 | 0.758 | 0.765 | 0.805 | 0.777 | 0.891 | 0.760 |
| Average | 0.802 | 0.715 | 0.524 | 0.628 | 0.448 | 0.528 | 0.679 | 0.870 | 0.889 | 0.853 | 0.969 | 0.903 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Yuan, J.; Wang, C.; Zhou, H.; Zhang, Y.; Wang, Y. DLogParser: An Efficient Dynamic Log Parser with Multiple Grouping Criteria. Appl. Sci. 2026, 16, 811. https://doi.org/10.3390/app16020811
Yuan J, Wang C, Zhou H, Zhang Y, Wang Y. DLogParser: An Efficient Dynamic Log Parser with Multiple Grouping Criteria. Applied Sciences. 2026; 16(2):811. https://doi.org/10.3390/app16020811
Chicago/Turabian StyleYuan, Jinhui, Chao Wang, Hongwei Zhou, Yucheng Zhang, and Yongwei Wang. 2026. "DLogParser: An Efficient Dynamic Log Parser with Multiple Grouping Criteria" Applied Sciences 16, no. 2: 811. https://doi.org/10.3390/app16020811
APA StyleYuan, J., Wang, C., Zhou, H., Zhang, Y., & Wang, Y. (2026). DLogParser: An Efficient Dynamic Log Parser with Multiple Grouping Criteria. Applied Sciences, 16(2), 811. https://doi.org/10.3390/app16020811

