Predictive Business Process Monitoring Approach Based on Hierarchical Transformer
Abstract
:1. Introduction
- We propose a novel model called HiP-Transformer, which infers accurate business process representations with three perspectives on the associated features. The internal associations between activities, events, and sub-sequences in the business process make up the comprehensive representation.
- We solve the problem of conceptual drift in long business processes. We capture the dependencies of preceding and following subsequences with conceptual drift behavior, which has a higher prediction accuracy and lower time complexity than traditional segmentation strategies.
- Comprehensive experiments on seven public datasets demonstrate the effectiveness of our HiP-Transformer model. Compared with deep learning methods, including LSTM [10], GRU [11], CNN [21], DNC [22], and Process Transformer [16], the next activity prediction accuracy improves by 6.32% compared to the average accuracy of the above models, and the average absolute error of remaining time prediction reduces by 21% compared to the average accuracy of the above models.
2. Literature Review
3. Preliminaries and Methods
3.1. Definitions
3.1.1. Event
3.1.2. Business Process Case and Prefix
3.1.3. Business Process Event Log
3.2. Task Description
3.3. Transformer
3.4. Hierarchical ProcessTransformer
3.4.1. First Layer: Event-Level Transformer
3.4.2. Second Layer: Subsequence-Level Transformer
Concept Drift Detection
Algorithm 1 Candidate Drift Point Search |
Input: Business process case encoding , subsequence length threshold , event vector module , confidence threshold δ Output: Candidate Drift Point Set 1: , //Initialization 2: //Initialization 3: //Concept drift detection when the sequence length is greater than the threshold value 4: //Calculate the sequence center position 5: , //Split left and right subsequences 6: , //Calculate the length of the left and right subsequences 7: , //Calculate the mean of the left and right subsequence encoding vectors 8: //Judge whether there is conceptual drift in the left and right subsequences according to Equation (18) 9: 10: //Candidate concept drift point search for left subsequence iterations 11: 12: //Candidate concept drift point search for right subsequence iterations 13: 14: |
Transformer for Subsequence
3.4.3. Third Layer: Case-Level Transformer
3.4.4. State Prediction Layer
- The encoding vector of the last event is selected in the first layer Transformer, noted as .
- The encoding matrix of the last event subsequence is selected in the second layer Transformer and is denoted as . Since this encoding matrix is related to the subsequence length, the last column is selected as the input and is denoted as .
- The encoding matrix of the business process case is selected in the third layer Transformer and is denoted as . Since this encoding matrix is related to the length of the sequence, the last columns are selected as input and are denoted as .
4. Results and Discussion
4.1. Experiment Data
- BPI2012 is a Holland financial institution loan applying process during the period from October 2011 to March 2012; W indicates the status of work items related to the approval process.
- BPI2013i is the process related to the Volvo back office management system during the period from April 2010 to May 2012.
- BPI2015 is the Holland building permit applying process during the period from April 2014 to September 2014.
- BPI2017 is the Holland financial institution loan applying process during the period from January 2016 to December 2016.
- Hospital is the business process in a regional hospital patient billing information during the period from December 2012 to January 2016.
- Sepsis is a hospital sepsis case visit information related process during the period from October 2014 to December 2014.
4.2. Experiment Setting
4.3. Comparison Experiments
4.3.1. Next Activity Prediction
- LSTM [10]: next activity prediction using Long Short-Term Memory (LSTM).
- GRU [11]: next activity prediction using Gated Recurrent Unit (GRU).
- CNN [21]: next activity prediction using Convolutional Neural Networks (CNN).
- DNC [22]: a memory unit is added to the LSTM model to constitute a differential neural computer (DNC) to implement the next activity prediction.
- Transformer [16]: the next activity prediction is predicted using the traditional Transformer directly.
4.3.2. Remaining Time Prediction
- LSTM [10]: remaining execution time prediction using LSTM.
- Accurate-LSTM [23]: using LSTM to model activity sequences, role sequences, and relative time sequences separately to enhance the effectiveness of LSTM models.
- Transformer [16]: remaining execution time prediction is directly implemented using the traditional Transformer.
4.4. Experiment Analysis
4.4.1. Ablation Experiment
4.4.2. Different Sequence Lengths on Prediction Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Aalst, W.; Adriansyah, A.; Medeiros, A.; Arcieri, F.; Wynn, M.T. Process Mining Manifesto; Springer: Berlin/Heidelberg, Germany, 2011; pp. 169–194. [Google Scholar]
- Rama-Maneiro, E.; Vidal, J.; Lama, M. Deep learning for predictive business process monitoring: Review and benchmark. IEEE Trans. Serv. Comput. 2021; early access. [Google Scholar] [CrossRef]
- Becker, J.; Breuker, D.; Delfmann, P.; Matzner, M. Designing and Implementing a Framework for Event-based Predictive Modelling of Business Processes. In Proceedings of the International Workshop on Enterprise Modeling and Information Systems Architectures, Luxembourg, 25–26 September 2014. [Google Scholar]
- Rogge-Solti, A.; Weske, M. Prediction of business process durations using non-Markovian stochastic Petri nets. Inf. Syst. 2015, 54, 1–14. [Google Scholar] [CrossRef]
- Lakshmanan, G.T.; Shamsi, D.; Doganata, Y.N.; Unuvar, M.; Khalaf, R. A markov prediction model for data-driven semi-structured business processes. Knowl. Inf. Syst. 2015, 42, 97–126. [Google Scholar] [CrossRef]
- Cabanillas, C.; Ciccio, C.D.; Mendling, J.; Baumgrass, A. Predictive task monitoring for business processes. In International Conference on Business Process Management; Sadiq, S., Soffer, P., Völzer, H., Eds.; Springer: Cham, Switzerland, 2014; Volume 8659, pp. 424–432. [Google Scholar]
- Francescomarino, C.D.; Dumas, M.; Maggi, F.M.; Teinemaa, I. Clustering-Based Predictive Process Monitoring. IEEE Trans. Serv. Comput. 2016, 12, 896–909. [Google Scholar] [CrossRef] [Green Version]
- Evermann, J.; Rehse, J.R.; Fettke, P. A Deep Learning Approach for Predicting Process Behaviour at Runtime. In International Conference on Business Process Management; Dumas, M., Fantinato, M., Eds.; Springer: Cham, Switzerland, 2017; Volume 281, pp. 327–338. [Google Scholar]
- Ni, W.J.; Sun, Y.J.; Liu, T. Business process remaining time prediction using bidirectional recurrent neural networks with attention. Comput. Integr. Manuf. Syst. 2020, 26, 1564–1572. [Google Scholar]
- Tax, N.; Verenich, I.; Rosa, M.L.; Dumas, M. Predictive Business Process Monitoring with LSTM Neural Networks. In International Conference on Advanced Information Systems Engineering; Dubois, E., Pohl, K., Eds.; Springer: Cham, Switzerland, 2017; Volume 10253, pp. 477–492. [Google Scholar]
- Hinkka, M.; Lehto, T.; Heljanko, K.; Jung, A. Classifying Process Instances Using Recurrent Neural Networks. In International Conference on Business Process Management; Daniel, F., Sheng, Q., Motahari, H., Eds.; Springer: Cham, Switzerland, 2018; Volume 342, pp. 313–324. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Ma, X.; Zhang, P.; Zhang, S.; Duan, N.; Zhou, M. A tensorized transformer for language modeling. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Sun, F.; Liu, J.; Wu, J.; Pei, C.; Lin, X.; Ou, W.; Jiang, P. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 1441–1450. [Google Scholar]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022; early access. [Google Scholar] [CrossRef]
- Bukhsh, Z.A.; Saeed, A.; Dijkman, R.M. Process Transformer: Predictive Business Process Monitoring with Transformer Network. arXiv 2021, arXiv:2104.00721. [Google Scholar]
- Li, L.; Wen, L.; Wang, J. MM-Pred: A Deep Predictive Model for Multi-attribute Event Sequence. In Proceedings of the 2019 SIAM International Conference on Data Mining, Calgary, AB, Canada, 2–4 May 2019; pp. 118–126. [Google Scholar]
- Schnig, S.; Jasinski, R.; Ackermann, L.; Jablonski, S. Deep Learning Process Prediction with Discrete and Continuous Data Features. In Proceedings of the 13th International Conference on Evaluation of Novel Approaches to Software Engineering, Funchal, Portugal, 23–24 March 2018; pp. 314–319. [Google Scholar]
- Sato, D.M.V.; Freitas, D.S.C.; Barddal, J.P.; Scalabrin, E.E. A survey on concept drift in process mining. ACM Comput. Surv. 2021, 54, 1–38. [Google Scholar] [CrossRef]
- Teinemaa, I.; Dumas, M.; Rosa, M.L.; Maggi, F.M. Outcome-oriented predictive process monitoring: Review and benchmark. ACM Trans. Knowl. Discov. Data 2019, 13, 1–57. [Google Scholar] [CrossRef]
- Pasquadibisceglie, V.; Appice, A.; Castellano, G.; Malerba, D. Using Convolutional Neural Networks for Predictive Process Analytics. In Proceedings of the IEEE International Conference on Process Mining, Aachen, Germany, 24–26 June 2019; pp. 129–136. [Google Scholar]
- Khan, A.; Le, H.; Do, K.; Tran, T.; Ghose, A.; Dam, H.; Sindhgatta, R. Memory-Augmented Neural Networks for Predictive Process Analytics. arXiv 2018, arXiv:1802.00938v1. [Google Scholar]
- Camargo, M.; Dumas, M.; González-Rojas, O. Learning Accurate LSTM Models of Business Processes. In International Conference on Business Process Management; Hildebrandt, T., van Dongen, B., Röglinger, M., Mendling, J., Eds.; Springer: Cham, Switzerland, 2019; Volume 11675, pp. 286–302. [Google Scholar]
- Liu, T.; Ni, W.; Sun, Y.; Zeng, Q. Predicting Remaining Business Time with Deep Transfer Learning. Data Anal. Knowl. Discov. 2020, 4, 134–142. [Google Scholar]
- Mehdiyev, N.; Evermann, J.; Fettke, P. A Multi-stage Deep Learning Approach for Business Process Event Prediction. In Proceedings of the IEEE 19th Conference on Business Informatics, Thessaloniki, Greece, 24–27 July 2017; pp. 119–128. [Google Scholar]
- Ni, W.; Yan, M.; Liu, T.; Zeng, Q. Predicting remaining execution time of business process instances via auto-encoded transition system. Intell. Data Anal. 2022, 26, 543–562. [Google Scholar] [CrossRef]
- Philipp, P.; Jacob, R.; Robert, S.; Beyerer, J. Predictive Analysis of Business Processes Using Neural Networks with Attention Mechanism. In Proceedings of the IEEE International Conference on Artificial Intelligence in Information and Communication, Fukuoka, Japan, 19–21 February 2020; pp. 225–230. [Google Scholar]
- Hoeffding, W. Probabilities inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 1963, 58, 13–30. [Google Scholar] [CrossRef]
- Frias-Blanco, I.; del Campo-Ávila, J.; Ramos-Jimenez, G.; Morales-Bueno, R.; Ortiz-Díaz, A.; Caballero-Mota, Y. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Trans. Knowl. Data Eng. 2014, 27, 810–823. [Google Scholar] [CrossRef]
- Ankerst, M.; Breunig, M.M.; Kriegel, H.P.; Sander, J. OPTICS: Ordering points to identify the clustering structure. In Proceedings of the ACM SIGMOD International Conference on Management of Data, New York, NY, USA, 31 May–3 June 1999; Volume 28, pp. 49–60. [Google Scholar]
Event Log | Cases Number | Events Number | Activities Number | Average Length | Maximum Length | Average Duration | Maximum Duration |
---|---|---|---|---|---|---|---|
BPI2012 | 13,087 | 262,200 | 24 | 20 | 175 | 8.6 | 137.2 |
BPI2012W | 9658 | 170,107 | 7 | 17.6 | 156 | 10.5 | 132 |
BPI2013i | 568 | 6499 | 13 | 11.4 | 123 | 11.9 | 768 |
BPI2015 | 1199 | 27,409 | 38 | 22.8 | 61 | 95.9 | 1486 |
BPI2017 | 31,509 | 1,202,267 | 26 | 35 | 180 | 18 | 395 |
Hospital | 3494 | 97,794 | 18 | 27.9 | 217 | 127.2 | 1034 |
Sepsis | 1049 | 15,214 | 16 | 14.5 | 185 | 28.5 | 422.3 |
Datasets | LSTM [10] | GRU [11] | CNN [21] | DNC [22] | Transformer [16] | HiP-Transformer |
---|---|---|---|---|---|---|
BPI2012 | 85.46 | 86.65 | 83.25 | 77.70 | 85.20 | 86.30 |
BPI2012W | 85.35 | 84.78 | 81.19 | 60.15 | 89.30 | 89.88 |
BPI2013i | 70.09 | 74.69 | 46.03 | 51.91 | 62.11 | 79.33 |
BPI2015 | 68.58 | 71.02 | 58.43 | 70.56 | 71.98 | 73.02 |
BPI2017 | 83.15 | 84.25 | 78.45 | 84.75 | 81.88 | 85.88 |
Hospital | 78.50 | 79.46 | 75.89 | 70.26 | 80.98 | 84.70 |
Sepsis | 64.22 | 63.50 | 56.15 | 21.01 | 45.98 | 62.54 |
Avg. | 76.48 | 77.76 | 68.48 | 62.33 | 73.92 | 80.24 |
Datasets | LSTM [10] | Accurate-LSTM [23] | Transformer [16] | HiP-Transformer |
---|---|---|---|---|
BPI2012 | 383.00 | 29.95 | 4.60 | 3.89 |
BPI2012W | 157.05 | 30.484 | 4.87 | 5.56 |
BPI2013i | 30.08 | 28.13 | 8.36 | 4.17 |
BPI2015 | 42.36 | 40.10 | 28.42 | 24.88 |
BPI2017 | 127.80 | 16.85 | 10.28 | 9.93 |
Hospital | 28.30 | 38.77 | 24.87 | 14.28 |
Sepsis | 421.60 | 17.82 | 19.47 | 17.26 |
Avg. | 170.02 | 28.87 | 14.41 | 11.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ni, W.; Zhao, G.; Liu, T.; Zeng, Q.; Xu, X. Predictive Business Process Monitoring Approach Based on Hierarchical Transformer. Electronics 2023, 12, 1273. https://doi.org/10.3390/electronics12061273
Ni W, Zhao G, Liu T, Zeng Q, Xu X. Predictive Business Process Monitoring Approach Based on Hierarchical Transformer. Electronics. 2023; 12(6):1273. https://doi.org/10.3390/electronics12061273
Chicago/Turabian StyleNi, Weijian, Gang Zhao, Tong Liu, Qingtian Zeng, and Xingzong Xu. 2023. "Predictive Business Process Monitoring Approach Based on Hierarchical Transformer" Electronics 12, no. 6: 1273. https://doi.org/10.3390/electronics12061273
APA StyleNi, W., Zhao, G., Liu, T., Zeng, Q., & Xu, X. (2023). Predictive Business Process Monitoring Approach Based on Hierarchical Transformer. Electronics, 12(6), 1273. https://doi.org/10.3390/electronics12061273