Translating Workflow Nets to Process Trees: An Algorithmic Approach

Since their recent introduction, process trees have been frequently used as a process modeling formalism in many process mining algorithms. A process tree is a tree-based model of a process, in which internal vertices represent behavioral control-flow relations and leaves represent process activities. A process tree is easily translated into a sound Workflow net (WF-net), however, the reverse is not the case. Yet, an algorithm that translates a WF-net into a process tree is of great interest, e.g., the explicit knowledge of the control-flow hierarchy in a WF-net allows one to more easily reason on its behavior. Hence, in this paper, we present such an algorithm, i.e., it detects whether a WF-net corresponds to a process tree, and, if so, constructs it. We prove that, if a process tree is discovered, the language of the process tree equals the language of the original WF-net. Conducted experiments show, that the algorithm's corresponding implementation has a quadratic time-complexity in the size of the WF-net. Furthermore, the experiments show strong evidence of process tree rediscoverability.


Introduction
The research field of process mining [4], is concerned with distilling knowledge of the execution of processes, by analyzing the event data generated during the execution of these processes, i.e., stored in modern-day information systems. In the field, different semi-automated techniques have been developed, that are able to distill processes knowledge, ranging from automated process discovery algorithms to conformance checking algorithms. As processes are the cornerstone of process mining, so are the models that allow us to represent them, and/or reason on their behavior/quality. Various process modeling formalisms exists, e.g., BPMN [8], EPCs [3], etc., some of which are heavily used in industry.
Recently, process trees were introduced [5]. A process tree is a hierarchical representation of a process, corresponding to a rooted tree, i.e., a connected, undirected acyclic graph, with a designated root vertex. The internal vertices represent how their children are related to each-other, in terms of control-flow behavior. The leaves of the tree represent the activities of the process. Consider Fig. 1b (page 3), in which we depict an example process tree. Its root vertex has label →, specifying that first its left-most child, i.e., activity with a, needs to be executed, secondly its middle child, and, finally its right-most child.
Process trees are easily translated into other process modeling formalisms, e.g., Workflow nets (WF-nets). By definition, a process tree corresponds to a sound WF-net, i.e., a WF-net with desirable behavioral properties, e.g., the absence of deadlocks. The reverse, i.e., given a WF-net, translating it into a process tree (if possible), is less trivial. At the same time, obtaining such a translation is of interest, i.e., it allows us to discover control-flow aware hierarchical structures within a WF-net. Such structures, can, for example, be used to hide certain parts of the model, i.e., leading to a more comprehensible view of the process model. Furthermore, any algorithm optimized for process trees, e.g., by exploiting the hierarchical structure, is applicable to WF-nets of such a type. For example, in [12], it is shown that the computation time of alignments [17], i.e., explanations of observed behavior in terms of a reference model, can be significantly reduced by applying Petri net decomposition, e.g., driven by model hierarchies.
In this paper, we present an algorithm that determines whether a given WFnet corresponds to a process tree, and, if so, constructs it. We prove that, if a process tree is found, the original WF-net is sound, and, that the obtained process tree's language is equal to the language of the original WF-net. A corresponding implementation, extending the process mining framework PM4Py [7], is publicly available. Using the implementation, we conducted several experiments, which show a quadratic time complexity in terms of the WF-net size. Furthermore, our experiments indicate rediscoverability of process trees.
The remainder of this paper is structured as follows. In Section 2, we present preliminary concepts and notation. In Section 3, we present the proposed algorithm, including the proofs w.r.t soundness preservation and language preservation. In Section 4, we evaluate our approach. In Section 5, we discuss related work. Section 6, concludes the paper.

Workflow Nets
Workflow nets (WF-nets) [2] extend the more general notion of Petri nets [14]. A Petri net is a directed bipartite graph containing two types of vertices, i.e., places and transitions. Places are visualized as circles, transitions are visualized as boxes. Places only connect to transitions, and vise-versa. Consider Fig. 1a, depicting an example Petri net (which is also a WF-net). We let N =(P, T, F, ) denote a (labelled) Petri net, where, P denotes a set of places, T denotes a set of transitions and F ⊆(P ×T )∪(T ×P ) represents the arcs. Furthermore, given a set of labels Σ and symbol τ / ∈Σ, : T →Σ∪{τ } is the net's labelling function, e.g., in Fig. 1a, (t 1 )=a, (t 2 )= b, etc. Given an element x∈P ∪T , •x = {y | (y, x)∈F } denotes the pre-set of x, whereas x•= {y|(x, y)∈F } denotes its post-set, e.g., in The state of a Petri net, is expressed by means of a marking, i.e., a multiset of places. A marking is visualized by drawing the corresponding number of dots in the place(s) of the marking, e.g., the marking in Fig. 1a denotes the reachable markings.
Given a Petri net N = (P, T, F, ), and designated initial and final marking A WF-net is a special type of Petri net, i.e., it has one unique start and one unique end place. Furthermore, every place/transition in the net is on a path from the start to the end place.
We let W⊂N denote the universe of WF-nets.
Of particular interest are sound WF-nets, i.e., WF-nets that are, by definition, guaranteed to be free of deadlocks and livelocks. We formalize the notion of soundness in Definition 2.
Observe that, the Petri net depicted in Fig. 1a, is a sound WF-net.

Process Trees
Process trees allow us to model processes, that comprise a control-flow hierarchy. A process tree is a mathematical tree, where the internal vertices are operators and leaves are (non-observable) activities. Operators specify how their children, i.e., sub-trees, need to be combined, from a control-flow perspective.
Definition 3 (Process Tree). Let Σ denote the universe of (activity) labels and let τ / ∈Σ. Let denote the universe of process tree operators. A process tree Q, is defined (recursively) as any of: .., Q n are process trees; We let Q denote the universe of process trees.
Several operators (elements of ) can be defined, however, in this work, we focus on four basic operators, i.e., the →, ×, ∧ and -operator. The sequence operator (→) specifies sequential behavior. First its left-most child is executed, then its second left-most child, ..., and finally its right-most child. For example, the root operator in Fig. 1b, specifies that first activity a is executed, then its second sub-tree ( ) and then its third sub-tree (×). The exclusive choice operator (×), specifies an exclusive choice, i.e., one (and exactly one) of its sub-trees is executed. Concurrent/parallel behavior is represented by the concurrency operator (∧), i.e., all children are executed simultaneously/in any order. Repeated behavior is represented by the loop operator . The →, × and ∧-operator have an arbitrary number of children. The -operator has two children. Its left child (the "do-child") is always executed. Secondly, executing its right child (the "re-dochild") is optional. After executing the re-do-child, we again execute the do-child. We are allowed to repeat this, yet, we always finish with the do-child.
Given a process tree Q∈Q, its language is of the form L Q (Q)⊆Σ * .
Definition 4 (Process Tree Language). Let Q∈Q be a process tree. The language of Q, i.e., L Q (Q)⊆Σ * , is defined recursively as: The process tree operators considered (→, ×, ∧ and ), are easily translated to sound WF-nets, cf. Fig. 2. Hence, we define a generic process tree to WF-net translation function, s.t., the language of the two is the same. Definition 5 (Process Tree Transformation Function). Let Q∈Q be a process tree. A process tree transformation function λ, is a function λ : Q→W, s.t., Given an arbitrary process tree Q∈Q, there are several ways to translate it to a (sound) WF-net W , s.t., L ν N (W )=L Q (Q), i.e., instantiating λ andλ. As an example, consider the translation functions, depicted in Fig. 2. Note that, each transformation function in Fig. 2, is sound by construction. Hence, we deduce that their recursive composition is also a sound [1, Theorem 3.3].

Translating Workflow Nets to Process Trees
In this section, we describe our main approach. In Section 3.1, we sketch the main idea of the approach. In Section 3.2, we present PTree-nets, i.e., Petri nets with rng( )=Q, which we exploit in our approach. In Section 3.3, we present Petri net fragments, used to identify process tree operators within the net, together with a generic reduction function. Finally, in Section 3.4, we provide an algorithmic description that allows us to find process trees, including correctness proofs.

Overview
The core idea of the approach, concerns searching for fragments in the given WFnet, that represent behavior that is expressible as a process tree. The patterns we look for, bear great similarity with the translation patterns defined in Fig. 2, i.e., they can be considered as a generalized reverse of those patterns. When we find a pattern, we replace it with a smaller net fragment that represents the process tree that was just found. We continue to search for patterns in the reduced net, until we are not able to find any more patterns. As we prove in Section 3.4, in case the final WF-net contains just one transition, in fact, its label carries a process tree with the same labelled-language as the original WF-net.     Consider Fig. 3, in which we sketch the basic idea of the algorithm, applied on the example WF-net W 1 (Fig. 1a). First, the algorithm detects two choice constructs, i.e., one between the transitions labelled b and c, and, one between the transitions labelled g and h. The algorithm replaces the fragments by means of two new transitions, carrying labels ×(b, c) and ×(g, h) respectively (Fig. 3a). Subsequently, a parallel construct is detected, i.e., between the transitions labelled ×(b, c) and d. Again, the pattern is replaced (Fig. 3b). A sequential pattern is detected and replaced (Fig. 3c), after which a loop construct is detected (Fig. 3d). The resulting process tree, i.e., carried by the remaining transition in Fig. 3e, → (a, (→ (∧(×(b, c), d), e), f ), ×(g, h)), is equal to Fig. 1b.

PTree-Nets and their Unfolding
The core idea of the proposed algorithm presented, is finding Petri net fragments in the WF-net, that represent behavior equivalent to a process tree. As illustrated in Section 3.1, the patterns found in the WF-net are replaced by transitions with a label carrying a corresponding process tree. In the upcoming section, we present four different fragment characterizations, corresponding to the basic process tree operators considered. However, in this section, we first briefly present PTree-nets, i.e., a trivial extension of Petri nets, in which labels are process trees.
Definition 6 (Process Tree-Labelled Petri-net (PTree-net)). Let Q denote the universe of process trees. LetP denote a set of places, letT denote a set of transitions and letF ⊆(P ×T )∪(T ×P ) denote the arc relation. Let˜ :T →Q. i.e., the definition of L Q ignores τ / ∈Σ. 3 Clearly, since PTree-nets extend the labelling function to Q, PTree-System-nets, and, PTree-WF-nets are readily defined. We let SN Q and W Q represent their respective universes.
Since a PTree-net contains process trees as labels, which can be translated into a Petri net fragment, we define a PTree-net unfolding, cf. Definition 7, which maps a PTree-net onto a corresponding conventional Petri net. Let λ(˜ (t))=(Pt, Tt, Ft, p it , p ot , t ),λ(˜ (t))=(Pt,Tt,Ft,ˆ t ), ∀t∈T , Note that, the WF-net in Fig. 1a, is the unfolding of all PTree-WF-nets in Fig. 3.

Pattern Reduction
In this section, we describe four patterns, used to identify and replace process tree behavior. Furthermore, we propose a corresponding overarching reduction function, which shows how to reduce a PTree-WF-net containing any of these patterns. However, first, we present the general notion of a feasible pattern.
In the upcoming paragraphs, we characterize an instantiation of a feasible ⊕pattern, for each operator considered. Sequential Pattern The →-operator, i.e., → (Q 1 , ..., Q n ), describes sequential behavior, hence, any subnet describing strictly sequential behavior, describes the same language. If a transition t 1 always, uniquely, enables transition t 2 , which in turn enables transition t 3 , ..., t n , and, whenever t 1 has fired, the only way to consume all tokens from t 1 •, is by means of firing t 2 , and similarly, the only way to consume all tokens from t 2 •, is by means of firing t 3 , etc., then t 1 , ..., t n are in a sequential relation. We visualize the →-pattern in the left-hand side of Fig. 4a.
Definition 9 (→-Pattern). LetÑ =(P ,T ,F ,˜ )∈N Q and let {t 1 , ..., t n } ⊆T (n≥2). If and only if: Parallel Pattern The parallel pattern is the most complicated pattern. In such a fragment, inference between its transitions is possible. This is achieved by requiring that the pre-sets and post-sets of the transitions do not have any overlap. Furthermore, the pre-set of the transition's pre-set places needs to be shared by all of these places, and, symmetrically, the post-set of the transition's post-set places needs to be shared by all of these places. That is, the enabling of the transitions in the pattern needs to be the same, and, their post-set should jointly block any further action, until all places in their joint post-set are marked.  Pattern Reduction The reduction rules for the patterns, i.e., defined in the previous paragraphs, as depicted in the right-hand side of Fig. 4, are very similar. Except for the →-pattern, we "copy" all places in the new Ptree-net, i.e., for the →-pattern, we remove the places inter-connecting transitions t 1 , ..., t n . The transitions involved in the pattern (and connecting arcs) are removed. A new transition is inserted, with a label depending on the pattern found. The connecting arcs of the newly added transition, differ slightly per pattern.

Algorithm
Algorithm 1: WF-net Reduction input : Here, we present an algorithm that translates a WF-net into a process tree. We prove that, if the algorithm finds a process tree, the input WF-net is sound. Moreover, we show that the language of the input WF-net equals the language of the process tree found. Consider Alg. 1, in which we present an algorithmic description of the reduction algorithm. As an input, the algorithm needs any WF-net W . Initially, the elements of W , excluding the initial and final place, are copied into variableÑ . In case a pattern of the form θ ⊕ (Ñ , SN ) is found inÑ , the corresponding reduction Θ ⊕ (Ñ , SN ) is applied (line 3). If no more pattern is found, the algorithm returns (N, p i , p o ,˜ ).
In case the obtained PTree-WF-net consists of just one transition, i.e., connected to place p i (incoming) and place p o (outgoing), cf. Fig. 3e, the label of the transition represents a process tree, describing the same language as the original WF-net. Furthermore, we are able to conclude that the original WF-net is, in fact, a sound WF-net. We prove these observations in Theorem 1, however, prior to this, we first present two useful lemmas. In Lemma 1, we prove that the proposed reduction rules are bidirectionally soundness preserving, i.e., if a PTree-WF-net is sound, then also the reduced PTree-WF-net is sound, and, vice versa. In Lemma 2, we prove that, if we are able, from the initial marking [p i ], to enable the observed fragment (enabling differs per fragment), then, the language of the original net and the reduced net is equal (and vice versa). Observe that, trivially, the reduction rules applied on a PTree-WF-net, yield a PTree-WF-net, i.e., none of the requirements of Definition 1, are violated on the resulting net. Proof. Observe that,W is sound. Lemma 1 implies that if we (continuously) revert the reductions applied by Alg. 1, i.e., corresponding to all intermediate assignments ofW in Alg. 1 are sound. 5 Observe that, Lemma 2 proves that the language of the unfoldings of all the intermediate WF-nets found is the same as well. Since the labels of the initial WF-net are all members of Σ∪{τ }, their unfolding remains the same. Hence, we deduce L Q (˜ (t))=L ν N (W ).

Evaluation
In this section, we evaluate the proposed algorithm. We briefly present the implementation, after which we discuss the experimental setup and the results. Implementation An implementation of Alg. 1 is available 6 , i.e., built on top of the process mining framework PM4Py [7]. Note that, the size of the patterns identified has no influence on the correctness of the algorithm. Hence, the implementation searches for binary patterns, yielding binary trees. Such a tree can be further reduced, e.g., → (Q 1 , → (Q 2 , → (Q 3 , τ ))) corresponds to → (Q 1 , Q 2 , Q 3 ).
Experimental Setup Here, we briefly discuss the experimental setup of our experiments. Consider Fig. 5, in which we present a graphical overview. Using an implementation of PTandLogGenerator [9, 10], we generate process trees, using two triangular distributions for the number of activities, i.e., {10, 20, 30} and {40, 50, 60}. The process trees are translated to WF-nets, using two different translations. One translation creates invisible start and end transitions for each operator; the other translation only does so when required (similar to Fig. 2). The first translation generates larger nets in terms of transitions/places/arcs. For each tirangular distribution/translation combination, we generate 50.000 process trees (yielding 200.000 experiments). Finally, we compare the generated process tree in canonical form, to the resulting process tree in canonical form. Results Here, we briefly discuss the results of the conducted experiments. Consider Fig. 6, in which we present the average time performance of the implementation. We plot the time performance, conditional to the size of the input WF-net. Additionally, we plot a polynomial trend-line, computed using polynomial least squares. As is clearly observable in Fig. 6, the time performance is quadratic in the size of the net (|P | + |T |). This is confirmed by the R 2 -score of the trendline, i.e,. ∼ 0.988. In all experiments, the canonical form of the generated process tree equals the canonical form of the (re)discovered process tree.

Conclusion
In this paper, we presented an algorithm to construct a process tree, on the basis of a Workflow net (WF-net). The proposed algorithm replaces fragments of the WF-net, that correspond to a process tree operator, i.e., by means of reduction rules. If the algorithm reduces the WF-net into a net, containing just one transition, there exists a corresponding process tree for the given WF-net, with the same language. The reduction rules proposed are bidirectionally soundness preserving, hence, in case a process tree is found, the original WF-net is sound. We have conducted experiments using a prototypical implementation, indicating quadratic time complexity in the net, and, process tree rediscoverability.
Future Work We aim to provide diagnostics w.r.t. the reason why a given WF-net cannot be reduced further, e.g., by assessing if removal of certain elements of the WF-net allows for further reduction. Alternatively, it is interesting to "wrap" certain fragments of the net into an encapsulating transition, after which the search to process tree fragments is continued. Another interesting direction is the search for structural properties of WF-nets that directly indicate whether a given WF-net corresponds to a process tree.